Motherboard Forums


Reply
Thread Tools Display Modes

SB 2000 or SB 2500: Which would you buy?

 
 





















Benjamin Gawert
Guest
Posts: n/a

 
      09-22-2009, 05:22 PM


* Bart:
> Can you point me to any documentation that explains this any better?
> I've never heard of this about the 1k/2k blades. Most any multiprocess
> system architecture I am familiar with has some overhead with the
> memory controller.


Standard multiprocessor computers have a common memory controller and
common memory. This is called UMA (Uniform Memory Access) architecture.
It looks like this:

[I/O]
|
[CPU0]---[CPU1]
|
[MEMORY CONTROLLER]-[MEMORY]

Good examples of UMA machines are most servers and workstations with
older intel XEON processors (pre-XEON 5500 series). These XEON
processors have a common (or with XEON 5000 series separate) FSB to
communicate with an external memory controller (Northbridge). This
basically means for a given situation the performance of each CPU
accessing memory is always the same, no matter which CPU does the access
and no matter which area of the system memory is accessed. However,
because the memory controller is outside the CPU, accessing memory takes
time (higher latency), and especially with older XEONs with common
single FSB the FSB limits the actual bandwidth available to the system
memory.


However, the UltraSPARCIII-based Sun machines like the SB1000/2000/2500
as well as AMD Opteron-based computers are NUMA[1] (Non-Uniform Memory
Access) architecture. NUMA means that every processor has its own memory
controller (which in case of UltraSPARC III and AMD Opteron is built
into the CPU) and its own local memory. NUMA looks like this:

[I/O]
|
[CPU0]-[MEMORY CONTROLLER]-[MEMORY]
|
[CPU1]-[MEMORY CONTROLLER]-[MEMORY]
|
[I/O]

The advantage is that every CPU has very fast access to its local RAM
(low latency), and it doesn't have to share the bandwidth with the other
processor. However, as soon as a CPU has to access memory connected to
another CPU, things get much slower as it has to go over the other
processor to access its memory. Now the memory performance depends which
part of the system memory has to be accessed, if it is local it is fast,
if it is connected to another processor it is slow. Therefore NUMA needs
a NUMA-aware OS (like Solaris, Windows or Linux) which distributes
processes and assigns memory in a way that processes use system RAM
connected to the processor it runs on.

As to the Sun Blade 1000/2000/2500, it is a crippled NUMA system which
basically looks like this:

[I/O]
|
[CPU0]-[MEMORY CONTROLLER]-[MEMORY]
|
[CPU1]-[MEMORY CONTROLLER]

While both processors do have memory controllers, only the first CPU can
actually have physical RAM. This means all processes running on the
second processor have to go over the primary one to access RAM as the
second CPU doesn't have local memory. This has quite a huge impact on
memory-intensive multiprocessor applications.

Ben





[1] http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access

 
Reply With Quote
 
ChrisQ
Guest
Posts: n/a

 
      09-22-2009, 08:54 PM
Benjamin Gawert wrote:

> While both processors do have memory controllers, only the first CPU can
> actually have physical RAM. This means all processes running on the
> second processor have to go over the primary one to access RAM as the
> second CPU doesn't have local memory. This has quite a huge impact on
> memory-intensive multiprocessor applications.
>
> Ben
>


That's interesting, but I doubt that would have much of an impact on the
sort of work done by many users, other than for seriously compute
intensive applications. A B1000 is used here both as the lab server and
also for software development. Recently removed one of the cpu's to save
power and notice no difference at all in terms of interactive response
or compilation times. Juice is expensive now, so why pay to use more if
there's little benefit ?. The blade 1 and 2k series are quite power
hungry (~200 watts) compared to earlier Sparc workstation class machines.

On windows, using task manager on a dual xeon machine, the only apps
that really make use of mt or the two cpu's are those like photoshop or
lightroom, but even then the processors are really just loafing along.
Might just as well remove one of the cpu's there as well...

Regards,

Chris
 
Reply With Quote
 
Benjamin Gawert
Guest
Posts: n/a

 
      09-23-2009, 06:30 AM
* erik magnuson:

> Are you sure that applies to the SB-2500? The max RAM for the SB-2500
> was twice that of the SB-1500. Bear in mind that the SB1k/2k used the
> US-III processor, while the SB-2500 used the US-IIIi processor.


I checked again and you are right, this doesn't apply to the SB2500
which has both CPUs connected to local memory (with one CPU only half of
the memory slots can be used).

Benjamin
 
Reply With Quote
 
Benjamin Gawert
Guest
Posts: n/a

 
      09-23-2009, 06:35 AM
* ChrisQ:

> That's interesting, but I doubt that would have much of an impact on the
> sort of work done by many users, other than for seriously compute
> intensive applications.


It affects all multiprocessor (multithreaded) applications that make use
of a lot of memory, and it does affect singlethreaded applications as
well if the scheduler shifts them to the second processor.

But considering that these machines are now becoming close to a decade
old and being really slow now I doubt this is a problem any more as the
overall performance of these critters is low. Also, the memory interface
of the USIII is not great.

> A B1000 is used here both as the lab server and
> also for software development. Recently removed one of the cpu's to save
> power and notice no difference at all in terms of interactive response
> or compilation times. Juice is expensive now, so why pay to use more if
> there's little benefit ?. The blade 1 and 2k series are quite power
> hungry (~200 watts) compared to earlier Sparc workstation class machines.


If power is of concern then a SB1000/2000 is probably a bad choice.

> On windows, using task manager on a dual xeon machine, the only apps
> that really make use of mt or the two cpu's are those like photoshop or
> lightroom, but even then the processors are really just loafing along.
> Might just as well remove one of the cpu's there as well...


Don't know about Lightroom but IIRC Photoshop only uses multiple
processors for certain filters.

It doesn't seem any of your applications make use of multiple
processors, so you might as well just use single processor machines
which use less power.

Benjamin
 
Reply With Quote
 
ChrisQ
Guest
Posts: n/a

 
      09-23-2009, 04:01 PM
Benjamin Gawert wrote:

>
> But considering that these machines are now becoming close to a decade
> old and being really slow now I doubt this is a problem any more as the
> overall performance of these critters is low. Also, the memory interface
> of the USIII is not great.


It's still the fastest sparc based machine here. At last, a sparc
machine that equals my old 1997 vintage Alpha box in terms of
interactive response and compile times. It's just a real shame sun
stopped doing sparc based workstations. Diversity improves the breed and
the cpu gene pool is shrinking fast.

> It doesn't seem any of your applications make use of multiple
> processors, so you might as well just use single processor machines
> which use less power.
>


It can always be refitted if there is the need and would probably get
more benefit for present work upgrading to a single 1200Mhz cpu, rather
than a dual 900Mhz config.

What other options there are, without mentioning X86, of course :-)...

Regards,

Chris
 
Reply With Quote
 
Bart
Guest
Posts: n/a

 
      11-18-2009, 06:49 PM
On Sep 23, 8:01*am, ChrisQ <m...@devnull.com> wrote:
> Benjamin Gawert wrote:
>
> > But considering that these machines are now becoming close to a decade
> > old and being really slow now I doubt this is a problem any more as the
> > overall performance of these critters is low. Also, the memory interface
> > of the USIII is not great.

>
> It's still the fastest sparc based machine here. At last, a sparc
> machine that equals my old 1997 vintage Alpha box in terms of
> interactive response and compile times. It's just a real shame sun
> stopped doing sparc based workstations. Diversity improves the breed and
> the cpu gene pool is shrinking fast.
>
> > It doesn't seem any of your applications make use of multiple
> > processors, so you might as well just use single processor machines
> > which use less power.

>
> It can always be refitted if there is the need and would probably get
> more benefit for present work upgrading to a single 1200Mhz cpu, rather
> than a dual 900Mhz config.
>
> What other options there are, without mentioning X86, of course :-)...
>
> Regards,
>
> Chris


Here is a document I found that provides the details on the Sunblade
1000/2000 memory and CPU architecture (Starts on Page 30).

http://ru.sun.com/products/workstati...s/sb1000wp.pdf

Apparently the US IIIcu has a built in memory controller and
multiprocessor systems use a buss arbitration system. I can see how
this might have a small impact in performance but the obvious benefit
is for scaling so I can also see why Sun chose this path. It provides
direct access for 1 cpu (because the USIII has a memory controller
built-in) while subsequent processors use the shared interconnect bus
(Sun called it Fireplane I guess).
 
Reply With Quote
 
Thomas Maier-Komor
Guest
Posts: n/a

 
      11-18-2009, 09:58 PM
Benjamin Gawert wrote:
> * Bart:
>> Can you point me to any documentation that explains this any better?
>> I've never heard of this about the 1k/2k blades. Most any multiprocess
>> system architecture I am familiar with has some overhead with the
>> memory controller.

>
> Standard multiprocessor computers have a common memory controller and
> common memory. This is called UMA (Uniform Memory Access) architecture.
> It looks like this:
>
> [I/O]
> |
> [CPU0]---[CPU1]
> |
> [MEMORY CONTROLLER]-[MEMORY]
>
> Good examples of UMA machines are most servers and workstations with
> older intel XEON processors (pre-XEON 5500 series). These XEON
> processors have a common (or with XEON 5000 series separate) FSB to
> communicate with an external memory controller (Northbridge). This
> basically means for a given situation the performance of each CPU
> accessing memory is always the same, no matter which CPU does the access
> and no matter which area of the system memory is accessed. However,
> because the memory controller is outside the CPU, accessing memory takes
> time (higher latency), and especially with older XEONs with common
> single FSB the FSB limits the actual bandwidth available to the system
> memory.
>
>
> However, the UltraSPARCIII-based Sun machines like the SB1000/2000/2500
> as well as AMD Opteron-based computers are NUMA[1] (Non-Uniform Memory
> Access) architecture. NUMA means that every processor has its own memory
> controller (which in case of UltraSPARC III and AMD Opteron is built
> into the CPU) and its own local memory. NUMA looks like this:
>
> [I/O]
> |
> [CPU0]-[MEMORY CONTROLLER]-[MEMORY]
> |
> [CPU1]-[MEMORY CONTROLLER]-[MEMORY]
> |
> [I/O]
>
> The advantage is that every CPU has very fast access to its local RAM
> (low latency), and it doesn't have to share the bandwidth with the other
> processor. However, as soon as a CPU has to access memory connected to
> another CPU, things get much slower as it has to go over the other
> processor to access its memory. Now the memory performance depends which
> part of the system memory has to be accessed, if it is local it is fast,
> if it is connected to another processor it is slow. Therefore NUMA needs
> a NUMA-aware OS (like Solaris, Windows or Linux) which distributes
> processes and assigns memory in a way that processes use system RAM
> connected to the processor it runs on.
>
> As to the Sun Blade 1000/2000/2500, it is a crippled NUMA system which
> basically looks like this:
>
> [I/O]
> |
> [CPU0]-[MEMORY CONTROLLER]-[MEMORY]
> |
> [CPU1]-[MEMORY CONTROLLER]
>
> While both processors do have memory controllers, only the first CPU can
> actually have physical RAM. This means all processes running on the
> second processor have to go over the primary one to access RAM as the
> second CPU doesn't have local memory. This has quite a huge impact on
> memory-intensive multiprocessor applications.
>
> Ben
>
>
>
>
>
> [1] http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access
>



If you take a look at the block diagram of the Blade 2500 in
817-5117-11.pdf, page C-3, you'll see that both cpus are connected to
two memory banks of their own and communicate with two IO bridges over a
common J-Bus.

- Thomas
 
Reply With Quote
 
Benjamin Gawert
Guest
Posts: n/a

 
      11-18-2009, 10:59 PM
* Bart:

> Here is a document I found that provides the details on the Sunblade
> 1000/2000 memory and CPU architecture (Starts on Page 30).
>
> http://ru.sun.com/products/workstati...s/sb1000wp.pdf
>
> Apparently the US IIIcu has a built in memory controller and
> multiprocessor systems use a buss arbitration system. I can see how
> this might have a small impact in performance but the obvious benefit
> is for scaling


Nope, it has no benefit "for scaling". In fact, it is a huge bottleneck
in multiprocessing and scales like crap.

> so I can also see why Sun chose this path.


I can, too. It was very likely a cost cutting measure and nothing else.

> It provides
> direct access for 1 cpu (because the USIII has a memory controller
> built-in) while subsequent processors use the shared interconnect bus
> (Sun called it Fireplane I guess).


It's not a bus, it's a crossbar switch which means it add noticeable
latency to the I/O.

For a workstation in this price range (when it was new, when I got my
first SB1000 the machine did cost around 30kEUR) I would have expected a
design that doesn't cut corners in important areas.

Of course today this is probably irrelevant as even the fastest SB1000
(and SB2000) is slow like hell by todays' standards.

Benjamin
 
Reply With Quote
 
Benjamin Gawert
Guest
Posts: n/a

 
      11-18-2009, 11:02 PM
* Thomas Maier-Komor:

> If you take a look at the block diagram of the Blade 2500 in
> 817-5117-11.pdf, page C-3, you'll see that both cpus are connected to
> two memory banks of their own and communicate with two IO bridges over a
> common J-Bus.


And if you read my posting from September 23rd (almost a month ago!) you
will find out that I already corrected myself re. Blade 2500 memory.

Benjamin
 
Reply With Quote
 
Thomas Maier-Komor
Guest
Posts: n/a

 
      11-18-2009, 11:46 PM
Benjamin Gawert wrote:
> * Thomas Maier-Komor:
>
>> If you take a look at the block diagram of the Blade 2500 in
>> 817-5117-11.pdf, page C-3, you'll see that both cpus are connected to
>> two memory banks of their own and communicate with two IO bridges over a
>> common J-Bus.

>
> And if you read my posting from September 23rd (almost a month ago!) you
> will find out that I already corrected myself re. Blade 2500 memory.
>
> Benjamin


sorry Benjamin - I've overlooked your reply to Erik in the other branch...
 
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Computer Security alec.recce.com.use@gmail.com Abit 0 11-30-2007 06:17 AM
Computer Security alan.densky.com.use@gmail.com HP 0 11-16-2007 02:50 AM
Computer Security akhil.richardson.com.use@gmail.com Dell 0 11-06-2007 02:24 AM
Virtual PC / Windows 2000 -- Replacement Files? James L. Ryan Apple 1 02-09-2004 05:45 AM
Copy files from a Windows 2000 server to a Macintosh client: error -128 Hans Stoessel Apple 0 08-14-2003 09:52 AM


All times are GMT. The time now is 11:28 PM.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43