SGI to Scale Linux Across 1024 CPUs 360
im333mfg writes "ComputerWorld has an article up about an upcoming SGI Machine, being built for the National Center for Supercomputing Applications, "that will run a single Linux operating system image across 1,024 Intel Corp. Itanium 2 processors and 3TB of shared memory.""
Re:Solaris (Score:5, Interesting)
Solaris scales to hundreds of processors out-of-the-box. Until the vanilla Linux kernel accepts these changes and scale, Solaris still has a big edge in this area.
Lame analogy: many people have demonstrated that they can hack their Honda Civic to outperform a Corvette, however I can walk into a dealership and purchase the latter which performs quite well without mods.
Similar software available? (Score:2, Interesting)
from MPI to multithreaded ? (Score:4, Interesting)
Re:What happened to RISC? (Score:5, Interesting)
Quick examples: RISC use less power because it has less logic? No, it needs to run at a higher frequency to maintain the same speed as a slower CISC.
RISC is easier to program? Depends on the person. A compiler can take advantage of large instructions very well which are hardware optimized.
RISC easier to develop/manage? I'll say yes for RISC on this one. There's simply less logic on the chip so less logical errors possible. There's plenty more cache which can break but broken parts can be fused off.
RISC is physically smaller? No. RISC needs a higher clock frequency because many more instructions need to be executed. The result of this is that a much larger instruction cache is needed on chip.
I don't remember every comparison but it pretty much comes out that neither is better than the other. That being said RISC is better than x86. Everything is better than x86. However CISC vs RISC is much harder to judge. Having done x86, 68k, and MIPS I must say that RISC is a pleasure.
Re:Solaris (Score:5, Interesting)
I wouldn't be surprised to see these changes in the 2.8 kernel. And what will people do until then I hear some people ask. I can tell you that right now it is very few people that actually have the need to scale to 1024 CPUs. And that will probably also be true by the time Linux 2.8.0 is released. AFAIK Linux 2.6 does scale well to 128 CPUs, but I don't have hardware to test it, neither does any of my friends. So I'd say there is no need for a rush to get this in mainstream, the few people that need this can patch their kernels. My guess is that in the time from now until 2.8.0 is released, we will see less than 1000 such machines worldwide.
Du-uh (Score:2, Interesting)
Re:Solaris (Score:3, Interesting)
If someone buys one of these clusters from SGI, then it does scale "out of the box" as far as they're concerned.
Re:Solaris (Score:3, Interesting)
A better retort would be "There's a world market for maybe 5 computers" by the IBM dude.
Claims are very difficult to make, and impossible to proove. However putting a time limit on a claim is easy. 2.8.0 will be released in 05 or 06, maybe we'll all have 1024CPU boxes in 20 years, but in 20 months?
Re:Ok (Score:2, Interesting)
Alright, pass around the hat.
Re:Sun does more than that (Score:4, Interesting)
The systems I've seen that have hot-swap PCI cards have plastic partitions between the slots to prevent the cards from touching each other when hot swapping them.
I'm not sure why the hypothetical screwdriver in such a tech's hands. Many systems have non-screw means of retaining memory, PCI cards, CPUs and such.
Re:What happened to RISC? (Score:1, Interesting)
The fears of RISC instruction bloat are unfounded: the instructions are going to be in L1 i-cache 99% of the time, and won't slow anything down.
What shorter/simpler instructions enable is much smaller pipelines. My G4 does a fused mulitply-ad op in 7 stages, a P4 does it in 2 passes through a 20 stage pipeline (40 cycles, since the result of the mult isn't availible until the end.) The P4 pipeline has to fetch operands from somewhere on the stack and write them back. This means CISC cpu's are more prone to memory-bottleknecking in worst-case scenarios (of course, in most cases, the working data set for both archs will be in L1.)
In conclusion, CISC vs. RISC is EASY to tell apart: if its operating on data in registers and memory simultaneously, its CISC. If its loading the working data into an expansive register set, operating on it locally, and then storing it back, its RISC.
Re:Sun does more than that (Score:2, Interesting)
Re:Sun and/or IBM zseries hardware (Score:1, Interesting)
For instance were I work we have a older s/390 mainframe that runs a database.
We have 1. Win2000 server running IIS web server and MS SQl that is used online to form Queries automagicly for the mainframe stuff for our customers. 2. We have a Linux based firewall 2. other Linux servers 3. routers 4. networks 5. numerious other insudry Linux machines.
All this could be replaced by Linux running in a single partition in the mainframe. All the network, all the server.
So don't be a dipshit. Obviously there is reasons for running linux in a Mainframe, especially WHEN YOU ALREADY OWN ONE FOR DOING SOMETHING ELSE.
Now ZSeries isn't just a mainframe. It makes a great server. There are different pricing levels, different setups.
Now go find a big Corporate Windows server farm (rarer then you'd think) now look at the hundreds of Windows servers, Hundreds of support personal, experts, then the rest of the A+ certified service geeks.
Now delete all that, replace it with one server, running various things in it's many partitions. It's run by 2 admins and some assistants.
It will be faster, more reliable, and probably much cheaper. However the benifits go far beyond just elimating hundreds of redundant personals, and dozens of high maintainance PC servers running a unreliable OS, you have something that is easy to deal with supported by a company that will bend over backwards for you, instead of being beholdent to the assholes in MS.
NOW if you don't end up liking it, then you could move to solaris, or run a Server clusters of Linux PCs. And since your already running Linux, moving to any other Unix platform running any other hardware, or running Linux on commodity hardware, is much much easier then migrating from Windows in the first place.
Comment removed (Score:2, Interesting)
Re:Sun does more than that, but SGI always has (Score:2, Interesting)
My opinion is that Linux on a 1024-way is a spectacularly stupid idea, introduced more for the sexiness of having a 1024-way machine than for any practical benefits. Linux is simply not designed for scaling that large. And there is a huge difference between an OS designed to scale that large, and an OS hacked up to support something that large, without actually making the appropriate design choices. SGI may know about those choices (and probably better than Sun), but I highly doubt they'd throw them into a GPLed Linux kernel - they still want to sell their own version of Unix!
I expect (yes, a wild pie-in-the-sky guess) that the advantage of a 1024-way machine over a 512-way machine, both running Linux, is going to be maybe 20-30% performance, far from the 100% the numbers might claim or the 70-80% that might be tolarable. For a supercomputer where that 20-30% is irrelevant because no other machine can crunch the data, cool; for everyone else, two 512-ways running unconnected will be better, cheaper, and faster. [At least, until Linux can scale that large... maybe in 5 years or so?]
Not likely - Same Machine for $1k in 14 years. (Score:2, Interesting)
The problem with 1024CPU is much more then just the operating system. It is a mess of communication hardware needed to wire everything together. It is about special power feeds and air conditioning, and sometimes floor loading requirements.
Take a quick look at the end of this PDF [sgi.com]. It talks about heat output and the need for 3 phase 240V power coming into this computer. It is not unusual to hire both an electricial and a cooling expert when you talk about installing one of these babies. Not for the Home user, and never will be, however, idential compute power comming in just 14 years, so get ready...
Re:Not likely - Same Machine for $1k in 14 years. (Score:2, Interesting)
Location, Location... (Score:2, Interesting)
Re:Sun does more than that (Score:3, Interesting)
As far as the E10000 being NUMA or SMP -- depends on how you look at it. The Origin line used a bristled hypercube interconnect topology, so memory on the same node as a CPU was one hop thru the fabric, memory on another node connected to the same router was three hops, on a distant node might be multiple routre hops. The E10K (and I think the E15K) used a star topology where memory was ether on the same bus as the CPU or was on another bus that had to go through the switch. So the Sun has basically two levels of memory latency, whereas the SGI could have many levels. The SGI is definitely NUMA, the Sun is either SMP or "slightly NUMA", or however you want to parse it.
If you've never seen it, the tech papers on how the SGI NUMA systems work are worth reading. Build a fast 8-port crossbar chip (the "spyder chip"), then use it to glue CPUs, memory, and peripherals together. Keep a couple ports open, and you can glue the crossbars together in a fabric. Presto, you can now build a system with 200 CPUs or 100 PCI busses. Pretty cool, even if it was expensive, proprietary, and all the rest.