
Top 500 Supercomputers 215
Anonymous Coward writes "sendmail.net has a piece on the new Top500 list of supercomputers. 'So who came out on top? Well, three US Department of Energy machines have taken spots one, two, and three to lead the list: ASCI Red (manufactured by Intel) at Sandia National Labs in Albuquerque, ASCI Blue-Pacific (IBM) at Lawrence Livermore Labs in Berkeley, and ASCI Blue Mountain (SGI) at Los Alamos. These are the only three systems to exceed 1 TF/s on the Linpack benchmark, and represent 7.4 percent of the total Flop/s on the list.' The story notes that the average growth rates for the list exceed the number set by Moore's Law. "
Re:Moore's law doesn't strictly apply here (Score:1)
Overall though this just keep an upper bound on realistic growth.
Supercomputer Metrics (Score:3)
A better metric for computation power is given by this formula concerning the memory heirarchy:
bandwidth*size*speed where memory
bandwidth is the average speed at which data streams to or from memory
size is the amount of memory
speed is the responsiveness to random access
What the massively parallel processor advocates frequently forget is that locality of reference is an expensive assumption. A similar mistake is made by memory heirarchy advocates. For example, many systems where CD-ROM jukeboxes were included to expand the size of the memory the architects overestimated "locality of reference" and therefore underestimated the profound impact that moving the robot arm around would have on latency. Such designs are convenient for the hardward designer who wants "good numbers" and a nighmare for advanced software application that needs unpredictable access to lots of information at a high rate in order to get the solution out of the machine before the solution is obsolete. The operands have to come together through that maze of wiring. If you have partitioned the memory, it profoundly affects both latency and bandwidth. The critical thing is to allow _shared memory_, and that means advanced memory control units.
Seymour Cray kept ahead of the supercomputer pack for more than two decades by focusing his best talent on fast, high bandwidth memory control units and building the biggest semiconductor memories to match.
Re:Beowolf (Score:1)
Depends on how you define a Beowulf cluster, really.
Not the same kind of parralell processing... (Score:1)
anyway, all I ment in the comment about 3d gaming was that PCs are better then Macs for 3d gaming, no one, exsept maybe apples marketing department, would deny this. And a comparable PC would be cheaper then a comparable mac.
Yes, there are other uses for floating point, but the primary use in consumer situations is games. If a sciantest really needed high-power floating point, they could get an Alpha or somthing.
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
Re:No. 5 spot held by a 128-cpu machine?? (Score:2)
Re:No. 5 spot held by a 128-cpu machine?? (Score:1)
Moore's Law is based on the fastest processor existing today, not the most economical. Now of course there are processors that will do specific functionality faster, but their semi-conductor density has not changed. Speed is not the issue at hand here, its the symptom of increasing density.
Re:Distributed.net (Score:1)
Re:SGI installs Linux Supercomputer at OSC (Score:1)
Re:Checking to see if pi terminates or repeats... (Score:1)
Have a great day...
Re:Research v.s. Academic uses of Supercomputers (Score:1)
Re:No. 5 spot held by a 128-cpu machine?? (Score:2)
Re:No. 5 spot held by a 128-cpu machine?? (Score:1)
Re:How many run Unix? (Score:3)
Re:Note these machine are all old. (Score:3)
The Sandia/Intel ASCI-red TFLOPS machine has proven to be one of the more technically successful efforts in massively parallel, high-performance computing. However, large MPP systems have drawbacks. Among these are:
Applications that require high levels of compute performance will continue to grow in size, variety, and complexity. While cluster-based projects have firmly established a foundation upon which small- and medium-scale clusters can be based, the current state of cluster technology does not support scaling to the level of compute performance, usability, and reliability of large MPP systems. In contrast, large-scale MPP systems have addressed the problems related to scalability, but are limited by their use of custom components. In order to scale clusters to thousands of nodes, the following must be addressed: Use of non-scalable technology must be bounded or eliminated. Technologies like TCP/IP, NFS, and rsh have inherent scalability limitations. Scalable management and maintenance is critical. The complexity of maintaining the cluster should not increase as it grows. Usability of the machine is critical. Users should not be required to know detailed information about the cluster, such as the name of each node or which nodes are operational, to effectively use the machine.
-----------------------------------------------
Re:How many run Unix? (Score:2)
The IBM SP series machines I've run into all ran unix.
The Suns of course run unix.
I would guess the HP's run unix, although it might not be HP-UX.
The SGIs probably run IRIX unless they are "Cray/SGI" T3Es in which case they run Unicos
I don't know about the NEC or Fujitsu machines.
~276 times more efficiency per cpu (Score:1)
Vector processors (Score:1)
Re:How many run Unix? (Score:1)
There own supercomputer OS for vector machines.
SGI:
Irix for SMP
Sun:
Solaris for SMP
Hitachi:
MPP version of HI-UX, this is a varient of HP-UX optimised for a non-shared memory system.
Intel:
Some flavour of unix I believe. However with these machines each node executes its own copy of the OS and does SMP on that node. The Intel machines should not really classify as one computer more like a few thousand clustered together.
Linux does not really scale beyond 4 procesors on SMP systems. The most poserful linux systems are the beowulf clusters like the ones that NASA has. I don't know why these don't appear on the list as they are surely more powerful than some of the lower end Suns. However I doubt that a beowulf cluster counts as one computer.
Re:Standard Processors? (Score:1)
Re:There are 2 Linux systems there (Score:1)
Livermore is not Berkeley (Score:2)
Livermore NL is in Livermore. Berkeley NL is the one in Berkeley.
Massive computing power (Score:1)
Massive computing power using sometimes generic technology, others using THE LATEST in busses and network technologies.
Quake at 100000 FPS... running OpenGL in software... I wouldn't be suprised, but then, these things run nuclear bomb simulations.
Quick question, if you linked these up, how long would it take them to crack RC5? DES? Probably why the USGov doesn't want them exported...
Ft. Meade (Score:1)
--
Quake! Give me QUAKE! (Score:1)
2048x1532 resolution simultanously on one machine?
How many run Unix? (Score:2)
I teach Unix courses (on Linux) and in the first class I try to give an idea of where Unix is used.
I always say "the fastest computers in the world run Unix", but I'd rather be able to say "480 of the top 500 computers run Unix" - it sounds more impressive. The problem is that, although I can identify most of the operating systems on the list quite easily, I'm not sure about some of the more esoteric ones. Does anybody know exactly what all these systems are running?
Moore's law doesn't strictly apply here (Score:1)
Re:SGI now builds Linux SuperComputers (Score:1)
Re:Simulated nuclear explosions (Score:1)
They should build a super computer and have it run all the time calculating pi, just to see if eventually it terminates or starts repeating... :)
Re:Massive computing power (Score:1)
Quick question, if you linked these up, how long would it take them to crack RC5? DES? Probably why the USGov doesn't want them exported...
Silly rabbit, the government has SPECIAL PURPOSE computers to crack rc5, DES, and every widespread block encryption algorithm. They can crack them faster than you would believe.
These general-purpose supercomputers are put to much more nefarious uses.
Clearly a flawed study (Score:1)
What gives?
Re:Standard Processors? (Score:1)
Yeah, Cray still uses alphas in their T3Ds and T3Es.
The T90s and SV1s use Cray's special vector processors.
Company winners and losers (Score:2)
Manuals for ASCI Blue Pacific here (Score:1)
Manuals for ASCI Blue Pacific here (Score:2)
Re:Does Moore's Law Apply? (Score:1)
Re:And the prizes for weirdest number of processor (Score:2)
170 NEC NLR 8 - fastest computer with a number of processors less than 10
101 SGI "Government" 1024 - PRESUMED slowest computer with a number of processors greater than 1000
teach me to consider a less-than symbol "Plain Old Text."
Re:And the prizes for weirdest number of processor (Score:1)
IBM SPs Are Not Super Beowulfs (Score:1)
At our center, we're installing a batch of new SMP nodes [mhpcc.edu], so I'll be interested to see just where we place in the standings when we rerun the benchmark.
Re:Thank You! (Score:1)
Re:Massive computing power (Score:1)
Q3a is multiprocesor ready.... (Score:1)
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
Re:CPU Comparison! VERY INTERESTING!~~~~ (Score:1)
Think about this for just a sec... If you take a fairly communications intensive benchmark (such as linpack), which clusters do you expect to give you the best "bang for the buck"? We know that performance will be a function of 1) processor speed, 2) number of processors, and 3) communication speed/bandwidth. Obviously, those clusters with the least comms overhead will have the advantage. Now, do you expect a machine with 10^2 procs to have the same comms speed/bandwidth available per processor as a machine with 10^4 procs?
So on one hand, you could say ASCI Red is an inefficient POS, and with respect to a benchmark like linpack, you'd be somewhat correct. On the other hand, given a less communications-bound benchmark (like a prime-number sieve, or something distributed.net-esque), ASCI Red would look a lot better.
Now this brings about one more topic: Why do they use linpack as a benchmark, and not something with fewer comms? For "Real-world" applications on supercomputers (as they apply to science/engineering), most of the computational effort is spent doing operations on sparse matrices (i.e., matrices that are mostly zeros). One way of handling these operations is to do them in the same manner as a dense matrix (multiplying/adding all the zeros), which is horridly inefficient. The preferable alternative is to spend a great deal of time "looking" for work to do on nonzero entries. This "looking" is faster than performing all the unnecessary computation, but obviously implies more communication. Thus, putting the machines with more processors at a disadvantage.
Happily, though, there are plenty of people with computational needs that aren't terribly comms intensive, who would rather have the 9000 processors than awesome comms speed, because for us, that's what makes our codes run faster. (That and we don't sit in a queue all day waiting for a block of cpu's to free up)
Quake Analogy:
Your pals across the street share a T1 and play on the same server with the same number of people all the time. With their nice ping, they each average 1024 frags/hour (gotta be even powers o' 2). Now you have a LAN party, and 64 other people (of equal "talent") get on the same server, abusing your T1, lagging it out, and you each average 16 frags/hour. Thus, looking at the first situation, they look like a butt-stomping, fragging machine. Whereas in the second situation, you look like a bunch of gay pansys. However, is it really fair to say that you're a worse quaker just because your ping sucks compared to that lpb across the street?
The same principle (a variant of Amdahl's Law, as it's known in academic circles) applies to supercomputers.
Avalon Cluster at #265 (Score:1)
Manufacturer: Self-made. Nice
Those ASCI machines are hot! Think of the... (Score:2)
(Score: -1, Unoriginal)
Seriously, all we really want to know is which of the machines on the list are Linux clusters of some sort. This is still Slashdot, after all...
--
Re:How many run Unix? (Score:2)
Re:Vector computing IS supercomputing (Score:1)
Re:No. 5 spot held by a 128-cpu machine?? (Score:2)
Throwing money at the problem isn't fair...perhaps they should have normalized these systems based on their price...
Re:Disturbing info/Echelon? (Score:1)
Hajo
Nope: #44 is the first beowulf (Score:1)
Not a truly accurate list.... (Score:1)
As one example of such a computer, Professor John Koza has a 1000 node (Pentium II 350Mhz) beowulf machine for his Genetic Programming Inc. ( GPI's web site [genetic-programming.com]) research group. He's running genetic programming applied to difficult problems on the machine (such as automatic analog circuit design), and is getting a nearly linear speedup because of the embarrasingly parallel nature of GP.
Cheers,
David Andre
my web site [berkeley.edu]
disclaimer: I worked with Professor Koza for several years and helped him build some of his previous machines.
go al! (Score:1)
go weird al!
Damn, you should benchmark it... (Score:1)
A funny story about that box, when I went down there, and saw it the first thing I thought was that all those lights represented CPUs, then I figured it would have been imposible, since it would have certanly made it one of the fastest computers on earth....
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
Re:CPU Comparison! VERY INTERESTING!~~~~ (Score:1)
That isn't even close to right... (Score:1)
Also, just like the PowerPCs, the Intel chips used are very old, Pentium Pros, probably running at about 200mhz. I'd be willing to bet that an Athlon running at 800mhz, the fastest you could buy, would easily beat a G4 at 450mhz, the fastest you can buy...
You Mac freaks never realize that is not performance per box, its price/performance, and the PC kicks the crap out of a Mac. (esp. for 3d gaming, witch is really the only need consumers have for all that FP)
--
"Subtle mind control? Why do all these HTML buttons say 'Submit' ?"
Re:Disturbing info/Echelon? (Score:1)
I don't think any NSA computers are on that list. I have yet to hear any real specifics regarding what the NSA has at their disposal, except the word CRAY a few times. Suffice it to say that they have the potential to have way cooler machines than any on this list, due to their undisclosed budget.
Re:Nope: #44 is the first beowulf (Score:1)
Avalon is the "real" Beowulf Linux supercomputer, with Linux on the nodes.
Maybe Cplant was using the word Beowulf for cachet. Then again I might be misunderstanding what a Beowulf is
commodity processors my ass (Score:1)
Re:Interesting (Score:1)
Re:SGI installs new Linux Supercomputer at OSC (Score:1)
So where's the Linpack #'s???
We're workin' on it. There should be something official announced at SC99 nect week.
mcurie.nercs.gov (Score:1)
Re:No. 5 spot held by a 128-cpu machine?? (Score:1)
And the pressing question is whether next year we can expect to see a Transmeta-based supercomputer in the top 500.
Re:How many run Unix? (Score:1)
Re:Does Moore's Law Apply? (Score:1)
Re:Massive computing power (Score:1)
Trust me, heavy iron does not a good Quake machine make...
Oh, did I mention I was on the console of a Reality Engine 2 at the time? Some guy sat next to me, stared at the screen and pronounced: "How come the Reality Engine thing is so slow?"
engineers never lie; we just approximate the truth.
Re:Avalon Cluster at #265 (Score:2)
Also, it would be nice if the table added a column for Rmax/AcquisitionCost, and let you sort by that column. I'll bet that would launch some Beowulfen toward the top.
--
It's October 6th. Where's W2K? Over the horizon again, eh?
Re:Amiga Linux is Cool (Score:1)
It does not run on Amigas.
Re:Moore's law doesn't strictly apply here (Score:1)
so yes, there is some limit to the amount one problem/program can be parallelized(sp?). However, I suspect that each of those machines is running more than one simulation at one time, so overall they are each somewhat efficient in their use of the processors n'stuff
m'kay?
Re:Note these machine are all old. (Score:1)
Re:Pixar?? (Score:1)
Don Negro
Re:Time to get pedantic... (Score:1)
Re:Massive computing power (Score:1)
Re:Those ASCI machines are hot! Think of the... (Score:2)
I know many people that frequent this site who detest linux, run windows NT, freeBSD, hell, dos
Re:And the winner for the oldest goes to (Score:1)
--
http://www.beroute.tzo.com
cpus vs nodes (Score:1)
Re:Vector computing IS crap (Score:1)
the obvious question (Score:1)
cable and wireless (Score:1)
...
hdj jewboy
Re:That isn't even close to right... (Score:1)
Well, there are other consumer uses
Also, it is not always the case that a single chip system will get much higher flops/chip than a multi-chip system -- it depends on how much communication is required in the particular application. If it is something like parallel search, for example, you need hardly any communication between chips and therefore you get good performance. However, for most consumer applications, you don't get as nice speedup.
Cheers,
David Andre
But how User Friendly are they? (Score:1)
*ducks flying tomatoes*
Deosyne
Re:commodity processors my ass (Score:1)
Re:There are 2 Linux systems there (Score:1)
Re:Nope: #44 is the first beowulf (Score:1)
FAQ 2: Is it another Beowulf machine?
Not really. The Cplant project has some broader goals than traditional Beowulf systems. We are not trying to build a machine for a small number of users to run a small number of applications on a small number of machines. We are trying to build a production machine for hundreds of users to run all types of parallel applications on potentially thousands of nodes. We are essentially trying to build a commodity-based machine patterned after the design of the Intel TeraFLOPS machine.
Re:Beowolf (Score:2)
#44 CPlant Cluster
#265 Avalog Cluster
#454 Parnass2 Cluster
Re:Time to get pedantic... (Score:1)
Re:How many run Unix? (Score:1)
Re:Quake! Give me QUAKE! (Score:1)
Hitachi Architecture (Score:3)
1) Their Interconnect
2) Their Processors
The interconnect is a hyper-bar crossbar network, with a bandwidth of 1GByte. Also they are able to get sustained message passing performance of about 90% like they did on their previous machine the SR2201. Other vendors would provide 60-65% of peak.
The number listed in the Top500 for processors is a bit mis-leading, this is in fact the number of nodes. The Hitachi nodes are made up of a number of processors, each with pseudo-vector optimisation (allowing them to miss the cache when loading large memory blocks). This optimisation means the chip can have a high sustained performance on large scale numeric problems. The nodes can be configured as either SMP of vector. This allows the machine to address a much wider range of domain problems.
Hitachi have a very brief page describing their machines SR8000 Product Page [hitachi.co.jp]
I would love to see what a fully configured machine could do (6 TFlops!).
BTW, Linpack is not a great gauge of a Supercomputers performance. When there a lot of nodes it becomes message bound and does not reflect the true performance of the machine. When looking at machines like this it is important to look at benchmarks related to domain problems. e.g. It does not really matter what interconnect you have if you are doing ray-tracing, but it matters a great deal when doing astro-physics.
Does Moore's Law Apply? (Score:2)
I wouldn't think so... At the time, Moore was running Intel, a one-CPU-per-machine outfit, and I think his "law" was an observation on the rate of progress in the PC industry, and what advancement was possible within the technology of single Von Neumann-bottleneck-style systems.
-schmaltz
Moore's Law Not Broken (Score:4)
Secondly, Moore's Law is the following (from http://www.intel.com/intel/museum/25anniv/hof/moo
In 1965, Gordon Moore was preparing a speech and made a
memorable observation. When he started to graph data about the
growth in memory chip performance, he realized there was a striking
trend. Each new chip contained roughly twice as much capacity as its
predecessor, and each chip was released within 18-24 months of the
previous chip. If this trend continued, he reasoned, computing power
would rise exponentially over relatively brief periods of time.
Moore's observation, now known as Moore's Law, described a trend
that has continued and is still remarkably accurate. It is the basis for
many planners' performance forecasts. In 26 years the number of
transistors on a chip has increased more than 3,200 times, from 2,300
on the 4004 in 1971 to 7.5 million on the Pentium® II processor.
Since the CPUs in supercomputers use standard processors, and Moore's Law applies to these processors, his law is still intact. His law is about CPUs, not systems.
The movie factor (Score:2)
Kevin Costner (1 megaflop/year)
--
Note these machine are all old. (Score:2)
The real action is lower down Avalon [lanl.gov] was top 100 now is down to 265. The top cluster goes to cplant [sandia.gov] take the award for top cluster now.
Re:Someone Tell Apple (Score:2)
NIVRAM
Re:And the prizes for weirdest number of processor (Score:2)
hmm, 243 is 3^5. I wonder what strange architecture dictated that number.
Re:Massive computing power (Score:3)
No. The sordid truth is that the Republicans are marginally behind the Democrats and the Pentagon, in the Mega Deathmatch, currently being played on a network of supercomputers and an enhanced Quake server.
The Republicans are desperate not to lose precious cycles to simulations, which would give the other two teams a decisive advantage. The lag might even cost them the tournament.
These would be OK at cracking DES, but really, given that DES can be cracked in less than a day on a kit computer (and within a week by assembling kitchen equiptment), DES is essentially dead, as far as the US Government is concerned. Only people like NASA and Boeing still use DES, to any degree. There are FAR worse vulnerabilities to their approaches, though, than mere crackability. Key management is - to be blunt - pathetic.
And the prizes for weirdest number of processors.. (Score:3)
12 IBM Charles Schwab 2000 - fastest computer with a number of processors evenly divisible by 100
15 Fujitsu Kyoto 63 - fastest computer with a number of processors not evenly divisible by 2
46 Fujitsu NAL 167 - fastest computer with a number of processors neither evenly divisible by 2 nor equal to (2^x)-1
94 IBM MHPCC 243 - fastest computer with a number of processors in no way related to common powers of 2
170 NEC NLR 8 - fastest computer with a number of processors 1000
(Yeah, yeah, I know you've got 2k TRS-80s in a Beowulf cluster in your back yard.)
No. 5 spot held by a 128-cpu machine?? (Score:3)
Re:Moore's law doesn't strictly apply here (Score:2)
--------------------------------------------
Time to get pedantic... (Score:2)
Standard Processors? (Score:2)
Then there's IBM, which seems to be using PowerPC 604e's.
Next, SGI uses MIPS
SGI/Cray - have they moved to MIPS, or are they still using Alpha's? (I'm not 100% sure that's what they used before, but i'm 95% sure it is).
My main puzzler here is NEC. WHAT ARE THEY USING??? If you go down to #73 on the list, there's a machine that was deployed in 1999 with just 16 processors? Okay, it's performance is 1/19th that of Intel's #1 offering, but it uses just 1/602 the amount of CPU's??? That's not a standard processor that i've ever heard of?
NEC has a bunch of listing below that, too. Some use just 5 processors (though, those are all in the high 400's). What chips is it using? Can anyone explain what this machine is?
Number of processors (Score:2)