The Supercomputer Race 158
CWmike writes "Every June and November a new list of the world's fastest supercomputers is revealed. The latest Top 500 list marked the scaling of computing's Mount Everest — the petaflops barrier. IBM's 'Roadrunner' topped the list, burning up the bytes at 1.026 petaflops. A computer to die for if you are a supercomputer user for whom no machine ever seems fast enough? Maybe not, says Richard Loft, director of supercomputing research at the National Center for Atmospheric Research in Boulder, Colo. The Top 500 list is only useful in telling you the absolute upper bound of the capabilities of the computers ... It's not useful in terms of telling you their utility in real scientific calculations. The problem with the rankings: a decades-old benchmark called Linpack, which is Fortran code that measures the speed of processors on floating-point math operations. One possible fix: Invoking specialization. Loft says of petaflops, peak performance, benchmark results, positions on a list — 'it's a little shell game that everybody plays. ... All we care about is the number of years of climate we can simulate in one day of wall-clock computer time. That tells you what kinds of experiments you can do.' State-of-the-art systems today can simulate about five years per day of computer time, he says, but some climatologists yearn to simulate 100 years in a day."
Flops not useful? (Score:5, Informative)
But.. The whole point is to test the model, and the models change, don't they? Surely we're not just simulating more "years" of climate with the current batch, but improving resolution, making fewer simplifying assumptions, and hopefully, finding ways to do the exact same operations with fewer cycles.
How can you possibly evaluate supercomputers in any other way except how many mathematical operations can be performed in some reference time? And.. some serial metric if the math is highly parallel, since just reducing the size of vectors in those cases wouldn't actually result in those flops being useful for other tasks.
Benchmark your application (Score:5, Informative)
I agree (Score:3, Informative)
I write massively parallel scientific code that runs on these supercomputers for a living... and this is what I've been preaching all along.
The thing about RoadRunner and others (such as Red Storm at Sandia) is that they are special pieces of hardware that run highly specialized operating systems. I can say from experience that these are an _enormous_ pain in the ass to code for... and reaching anything near the theoretical computing limit on these machines with real world engineering applications is essentially impossible... not too mention all of the extra time it costs you in just getting your application to compile on the machine and debug it...
My "day-to-day" supercomputer is a 2048 processor machine made up of generic Intel cores all running a slightly modified version of Suse Linux. This is a great machine for development _and_ for execution. My users have no trouble using my software and the machine... because it's just Linux.
When looking at a supercomputer I always think in terms of utility... not in terms of Flops. It's for this reason that I think the guys down at the Texas Advanced Computing Center got it right when they built Ranger ( http://www.tacc.utexas.edu/resources/hpcsystems/#constellation [utexas.edu] ). It's about a half a petaflop... but guess what? It runs Linux! And is actually made up of a bunch of Opteron cores... the machine itself is also a huge, awesome looking beast (I've been inside it... the 2 huge Infiniband switches are really something to see). I haven't used it myself (yet), but I have friends working at TACC and everyone really likes the machine a lot. It definitely strikes that chord between ultra-powerful and ultra-useful.
Friedmud
Re:Flops not useful? (Score:5, Informative)
Flops wouldn't test how well the interconnects work.
Since you say "increase the resolution of the model", you are expanding the size of the model, and how much data must be used by all of the nodes of the computer.
Since how important the interconnect properties are is dependent on the model, with almost no communication needed, like for F@H, to a problem that needs all of the nodes to have access to a single shared set of data, it would be very hard to quantify performance in one number.
Unfortunately, there are more than a few fields where marketers want a single number to advertise in a "mine is bigger than yours" competition, and come up with a metric that is almost worthless.
Re:Simulation (Score:4, Informative)
From the webpage: rl is a command-line tool that reads lines from an input file or stdin, randomizes the lines and outputs a specified number of lines. It does this with only a single pass over the input while trying to use as little memory as possible.
Didn't know about it either. Seems marginally useful
Re:Flops not useful? (Score:5, Informative)
Simple: you evaluate how much actual work it can perform across the entire system per unit time, where "actual work" means a mix of operations similar to some real application of interest. The whole problem here is that practically no real application is as purely focused on arithmetic operations as Linpack. Even the people who developed Linpack know this, which is why they developed the HPCC suite as its successor. It's composed of seven benchmarks, including some (e.g. stream triad) that mostly stress memory and some (e.g. matrix transpose) that mostly stress interconnects. If you want to get an idea how your application will perform on various machines, you determine what mix of those seven numbers best approximates your application, assign appropriate weights, and then apply those weights to the vendor numbers. Then you negotiate with the two or three most promising vendors to run your application for real. SPEC should have put an end to simplistic "single figure of merit" comparisons, or if not them then TPC, SPC, etc. Sadly, though, there's still always someone who comes along and tries to revive the corpse.
Re:Uhh, do you have a model? (Score:5, Informative)
Re:Flops not useful? (Score:5, Informative)
Of course, if it's actually the case that people are dumb, lazy or in marketing, then that would explain why we don't get a full range of stats, even though the tools have existed for many years and are certainly widely known.
Re:Uhh, do you have a model? (Score:4, Informative)
IANAM (I am not a meteorologist)
That's for sure.
Here's an analogy: Say you pour two different colored cans of paint into a bucket and start stirring. Weather is like predicting the exact patterns of swirls that you'll see as the colors mix. Very hard to do looking ahead more than a couple of stirs.
Climate is more like predicting the final color that will result after the mixing is done. Not nearly so intractable. The summary is talking about climate, not weather.
Re:Flops not useful? (Score:3, Informative)
Actually, Linpack is not embarrassingly parallel so it DOES test how well the interconnects work, to some extent.
The top 500 list is interesting, but if you're building a supercomputer to make a certain rank you have too much money and you should really give me some.
You build a supercomputer to perform some task or class of tasks. If it gets you on the list, cool.
Re:RoadRunner (Score:3, Informative)
The specialization of the hardware / software combo is what I was referring to.
Have you ever coded for one of these special architectures? It really is a bitch. Yes, Redstorm is even worse (special OS that doesn't even allow dynamic linking!)... but the non-generality of the cell-processors is going to kill the real world impact of Roadrunner.
ASCII Purple was one of the previous machines at LANL that was a "one-off" build from IBM. It was a complete disaster. Porting code to the machine took much longer than usual and any person who could show that they were successfully running _anything_ on the machine got a pat on the back. I had the luxury of porting some software to it... good god, just thinking about it makes me want to blow my brains out...
I can't believe they've gone down that same path again.
Friedmud
Re:RoadRunner (Score:3, Informative)
Sorry... got my supercomputers mixed up... ASCII Purple was at LANL...
I was thinking of ASCI Q, but that was made by HP...
Oh... just nevermind... I screwed it up well enough, just forget it ;-)
Need to get some sleep.
Friedmud
Re:RoadRunner (Score:3, Informative)
As far as I know (as of 3 months ago) they're still running Catamount at Sandia... and it's for the reason you state: they developed it.
Friedmud