Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Supercomputing Technology

A Look Inside Oak Ridge Lab's Supercomputing Facility 59

1sockchuck writes "Three of the world's most powerful supercomputers live in adjacent aisles within a single data center at Oak Ridge National Laboratory in Tennessee. Inside this facility, technicians are busy installing new GPUs into the Jaguar supercomputer, the final step in its transformation into a more powerful system that will be known as Titan. The Oak Ridge team expects the GPU-accelerated machine to reach 20 petaflops, which should make it the fastest supercomputer in the Top 500. Data Center Knowledge has a story and photos looking at this unique facility, which also houses the Kraken machine from the University of Tennessee and NOAA's Gaea supercomputer."
This discussion has been archived. No new comments can be posted.

A Look Inside Oak Ridge Lab's Supercomputing Facility

Comments Filter:
  • by PPH ( 736903 ) on Tuesday September 11, 2012 @10:07AM (#41300075)

    ... a Beowulf cluster of these?

  • the final step in its transformation into a more powerful system that will be known as Titan.

    Oh, that is not even its final form.

  • " which should make it the fastest supercomputer in the Top 500"

    At first I thought this was redundant, but then I wondered if there are faster supercomputers that simply are not independently verified to be in the top 500 supercomputers. Anyone have any more info, or am I just overthinking this?

    • by Anonymous Coward

      Obviously a number of classified systems don't share their benchmark results, for one....

      • by Anonymous Coward

        You know what they say about hiding things in plain sight....

    • Different benchmarks will produce a different fastest super computer list. 'Top 500' is a specific list that uses a specific benchmark, a benchmark that this particular machine is currently at the top of. Using a different benchmark could be just as valid and produce a completely different list.

    • by tomhath ( 637240 )
      Google would probably top some benchmarks with their data centers. But they don't do number crunching like the Top 500.
  • I went there my Sophomore year to check out Oak Ridge. I didn't go for computing ,but since my guide knew that I like computing, he took me to look at the supercomputers. It's this huge room that was visible through glass windows which looked essentially like a huge clean, white, office floor with all the cubicles removed and with the supercomputers in place instead.

    At that time (2009?) I heard it wasn't really the fastest supercomputer but it's awesome to hear they're revving it up to that. If I didn't hat

  • by bratmobile ( 550334 ) on Tuesday September 11, 2012 @10:38AM (#41300467)

    I really, really wish articles would stop saying that computer X has Y GFLOPS. It's almost meaningless, because when you're dealing with that much CPU power, the real challenge is to make the communications topology match the computational topology. That is, you need the physical structure of the computer to be very similar to the structure of the problem you are working on. If you're doing parallel processing (and of course you are, for systems like this), then you need to be able to break your problem into chunks, and map each chunk to a processor. Some problems are more easily divided into chunks than other problems. (Go read up on the "parallel dwarves" for a description of how things can be divided up, if you're curious.)

    I'll drill into an example. If you're doing a problem that can be spatially decomposed (fluid dynamics, molecular dynamics, etc.), then you can map regions of space to different processors. Then you run your simulation by having all the processors run for X time period (on your simulated timescale). At the end of the time period, each processor sends its results to its neighbors, and possibly to "far" neighbors if the forces exceed some threshold. In the worst case, every processor has to send a message to every other processor. Then, you run the simulation for the next time chunk. Depending on your data set, you may spend *FAR* more time sending the intermediate results between all the different processors than you do actually running the simulation. That's what I mean by matching the physical topology to the computational topology. In a system where the communications cost dominates the computation cost, then adding more processors usually doesn't help you *at all*, or can even slow down the entire system even more. So it's really meaningless to say "my cluster can do 500 GFLOPS", unless you are talking about the time that is actually spent doing productive simulation, not just time wasted waiting for communication.

    Here's a (somewhat dumb) analogy. Let's say a Formula 1 race car can do a nominal 250 MPH. (The real number doesn't matter.) If you had 1000 F1 cars lined up, side by side, then how fast can you go? You're not going 250,000 MPH, that's for sure.

    I'm not saying that this is not a real advance in supercomputing. What I am saying, is that you cannot measure the performance of any supercomputer with a single GFLOPS number. It's not an apples-to-apples comparison, unless you really are working on the exact same problem (like molecular dynamics). And in that case, you need some unit of measurement that is specific to that kind of problem. Maybe for molecular dynamics you could quantify the number of atoms being simulated, the average bond count, the length of time in every "tick" (the simulation time unit). THEN you could talk about how many of that unit your system can do, per second, rather than a meaningless number like GFLOPS.

    • Here's a (somewhat dumb) analogy. Let's say a Formula 1 race car can do a nominal 250 MPH. (The real number doesn't matter.) If you had 1000 F1 cars lined up, side by side, then how fast can you go? You're not going 250,000 MPH, that's for sure.

      No... but collectively you cover the same distance, right?

    • Re: (Score:3, Informative)

      I'll drill into an example. If you're doing a problem that can be spatially decomposed (fluid dynamics, molecular dynamics, etc.), then you can map regions of space to different processors. Then you run your simulation by having all the processors run for X time period (on your simulated timescale). At the end of the time period, each processor sends its results to its neighbors, and possibly to "far" neighbors if the forces exceed some threshold. In the worst case, every processor has to send a message to every other processor. Then, you run the simulation for the next time chunk. Depending on your data set, you may spend *FAR* more time sending the intermediate results between all the different processors than you do actually running the simulation. That's what I mean by matching the physical topology to the computational topology. In a system where the communications cost dominates the computation cost, then adding more processors usually doesn't help you *at all*, or can even slow down the entire system even more. So it's really meaningless to say "my cluster can do 500 GFLOPS", unless you are talking about the time that is actually spent doing productive simulation, not just time wasted waiting for communication.

      Considering that computational fluid dynamics, molecular dynamics, etc., break down into linear algebra operations, I'd say that the FLOPS count on a LINPACK benchmark is probably the best single metric available. In massively parallel CFD, we don't match the physical topology to the computational topology, because we don't (usually) build the physical topology. But I can and do match the computational topology to the physical one.

      • Yes, many problems can be expressed as dense linear algebra, and so measuring and comparing LINPACK perf for these makes sense for those problems. However, many problems don't map well to dense linear algebra. The Berkeley "parallel dwarfs" paper expresses this idea better than I ever could: http://view.eecs.berkeley.edu/wiki/Dwarfs [berkeley.edu]

        • Yes, many problems can be expressed as dense linear algebra, and so measuring and comparing LINPACK perf for these makes sense for those problems. However, many problems don't map well to dense linear algebra.

          Sure, but as far as I've seen, linear algebra problems dominate the runtime of these very large systems. That's what I use them for.

          At least the first 6 on that dwarfs list are done daily on top500 machines. I write parallel spectral methods, and use structured and unstructured grids. Achieving high scaling on these on massively parallel machines is not at all what I would call an open problem (as far as correctly using a given network for a problem, or designing a network for a given problem). For any give

    • by Anonymous Coward

      You're doing it wrong. Line them up end-to-end, not side-by-side.

  • The US still has these Big Science centers left over from the glory years. There's Oak Ridge, Los Alamos, and the Lawrence Livermore Senior Activity Center (er, "stockpile stewardship"), plus the NASA centers. Their original missions (designing bombs, sending people to the Moon) are long gone, but nobody turned off the money, so they keep looking for something, anything, to justify the pork.

    The atomic centers are all located in the middle of nowhere. This was originally done for good reasons - their exist

    • by ks*nut ( 985334 )
      Umm, what is this "post-nuke" era ? There's a reason they have huge computing capability - the nukes haven't gone away, we just don't talk about them anymore. And they don't just sit around gathering dust; they must be carefully maintained and a huge amount of computing power is expended in "improving" them. And you may rest assured that the nuclear establishment is developing new tactical and strategic nuclear weapons for specialized applications, again using vast amounts of computing power.
    • Please do your homework first. While the supercomputers at Lawrence Livermore, Los Alamos, and Sandia National Laboratories are primarily used for nuclear weapons work, the work of keeping the country's huge stockpile safe and reliable is a gigantic job, especially if you don't want to actually detonate any of the warheads. Yep, that's the trick. Simulate the ENTIRE weapon, from high explosive initiation all the way to final weapon delivery. With all of the hydrodynamics, chemistry, materials science, nucle

  • Looks like they clustered some Pepsi machines
  • While it's certainly fascinating to hear about the machine itself, it's easy to forget part of why it exists: simulating destruction. The Manhattan Project also came from Oak Ridge, if you recall.

    As someone who lives in the region, nobody is particularly keen on what possibly goes on at these places. There are various "secret" military installations scattered around here, from Oak Ridge to Holston Army Ammunition. Between what we factually know is buried under and developed at these places, and what is r

We are Microsoft. Unix is irrelevant. Openness is futile. Prepare to be assimilated.

Working...