Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Supercomputing Technology

Sandia Wants To Build Exaflop Computer 144

Dan100 brings us an announcement that Sandia and Oak Ridge National Laboratories are setting their sights on an exaflop supercomputer. Researchers from the two laboratories jointly launched the Institute for Advanced Architectures to facilitate development. One of the problems they hope to solve is how to provide each core of each processor with enough data so that cycles aren't going to waste. "The idea behind the institute — under consideration for a year and a half prior to its opening — is 'to close critical gaps between theoretical peak performance and actual performance on current supercomputers,' says Sandia project lead Sudip Dosanjh. 'We believe this can be done by developing novel and innovative computer architectures.' The institute is funded in FY08 by congressional mandate at $7.4 million."
This discussion has been archived. No new comments can be posted.

Sandia Wants To Build Exaflop Computer

Comments Filter:
  • by curmudgeon99 ( 1040054 ) on Friday February 22, 2008 @08:23AM (#22513660)
    Aren't we getting a little bit ahead of ourselves, Sandia? What program would you run on this? This brings up the essential issue: what kind of program would YOU write to take advantage of this? I can only think of one: AI.
    • by morgan_greywolf ( 835522 ) on Friday February 22, 2008 @08:29AM (#22513702) Homepage Journal

      Aren't we getting a little bit ahead of ourselves, Sandia? What program would you run on this? This brings up the essential issue: what kind of program would YOU write to take advantage of this? I can only think of one: AI.
      Military simulations. That's what Sandia spends most of its supercomputing clock cycles doing. The Department of Energy funds supercomputing centers like Sandia National Laboratories in order to run simulations on military vehicles, nuclear weapons simulations, etc.
      • So basically, Sandia wants to run better games...
        • by Albert Sandberg ( 315235 ) on Friday February 22, 2008 @09:01AM (#22513928) Homepage
          ... they will be the best prepared for duke nukem forever that's for sure.
        • Lots and lots of Tic-Tac-Toe [wikipedia.org]. All jokes aside, I'd like to think we could get more than 7.4 million allocated to these sorts of projects. Accelerate the singularity, I say...
        • by gnick ( 1211984 )

          So basically, Sandia wants to run better games...
          No. Weapons modeling is actually fairly dull. But, since the kind of weapons that the DoE cares about can no longer be tested, detailed and computationally hungry simulations are the only way to predict performance.
          • This does not, of course, address the question of whether nuclear weapons modeling is the most worthwhile use of all those cycles, paid for by the American taxpayer.

            I don't think it is.
            • by gnick ( 1211984 )
              Maybe not - That's another debate entirely that I don't care to weigh in on. But, it is an application that demands a lot of processing and that will almost certainly be one of the many applications that SNL and ORNL will be using this system for. An AC below who claims to work at SNL points out that SNL has also done asteroid impacts modeling and says that SNL makes a lot of its resources available for outside research.
              • I'm involved in a lattice QCD project that uses some of the computer time from Lawrence Livermore.

                We have one guy who has a security clearance, and he's not allowed to actually take any data out of the lab. He has to go there with a PIECE OF PAPER and write down a handful of numbers on it, and that's all the data we can get out.

                It's a nightmare.
                • by gnick ( 1211984 )
                  I understand entirely. I'm involved in a project that involves several players from both the DoD and the DoE. We only have one effective channel of communication so, when the folks on the east coast want to talk with the folks here in NM, the poor guy in Albq has to receive the communication, print it out, and then key it in manually on our side. When we want to respond, he does the same thing in reverse.

                  On the down side, his time is extremely expensive and it's being paid by our tax dollars. On the up
                  • Is this real research or military stuff?
                    • by gnick ( 1211984 )
                      Well, it's DoD and DoE, so you can assume that it is aimed at an application as opposed to being purely ethereal (not that they've never explored the latter). Is that what you meant?

                      No, it's no military stuff. It's just directed at understanding something usable instead of strict academic masturbation.
                    • Careful, you won't win many friends here being contemptuous of masturbation, academic or otherwise.
                    • Strict academic masturbation? You mean like those German guys who sat around in the '20's and '30's working out quantum mechanics, right?

                      Oh, right -- that thing that makes transistors possible, and thus these here Intertubes, and modern chemistry...

                      DoD doesn't do research aimed at "applications", if by "application" you mean "something that might make your life better".
                    • by gnick ( 1211984 )

                      DoD doesn't do research aimed at "applications", if by "application" you mean "something that might make your life better".
                      Know how long it takes to fly across the globe in a prop-plane?
      • True. Sandia does a lot of nuclear systems simulation.

        However, Oak Ridge is an unclassified facility doing mostly academic research on climate change, fusion energy, biological systems modeling, geological systems, ... the list goes on, but almost any US researcher can get an account on their systems, and purchase cpu time. The trouble is that neither of these labs has quite enough resources to dramatically change computer architecture directions. They can both afford to have 1 or 2 very high end machines a
      • by Rolgar ( 556636 )
        They probably need it to run Windows for Warships properly. You wouldn't want to run out of cycles in the middle of a battlefield.
      • "Simulations" sums it up. Just the word "simulations" is enough to suck up any conceivable amount of processing power they will ever have.

        Folding@Home runs about [stanford.edu] a petaflop these days, so they're planning to build the equivalent of about 1,000 Folding@Homes. But Folding@Home is barely making a dent in the number of proteins scientists want to fold. Just this one existing simulation project could probably saturate the proposed exaflop computer.

        What if they want to simulate battle field conditions? Surely
    • by uuxququex ( 1175981 ) on Friday February 22, 2008 @08:30AM (#22513706)
      So far the advances in the field of AI have been non-spectacular. Yes, there have been somewhat succesful reasoning systems (rule based or probability based) and neural networks have made classification easier.

      However, at the moment there are no serious applications that will only become feasible by having more computer power.

      More speed in calculation has plenty of benefits, but AI as a research field will not be making major announcements soon because of this new machine.

      • Re: (Score:2, Interesting)

        Excellent point. I might add that I have been working on just such a code set for a few years but that's another discussion. The real reason that AI has been stuck is because it has only attempted to replicate the functions of one hemisphere: the left (linear sequential, where language is processed). The visual-simultaneous right hemisphere is the one that no computer today replicates. THAT, my esteemed friends, is where the work needs to be done. I have spent the better part of four years on just that pro
        • Re: (Score:3, Informative)

          by kestasjk ( 933987 )
          Many people have worked on the area which they think AI is lacking in, but until we understand the brain (or even come up with a good definition of "intelligence") I don't think we'll get very far. But who knows, keep on looking!

          There are a lot of uses for extra computing power though, it's not like we've reached a point where we have too much. Protein folding and climate models are the first that come to mind, but I'm sure there are many others. Companies aren't building these things for fun.
          • by Alef ( 605149 )

            Many people have worked on the area which they think AI is lacking in, but until we understand the brain (or even come up with a good definition of "intelligence") I don't think we'll get very far. But who knows, keep on looking!

            It could also be the case that the wonders of the human brain are not so much a result of clever "design" (no pun intended) as it is of raw processing power and brute force.

            Taken to its extreme, it could very well be the case that strong AI, at least to some degree, is a kind

            • I agree that creating an AI probably won't involve programming the intelligence itself, but programming some sort of framework that will allow the intelligence to come into being by some other process. But even that is a long way off, and I think understanding what sort of framework the brain uses will be vital.

              I've read "How the Mind Works" by Steven Pinker, who is a neuroscientist, and he makes the case that consciousness works using replicating patterns in endlessly repeated parallel trunks of neurons
              • by Alef ( 605149 )

                Anyway my point is we should try and mimic the brain's workings if we want to achieve the brain's results, and I think trying to leap ahead to AI development is probably going to be premature if we don't first understand how the brain does it.

                Yes, I agree that our ability to emulate the human brain is probably tightly coupled with our understanding of it. However, I'm not sure one must come before the other. I still think AI research merited, in part because it has other applications than achieving huma

            • Actually, I think you're right on the money. Though I too am not a neuroscientist, I can read and research. The precise solution to the visual-simultaneous processing is exactly as you described: massively parallel processing of fairly simple processes. Again, the solution to AI will NOT be found in making ever more powerful left brain [linear-sequential processing].
          • Actually, there is a lot of information on AI but you won't find it in computer science. You need to read the research of Roger Sperry [won a Nobel Prize in 1981 for his work].
        • IMHO, AI won't progress until we take a serious look at how intelligence arose on this planet, namely us. How did intelligence start with us? Where did it evolve from? Take for instance, the instinct for survival, I would say that this spurred sapien family tree to grow more intelligent letting us become more aware of our surroundings and ourselves (self-awareness). the pre-frontal cortex to simulate what we see and hear and feel and do some trial and error simulations in our head to determine the outcome (
    • by PowerEdge ( 648673 ) on Friday February 22, 2008 @08:31AM (#22513712)
      You don't usually run one program on these type of systems. The compute cycles are bidded out to researchers and they get x number of compute hours. The system is partitioned out to a few nodes and given to the researcher to run their codes on. You could have on a system like this hundreds of jobs running simultaneously. Also, with the tens of thousands of cores needed to reach this status, a node failure, or other hardware failure is inevitable. Right now if a node fails in the middle of the job, everything is lost from the last checkpoint. The chances of failures impeding work go up greatly the more nodes and cores you run the job on.
      • Cool! Reminds me of the beginning of my career when I had to work on main frames. So much system code designed to do that parsing. Thank you for an interesting reply!
      • Sort of.

        Ten years ago the largest systems had ~1000 processors, and jobs would usually run on 100-300 nodes. Now they have ~20,000 processors, and jobs tend to typically use 4000 nodes. Presumably an exaflop machine will have ~1,000,000 processor cores, and typical jobs will use 200,000 nodes.

        I think this institute is being funded to deal with issues exactly like the problem you present. Checkpoint/restart was a decent solution for a YMP, but it has outlived its usefullness. I imagine there will someday be
    • by HuguesT ( 84078 )
      What AI program, pray? I was unaware we had an AI program ready to think for us, only asking for a few hexaflops here or there.

      I think Sandia would probably like to run lattice QCD simulations. Those can chew through any amount of hexaflops you can throw at them. Otherwise we have the ever-demanding weather bureau for these elusive 15-day forecasts. It's not difficult to conjure up a problem that would take weeks to run on current hardware. Indeed neural simulations are a possibility, but not the only one.
      • Re: (Score:2, Funny)

        by Anonymous Coward
        I am a flopped hex, you insensitive clod!

        (or did you mean exaflops, as in 10^18 FLoating point Operations Per Second?)
        • Just in case someone needs a reminder. I get always confused:

          10^24 yotta Y 1 000 000 000 000 000 000 000 000
          10^21 zetta Z 1 000 000 000 000 000 000 000
          10^18 exa E 1 000 000 000 000 000 000
          10^15 peta P 1 000 000 000 000 000
          10^12 tera T 1 000 000 000 000
          10^9 giga G 1 000 000 000
          10^6 mega M 1 000 000
          10^3 kilo k 1 000
          10^2 hecto h 100
          10^1 deka da 10
          10^0 1
          http://en.wikipedia.org/wiki/SI_prefix

        • I am a flopped hex, you insensitive clod!
          Wouldn't a hexxed clod of dirt actually have nerve endings, and therefore be sensitive?
      • Re: (Score:3, Informative)

        by olafva ( 188481 )
        Try Climate & Weather Codes, Fusion, Combustion, CFD, Bio (genomics), and
        a host of large science/engineering, partial differential equation=based applications
        requiring the solution of large systems of matrix equations,... Check out:

        http://www.ornl.gov/info/press_releases/get_press_release.cfm?ReleaseNumber=mr20061025-00 [ornl.gov]
    • by Huntr ( 951770 ) on Friday February 22, 2008 @08:37AM (#22513756)
      What program would you run on this?

      Vista, with Aero enabled.
      • by GregPK ( 991973 )
        Some well written code... Just might go so fast that the space time continuim bends to accomodate a new class of ultimate power in the universe. ;) Unfortunatly, you aren't going to find a whole lot of that in Vista. =P
      • Well, obviously with some effects disabled.
      • Either way (successful or not), either (EFC or ms) will be multi-mega-flops...
    • by macrom ( 537566 )
      Um, the Towers of Hanoi?
    • by Anonymous Coward on Friday February 22, 2008 @08:51AM (#22513854)
      I happen to work at Sandia and can assure you that much more than weapons work is done on the computers. In fact, recently a lot of work was done in modeling the huge asteroid that smashed into Russia in the early 20th century. The researchers we able to develop new understanding of the dynamics of such an event and discovered that much smaller asteroids than previously thought could do such damage.

      Also, a large portion of the computers are available to outside research (besides research done at the Labs).

    • by hbean ( 144582 )
      It might (just might) run windows vista well enough to make it usable.
    • by Etrias ( 1121031 )
      I bet they partnered with 3dRealms to who needed help finishing Duke Nukem Forever.
    • If they had AI that could run on fast computers, then they'd have AI that could run on slow computers, just slowly. They don't, sorry.
      • by bnenning ( 58349 )
        If they had AI that could run on fast computers, then they'd have AI that could run on slow computers, just slowly.

        True. But fast computers may help quite a bit in developing AI. This simulation [nsi.edu] of 100 billion neurons and a quadrillion synapses took 50 days to process one second of simulation-time. An interesting proof of concept, but not exactly ideal for experimentation; you get 7 tests a year. But increase the CPU power by 1000x, and now it only takes an hour to simulate a second and you get to do a lot
    • Crysis. With a rig like this you might be able to turn three or four of the effects up to "high".
    • Crysis at 2560 x 1600, All settings maxed.
    • They could always download BOINC and have their pick
    • I'd run me. Kurzweil suggests we run around 12 petaflops or so. The problem is getting the data out of my head. I'm not letting them slice my brain up until they figure out how to scan the synaptic weightings in addition to the connections. Histography, confocal laser scanning, electron microscopy, those seem to be the holdups to uploading now, not processing time.
    • "What program would you run on this?" SKYNET!
    • by JerkBoB ( 7130 )
      What program would you run on this?

      The thing about supercomputers these days is that they aren't a single node. You don't boot them up and run a "program". Typically, what happens is that you've got a whole bunch of folks who want to run their codes on a distinct set of processor cores, and the more total processing power a supercomputer has, the more individual codes can be run simultaneously. At least, this is how it works on massively-parallel supercomputers like BG/L, Cray XT, etc. The vector-based
    • Duke Nukem Forever
    • It's not hard to come up with programs that need a lot of processing power to run. Most of the stuff currently being run on supercomputers are (relatively) small programs, but with huge data sets that are easily parallelized. Pretty much any kind of simulation falls into this category: climate, genetics, biology/pharmaceutical, plasma physics, particle physics, nuclear physics.

      You don't need a "smart" program to utilize a fast computer - in fact, they are most useful in situations where the smartest people
    • Lattice QCD.

      These guys casually throw around numbers like "32 million cpu-hours on this machine, 40 million cpu-hours on that machine", and are always needing more power.
    • by bugnuts ( 94678 )
      All those flops in the past have been traditionally used by modelling codes for nuclear explosions and impacts (e.g. firing a tank shell at 3" steel). Simulations are done because of the treaty against above-ground nuclear explosions the US signed long ago. They generally gather experimental data by using high explosives and other physics experiments to use to determine constants. For instance, they might get data for shooting 3" armor, but then want to know if 4" armor will stop the shell... That's a g
    • This brings up the essential issue: what kind of program would YOU write to take advantage of this?
      Any problem that has to work against a variational limit, like (many) computational chemistry applications and also problems in quantum chromodynamics.
    • Actually, the plan is to use it to finally get a three-day weather forecast that's more reliable than rolling dice.

      When they've got that figured out, they plan to go to work on a five-year climate forecast.

  • Seems to me that an SSD 4GB Fibre SAN might be such a novel way to go to ensuring that each node gets pretty much continuous access to large data. Seem to me the throughput would run closer to peak all the time as opposed to using a traditional HDD-based SAN. Combine that with some sort of clustering technique and I think you could achieve really good performance.
    • Re: (Score:2, Informative)

      by Digi-John ( 692918 )

      If your work tends to be I/O bound... it doesn't belong on an exaflop cluster. You know BlueGene/L? Most of its nodes don't even talk directly to the storage system--they're connected to special I/O nodes which then talk to the storage system.

      Scientific computing doesn't really deal with THAT much data. The scientists here at Sandia (yeah I work at Sandia CA) think they are just HUGE data creators. "We generate a PETABYTE per YEAR!" they say... not realizing that a petabyte is a drop in the bucket for the

  • As long as they keep making new peripheral buses and networks that are the fastest in the world to keep up with the supercomputing speeds, that we can then buy to use with our PCs, then it's a great investment.

    But since the American people will have bought the new tech for the world, it should be released into the public domain, after maybe 5 years patent licensed to only American corporations (who cannot sublicense it abroad). That's what investments in American tech should be like. Not just subsidies to p
  • Flow Down? (Score:3, Funny)

    by webword ( 82711 ) on Friday February 22, 2008 @08:35AM (#22513730) Homepage
    How long does it typically take for memory and sotrage advances to make to end consumers? For example, when we first heard about "gigabytes" back in the day, how long did it take to get there once it was being done in the laboratory?

    OK, here's the truth. I'm just wondering since I need more memory to carrying around the entire internet in my pocket. Right now, I can only fit Ron Paul fanatic postings on my USB stick. They are taking up a lot of room. (Nothing against Ron Paul, mind you.)
    • by Nullav ( 1053766 )
      Shh, we already have the record/movie companies pissed off. Just imagine the lawsuits when people start torrenting whole internets!
  • by kazade84 ( 1078337 ) on Friday February 22, 2008 @08:36AM (#22513738)
    it said Santa was building a super computer :)
  • "In an exascale computer, data might be tens of thousands of processors away from the processor that wants it," says Sandia computer architect Doug Doerfler. "But until that processor gets its data, it has nothing useful to do. One key to scalability is to make sure all processors have something to work on at all times."

    This sounds a very interesting OR optimization problem, but I am not sure what are the variables...If a processor is working on a particular piece of a problem and the data required to sol

  • by vstat ( 456161 ) on Friday February 22, 2008 @08:37AM (#22513754)
    No cycles wasted there!
  • I want one (Score:4, Interesting)

    by sm62704 ( 957197 ) on Friday February 22, 2008 @08:44AM (#22513808) Journal
    I don't know why, but I want one.

    Twenty years ago we had a Compaq portable that ran on a 16 mhz 286 at work, and it was HOT. Blazingly fast, could do anything. That is, for its time. The supercomputers then weren't as powerful as your laptop today.

    So if I can manage to stay alive for another 20 years, I'll probably have a laptop more powerful than the supercomputer in TFA. I guess I'll just have to wait a while.

    -mcgrew [kuro5hin.org] (link is to "Growing Up With Computers", a 2 year old K5 article)
  • Not enough money (Score:1, Informative)

    by ThoreauHD ( 213527 )
    7 million isn't alot for a datacenter with a supercomputer housing a novel architecture. They'll need infiniband or gig fiber, and other high end equipment. That in itself will take 2 million at least. I dunno. Maybe they'll do a low-rent google and call it unique.
  • by Lars Clausen ( 1208 ) on Friday February 22, 2008 @09:00AM (#22513912)
    It'd be cool if TFA's headline was actually correct: Then we'd have a machine whose performance actually accelerated by a 10^15 floating point operations per second *per second*. That gets to be a lot of FLOPS real fast.

    OTOH, it might just be the singularity happening. We wouldn't notice until it was too late.
  • Great... (Score:1, Redundant)

    by owlnation ( 858981 )
    ... the first computer with enough specs to able to run Vista! (just)
  • by Guybrush_T ( 980074 ) on Friday February 22, 2008 @09:21AM (#22514134)

    Reading the article, the goal is nowhere near building a real exaflop computer, but more about thinking about issues (like processor data feeding).

    In a year and a half, we shouln't have more than 100 GFlops per socket, which means that you will still need 10 millions of processors (not cores!) to achieve the exaflop computer. No chance to build a cluster that big (at least these years).

    The all-times progression of the top500 [top500.org] shows that exaflop computers should arrive around year 2020, definetly not tomorrow. (x10 every ~4 years, 2008:1 PF, 2012:10 PF, 2016:100 PF, 2020:1 EF)

    • Intel [wikipedia.org] has recently managed to squeeze 80 cores onto a single chip, resulting in up to 2 TFlops [wikipedia.org] per socket. That's about 700 times as much as a single 700 MHz BlueGene/L processor or 350 times as fast as a BlueGene processing node. 250 (500) of those chips could take on LLNL's BlueGene at a mere 66 (31) kW of power consumption. (Values in brackets are for the slower, more power-efficient version).
      Back in the real world I don't know how real the TeraScale chips are. Yield probably is as low as it gets, I wou
    • I had a chance to listen to talk by a guy involved in this just the other day. You're right about the first part; it is more about thinking. He said a big problem now is that, while they greatly appreciate being funded to build faster computers, there isn't enough storage space and especially *bandwidth* to deal with it!!! In other words, our networking infrastructure in the United States needs a massive upgrade, and so far no one seems to care. Congress will fund a new supercomputer every few years, but fo
  • petaflop exaflop floppity flop flop flop - whatever - install Vista and watch it puke blood and die...

    obSlashdot: but will it run linux?

    RS

  • For instance the Cell/BE processor allows C/C++ programmers to manage memory directly with the Memory Flow Controller to perform double-buffered asynchronous transfer of data between the main memory and the processor memory. Using Direct Memory Access, Cell users can achieve 98% of the peak performance on some applications: http://www-128.ibm.com/developerworks/power/library/pa-cellperf/ [ibm.com] .
  • A lot of so-called supercomputers only have some parts that may run at the theoretical speed, because they stint in other parts such as memory, or bus speed etc. A viable general-purpose computer usually has one flop = one byte of core = one second to completely write core. Thats pretty much the case with desktops in the single-gigaflop range.
  • Sandia and Oak Ridge are not coming up with an exaflops computer. They are contemplating how to write software, in a way that will effectively use exaflop computers when they become available. This little group has a budget of $4.7million, which is enough to pay a dozen high-level research scientists, and a half dozen software developers for a year. They're not going to reinvent high performance technical computing.

    They're going to rework some fundamental math libraries to deal with the obvious trend in HPT
  • Has the "Just imagine a Beowulf cluster of these" joke gone out of fashion? In the old days, someone would have posted it before the ink was dry on the story (and virtual ink dries really fast).

Some people manage by the book, even though they don't know who wrote the book or even what book.

Working...