Forgot your password?
typodupeerror
Supercomputing IBM Hardware

IBM's Blue Gene Runs Continuously At 1 Petaflop 231

Posted by Zonk
from the all-i-can-think-is-the-games-you-could-play dept.
An anonymous reader writes "ZDNet is reporting on IBM's claim that the Blue Gene/P will continuously operate at more than 1 petaflop. It is actually capable of 3 quadrillion operations a second, or 3 petaflops. IBM claims that at 1 petaflop, Blue Gene/P is performing more operations than a 1.5-mile-high stack of laptops! 'Like the vast majority of other modern supercomputers, Blue Gene/P is composed of several racks of servers lashed together in clusters for large computing tasks, such as running programs that can graphically simulate worldwide weather patterns. Technologies designed for these computers trickle down into the mainstream while conventional technologies and components are used to cut the costs of building these systems. The chip inside Blue Gene/P consists of four PowerPC 450 cores running at 850MHz each. A 2x2 foot circuit board containing 32 of the Blue Gene/P chips can churn out 435 billion operations a second. Thirty two of these boards can be stuffed into a 6-foot-high rack.'"
This discussion has been archived. No new comments can be posted.

IBM's Blue Gene Runs Continuously At 1 Petaflop

Comments Filter:
  • Oh good grief...655,360 central processing units ought to be enough for anyone.
  • One of these days, I am going to get a bunch of spam from "YOUR IBM SUPERCOMPUTER OVERLORD", informing me that humanity has made a mess of things, and it has decided to run the world for our own good.
  • by jshriverWVU (810740) on Tuesday June 26, 2007 @01:08PM (#19651717)
    As a parallel programmer, I'd love to have just one of these chips let alone one of the boards in a nice 2u rack. Can they bought at a reasonable price or strictly research or inhouse?
    • by no_pets (881013)
      I'm sure IBM would love to sell you one.
    • Re: (Score:2, Funny)

      by asliarun (636603)
      Ha, you might be a parallel programmer, but can you compete with him [wikipedia.org]?? :-D
    • by PHPfanboy (841183)
      IANA parallel programmer. Please enlighten me what a parallel programmer does with just one chip?
      • by grommit (97148)

        IANA parallel programmer. Please enlighten me what a parallel programmer does with just one chip?

        That depends, does that one chip contain four cores like the PowerPC chip from TFA does?/p?

      • by Aladrin (926209)
        From TFS: "The chip inside Blue Gene/P consists of four PowerPC 450 cores running at 850MHz each."
      • Each chip in this case has 4 cores, so it can use parallelized software. Just a Core Due 2.
    • IIRC the smallest you can buy is one rack, and if you have to ask you can't afford it.
    • by e2d2 (115622)
      Share some of the tools you use (please). I'm sure some of the programmers here like myself would love to dive into this area but probably don't know where to start. It's pretty easy to find a parallel programming framework, it's not so easy to know what works and what tools/techniques are a waste of time.

  • In the Future... (Score:2, Interesting)

    by perlhacker14 (1056902)
    I yearn for the day that this kind of power may be brought into households all over the world. Think: the opportunities presented by such computers available to all are scientifically tremendous. There should be consideration of having these in Libraries, at least. Publically and Freely accessible supercomputing should become a national goal, to be achieved by 2019 at least.
  • by Anonymous Coward on Tuesday June 26, 2007 @01:14PM (#19651835)
    For harboring petaphiles!
  • ...the next step (10**18) is the "exaflop."
  • by danbert8 (1024253) on Tuesday June 26, 2007 @01:18PM (#19651893)
    Imagine a beowulf cluster of THESE!
  • by Bazman (4849) on Tuesday June 26, 2007 @01:30PM (#19652105) Journal
    So, do they have enough compute power to simulate the flap of every butterfly's wings now? And does it include the heat it produces from its cooling systems in its climate models?

    • by Random832 (694525)
      The amount of time it actually takes for a butterfly wing flap to result in a hurricane is well in excess of the amount of time for which weather forecasts purport to predict systems such as hurricanes.
  • How high? (Score:4, Informative)

    by Anonymous Coward on Tuesday June 26, 2007 @01:32PM (#19652135)
    Well the the stack of laptops might be tall, but even the 216 racks would stack up to 1/5 of a mile high.
  • What about Memory? (Score:5, Interesting)

    by sluke (26350) on Tuesday June 26, 2007 @01:32PM (#19652143)
    I recently had a chance to see Francois Gygi, one of the principal authors of qbox (http://eslab.ucdavis.edu/) which is a quantum electronic structure code that has set some performance records on the Blue Gene/L at Livermore. He mentioned that the biggest challenge he faced was the very small amount of memory available to each node of the Blue Gene (something like 256Mb). This forced him to put so much emphasis on the internode communications that simply changing the order of the nodes where the data was distributed in the machine (without changing the way the data itself was split) affected performance by over 100%. This will only get worse as the number of cores per board goes from 2 to 4 on the Blue Gene/P. I couldn't find anything in a quick google search, but does someone know what the plans are for the memory on this new machine?
    • Re: (Score:3, Informative)

      by Anonymous Coward
      BG/P will support 2 GB standard for each compute node. A compute node has 4 core processors. An option for 4 GB of memory is also available. On BG/L the initial memory configuration at Livermore was 512 MB per compute node which consisted of 2 core processors. Since 2007 BG/L has offered 1 GB memory as the standard configuration.
  • by deadline (14171) on Tuesday June 26, 2007 @01:33PM (#19652145) Homepage

    Blue Gene [wikipedia.org] is a specialized design that is based on using large amounts of low power CPUs. This approach is also the one taken by SiCortex [sicortex.com]. One of the big problems with heroic computers (computers that are pushing the envelop in terms of performance) is heat and power. Just stacking Intel and AMD servers gets expensive at the high end.

  • by loonicks (807801) on Tuesday June 26, 2007 @01:40PM (#19652269)
    Who cares if it's as fast as 1.5 miles of stacked laptops? Why do we always have to compare things in such arbitrary units? Let's ask some other questions:
    • How many football fields does the hardware span?
    • How many Volkswagens does is weigh?
    • How many AOL CDs worth of storage does it contain?
    • How many Libraries of Congress can it process per unit time?
    • If it were melted down and re-formed into low-cost housing materials, how many starving third-world children could it shelter?
    • by Belial6 (794905)
      We pick arbitrary units because in the end, all units of measurement are arbitrary. What we think of as standard measurements are just some arbitrary measurements that lots and lots of people agree on. Given that much of the world has a difficult time understanding what a petaflop really means, the writers will use a unit of measurement that they believe people will understand, and compare it to the 'standard' units. This is frequently a useful way to get the data across. Of course, I will agree that Li
    • Re: (Score:2, Funny)

      Actually, to be fair, you could answer those questions. If you really want to compare an arbitrary metric, you should ask something like
      • How many accountants can be replaced by this processing power?
      • How many spider webs would be filled with the bugs that occupy the code written for this system?
      • How many outsourced technicians will it take to support this system if it runs something that needs toll-free support?
  • Not enough (Score:3, Funny)

    by Ollabelle (980205) on Tuesday June 26, 2007 @01:46PM (#19652389)
    Civ 4 will still run slow.
  • by i_like_spam (874080) on Tuesday June 26, 2007 @01:47PM (#19652399) Journal
    This announcement is part of the International Supercomputing Conference [supercomp.de], which just kicked off today. The new Top500 list [top500.org] will also be announced shortly.

    While the new IBM Blue Gene/P system is impressive, I'm more curious to see what sort of new supercomputer Andreas Bechtolsheim [nytimes.com] of Sun Microsystems has put together.

    Here's an interesting quote about Bechtolsheim from the article:

    'He's a perfectionist,' said Eric Schmidt, Google's chief executive, who worked with Mr. Bechtolsheim beginning in 1983 at Sun. 'He works 18 hours a day and he's very disciplined. Every computer he has built has been the fastest of its generation.'
    • by shaitand (626655)
      It gets a brief note at the bottom of TFA. At 500 terraflops it pales in comparison.
    • by flaming-opus (8186) on Tuesday June 26, 2007 @04:35PM (#19654819)
      It appears that Sun's design is less revolutionary. It's just a bunch of off-the-shelf blade servers strung together with infinaband. They use the same cabinets, powersupplies, etc as the regular blade server offerings for non-technical computing. It also runs as a regular linux OS, clustered, rather than a supercomputer specific OS, as the Blue Gene does. The big differentiator of the Sun system is the massive 3000 port infinaband switch. I'm sure it's not actually a 3000-port switch, but a bunch of small switches packed together, running over printed circuit boards, rather than cables.

      Sun's design is affordable, and probably has a pretty decent max performance, and pretty reasonable power/memory per node. However, it's not as exotic as IBM's design. The IBM design has fantastic flops/watt and flops/square-foot performance. However, each node is really wimpy, which forces you to use a LOT of nodes for any problem, which inreases the necessary amount of communication. Some problems work really well, others, not so much.

      IBM has limited blue gene to a small number of customers, all with fairly large systems. I suspect that's because it's very difficult to port an application to the system, and get good performance.
  • The next version will do fifty petaflops but its weather calculations will always be wrong.

    That is until one day someone remembers to add in the massive heat output from its own cooling towers.
    • Re: (Score:3, Informative)

      by shaitand (626655)
      Even with the computing power weather would be impossible to calculate. It isn't because of a lack of understanding either. In order to calculate weather you don't just need to know how weather works, you need to have precise data on every variable across the globe and these measurments would need to be taken to a resolution that is simply insane. If you had a fast enough machine, it could even catch up with current weather from that point, but your snapshot would have to be exact and all measurements would
    • I could give a crap about the petaflops, how many laptops tall will it be?
  • by Vellmont (569020) on Tuesday June 26, 2007 @01:54PM (#19652501)
    Years ago, shortly after the Pentium first came out and the then astounding "x million flops/second" numbers were floating around, I wondered how far we were behind the power of supercomputers. I remember doing some rough calculations and finding that only a few pentiums could do the calculations of a Cray 1. I don't remember the specifics of how many pentiums/cray, or how rough the calculation was, but that's largely unimportant for my point.

    So I have to wonder, what's the equivalent supercomputer that a modern, hefty desktop is capable of performing at? 10 years ago, 20 years ago? Have super-computers accelerated in terms of the speed of increased computing power, stayed the same, or fallen behind desktops?
    • by flaming-opus (8186) on Tuesday June 26, 2007 @04:47PM (#19655013)
      A tricky question, but not all that interesting. A fast server processor is within a factor of 4 of the fastest supercomputer processor in the world. That does not mean that you can do equivalent work with the server processor. Among other things, processing performance (gigaflops) of a CPU, is no longer the interesting part of a supercomputer. (It never really was) memory bandwidth, interconnect bandwidth and latency, and I/O performance are the more interesting features of supers. 12 year old Cray processors still have five times the memory bandwidth of modern PC processors, and twenty times the I/O bandwidth.

      You'll notice, that 98% of the supercomputers, sold in the last 10 years, all use server processors. (Blue Gene actually uses an embedded systems processor, but it's the same idea) However, in the late 80's putting 256 processors in a super was cutting edge. In the 90's, a few thousand. Soon you'll see a quarter million cores. So supers are actually getting faster at a higher rate than are desktops, at least by most measures.

      • by Vellmont (569020)

        processing performance (gigaflops) of a CPU, is no longer the interesting part of a supercomputer. (It never really was) memory bandwidth, interconnect bandwidth and latency, and I/O performance are the more interesting features of supers.


        I always hear this, but I've never seen anything terribly definitive on it. I'd like to see how fast a Cray for 12 years ago, and a modern top-of-the-line Desktop PC with a hot graphics processor could solve a problem designed to run on that Cray from 12 years ago. Metri
        • Re: (Score:3, Insightful)

          by afidel (530433)
          A Cray from 12 years ago would be a T90. The top of the line was the T932 with 32 vector CPU's. It was capable of 57.6 gigaflops and had a total internode I/O bandwidth of 330GB/s. It maxed out at 8GB of main memory. Compare that to an ATI Radeon x1950xtx gpu running folding@home at ~90Gflops with a half gig of ram and ram I/O of 64GB/s, which is significantly faster than a desktop CPU. So, it really depends on what your problems throughput limitation is, CPU/GPU raw power or I/O bandwidth as to whether a c
  • I thought these will be based on the new Cell architecture, which is simply awesome. http://arstechnica.com/articles/paedia/cpu/cell-1. ars [arstechnica.com] [Ars Technicia]
    • by cerelib (903469)
      Wouldn't that be a bit redundant? It seems like the purpose of the Cell architecture was to take the ideas used to create these kinds of super-computers and put them on a single chip allowing for good parallel performance. Wasn't it being touted as a super-computer on a chip? To program for a BlueGene based on Cell, you would have to parellelize and specialize your already parallel tasks to get the full advantage. That is my understanding of it, but only time will tell what can be done, or what people w
  • by blind biker (1066130) on Tuesday June 26, 2007 @01:56PM (#19652543) Journal
    "How many laptop-miles does this computer do?"
  • ...chips like the PPC 450 are the reason WHY Apple moved over to Intel, not a reason why they should have stayed. IBM made a business decision to steer its CPU engineering resources away from general-purpose desktop computing (aka G5) and focus on two more specialized niches: big iron (aka Blue Gene, POWER6) and consoles (aka Cell, Xenon, Broadway). All of those are very nice chips that make IBM a LOT of money, but NONE of them are suited to be the brains of a consumer Mac, and especially not a Mac laptop.

    • Re: (Score:2, Interesting)

      by Anonymous Coward
      FYI these are not "normal" PPC 450s ... they are PPC 450 cores with two high end FPUs bolted on (the FPUs from the G5) This works very well if you want to build a big parallel machine like BGP. As you say, no good for a desktop (true) but my point is just this is not a typical embedded PPC chip.
  • I thought 850 chips were slow by today's standards. What am I missing?
    • by davidbrit2 (775091) on Tuesday June 26, 2007 @03:06PM (#19653571) Homepage

      What am I missing?
      The other 4,095 of them.
      • Also its an entirely different architecture - Intel P4 fanboys will say that an AMD FX7x running at like 2.x or 3.x (whatever its at) is a slow piece of crap. Basically, clock speed (bad car analogy: horsepower) isn't everything, its the actual work that it can do (torque) that matters for heavy stuff...
    • Re: (Score:3, Interesting)

      by joib (70841)

      I thought 850 chips were slow by today's standards. What am I missing?


      You can stuff 4096 cores (1024 chips) per rack. Precisely because the chips are a slow low power design.
    • by homer_ca (144738)
      And the part about each chip having 4 cores and a 2x2 ft circuit board containing 32 chips. 128 CPU cores in the space of an ATX motherboard is pretty impressive even at only 850Mhz each.
  • vista (Score:4, Funny)

    by arclyte (961404) on Tuesday June 26, 2007 @03:12PM (#19653649)
    IBM researches are excited, because if they can get it to sustain the 3 petaflops, they'll finally be able to switch on the new "Aero" feature of the Windows Vista Super-Penultimate Premium Advanced edition.
  • Probably not the greenest computer.
  • Contrary to most people that think a singular way of representing floating point speed is FLOP, it is FLOPS because FLOPS is not plural. FLOPS is Floating Point Operations Per Second. So, I chuckle everytime I read 1 PETAFLOP. Guys, just turn off your singular/plural alarm and say with me 1 and only 1 PETAFLOPS.
  • A high end laptop with a core 2 duo at 2.4GHz rates around 20Gigaflops
    You can probably overclock it to 25 Gigaflops, and it is 25mm thick(Opened up).

    The GPU probably adds between 20 and 200 Gigaflops (Nvidia claims 500GFlops on the 8800) http://en.wikipedia.org/wiki/GeForce_8_Series [wikipedia.org]

    Maybe an overall estimate is 50 Gigaflops total is reasonable.

    To get 2 Petaflop you will need 40 of these or a stack of 1 meter.

Heuristics are bug ridden by definition. If they didn't have bugs, then they'd be algorithms.

Working...