Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Supercomputing IBM

IBM's Eight-Core, 4-GHz Power7 Chip 425

pacopico writes "The first details on IBM's upcoming Power7 chip have emerged. The Register is reporting that IBM will ship an eight-core chip running at 4.0 GHz. The chip will support four threads per core and fit into some huge systems. For example, University of Illinois is going to house a 300,000-core machine that can hit 10 petaflops. It'll have 620 TB of memory and support 5 PB/s of memory bandwidth. Optical interconnects anyone?"
This discussion has been archived. No new comments can be posted.

IBM's Eight-Core, 4-GHz Power7 Chip

Comments Filter:
  • by inject_hotmail.com ( 843637 ) on Monday July 14, 2008 @10:13PM (#24190667)

    Seriously:

    I have a couple dual-core PCs. I notice that some won't ever use 100% CPU even though they easily could. I check "set affinity" in task manager, which says the process should use both cores...but it only ever hits 50% of total CPU. Looking at the CPU graph, it shows that as usage goes up on one core, usage goes down on the other.

    Is there any way to force a process to run over 2 cores at 100%?

    If not...how would 300,000 cores help unless you are running 300,000 processes, or an app that you know will scale over that many cores?

    The preceding was in fact a serious question.

  • Chances are IBM will still have a problem supplying them, plus new game consoles will get a priority in shipping in 2010, when that XBox 720 or Playstation 4 comes out.

    It is also possible that the eight core chip will be really expensive, and in order to keep up with it a PowerMac would cost $4000 or more just to eliminate bottlenecks and use optical technology like super computers use to be able to use the chip properly. Not to say that nothing stops Apple from bringing out PowerPC based Macs in 2010 as Mac OSX already runs on PowerPC code and would have to be modified to run on the Power7 instruction set. Which is very doable. Apple could have Intel Macs for low cost systems for home and small businesses, and Power7 Macs for high end workstations and servers for middle to large businesses. I don't see why Apple couldn't bring back PowerMacs and sell them next to Intel Macs, unless IBM starts to have production problems again and can't supply Apple the number of PowerPC chips that they need?

  • Great (Score:5, Interesting)

    by afidel ( 530433 ) on Monday July 14, 2008 @10:30PM (#24190827)
    So you can get 16 cores in a low end box but it still won't have enough I/O slots so you will have to buy a shelf at $obscene_amount, seriously why does IBM put such few I/O slots in the lower end P series boxes?
  • 4 Threads per core? (Score:5, Interesting)

    by jdb2 ( 800046 ) * on Monday July 14, 2008 @10:35PM (#24190875) Journal
    It should be noted that previous POWER architectures had 2 threads per core. They also had SMT ( Simultaneous Multi-Threading [wikipedia.org] ) support, which gave them an "effective" 4 threads per core. I wonder. Are the all the threads on the POWER7 "true" threads ( ie. 4 execution units -- 1 per thread ) or is it a 2 thread setup with SMT? On the other hand, if the POWER7 really does have 4 "true" threads, then with SMT you'd get an "effective" *8* threads per core.

    jdb2
  • by jcarkeys ( 925469 ) on Monday July 14, 2008 @10:39PM (#24190907) Homepage
    You sir, are correct. Most things aren't set to run in parallel and thus don't get the gains. Gains come from optimized code (obviously), but also doing multiple tasks. Leave Photoshop to render an HDR image and play a game, if you want. Though to be fair, it's not "just 1/4th" performance, the other cores are able to handle some of the other CPU tasks, such as running hardware controllers.

    And the reason that it kind've oscillates between cores is because "Set Affinity" tells the process that it's allowed to use that core, not that it has to or even should. If you want something to use both cores, open up two processes, set the first to core 1 and the second to core 2. Most of the time that's unusable like that, but I recently transcoded my entire music library and set one process to do songs from A-M, and the other from N-Z. It really helped

  • Memory Bandwidth (Score:5, Interesting)

    by Brad1138 ( 590148 ) * <brad1138@yahoo.com> on Monday July 14, 2008 @10:50PM (#24190989)

    It'll have 620 TB of memory and support 5 PB/s

    Is that kind of memory bandwidth possible? You could access the entire 620TB in ~120 milliseconds. I guess nothing is ever to fast, it just seems unrealistically fast.

  • Re:PPC Linux (Score:3, Interesting)

    by rbanffy ( 584143 ) on Monday July 14, 2008 @11:11PM (#24191175) Homepage Journal

    The problem is Sony cripled the environment in ways that make it very hard to use a PS3 as a computer.

    I still think one could build a cheap computer with a Cell processor and make a decent profit. Those über-DSPs could do a whole lot to help the relatively puny PPC cores and having such a box readily available would foster a lot of research with asymmetric multi-processors. It's really sad to see future compsci graduates who never really used anything not descending from an IBM 5150

    That said, I think there is some interesting stuff coming to the x86 world. That Larrabee x86 thing Intel is readying could be very interesting in itself and even generate some more interesting spin-offs. Imagine having a couple of cores that could, in an emergency, run the same binaries but that were tuned for different applications. One out-of-order core plus 4 in-order multi-threading cores would make a very interesting desktop processor.

  • Re:Memory Bandwidth (Score:3, Interesting)

    by irtza ( 893217 ) on Monday July 14, 2008 @11:15PM (#24191209) Homepage

    I believe the memory is aggregate and so is the bandwidth...so per core memory bandwidth is only 5PB/300K cores/s

    The real question is how memory allocation is done in per core - does each core have unrestricted access to the full 620TB or is it a cluster with each machine having unrestricted access to a subset and a software interface to move data to other nodes.

    if anyone here has insight on this, please fill in the giant blank.

  • by rbanffy ( 584143 ) on Monday July 14, 2008 @11:24PM (#24191257) Homepage Journal

    "Aren't a lot of games and apps single-threaded?"

    And that's one more thing we can thank Microsoft for.

    Hadn't DOS and the PC-clones crippled with mono-processor/mono-threading DOS/Windows stack become the dominant architecture for most of the 90s, we would have rock-solid, secure, multi-processor, 64-bit RISC boxes running some flavor of Unix on our desktops by now.

    Thanks Bill.

  • by Detritus ( 11846 ) on Monday July 14, 2008 @11:49PM (#24191435) Homepage

    Single threading, like used on old versions of Windows, does have some advantages. It avoids a lot of concurrency related problems that most programmers are not properly trained to deal with. If everyone follow the rules, it's efficient and performs well.

    I was recently reading a paper on multi-core processors and the future of programming. It pointed out that many multi-threaded programs work fine until they are run on a real multi-processor system where multiple threads can actually run simultaneously. At that point, strange timing-related bugs often appear that are very difficult to replicate and diagnose.

  • Re:Memory Bandwidth (Score:3, Interesting)

    by 777v777 ( 730694 ) on Tuesday July 15, 2008 @12:05AM (#24191563)
    It would be incredibly unlikely that each core could directly access the full 620TB. The current largest machines on the Top500 list are all distributed memory machines(clusters). However, the trends in modern interconnect networks are to increase the capabilities for doing stuff like remote direct memory access (RDMA). In such a scheme, the remote memory is not addressable(with load/store instructions), but stuff can be transferred between memories of different nodes by the network hardware. The codes commonly run on the top500 machines are likely written in MPI or MPI/OpenMP. This means they don't need to directly access remote memories.
  • by symbolset ( 646467 ) on Tuesday July 15, 2008 @12:14AM (#24191635) Journal

    iometer [iometer.org]

    Properly configured it can stress all the cores on all the nodes in your cluster.

    Oh you wanted to do something useful...

    Intel released it as open source in 2001. Edit the source for the dynamo so that it does something useful. Compile and install. Done.

    Or you could load Vista and play a light game. That ought to peg both cores.

    Actually, dual core is what it's cracked up to be. While your single threaded application is grinding away you can still interact with your computer instead of staring at the hourglass like you used to do. Since you like playing with the affinity you can launch several long single threaded tasks and set their affinity for different cores. Transcode a .AVI into a DVD of the family picnic? Render an animation in POVRay? Compute a few billion prime numbers. Fold some proteins. Calculate the propagation of thermal energy through single fibers in a carbon-fiber fabric. Whatever you want.

    Soon almost all non-trivial applications will be multithreaded, and then you'll be cursing the hourglass again. Until then enjoy your vacation from its tyranny.

  • by hkfczrqj ( 671146 ) on Tuesday July 15, 2008 @01:57AM (#24192279)
    Being a grad student at Illinois I can tell you something. You really don't know about the University's accounting system. It can literally index every atom in campus (not that they need to). That's why 640 won't be enough :) Also, the supercomputer will require the construction of a new power plant. Seriously.
  • by Fweeky ( 41046 ) on Tuesday July 15, 2008 @02:27AM (#24192421) Homepage

    The benefit? Unless your app uses more than around 3GB of RAM, basically zero

    Plenty of things quite enjoy being able to perform operations in 64bits at a time, actually. Especially when it comes to media, crypto, compression, and indeed games; on top of having 2-3x the usable number of general purpose registers, which certainly isn't something to sneer at given how awful x86 has traditionally been in this area. Plenty of things you're likely to actually care about the performance of are likely to get a nice boost.

    64-bits as a waste of address space, UNLESS you're accessing large amounts of memory (>3GB per program!)

    Well, you generally only get 3GB when you've performed tricks to ask the OS to allow that; e.g. /3GB boot flag, fiddling with MAXDSIZ, or recompiling with a different user/kernel space split.

    On top of that, it's not all about RAM, it's about address space; if you've only got 32 bits to play with, you need to be very careful about allocating it, since any wastage can lead to exhausting your virtual address space before your physical space; like with filesystems, fragmentation becomes more of a concern the closer you get to your maximum capacity.

    Large virtual spaces are also useful when it comes to doing things like mmapping large files; for instance, a database might like to mmap data files to avoid unnecessary copying you get with read()/write(), but mmapping a 1GB file means you need 1GB of address space, even if you don't touch any of it. When it's common to access disk using memory addresses, 3GB starts looking small very fast.

    You also very quickly eat into it using modern graphics cards; 512MB is common, having two isn't that uncommon, and things are moving towards 1GB; bang goes your 3GB, all that frame buffer needs addressing too, on top of the kernel's other needs.

    Really, 32bit needs to die screaming, sooner rather than later.

  • by Ilgaz ( 86384 ) on Tuesday July 15, 2008 @05:42AM (#24193351) Homepage

    Funny is people actually thought IBM can't deliver 3 Ghz or cold running G5. No, they just chose not to deliver it to Apple. Their focus is enterprise, servers, massive scientific computing. The early warning came when they sold their superbly prestigious and brand advertising Laptop division to Lenovo.

    Just imagine they cancel this CPU to deliver 3 Ghz G5 to Apple. For what? Apple fans turned x86 fanatics almost overnight happily buying parallels to run Windows applications on OS X and buying overpriced Windows games which are masked as OS X applications.

    At least IBM and Apple took away the "endian" excuse in hands of developers and GPU vendors. They still, shamelessly sell 20-30% more expensive graphics cards to Mac users, running Intel, on standard PCI-X mainboard! New excuse is... EFI!

  • by LionMage ( 318500 ) on Tuesday July 15, 2008 @05:19PM (#24203471) Homepage

    Besides the increased number of general purpose registers on x86-64, there's also the change in calling convention -- on 32-bit x86, function arguments are pushed onto the stack, whereas on x86-64, the arguments are passed via register. That's another reason that apps like Photoshop run faster when compiled as 64-bit x86 code.

Without life, Biology itself would be impossible.

Working...