IBM's Eight-Core, 4-GHz Power7 Chip 425
pacopico writes "The first details on IBM's upcoming Power7 chip have emerged. The Register is reporting that IBM will ship an eight-core chip running at 4.0 GHz. The chip will support four threads per core and fit into some huge systems. For example, University of Illinois is going to house a 300,000-core machine that can hit 10 petaflops. It'll have 620 TB of memory and support 5 PB/s of memory bandwidth. Optical interconnects anyone?"
I have a serious question: (Score:1, Interesting)
Seriously:
I have a couple dual-core PCs. I notice that some won't ever use 100% CPU even though they easily could. I check "set affinity" in task manager, which says the process should use both cores...but it only ever hits 50% of total CPU. Looking at the CPU graph, it shows that as usage goes up on one core, usage goes down on the other.
Is there any way to force a process to run over 2 cores at 100%?
If not...how would 300,000 cores help unless you are running 300,000 processes, or an app that you know will scale over that many cores?
The preceding was in fact a serious question.
Re:Steve Jobs is crying in his pillow tonight. (Score:3, Interesting)
Chances are IBM will still have a problem supplying them, plus new game consoles will get a priority in shipping in 2010, when that XBox 720 or Playstation 4 comes out.
It is also possible that the eight core chip will be really expensive, and in order to keep up with it a PowerMac would cost $4000 or more just to eliminate bottlenecks and use optical technology like super computers use to be able to use the chip properly. Not to say that nothing stops Apple from bringing out PowerPC based Macs in 2010 as Mac OSX already runs on PowerPC code and would have to be modified to run on the Power7 instruction set. Which is very doable. Apple could have Intel Macs for low cost systems for home and small businesses, and Power7 Macs for high end workstations and servers for middle to large businesses. I don't see why Apple couldn't bring back PowerMacs and sell them next to Intel Macs, unless IBM starts to have production problems again and can't supply Apple the number of PowerPC chips that they need?
Great (Score:5, Interesting)
4 Threads per core? (Score:5, Interesting)
jdb2
Re:I have a serious question: (Score:5, Interesting)
And the reason that it kind've oscillates between cores is because "Set Affinity" tells the process that it's allowed to use that core, not that it has to or even should. If you want something to use both cores, open up two processes, set the first to core 1 and the second to core 2. Most of the time that's unusable like that, but I recently transcoded my entire music library and set one process to do songs from A-M, and the other from N-Z. It really helped
Memory Bandwidth (Score:5, Interesting)
It'll have 620 TB of memory and support 5 PB/s
Is that kind of memory bandwidth possible? You could access the entire 620TB in ~120 milliseconds. I guess nothing is ever to fast, it just seems unrealistically fast.
Re:PPC Linux (Score:3, Interesting)
The problem is Sony cripled the environment in ways that make it very hard to use a PS3 as a computer.
I still think one could build a cheap computer with a Cell processor and make a decent profit. Those über-DSPs could do a whole lot to help the relatively puny PPC cores and having such a box readily available would foster a lot of research with asymmetric multi-processors. It's really sad to see future compsci graduates who never really used anything not descending from an IBM 5150
That said, I think there is some interesting stuff coming to the x86 world. That Larrabee x86 thing Intel is readying could be very interesting in itself and even generate some more interesting spin-offs. Imagine having a couple of cores that could, in an emergency, run the same binaries but that were tuned for different applications. One out-of-order core plus 4 in-order multi-threading cores would make a very interesting desktop processor.
Re:Memory Bandwidth (Score:3, Interesting)
I believe the memory is aggregate and so is the bandwidth...so per core memory bandwidth is only 5PB/300K cores/s
The real question is how memory allocation is done in per core - does each core have unrestricted access to the full 620TB or is it a cluster with each machine having unrestricted access to a subset and a software interface to move data to other nodes.
if anyone here has insight on this, please fill in the giant blank.
Re:I have a serious question: (Score:4, Interesting)
"Aren't a lot of games and apps single-threaded?"
And that's one more thing we can thank Microsoft for.
Hadn't DOS and the PC-clones crippled with mono-processor/mono-threading DOS/Windows stack become the dominant architecture for most of the 90s, we would have rock-solid, secure, multi-processor, 64-bit RISC boxes running some flavor of Unix on our desktops by now.
Thanks Bill.
Re:I have a serious question: (Score:3, Interesting)
Single threading, like used on old versions of Windows, does have some advantages. It avoids a lot of concurrency related problems that most programmers are not properly trained to deal with. If everyone follow the rules, it's efficient and performs well.
I was recently reading a paper on multi-core processors and the future of programming. It pointed out that many multi-threaded programs work fine until they are run on a real multi-processor system where multiple threads can actually run simultaneously. At that point, strange timing-related bugs often appear that are very difficult to replicate and diagnose.
Re:Memory Bandwidth (Score:3, Interesting)
Re:I have a serious question: (Score:3, Interesting)
iometer [iometer.org]
Properly configured it can stress all the cores on all the nodes in your cluster.
Oh you wanted to do something useful...
Intel released it as open source in 2001. Edit the source for the dynamo so that it does something useful. Compile and install. Done.
Or you could load Vista and play a light game. That ought to peg both cores.
Actually, dual core is what it's cracked up to be. While your single threaded application is grinding away you can still interact with your computer instead of staring at the hourglass like you used to do. Since you like playing with the affinity you can launch several long single threaded tasks and set their affinity for different cores. Transcode a .AVI into a DVD of the family picnic? Render an animation in POVRay? Compute a few billion prime numbers. Fold some proteins. Calculate the propagation of thermal energy through single fibers in a carbon-fiber fabric. Whatever you want.
Soon almost all non-trivial applications will be multithreaded, and then you'll be cursing the hourglass again. Until then enjoy your vacation from its tyranny.
Re:UofI machine is a bit low on memory (Score:3, Interesting)
Re:I have a serious question: (Score:4, Interesting)
The benefit? Unless your app uses more than around 3GB of RAM, basically zero
Plenty of things quite enjoy being able to perform operations in 64bits at a time, actually. Especially when it comes to media, crypto, compression, and indeed games; on top of having 2-3x the usable number of general purpose registers, which certainly isn't something to sneer at given how awful x86 has traditionally been in this area. Plenty of things you're likely to actually care about the performance of are likely to get a nice boost.
64-bits as a waste of address space, UNLESS you're accessing large amounts of memory (>3GB per program!)
Well, you generally only get 3GB when you've performed tricks to ask the OS to allow that; e.g. /3GB boot flag, fiddling with MAXDSIZ, or recompiling with a different user/kernel space split.
On top of that, it's not all about RAM, it's about address space; if you've only got 32 bits to play with, you need to be very careful about allocating it, since any wastage can lead to exhausting your virtual address space before your physical space; like with filesystems, fragmentation becomes more of a concern the closer you get to your maximum capacity.
Large virtual spaces are also useful when it comes to doing things like mmapping large files; for instance, a database might like to mmap data files to avoid unnecessary copying you get with read()/write(), but mmapping a 1GB file means you need 1GB of address space, even if you don't touch any of it. When it's common to access disk using memory addresses, 3GB starts looking small very fast.
You also very quickly eat into it using modern graphics cards; 512MB is common, having two isn't that uncommon, and things are moving towards 1GB; bang goes your 3GB, all that frame buffer needs addressing too, on top of the kernel's other needs.
Really, 32bit needs to die screaming, sooner rather than later.
Re:Sorry Mac fans . . . (Score:3, Interesting)
Funny is people actually thought IBM can't deliver 3 Ghz or cold running G5. No, they just chose not to deliver it to Apple. Their focus is enterprise, servers, massive scientific computing. The early warning came when they sold their superbly prestigious and brand advertising Laptop division to Lenovo.
Just imagine they cancel this CPU to deliver 3 Ghz G5 to Apple. For what? Apple fans turned x86 fanatics almost overnight happily buying parallels to run Windows applications on OS X and buying overpriced Windows games which are masked as OS X applications.
At least IBM and Apple took away the "endian" excuse in hands of developers and GPU vendors. They still, shamelessly sell 20-30% more expensive graphics cards to Mac users, running Intel, on standard PCI-X mainboard! New excuse is... EFI!
Re:I have a serious question: (Score:3, Interesting)
Besides the increased number of general purpose registers on x86-64, there's also the change in calling convention -- on 32-bit x86, function arguments are pushed onto the stack, whereas on x86-64, the arguments are passed via register. That's another reason that apps like Photoshop run faster when compiled as 64-bit x86 code.