Slashdot Log In
IBM's Eight-Core, 4-GHz Power7 Chip
Posted by
kdawson
on Mon Jul 14, 2008 08:58 PM
from the 100-racks-of-goodness dept.
from the 100-racks-of-goodness dept.
pacopico writes "The first details on IBM's upcoming Power7 chip have emerged. The Register is reporting that IBM will ship an eight-core chip running at 4.0 GHz. The chip will support four threads per core and fit into some huge systems. For example, University of Illinois is going to house a 300,000-core machine that can hit 10 petaflops. It'll have 620 TB of memory and support 5 PB/s of memory bandwidth. Optical interconnects anyone?"
Related Stories
[+]
IT: POWER7 To Ship In First Half of 2010 73 comments
BBCWatcher writes "In CPU news, IBM says that its POWER7 servers will start shipping in the first half of 2010, on schedule or perhaps even a few months early if you believe Wikipedia. Moreover, upgrades from a wide variety of POWER6 models will be mere CPU swaps, with the upgraded servers keeping their same serial numbers. (Bean counters like that.) POWER7 sports up to 8 cores per die, 4 threads per core, a clock speed a Hertz or two above 4 GHz, 45 nm process manufacturing, on-chip DDR3, and up to 1,000 micropartitions per machine. IBM claims that POWER7 will offer about 256 Gflops per die and two to three times the performance per watt as POWER6. IBM wants to keep taking orders now for its POWER6 gear (duh), so its sales reps are allegedly ready and eager to deal on 6-cum-7 packages. And it looks like that cunning plan could work rather well given Sun's Rock CPU cancellation and HP's delay of Tukwila Itanium to 2010. (Is anybody still in the server CPU race except IBM, Intel, and maybe AMD?) In 2006, POWER7 won the contest for a DARPA supercomputing R&D grant of $244 million, so you could say that each US citizen is in for about a dollar already."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Finally (Score:4, Funny)
Re:Finally (Score:5, Funny)
Parent
Toasty. (Score:5, Funny)
In other news, temperatures on the University of Illinois campus have mysteriously risen ten degrees. Scientists are still examining possible causes..
Re:Toasty. (Score:5, Funny)
Good thing they have a brand-new supercomputer, analyzing this temperature anomaly will be much faster!
Parent
Re:Toasty. (Score:5, Funny)
An African or European Heisenberg?
Parent
Re:Toasty. (Score:5, Funny)
> Scientists are still examining possible causes..
Nah. If something gets warmer it is caused by Global Warming and the solution is to eliminate Western industrial civilization.
If something gets colder it is Global Climate Change and the solution is to eliminate Western industrial civilization.
If we have more hurricanes it is Global Warming. Fewer and it is Climate Change. More tornadoes? Global Warming. Floods caused by increased snowfall? Somehow that was also Global Warming, I'd have thought they would have went with Global Climate Change, but every rule seems to need an exception.
Parent
Re:Toasty. (Score:5, Insightful)
There's a difference between scientists and the mainstream media's coverage of that science. And I don't speak for jmorris here, but that just might be what he's satirizing.
Parent
Re: (Score:3, Insightful)
I think the people that complain about Global Warming hype are not complaining about the science but the dumbed down and politically motivated 'summary of what the vast majority of respectable scientists believe' from people who are activists not scientists.
Re:Toasty. (Score:5, Insightful)
BTW: If you think a post is clueless then put some information in your reply, AC ad-homs won't convince anyone of anything except perhaps that your a witless arsehole.
Parent
Re:Toasty. (Score:4, Informative)
> not to mention, in 2007 that the northwest passage was completely ice free for the first time in recorded history.
Yea, right. Pull the other one. First time in recorded history huh? Except for 1906, 1944, 1957, 1969, 1977, 1984, 1985, 1988 and 2000 in wooden ships, catamarans, naval vessels, cruise ships, etc.
Stop beliving the propaganda and do some googling before you open yer piehole and up looking like a retard.
btw, here is the link I got from Google searching for "northwest passage ice free"
Classically Liberal: Bad reporting about the Northwest Passage issue [blogspot.com]
Parent
Re:Toasty. (Score:4, Informative)
not to mention, in 2007 that the northwest passage was completely ice free for the first time in recorded history.
Yea, right. Pull the other one. First time in recorded history huh? Except for 1906, 1944, 1957, 1969, 1977, 1984, 1985, 1988 and 2000 in wooden ships, catamarans, naval vessels, cruise ships, etc.
One should note that in 1906 at least it wasn't exactly ice free, which is why it took Amundsen and his Gjoa three years to pass through (1903-1906). Your list is basically a list of years when some vessel finished sailing through the North-West Passage, but it doesn't really say anything about how much ice they encountered on the way.
That's also the basic fallacy of the blog you're linking to - it mentions an ice-free North-West Passage, but only for 2000. For the other years it just mentions a couple of vessels, while not really saying anything ice (except for 1984, where it says that the ice was "in retreat", implying that there still was some ice there).
So, for a comment like "Stop beliving the propaganda and do some googling before you open yer piehole and up looking like a retard." you are leaning out of the window a bit too much for my taste.
Parent
Core pron (Score:5, Funny)
"For example, University of Illinois is going to house a 300,000-core machine that can hit 10 petaflops. It'll have 620 TB of memory and support 5 PB/s of memory bandwidth."
I came.
Re:Core pron (Score:5, Funny)
I came.
I saw.
Parent
Re:Core pron (Score:5, Funny)
I came.
I saw.
I compiled.
Parent
Re:Core pron (Score:4, Informative)
I...
KHaaaaaaaaaaaaaaaaaaaaaaaaaaaNnnnn!!!!!!!!!!!!
Parent
Re:Core pron (Score:5, Funny)
I, Caesar, when I learned of the fame
Of Cleopatra, I straightway laid claim.
Ahead of my legions,
I invaded her regions,
I saw, I conquered, I came.
Parent
Re: (Score:3, Informative)
Veni, vini, vomi.
Re: (Score:3, Funny)
Is that a new Apple product?
PPC Linux (Score:4, Insightful)
I'd be a lot more excited about these PPC lines if Ubuntu 8.04 would install and run properly on the PS3, whose PPC+6xDSP architecture would be a great entry level platform for coming up with parallel techniques for the bigger and more parallel PPC chips.
Re: (Score:3, Interesting)
The problem is Sony cripled the environment in ways that make it very hard to use a PS3 as a computer.
I still think one could build a cheap computer with a Cell processor and make a decent profit. Those über-DSPs could do a whole lot to help the relatively puny PPC cores and having such a box readily available would foster a lot of research with asymmetric multi-processors. It's really sad to see future compsci graduates who never really used anything not descending from an IBM 5150
That said, I t
Re:PPC Linux (Score:5, Insightful)
No, that is not a problem. Linux, including Ubuntu, has been running on PS3 since the PS3 was released. Every Ubuntu since 6.10 has run on it. And current releases of other Linux (PPC) distros usually do install, especially the Yellow Dog that is the one officially supported by Sony. But the problem is that the Ubuntu team has too few developers, and new features in Ubuntu releases break the installation in ways that the small Cell/PPC team can't keep up with.
Also, there's nothing really "puny" about the Cell's PPC core, which is a 3.2GHz dual-hyperthread Power core.
The Cell/Linux platform has already got video drivers that offload graphics from the PPC to the DSPs the same way most distros run graphics on separate VGA chips. It's a little buggy, in beta, but that's why the project just needs more developers. Not more FUD.
I don't think you really know anything about how Linux actually runs on the Cell, on the PS3. I think you're just repeating the most whiny posts about it you've heard. Because the reality is very different from what you describe, even if it's still got problems. Problems that don't require waiting for more x86 HW revisions, but rather just a little more work on the Cell Linux that Ubuntu is releasing.
Parent
Great (Score:5, Interesting)
4 Threads per core? (Score:5, Interesting)
jdb2
Re:4 Threads per core? (Score:5, Informative)
Correct.
No, they do not "also" have SMT. It is the SMT that gives them 2 threads per core in the first place.
Power 5 & 6 have 2-way SMT. Power 7 has 4-way SMT.
Parent
Re:4 Threads per core? (Score:5, Informative)
I always thought the definition of a "core" as whatever the minimal set of hardware required to run a single thread at "full power". By my logic, anytime you run more than one thread on a core, you're doing what SMT does.
Someone please tell me if I'm wrong (and how).
A core is a set of registers and function units, among other hardware. Each core is, effectively, a completely separate processor (though it likely shares some things, such as the L2 cache and FSB with other cores). Since processor usually refers to a whole chip (encased in a plastic or ceramic package, and soldered on the motherboard), the term "core" refers to when there is more than one inside a single package. The ultimate goal is usually to have all cores on a single piece of silicon, but often multi-chip modules are used (especially early in production), where a four-core processor might contain two silicon dies, each with 2 cores. This can help increase yield (by reducing the die size), and reduce production cost. After improving the yield of the processor, or changing to a reduced feature size (eg 90 nm to 65 nm), a switch back to a single die is possible, reducing the packaging cost.
Simultaneous Multithreading (SMT), on the other hand, works on a single processor/core. It is a feature that allows sharing of the processor resources, such as registers and functional units. A PowerPC 970 (which never had SMT support) could issue 4 instructions and 1 branch every clock cycle. Because of that, plus the deep pipelines, up to 216 instructions can be in various stages of completion at any given time. However, on average, a program branches every 4 instructions - this means that the processor would have to correctly predict 54 branches to keep the pipeline full, AND that the instructions would be (mostly) independent of each other. This isn't easy to do. So, what many processors do is split the available resources. One might issue 2 instructions from one thread and 2 instructions from another in each clock cycle, or alternate clock cycles issuing 4 from one, then 4 from the other. This shares most of the CPU's resources, while requiring a fairly minimal amount of extra logic to track the second thread.
So, cores are extra hardware that can perform more calculations. SMT is taking better advantage of what is already present.
However, the disadvantage of SMT is that it can slow a single-threaded program down, because now it has to share resources. Some processors actually do away with superscalar (aka issuing multiple instructions at once) and out-of-order execution and bypass logic, and instead rotate through many threads. For example, if the pipeline is 8 stages deep and supports 8 threads in hardware, it can issue one instruction from each thread in each clock cycle. Then it never needs to check if an instruction is dependent on an earlier one, because an instruction is completed before the next one from the same program is issued. Having this many threads can also minimize the cost of a branch misprediction, cache miss, or other long-latency events. Also, removing the bypass, dependency checking, multiple-issue, and instruction reordering logic can give a significant reduction in power and area on the chip. The performance hit by these eliminations can then be made up by adding many more cores than you'd see in a superscalar out-of-order processor like a Pentium or Core architecture. The catch is that a processor like this is only faster if you have enough threads to keep the processor busy. However, if your problem is big enough to need a supercomputer, you're darn well going to spend time writing it to take advantage of as many threads as you can.
Parent
Memory Bandwidth (Score:5, Interesting)
It'll have 620 TB of memory and support 5 PB/s
Is that kind of memory bandwidth possible? You could access the entire 620TB in ~120 milliseconds. I guess nothing is ever to fast, it just seems unrealistically fast.
Re: (Score:3, Interesting)
I believe the memory is aggregate and so is the bandwidth...so per core memory bandwidth is only 5PB/300K cores/s
The real question is how memory allocation is done in per core - does each core have unrestricted access to the full 620TB or is it a cluster with each machine having unrestricted access to a subset and a software interface to move data to other nodes.
if anyone here has insight on this, please fill in the giant blank.
Re: (Score:3, Interesting)
620 TB of memory? (Score:5, Funny)
Surely no one would ever need more...
UofI machine is a bit low on memory (Score:5, Funny)
That University of Illinois machine sounds like it needs more memory.
Only 620TB? Why not bump it up to 640? That should be enough for anybody.
Re: (Score:3, Funny)
Re:I have a serious question: (Score:5, Insightful)
Parent
Re:I have a serious question: (Score:5, Informative)
If your process looks like this:
int main()
{
while (something)
{
doSometing();
}
}
It will hit 100% on one core and that's it. Its not multithreaded - one CPU will churn on it forever and the others will sit around waiting for a task from the OS. 2 course, 200,000 cores the results will be the same. These machines are made for tasks that are broken up into lots of smaller jobs and processed individually. Its not magic - more cores won't get a single threaded process done faster.
Seriously.
Parent
Re:I have a serious question: (Score:5, Interesting)
And the reason that it kind've oscillates between cores is because "Set Affinity" tells the process that it's allowed to use that core, not that it has to or even should. If you want something to use both cores, open up two processes, set the first to core 1 and the second to core 2. Most of the time that's unusable like that, but I recently transcoded my entire music library and set one process to do songs from A-M, and the other from N-Z. It really helped
Parent
Re: (Score:3, Insightful)
It's not that simple. While one single task generally is not coded to take advantage of the entire system (single threaded on a dual system, dual thread on a quad system, whatever), you are able to actually use your computer while said task is underway. Ever encoded a DVD on a single core machine? Not so fun - half the time, you can't even use your mouse. Slap the same task on a dual-core box, and suddenly you can continue to work (or play) while that goes on in the background. Alternately, you can enc
Re: (Score:3, Informative)
Re:I have a serious question: (Score:5, Informative)
The funny thing is that it teeter-totters back and forth from one core to the other. I wish I knew what made it do that.
The OS runs the process a few milliseconds at a time, then kicks the process of the cpu for another process to run (if there is one, including OS tasks such as I/O routines). When the OS starts up the process again for a few more milliseconds, it may start it up on a different core. That is why both cores will show 50% average utilization.
Now if you set CPU affinity for that process to be on one core, then it will max that core out at 100% and the other core will be idle. This may result in better performance, because you get better cache utilization if the process stays on the same core.
On a related topic, this can also be the case if the app is multithreaded -- sometimes it is more efficient to run multiple threads on the same CPU instead of across CPUs, if each thread is accessing the same region of memory. Otherwise, if the threads are on different CPUs or cores, then the threads are constantly invalidating the cache on the other core, causing more (expensive) reads/writes to main memory.
Parent
Re:I have a serious question: (Score:5, Informative)
Aren't a lot of games and apps single-threaded? Hmmm. I figured that dual/quad-core wasn't all it's cracked up to be. So, essentially, if I have a single-threaded app on a quad core, it'll perform at 1/4th the potential speed.
Yes, although, most high end games and game engines actually are multi-threaded. Few are designed to take advantage of more than 2 cores though, and none that I know of will use 8 or 300,000...
So, essentially, if I have a single-threaded app on a quad core, it'll perform at 1/4th the potential speed.
Not necessarily. If you have 3 women can you make a baby in 3 months instead of 9? Given that it still takes 9 months and 2 of the women are idle, would you say that these women are performing at 1/3rd the potential speed? Same sort of logic applies here. If the task is inherently sequential, having more cores (or ladies) won't make it any faster.
Somethings -are- highly parellizable, like ray-tracing or cutting down all the trees in a forest.. and other things are partly parallelizable... like changing tires (a pit crew can change 4 tires at once... but adding more staff to allow you to change 5 tires at once doesn't make your team any faster...)
That doesn't leave me with a warm and fuzzy feeling inside.
Yes, in general computing applications, an 8GHz CPU would be faster than a quad core 2GHz. (And even under optimal parallilizable situations the 2ghz quadcore would just barely surpass the 8ghz cpu due to lower task switching overhead.) So the faster single cpu is almost always better. The reason we have quad core 2Ghz cpus is that they are much much more practical to actually make, and a lot of the stuff that takes a long time (rendering 3d, encoding movies, etc is actually highly parellizable so we do see a benefit. And much of the single threaded sequential stuff we see is waiting on hard drive performance, network bandedith, or user input... so cpu isn't the bottleneck there anyway.
The funny thing is that it teeter-totters back and forth from one core to the other. I wish I knew what made it do that.
If you look at task manager, there what? some 40+ processes running. The OS rotates them onto an off of the 2 cores based on what they all need in terms of cpu time. So your 'cpu heavy task' gets pulled off a core to give another task a timeslice, and then once its off, it can be scheduled back onto either core. Ideally should stay on one core to maximize level one cache hits, etc, but if its been off the core long enough for the other processes to cache all new memory it doesn't really matter which one it gets assigned to, and in any case flipping from one to the other every now and then makes a almost immeasurably small performance difference.
btw - the 'set processor affinity' feature tells the OS that you really want this process to run on a given cpu/core, instead of hopping around. But in most cases its not something one needs (or gains any benefit) from doing.
Parent
Re:I have a serious question: (Score:4, Interesting)
"Aren't a lot of games and apps single-threaded?"
And that's one more thing we can thank Microsoft for.
Hadn't DOS and the PC-clones crippled with mono-processor/mono-threading DOS/Windows stack become the dominant architecture for most of the 90s, we would have rock-solid, secure, multi-processor, 64-bit RISC boxes running some flavor of Unix on our desktops by now.
Thanks Bill.
Parent
Re: (Score:3, Interesting)
Single threading, like used on old versions of Windows, does have some advantages. It avoids a lot of concurrency related problems that most programmers are not properly trained to deal with. If everyone follow the rules, it's efficient and performs well.
I was recently reading a paper on multi-core processors and the future of programming. It pointed out that many multi-threaded programs work fine until they are run on a real multi-processor system where multiple threads can actually run simultaneously. At
Re:I have a serious question: (Score:5, Insightful)
A couple things:
x86 chips today are 99% RISC-like (the term RISC is rarely uses today, since basically no modern CPUs are "pure RISC" in design (reduced as in not even having a multiply instruction, like older SPARCs). Sure the exposed architecture is ugly x86, but that's the compiler's job to worry about, not yours. It doesn't really affect the chip performance. Don't forget x86 chips are still the fastest out there, despite the weird interface)
Also, for Joe Sixpack, 64-bit is pointless - especially when the 32-bit version works on the same OS! If you recompile a program as 64-bit (and often that is all there is to it; a recompile), you'll notice that the binaries are larger. In fact, most pointers (memory addresses) now takes up twice the space, so your program also uses more memory. The benefit? Unless your app uses more than around 3GB of RAM, basically zero (On x86 there is a sometimes a slight performance benefit, not because 64-bits is "faster" or anything, but because AMD added some more registers to the x86-64 spec).
Anyhow, i generally view 64-bits as a waste of address space, UNLESS you're accessing large amounts of memory (>3GB per program!). This will be more of a concern in the next few years, but there isn't any rush. I use 64-bit Vista for development (Because I have 4GB of RAM) but otherwise probably wouldn't care. Even Visual Studio (the dev platform for 64-bit code) is mostly a 32-bit app, nor should they change it.
Parent
Re:I have a serious question: (Score:4, Interesting)
The benefit? Unless your app uses more than around 3GB of RAM, basically zero
Plenty of things quite enjoy being able to perform operations in 64bits at a time, actually. Especially when it comes to media, crypto, compression, and indeed games; on top of having 2-3x the usable number of general purpose registers, which certainly isn't something to sneer at given how awful x86 has traditionally been in this area. Plenty of things you're likely to actually care about the performance of are likely to get a nice boost.
64-bits as a waste of address space, UNLESS you're accessing large amounts of memory (>3GB per program!)
Well, you generally only get 3GB when you've performed tricks to ask the OS to allow that; e.g. /3GB boot flag, fiddling with MAXDSIZ, or recompiling with a different user/kernel space split.
On top of that, it's not all about RAM, it's about address space; if you've only got 32 bits to play with, you need to be very careful about allocating it, since any wastage can lead to exhausting your virtual address space before your physical space; like with filesystems, fragmentation becomes more of a concern the closer you get to your maximum capacity.
Large virtual spaces are also useful when it comes to doing things like mmapping large files; for instance, a database might like to mmap data files to avoid unnecessary copying you get with read()/write(), but mmapping a 1GB file means you need 1GB of address space, even if you don't touch any of it. When it's common to access disk using memory addresses, 3GB starts looking small very fast.
You also very quickly eat into it using modern graphics cards; 512MB is common, having two isn't that uncommon, and things are moving towards 1GB; bang goes your 3GB, all that frame buffer needs addressing too, on top of the kernel's other needs.
Really, 32bit needs to die screaming, sooner rather than later.
Parent
Re: (Score:3, Interesting)
iometer [iometer.org]
Properly configured it can stress all the cores on all the nodes in your cluster.
Oh you wanted to do something useful...
Intel released it as open source in 2001. Edit the source for the dynamo so that it does something useful. Compile and install. Done.
Or you could load Vista and play a light game. That ought to peg both cores.
Actually, dual core is what it's cracked up to be. While your single threaded application is grinding away you can still interact with your computer instead of stari
Re:I have a serious question: (Score:5, Informative)
No, each core is running at 4Ghz. That does not total up to 16 Ghz processing power though, because only multithreaded programs can take advantage of more than one core at once, and they still have to wait if they're sharing data.
Parent
Re: (Score:3, Interesting)
Chances are IBM will still have a problem supplying them, plus new game consoles will get a priority in shipping in 2010, when that XBox 720 or Playstation 4 comes out.
It is also possible that the eight core chip will be really expensive, and in order to keep up with it a PowerMac would cost $4000 or more just to eliminate bottlenecks and use optical technology like super computers use to be able to use the chip properly. Not to say that nothing stops Apple from bringing out PowerPC based Macs in 2010 as Ma
Re:Steve Jobs is crying in his pillow tonight. (Score:4, Insightful)
Look at the heatsink in a PS3 and you have your answer.
Parent
Re:Apple Got Dumped By IBM. (Score:4, Insightful)
Actually, it was the other way around.
Intel chips outperform the PowerPC cpus without a doubt. PowerPC cpus were horrible. The first MacBook pros with Intel chips were 2-3 times faster than the ones before with PowerPC chips. If anything, it was a good move for Apple to start using Intel. I'm not a huge Mac Fan. I own one Apple product, a Nano with RockBox currently on it. However, I do hate when people don't do their fact checking and simply want to troll about a company they hate without justification.
Parent
Re:Apple Got Dumped By IBM. (Score:5, Insightful)
I remember something from that time that suggested it was simply a supply issue - AMD weren't big enough to guarantee supply. I remember looking at the figures and being surprised (about the capacity of AMD).
I also remember Jobs saying Intel had shown him _very_ exciting things, hint hint. And they were too.
Parent
Re: (Score:3, Funny)
Re: (Score:3, Insightful)
History is absolutely full of people who don't follow the mainstream theory or have financial backing and end up creating the next mainstream theory which receives all of the financial backing.
History is also full of people such yourself, AC, w