Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
IBM Technology

Cell Architecture Explained 570

IdiotOnMyLeft writes "OSNews features an article written by Nicholas Blachford about the new processor developed by IBM and Sony for their Playstation 3 console. The article goes deep inside the Cell architecture and describes why it is a revolutionary step forwards in technology and until now, the most serious threat to x86. '5 dual core Opterons directly connected via HyperTransport should be able to achieve a similar level of performance in stream processing - as a single Cell. The PlayStation 3 is expected to have have 4 Cells.'"
This discussion has been archived. No new comments can be posted.

Cell Architecture Explained

Comments Filter:
  • Seeing is believing (Score:3, Informative)

    by Anonymous Coward on Friday January 21, 2005 @04:40AM (#11429627)

    It's not like we haven't heard it before. It usually turns out to be halfish-truish for some restricted subset of operations in a theoretical setting, you know where you discount busses, memory and latencies.

  • by FRAGaLOT ( 239725 ) on Friday January 21, 2005 @04:53AM (#11429681) Homepage
    Actually I find the opposite to be true. Take for example an Xbox, which is basically a PC from about seven years ago. (Sub gigahertz P3, 64megs RAM, GeForce3 video)

    But it plays all the popular games of today's PC with little to no lag. Where as you need a very high end PC to play the same game!

    This is mostly due to the fact that the architecture with the video is more direct, than it is on a PC. There's no AGP bus, or any bottle neck to access video ram. It's more direct which is probably why an Xbox can perform as well as a current PC rig.

    But then an Xbox is only running at 800x600. LOL
  • by xylix ( 447915 ) on Friday January 21, 2005 @04:55AM (#11429687)
    Unless the thing has an x86 emulation layer, it's dead in the water in regards to the PC market. Even Apple, with their much vaunted G-chips have to emulate the x86 hardware so that users can run their Windows programs.

    Huh?? what are you smoking? Since when does Apple emulate x86 harware? Perhaps you are confused by the fact that you can buy Microsoft Office for the macintosh, or run Internet Explorer. Heres a news flash - they aren't emulating the x86, they are native mac code.

    Most likely you are confusing one of two different things. Back in the OS X programs can be either Cocoa (the new way) or Carbon (the old way) apps. But that isn't really emulation. What is emulation is that Mac OS X can run OS 9 by emulation - nothing to do with x86 here.

    Or you can buy Virtual PC, which is now owned by Microsoft. It will allow you to emulate x86 to run Windows (or Linux) and associated. But note - this isn't Apple, and it isn't something that Mac users "have to do".

    Lastly, why on earth would you HAVE to be able to run windows programs in order for a Playstation processor to be successful? Last time I checked PlayStation was still wildly successful? More so than MS Xbox I think.

  • by drgonzo59 ( 747139 ) on Friday January 21, 2005 @05:16AM (#11429773)
    The idea of having many processing units in a personal workstation is not new. They thought that Moore's law was going to fail years ago and predicted that by now we would all have massively parallel machines at home on our desks. Well it turned out that Moore's law didn't fail and most importantly that many of the software algorithms are not easily parallelizable. So what if I can have 100 cells at home in my workstation. I could run SETI, weather or some other kind of simulation but I couldn't really play my video games much faster or have a more responsive user interface if I ever install Longhorn. I just can't think of too many programs run on home user's machines that would benefit form a parallel architecture.

    Now if the can be made very fast and have only a few (2-8) coupled together...well,as it was said, that is what a nice Opteron machine does anyway nowadays.

  • Compiler technology (Score:5, Informative)

    by sifi ( 170630 ) on Friday January 21, 2005 @05:17AM (#11429777)
    One question which was not addressed fully in the article was how do you compile/test programs for this thing.

    The potential of parallel architectures has never been in doubt since the early days of the Cray monsters - but how to compile code to use all the features efficiently has.

    I don't believe that we see the full advantage of these types of architecture exploited without some similar break-through in software tools.

    Mind you the hardware rocks...
  • by CoolGuySteve ( 264277 ) on Friday January 21, 2005 @06:00AM (#11429895)
    When it was released, the Emotion Engine in the PS2 actually would have been pretty wicked for supercomputing applications if Sony had sold a version with faster interconnects and more RAM. The processors in the PS2 are designed almost entirely to crunch vector operations, which is what most scientific codes rely on. It's really an excellent computer, it just sucks at graphics. The 4MB of uncompressed video memory and lack of hardware texture support are particularly ugly.

    I suspect that the main reason there was never an Emotion Engine based cluster product was because the high performance market is tiny, especially compared to the console market, and Sony was already having trouble meeting demand with their exotic chipset when it first came out.

    Anyways, I think the guy does go overboard about this new architecture. It probably will be a lot faster than PCs at certain tasks but you can only fit so many transistors in a chip. The cell stuff is cool though, it seems to fit a lot better with what most computers spend their time processing unless you're doing a lot of compiling or database operations.
  • Comment removed (Score:2, Informative)

    by account_deleted ( 4530225 ) on Friday January 21, 2005 @06:49AM (#11430040)
    Comment removed based on user account deletion
  • by LarsWestergren ( 9033 ) on Friday January 21, 2005 @07:12AM (#11430119) Homepage Journal
    Take for example an Xbox, which is basically a PC from about seven years ago. (Sub gigahertz P3, 64megs RAM, GeForce3 video) But it plays all the popular games of today's PC with little to no lag. Where as you need a very high end PC to play the same game! This is mostly due to the fact that the architecture with the video is more direct, than it is on a PC. There's no AGP bus, or any bottle neck to access video ram. It's more direct which is probably why an Xbox can perform as well as a current PC rig.


    Well no, not exactly. The reason console games don't suffer from lag is that unlike a PC, the hardware specs are not a moving target during development. Developers can optimize textures, audio, algorithms etc with a specific platform in mind. This makes it much easier to create content that you know won't overwhelm the machine.

    Compare this with a PC developer. They have to estimate the time it takes to develop the game. Then they have to estimate the average gamer hardware and the cutting edge gaming hardware at the time the game is released. They have to take into consideration at the very least processor speeds, main memory size and speed, graphic card speed and memory size.

    If the developers overestimate, the game will be unplayable (when the first System Shock came for instance, I remember reviewers writing "You actually need a PENTIUM to play this game, it's insane!"). If they underestimate hardware or take too long, they will be killed by reviews complaining about "outdated graphics". Oh, and preferably there shouldn't be any problems with any special configuration.

    This is extremely difficult to achieve. Half-Life 2 for instance was praised for the fact that it managed to scale its graphics so it was playable on low end yet good looking on high end machines. However, some people experienced stuttering of audio as levels started, this was even more noticable in Vampire: Bloodlines, a great game that uses the HL2 engine. I think this had something to do with the hard drive loading textures or level geometry (I noticed it especially when loading the huge LA Downtown level in Vampire, sound was stuttering for 10 seconds after level loaded). People with fast hard drives, especially those that chose or had to chose low resolution textures didn't suffer from it as much as those with graphics cards with lot of memory for high resolution textures and comparatively slow hard drives.

    So, the interaction between the many different hardware configurations on PCs makes it difficult to optimize and that is what causes the lag, not the lack of any AGP bus or anything. A console developer can test on just on one console and be fairly certain it will run the same on all target machines.
  • by jovetoo ( 629494 ) on Friday January 21, 2005 @07:22AM (#11430170) Journal
    First of all, as another post says, the GPUs contain a video controller, DAC and so on. Second, the Cell will still be able to accelerate graphics performance by doing all kinds of vector pre-processing. Last, it will be a lot more easy for software companies to build PS3 games fast if they have somewhat the same computing/graphics environment as on a x86. Reasons enough, I think.

    But what struck me most is that you seemed to have missed the whole point the authors seeks to make. Yes, Moore's law will double the performance of the GPU within 18 months. So? It still does not give them the raw processing power of those Cells. Nor the scalability. (Damned! These things will be in you TV, your DVD, your stereo and they all cluster...) If these Cells really become low cost chips, I seriously doubt x86 will survive.

  • No longer true (Score:2, Informative)

    by Moraelin ( 679338 ) on Friday January 21, 2005 @08:11AM (#11430357) Journal
    A long time ago, in a galaxy far away, CPU transistor budgets were measured in tens of thousands. You had _barely_ enough of them for either more registers _or_ a more complex decoder, but never both.

    This begat RISC. A CISC computer had a more complex in struction set, but that barely left it with enough transistors for a couple of general purpose registers. A RISC computer, on the other hand, went by the mantra "never do in software what the compiler can do for you", so it had an over-simplified instruction set, but then it had enough transistors left for more registers.

    In a sense each of the two was too expensive for _someone_. For CISC, registers were too expensive. For RISC the decoder was too expensive. In truth, both were expensive, and the grand unified theory ;) is simply that you just couldn't have _both_.

    Fast-forward a bit, and registers are _not_ expensive for CISC any more. You do mention "what will prevent Intel/AMD to add a technology which could use multiple sets of registers", so the answer to that is: they already do. Both have huge register stacks they internally use for renaming. (E.g., when you swap EAX and EBX, the data isn't really copied, but the register from that huge stack which is currently EAX and the one which is currently EBX, get renamed to EBX and EAX respectively.)

    Either way, they already have the Cell's 128 registers, and some even have 256 registers. You just don't see them from the outside. (Which is a pity, since compilers could really use them.)

    Is there something to stop them from exposing more of those registers to the outside world? Nope. The AMD Athlon 64's (now also addopted by Intel) "64 bit extensions" already do just that: they double the number of general purpose registers visible to the program. That's largely what gives an Athlon 64 the speed boost when running 64 bit code: the extra registers.

    Is there something to keep them from doubling or quadrupling them again? From a technical point of view, nothing whatsoever.

    What's been keeping them so far is the software backward compatibility. A Pentium 4 still has to run code written for a 486. So whatever changes you do to their instruction set, they must leave the old pre-existing instructions unchanged. And there simply aren't bits left to add the new registers without changing the whole instruction set.

    The migration to 64 bits has been such a good excuse to come up with a completely new instruction set, with more general-purpose registers. But such excuses are few and far between.

    As for RISC... it died, it lost the battle. Yes, Apple and IBM still use it as a marketting buzzword, but that's it. There are _no_ RISC CPUs still being produced.

    The G5 in Macs is simply a CISC with more registers and a better instruction set, but it's CISC nevertheless. It's internal structure is _not_ RISC, and AltiVec is _not_ RISC. They're in fact contrary to everything RISC stood for.

    Ditto for Sun's UltraSparc.

    (And everyone arguing that a G5 is RISC, has obviously never programmed a RISC CPU before. You had to take care of every single detail in software, because of the mantra "never do in hardware what the compiler can do for you." Even recovering from a pipeline overflow when an interrupt came, you had to do that in software.)

    Hope this helps.
  • by Anonymous Coward on Friday January 21, 2005 @08:29AM (#11430428)
    If Cell is so almighty, why does Sony uses NVidia GPU instead of using more Cells for graphic prosessing?

    That's not what he's saying, if you take the quote in context. He's saying that a PC using the GPU for general-purpose computation won't beat the Cell at the same computations.

    What he's specifically not saying is that the Cell is better than a GPU for doing graphics. The GPU is a special-purpose chip for graphics, so of course it's going to outperform something more generic, on cost if on nothing else.

  • by Anonymous Coward on Friday January 21, 2005 @09:27AM (#11430771)
    The keyword here is:

    "...should be able to achieve a similar level of performance in stream processing - as a single Cell. "

    "in stream processing" .. wtf? Well Duh. That's already the case versus your graphics card. Imagine trying to run Dooom II without a graphics card in software rendering mode and used an opteron to do the work of a graphics card.. so, even with two opterons in a box ..what sort of performance degradation will you expect in Doom III? (more than 5x I'd bet).

    Yes Sony, Opterons make shitty graphics card. We already knew that.
  • by Anonymous Coward on Friday January 21, 2005 @09:29AM (#11430796)
    Microsoft is still dicking around with porting Windows to AMD64... a platform mostly compatable with x86. (don't give me crap about NT running on Alpha. It ran on 32bit version, and there was a early beta of W2k that ran 64bit native, but the Win32 API and everything else you use on your computer is and always has been x86-only)

    There are two operating systems Microsoft have developed called Windows. DOS/Windows, the original one, was based on an x86 clone of CP/M that Microsoft bought. The first version, "Windows 1.0", was released in 1985. The last version, called "Windows Me", was released in 2000, IIRC. This OS was always x86-only, originally ran on archaic CPUs without memory protection and never supported full protected memory, symmetric multiprocessing or other (now) basic OS features.

    The second OS developed by Microsoft that's marketed as Windows is Windows NT (now just called "Windows"). It was started in 1988, and never had any relation to DOS/Windows, except insofar as it can (to some extent) emulate it for compatibility reasons (including an x86 emulator on hardware that can't natively execute x86 code). Windows NT was developed on the MIPS platform, not the x86. The original plan had been to use the Intel i860 (an LIW architecture completely different from the x86) as the development platform, but the i860 hardware never met its promise, so MIPS was chosen instead.

    The first version of Windows NT was released in 1993, and called "Windows NT 3.1" (3.1 was used for marketing reasons, since that was the latest version of DOS/Windows at the time). Like UNIX, it was mostly written in C, with assembly at the low level to handle hardware dependencies. At its release, Windows NT 3.1 ran on 32-bit MIPS (the development platform) and 32-bit x86 (the first port).

    The second version of Windows NT (3.5) was released in 1994, and planned to add 64-bit Alpha (in a semi-crippled, 32-bit mode) and 32-bit PowerPC. However, IBM and Motorola ran into problems with the hardware (in part because of ongoing disagreements with Apple, who wanted to use their own, proprietary platform), so Windows NT 3.5 only added Alpha support. In 1995, after IBM and Motorola had managed to (mostly) sort out their problems (but with Apple declining to follow the IBM/Motorola PReP standard), the PowerPC port of Windows NT was completed, and released as version 3.51. At this point, the OS ran on MIPS, x86, Alpha and PowerPC.

    In 1996, the user interface of Windows NT was upgraded to match the user interface of the popular 4.0 release of DOS/Windows (called Windows 95). Windows NT 4.0, which copied the user interface of DOS/Windows 4.0, ran on MIPS, x86, Alpha and PowerPC.

    By the late 1990s, as Microsoft continued work on version 5.0 of Windows NT, the market had lost confidence in non-x86 systems for general-purpose PCs (apart from Apple Macs, which didn't follow the PReP standard, so couldn't run OSes ported to it, like AIX and Windows NT). As a result, Microsoft and the vendors of MIPS and PowerPC workstations agreed to cease development and marketing of NT 5.0 for those platforms. Windows NT 5.0 continued to be developed for the x86 and DEC Alpha architectures, into the beta releases.

    DEC (which was taken over by Compaq) had continued to have hope for the Alpha as a general-purpose alternative to the x86, but financial difficulties led to the project being abandoned towards the end of the developent cycle for Windows NT 5.0 (marketed as "Windows 2000"). As a result, Windows NT 5.0, completed at the end of 1999, was the first version of NT that only ran on one platform (the x86).

    A port of Windows NT 5.0 to the 64-bit Intel Itanium, including 64-bit versions of the Windows APIs (unlike the earlier Alpha port), was released in 2001, but only to select customers.

    Windows NT 5.1 (marketed as "Windows XP) was also released in 2001, and again only ran on the x86, apart from another 64-bit limited release for Itanium (in 2002, IIRC).

    Windows NT 5.2 (marketed as "Windows Se

  • by S3D ( 745318 ) on Friday January 21, 2005 @10:16AM (#11431224)

    One question which was not addressed fully in the article was how do you compile/test programs for this thing. The answer is OpenMP [openmp.org]. OpenMP is mulithreading API wich can hide parallelization from the user almoste completly. It's embarassingly easy to use - only one line of code is enouth to parallelize a loop. All threads creation/synchronisation remain hidden from user. It's extremly efficient too - I was never able to achime the same level of performance if duing multithreading myself.
  • by KirkH ( 148427 ) on Friday January 21, 2005 @10:29AM (#11431352)
    Eh? MS is leaving x86 behind for the Xbox 2. They're going with some type of PowerPC based chip from IBM, rumored to be multi-core. ATI provides a custom graphics chipset that will not have a PC counterpart.

    Sony is going with Cell from IBM and an nVidia graphics chipset. So I don't see a huge difference. My guess is that both consoles will have extremely similar performance and this next generation of consoles will be the most boring ever -- lots of multi-platform games that look identical.
  • by rob_osx ( 851996 ) on Friday January 21, 2005 @10:38AM (#11431465)
    My link to the analysis of Apple's use of the Cell was wrong. http://www.tweet2.org/wordpress/index.php?p=13 [tweet2.org]
  • by fitten ( 521191 ) on Friday January 21, 2005 @11:51AM (#11432275)
    Lots of people have been working on auto-parallelizing compilers. The idea is to take existing code that isn't parallel and during compile time (or run time) make those decisions intelligently and speed up processing. So far, there have been zero successes at it without explicit user directives to tell the compilers where good targets for parallelization are and how to do it (specifically creating threads and/or marking loops that can be parallelized).

    If you (or anyone) can solve this problem well, you'd be famous and wealthy beyond the dreams of avarice (assuming you patent it and license it out :))
  • by Anonymous Coward on Friday January 21, 2005 @05:45PM (#11436247)
    quote: No cache for CPUs? A breakthrough? Hello! Both PSone and PS2 have the so-called scratchpad, which is what the Cell seems to have: a cache which has to be managed explicitly by the programmer. Breaking news: This is a royal pain in the ass. And calculating bandwidth when reading from this tiny scratchpads makes about as much sense as calculating the speed at which a x86 processor can execute MOV EAX, EBX.

    sheeesh... If you would stop for a moment to think about it, you'd notice that this architecture is vastly different from your average PS2 arch. The "caches" are actually one per APU which means that these APUs have a kind of access to "locked-down" cache memory - this is basically what most modern signal-processing CPUs do. Take a look at some DSP code specifically tailored for the PPC440 cpu, it explicitly takes advantage of the ability to lock the CPU's caches. I'm sure the main processing unit WILL have L2 cache, though.

  • by DeadScreenSky ( 666442 ) on Friday January 21, 2005 @09:59PM (#11438249)
    I think that a much bigger factor is that Warren Spector wasn't the lead designer anymore. A game like DE simply isn't something that can reasonably be accomplished by someone new to that position. The game has all the hallmarks of an inexperienced dev team (in particular, note the tonal and pacing issues).

    Anyway, hardware really didn't affect the game like some people pretend it did. The streamlined gameplay was because Harvey Smith and his team wanted it that way (the Xbox has certainly seen plenty of more complicated gameplay systems than Deus Ex 1!), and the vast majority of them would have occurred even if the game was PC only. I disagree with the decision as well, but the devs wanted the gameplay to be simpler and more focused.

    Most of the engine limitations were simply because they chose poor technology - the hacked-up UT2k3 engine didn't scale or perform well on the PC, either. Lots of Xbox games feature huge levels with minimal or even non-existent load times (see Riddick, Halo series, Ninja Gaiden...).

    (And I would point out that the original DE was pretty spotty when it came to tech as well. Very slow on most systems when it was released, without particularly nice graphics to compensate. I suspect they just donn't have the calibur of 3D programmer required...)

    And as much as I love DE1, it really didn't leave room for a sequel. Near-future stuff works fine, but once you get close to or even past the Singularity, it is almost impossible to create a realistic or interesting setting (as any human just wouldn't be close to where the real action is). Most of the truly interesting conspiracy theories were already dealt with (seriously, you already used the Templars, the Illuminati, and Majestic 12). Most of the interesting future tech was used (nanotechnology especially, though I do think Invisible War expanded on it in interesting areas). And you couldn't reasonably expand on the cyberpunk theme too much, because the world had already been pulled back from the brink in the original. (The globalization issues that Invisible War brought up was a good attempt, but that is really hard to address in a FPS - there are no real masses, you understand?)
  • by evilviper ( 135110 ) on Sunday January 23, 2005 @03:05AM (#11446450) Journal
    Yes, wasn't this the fate of the sega saturn? Or was it the atari jaguar? It used multiple processors which made it a fast and cheap system, but developers steered toward the ease of programming on the N64

    The Sega Saturn used dual processors, and was nearly a clone of top-end Sega arcade systems. Unfortunately, it was terribly hard to program, so only in-house Sega titles were developed to utilize the full potential of the device, such as Virtua Fighter, while other titles were only using half the performance of the system.

    It was not, however, cheap.

    The Jaguar was cheap and was hard to program for (primarily because of bugs in the cheap hardware), but it could hardly be considered a powerful system. It was well ahead of NES/Genesis, but it came in late in the game, when they were both well established, and the Playstation/Sega Saturn were just around the corner.

HELP!!!! I'm being held prisoner in /usr/games/lib!

Working...