Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Supercomputing PlayStation (Games) Hardware

How To Build a Homebrew PS3 Cluster Supercomputer 211

eldavojohn writes "UMass Dartmouth Physics Professor Gaurav Khanna and UMass Dartmouth Principal Investigator Chris Poulin have created a step-by-step guide designed to show you how to build your own supercomputer for about $4,000. They are also hoping that by publishing this guide they will bring about a new kind of software development targeting this architecture & grid (I know a few failed NLP projects of my own that could use some new hardware). If this catches on for research institutions it may increase Sony's sales, but they might not be seeing the corresponding sale of games spike (where they make the most profit)."
This discussion has been archived. No new comments can be posted.

How To Build a Homebrew PS3 Cluster Supercomputer

Comments Filter:
  • by makapuf ( 412290 ) on Wednesday December 17, 2008 @07:41PM (#26152741)

    AGAIN, revenue of console sales is not N*const (positive or negative), but const1+N*const2 where const2 is negative (it's a gain per console) but upfront costs=const1(R&D, licences ...) are big. So the fact that the total is negative implies const is negative, but in fact it's mostly that N*const2 is still less than const1. (I hope this makes sense to some at least)

  • Re:Why use PS3s? (Score:5, Informative)

    by drinkypoo ( 153816 ) <drink@hyperlogos.org> on Wednesday December 17, 2008 @07:55PM (#26152875) Homepage Journal

    Why would you want to use PS3s for a homebrew supercomputing cluster if it means you have to write and optimize code for the SPEs to get benefit out of it? The PS3's linux environment doesn't let you utilize the GPU or all of the built-in SPEs and it doesn't have a lot of RAM available either.

    Well, I'll bite; if the cell is the fastest processor for your workload, the PS3 is the cheapest way to get one, even at only six usable SPEs and no GPU. Doesn't the PS3 have GigE? That's plenty fast enough to shovel data in and out of the system.

  • Re:Why PS3s? (Score:5, Informative)

    by Darkness404 ( 1287218 ) on Wednesday December 17, 2008 @08:01PM (#26152959)
    A) Although the cell is a pain to code for, it is much better than whatever PC you can get for ~$400 which will probably contain a mid-to-low-range dual core x86 CPU, whereas the PS3 gives you a Cell CPU which is much, much, faster than the x86 CPU.

    B) PS3s are uniform. Other than HD differences, a PS3 built in 2008 will be the same PS3 built in 2012 (assuming the PS3 lasts that long) this allows for a uniform cluster without worrying about differing parts (for example, the Core i7 built in 2008 will not be the same as the Core i7 built in 2012 and getting a 2008 Core i7 is going to be a pain)

    C) PS3s are the new fad. It isn't going to be hard to set up a supercomputer cluster with PS3s compared to using a mismatch of older computers because again, the PS3 is uniform.
  • Re:"super" computer: (Score:5, Informative)

    by steveha ( 103154 ) on Wednesday December 17, 2008 @08:07PM (#26153009) Homepage

    I'm not trying to be a smartass, but why did he mention in TFA that his supercomputer cost $4000 if the 8 consoles were "Sony-donated"?

    Oh come on, you are being pedantic. Clearly what he meant was "$4000 worth of consoles", never mind that they were donated. $X worth of consoles is a useful number if someone is considering buying PS3s and setting up a supercomputer; it's also a fun number to compare to the cost of renting time on some large supercomputer.

    The original Wired article is informative:

    http://www.wired.com/techbiz/it/news/2007/10/ps3_supercomputer [wired.com]

    He asked for Sony to donate the PS3s because he didn't think the NSF would give him grant money to buy video game systems. Now that he has actually built the supercomputer and it does everything he hoped it would do, perhaps other researchers will be able to justify the money to set up their own clusters (without donations from Sony).

    The numbers are a no-brainer: he used to spend $5000 to do a single simulation run using rented supercomputer time. For less than the cost of a single simulation run, you can set up your own supercomputer and make simulation runs whenever you feel like it.

    ALso, like the iPod example at the top of the post, most research use of the technology won't come from actual iPods or consoles

    Um, he is using actual PS3 consoles to do actual research.

    If one wanted to build their own home "super" computer then why not just use CUDA and a few Nvidia cards?

    If you think that is a good way to make a super computer, why don't you go ahead and do it, and make a web site explaining how it is done?

    Meanwhile, he thought he had a good way to go with the PS3, and it did in fact work as he expected, so what's the problem?

    Anyway, here's why he thought it was a good idea. From the above linked Wired article:

    According to Rimon, the Cell processor was designed as a parallel processing device, so he's not all that surprised the research community has embraced it. "It has a general purpose processor, as well as eight additional processing cores, each of which has two processing pipelines and can process multiple numbers, all at the same time," Rimon says.

    Khanna says that his gravity grid has been up and running for a little over a month now and that, crudely speaking, his eight consoles are equal to about 200 of the supercomputing nodes he used to rely on.

    steveha

  • Re:"super" computer: (Score:5, Informative)

    by afidel ( 530433 ) on Wednesday December 17, 2008 @08:20PM (#26153135)
    Old CPU's have a much lower MIPS/Watt and a lower MIPS/interconnect so they have a higher cost. Many organizations have found it's cheaper to retire an old supercomputer and add a few nodes to the new one even if it is more capital outlay to get the same performance. Basically a Cell does many times as much useful work than a P4 at a fraction of the power budget.
  • Re:Why use PS3s? (Score:5, Informative)

    by ASBands ( 1087159 ) on Wednesday December 17, 2008 @08:32PM (#26153267) Homepage

    since CUDA is roughly C

    Not quite. CUDA looks a lot like C in that it has C-family syntax but the biggest limitation it has is that there is no application stack - which means no recursion. CUDA also lacks the idea of a pointer, although you can bypass this by doing number to address translation (as in, the number 78 means look up tex2D(tex, 0.7, 0.8)). The GPU also has other shortcomings, in that most architectures like to have all their shaders running the same instruction at the same time. For this code

    if (pixel.r < pixel.g){
    //do stuff A
    }else if (pixel.g < pixel.b){
    //do stuff B
    }else{
    //do stuff C
    }

    The GPU will slow down a ton if the pixel color causes different pixels to branch in different directions. Basically, the three sets of shaders following different branches of that code will be inactive 2/3 of the time.

    In the Cell, you really do just program in C with a number of extensions added onto it like the SPE SIMD intrinsics and the DMA transfer commands (check it out [ibm.com]). The Cell really is 9 (10 logical) processors all working together in a single chip (except in PS3, where there are only 7 working SPEs). Furthermore, your 8 SPEs can be running completely different programs -- they're just little processors. Granted, you have to be smart when you program them to deal with race conditions and all the other crap you have to deal with for multithreaded programming. The Cell takes about 14 times longer to calculate a double precision floating point than a single (and there aren't SPE commands to do four at once like you can with singles).

    So which is more powerful? It really depends what you're doing. If your task is ridiculously parallellizable and doesn't require the use of recursion, pointers or multiple branches, the GPU is most likely your best bet. If your program falls into any of those categories, use a Cell.

  • I wish I had one (Score:3, Informative)

    by uassholes ( 1179143 ) on Wednesday December 17, 2008 @09:23PM (#26153689)
    For the dick licks that say it's useless, I guess you missed all the previous articles about scientists who have been doing the same thing:

    http://www.physorg.com/news92674403.html [physorg.com]

    http://dgl.com/itinfo/2003/it030528.html [dgl.com]

    http://www.lbl.gov/Science-Articles/Archive/sabl/2006/Jul/06.html [lbl.gov]

    http://folding.stanford.edu/English/FAQ-PS3 [stanford.edu]

  • Re:"super" computer: (Score:4, Informative)

    by lysergic.acid ( 845423 ) on Wednesday December 17, 2008 @10:28PM (#26154229) Homepage

    it's not that simple. sure you can make up for a lack of per-CPU processing power through cluster computing, but at some point it becomes more practical or even cheaper to go with a smaller cluster using a better processor architecture.

    you could use hundreds of P3s or even P4s and still not achieve the same real-world performance as a couple dozen cell processors or modern GPGPU stream processors. that's because P3s & P4s are general-purpose CPUs designed for SISD/scalar processing. they're great for the bulk of general-purpose commodity computing applications like running an OS, web browser, word processor, etc., but high-performance computing problems typically involve processing very large data sets that greatly benefit from data parallelism. so if you had two processors, one scalar and one vector, each with the same power consumption and clock rate, the vector processor would be an order of magnitude faster at performing HPC tasks than the processor with the scalar architecture.

    and the combined use of parallelization at multiple levels will always be more efficient than relying solely on a single form of parallelism. blindly adding more cheap 32-bit scalar CPUs won't get you as good of results as building a smaller cluster comprised of 64-bit fully-pipelined stream processors with multithreaded superscalar cores that support VLIW. in the former case, you're only employing task-level parallelism, whereas in the later case you're taking advantage of bit-level, instruction-level (pipelining + superscalar + VLIW), data, and task-level (multiprocessing + multithreading) parallelism. you'd not only save power by using fewer (more power-efficient) processors, but you'd also reduce memory coherence & bandwidth problems, not to mention the space savings.

  • Re:What a ripoff! (Score:2, Informative)

    by enslaved_robot_boy ( 774973 ) on Wednesday December 17, 2008 @10:36PM (#26154333)

    If you clicked on some of the links you would find some quantitative data hotshot.

  • by CronoCloud ( 590650 ) <cronocloudauron AT gmail DOT com> on Wednesday December 17, 2008 @11:07PM (#26154729)

    Meaning the 80GB systems are NOT 100% backwards-compatible.

    Not 100% but very still very high. Out of my collection of 64 PS2 games, only 2 have enough problems that I consider them unplayable on the PS3: Tekken Tag Tournament (doesn't run full speed) and Fallout Brotherhood of Steel (with very pronounced texture glitching)

    But now, the PS3 ships without hardware or software emulation to play PS2/PS1 games.

    Sigh, why do people keep getting this wrong. Although the latest release PS3 consoles can't play PS2 games, ALL PS3's can play PS1 games since that's entirely software emulation.

  • Re:Limited use (Score:5, Informative)

    by raftpeople ( 844215 ) on Wednesday December 17, 2008 @11:34PM (#26155001)
    This is why:
    8 PS3's = 8 cells
    8 cells X 7 available SPE's per cell = 56 SPE's
    56 SPE's X 4 simultaneous FP calcs = 224 FP calcs per cycle

    You would need to get quite a few of those x86 dual core kits to match that performance
  • Re:ibm (Score:3, Informative)

    by dbIII ( 701233 ) on Wednesday December 17, 2008 @11:36PM (#26155031)
    If you wait out the two weeks of used car salesman tactics from a place like Mercury that can sell you cell processors in systems with more than 256megs of RAM you'll find out that unless you have an endless budget you are probably better off with the ten quad core Xeon systems you could get for the same price.

    That is why a system made of game consoles makes a lot more sense than very similar hardware in a rackmount case. Other cell hardware has been priced into complete irrelevance by salesfolk having too much control over the process.

    On the other hand there is the nvidia CUDA solutions, hardware doing things a slightly different way but proudly printing their prices on the net instead of two weeks of mindless chatty emails before you get the price.

  • Re:Why use PS3s? (Score:1, Informative)

    by Anonymous Coward on Thursday December 18, 2008 @12:32AM (#26155619)

    The heap/recursion thing is occasionally annoying but not a blocker. If you can "be smart" when programming a cell you can be smart and unroll your recursion as I did to code non-recursive heap operations w/ CUDA. The payoff is I now have a ridiculously fast and awesome heap. Using hardware acceleration in the first place means you have a problem where you deem the machine time to be more valuable than the dev time needed to code to it over just the CPU. Recursive implementations are dev friendly but not usually the best solution for the machine. If it's worth the dev time to code for the custom hw then it's worth the effort to unroll recursion (that is to say, since you are going for high performance you'd probably unroll your recursion on such an important problem on a CPU also).

    Branches - that's really nitpicking & from what I've seen a really modest price. I don't sweat branches - if you need them you need them & none of them have yet killed my performance.

    Pointers - huh? CUDA is rife with pointers especially w/ image processing. See the image processing samples in Nvidia's CUDA samples. e.g. boxFilter. the image gets chopped up and stream processors process bits of the image, indexed by pointers into the image. You do a cudaMalloc and it returns a pointer to device memory. Pointer math works fine. It's not a pointer to host memory but it is a pointer.

    I think the claim that "CUDA is roughly C" is accurate after having slogged through the APIs for other hardware accelerated solutions. So it's a relative thing. Not perfect but oh yes it's very nice.

Today is a good day for information-gathering. Read someone else's mail file.

Working...