Slashdot Log In
How To Build a Homebrew PS3 Cluster Supercomputer
Posted by
timothy
on Wednesday December 17, @06:18PM
from the slot-a-tab-b dept.
from the slot-a-tab-b dept.
eldavojohn writes "UMass Dartmouth Physics Professor Gaurav Khanna and UMass Dartmouth Principal Investigator Chris Poulin have created a step-by-step guide designed to show you how to build your own supercomputer for about $4,000. They are also hoping that by publishing this guide they will bring about a new kind of software development targeting this architecture & grid (I know a few failed NLP projects of my own that could use some new hardware). If this catches on for research institutions it may increase Sony's sales, but they might not be seeing the corresponding sale of games spike (where they make the most profit)."
Related Stories
[+]
Inside Tsubame, Japan's GPU-Based Supercomputer 75 comments
Startled Hippo writes "Japan's Tsubame supercomputer was ranked 29th-fastest in the world in the latest Top 500 ranking with a speed of 77.48T Flops (floating point operations per second) on the industry-standard Linpack benchmark. Why is it so special? It uses NVIDIA GPUs. Tsubame includes hundreds of graphics processors of the same type used in consumer PCs, working alongside CPUs in a mixed environment that some say is a model for future supercomputers serving disciplines like material chemistry." Unlike the GPU-based Tesla, Tsubame definitely won't be mistaken for a personal computer.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.

Why use PS3s? (Score:4, Insightful)
Why would you want to use PS3s for a homebrew supercomputing cluster if it means you have to write and optimize code for the SPEs to get benefit out of it? The PS3's linux environment doesn't let you utilize the GPU or all of the built-in SPEs and it doesn't have a lot of RAM available either. It seems like it would be cheaper to build a cluster out of commodity PC parts, and maybe use GPUs+CUDA to get more muscle without having to completely hand-roll your own accelerated computation code (since CUDA is roughly C). I can't imagine that the PS3 would end up cheaper for these purposes, considering it includes a Blu-Ray player along with a bunch of other things you're not going to be using.
Reply to This
Re:Why use PS3s? (Score:5, Informative)
Why would you want to use PS3s for a homebrew supercomputing cluster if it means you have to write and optimize code for the SPEs to get benefit out of it? The PS3's linux environment doesn't let you utilize the GPU or all of the built-in SPEs and it doesn't have a lot of RAM available either.
Well, I'll bite; if the cell is the fastest processor for your workload, the PS3 is the cheapest way to get one, even at only six usable SPEs and no GPU. Doesn't the PS3 have GigE? That's plenty fast enough to shovel data in and out of the system.
Reply to This
Parent
Re:Why use PS3s? (Score:4, Interesting)
Reply to This
Parent
Re:Why use PS3s? (Score:5, Informative)
Not quite. CUDA looks a lot like C in that it has C-family syntax but the biggest limitation it has is that there is no application stack - which means no recursion. CUDA also lacks the idea of a pointer, although you can bypass this by doing number to address translation (as in, the number 78 means look up tex2D(tex, 0.7, 0.8)). The GPU also has other shortcomings, in that most architectures like to have all their shaders running the same instruction at the same time. For this code
if (pixel.r < pixel.g){
//do stuff A
//do stuff B
//do stuff C
}else if (pixel.g < pixel.b){
}else{
}
The GPU will slow down a ton if the pixel color causes different pixels to branch in different directions. Basically, the three sets of shaders following different branches of that code will be inactive 2/3 of the time.
In the Cell, you really do just program in C with a number of extensions added onto it like the SPE SIMD intrinsics and the DMA transfer commands (check it out [ibm.com]). The Cell really is 9 (10 logical) processors all working together in a single chip (except in PS3, where there are only 7 working SPEs). Furthermore, your 8 SPEs can be running completely different programs -- they're just little processors. Granted, you have to be smart when you program them to deal with race conditions and all the other crap you have to deal with for multithreaded programming. The Cell takes about 14 times longer to calculate a double precision floating point than a single (and there aren't SPE commands to do four at once like you can with singles).
So which is more powerful? It really depends what you're doing. If your task is ridiculously parallellizable and doesn't require the use of recursion, pointers or multiple branches, the GPU is most likely your best bet. If your program falls into any of those categories, use a Cell.
Reply to This
Parent
Re: (Score:3, Interesting)
Limited use (Score:5, Insightful)
Couple issues with this as an alternative to the garden-variety x86 cluster connected with InfiniBand:
Slow network interconnect. For problems that are not trivially parallel, network latency is usually a big deal. Ethernet doesn't cut it.
Lack of RAM. 'Nuff said.
Have to care about Cell and PS3 architecture. The codes ("codes" has a slightly different meaning in the context of supercomputing) have to be modified to take advantage of this very specific architecture. Software always outlives hardware, so in the long run the effort may not be worth it.
That said, it's really cheap. If your application isn't held back too much by these issues then enjoy your insanely cheap cluster!
Reply to This
Re:Limited use (Score:5, Informative)
8 PS3's = 8 cells
8 cells X 7 available SPE's per cell = 56 SPE's
56 SPE's X 4 simultaneous FP calcs = 224 FP calcs per cycle
You would need to get quite a few of those x86 dual core kits to match that performance
Reply to This
Parent
Re:Limited use (Score:4, Interesting)
I tell you what: you go ahead and buy $4000 of those Dual core kits, and we'll compare your output from a well-written algorithm versus the Cell system designed by this team.
Some interesting code examples for using the Cell have been demonstrated and it has immense processing power that most people don't recognize immediately. Check out this Dr Dobb's Journal article [uni-erlangen.de] for an example.
Reply to This
Parent
Power and maintenance? (Score:3, Interesting)
Reply to This
I wish I had one (Score:3, Informative)
http://www.physorg.com/news92674403.html [physorg.com]
http://dgl.com/itinfo/2003/it030528.html [dgl.com]
http://www.lbl.gov/Science-Articles/Archive/sabl/2006/Jul/06.html [lbl.gov]
http://folding.stanford.edu/English/FAQ-PS3 [stanford.edu]
Reply to This
Re: (Score:3)
Re: (Score:3, Funny)
This isn't really the place to start criticizing grammar and spelling, unless you REALLY want to live a life full of frustration and torment....?
(Though it could be worse, I suppose - you could go to digg etc. intsead...:p)
Re:ibm (Score:5, Insightful)
If your application leans almost entirely on the CPU with very little need for RAM, and you have an army of screwdriver monkeys(or grad students) to do all the legwork, the PS3 is an excellent deal. If you need something with RAM capacity that wasn't a joke in 2001, and/or management features that won't have you tearing your eyes out when you have 10,000 of them, then IBM smells opportunity.
Reply to This
Parent
Re: (Score:3, Informative)
AGAIN, revenue of console sales is not N*const (positive or negative), but const1+N*const2 where const2 is negative (it's a gain per console) but upfront costs=const1(R&D, licences ...) are big. So the fact that the total is negative implies const is negative, but in fact it's mostly that N*const2 is still less than const1. (I hope this makes sense to some at least)
Re:Subsidized Supercomputers (Score:5, Insightful)
Reply to This
Parent
Re:Why PS3s? (Score:5, Informative)
B) PS3s are uniform. Other than HD differences, a PS3 built in 2008 will be the same PS3 built in 2012 (assuming the PS3 lasts that long) this allows for a uniform cluster without worrying about differing parts (for example, the Core i7 built in 2008 will not be the same as the Core i7 built in 2012 and getting a 2008 Core i7 is going to be a pain)
C) PS3s are the new fad. It isn't going to be hard to set up a supercomputer cluster with PS3s compared to using a mismatch of older computers because again, the PS3 is uniform.
Reply to This
Parent
Re:"super" computer: (Score:5, Informative)
I'm not trying to be a smartass, but why did he mention in TFA that his supercomputer cost $4000 if the 8 consoles were "Sony-donated"?
Oh come on, you are being pedantic. Clearly what he meant was "$4000 worth of consoles", never mind that they were donated. $X worth of consoles is a useful number if someone is considering buying PS3s and setting up a supercomputer; it's also a fun number to compare to the cost of renting time on some large supercomputer.
The original Wired article is informative:
http://www.wired.com/techbiz/it/news/2007/10/ps3_supercomputer [wired.com]
He asked for Sony to donate the PS3s because he didn't think the NSF would give him grant money to buy video game systems. Now that he has actually built the supercomputer and it does everything he hoped it would do, perhaps other researchers will be able to justify the money to set up their own clusters (without donations from Sony).
The numbers are a no-brainer: he used to spend $5000 to do a single simulation run using rented supercomputer time. For less than the cost of a single simulation run, you can set up your own supercomputer and make simulation runs whenever you feel like it.
ALso, like the iPod example at the top of the post, most research use of the technology won't come from actual iPods or consoles
Um, he is using actual PS3 consoles to do actual research.
If one wanted to build their own home "super" computer then why not just use CUDA and a few Nvidia cards?
If you think that is a good way to make a super computer, why don't you go ahead and do it, and make a web site explaining how it is done?
Meanwhile, he thought he had a good way to go with the PS3, and it did in fact work as he expected, so what's the problem?
Anyway, here's why he thought it was a good idea. From the above linked Wired article:
steveha
Reply to This
Parent
Re:"super" computer: (Score:5, Informative)
Reply to This
Parent
Re: (Score:3, Interesting)
basically it comes down to the costs of having your own personal power station in the TCO to run a cluster.
this started (well, really hit it off) a few years back, when the pentium M and centrino tech became widespread. basically, to my knowledge, it was the first time you could actually have more processors with less jiggahertz, that consumed less power in total and still had more flops than the others. it swayed everyone from "more powerful cpus plz" train of thought to the "more cp
Re:"super" computer: (Score:4, Informative)
it's not that simple. sure you can make up for a lack of per-CPU processing power through cluster computing, but at some point it becomes more practical or even cheaper to go with a smaller cluster using a better processor architecture.
you could use hundreds of P3s or even P4s and still not achieve the same real-world performance as a couple dozen cell processors or modern GPGPU stream processors. that's because P3s & P4s are general-purpose CPUs designed for SISD/scalar processing. they're great for the bulk of general-purpose commodity computing applications like running an OS, web browser, word processor, etc., but high-performance computing problems typically involve processing very large data sets that greatly benefit from data parallelism. so if you had two processors, one scalar and one vector, each with the same power consumption and clock rate, the vector processor would be an order of magnitude faster at performing HPC tasks than the processor with the scalar architecture.
and the combined use of parallelization at multiple levels will always be more efficient than relying solely on a single form of parallelism. blindly adding more cheap 32-bit scalar CPUs won't get you as good of results as building a smaller cluster comprised of 64-bit fully-pipelined stream processors with multithreaded superscalar cores that support VLIW. in the former case, you're only employing task-level parallelism, whereas in the later case you're taking advantage of bit-level, instruction-level (pipelining + superscalar + VLIW), data, and task-level (multiprocessing + multithreading) parallelism. you'd not only save power by using fewer (more power-efficient) processors, but you'd also reduce memory coherence & bandwidth problems, not to mention the space savings.
Reply to This
Parent
Re:"super" computer: (Score:4, Funny)
or bank reclaimed assets from a sunken business?
What type of processor do Woolworth's POS tills use?
Reply to This
Parent
Invader Zim is non-free (Score:3, Funny)
It's not my fault you didn't catch the Invader Zim reference.
Invader Zim is non-free. It's easier to catch pop culture references if they are pre-1923 or otherwise free [freedomdefined.org].
Re:Invader Zim is non-free 23-skidoo! (Score:5, Funny)
sorry, but that's stupid -how many pop culture references from 1923 are relevant to TODAY's pop culture:
seeya snookums, me and the squeeze are the bees knees in our raccoon coats, we're gonna get jazzed up in our hupmobile on hootch and go check out Mary Astor's horse after we hit the blind pig.
I agree its unfortunate that this stuff is non free, but pre 1923 means that most talkies would be out of bounds as well -including stuff you can see on tv all the time.
I'm just sayin'
Reply to This
Parent
Re: (Score:3, Insightful)
sorry, but that's stupid -how many pop culture references from 1923 are relevant to TODAY's pop culture:
A perfect illustration of the fact that copyright terms are way too long.
Re:Pretty much useless (Score:5, Interesting)
How is it useless, when the guy who built it, used it already for a month? And it has replaced 200 supercomputer nodes, for his purpose? I'd say that's very fucking useful.
But you know what, maybe you should send him an e-mail and try to convince him how his cluster is useless. Make it a nice, insightful and intelligent e-mail, like your post.
Reply to This
Parent