NVIDIA's $10K Tesla GPU-Based Personal Supercomputer 236
gupg writes "NVIDIA announced a new category of supercomputers — the Tesla Personal Supercomputer — a 4 TeraFLOPS desktop for under $10,000. This desktop machine has 4 of the Tesla C1060 computing processors. These GPUs have no graphics out and are used only for computing. Each Tesla GPU has 240 cores and delivers about 1 TeraFLOPS single precision and about 80 GigaFLOPS double-precision floating point performance. The CPU + GPU is programmed using C with added keywords using a parallel programming model called CUDA. The CUDA C compiler/development toolchain is free to download. There are tons of applications ported to CUDA including Mathematica, LabView, ANSYS Mechanical, and tons of scientific codes from molecular dynamics, quantum chemistry, and electromagnetics; they're listed on CUDA Zone."
Heartening... (Score:3, Interesting)
...to see a company established in a certain market, to branch out so aggressively and boldly into something... well, completely new, really.
Does anyone know if Comsol Multiphysics can be ported to CUDA?
Re:Ooooooo! Ahhhh! (Score:2, Interesting)
Port john the ripper/aircrack-ng? Buy a few terabyte drives and start generating hash tables?
What a disappointment (Score:2, Interesting)
Turns out "Tesla" is just the name of the product.
Drat. I demand a refund.
Scientist speak (Score:2, Interesting)
So many scientists use the word "codes" when they mean "program(s)".
Why is this?
Re:And the worst timing ever award goes to... (Score:2, Interesting)
according to http://folding.stanford.edu/English/Stats about 250.000 "normal" users are folding proteins at home.
Personally, I would use it as a render farm, but Blender compatibility could take a while if Nvidia keeps the drivers and specification locked up.
What they don't seem to mention is the amount of memory/core (at 960 cores). I'd guess about 32 MB/core, and 240 cores sharing the same memory bus...
Re:Scientist speak (Score:3, Interesting)
It's cultural.
You're not even allowed to say that you're "coding", but only that you produce "codes".
Maybe it's because analytic science is basic on equations which become algorithms in computing, and you can't say that you're "equationing" nor "algorithming".
In practice it's actually dishonest, because the algorithms don't have the conceptual power of the equations that they represent (they would if programmed in LISP, but "codes" are mostly written in Fortran and C), so the computations are often questionable. Even worse, it's almost impossible for one research group to compare the "codes" that yielded their results against those produced by another group when numerical computing is used, whereas equations are universally portable.
The theoretical half of the scientific method has lost some of the firm foundations upon which it used to build in recent years, as a result of theorizing through numerical simulation. Fortunately it doesn't matter too much in most sciences because experiment soon demolishes any incorrect predictions. However, those sciences which deal with long-term or historic or otherwise untestable areas are suffering, as a fair bit of unsubstantiated nonsense is popping out of poorly approximated simulations and being claimed as "fact", even though reality hasn't agreed yet.
Things are probably going to get worse in this area before they get better.
Re:weak DP performance (Score:1, Interesting)
> Each Tesla GPU has 240 cores and delivers about 1 Teraflop single precision and about 80 Gigaflops double-precision floating point performance.
The 80GFlops are per card. So you end up with 320GFlops total.
Not much better, but still better than nothing ;)
Re:Heartening... (Score:5, Interesting)
Yes, I can. My first thought when I saw the article was to calculate how many of them one would need to simulate a human brain in real time. The answer is: with 2500 of these machines one could simulate a hundred billion neurons with a thousand synapses each, firing a hundred times per second, which is the approximate capacity of a human brain.
People have paid $20 million to visit the space station, now who will be the first millionaire hobbyist to pay $25 million to have his own simulated human brain?
Erlang (Score:3, Interesting)
Re:Heartening... (Score:3, Interesting)
Would the interconnects be fast enough? There's a lot of non-locality in the synaptic connections, so you're going to need some pretty heavy comms between the cores.
Also a selection of neurons are far more heavily connected than 1000s of synapses, and they're fairly essential ones. Might these be a critical path?
Sure would be cool to build such a beast, do some random connections, and see what happens...
Re:Heartening... (Score:1, Interesting)
Short answer: no.
Long answer: there is no direct interconnect between the cards, so any data would have to go down the PCIe bus to the host and then back up to an interconnect card, across the network and back across the PCIe bus twice to get to the other. With a specially designed PCIe root complex you could probably eliminate some of the overhead and allow the card to send direct to the interconnect without having to share the bus with the host, but you couldn't do that currently. Even then, there isn't any interconnect currently available that even comes close to the bandwidth you'd need.
cold hard facts about cuda (Score:3, Interesting)
Re:Heartening... (Score:5, Interesting)
Your figures are off by several orders of magnitude. 2500 of these is roughly 10,000T/flops. As a Tflop is 10^12 operations, and we have 10^11 neurons that leaves 10^5 floating point operations per neuron. If each has 1000 synapses to process then we are down to 100 operations per connection, per second.
At this point it seems obvious that you've assumed a really simplistic model of a neuron that can compute a synaptic value in a single floating point operation. These simple neuron models don't behave like a real brain, and scaling up simulations of them doesn't produce anything interesting. Real neurons are capable of computing much more complex functions than these models. The throughput on the interconnect is going to be a major factor, and simulating each neuron will require from 10s to 1000000s of operations depending on the level of biological realism that is required. The Blue Brain project has a lot of interesting material on different models of the neuron and the tradeoff between performance and realism.
Their end goal is to dedicate a large IBM Blue Gene to simulating an entire column within the brain (roughly 1,000,000 neurons) using a biologically-realistic model.
Nor turbine. (Score:2, Interesting)
Patmos International (Score:3, Interesting)
Re: Is that all you got? (Score:3, Interesting)
Neural nets.
This setup sounds ideal for a training bed for fann programs. I can't recall if there's a port of fann for CUDA, but I think there might be.
Re:Graphics (Score:2, Interesting)
Comment removed (Score:4, Interesting)
Re:4 TFLOPS? (Score:2, Interesting)
A single Radeon 4870x2 uses two chips
2.4 / 2 = 1.2
Each Tesla GPU has 240 cores and delivers about 1 TeraFLOPS single precision...
Each Radeon HD 4870 produces 1.2 TFLOPS, about 0.2 more than one Tesla GPU.
"NVIDIA announced...the Tesla Personal Supercomputer -- a 4 TeraFLOPS desktop...
Two 4870 X2s equal 4.8 TFLOPS, 0.8 more than four Tesla GPUs.
I think the parent's point was that even when an HD Radeon 4870 X2 is made up of two cards they're still connected and recognized as one. Thus, with "fewer" cards and fewer slots you could achieve more performance. Or you could use the other two vacant slots for yet another two 4870s: Four of them in crossfire would equal 9.6 TFLOPS, 5.6 more than four Tesla GPUs.
Futhermore, I would assume two GPUs that are closely interconnected as a "single" card (4870 X2) would be better than a pair of GPUs connected through a combination of the motherboard (x2 Tesla GPU) and custom interconnects.
I'm not implying that an HD 4870 is a viable alternative to a Tesla GPU but the "performance" is more than just comparable. As it's been mentioned before, the hardware concerned is meant for precision and not speed, otherwise known as performance. Then again, you could compensate for in-accuracy by using all that computing power to make multiple passes rather than making sure your initial calculations were accurate.
Note: Emphasis by me in all quotes provided.
It's news because... (Score:3, Interesting)
So, while in theory you could put together some Radeon's, work with their API and achieve the same thing, NVIDIA has significantly reduced the level of effort to make it happen.
Re:cold hard facts about cuda- unbalanced (Score:1, Interesting)
Re:Heartening... (Score:3, Interesting)
I think your post was intended humorously, but I'm going to pretend otherwise. (Note, I'm not a specialist in computational mentalistics, or whatever the field would be called, but:)
I'm fairly certain the interconnects are fast enough. The brain is no speed demon on individual connections. It's basically chemical, with only a little electrical stuff on top that's still based on ions floating in liquid.
The problem is the software. And the sensoria. And the effectors.
Each of those problems is being addressed separately. What do you want to bet that when they all come to "good enough" solutions, interfacing them is going to be a MASSIVE kludge.
And even if you could, you can't just copy how people did it. A camera is basically different from a retina. It extracts different information. You can use complex processing to convert one into a simulation of the other, but there's no straightforward mapping. Each conversion involves loss of information...so you need to ensure that the correct information is lost.
Just as a silly example of the difference, a recent experimental hearing aid uses infra-red lasers to stimulate the nerves in the cochlea. You KNOW that people use electric signals, but artificially generated electrical signals spread too much in the interface, so you can't get decent tone resolution. With infra-red lasers, though, you can stimulate any particular neuron you choose.
Guaranteed: random connections will give you a crashed program. Secondary chance is an infinite loop.
Mind you, there are neural nets that are initialized with random initial values, but they have strict boundary conditions. Otherwise you never get better than garbage out of them.
Also: There are lots of groups of neurons that are more highly connected than average. These are "functional specialists". There often isn't anything special about the neurons, but only about the way that their connections have been reinforced. I'm not sure about the neurons that branch outside of the column, but I suspect the same of them.
My projection for a human mind equivalent computer remains at around 2020-2030. This announcement drops my estimate of the cost, but that was never an exact number of dollars, so I can't quantify it. Also note that I said equivalent. I'm not going to assert that it would enjoy watching Star Wars, or even 2001. It's emotions are unlikely to be similar in nature to those of a mammal...unless that's necessary in order to understand human language...and only to the extent necessary.
For that matter, we wouldn't WANT it to have the same emotional structure that we have. That would be very dangerous. If we did that then it might have "take over the world!" as an innate goal, rather than as a tactical move. Even as a tactical move it's rather dangerous, so we would probably want to so design it's goals that such a tactical move would appear extremely distasteful, and best accomplished by manipulating willing proxies. (This would ensure that there was room for people where people would be comfortable.)
OTOH, I don't see a human mind equivalent AI as remaining merely human equivalent. Progress rarely stops. But if it's motivational structure is so designed that there's plenty of comfortable room for people, I don't see this as a problem. Entities rarely want to alter their motivational structure unless it's giving them severe problems, and often not then. But don't expect it to be passive or a mere recipient of orders. It would, however, be reasonable to expect it to be a lot more considerate of human needs and desires than the current bureaucracy...in any country. (Note that individual office holders may well be sympathetic, but the system itself isn't.)