NVIDIA To Push Into Supercomputing 103
			
		 	
				RedEaredSlider writes "NVIDIA outlined a plan to become 'the computing company,' moving well beyond its traditional focus on graphics and into high-profile areas such as supercomputing. NVIDIA is making heavy investments in several fields. Its Tegra product will be featured in several mobile devices, including a number of tablets that have either hit the market already or are planned for release this year. Its GeForce lineup is gaming-focused while Quadro is all about computer-aided design workstations. The Tesla product line is at the center of NVIDIA's supercomputing push."
		 	
		
		
		
		
			
		
	
more nukes :/ (Score:3, Insightful)
I just hope enough nuclear power plants come online before their first supercomputer customer turns on a new rig. The latest GPUs already use more power than the hungriest Intel or AMD x86 ever did.
Re: (Score:1)
Um, yes, of course. Because they have 292 cores instead of 4/6/8. While those two designs do remarkably different things, the point remains, for the tasks that GPUs are well suited, you cannot possibly beat it with an Intel/AMD.
Re: (Score:2)
No they don't, let's stop such silliness.
The GTX580 has 16 cores. The GTX280 has 32. The AMD 6970 has 24. The AMD Magny-Cours CPUs can have up to 16 (ish, if you don't mind that it's an MCM).
292 indeed. NVIDIA does an even better job of marketing than they do of building chips.
Re: (Score:2)
Texture units clearly aren't cores, they're largely passive data pipelines. If you really look at a GPU more closely you can of course get far more complicated, The AMD architecture at the high end has two control flow cores with 24 SIMD coprocessors that execute blocks of non-control flow non-memory work. It is true that even those are hard to qualify as cores given their limited capabilities.
Without question a single SIMD lane is not a core, though.
Re: (Score:2)
You're definitely a nerd. All you do is bitch without providing the information you say the parent is lacking.
Re: (Score:2)
er... each of those cores has 16 things that aren't cores so they have lots of cores?
Each of the Cores on a i7 has an AVX pipe (or two, depending on how you look at it) with 8 ALUs in it. Does that mean a quad i7 has 32 cores?
Re: (Score:2)
Throwing a shitton of SIMD units on a chip isn't that cool anymore. DSP's have been doing it forever. Real workloads require fast sequential code performance, and a GPU will have truly embarrassing results on such workloads.
Re: (Score:2)
"some useful algorithms are sequential" != "no useful algorithms are parallel"
Care to define "real workloads" for us, cowboy?
Re: (Score:1)
Re:more nukes :/ (Score:5, Insightful)
The latest GPUs already use more power than the hungriest Intel or AMD x86 ever did.
And when used for the types tasks designed, pump out 10x the performance for maybe twice or three times the power.
Re:more nukes :/ (Score:5, Informative)
Re: (Score:2)
As far as I know, while present GPUs do use a lot of power they, also, produce a massive number fo FLOPS compared to general processors. This means they, actualy, have a lower power cost per FLOP.
Re: (Score:2)
Would you rather have to power a supercomputer sporting 1024 Intel CPUs? Which is going to be a bigger power hog? Which will scale better?
Re:more nukes :/ (Score:4, Informative)
If all you're measuring is pure FLOPS, then here are some numbers: Cray X1, 250MFLOPS. nVidia Fermi: 1GFLOPS. ARM Cortex A8: 10MFLOPS. Of course, that doesn't tell the whole story. Getting 250MFLOPS out of the Cray required writing everything using huge vectors. Getting 1GFLOPS from Fermi requires using vectors within independent concurrent processing tasks which access memory in a predictable pattern and rarely branch.
GPUs are not magic, they are just optimised for different workloads. CPUs are designed to work well with algorithms that branch frequently (every 7 instructions or so - so they devote a lot of die area to branch prediction), have good locality of reference (cache speeds up memory in these cases), and have an integer-heavy workload. GPUs generally lack branch prediction, so a branch causes a pipeline stall (and, on something like Fermi, if two kernels take different branches then you drop to 50% throughput immediately). Their memory architecture is designed to stream large blocks of data in a few different orders (e.g. a texture cube, pixels in order along any axis). So, depending on your workload, either one may be faster than the other.
Re: (Score:2)
Re: (Score:2)
nVidia has some serious processing power.
Re: (Score:2)
Nvidia Fermi (GTX 400 series [wikipedia.org])
GTX 470: 1088.64GFLOPS (32-bit) (215W(mfg. claim) $350; 3e9 transistors; 1280MB GDDR5; 448 Unified Shaders:56 Texture mapping units:40 Render Output units);
GTX 480:1344.96GFLOPS (32-bit) (250W(mfg. claim)-(500W tested max.) $500;3e9 transistors; 1536MB GDDR5; 480 Unified Shaders:60 Texture mapping units: 48Render Output units).
Tesla M2050 1030GFLOPS(32-bit), 515GFLOPS(64-bit) 3GB ECC (M2070 is same but 6GB ECC GDDR5)
IBM linpack test May 2009 [hpcwire.com]: $7K Xeon, 48GB : 80.1 GFLOPS, 11GFLP
Re:more nukes :/ (Score:4, Informative)
5 of Top 10 most green supercomputers use GPUs:
Green 500 List [green500.org]
Each GPU is very high performance and so high power. Performance / watt is what counts and
here GPUs beat CPUs by 4 to 5 times. This is why so many of the new supercomputers are using
GPUs / heterogenous computing.
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Exactly, CUDA has become a major player in the field of supercomputing. Just like IBM's PowerPC/BlueGene systems. With support for Floats/Doubles and amazingly fast math functions and tons of data in Matrices, the only other way to do all that math fast is a FPGA or a PowerPC chip.
CUDA, Supercomputing for the Masses: Part 1/20
http://drdobbs.com/cpp/207200659 [drdobbs.com]
Re: (Score:2)
Re: (Score:2)
On several of my distributed tasks, the SSE2 version takes about 36 hours on one core, or about 6 hours per task average if using all 8 threads(assuming an optimistic 50% scaling from hyper-threading). My GPU, on the same work units, takes only 1min 40sec. The GPU is about 216 times faster and slightly under twice the power. 60% more power draw, 21600% better performance.
I wouldn't compare GPU vs CPU for power draw
Re: (Score:2)
Re: (Score:1)
More like pararell supercomuting... (Score:1)
Re: (Score:2)
Yes, but if you check out the top500.org - the list of the 500 known fastest 'supercomputers' you'll see that they all achieve their benchmark through parallelizing their tasks across multiple cores.
I think it is safe to say that all modern supercomputers achieve their 'power' in this way - I've not seen any terahertz single-core/processor systems on the horizon, and don't expect to see them.
SGI (Score:5, Funny)
Re: (Score:2)
no it's sgi version 4, which is nvidia supercomputing version 3(at least or so).
they outline a plan like this at least once in two years....
Oh how rich the irony... (Score:1)
The company that was setup by disgruntled Silicon Graphics gfx division employees because the SGI gfx tech was suffering from toxic internal politics and the push into Big Iron and Storage... is now moving into 'Supercomputing'. Hope they bring back the Cube Logo  :)
Massive parallel coprocessor (Score:2)
I doubt it would be truly useful, but I'd like to see a 2 million core processor. Arrange in, let's see, a 1920 x 1080 grid. The 8008 used 3500 transistors per core, so even before memory, it'd be a 7 billion transistor chip.
More practical might be a 128 x 128 core processor, using a modified 386 or 68020 for cores. That could be less than 5 billion transistors. Each processor is simple and well known enough that hand optimized assembly begins to make sense again.
Run the little bastard at just 1 GHz a
Re: (Score:2)
Re: (Score:2)
The 80386 and 68020 didn't have any caches to speak of. Put 16,384 of them together and you'll find yourself several orders of magnitude short of the DRAM bandwidth necessary to keep them occupied.
Yet another reason to begrudge nVidia (Score:2, Interesting)
"Supercomputing" almost always means "massive Linux deployment and development." I will spare critics the wikipedia link on the subject, but the numbers reported there almost says "Supercomputing is the exclusive domain of Linux now."
Why am I offended that nVidia would use Linux to do their Supercomputing thing? Because their GPU side copulates Linux users in the posterior orifice. So they can take, take, take from the community and when the community wants something from them, they say "sorry, there's n
Re:Yet another reason to begrudge nVidia (Score:5, Insightful)
Gotta love the rabid GPL fans. The GPL doesn't mean freedom for everyone to do things the way you think they should be done.
Re: (Score:2)
I know they publish a driver for Linux. Trouble is, I can't use it because they won't tell us how to make it work through their "Optimus technology." I had high hopes for my newest machine only to have them dashed to bits with the words "we have no plans to support Optimus under Linux..."
Re: (Score:2)
Re: (Score:2)
Let's just let the market forces do their thing here. Personally, I tell anybody I hear thinking about buying NVIDIA to buy AMD instead. Sure, you might get a few more fps today, but tomorrow you may find your card unsupported by the manufacturer with no documentation available to end users on how to fix problems they may encounter in the future. NVIDIA dug their grave, let them sleep in it.
Re: (Score:2)
I tell anybody I hear thinking about buying NVIDIA to buy AMD instead. Sure, you might get a few more fps today, but tomorrow you may find your card unsupported by the manufacturer with no documentation available to end users on how to fix problems they may encounter in the future.
AMD no longer support my integrated ATI GPU; I had to manually patch the driver wrapper source to make it work after recent kernel changes and I'm guessing that before long it will be too rotted to work at all.
There is an open source driver but it doesn't work with my monitor resolution and performance is awful. So my solution before I discovered I could patch the source was going to be buying the cheapest Nvidia card I could fit into the computer.
ATI - no Linux heaven (Score:2)
I bought an ATI card (HD 3800) and its Linux driver sucks, I can't use it for gaming or 3d arts. (If I try to run blender, it won't display some menu elements, and looks totally broken.) It only works decently on Windows. So the funny thing is, I can't use an opensource software (Blender) with a video card that's supposedly opensource friendly on an opensource operating system (Linux; I tried it with several distros).
The funny thing is that only nVidia and Intel have decent drivers for Linux. So it's not a
Re: (Score:2)
look if you tell everyone something like that you have to change your stance every few years due to the company offerings changing. s3's made sense at one point in time.
who cares about tomorrow? tnt2's are as worthless as matrox milleniums.
Re: (Score:2)
Re: (Score:2)
Maybe (I'm not a guru by very far, so I'm going on a limb here) you could even have them running on different tty's and switch semi-on the fly. Would the discrete be shut down if you are on the integrated tty?
Some explanation: I reserved some space on my nettop with ION2. Some day I might want
short little span of attention (Score:3)
If we all buy AMD's product on the virtue of their openness, it won't be long before AMD holds the upper hand on features and stability. I think they're heading in a good direction already.
How much entrenched advantage does inferior need before you lock in? Your personal FIR filter on "what have you done for me lately" seems to have unit delay of hours rather than years.
Re: (Score:2)
haters hate (Score:2)
I'm sorry if nvidia won't gut their business to satisfy your irrational request.
3 of Top5 Supercomputers already use NVIDIA GPUs (Score:2)
3 of the Top 5 supercomputers are already using NVIDIA GPUs:
NVIDIA press release [nvidia.com]
Bill Dally outlined NVIDIA's plans for Exascale computing at Supercomputing in Nov 2010:
Bill Dally Keynote [nvidia.com]
Re: (Score:2)
One thing I've been really keen to know is what the utilisation is like on those supercomputers. We know they can do LINPACK really fast and more efficiently than the CPUs do, that's what you get for having a high ALU density, a few threads per core and wide SIMD structures. The question is: out of the algorithms that people intended to run on those supercomputers, then what level of efficiency are they hitting.
Are they still a net gain over a standard opteron-based machine? They may be, but I don't know th
Re: (Score:2)
Double-precision linpack performance increase over a CPU-only system is ~288GFLOPS per Tesla M2050 card (up to 4 cards per system - adding more doesn't help without going to exotic motherboards. See news report of IBM study [hpcwire.com]). Raw performance is 515GFLOPS/card (double precision), so you're looking at ~56% utilization. Others report 53% overall on a massively parallel setup ( See: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5470353 [ieee.org] )
A rough rule of thumb for linpack double-precision is 25% of the
Re: (Score:2)
Maybe they will need better branch prediction in their pipeline?  :p
eh.. (Score:2, Interesting)
I've been working with their GPGPU push for a couple of years now. What I notice is they are very good at data parallelism with highly regular data access patterns and very few branches. While they are technically general purpose, they don't perform well on a large portion of high performance tasks that are critical even in scientific computing which are generally compute-bound. This creates some really annoying bottlenecks that simply cannot be resolved. They can give tremendous speedup to a very limited s
Re: (Score:2)
where the programmer time involved was increase exponentially
Exponentially with what variable?
Amdhal's "law" of parallel speedup revisited (Score:2)
http://en.wikipedia.org/wiki/Amdahl's_law [wikipedia.org] Really, can anyone educated enough to do scientific programming NOT know what to expect?
NVIDIA? please AMD (Score:2)
The current line up of AMD GPUs have far more stream processors than the NVIDIA models, and run at roughly the same clock speed. Why would anybody buy the NVIDIA ones?
Re: (Score:2)
The number of "stream processors" doesn't necessarily scale as a linear performance metric. As an example (using dated lower midrange hardware, as it's what I still know), a Radeon 3850 sports 320 stream processors. My Geforce 9600GT advertises 64 SPs, yet pulls ahead of the 3850 in many benchmarks. It's not as simple as quoting a number used in marketing material as a universal metric, any more than a 3 GHz Pentium 4 is 50% faster in real-world performance than a 2 GHz Athlon64.
As for the other issue, N
Re: (Score:1)
Re: (Score:2)
Because nVidia has CUDA firmly entrenched in the scientific community by now. And CUDA almost works by now, that is, the most glaring bugs have been eradicated. Oh, and it even works on Linux!
Does AMD have support for doubles on their chips by now? Honest question here. It's a practically useless feature for graphics, but it makes a lot of sense for scientific computing.
Re: (Score:1)
The HD 5870 GPU has very good DP performance. Auto-tuned DGEMM reaches about 65% device utilization in my experience. This agrees with benchmarks done by Dongarra's group, I believe.
AMD hardware is powerful. But the software stack is relatively behind in supporting it. However, I don't think this is the dominating cost of adoption.
If you come from the nVidia world, kernels for AMD look completely different. I don't have enough experience to say it is harder in the AMD world compared with nVidia. I can say i
Re: (Score:3)
Re: (Score:2)
Well, I'm still failing to see nVidia putting their money where the mouth is on that one. The last time I checked their OpenCL implementation, a lot of the demos that were ported over from CUDA ran slower - 10 times slower in the case of the volume rendering example. So this is not how you get to impress people who are solely concerned with performance. Oh, and unlike the CUDA compiler, the in-process OpenCL compiler even segfaulted on me within about 4 hours of playing with nVidia's OpenCL implementation (
Re: (Score:1)
NVIDIA is dragging their heels with OpenCL. They have yet to publicly release an OpenCL 1.1 compliant driver, despite the fact they have had a beta version for about 8 months. They are also slow to respond in their forums, and many problems/bugs that were reported at least a year ago still have not been fixed. They are throwing their weight behind CUDA, plain and simple. CUDA 4.0 just came out, and has some phenomenal technologies that make me wonder if OpenCL has a fighting chance.
I think it does, but it i
NVIDIA just needs to die (Score:2)
They chose to not release the necessary specs to allow others to utilize their hardware the way Intel and to a lesser extent AMD did, and as the current smartphone trend has shown, locked in is the same as being locked out.
Re: (Score:2)
curious to mention intel, their gpu's offer less realistic programmability.
And back in the day... (Score:2)
... when GPGPU was in its infancy and I was lusting to play with that stuff; that's about 5 yrs ago, at most.
Alas, our semiconductor department was so content with its orthodoxy and cluster running Fortran WTF hairballs...  :`(  :>
Ah well, no point crying over that spilt milk... it just takes patience and pig headedness...
Re: (Score:2)
Eh, it's a logical step. Graphics is, has been, and will always be about parallelism and matrices. Supercomputing is almost always about simulation and high-order computation, which works out to the same thing. Really good graphics hardware, thirty years ago, or now, or thirty years in the future, will always be good science hardware, and supercomputing is driven by science.
Linux support for Nvidia (Score:1)
Re: (Score:2)
Let us know why you think this is a bad idea.
I think it's a great idea. Intel keeps putting out chipsets with video on-board, and this has to hurt nVidia's core business. If they make inroads into other areas where Intel is now dominant, and can do it without going broke, then that puts them in a nicer position.
Re: (Score:2)
Didn't their licenses expire on some bus or other preventing them from making chipsets for intel CPUs? The press release I saw said the recent $1.5b deal excluded certain chipsets. They probably aren't too interested in making AMD chipsets these days. Large racks of MIPS/ARM CPU & Fermi GPU systems makes sense to me. Top-end graphics cards will die off soon thanks to consoles & hollywood. Even multi-monitor gaming wont slow that by much. In a generation or two even low-end graphics cards will probab
Re: (Score:1)
Re: (Score:2)
"1080p" isn't that impressive of a resolution for PC games.
Exactly but because most newer games are made with 'console capable' engines designed to run on 1080p you'll see less and less games making
use of the extra power PCs have (especially as not everyone has higher spec pcs). The same is true for console games with most games being
designed for xbox360 capabilities and not as much effort into improving that for ps3 gameplay.
Its not entirely a bad thing as there is a very small chance some of that effort might be re-channelled into improving gameplay and not just
1080p is more than enough for Nethack (Score:2)
1024x768 wasn't wide enough to play the graphical version of Nethack without scrolling. 1280x1024 is almost but not quite enough, and 1440 or above works just fine.
Of course, 24x80 was enough for the real version.
Re: (Score:2)
In a generation or two even low-end graphics cards will probably have the power to play 1080p games at full detail.
I suspect you are right, and that there will be a race for power efficiency like there is today on tablets/phones. "High end" will still exist, but the definition will change to power/performance rather than just raw performance.
And of course, powerful video cards will always be appreciated in the rendering world.
Re: (Score:2)
In a generation or two even low-end graphics cards will probably have the power to play 1080p games at full detail.
They do already, so long as you're playing games from 2003. The reason why you can play many modern games on max settings on mid-range cards is that those games have been crippled for the console market and simply cannot benefit from the power of a high-end card.
Re: (Score:2)
It's obviously a bad idea because it makes you extremely stressed out, which in turn reduces your life expectancy.
Other than that GPGPU's are great for certain things, like supercomputers are good for certain things.
Why did you came here to post that anyway?
Re:Fuck you moderators (Score:5, Funny)
Re:Fuck you moderators (Score:4, Insightful)
I can NOT fucking believe there was not already a troll account called moderators long, long, LONG before UID 2M.
Re: (Score:2)
The kid obviously has a bright future ahead of him. He's got moxy.