Larrabee ISA Revealed 196
David Greene writes "Intel has released information on Larrabee's ISA. Far more than an instruction set for graphics, Larrabee's ISA provides x86 users with a vector architecture reminiscent of the top supercomputers of the late 1990s and early 2000s. '... Intel has also been applying additional transistors in a different way — by adding more cores. This approach has the great advantage that, given software that can parallelize across many such cores, performance can scale nearly linearly as more and more cores get packed onto chips in the future. Larrabee takes this approach to its logical conclusion, with lots of power-efficient in-order cores clocked at the power/performance sweet spot. Furthermore, these cores are optimized for running not single-threaded scalar code, but rather multiple threads of streaming vector code, with both the threads and the vector units further extending the benefits of parallelization.' Things are going to get interesting."
Duh (Score:5, Interesting)
That's what libraries, toolsets and custom compilers are for. If the problem was just silicon we'd have Larrabee by now. What's holding up the train is the software toolchain and software licensing issues.
Don't worry, though. On launch day the tools will be mature enough to use, and game vendors will have new ray tracing games that look fabulous on nothing but this.
I'm hoping the tools will be open but that's a long bet. If they are, Microsoft is done as the game platform for the serious gamer and Intel will make billions as they take the entire graphics market. Intel will make hundreds of millions regardless and a bird in the hand is worth two in the bush, so they might partner in a way that limits their upside to limit their downside risk. That would be the safe play. We'll see if they still have the appetite for risk that used to be their signature. I'm hoping they still dare enough to reach for the brass ring.
End of an era (Score:2, Interesting)
Structural engineering welcomes this. (Score:5, Interesting)
As a structural engineering in training who is starting to cut his teeth in writing structural analysis software, these are truly interesting times in the personal computer world. Technologies like CUDA, OpenCL and maybe also Larrabee are making it possible to simply place in any engineer's desk a system capable of analysing complex structures practically instantaneously. Moreover, it will also push the boundaries of that sort of software beyond, making it possible to, for example, modeling composite materials such as reinforced concrete through the plastic limit, a task that involves simulating random cracks through a structure in order to get the value of the lowest supported load and that, with today's personal computers, takes hours just to run the test on a simple simply supported, single span beam.
So, to put this in perspective, this sort of technology will end up making it possible for construction projects to be both cheaper, safer and take less time to finish, all in exchange of a couple hundred dollars on hardware that a while back was intended for playing games. Good times.
Re:WTF. I do not want moar x86. (Score:4, Interesting)
Isn't this exactly what Gallium3d + LLVM GLSL compiler is giving you? Heck, even with the simple shader ISA's you probably want an optimizing compiler anyway in order to get good GLSL performance, no?
Wouldn't this actually be a good thing; instead of spending all the time developing new drivers for each generation of hw (changing every 6 months, poorly if at all documented), you could just keep on developing the architecture and improve the x86 backend.
Re:Structural engineering welcomes this. (Score:3, Interesting)
That's a problem with the animator. You don't need complicated software to make good animation--Toy Story should be sufficient evidence of that. You just need talent. Less and less talent these days, actually: if you're playing a game where the avatars are floating, it's because the designers don't give a^H^H^H^H^H^H^H care enough to simulate motion properly.
As an aside, realism is frequently not a goal in animation. You tend to run up against the uncanny valley: all the characters look like zombies. Realism is what made "A Scanner Darkly" so painful to watch, especially as contrasted to "Waking life".
I think Larrabee has somewhat more potential to improve ray tracing. Lighting in games these days seems like layers of, well, kludges. The code works, and it's fast, but it's an ugly, ugly solution.
Transcendental functions? (Score:3, Interesting)
Articles states that there's hardware support for transcendental functions, but the list of instructions doesn't include any. Anyone know what is/isn't supported in this line?
Re:End of an era (Score:2, Interesting)
Plus Itanium killed of most of the Risc architectures and x64 looks likely to kill off or nicheify Itanium.
This is misinformed B.S. Itanium didn't kill anything.
That was (and is) triumphant march of Linux/x64 all the time.
It is true that Intel and HP made out of PA-RISC and Alpha sacrificial lambs on Itanic's altar. Yet, Itanic's never caught up (and never will) to the levels where both PA-RISC and Alpha in the times were.
I bet a Larrabee like CPU would be great in a server too, and it's trivially highly scalable by changing the number of cores.
Servers are I/O heavy - CPU parallelism is very secondary. I doubt Larrabee would make any dent in server market. Unless of course OnLive/similar would catch up or Intel add something interesting for e.g. XML processing.
Re:Not really x86 (Score:3, Interesting)
This isn't really x86, in my opinion; it's x86 with a separate set of very obviously graphics-oriented instructions bolted on top. Since getting decent performance will require using the new instructions and a new programming model almost exclusively, what's the point of the x86 bit?
The point is that there's stuff those graphics-oriented instructions are really not very good at, like indirect memory referencing and branching logic, both of which x86 excels at handling. Now, that kind of workload isn't common on GPUs _at the moment_, but both of those are common operations, for example, in ray tracing, so you may see them become more important over the next few years. What Intel are doing here is defining the GPU architecture for the next decade, and it's one that allows more complex algorithms to be implemented than can easily be done using the specialized stream processing systems we have at the moment.
The other point behind the x86 bit is that not only did Intel alrady have core designs that implemented it (Larrabee simply has the new registers & instructions bolted on to an existing low-power Pentium-class core) thus enabling faster time to market than if they'd developed entirely new hardware, they also have a massive amount of software support for the architecture, including one of the best optimizing C++ compilers there is. A new ISA would have required a new compiler, thus further complicating the project. As it is, only extensions to their existing compiler have been necessary.
Re:If Intel are smart they will mix Core and Larab (Score:3, Interesting)
Oddly enough your post ranks quite highly in that search. Drilling through the forums that show up reveal speculation that a 32-core Larrabee design will use 300W TDP, or roughly 10W per core. There doesn't seem to be any justification for that number although the Larrabee looks like Atom + stonking huge vector array. The Atom only uses 2W, it seems hard to believe that the 16-way vector array would use as much power for each FLOP as the entire Atom power budget to deliver that FLOP. Or perhaps it will, it's all just speculation at this point.
So that 32-core processor would deliver 16x32 = 512 FLOP/clock peak. I would guess that they could deliver a low-power part clocked at 1GHz judging by the efficiency of Intel's floating point units across the whole range (from Atom up to i7). That part would hit 512GFlop/s peak. Then it's just a guessing game of what clock-speed they could ramp it up to within that 300W TDP, 2Ghz? 3?
The real killer could be how much sustained throughput can be achieved on an x86 derivative. The Core-2 sustained throughputs were mental, but it used every OoO trick that Intel could throw at it. Without that advantage the peak:sustained ratio will be closer to AMD/Nvidia's current offerings.
Re:End of an era (Score:3, Interesting)
And what about per-Watt basis? (honest question here; though I do suspect i7 is quite a bit more competive here)
Re:Duh (Score:3, Interesting)
Re:Missed it by *that* much (Score:3, Interesting)
If you watch large teams of programmers, the management actually force the developers to write slow code, claiming that maintainability is more important than any other factor!
I've worked in a couple of companies like that - usually the programmers were limited to working on technology that the management (ex-programmers) were familiar with. Then also, management didn't want the programmers learning "high-demand skills" (ie. hardware programming) that would boost the chances of their staff leaving to a better paid environment. Or there was the politics of favoritism where the directors wanted to give a leg up the seniority ladder to their best mate. Everyone else who was qualified "didn't have the skills or was busy on another project" while of course their mate "had applied at just the right time with the right skills". Another problem was that if management gave only one programmer a new hardware system, then everyone else would get cheesed off that they were falling behind that they would leave (eg. a CPU porting project). Alternatively, there are also quota based systems which would piss off one nationality off another.
Invariably these companies gain a bad reputation and implode after a slow death spiral, where they are forced to lay off staff and sell off equipment to cover debts. With fewer staff, they can't take on new projects, and the cycle continues until the last project is cancelled.