Catch up on stories from the past week (and beyond) at the Slashdot story archive

There's No Such Thing As a General-Purpose Processor 181

Posted by Soulskill on Saturday November 08, 2014 @08:57PM from the or-ghosts dept.

CowboyRobot writes: David Chisnall of the University of Cambridge argues that despite the current trend of categorizing processors and accelerators as "general purpose," there really is no such thing and believing in such a device is harmful.

"The problem of dark silicon (the portion of a chip that must be left unpowered) means that it is going to be increasingly viable to have lots of different cores on the same die, as long as most of them are not constantly powered. Efficient designs in such a world will require admitting that there is no one-size-fits-all processor design and that there is a large spectrum, with different trade-offs at different points."

This discussion has been archived. No new comments can be posted.

There's No Such Thing As a General-Purpose Processor

Load All Comments

Search 181 Comments Log In/Create an Account

Comments Filter:

There are different workloads, duh. (Score:4, Insightful)

by Anonymous Coward writes: on Saturday November 08, 2014 @09:00PM (#48342891)

David Chisnall says : parallel is not the same as in series. World gasps.

Share
twitter facebook
- Re: (Score:3)
  
  by Jane Q. Public ( 1010737 ) writes:
  
  David Chisnall says : parallel is not the same as in series. World gasps.
  It's worse than that. TFA's basic premise that "there is no such thing as a general-purpose processor" is just flat wrong. Of course there are. His real argument is about how to make them efficient, which is a different thing and very much contrary to his title and introduction.
  
  Anything that can implement a Turing machine *IS* a general-purpose processor, by definition. And any general-purpose processor can do what any other general-purpose processor can do... although not necessarily fast or efficiently
Specialization is for insects (Score:3)

by rossdee ( 243626 ) writes: on Saturday November 08, 2014 @09:00PM (#48342895)

According to Lazarus long
The same should be true for AI

Share
twitter facebook
- Re: (Score:3)
  
  by rasmusbr ( 2186518 ) writes:
  
  According to Lazarus long
  The same should be true for AI
  If that analogy holds in more than one way then I suppose that specialized AI models will appear earlier in history, will be vastly more numerous and resilient and long-lived than more generalized AI models.
  The more generalized AI:s will probably want to reach for a specialized-AI swatter every now and then.
  - Re: (Score:2)
    
    by wierd_w ( 1375923 ) writes:
    
    and that would seem to be on the money...
    Imagine the reaction of a hypothetical future smart-AI responding to the niggling bullshit of a spambot...
- Re: (Score:2)
  
  by Zorpheus ( 857617 ) writes:
  
  The current "general purpose" processors are also specialised though, for example for algorithms with a low number of threads. A processor with the combination of several specialised cores is less specialised, since it is good at everything.
Saturday is Semantics Day (Score:3, Insightful)

by Crashmarik ( 635988 ) writes: on Saturday November 08, 2014 @09:11PM (#48342939)

It isn't hard to understand when people talk about specialty processors they are talking about things that explicitly designed particular tasks easier.
V.S. General processors which are designed to have wider application while not being as fit for any particular task.
Not going to say this is correct, but it's pretty easy to put together the exact opposite argument of the authors. That specialty processors should be treated very carefully and their use limited. After all each time you add a different type of specialty processor into an environment, you introduce another codebase for the application, another toolchain to learn and another set of communication / OS support issues.

Share
twitter facebook
- Re: (Score:3)
  
  by phantomfive ( 622387 ) writes:
  
  It isn't hard to understand when people talk about specialty processors they are talking about things that explicitly designed particular tasks easier. V.S. General processors which are designed to have wider application while not being as fit for any particular task.
  I believe his point (in the context of high performance algorithms) is that 'standard' processors vary so much in their performance, in the cases they optimize for; that there is no general purpose optimizer that you can consider will work in a certain way for a given algorithm. Some algorithms will work better on processor A, and others on processor B, even though they both are 'general purpose' processors.
  
  In other words, if you are studying high-performance algorithms, and are writing papers about high-
- Re: (Score:2)
  
  by __aaltlg1547 ( 2541114 ) writes:
  
  As an electronics designer, I believe he's dead wrong. Specialty ICs become obsolete almost immediately on deployment. You can replace a general purpose processor, or a special purpose processor with a general purpose processor. And you can upgrade the function of a product.
  Fuck overspecialization.
  - Re: (Score:2)
    
    by Crashmarik ( 635988 ) writes:
    
    So true, remember when C-Cube was going to be the only way to encode/decode video ?
    - Re: (Score:2)
      
      by Zontar The Mindless ( 9002 ) writes:
      
      Oh, yeah, the video-on-a-chip-and-only-on-a-chip people. Was always surprised that Sun didn't try to sue them over their logo.
  - Re: (Score:2)
    
    by solidraven ( 1633185 ) writes:
    
    True, there is a need for speciality IP blocks though for those few applications where it does matter. But at that point using an ASIC is probably not the best choice.
    
    What I do expect to see, given the recent Intel announcement, is FPGAs showing up more and more as co-processor. There is a lot of speed to be gained by reconfiguring the hardware for when you have to crunch through a few gigabyte of data like decoding/encoding a video stream or running a query on a massive database. The only real "speciality
    - Re: (Score:2)
      
      by __aaltlg1547 ( 2541114 ) writes:
      
      There's a lot of FPGA-based SOCs that contain embedded microprocessor cores in them. (e.g. Xilinx's Virtex and XINQ, Altera's Cyclone, Arria and Stratix families, Microsemi's Smartfusion 2). We may see that flip the other way so there's a very high-end core or several of them in a SOC, with FPGA logic to allow pin reconfiguration and large CLBs for speeding up or offloading processes.
      That might spawn new operating systems that manage how you use and configure the CLB and pin configuration. Do they have
- Re: (Score:2)
  
  by Sqr(twg) ( 2126054 ) writes:
  
  After all each time you add a different type of specialty processor into an environment, you introduce another codebase for the application, another toolchain to learn and another set of communication / OS support issues.
  That will be an issue only for the OS and library developers. To the applications developer there will be no noticeable difference. It is already the case that you need to use specialized libraries to get maximum performance on common types of tasks.
  For example, if you want to use an FFT on a modern "general purpose" processor, you will get much better performance using a standard library function than you would if you wrote your own. There are so may issues with memory access patterns, core and cache utili
If it's fast enough, "general purpose" is fine (Score:4, Insightful)

by Anonymous Coward writes: on Saturday November 08, 2014 @09:21PM (#48342983)

If a "general purpose" processor solves your problems fast enough, it's good enough.
How the fuck is that "harmful"?
Geez, you'd think TFA is just a blowhard looking for page hits.

Share
twitter facebook
- Re: (Score:3)
  
  by JaredOfEuropa ( 526365 ) writes:
  
  How the fuck is that "harmful"?
  Because every time you believe in a general purpose processor, a kitten dies
- I's just a bullshit semantics game (Score:3)
  
  by Sycraft-fu ( 314770 ) writes:
  
  Guy is trying to play silly distinction games. Really, everyone in tech understands what people mean when they say "general purpose processor." Yes, said unit may have some specialized circuits and such, but it is made to be good at dealing with all kinds of problems. Integer, FP, branching, linear, etc doesn't matter its design can handle them all reasonably well.
  That compares to something specialized like a GPU. For certain kinds of problems, specifically single precision vector math with fairly consisten
- Re: (Score:2)
  
  by Stephan Schulz ( 948 ) writes:
  
  If a "general purpose" processor solves your problems fast enough, it's good enough.
  How the fuck is that "harmful"?
  You miss the point. It's not the "general purpose processor" that is harmful per se. What is harmful is the labelling of a certain class of processors as "general purpose", when, in the view of the author, they are not really general purpose, but specialised for executing C code with, at most, mid-sized working sets and little inter-processor communication. By assuming this workload as the default and calling processors good for it "general purpose", we may miss other approaches that might be more suitable
  - Re: (Score:2)
    
    by ChrisMaple ( 607946 ) writes:
    
    C is a general purpose language, perhaps the most general purpose language. Processors optimized for C code are by default general purpose.
    - Re: (Score:2)
      
      by jedidiah ( 1196 ) writes:
      
      A general purpose processor is intended to do anything.
      General purpose processors are based on the idea that they aren't superstars at any one particular task. So they are pushed to perform as well as the tech will allow. This allows them to beat even the speciality silicon.
      Also, not all speciality coprocessors are created equal.
      A weak (but cheap) special purpose coprocessor will still underperform a general purpose CPU that's not hamstrung with certain pecular engineering considerations.
      A general purpose p
- - Re: (Score:3, Informative)
    
    by Anonymous Coward writes:
    
    Because it is inefficient. In addition to higher energy bills, a less efficient architecture means shorter battery life in a mobile device, more noise in a desktop PC, and fewer servers per rack in a datacenter.
    There are "general purpose" microcontrollers that use microWatts of power. That can run on one tiny watch battery for years.
    For example, http://www.microchip.com/wwwpr... [microchip.com]
    From datasheet,
    * 30 microAmps per Mhz
    * 20 nanoAmps in sleep
    So a 100mAh 3V watch battery would last 570 years on sleep mode and 3-5 months operating at 1MHz. Or at 31kHz, with some sleep it should operate for years on a button cell.
    And it's a fully programmable, general purpose microcontroller.
    So what's the problem? Too ineff
  - Re: (Score:3, Insightful)
    
    by Ihlosi ( 895663 ) writes:
    
    Because it is inefficient.
    And building a separate processor for each of the nearly inifinite number of possible tasks out there isn't?
    Especially when it comes to integrated circuits, mass production of one product is what makes the production process cheap and efficient.
    Also, at some point, you need to say "good enough/fullfills our requirements". Yes, you might save a bit of power by coming up with your own chip design, but designing an ASIC is not a trivial task and in the end your product might be
- - Re: (Score:2)
    
    by Thor Ablestar ( 321949 ) writes:
    
    You are not right. There are lots of useful strictly sequential algos. The mplayer, for instance. It loads 100% one of 8 cores, other cores are idle. Some of it's codecs can be parallel, most not. Some programs are intentionally sequential, litecoin miner for instance.
    And since I spend power for heating, the 100W processor gives as much heat as 100W resistor. Heat pump cannot operate since temperature difference is usually big enough.
    - Re: (Score:2)
      
      by TechyImmigrant ( 175943 ) writes:
      
      >And since I spend power for heating, the 100W processor gives as much heat as 100W resistor. Heat pump cannot operate since temperature difference is usually big enough
      Yes. This is why the power efficiency of a computer is strictly a function of how you define 'useful work'.
Clickbait Caption, but Valid Arguments (Score:4, Insightful)

by gentryx ( 759438 ) writes: on Saturday November 08, 2014 @09:24PM (#48342997) Homepage Journal

Of course general purpose CPUs exist, simply because we call them that way. But it is also true that each design has it's own strengths, and "dark silicon" is another driver for special purpose hardware. Efficiency is another. Andrew Chien has published some interesting research [illinois.edu] on this subject. In his 10x10 approach he suggests to use 10 different types of domain-specific compute units (e.g. for n-body, graphics, tree-walking...), each of which is 10x more efficient than "general purpose CPUs" in its domain (YMMV). Those compute units bundled together, make up one core of the 10x10 design. Multiple cores can be connected via a NoC.
Let's see how software will cope with this development...
ps: can special purpose hardware exist if general purpose hardware doesn't?

Share
twitter facebook
- Re: (Score:2)
  
  by Firethorn ( 177587 ) writes:
  
  ps: can special purpose hardware exist if general purpose hardware doesn't?
  
  Yes it can. After all the first 'computers' were dedicated code-breakers and such.
  However, there are many different levels of 'general purpose' or 'specialized'. Earlier 'cars' were mentioned as 'general purpose', but think about it, they're actually pretty specialized - they're generally designed for a max of 4-6 passengers, carry maybe a couple hundred pounds of cargo besides the passengers, drive mostly on paved roads, have a range of somewhere between 300 and 400 miles, etc... But within that you hav
  - Re: (Score:2)
    
    by drinkypoo ( 153816 ) writes:
    
    Earlier 'cars' were mentioned as 'general purpose', but think about it, they're actually pretty specialized - they're generally designed for a max of 4-6 passengers, carry maybe a couple hundred pounds of cargo besides the passengers, drive mostly on paved roads, have a range of somewhere between 300 and 400 miles, etc...
    The first automobiles were curiosities. But as soon as someone figured out you could move people, they knew you could move cargo, and the ones immediately following the first passenger cars were trucks. And a pickup truck is a general-purpose vehicle. It's not particularly good at anything, but it can do a little bit of everything. A sedan is a special-purpose vehicle for moving people and their personal cargo.
- Re: (Score:2)
  
  by Blaskowicz ( 634489 ) writes:
  
  I believe the Atari Jaguar was a console made of special purpose chips, with an "orchestrator" 68000 CPU tacked on. The propaganda said you're not meant to use that CPU or something to that effect but of course the games heavily relied on it, some of them Amiga ports. The console was a big failure, lacking a usable SDK and documentation. Now consoles have an OS, APIs, middleware.
  - Re: (Score:2)
    
    by dbIII ( 701233 ) writes:
    
    I'd say by that point Atari was a huge failure as an entity instead of the actual device, as seen with other things where they developed products but did not follow through.
- Re: (Score:2)
  
  by kesuki ( 321456 ) writes:
  
  "Of course general purpose CPUs exist, simply because we call them that way."
  the wisdom of those real world coders gone from this world is thus. a jack of all trades, is a master of none.
  this means simply that a general purpose FPGA that can modulate it's functions can do a lot of different things but not at the same time, and in etched hardware the trade off is having dark silicon for all the tasks a true jack of all trades cpu can do.
  for instance, when i was doing video games all day the more games i play
  - Re: Clickbait Caption, but Valid Arguments (Score:2)
    
    by O('_')O_Bush ( 1162487 ) writes:
    
    "jack of all trades, is a master of none"
    
    You do know that is just an idiom, and an incorrect one, at that, not a law?
    
    The idiom is "A jack of all trades is better than a master of one", which was later shoehorned into a description of someone who is "a Jack of all trades but a master of none". It is not intended to be folk wisdom that there is a tradeoff between mastering something and being proficient at lots of things... in fact, there are numerous examples of people being masterful in many arenas and also
    - Re: (Score:2)
      
      by jkflying ( 2190798 ) writes:
      
      "Jack of all trades, master of none" is the correct saying, it is just missing the ending: "still better than a master of one."
      - Re: (Score:2)
        
        by Hognoxious ( 631665 ) writes:
        
        What a load of bollocks. If you've got a tooth abscess you want a master of dentisfuckingtry and you won't give a rat's ass how good he is at carpentry or playing the guitar.
- Re: (Score:2)
  
  by mlts ( 1038732 ) writes:
  
  What we might have happen is that we end up with a mix, where a core is weighted towards a task... but compared to running a job at say, 80% as effectively as a core that is built for the job, versus not running the task at all, the scheduler [1] would drop tasks on non-optimal cores if it would help performance. If it is something definitely not optimal (FPU instructions on an integer-only core), the weighting would account for that and might not even place a task on there come the next quantum.
  The 10x10
- Re: (Score:2)
  
  by K. S. Kyosuke ( 729550 ) writes:
  
  I've had similar ideas about specializing the Forth cores of the F18 design in a grid. The good thing is that the specific instruction set extensions can be simply substituted by subroutine calls if you need to use them on other cores in cases of low dynamic frequency of their usage - the code is essentially concatenative, so it's a matter of simple string substitution. Even the OS loader or scheduler could do it very quickly on the fly.
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
- - Re:Clickbait Caption, but Valid Arguments (Score:4, Insightful)
    
    by Thor Ablestar ( 321949 ) writes: on Sunday November 09, 2014 @02:17AM (#48343821)
    
    Of course. But the very origins of microprocessor revolution are being forgotten:
    Intel was to make 10 specialized chips. Intel made ONE universal programmable chip i4004 that replaced them all and spent development money ONCE.
    It's possible to make 10 special-purpose chips but it will cost 10 times more.
    
    Parent Share
    twitter facebook
    - We're no longer at the origin (Score:2)
      
      by gentryx ( 759438 ) writes:
      
      Architectural improvements for general purpose CPUs yield less and less benefits: Even more registers? Even better branch prediction? Even larger caches? It'll all yield but a few percent, at least for current Intel designs. So, the way to go is currently more and more cores, but what good is it to have many cores that can't all fire simultaneously?
- - Re: (Score:2)
    
    by K. S. Kyosuke ( 729550 ) writes:
    
    Actually, that's not exactly a good analogy. The SPEs were identical. They wouldn't be in this case.
What a useless paper (Score:2)

by iamacat ( 583406 ) writes:

Basically it's making a big deal out of the fact that today's commonly available hardware is optimized for today's commonly available software. Duh! General purpose is a term relative to purposes a particular person has in mind. Nobody is suggested that Core i7 is capable of running Lt Cmdr Data.
A genuinely interesting paper would have specific ideas for architecture capable of solving problems beyond the scope of current CPUs and GPUs.
- - Re: (Score:2)
    
    by K. S. Kyosuke ( 729550 ) writes:
    
    The parts bin of failures is full to overflowing with more specialized designs, Transputer, Cell, Itanic.
    
    Yes, but those are specialized symmetrical designs. It's not like you get a few extra transputers and one extra Itanic core with every two-core Haswell.
    By the time the specialized processor has realized it's theoretical performance the generalized one is two generations ahead, cheaper and just as fast.
    Again, a blatantly false analogy. These extra units would have more in common with vector SSE/AVX units in your "general purpose" CPU. You don't drop them "two generations ahead" - Intel never did, and they're not going to drop AVX2 "two generations" after Haswell where they introduced them. You just don't power them if you don't need them for a while, and t
General purpose: Efficiency not required (Score:5, Insightful)

by davidwr ( 791652 ) writes: on Saturday November 08, 2014 @10:01PM (#48343133) Homepage Journal

From TFA:
It's therefore not enough for a processor to be Turing complete in order to be classified as general purpose; it must be able to run all programs efficiently. [emphasis added]

Um, nope.
A general-purpose anything is rarely as efficient at a given task as a special-purpose version of the same thing. Sometimes you really do want your computer chip to be a "Jack of all trades, master of none."

Share
twitter facebook
- Re: (Score:2)
  
  by 91degrees ( 207121 ) writes:
  
  I think there is a certain efficiency argument. A GPU may be able to run a C compiler but nobody would consider using it for that. A CPU can run an OpenGL implementation and it would be slow but you'd at least be able to do it without any fiddly hacks, and there could be a reason to do so.
  
  The article seems to be trying to find a hard and fast rule as to what "general purpose" means and then realising that that doesn't actually apply to general purpose processors.
ACM Queue brand decay (Score:2)

by fche ( 36607 ) writes:

There have been many mediocre articles on the ACM Queue in the last few years ... this fits perfectly.
It's not all that many years ago (Score:5, Insightful)

by msobkow ( 48369 ) writes: on Saturday November 08, 2014 @10:44PM (#48343307) Homepage Journal

It isn't all that many years ago that the floating point was handled by either software emulation or a co-processor. Now we're using GPUs as co-processors. There are also audio designs that act as co-processors. Several enterprise systems have encryption co-processors. IBM is notorious for putting specialized processors in their mainframes. Several chips have the GPUs embedded on-chip already.
I'd argue that putting specialized chips on-die doesn't affect the general-purpose nature of the compute core that controls those resources at all. The whole article is red herring trying to establish a distinction between on-chip and off-chip processing that has to do more with the scalability of silicon manufacturing techniques than it does any distinguishing feature of the designs.
Let's face it -- if you want to really accelerate a task, you design silicon specifically for that task and interface it to a general purpose core. The article discusses nothing new in the world of computing.

Share
twitter facebook
- Re: (Score:3)
  
  by SuricouRaven ( 1897204 ) writes:
  
  AES encryption is built into all modern Intel CPUs, except a few Atoms.
  The enterprise crypto co-processors are mostly for RSA key generation. Something that's only done during connection setup, but can be a substantial load on a high-traffic SSL server that creates hundreds of connections each second.
  - Re: (Score:2)
    
    by K. S. Kyosuke ( 729550 ) writes:
    
    You're sure it's RSA key generation? Or RSA encryption of the generated session key? I thought that RSA keys were generated when certificates are generated, not when the certificate is being used.
    - Re: (Score:2)
      
      by SuricouRaven ( 1897204 ) writes:
      
      No, I'm not sure. I'm not a cryptographer, so I'm not sure exactly what maths it's doing. All I can tell you is the practical side: There is something computationally intensive that happens during the setup stage of an SSL connection, and the main purpose of a hardware cryptographic accelerator is to do that something. Usually so that a webserver may handle many more SSL connections per second than CPU alone could handle. The other approach is an appliance that sits before the webserver and does the SSL stu
      - Re: (Score:2)
        
        by msobkow ( 48369 ) writes:
        
        When an SSL or HTTPS connection is established, the existing RSA key is used to negotiate the connection, but a connection-specific key is generated and shared over the RSA-keyed initial connection. It's that generation of the connection-specific key that is compute-intensive. If I recall correctly, that secondary key is usually done using a symmetric algorithm that can be processed faster than AES encryption can be, with the caveat that it requires sharing the key, so it can only be safely used if the i
- Re: (Score:2)
  
  by K. S. Kyosuke ( 729550 ) writes:
  
  Note that a part of AMD's HSA is exactly this - allowing multiple heterogeneous units to do memory-based communication on top of the paged virtual memory, and providing a mechanism for the units to communicate with each other (by sending "command packets"). Now the current implementation in Kaveri is mostly about having a CPU and a GPU on the same chip, but adding other units - even almost-fixed-purpose ones like one that would be able to accept commands of the form of "encrypt [this block of memory] using
Problem vs Processor (Score:2)

by jimbrooking ( 1909170 ) writes:

Back in the day (when I was actually PAID for buying supercomputers) I devised Jim's First Law of Supercomputing: For every computer architecture there is a problem that will solve on that particular architecture "better" than on any other architecture. And conversely, for every problem there is an architecture that will solve this problem "better" than any other architecture. (You get to define "better".) You didn't have to talk to too many computer sales-people to accept this as fact. I believe the point
Transmeta (Score:3)

by Sebastopol ( 189276 ) writes: on Sunday November 09, 2014 @12:15AM (#48343521) Homepage

This whole discussion just made me laugh whilst remembering the hype around the Transmeta / Torvalds code-morphing engine.
Ah, the 90's. They were fun.
CPUs have been "general purpose" since day one. The only non-general purpose hardware are ASICs (like the article says). Everything else is just marketing hype from Intel, et al.
This is such an amazing rehash of what Intel used to call *T technologies in the 90's, starting from the 80's, when coprocessors started appearing (x87). The big trend was toward DSPs in the 90's, but that never happened, instead they pushed on new hardware like MMX, SSE and now vector processors. That's why we have graphics processors as non-general-purpose CPUs.
To call something a GPGPU is just an egregious assault of on common sense.
"Dark silicon", while a catchy name, is simply a side effect of latency, something the article mostly skips (hints at it with locality): the memory hierarchy exists and dark silicon is a result. When latency is zero, more of the silicon will be engaged.
While one could easily claim that because parts of any chip power down that means it's not general purpose, that's an oversimplification: 100% utilization is fundamentally impossible because problems aren't solved that way, there is no infinite parallelism.
I really think the author's analysis isn't fully developed. While the conclusion that hardware looks like the software may be a pleasant tautology, it overlooks Turing's thesis entirely. Which is odd, because that's what they author -started- with!

Share
twitter facebook
- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  The big trend was toward DSPs in the 90's, but that never happened, instead they pushed on new hardware like MMX, SSE and now vector processors
  No, it did happen. We got DSP-like technology in our PCs in the form of MMX, SSE, and vector processors. So we got both DSP and general-purpose processors.
  To call something a GPGPU is just an egregious assault of on common sense.
  If our system bus designs permitted we could have separate graphics output and vector processor. But they don't. You've got to call it something.
An interesting paragraph from the article (Score:2)

by Jeremi ( 14640 ) writes:

"[Superscalar architectures] translate the architectural instruction encodings into something more akin to a static single assignment form (ironically, the compiler spends a lot of effort translating from such a form into a finite-register encoding)
Which makes me wonder, would it (in principle) be worth designing a chip with an ISA that is based explicitly on single-assignment-form, thereby avoiding both the need for transformations by the compiler and (more importantly) transformations by the CPU at run-time?
- Re: (Score:2)
  
  by linuxrocks123 ( 905424 ) writes:
  
  That is a very cool idea, and is explored in this research project: http://en.wikipedia.org/wiki/T... [wikipedia.org]
- Compiling for single-assignment (Score:2)
  
  by Ottibus ( 753944 ) writes:
  
  Which makes me wonder, would it (in principle) be worth designing a chip with an ISA that is based explicitly on single-assignment-form, thereby avoiding both the need for transformations by the compiler and (more importantly) transformations by the CPU at run-time?
  If the ISA is based on single assignment then the compiler will still have to transform the code into this form. In practice the compiler does this anyway when it can and super-scaler hardware does a good job of executing this kind of code. The problem (returning to the subject of the thread) is that not all code fits this pattern and in these cases a single-assignment ISA will perform particularly badly so it is not suitable for general purpose processors.
  Researchers have be developing ISAs for decades (in
- Re: (Score:2)
  
  by TheRaven64 ( 641858 ) writes:
  
  It would be interesting, but it's also a question of encoding density. Having a fixed number of architectural registers (and a much larger number of microarchitectural registers) is a technique that works reasonably well. Adding more architectural registers makes your operand size very large. You could imagine something like Dalvik bytecode, with 2^32 SSA registers and a CPU able to interpret it by either using internal registers or spilling to RAM, but you'd likely end up needing huge instruction caches
Yes, there is... (Score:2)

by ka9dgx ( 72702 ) writes:

Yes, you can use almost 100% of the silicon, if you use a BitGrid [blogspot.com] to process information instead of Von Neuman architectures.
Um, what? (Score:3)

by wonkey_monkey ( 2592601 ) writes: on Sunday November 09, 2014 @05:37AM (#48344235) Homepage

Here is one definition of a general-purpose processor: if it can run any algorithm, then it is general purpose. This is not a particularly interesting definition, because it ignores the performance aspect that has been the driving goal for most processor development.

Well, I'm sorry you don't find the definition interesting, but that doesn't mean you can redefine it however you want.
It's therefore not enough for a processor to be Turing complete in order to be classified as general purpose; it must be able to run all programs efficiently.
I assume there's a name for a logical fallacy where you redefine terms in order to make your point.
With this in mind, let's explore what people really mean when they refer to a general-purpose processor: the specific category of workloads that these devices are optimized for and what those optimizations are.

That's not what I mean when I refer to a general-purpose processor.
Efficient designs in such a world will require admitting that there is no one-size-fits-all processor design and that there is a large spectrum, with different trade-offs at different points.
I didn't realise anyone was denying this.

Share
twitter facebook
Wrong (Score:2)

by Kim0 ( 106623 ) writes:

I have designs for general and efficient processors, or rather computing structures, and it is provable they are both efficient and general.
The generality of computing has been known for a long time, through emulation of Turing machines, but this has not been efficient.
Programming complexity (Score:3)

by rockmuelle ( 575982 ) writes: on Sunday November 09, 2014 @11:54AM (#48345235)

A big reason we accept the trade offs of modern processors is that it's generally easy to program a broad range of applications for them.
In the mid aughts (not very long ago, actually), there was a big push for heterogeneous multi-core processors and systems in the HPC space. Roadrunner at Los Alamos was a culmination of this effort (one of the first petascale systems). It was mix of processor types including IBMs Cell (itself a heterogeneous chip). Programming Roadrunner was a bitch. In having different processor families, you had to decompose your algorithm to target the right processor for a given task. Then you had to worry about moving data efficiently between different processors.
This type of development is fun as an intellectual exercise, but very difficult and time consuming in practice. It's also something compilers will never be good at, requiring experts in the architectures, domains, and applications to effectively use the system.
Another lesson from the period (and one that anyone whose done asics has known for years) is that general purpose hardware generally evolves fast enough to catch up with specialized hardware with a reasonable timeframe (usually 6-18 months, see DE Shaw's ASIC for protein folding as an example).
While custom processors are cool (I love hacking on them), they're rarely practical.
-Chris

Share
twitter facebook
The real problem. (Score:2)

by jedidiah ( 1196 ) writes:

The real problem is that you never know what you may end up doing. The potential of a speciality bit of silicon is by it's very nature limited. It may completely fail at some new task that you didn't think of when you were building it.
It's great for some really well defined problem but as soon as that definition is no longer invalid, the speciality silicon is useless.
"General Purpose" means that you can address any problem including the ones where your specialty silicon fail.
- Re: (Score:2)
  
  by tepples ( 727027 ) writes:
  
  Yeah, when I read the headline it sounded like someone was trying to undermine the concept of a "general-purpose computer" in favor of locked-down appliances for specific tasks.
  - Re: (Score:2)
    
    by Thor Ablestar ( 321949 ) writes:
    
    You are right.
    Quite often the salesmen approach me with their attempt to sell me an Ipad. I usually answer: "Well, but please add the development kit". Their reaction is epic.
    - - Re: (Score:3)
        
        by Zontar The Mindless ( 9002 ) writes:
        
        Come to Scandinavia, where there are lots of folks named "Thor" or "Tor", including the guy who lives just above me. He's a handyman and swings a very real hammer.
  - Re:Hyperbolic headlines strike again (Score:4, Informative)
    
    by TheRaven64 ( 641858 ) writes: on Sunday November 09, 2014 @06:28AM (#48344315) Journal
    
    I'm the author of TFA. There's a big difference between a general purpose processor and a general purpose computer. A lot of current research in computer architecture is focussed on the idea that you have a sharp divide between accelerators and general purpose CPUs. The point of the article is that different CPU microarchitectures are specialised for different workloads (one of the cited results was that in a big.LITTLE arrangement, the A7 core runs one of the SPEC benchmarks faster than the A15 because of its lower cache access time, for example) and that there are a lot of assumptions about the kind of code that the general purpose core will run. Many of these are true for C code, but a lot less true for code written in other languages. The communication patterns that mainstream multicore processors are optimised for are heavily tied to C, to the extent that if you have a language with a shared-nothing abstraction and message passing then the only way of implementing it is horrendously inefficient at the hardware level.
    
    Parent Share
    twitter facebook
    - Re:Hyperbolic headlines strike again (Score:5, Insightful)
      
      by smallfries ( 601545 ) writes: on Sunday November 09, 2014 @07:00AM (#48344399) Homepage
      
      A lot of the value in your article is lost by trying to shoehorn "general purpose processors" into an argument about task-optimisation. The difference between properties relating to computational power and those relating to performance is really basic textbook stuff that we teach to undergraduates. Being able to run any program, and being able to run any program efficiently, is a difference taught in undergraduate architecture courses.
      The parts of your article that are interesting and valuable would have been better served by a narrative that does not rely on a straw man. Cleanly separating the issue of power / performance and explaining that task-neutral optimisation is impossible would have been a better article, and one that would have been easier to write. There is a natural analogy with representation-bias in machine learning that would have provided more explanatory power without the unnecessary rhetoric. I know its the queue, but even so I am a little disappointed in your reviewers.
      
      Parent Share
      twitter facebook
    - Re: (Score:2)
      
      by BadDreamer ( 196188 ) writes:
      
      This is why we at one time had Lisp Machines with specialized hardware optimized for running Lisp efficiently. Message based machines were tried for Smalltalk.
      But people do not use these kinds of languages enough. Operating systems and applications are largely written in C and its derivatives. That is why processors optimized for C won out.
      So yes, it is in a way a vicious circle. Most of our software is C, so most of our hardware is optimized for C, so writing software in C makes the most efficient use of i
      - Re: (Score:2)
        
        by TheRaven64 ( 641858 ) writes:
        
        This is why we at one time had Lisp Machines with specialized hardware optimized for running Lisp efficiently. Message based machines were tried for Smalltalk
        The main reason that Lisp machines lost out was that they were stack based. Stack-based instruction sets don't (easily) expose any instruction-level parallelism, which means that you can't easily extend them to take advantage of pipelining. That wouldn't have been such a problem if Lisp had been parallel (a barrel-scheduled multithreaded stack-based CPU can be very simple to design, have very good instruction cache usage, and get good power / performance ratios), but Lisp machines ran an single-threaded e
    - Re: (Score:2)
      
      by Dutch Gun ( 899105 ) writes:
      
      Interesting article.
      IMO, over-specialization was the reason the PlayStation 3 and it's Cell processors never really lived up to their potential. While they were amazing at crunching raw numbers in highly-serialized batches (they were originally designed for video processing, remember), they're not really so great at processing the type of wildly diverse data and tasks that videogames typically require. These processors were simply designed for the wrong types of tasks - too specialized, essentially. In t
    - Re: (Score:3)
      
      by Kjella ( 173770 ) writes:
      
      Still don't get it. The difference is that accelerators try to do one thing, or at least one class of problems well at the expense of everything else. They optimize for the best case. CPUs do the same when they incorporates specialized instructions as "mini-accelerators" like AES-NI. But what sets general purpose processors apart is that they assume the worst and tries to make all code perform, no matter how ugly. They optimize for flexibility, with an emphasis on minimizing the worst cases. Those are two b
      - Re: (Score:2)
        
        by TheRaven64 ( 641858 ) writes:
        
        But what sets general purpose processors apart is that they assume the worst and tries to make all code perform, no matter how ugly. They optimize for flexibility, with an emphasis on minimizing the worst cases
        Read TFA. They optimise for a specific category of algorithm, that is branch heavy (although comparatively light in computed branches), has strong locality of reference, is either single-threaded or has shared-everything parallelism, and a few other constraints. That's not a general purpose processor, that's something optimised for a specific workload and, because they've been the cheapest way of buying processing power for a few decades, people put a lot of effort into trying to shoehorn algorithms to ha
        
        Re: (Score:2)
        
        by Kjella ( 173770 ) writes:
        
        Read TFA. They optimise for a specific category of algorithm, that is branch heavy (although comparatively light in computed branches), has strong locality of reference, is either single-threaded or has shared-everything parallelism, and a few other constraints. That's not a general purpose processor
        I did, you're still being silly. It's easy to run non-branching code on a branching processor, it's almost impossible to do the opposite. That is why we call branching processors general purpose. It's easy to run code with weak locality on a processor with strong locality, it's almost impossible to do the opposite. That is we call processors with strong locality general purpose. It's easy to run parallel code on a sequential processor, it's almost impossible to run sequential code on a parallel processor. T
  - Re: (Score:2)
    
    by TechyImmigrant ( 175943 ) writes:
    
    I thought the most interesting part of the article was the observation that MMUs on all modern processors are geared to the unix model of memory protection and virtualized memory and they let go of the Multics model, which was about 99.9% of the OS course when I was in college back in the stone age.
- Re: (Score:2)
  
  by Bengie ( 1121981 ) writes:
  
  A "general purpose processor" is really a processor with a bunch of specialized execution units, each one processing data serially.
- Re: (Score:3, Informative)
  
  by jones_supa ( 887896 ) writes:
  
  This seems to be how the human brain works, and it runs on less than 100 watts (100 watts corresponds to 2000 Calories per day).
  A whole woman consumes 100 watts. Of that brain is about 20 watts. Also, watt represents momentary consumption and calories are a fixed mass of energy, so you can't directly compare them.
  - Re: (Score:3)
    
    by jones_supa ( 887896 ) writes:
    
    A whole woman consumes 100 watts.
    D'oh! Human, not woman.
    - Re:Efficiency (Score:5, Funny)
      
      by TechyImmigrant ( 175943 ) writes: on Sunday November 09, 2014 @01:42AM (#48343755) Homepage Journal
      
      A whole woman consumes 100 watts.
      D'oh! Human, not woman.
      I got myself an nvidia woman. She takes 400 watts.
      
      Parent Share
      twitter facebook
      - Re:Efficiency (Score:5, Funny)
        
        by Anonymous Coward writes: on Sunday November 09, 2014 @09:01AM (#48344641)
        
        Sounds like a hottie
        
        Parent Share
        twitter facebook
        
        Re: (Score:3, Funny)
        
        by Anonymous Coward writes:
        
        Unfortunately she requires two inputs. :(
    - - Re: (Score:2)
        
        by jones_supa ( 887896 ) writes:
        
        Of course they are. I simply meant to use the word "human" instead of "woman". The word "human" still encompasses women.
  - 1 Calorie per day = 48.4 mW (Score:5, Informative)
    
    by tepples ( 727027 ) writes: <tepples@NOsPAm.gmail.com> on Saturday November 08, 2014 @09:25PM (#48342999) Homepage Journal
    
    100 watts corresponds to 2000 Calories per day
    Also, watt represents momentary consumption and calories are a fixed mass of energy
    Calories and calories per day are not the same unit. A calorie is 4.18 kJ, and a calorie per day is 4.18 kJ / 86.4 ks = 48.4 mW. Multiply this by 2000 and you'll end up very close to 100 W.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by jones_supa ( 887896 ) writes:
      
      Oh, good point. Then it makes sense indeed.
    - - Re: (Score:2)
        
        by Hognoxious ( 631665 ) writes:
        
        I figured something was wrong, given that I can (or rather could) produce 250 watts on an exercise bike.
  - Re: (Score:2)
    
    by jonnythan ( 79727 ) writes:
    
    Watts is in units of energy divided by time.
    Now.... what are the units of "calories per day"? Take your time.
    - In Soviet Russia... (Score:2)
      
      by Thor Ablestar ( 321949 ) writes:
      
      ... there are lots of idiots that instead of kWh write kW/h.
      It's quite understandable: our Education Minister Fursenko once said:
      "Defect of the Soviet education system was an attempt to form a creative man, and now the challenge is to cultivate qualified consumers, qualified to benefit from the creative work of others"
      "The ideology of education remains the same - we must prepare creators. But we need above all to inculcate the culture of using already existing developments, following the existing standards.
  - - Re: (Score:2)
      
      by linuxrocks123 ( 905424 ) writes:
      
      Not disagreeing with you, but it's a little misleading to say "a processor needs some electricity and that's it". A processor needs a very precise voltage level of DC current supplied continuously. To get that precise voltage level, you need regulators, AC/DC converters, etc. Moreover, you need to burn coal, oil, natural gas, or sustain a nuclear reaction in order to provide this electricity. Finding carbohydrates is comparatively easy compared to maintaining an entire electrical grid. After all, they
      - Re: (Score:2)
        
        by itzly ( 3699663 ) writes:
        
        Similarly, arbitrary cells in your body can't just run on fruit from trees. They need a very precisely regulated supply of certain substances, which needs to be regulated by very complex mechanisms in the body.
  - - Re: Efficiency (Score:2)
      
      by GigaplexNZ ( 1233886 ) writes:
      
      >, watt represents momentary consumption and calories are a fixed mass of energy, so you can't directly compare them. You you can, if you declare a rate of energy use, and a timeframe that the rate of energy is used over, you can work out how much energy is used
      That's an indirect comparison.
  - - Re: (Score:2)
      
      by Geeky ( 90998 ) writes:
      
      Or "B logically follows from A. Therefore B is true if I want it to be. Unless I do really but don't want to tell you I do, or I can make a drama out of it not being true."
      I'm trying not be misogynistic but sometimes it really is hard to follow the logic. Maybe it's just the one I'm seeing. I sort of assume attacking the logic of a certain action is somehow preferable to simply saying "I don't want to".
      I should just accept that logic and relationships are non-overlapping magesteria [wikipedia.org].
      Meh. Bad weekend.
      - Re: (Score:2)
        
        by Thor Ablestar ( 321949 ) writes:
        
        I don't mean misogyny. There are lots of classic philosophy works that are accepted as classic but are step by step reducible to Kolmogorov's lemma. And "results" or "follows" are necessary part of the lemma. "A is true because I want A to be true" has no "follows".
        And btw I think it's as difficult to produce a computer that wants as the computer that feels pleasure.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  And you think there are people that can write good software for such a thing? Most developers fail even when there is one benign execution mechanism, such as a virtual machine. This is just an academic that wants attention.
  - - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      Compilers are not smart enough for this.
- Emulation (Score:2)
  
  by tepples ( 727027 ) writes:
  
  So, a chip that can do everything in Excel or Mathmatica would be the true General Purpose Processor.
  Yet such a chip can be emulated on a processor with reduced instruction set complexity because they're Turing equivalent [wikipedia.org]: both are linear bounded automata. If you compile the same program in a high level language for a complex processor and a simple processor, they'll produce the same result. Each operation on the complex processor may correspond to several instructions on the simple processor, but ARM's bet with big.LITTLE is that reduced power consumption in a simple processor's instruction decoder makes
  - Re: (Score:2)
    
    by Guy Harris ( 3803 ) writes:
    
    If you compile the same program in a high level language for a complex processor and a simple processor, they'll produce the same result. Each operation on the complex processor may correspond to several instructions on the simple processor, but ARM's bet with big.LITTLE is that reduced power consumption in a simple processor's instruction decoder makes up for that difference.
    For big.LITTLE, the difference between the instruction decoders isn't an issue of different instruction sets; to quote their big.LITTLE Processing with ARM Cortex-A15 & Cortex-A7 [arm.com] white paper:
    The central tenet of big.LITTLE is that the processors are architecturally identical. Both Cortex-A15 and Cortex-A7 implement the full ARM v7A architecture including Virtualization and Large Physical Address Extensions. Accordingly all instructions will execute in an architecturally consistent way on both Cortex- A1
    - Re: (Score:2)
      
      by tepples ( 727027 ) writes:
      
      For big.LITTLE, the difference between the instruction decoders isn't an issue of different instruction sets
      I agree that the analogy is inexact, but there is still a difference in complexity between a big core and a little core. An in-order processor needs a less complex decoder than an out-of-order superscalar processor, just as RISC needs a less complex decoder than CISC. Thus the little cores use a less complex decoder compared to the big cores' more complex decoder to decode the same ARM instructions.
- Re: (Score:2)
  
  by DivineKnight ( 3763507 ) writes:
  
  Speaking of RISC, I have an OT question (but still technology related, so there's that): does anyone know where I can pick up a DEC Alpha CPU + MB (or machine) in about a month's time, with a CPU speed of >=500Mhz? I have this undying urge to tinker with one, but eBay's listing stuff for 266Mhz, or at a price that well exceeds my inner geek's wallet.
- Re: (Score:2)
  
  by tepples ( 727027 ) writes:
  
  Sure there is: a British citizen who resides in Scotland.
  - - The value of a good definition (Score:2)
      
      by tepples ( 727027 ) writes:
      
      Aren't all Scotsmen inherently British citizens
      Yes. But not all British citizens are Scotsmen, only those residing in Scotland.
      unless you have some other angle I think you failed to produce a workable joke.
      "No true Scotsman" is about changing definitions in mid-argument, and an effective way to avoid this is to agree on definitions early on. Likewise, one resolves the heap paradox [wikipedia.org] by defining a heap as a contiguous collection of grains where at least one grain is supported solely by other grains. I was aiming for a bit of an anti-joke by defining "Scotsman" the way the law probably defines it, if the definition of "citizen of a s
- Re: (Score:3)
  
  by PPH ( 736903 ) writes:
  
  Good point. But if I can take (patentable) software targeted to a special purpose processor and port it to a different (possibly general purpose) processor, I have bypassed the patent.
  The goal of a 'well written' patent is to be as general as possible without getting tossed out of a USPTO examiner's office.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

There are different workloads, duh. (Score:4, Insightful)

Re: (Score:3)

Specialization is for insects (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Saturday is Semantics Day (Score:3, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

If it's fast enough, "general purpose" is fine (Score:4, Insightful)

Re: (Score:3)

I's just a bullshit semantics game (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Clickbait Caption, but Valid Arguments (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Clickbait Caption, but Valid Arguments (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Clickbait Caption, but Valid Arguments (Score:4, Insightful)

We're no longer at the origin (Score:2)

Re: (Score:2)

What a useless paper (Score:2)

Re: (Score:2)

General purpose: Efficiency not required (Score:5, Insightful)

Re: (Score:2)

ACM Queue brand decay (Score:2)

It's not all that many years ago (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Problem vs Processor (Score:2)

Transmeta (Score:3)

Re: (Score:2)

An interesting paragraph from the article (Score:2)

Re: (Score:2)

Compiling for single-assignment (Score:2)

Re: (Score:2)

Yes, there is... (Score:2)

Um, what? (Score:3)

Wrong (Score:2)

Programming complexity (Score:3)

The real problem. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re:Hyperbolic headlines strike again (Score:4, Informative)

Re:Hyperbolic headlines strike again (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:3)

Re:Efficiency (Score:5, Funny)

Re:Efficiency (Score:5, Funny)

Re: (Score:3, Funny)