Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
Technology Hardware

There's No Such Thing As a General-Purpose Processor 181

CowboyRobot writes: David Chisnall of the University of Cambridge argues that despite the current trend of categorizing processors and accelerators as "general purpose," there really is no such thing and believing in such a device is harmful.

"The problem of dark silicon (the portion of a chip that must be left unpowered) means that it is going to be increasingly viable to have lots of different cores on the same die, as long as most of them are not constantly powered. Efficient designs in such a world will require admitting that there is no one-size-fits-all processor design and that there is a large spectrum, with different trade-offs at different points."
This discussion has been archived. No new comments can be posted.

There's No Such Thing As a General-Purpose Processor

Comments Filter:
  • by Anonymous Coward on Saturday November 08, 2014 @08:00PM (#48342891)

    David Chisnall says : parallel is not the same as in series. World gasps.

    • David Chisnall says : parallel is not the same as in series. World gasps.

      It's worse than that. TFA's basic premise that "there is no such thing as a general-purpose processor" is just flat wrong. Of course there are. His real argument is about how to make them efficient, which is a different thing and very much contrary to his title and introduction.

      Anything that can implement a Turing machine *IS* a general-purpose processor, by definition. And any general-purpose processor can do what any other general-purpose processor can do... although not necessarily fast or efficiently

  • by rossdee ( 243626 ) on Saturday November 08, 2014 @08:00PM (#48342895)

    According to Lazarus long
    The same should be true for AI

    • According to Lazarus long
      The same should be true for AI

      If that analogy holds in more than one way then I suppose that specialized AI models will appear earlier in history, will be vastly more numerous and resilient and long-lived than more generalized AI models.

      The more generalized AI:s will probably want to reach for a specialized-AI swatter every now and then.

      • and that would seem to be on the money...

        Imagine the reaction of a hypothetical future smart-AI responding to the niggling bullshit of a spambot...

    • The current "general purpose" processors are also specialised though, for example for algorithms with a low number of threads. A processor with the combination of several specialised cores is less specialised, since it is good at everything.
  • by Crashmarik ( 635988 ) on Saturday November 08, 2014 @08:11PM (#48342939)

    It isn't hard to understand when people talk about specialty processors they are talking about things that explicitly designed particular tasks easier.
    V.S. General processors which are designed to have wider application while not being as fit for any particular task.

    Not going to say this is correct, but it's pretty easy to put together the exact opposite argument of the authors. That specialty processors should be treated very carefully and their use limited. After all each time you add a different type of specialty processor into an environment, you introduce another codebase for the application, another toolchain to learn and another set of communication / OS support issues.

    • It isn't hard to understand when people talk about specialty processors they are talking about things that explicitly designed particular tasks easier. V.S. General processors which are designed to have wider application while not being as fit for any particular task.

      I believe his point (in the context of high performance algorithms) is that 'standard' processors vary so much in their performance, in the cases they optimize for; that there is no general purpose optimizer that you can consider will work in a certain way for a given algorithm. Some algorithms will work better on processor A, and others on processor B, even though they both are 'general purpose' processors.

      In other words, if you are studying high-performance algorithms, and are writing papers about high-

    • As an electronics designer, I believe he's dead wrong. Specialty ICs become obsolete almost immediately on deployment. You can replace a general purpose processor, or a special purpose processor with a general purpose processor. And you can upgrade the function of a product.

      Fuck overspecialization.

      • So true, remember when C-Cube was going to be the only way to encode/decode video ?

      • True, there is a need for speciality IP blocks though for those few applications where it does matter. But at that point using an ASIC is probably not the best choice.

        What I do expect to see, given the recent Intel announcement, is FPGAs showing up more and more as co-processor. There is a lot of speed to be gained by reconfiguring the hardware for when you have to crunch through a few gigabyte of data like decoding/encoding a video stream or running a query on a massive database. The only real "speciality
        • There's a lot of FPGA-based SOCs that contain embedded microprocessor cores in them. (e.g. Xilinx's Virtex and XINQ, Altera's Cyclone, Arria and Stratix families, Microsemi's Smartfusion 2). We may see that flip the other way so there's a very high-end core or several of them in a SOC, with FPGA logic to allow pin reconfiguration and large CLBs for speeding up or offloading processes.

          That might spawn new operating systems that manage how you use and configure the CLB and pin configuration. Do they have

    • After all each time you add a different type of specialty processor into an environment, you introduce another codebase for the application, another toolchain to learn and another set of communication / OS support issues.

      That will be an issue only for the OS and library developers. To the applications developer there will be no noticeable difference. It is already the case that you need to use specialized libraries to get maximum performance on common types of tasks.

      For example, if you want to use an FFT on a modern "general purpose" processor, you will get much better performance using a standard library function than you would if you wrote your own. There are so may issues with memory access patterns, core and cache utili

  • by Anonymous Coward on Saturday November 08, 2014 @08:21PM (#48342983)

    If a "general purpose" processor solves your problems fast enough, it's good enough.

    How the fuck is that "harmful"?

    Geez, you'd think TFA is just a blowhard looking for page hits.

    • How the fuck is that "harmful"?

      Because every time you believe in a general purpose processor, a kitten dies

    • Guy is trying to play silly distinction games. Really, everyone in tech understands what people mean when they say "general purpose processor." Yes, said unit may have some specialized circuits and such, but it is made to be good at dealing with all kinds of problems. Integer, FP, branching, linear, etc doesn't matter its design can handle them all reasonably well.

      That compares to something specialized like a GPU. For certain kinds of problems, specifically single precision vector math with fairly consisten

    • If a "general purpose" processor solves your problems fast enough, it's good enough.

      How the fuck is that "harmful"?

      You miss the point. It's not the "general purpose processor" that is harmful per se. What is harmful is the labelling of a certain class of processors as "general purpose", when, in the view of the author, they are not really general purpose, but specialised for executing C code with, at most, mid-sized working sets and little inter-processor communication. By assuming this workload as the default and calling processors good for it "general purpose", we may miss other approaches that might be more suitable

      • C is a general purpose language, perhaps the most general purpose language. Processors optimized for C code are by default general purpose.
        • by jedidiah ( 1196 )

          A general purpose processor is intended to do anything.

          General purpose processors are based on the idea that they aren't superstars at any one particular task. So they are pushed to perform as well as the tech will allow. This allows them to beat even the speciality silicon.

          Also, not all speciality coprocessors are created equal.

          A weak (but cheap) special purpose coprocessor will still underperform a general purpose CPU that's not hamstrung with certain pecular engineering considerations.

          A general purpose p

  • by gentryx ( 759438 ) on Saturday November 08, 2014 @08:24PM (#48342997) Homepage Journal

    Of course general purpose CPUs exist, simply because we call them that way. But it is also true that each design has it's own strengths, and "dark silicon" is another driver for special purpose hardware. Efficiency is another. Andrew Chien has published some interesting research [illinois.edu] on this subject. In his 10x10 approach he suggests to use 10 different types of domain-specific compute units (e.g. for n-body, graphics, tree-walking...), each of which is 10x more efficient than "general purpose CPUs" in its domain (YMMV). Those compute units bundled together, make up one core of the 10x10 design. Multiple cores can be connected via a NoC.

    Let's see how software will cope with this development...

    ps: can special purpose hardware exist if general purpose hardware doesn't?

    • ps: can special purpose hardware exist if general purpose hardware doesn't?

      Yes it can. After all the first 'computers' were dedicated code-breakers and such.

      However, there are many different levels of 'general purpose' or 'specialized'. Earlier 'cars' were mentioned as 'general purpose', but think about it, they're actually pretty specialized - they're generally designed for a max of 4-6 passengers, carry maybe a couple hundred pounds of cargo besides the passengers, drive mostly on paved roads, have a range of somewhere between 300 and 400 miles, etc... But within that you hav

      • Earlier 'cars' were mentioned as 'general purpose', but think about it, they're actually pretty specialized - they're generally designed for a max of 4-6 passengers, carry maybe a couple hundred pounds of cargo besides the passengers, drive mostly on paved roads, have a range of somewhere between 300 and 400 miles, etc...

        The first automobiles were curiosities. But as soon as someone figured out you could move people, they knew you could move cargo, and the ones immediately following the first passenger cars were trucks. And a pickup truck is a general-purpose vehicle. It's not particularly good at anything, but it can do a little bit of everything. A sedan is a special-purpose vehicle for moving people and their personal cargo.

    • I believe the Atari Jaguar was a console made of special purpose chips, with an "orchestrator" 68000 CPU tacked on. The propaganda said you're not meant to use that CPU or something to that effect but of course the games heavily relied on it, some of them Amiga ports. The console was a big failure, lacking a usable SDK and documentation. Now consoles have an OS, APIs, middleware.

      • by dbIII ( 701233 )
        I'd say by that point Atari was a huge failure as an entity instead of the actual device, as seen with other things where they developed products but did not follow through.
    • by kesuki ( 321456 )

      "Of course general purpose CPUs exist, simply because we call them that way."

      the wisdom of those real world coders gone from this world is thus. a jack of all trades, is a master of none.

      this means simply that a general purpose FPGA that can modulate it's functions can do a lot of different things but not at the same time, and in etched hardware the trade off is having dark silicon for all the tasks a true jack of all trades cpu can do.

      for instance, when i was doing video games all day the more games i play

      • "jack of all trades, is a master of none"

        You do know that is just an idiom, and an incorrect one, at that, not a law?

        The idiom is "A jack of all trades is better than a master of one", which was later shoehorned into a description of someone who is "a Jack of all trades but a master of none". It is not intended to be folk wisdom that there is a tradeoff between mastering something and being proficient at lots of things... in fact, there are numerous examples of people being masterful in many arenas and also
        • "Jack of all trades, master of none" is the correct saying, it is just missing the ending: "still better than a master of one."

          • What a load of bollocks. If you've got a tooth abscess you want a master of dentisfuckingtry and you won't give a rat's ass how good he is at carpentry or playing the guitar.

    • by mlts ( 1038732 )

      What we might have happen is that we end up with a mix, where a core is weighted towards a task... but compared to running a job at say, 80% as effectively as a core that is built for the job, versus not running the task at all, the scheduler [1] would drop tasks on non-optimal cores if it would help performance. If it is something definitely not optimal (FPU instructions on an integer-only core), the weighting would account for that and might not even place a task on there come the next quantum.

      The 10x10

    • I've had similar ideas about specializing the Forth cores of the F18 design in a grid. The good thing is that the specific instruction set extensions can be simply substituted by subroutine calls if you need to use them on other cores in cases of low dynamic frequency of their usage - the code is essentially concatenative, so it's a matter of simple string substitution. Even the OS loader or scheduler could do it very quickly on the fly.
    • Sheeeit, even the guys that build X86 chips can't design software that makes using just a handful of X86 cores as easy to use as writing for just one core, and he expects them to magically come up with libraries and compilers that will seamlessly switch between dozens of specialized cores, probably hundreds of times per second depending on the task?
  • Basically it's making a big deal out of the fact that today's commonly available hardware is optimized for today's commonly available software. Duh! General purpose is a term relative to purposes a particular person has in mind. Nobody is suggested that Core i7 is capable of running Lt Cmdr Data.

    A genuinely interesting paper would have specific ideas for architecture capable of solving problems beyond the scope of current CPUs and GPUs.

  • by davidwr ( 791652 ) on Saturday November 08, 2014 @09:01PM (#48343133) Homepage Journal

    From TFA:

    It's therefore not enough for a processor to be Turing complete in order to be classified as general purpose; it must be able to run all programs efficiently. [emphasis added]

    Um, nope.

    A general-purpose anything is rarely as efficient at a given task as a special-purpose version of the same thing. Sometimes you really do want your computer chip to be a "Jack of all trades, master of none."

    • I think there is a certain efficiency argument. A GPU may be able to run a C compiler but nobody would consider using it for that. A CPU can run an OpenGL implementation and it would be slow but you'd at least be able to do it without any fiddly hacks, and there could be a reason to do so.

      The article seems to be trying to find a hard and fast rule as to what "general purpose" means and then realising that that doesn't actually apply to general purpose processors.
  • There have been many mediocre articles on the ACM Queue in the last few years ... this fits perfectly.

  • by msobkow ( 48369 ) on Saturday November 08, 2014 @09:44PM (#48343307) Homepage Journal

    It isn't all that many years ago that the floating point was handled by either software emulation or a co-processor. Now we're using GPUs as co-processors. There are also audio designs that act as co-processors. Several enterprise systems have encryption co-processors. IBM is notorious for putting specialized processors in their mainframes. Several chips have the GPUs embedded on-chip already.

    I'd argue that putting specialized chips on-die doesn't affect the general-purpose nature of the compute core that controls those resources at all. The whole article is red herring trying to establish a distinction between on-chip and off-chip processing that has to do more with the scalability of silicon manufacturing techniques than it does any distinguishing feature of the designs.

    Let's face it -- if you want to really accelerate a task, you design silicon specifically for that task and interface it to a general purpose core. The article discusses nothing new in the world of computing.

    • AES encryption is built into all modern Intel CPUs, except a few Atoms.

      The enterprise crypto co-processors are mostly for RSA key generation. Something that's only done during connection setup, but can be a substantial load on a high-traffic SSL server that creates hundreds of connections each second.

      • You're sure it's RSA key generation? Or RSA encryption of the generated session key? I thought that RSA keys were generated when certificates are generated, not when the certificate is being used.
        • No, I'm not sure. I'm not a cryptographer, so I'm not sure exactly what maths it's doing. All I can tell you is the practical side: There is something computationally intensive that happens during the setup stage of an SSL connection, and the main purpose of a hardware cryptographic accelerator is to do that something. Usually so that a webserver may handle many more SSL connections per second than CPU alone could handle. The other approach is an appliance that sits before the webserver and does the SSL stu

          • by msobkow ( 48369 )

            When an SSL or HTTPS connection is established, the existing RSA key is used to negotiate the connection, but a connection-specific key is generated and shared over the RSA-keyed initial connection. It's that generation of the connection-specific key that is compute-intensive. If I recall correctly, that secondary key is usually done using a symmetric algorithm that can be processed faster than AES encryption can be, with the caveat that it requires sharing the key, so it can only be safely used if the i

    • Note that a part of AMD's HSA is exactly this - allowing multiple heterogeneous units to do memory-based communication on top of the paged virtual memory, and providing a mechanism for the units to communicate with each other (by sending "command packets"). Now the current implementation in Kaveri is mostly about having a CPU and a GPU on the same chip, but adding other units - even almost-fixed-purpose ones like one that would be able to accept commands of the form of "encrypt [this block of memory] using
  • Back in the day (when I was actually PAID for buying supercomputers) I devised Jim's First Law of Supercomputing: For every computer architecture there is a problem that will solve on that particular architecture "better" than on any other architecture. And conversely, for every problem there is an architecture that will solve this problem "better" than any other architecture. (You get to define "better".) You didn't have to talk to too many computer sales-people to accept this as fact. I believe the point
  • by Sebastopol ( 189276 ) on Saturday November 08, 2014 @11:15PM (#48343521) Homepage

    This whole discussion just made me laugh whilst remembering the hype around the Transmeta / Torvalds code-morphing engine.

    Ah, the 90's. They were fun.

    CPUs have been "general purpose" since day one. The only non-general purpose hardware are ASICs (like the article says). Everything else is just marketing hype from Intel, et al.

    This is such an amazing rehash of what Intel used to call *T technologies in the 90's, starting from the 80's, when coprocessors started appearing (x87). The big trend was toward DSPs in the 90's, but that never happened, instead they pushed on new hardware like MMX, SSE and now vector processors. That's why we have graphics processors as non-general-purpose CPUs.

    To call something a GPGPU is just an egregious assault of on common sense.

    "Dark silicon", while a catchy name, is simply a side effect of latency, something the article mostly skips (hints at it with locality): the memory hierarchy exists and dark silicon is a result. When latency is zero, more of the silicon will be engaged.

    While one could easily claim that because parts of any chip power down that means it's not general purpose, that's an oversimplification: 100% utilization is fundamentally impossible because problems aren't solved that way, there is no infinite parallelism.

    I really think the author's analysis isn't fully developed. While the conclusion that hardware looks like the software may be a pleasant tautology, it overlooks Turing's thesis entirely. Which is odd, because that's what they author -started- with!

    • The big trend was toward DSPs in the 90's, but that never happened, instead they pushed on new hardware like MMX, SSE and now vector processors

      No, it did happen. We got DSP-like technology in our PCs in the form of MMX, SSE, and vector processors. So we got both DSP and general-purpose processors.

      To call something a GPGPU is just an egregious assault of on common sense.

      If our system bus designs permitted we could have separate graphics output and vector processor. But they don't. You've got to call it something.

  • "[Superscalar architectures] translate the architectural instruction encodings into something more akin to a static single assignment form (ironically, the compiler spends a lot of effort translating from such a form into a finite-register encoding)

    Which makes me wonder, would it (in principle) be worth designing a chip with an ISA that is based explicitly on single-assignment-form, thereby avoiding both the need for transformations by the compiler and (more importantly) transformations by the CPU at run-time?

    • That is a very cool idea, and is explored in this research project: http://en.wikipedia.org/wiki/T... [wikipedia.org]

    • Which makes me wonder, would it (in principle) be worth designing a chip with an ISA that is based explicitly on single-assignment-form, thereby avoiding both the need for transformations by the compiler and (more importantly) transformations by the CPU at run-time?

      If the ISA is based on single assignment then the compiler will still have to transform the code into this form. In practice the compiler does this anyway when it can and super-scaler hardware does a good job of executing this kind of code. The problem (returning to the subject of the thread) is that not all code fits this pattern and in these cases a single-assignment ISA will perform particularly badly so it is not suitable for general purpose processors.

      Researchers have be developing ISAs for decades (in

    • It would be interesting, but it's also a question of encoding density. Having a fixed number of architectural registers (and a much larger number of microarchitectural registers) is a technique that works reasonably well. Adding more architectural registers makes your operand size very large. You could imagine something like Dalvik bytecode, with 2^32 SSA registers and a CPU able to interpret it by either using internal registers or spilling to RAM, but you'd likely end up needing huge instruction caches
  • Yes, you can use almost 100% of the silicon, if you use a BitGrid [blogspot.com] to process information instead of Von Neuman architectures.

  • by wonkey_monkey ( 2592601 ) on Sunday November 09, 2014 @04:37AM (#48344235) Homepage

    Here is one definition of a general-purpose processor: if it can run any algorithm, then it is general purpose. This is not a particularly interesting definition, because it ignores the performance aspect that has been the driving goal for most processor development.

    Well, I'm sorry you don't find the definition interesting, but that doesn't mean you can redefine it however you want.

    It's therefore not enough for a processor to be Turing complete in order to be classified as general purpose; it must be able to run all programs efficiently.

    I assume there's a name for a logical fallacy where you redefine terms in order to make your point.

    With this in mind, let's explore what people really mean when they refer to a general-purpose processor: the specific category of workloads that these devices are optimized for and what those optimizations are.

    That's not what I mean when I refer to a general-purpose processor.

    Efficient designs in such a world will require admitting that there is no one-size-fits-all processor design and that there is a large spectrum, with different trade-offs at different points.

    I didn't realise anyone was denying this.

  • by Kim0 ( 106623 )

    I have designs for general and efficient processors, or rather computing structures, and it is provable they are both efficient and general.

    The generality of computing has been known for a long time, through emulation of Turing machines, but this has not been efficient.

  • by rockmuelle ( 575982 ) on Sunday November 09, 2014 @10:54AM (#48345235)

    A big reason we accept the trade offs of modern processors is that it's generally easy to program a broad range of applications for them.

    In the mid aughts (not very long ago, actually), there was a big push for heterogeneous multi-core processors and systems in the HPC space. Roadrunner at Los Alamos was a culmination of this effort (one of the first petascale systems). It was mix of processor types including IBMs Cell (itself a heterogeneous chip). Programming Roadrunner was a bitch. In having different processor families, you had to decompose your algorithm to target the right processor for a given task. Then you had to worry about moving data efficiently between different processors.

    This type of development is fun as an intellectual exercise, but very difficult and time consuming in practice. It's also something compilers will never be good at, requiring experts in the architectures, domains, and applications to effectively use the system.

    Another lesson from the period (and one that anyone whose done asics has known for years) is that general purpose hardware generally evolves fast enough to catch up with specialized hardware with a reasonable timeframe (usually 6-18 months, see DE Shaw's ASIC for protein folding as an example).

    While custom processors are cool (I love hacking on them), they're rarely practical.

    -Chris

  • The real problem is that you never know what you may end up doing. The potential of a speciality bit of silicon is by it's very nature limited. It may completely fail at some new task that you didn't think of when you were building it.

    It's great for some really well defined problem but as soon as that definition is no longer invalid, the speciality silicon is useless.

    "General Purpose" means that you can address any problem including the ones where your specialty silicon fail.

Two is not equal to three, even for large values of two.

Working...