Forgot your password?
typodupeerror
Supercomputing United Kingdom Technology Hardware Science

Scaling To a Million Cores and Beyond 206

Posted by kdawson
from the can't-get-there-from-here dept.
mattaw writes "In my blog post I describe a system designed to test a route to the potential future of computing. What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion? None of our current programming models or computer architecture models apply to machines of this complexity (and with their corresponding component failure rate and other scaling issues). The current model of coherent memory/identical time/everything can route to everywhere; it just can't scale to machines of this size. So the scientists at the University of Manchester (including Steve Furber, one of the ARM founders) and the University of Southampton turned to the brain for a new model. Our brains just don't work like any computers we currently make. Our brains have a lot more than 1 million processing elements (more like the 100 billion), all of which don't have any precise idea of time (vague ordering of events maybe) nor a shared memory; and not everything routes to everything else. But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly. In effect, modern computing bears as much relation to biological computing as the ordered world of sudoku does to the statistical chaos of quantum mechanics.
This discussion has been archived. No new comments can be posted.

Scaling To a Million Cores and Beyond

Comments Filter:
  • With his Connection Machine.

    Don't remember how many cores that one had...

    • Re: (Score:3, Informative)

      by headhot (137860)

      The CM-1 has 65,536 one bit processors. The CM-5 was in Jurassic Park, and some phone companies.

    • Re: (Score:2, Insightful)

      by teazen (876487)
      Exactly! New is the new old. A million processors? Pah! Old hat. There has been done lots of interesting research into parallel processing in the past. Read the Connection Machine book [google.com.my] It's a great read.

      Feynman was also involved with the machine at a certain point. There's a great writeup [longnow.org] on him and it for a quick introduction: '.. It was a complicated device; by comparison, the processors themselves were simple. Connecting a separate communication wire between each pair of processors was impractical si
    • by Z00L00K (682162)

      And I would probably select Erlang [erlang.org] as programming language for a massive amount of cores.

      • by pieterh (196118) on Wednesday June 30, 2010 @05:19AM (#32741880) Homepage

        You don't even need Erland, you can use a lightweight message-passing library like ZeroMQ [zeromq.org] that lets you build fast concurrent applications in 20 or so languages. It looks like sockets but implements Actors that connect in various patterns (pubsub, request-reply, butterfly), and works with Ruby, Python, C, C++, Java, Ada, C++, CLisp, Go, Haskell, Perl, and even Erlang. You can even mix components in any language.

        You get concurrent apps with no shared state, no shared clock, and components that can come and go at any time, and communicate only by sending each other messages.

        In hardware terms it lets you run one thread per core, at full efficiency, with no wait states. In software terms it lets you build at any scale, even to the scale of the human brain, which is basically a message-passing concurrent architecture.

  • 1 billion cores (Score:3, Informative)

    by should_be_linear (779431) on Wednesday June 30, 2010 @02:20AM (#32741042)
    thats about 30 forks(), and there you go.
  • multi core design (Score:5, Insightful)

    by girlintraining (1395911) on Wednesday June 30, 2010 @02:22AM (#32741054)

    Simply put, there are some computational problems that work well with parallelization. And there are some that no matter how you try to approach it, you come back to a serial-based model. You could have a billion core machine running at 1Ghz get stomped by a single core machine running at 1.7Ghz for certain computational processes. We have yet to find a way computationally or mathematically to make intrinsically serialized problems into parallel ones. If we did, it would probably open up a whole new field of mathematics.

    • Re:multi core design (Score:5, Interesting)

      by jd (1658) <imipak&yahoo,com> on Wednesday June 30, 2010 @02:30AM (#32741090) Homepage Journal

      You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second. On the other hand, there are MANY tasks that are inherently parallel but which are serialized because either the programmers aren't up to the task, the OS isn't up to the task or the CPU isn't up to the task.

      (I don't know if kernel threads under Linux will be divided between CPUs in an SMP system, they certainly can't migrate across motherboards in any MOSIX-type project. That limits how parallel the bottlenecks in the program can ever become. And it's one of the best OS' out there.)

      • Re: (Score:3, Interesting)

        by Your.Master (1088569)

        Actually, you can to some extent serialize a parallel task, with sufficiently many cores.

        For instance, you could just guess at all the intermediate results of halfway through a long sequence of operations, and execute from there, but discard the information if it's wrong. With lots of cores and a good chokepoint, you might be able to gain a 2x speedup a significant percent of the time (for a lower average speedup). 2x, that is, from billions of cores.

        Kind of like branch prediction, or a dynamically genera

        • Ugh, I meant "parallelize a serial task".

        • by gknoy (899301)

          you could just guess at all the intermediate results of halfway through a long sequence of operations, and execute from there, but discard the information if it's wrong.

          How would you know that the calculations were wrong?

      • have 60 people dig one posthole in one second

        Where there's a will, there's a way.

      • by roman_mir (125474)

        You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second.

        - the appropriate analogy is 9 women giving birth to one baby 1 month after conception.

        Then again, this is slashdot, but it's good for learning new stuffs [slashdot.org]

        • by Ihlosi (895663)
          - the appropriate analogy is 9 women giving birth to one baby 1 month after conception. Then again, this is slashdot, but it's good for learning new stuffs

          The woman-month is even more mythical than the man-month in this case.

          *SCNR*

      • by WillDraven (760005) on Wednesday June 30, 2010 @10:38AM (#32744362) Homepage

        You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second.

        We do it all the time around here:
        1 to operate the pile driver
        2 holding up stop/slow signs
        3 riding in the "follow me" vehicle
        4 standing around supervising
        5 cops writing tickets in the surrounding 8 mile work zone
        10 administrators to approve the project
        15 residents jumping out of bed at 6am thinking it'a a bomb going off
        20 people sitting in their cars honking their horns for motivational support

        Of course the whole procedure and traffic carnage can last for months or years, but the actual post being rammed in only takes a second. ;-)

    • by anarche (1525323)

      that sounds like a mathematics phd in the making to me...

    • Re: (Score:3, Interesting)

      by Anonymous Coward

      If one is willing to use the transistors of a billion core machines to speed up a single problem anyway, some transistors could be better used to accelerate the specific problem in the form of custom circuits. There could be a similar layered structure in the system as the brain uses. The lowest, fastest and task customized levels could be build using hard-wired logic, the layer above it using slowly reconfigurable circuits with very fast switching speeds, the layers above easily reconfigurable circuits wit

      • mod parent up. i'm not sure wether it's funny, insightful or plain stupid, but it needs to be seen by others.
        as a sidenote, our brain is perfectly capable of performing complex numerical computations. the problem is that we don't have a way of consciously accessing those functions. similar problems will arise for any system designed on "levels".

    • Re: (Score:3, Interesting)

      by smallfries (601545)

      While turning intrinsically serial problems into a parallel form would certainly open up a new field it is doubtful that it would be a "a whole new field of mathematics" or did you just like the sound of your hyperbole?

      On a slightly different note; every time there is any article about parallel architecture on slashdot someone raises the problem of inherently serial tasks. Can you name any? Or more to the point can you name an inherently serial task on a realistically sized data-set that can't be tackled by

    • As far as I know the E programming language is designed for this problem.

  • Bluudy Blogs (Score:4, Informative)

    by jd (1658) <imipak&yahoo,com> on Wednesday June 30, 2010 @02:25AM (#32741064) Homepage Journal

    I've left out links to some projects, by request, but everything can be found on their homepage anyway. Anyways, it is this combination that is important, NOT one component alone.

  • don't have any precise idea of time (vague ordering of events maybe) nor a shared memory; and not everything routes to everything else

    Sounds like a large scale distributed system. Maybe somebody should ask google about this.

  • What you are describing is the problems around distributed systems. What would I do with a billion cores? Run tens of millions of instances of VMWare (x8 or 16 each) and write distributed code that runs on millions of machines. No shared memory, communication channels which are slow compared with computation? Basically, that's the line between distributed systems and non-distributed systems. Not that most distributed systems problems are solved, but this is the model that we would be investigating assu
    • What would I do with a billion cores? Run tens of millions of instances of VMWare (x8 or 16 each) and write distributed code that runs on millions of machines.

      Not without very serious disk, memory, and network subsystems. CPU cores are not the only bottleneck in a VM setup.

  • "Our brains have a lot more than 1 million processing elements and not everything routes to everything else"

    Thats why I can watch pr0n and code at the same time!

  • by Anonymous Coward on Wednesday June 30, 2010 @02:43AM (#32741140)

    The problem posed by the author is somewhat of a straw man argument: "The trouble is once you go to more than a few thousand cores the shared memory - shared time concept falls to bits."

    Multiple processors in a single multicore aren't required even today to be in lockstep in time (it is actually very difficult to do this). Yes, locally within each core and privates caches they do maintain a synchronous clock, but cores can run in their own clock domains. So I don't buy the argument about scaling with "shared time".

    Secondly, the author states that the "future" of computing should automatically be massively parallel. Clearly they are forgetting about Amdahl's Law (http://en.wikipedia.org/wiki/Amdahl's_law). If your application is 99.9% parallelizable, the MOST speedup I can expect to achieve is 1000X, forget about millions. High sequential performance (ala out-of-order execution, etc.) will not be going away anytime in the near future simply because they are best equipped to deal with serial regions of an application.

    Finally, I was under the impression that they were talking about fitting "millions" of cores onto a single die, until I read the to the end of the post that they are connecting multiple boards via multi-gigabit links. Each chip on a board has about 20 or so cores with privates caches or local store. They talk to other cores on other boards through off-chip links...... SO isn't this just a plain old message passing computer?! What's the novelty here? Am I missing something?

    • Re: (Score:3, Insightful)

      by jd (1658)

      The main problem is that it's horribly hard to pass that many messages around without the overheads of the network exceeding the benefit from the parallelization. If they have found a way to reduce this problem, I'd call that a major novelty.

    • Re: (Score:2, Interesting)

      by GPSguy (62002)

      In weather forecasting, we find ourselves starting, stopping and waiting. Some of the tiles on which we compute will be trivially simple to complete, while others will not run to something approaching a numerically complete solution for some larger number of iterations. Ttherefore, we have to wait for the slowest computation/solution before we advect results to the surrounding tiles and begin the process all over again.

      The nature of parallel problems isn't so simple that you generalize about how synchroniza

  • The things we use computers for are different from the things we use humans for.

    Computers are consistant and predictable. The human brain is not.

    We have billions of human brains cheaply available, so let's use those when we want a human brain. And let's use computers when we want computers.

    • by iammani (1392285)
      I would say Brain = Computer; only the Input Output ports/devices are different. Say your task is to find a picture of Megan Fox among a bundle of photos, you could use a brain (along with necessary I/O devices of course) to perform this task. You need to input these photos (along with a few marked snaps of Fox, if the brain does not already know her), through the eyes attached to it and you could ask the brain to output the result using the mouth attached to it.
    • by dargaud (518470)

      The things we use computers for are different from the things we use humans for.

      Yeah, zombies can't eat computers...

    • I would not call computers entirely predictable, there are too many influences which can derail them, for instance an accidental bit flip, hardware design issues etc...

    • by wvmarle (1070040)

      I think this discussion is mainly relevant considering we are reaching end points hardware-wise. There is only that many transistors one can put in a processor. There is only that many GHz silicon can manage. So we're reaching the end point of the processor as single unit - the logical step is not to increase the one processor, but to take more of those processors and try to let them work together. The human brain may be an interesting model for that.

      And of course in the end Intel et. al have to think of t

    • Re: (Score:2, Interesting)

      by Xest (935314)

      I was going to post about this, but you've already covered part of it, so I'll reply instead.

      I take issue with this statement from the summary:

      "But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly."

      No, they don't look silly, they look smarter and more knowledgeable about the topic than the idiot who made this comment. The human brain has multiple flaws, and whilst it's excellent for some things, it's terrible for others. The human brain relies on emergence, wh

      • by Wescotte (732385) on Wednesday June 30, 2010 @06:15AM (#32742162)

        but hopeless for calculating with a reasonable degree of accuracy the actual distance to that object- the margin of error for most people is on average going to be quite large.

        I disagree. How can we learn to throw a basketball into a tiny hoop from far away without having very accurate estimates? Think of any sport and just how many good estimates are done VERY quickly and pretty damn accurately. How can a painter look at any scene recreate (to scale) what they see on canvas? I'd say are brains are pretty damn good at calculating with very high accuracy.

        Just because I can't say the hoop is exactly 32.74578453 feet from me doesn't mean I don't know how far it is away. If I can throw the ball into the hoop then I have accurately calculated/predicted the distance.

        Look at how sometimes people are mid-conversation, talking about something they know in depth and suddenly they forget what they were going to say- this is because processing in the brain has gone completely off track.

        I'm having a hard time coming up with a good analogy but I suspect these situations are similar to interrupts in computers. Something more important requires the brains resources at that time. It's not like the information is forgotten it's simply not accessible at that movement in time. The information is never "lost" it's just unavailable for a time. If it was lost you wouldn't have the "oh yeah" moments when you remember it or look it up again. You recognize it because you already knew it.

        While I agree the brain isn't as effective at large scale number crunching I do believe it's something the brain can be trained to do. There are plenty of people out there who can do insanely complex arithmetic in their heads. I suspect the reason we all don't have such skills is because we don't need them.

        but there's a lot that current computers can do that the brain can't- serious large scale number crunching for example

        There is no real reason in the survival of the fittest terms for us to be able to accomplish such tasks. So those resources in the brain were put to use on other tasks like accurately processing visual and audio data. I can hear or spot a predator very quickly and accurately in all types of environments and lighting conditions. If we use a computer to perform these tasks we realize just how much computation is required. There is no reason these resources couldn't be allocated to general number crunching. It's just evolution says they are better used for other tasks.

        • Re: (Score:3, Interesting)

          by Xest (935314)

          "I disagree. How can we learn to throw a basketball into a tiny hoop from far away without having very accurate estimates?"

          That was precisely my point, they're still estimates. The human brain can judge based on experience how to throw the ball to get it through a hoop, but can it calculate the distance well enough, and consistently enough to calculated the angle from the feet of the thrower upto the net to perform some action such as a precise manufacturing task?

          These are two very different things, and are

          • by Wescotte (732385)

            it's quite easy in computers, because we have explicit memory addresses

            Yes, it's true you can find something quickly if you have the memory address. However, you still need to know the address of what you want. A dictionary is a great way to verify the spelling of a word but you still need to have a good idea of how the word is spelled to find it.

            A computer could simply check every address via brute force to find what it needs but that isn't generally what happens. I assume our brain works somewhat similar to a database. You make a query and get results. Those results can b

  • Damaged Brains (Score:4, Insightful)

    by b4upoo (166390) on Wednesday June 30, 2010 @02:48AM (#32741168)

    Some folks with severely damaged brains seem to make better human computers than people with healthy brains. Rain Man leaps to mind as well as other savants. It seems that when some parts of the brain are impaired the energy of thought is diverted to narrower functions. Perhaps we need to think of delivery more energy to less cores to make machines that do tasks that normal humans are not so good at doing.

    • Isn't that the status quo? I currently have a relatively low-power computer that can calculate Pi to an impressive amount of decimal places in less than a second - considerably better than I can! However it still can't hold a decent conversation.

    • by Wescotte (732385)
      I think I've decided to no longer consider their brains severely damaged. Their brains seem to be wired in a manor that puts communication (and other things) at a lower priority. The link below is a video demonstrating how a person originally classified as severly mentally disabled was simply unable to communicate in the manor we are accustom to.
      http://www.wimp.com/autisticgirl/ [wimp.com]

      I think it's simply evolution. While these people who never last in the hunter/gather world in modern times we care for them a
  • Is it coincidence that earlier this month there was a press release from IMEC regarding the issues of massively scaling up computational power ("exascaling")?
    Press blurb can be found here [www2.imec.be].
    Killer application would be "space weather prediction".

  • by thomasinx (643997) on Wednesday June 30, 2010 @02:52AM (#32741190)
    The problems with "coherent memory/identical time/everything can route to everywhere" isnt only seen when you get up to a million cores. I've done plenty of work with MPI and pthreads, and depending on how it's organized, a significant portion of these methods start showing inefficiencies when you get into just a few hundred cores.

    Since there are already plenty of clusters containing thousands upon thousands of individual processors (which dont use coherent memory..etc), the step to scale up to a million would likely follow the same logical development. There should already be one or two decent CS papers on the topic, since it's basically a problem that's been around since beowulf clusters were popularized (or even before then)
  • by ctrl-alt-canc (977108) on Wednesday June 30, 2010 @03:03AM (#32741248)

    ...it seemed to me that Amdahl's law [ameslab.gov] was still alive and kicking.

    • Given you statement, why would you link to a document entitled Reevaluating Amdahl's Law [ameslab.gov]? Did you even read what you linked to? Here's an excerpt:

      Our work to date shows that it is not an insurmountable task to extract very high efficiency from a massively-parallel ensemble, for the reasons presented here. We feel that it is important for the computing research community to overcome the "mental block" against massive parallelism imposed by a misuse of Amdahl's speedup formula; speedup should be measured by scaling the problem to the number of processors, not fixing problem size. We expect to extend our success to a broader range of applications and even larger values for N.

      • by jvonk (315830)
        Gustafson's Law giveth, while Amdahl's Law taketh away.

        Interesting... [temple.edu]
      • I didn't link that document by chance...

        In our industrial field we daily compute several finite-difference simulations on a medium level scale of resolution, and each CPU shares with the others a piece of the simulation playground. The bottleneck is in the inter-cpu communications: whenever we graph the performance of our codes, we always see Amdalh's law say "hi!" to us.

        Of course application examples exist where parallelism is predominant, and in this case the second part of the document I linked is verifi

    • by Burnhard (1031106)
      One of the criticisms of Amdahl's Law is that it makes pessimistic assumptions about the amount of program code that must be serial. These assumptions are wholly dependent on the problem domain under consideration of course. To use Google as a case in point, the amount of serial code is closer to 0% and so Amdahl's Law doesn't really apply here. So, to state the obvious, this kind of processing works well if the problem to be solved is naturally parallel in nature (each element can be effectively process
  • by Animats (122034) on Wednesday June 30, 2010 @03:03AM (#32741252) Homepage

    This is very similar to the Inmos Transputer [wikipedia.org], a mid-1980s system. It's the same idea: many processors, no shared memory, message passing over fast serial links. The Transputer suffered from a slow development cycle; by the time it shipped, each new part was behind mainstream CPUs.

    This new thing has more potential, though. There's enough memory per CPU to get something done. Each Cell processor, with only 256K per CPU, didn't have enough memory to do much on its own. 20 CPUs sharing 1GB gives 50MB per CPU, which has more promise. Each machine is big enough that this can be viewed as a cluster, something that's reasonably well understood. Cell CPUs are too tiny for that; they tend to be used as DSPs processing streaming data.

    As usual, the problem will be to figure out how to program it. The original article talks about "neurons" too much. That hasn't historically been a really useful concept in computing.

  • Well, when you start thinking about 10^6 or more cores, is pretty obvious that they cannot all be connected to each other and cannot all share memory. At that point your in the realm of neural networking, and are looking at many many serial (and parallel) tasks running in parallel.

    When you look at the brain, it has evolved so that different areas have different purposes and techniques for processing data. There are some very highly specialized systems in there for very specific problems.

  • I saw Steve Furber talk at Retro Reunited in Huddersfield last year (where he spoke about the past - Acorn, the development of the ARM, the present, and the future in terms of what they were doing with SpiNNaker. Very interesting talk. (I also saw Sophie Wilson, another one of the original ARM developers at Bletchley park a couple of weekends ago, another fascinating talk. She now works for Broadcom designing processors for telecommunications).

  • by Anonymous Coward

    The brain isn't like computers at all. The brain is compartmentalized. There are dozens of separate pieces each with its specialty. Its wired to other pieces in specific ways. There is no "Total Information Awareness"(tm) bullshit going on (what 1 million cores would give you). The problem with TIA is that there is too much crap to wade through. Too big a haystack to find the needle you need. What they found when analyzing Berger-Liaw speech recognition systems against other systems is that the Berger

  • But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly.

    Wouldn't they have just proved the point then. In their case at least......which would mean that they weren't silly after all ..... so their comment would be silly .... which would prove their point.... [stack overflow]

  • > What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion? ...run really awesome screensavers!

    • > What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion?

      Run really awesome screensavers!

  • by CBravo (35450) on Wednesday June 30, 2010 @03:32AM (#32741372)

    My opinion is that you should not require software to be parallelized from the start. You parallelize it during runtime or at compile time.

    This makes sense because parallelization does not add anything in functionality (the outcome should not change). My point is: program functionality and configure/compile parallelization afterwards (possibly by power-users). There could be a unique selling point for open source: parallel performance because you can recompile.

    • by dargaud (518470)

      My opinion is that you should not require software to be parallelized from the start. You parallelize it during runtime or at compile time.

      It certainly is the easiest for the lazy programmer, but certainly not the best way. Using a language that's inherently parallel like Erlang is certainly (one of) the best ways: you don't need to fight with mutexes, semaphores, p-loops as the semantics of the language takes care of it for you. Neat but you need to rethink many processes.

      • by CBravo (35450)

        Rewriting all software in the world using Erlang will not happen. It is not about a rewrite of the world imho, it is about the possible path, starting where we are now, to a new situation.

    • by selven (1556643)

      So the compiler is supposed to guess which parts of the program can be parallelized and which can't be? What if a programmer, not thinking about parallelization at all, accidentally makes his entire program impossible to break up (eg. by using the same variable for loops throughout the entire code)? To do this, the compiler would have to understand the program's intent, and I think we'll get to 1 million processors long before we get that kind of AI.

      • by CBravo (35450)

        I am aware of the problems of automatic parallelization with current methods (and I did not state 'automatic'). If there was a simple/simplistic solution I would like to post it here (but I don't know one).

        Your statements have so many assumptions (and not all are true) that you make it hard for yourself to find a solution. Imho one has to think simple to find solutions to hard problems.

  • The Internet (Score:5, Interesting)

    by pmontra (738736) on Wednesday June 30, 2010 @03:37AM (#32741400) Homepage
    The Internet is at least in the 1 billion cores range. The way to use many of them for a parallel computation has been demonstrated by Seti@home, Folding@home and even by botnets. They might not be the most efficient implementations when you have full control of the cores but they show the way to go when the availability of the cores and the communication between them is unreliable, when they have different times and different clocks and when they might be preempted to do different tasks.
  • With 100+ cores we should start considering leaving Von-neuman behind.

    Separated memory/processing/instructions/registers would stop making sense. We would have to follow another model. I don't know which.

  • by master_p (608214) on Wednesday June 30, 2010 @04:04AM (#32741534)

    The brain does not do arithmetic, it only does pattern matching. That's what most people don't get and that's the obstacle to understanding and realizing AI.

    If you ask how can humans can then do math in their brain, the answer is simple: they can't, but a pattern matching system can be trained to do math by learning all the relevant patterns.

    If you further ask how humans can do logical inference in their brain, the answer is again simple: they can't, and that's the reason people believe in illogical things. Their answers are the result of pattern matching, just like Google returning the wrong results.

    • by Ihlosi (895663)
      The brain does not do arithmetic, it only does pattern matching.

      Down at the neuron level, the brain does arithmetics on pulse frequency modulated signals.

  • The current model of coherent memory/identical time/everything can route to everywhere just can't scale to machines of this size

    Why not ? Obviously, you can't have a million processors accessing the same variable in memory, but with a layered system of caches, you could keep most processors working in their own local copy. As soon as a processor writes to memory that's also used by another process, extra hardware will keep the memory coherent. This architecture is basically a superset of a message passing a

  • ... then our processors also have millions already.

    A neuron is a fairly simple processing element, after all. Complexity comes from the sheer number of connections with other neurons that a single neuron can have.

  • In Soviet Russia, a Beowulf cluster of these imagines YOU!!!!!

  • Our brains have billions of neurons, not billions of cores. These are completely different beasts when it comes to architecture.
  • The Human brain is rather spiffy, but it's far from perfect. It has fantastic performance, but it can frequently screw up massively. I don't think mankind would be too pleased if their most powerful computers got depressed (like Marvin) - we'd probably expect to be able to use them, and to trust their output. If they work like the human brain, we can't always do both.
  • by Skapare (16644)

    When are they going to get to a point where all of the RAM is on the same die with the processor core(s) that need access to that RAM? By shortening the path to the RAM from going off-chip to staying on-chip, the opportunity for increased speed and lower power consumption arises. And this can also be constructed more compactly, allowing more such complete processors within the same space. Then with more processors, at some point we no longer even need virtual memory for at least the bulk of the processor

  • Related stories
    Intel Says to Prepare For "Thousands of Cores" [slashdot.org]

    There, hope that helps.

One small step for man, one giant stumble for mankind.

Working...