Become a fan of Slashdot on Facebook

Scaling To a Million Cores and Beyond 206

Posted by kdawson on Wednesday June 30, 2010 @02:13AM from the can't-get-there-from-here dept.

mattaw writes "In my blog post I describe a system designed to test a route to the potential future of computing. What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion? None of our current programming models or computer architecture models apply to machines of this complexity (and with their corresponding component failure rate and other scaling issues). The current model of coherent memory/identical time/everything can route to everywhere; it just can't scale to machines of this size. So the scientists at the University of Manchester (including Steve Furber, one of the ARM founders) and the University of Southampton turned to the brain for a new model. Our brains just don't work like any computers we currently make. Our brains have a lot more than 1 million processing elements (more like the 100 billion), all of which don't have any precise idea of time (vague ordering of events maybe) nor a shared memory; and not everything routes to everything else. But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly. In effect, modern computing bears as much relation to biological computing as the ordered world of sudoku does to the statistical chaos of quantum mechanics.

This discussion has been archived. No new comments can be posted.

Scaling To a Million Cores and Beyond

Load All Comments

Search 206 Comments Log In/Create an Account

Comments Filter:

Reminds me of Hillis (Score:2)

by Arancaytar ( 966377 ) writes:

With his Connection Machine.
Don't remember how many cores that one had...
- Re: (Score:3, Informative)
  
  by headhot ( 137860 ) writes:
  
  The CM-1 has 65,536 one bit processors. The CM-5 was in Jurassic Park, and some phone companies.
- Re: (Score:2, Insightful)
  
  by teazen ( 876487 ) writes:
  
  Exactly! New is the new old. A million processors? Pah! Old hat. There has been done lots of interesting research into parallel processing in the past. Read the Connection Machine book [google.com.my] It's a great read.
  
  Feynman was also involved with the machine at a certain point. There's a great writeup [longnow.org] on him and it for a quick introduction: '.. It was a complicated device; by comparison, the processors themselves were simple. Connecting a separate communication wire between each pair of processors was impractical si
- Re: (Score:2)
  
  by Z00L00K ( 682162 ) writes:
  
  And I would probably select Erlang [erlang.org] as programming language for a massive amount of cores.
  - Re:Reminds me of Hillis (Score:5, Interesting)
    
    by pieterh ( 196118 ) writes: on Wednesday June 30, 2010 @05:19AM (#32741880) Homepage
    
    You don't even need Erland, you can use a lightweight message-passing library like ZeroMQ [zeromq.org] that lets you build fast concurrent applications in 20 or so languages. It looks like sockets but implements Actors that connect in various patterns (pubsub, request-reply, butterfly), and works with Ruby, Python, C, C++, Java, Ada, C++, CLisp, Go, Haskell, Perl, and even Erlang. You can even mix components in any language.
    You get concurrent apps with no shared state, no shared clock, and components that can come and go at any time, and communicate only by sending each other messages.
    In hardware terms it lets you run one thread per core, at full efficiency, with no wait states. In software terms it lets you build at any scale, even to the scale of the human brain, which is basically a message-passing concurrent architecture.
    
    Parent Share
    twitter facebook
- - Re: (Score:2)
    
    by skids ( 119237 ) writes:
    
    What sucks life out of *my* CPU is waiting for cache lines while running gigantic hairballs or spaghetti "business logic" code and it's associated over-OOd overhead.
1 billion cores (Score:3, Informative)

by should_be_linear ( 779431 ) writes: on Wednesday June 30, 2010 @02:20AM (#32741042)

thats about 30 forks(), and there you go.

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by BhaKi ( 1316335 ) writes:
  
  Wrong. You need 999999999 forks.
  - Re:1 billion cores (Score:4, Informative)
    
    by TheSpoom ( 715771 ) writes: <slashdot&uberm00,net> on Wednesday June 30, 2010 @03:17AM (#32741318) Homepage Journal
    
    I'm pretty sure the poster meant to do something like this:
    fork();
    fork();
    fork(); // etc.
    which would make the number of processes increase exponentially every time the forked processes forked again. Not 1, 2, 3, but 1, 2, 4, 8, 16... and 2^30 gets you above 1 billion.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by Laxori666 ( 748529 ) writes:
      
      I think GP meant you should built the neuron elements out of forks. Maybe they have a unique potential for computing power?
      - Re: (Score:2)
        
        by VoidCrow ( 836595 ) writes:
        
        Behold, the Fork of Truth!
multi core design (Score:5, Insightful)

by girlintraining ( 1395911 ) writes: on Wednesday June 30, 2010 @02:22AM (#32741054)

Simply put, there are some computational problems that work well with parallelization. And there are some that no matter how you try to approach it, you come back to a serial-based model. You could have a billion core machine running at 1Ghz get stomped by a single core machine running at 1.7Ghz for certain computational processes. We have yet to find a way computationally or mathematically to make intrinsically serialized problems into parallel ones. If we did, it would probably open up a whole new field of mathematics.

Share
twitter facebook
- Re:multi core design (Score:5, Interesting)
  
  by jd ( 1658 ) writes: <imipak AT yahoo DOT com> on Wednesday June 30, 2010 @02:30AM (#32741090) Homepage Journal
  
  You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second. On the other hand, there are MANY tasks that are inherently parallel but which are serialized because either the programmers aren't up to the task, the OS isn't up to the task or the CPU isn't up to the task.
  (I don't know if kernel threads under Linux will be divided between CPUs in an SMP system, they certainly can't migrate across motherboards in any MOSIX-type project. That limits how parallel the bottlenecks in the program can ever become. And it's one of the best OS' out there.)
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Interesting)
    
    by Your.Master ( 1088569 ) writes:
    
    Actually, you can to some extent serialize a parallel task, with sufficiently many cores.
    For instance, you could just guess at all the intermediate results of halfway through a long sequence of operations, and execute from there, but discard the information if it's wrong. With lots of cores and a good chokepoint, you might be able to gain a 2x speedup a significant percent of the time (for a lower average speedup). 2x, that is, from billions of cores.
    Kind of like branch prediction, or a dynamically genera
    - Re: (Score:2)
      
      by Your.Master ( 1088569 ) writes:
      
      Ugh, I meant "parallelize a serial task".
    - Re: (Score:2)
      
      by gknoy ( 899301 ) writes:
      
      you could just guess at all the intermediate results of halfway through a long sequence of operations, and execute from there, but discard the information if it's wrong.
      How would you know that the calculations were wrong?
  - Re: (Score:2)
    
    by CarpetShark ( 865376 ) writes:
    
    have 60 people dig one posthole in one second
    Where there's a will, there's a way.
  - Re: (Score:2)
    
    by roman_mir ( 125474 ) writes:
    
    You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second.
    - the appropriate analogy is 9 women giving birth to one baby 1 month after conception.
    Then again, this is slashdot, but it's good for learning new stuffs [slashdot.org]
    - Re: (Score:2)
      
      by Ihlosi ( 895663 ) writes:
      
      - the appropriate analogy is 9 women giving birth to one baby 1 month after conception. Then again, this is slashdot, but it's good for learning new stuffs
      The woman-month is even more mythical than the man-month in this case.
      *SCNR*
  - Re:multi core design (Score:5, Funny)
    
    by WillDraven ( 760005 ) writes: on Wednesday June 30, 2010 @10:38AM (#32744362) Homepage
    
    You cannot parallelize a serial task, any more than you can have 60 people dig one posthole in one second.
    We do it all the time around here:
    1 to operate the pile driver
    2 holding up stop/slow signs
    3 riding in the "follow me" vehicle
    4 standing around supervising
    5 cops writing tickets in the surrounding 8 mile work zone
    10 administrators to approve the project
    15 residents jumping out of bed at 6am thinking it'a a bomb going off
    20 people sitting in their cars honking their horns for motivational support
    Of course the whole procedure and traffic carnage can last for months or years, but the actual post being rammed in only takes a second. ;-)
    
    Parent Share
    twitter facebook
  - - Re:multi core design (Score:4, Informative)
      
      by jd ( 1658 ) writes: <imipak AT yahoo DOT com> on Wednesday June 30, 2010 @04:10AM (#32741578) Homepage Journal
      
      You've got to be careful when talking about threads. There are four basic models: SISD, SIMD, MISD and MIMD. Of those, only SISD is serial, but if you've two independent SISD tasks, you can run them in parallel. Most modern supercomputers are built on the premise that SIMD is good enough. Not sure where MISD is used, MIMD fell out of favour when vector processors became too expensive but may be revived on more modest CPUs with modern interconnects like Infiniband.
      
      Parent Share
      twitter facebook
      - Re:multi core design (Score:5, Interesting)
        
        by William Robinson ( 875390 ) writes: on Wednesday June 30, 2010 @04:45AM (#32741760)
        
        Not sure where MISD is used
        Back in 1987, when I was part of team that was designing parallel processing machine, with 4 neighboring CPUs sharing common memory (apart from their own local memory, kind of systolic array), we were designing machine suitable to simulate aerodynamics or weather forecasting using diffusion equations. We believed that it was working on MISD model, where different algorithms running in different CPUs utilized same data for analysis, using bus arbitration logic.
        
        Parent Share
        twitter facebook
      - Re: (Score:2)
        
        by skids ( 119237 ) writes:
        
        Speculative execution is a current manifestation of MISD, FWIW. However with the push towards power savings to preserve battery life, it probably won't be as popular going forward.
- Re: (Score:2)
  
  by anarche ( 1525323 ) writes:
  
  that sounds like a mathematics phd in the making to me...
- Re: (Score:3, Interesting)
  
  by Anonymous Coward writes:
  
  If one is willing to use the transistors of a billion core machines to speed up a single problem anyway, some transistors could be better used to accelerate the specific problem in the form of custom circuits. There could be a similar layered structure in the system as the brain uses. The lowest, fastest and task customized levels could be build using hard-wired logic, the layer above it using slowly reconfigurable circuits with very fast switching speeds, the layers above easily reconfigurable circuits wit
  - Re: (Score:2)
    
    by chichilalescu ( 1647065 ) writes:
    
    mod parent up. i'm not sure wether it's funny, insightful or plain stupid, but it needs to be seen by others.
    as a sidenote, our brain is perfectly capable of performing complex numerical computations. the problem is that we don't have a way of consciously accessing those functions. similar problems will arise for any system designed on "levels".
- Re: (Score:3, Interesting)
  
  by smallfries ( 601545 ) writes:
  
  While turning intrinsically serial problems into a parallel form would certainly open up a new field it is doubtful that it would be a "a whole new field of mathematics" or did you just like the sound of your hyperbole?
  On a slightly different note; every time there is any article about parallel architecture on slashdot someone raises the problem of inherently serial tasks. Can you name any? Or more to the point can you name an inherently serial task on a realistically sized data-set that can't be tackled by
  - Re: (Score:2)
    
    by imgod2u ( 812837 ) writes:
    
    Adobe Flash?
- E programming language (Score:2)
  
  by elucido ( 870205 ) * writes:
  
  As far as I know the E programming language is designed for this problem.
- - Re: (Score:2)
    
    by hedwards ( 940851 ) writes:
    
    Actually, you'd parallelize that one by just stealing a baby that's already been started. I think that's the most sensible way, or you know you could always rent or adopt if you really need one that fast.
    - Re: "steal" (Score:4, Funny)
      
      by TaoPhoenix ( 980487 ) writes: <TaoPhoenix@yahoo.com> on Wednesday June 30, 2010 @09:58AM (#32743808) Journal
      
      No no - you had the golden chance and missed it!
      You *license* the baby!
      
      Parent Share
      twitter facebook
Bluudy Blogs (Score:4, Informative)

by jd ( 1658 ) writes: <imipak AT yahoo DOT com> on Wednesday June 30, 2010 @02:25AM (#32741064) Homepage Journal
- Hardware design tools [man.ac.uk]
- Transactional Memory [man.ac.uk]
- TERAFLUX: Exploiting Dataflow Parallelism in Teradevice Computing [teraflux.eu]
- SpiNNaker - A Universal Spiking Neural Network Architecture [man.ac.uk]
- The Balsa Asynchronous Synthesis System [man.ac.uk]
I've left out links to some projects, by request, but everything can be found on their homepage anyway. Anyways, it is this combination that is important, NOT one component alone.
Share
twitter facebook
Distributed systems (Score:2)

by MichaelSmith ( 789609 ) writes:

don't have any precise idea of time (vague ordering of events maybe) nor a shared memory; and not everything routes to everything else
Sounds like a large scale distributed system. Maybe somebody should ask google about this.
- Re: (Score:2)
  
  by CarpetShark ( 865376 ) writes:
  
  Boink has half a million computers, many of which probably have more than 2 cores.
  - Re: (Score:2)
    
    by CarpetShark ( 865376 ) writes:
    
    Boinc, even.
    - Re: (Score:2)
      
      by MichaelSmith ( 789609 ) writes:
      
      Yeah though boinc is pretty simple. Just a flat array of machines which do stuff. Google is closer to the spaghetti like structure of the brain.
Distributed Computing (Score:2)

by sfcat ( 872532 ) writes:

What you are describing is the problems around distributed systems. What would I do with a billion cores? Run tens of millions of instances of VMWare (x8 or 16 each) and write distributed code that runs on millions of machines. No shared memory, communication channels which are slow compared with computation? Basically, that's the line between distributed systems and non-distributed systems. Not that most distributed systems problems are solved, but this is the model that we would be investigating assu
- Re: (Score:2)
  
  by CarpetShark ( 865376 ) writes:
  
  What would I do with a billion cores? Run tens of millions of instances of VMWare (x8 or 16 each) and write distributed code that runs on millions of machines.
  Not without very serious disk, memory, and network subsystems. CPU cores are not the only bottleneck in a VM setup.
multi-tasking (Score:2)

by vivek7006 ( 585218 ) writes:

"Our brains have a lot more than 1 million processing elements and not everything routes to everything else"
Thats why I can watch pr0n and code at the same time!
Problems with this blog. (Score:4, Informative)

by Anonymous Coward writes: on Wednesday June 30, 2010 @02:43AM (#32741140)

The problem posed by the author is somewhat of a straw man argument: "The trouble is once you go to more than a few thousand cores the shared memory - shared time concept falls to bits."
Multiple processors in a single multicore aren't required even today to be in lockstep in time (it is actually very difficult to do this). Yes, locally within each core and privates caches they do maintain a synchronous clock, but cores can run in their own clock domains. So I don't buy the argument about scaling with "shared time".
Secondly, the author states that the "future" of computing should automatically be massively parallel. Clearly they are forgetting about Amdahl's Law (http://en.wikipedia.org/wiki/Amdahl's_law). If your application is 99.9% parallelizable, the MOST speedup I can expect to achieve is 1000X, forget about millions. High sequential performance (ala out-of-order execution, etc.) will not be going away anytime in the near future simply because they are best equipped to deal with serial regions of an application.
Finally, I was under the impression that they were talking about fitting "millions" of cores onto a single die, until I read the to the end of the post that they are connecting multiple boards via multi-gigabit links. Each chip on a board has about 20 or so cores with privates caches or local store. They talk to other cores on other boards through off-chip links...... SO isn't this just a plain old message passing computer?! What's the novelty here? Am I missing something?

Share
twitter facebook
- Re: (Score:3, Insightful)
  
  by jd ( 1658 ) writes:
  
  The main problem is that it's horribly hard to pass that many messages around without the overheads of the network exceeding the benefit from the parallelization. If they have found a way to reduce this problem, I'd call that a major novelty.
- Re: (Score:2, Interesting)
  
  by GPSguy ( 62002 ) writes:
  
  In weather forecasting, we find ourselves starting, stopping and waiting. Some of the tiles on which we compute will be trivially simple to complete, while others will not run to something approaching a numerically complete solution for some larger number of iterations. Ttherefore, we have to wait for the slowest computation/solution before we advect results to the surrounding tiles and begin the process all over again.
  The nature of parallel problems isn't so simple that you generalize about how synchroniza
Human brain != computer (Score:2, Insightful)

by i-like-burritos ( 1532531 ) writes:

The things we use computers for are different from the things we use humans for.
Computers are consistant and predictable. The human brain is not.
We have billions of human brains cheaply available, so let's use those when we want a human brain. And let's use computers when we want computers.
- Re: (Score:2)
  
  by iammani ( 1392285 ) writes:
  
  I would say Brain = Computer; only the Input Output ports/devices are different. Say your task is to find a picture of Megan Fox among a bundle of photos, you could use a brain (along with necessary I/O devices of course) to perform this task. You need to input these photos (along with a few marked snaps of Fox, if the brain does not already know her), through the eyes attached to it and you could ask the brain to output the result using the mouth attached to it.
- Re: (Score:2)
  
  by dargaud ( 518470 ) writes:
  
  The things we use computers for are different from the things we use humans for.
  Yeah, zombies can't eat computers...
- Re: (Score:2)
  
  by MemoryDragon ( 544441 ) writes:
  
  I would not call computers entirely predictable, there are too many influences which can derail them, for instance an accidental bit flip, hardware design issues etc...
- Re: (Score:2)
  
  by wvmarle ( 1070040 ) writes:
  
  I think this discussion is mainly relevant considering we are reaching end points hardware-wise. There is only that many transistors one can put in a processor. There is only that many GHz silicon can manage. So we're reaching the end point of the processor as single unit - the logical step is not to increase the one processor, but to take more of those processors and try to let them work together. The human brain may be an interesting model for that.
  And of course in the end Intel et. al have to think of t
- Re: (Score:2, Interesting)
  
  by Xest ( 935314 ) writes:
  
  I was going to post about this, but you've already covered part of it, so I'll reply instead.
  I take issue with this statement from the summary:
  "But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly."
  No, they don't look silly, they look smarter and more knowledgeable about the topic than the idiot who made this comment. The human brain has multiple flaws, and whilst it's excellent for some things, it's terrible for others. The human brain relies on emergence, wh
  - Re:Human brain != computer (Score:5, Insightful)
    
    by Wescotte ( 732385 ) writes: on Wednesday June 30, 2010 @06:15AM (#32742162)
    
    but hopeless for calculating with a reasonable degree of accuracy the actual distance to that object- the margin of error for most people is on average going to be quite large.
    I disagree. How can we learn to throw a basketball into a tiny hoop from far away without having very accurate estimates? Think of any sport and just how many good estimates are done VERY quickly and pretty damn accurately. How can a painter look at any scene recreate (to scale) what they see on canvas? I'd say are brains are pretty damn good at calculating with very high accuracy.
    
    Just because I can't say the hoop is exactly 32.74578453 feet from me doesn't mean I don't know how far it is away. If I can throw the ball into the hoop then I have accurately calculated/predicted the distance.
    Look at how sometimes people are mid-conversation, talking about something they know in depth and suddenly they forget what they were going to say- this is because processing in the brain has gone completely off track.
    I'm having a hard time coming up with a good analogy but I suspect these situations are similar to interrupts in computers. Something more important requires the brains resources at that time. It's not like the information is forgotten it's simply not accessible at that movement in time. The information is never "lost" it's just unavailable for a time. If it was lost you wouldn't have the "oh yeah" moments when you remember it or look it up again. You recognize it because you already knew it.
    
    While I agree the brain isn't as effective at large scale number crunching I do believe it's something the brain can be trained to do. There are plenty of people out there who can do insanely complex arithmetic in their heads. I suspect the reason we all don't have such skills is because we don't need them.
    but there's a lot that current computers can do that the brain can't- serious large scale number crunching for example
    There is no real reason in the survival of the fittest terms for us to be able to accomplish such tasks. So those resources in the brain were put to use on other tasks like accurately processing visual and audio data. I can hear or spot a predator very quickly and accurately in all types of environments and lighting conditions. If we use a computer to perform these tasks we realize just how much computation is required. There is no reason these resources couldn't be allocated to general number crunching. It's just evolution says they are better used for other tasks.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Interesting)
      
      by Xest ( 935314 ) writes:
      
      "I disagree. How can we learn to throw a basketball into a tiny hoop from far away without having very accurate estimates?"
      That was precisely my point, they're still estimates. The human brain can judge based on experience how to throw the ball to get it through a hoop, but can it calculate the distance well enough, and consistently enough to calculated the angle from the feet of the thrower upto the net to perform some action such as a precise manufacturing task?
      These are two very different things, and are
      - Re: (Score:2)
        
        by Wescotte ( 732385 ) writes:
        
        it's quite easy in computers, because we have explicit memory addresses
        
        Yes, it's true you can find something quickly if you have the memory address. However, you still need to know the address of what you want. A dictionary is a great way to verify the spelling of a word but you still need to have a good idea of how the word is spelled to find it.
        
        A computer could simply check every address via brute force to find what it needs but that isn't generally what happens. I assume our brain works somewhat similar to a database. You make a query and get results. Those results can b
    - - Re: (Score:2)
        
        by Wescotte ( 732385 ) writes:
        
        I think there is numerical analysis involved it's just we don't directly have access to it. Our conscience mind doesn't have root access and thus is limited to some sort of API and there is private data structures I simply can not access via my conscience mind.
        
        True, in the case of the basketball we only limited feedback from our throw function. It returns make/miss overthrow/underthrow type results. However, even with our initial throw we have some basis for how far we can throw something. We might not th
    - - Re: (Score:2)
        
        by Wescotte ( 732385 ) writes:
        
        I agree it's pattern matching but I also consider that to be a mathematical process.
        
        A machine/computer can only be as accurate as it is because we isolate it. We limit the number of things that can go wrong. A brain doesn't have that luxury. Practice seems to be learning to isolate a task from unknown and unrelated external forces/input.
        
        I see no reason why a brain can't be trained to perform the exact same task as a computer/machine being as accurate and consistent. It's just we don't do it. We find s
- - Re: (Score:2)
    
    by chichilalescu ( 1647065 ) writes:
    
    Yes, because only humans were able to do them. But a computer can count much faster and much more accurately than a human, BECAUSE it is consistent and predictable.
    In practice the relationship is:
    human brains are very good at generating algorithms to solve specific problems.
    computers are very good at applying the algorithm.
    - - Re: (Score:2)
        
        by chichilalescu ( 1647065 ) writes:
        
        the best we came up with (as far as I know) is to have an algorithm that mimicks the human brain :)
        so it would only be useful for robots going in places where humans can't follow. because, at least for now, we lack the technology of efficiently simulating the human brain. These guys are trying to do just that (I think): rethink the hardware to simulate brains more efficiently.
- - Re: (Score:2)
    
    by itsdapead ( 734413 ) writes:
    
    this is because we do not understand the algorithms/software running inside the brain.
    I thought that enough was known to suggest that the brain works more like a neural net than a digital computer.
    i also think that the brain is exactly analogous to a digital computer.
    Or is it digitous to an analogue computer?
    Seriously, though, Neural Net != Digital (Turing/Von Neumann) computer. No real concept of an algorithm, no real concept of "computability" (only ever delevers a good guess, not an analytic solution, doesn't even require the problem to be formulated).
    ...and while a computer can simulate a neural net, vice-versa is a bit more tricky...
Damaged Brains (Score:4, Insightful)

by b4upoo ( 166390 ) writes: on Wednesday June 30, 2010 @02:48AM (#32741168)

Some folks with severely damaged brains seem to make better human computers than people with healthy brains. Rain Man leaps to mind as well as other savants. It seems that when some parts of the brain are impaired the energy of thought is diverted to narrower functions. Perhaps we need to think of delivery more energy to less cores to make machines that do tasks that normal humans are not so good at doing.

Share
twitter facebook
- Re: (Score:2)
  
  by imakemusic ( 1164993 ) writes:
  
  Isn't that the status quo? I currently have a relatively low-power computer that can calculate Pi to an impressive amount of decimal places in less than a second - considerably better than I can! However it still can't hold a decent conversation.
- Re: (Score:2)
  
  by Wescotte ( 732385 ) writes:
  
  I think I've decided to no longer consider their brains severely damaged. Their brains seem to be wired in a manor that puts communication (and other things) at a lower priority. The link below is a video demonstrating how a person originally classified as severly mentally disabled was simply unable to communicate in the manor we are accustom to.
  http://www.wimp.com/autisticgirl/ [wimp.com]
  
  I think it's simply evolution. While these people who never last in the hunter/gather world in modern times we care for them a
Link with IMAC ExaScience lab? (Score:2, Interesting)

by Hedon ( 192607 ) writes:

Is it coincidence that earlier this month there was a press release from IMEC regarding the issues of massively scaling up computational power ("exascaling")?
Press blurb can be found here [www2.imec.be].
Killer application would be "space weather prediction".
Distributed Computing (Score:3, Informative)

by thomasinx ( 643997 ) writes: on Wednesday June 30, 2010 @02:52AM (#32741190)

The problems with "coherent memory/identical time/everything can route to everywhere" isnt only seen when you get up to a million cores. I've done plenty of work with MPI and pthreads, and depending on how it's organized, a significant portion of these methods start showing inefficiencies when you get into just a few hundred cores.

Since there are already plenty of clusters containing thousands upon thousands of individual processors (which dont use coherent memory..etc), the step to scale up to a million would likely follow the same logical development. There should already be one or two decent CS papers on the topic, since it's basically a problem that's been around since beowulf clusters were popularized (or even before then)

Share
twitter facebook
Last time I run a parallel program... (Score:3, Interesting)

by ctrl-alt-canc ( 977108 ) writes: on Wednesday June 30, 2010 @03:03AM (#32741248)

...it seemed to me that Amdahl's law [ameslab.gov] was still alive and kicking.

Share
twitter facebook
- Re:Last time I run a parallel program... (Score:5, Insightful)
  
  by palegray.net ( 1195047 ) writes: <philip DOT paradis AT palegray DOT net> on Wednesday June 30, 2010 @03:26AM (#32741352) Homepage Journal
  
  Given you statement, why would you link to a document entitled Reevaluating Amdahl's Law [ameslab.gov]? Did you even read what you linked to? Here's an excerpt:
  Our work to date shows that it is not an insurmountable task to extract very high efficiency from a massively-parallel ensemble, for the reasons presented here. We feel that it is important for the computing research community to overcome the "mental block" against massive parallelism imposed by a misuse of Amdahl's speedup formula; speedup should be measured by scaling the problem to the number of processors, not fixing problem size. We expect to extend our success to a broader range of applications and even larger values for N.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by jvonk ( 315830 ) writes:
    
    Gustafson's Law giveth, while Amdahl's Law taketh away.
    
    Interesting... [temple.edu]
  - Re: (Score:2)
    
    by ctrl-alt-canc ( 977108 ) writes:
    
    I didn't link that document by chance...
    In our industrial field we daily compute several finite-difference simulations on a medium level scale of resolution, and each CPU shares with the others a piece of the simulation playground. The bottleneck is in the inter-cpu communications: whenever we graph the performance of our codes, we always see Amdalh's law say "hi!" to us.
    Of course application examples exist where parallelism is predominant, and in this case the second part of the document I linked is verifi
- Re: (Score:2)
  
  by Burnhard ( 1031106 ) writes:
  
  One of the criticisms of Amdahl's Law is that it makes pessimistic assumptions about the amount of program code that must be serial. These assumptions are wholly dependent on the problem domain under consideration of course. To use Google as a case in point, the amount of serial code is closer to 0% and so Amdahl's Law doesn't really apply here. So, to state the obvious, this kind of processing works well if the problem to be solved is naturally parallel in nature (each element can be effectively process
Transputer, The Next Generation (Score:3, Insightful)

by Animats ( 122034 ) writes: on Wednesday June 30, 2010 @03:03AM (#32741252) Homepage

This is very similar to the Inmos Transputer [wikipedia.org], a mid-1980s system. It's the same idea: many processors, no shared memory, message passing over fast serial links. The Transputer suffered from a slow development cycle; by the time it shipped, each new part was behind mainstream CPUs.
This new thing has more potential, though. There's enough memory per CPU to get something done. Each Cell processor, with only 256K per CPU, didn't have enough memory to do much on its own. 20 CPUs sharing 1GB gives 50MB per CPU, which has more promise. Each machine is big enough that this can be viewed as a cluster, something that's reasonably well understood. Cell CPUs are too tiny for that; they tend to be used as DSPs processing streaming data.
As usual, the problem will be to figure out how to program it. The original article talks about "neurons" too much. That hasn't historically been a really useful concept in computing.

Share
twitter facebook
The Brain. (Score:2)

by headhot ( 137860 ) writes:

Well, when you start thinking about 10^6 or more cores, is pretty obvious that they cannot all be connected to each other and cannot all share memory. At that point your in the realm of neural networking, and are looking at many many serial (and parallel) tasks running in parallel.
When you look at the brain, it has evolved so that different areas have different purposes and techniques for processing data. There are some very highly specialized systems in there for very specific problems.
Steve Furber (Score:2)

by Alioth ( 221270 ) writes:

I saw Steve Furber talk at Retro Reunited in Huddersfield last year (where he spoke about the past - Acorn, the development of the ARM, the present, and the future in terms of what they were doing with SpiNNaker. Very interesting talk. (I also saw Sophie Wilson, another one of the original ARM developers at Bletchley park a couple of weekends ago, another fascinating talk. She now works for Broadcom designing processors for telecommunications).
Yep, its not like computers (Score:2, Interesting)

by Anonymous Coward writes:

The brain isn't like computers at all. The brain is compartmentalized. There are dozens of separate pieces each with its specialty. Its wired to other pieces in specific ways. There is no "Total Information Awareness"(tm) bullshit going on (what 1 million cores would give you). The problem with TIA is that there is too much crap to wade through. Too big a haystack to find the needle you need. What they found when analyzing Berger-Liaw speech recognition systems against other systems is that the Berger
.... ends up looking pretty silly (Score:2)

by Chrisq ( 894406 ) writes:

But anyone who argues the brain isn't a pretty spiffy processing system ends up looking pretty silly.
Wouldn't they have just proved the point then. In their case at least......which would mean that they weren't silly after all ..... so their comment would be silly .... which would prove their point.... [stack overflow]
Obviously... (Score:2)

by muckracer ( 1204794 ) writes:

> What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion? ...run really awesome screensavers!
- Re: (Score:2)
  
  by muckracer ( 1204794 ) writes:
  
  > What do we do when we have computers with 1 million cores? What about a billion? How about 100 billion?
  Run really awesome screensavers!
Where to start? software (Score:3, Interesting)

by CBravo ( 35450 ) writes: on Wednesday June 30, 2010 @03:32AM (#32741372)

My opinion is that you should not require software to be parallelized from the start. You parallelize it during runtime or at compile time.
This makes sense because parallelization does not add anything in functionality (the outcome should not change). My point is: program functionality and configure/compile parallelization afterwards (possibly by power-users). There could be a unique selling point for open source: parallel performance because you can recompile.

Share
twitter facebook
- Re: (Score:2)
  
  by dargaud ( 518470 ) writes:
  
  My opinion is that you should not require software to be parallelized from the start. You parallelize it during runtime or at compile time.
  It certainly is the easiest for the lazy programmer, but certainly not the best way. Using a language that's inherently parallel like Erlang is certainly (one of) the best ways: you don't need to fight with mutexes, semaphores, p-loops as the semantics of the language takes care of it for you. Neat but you need to rethink many processes.
  - Re: (Score:2)
    
    by CBravo ( 35450 ) writes:
    
    Rewriting all software in the world using Erlang will not happen. It is not about a rewrite of the world imho, it is about the possible path, starting where we are now, to a new situation.
- Re: (Score:2)
  
  by selven ( 1556643 ) writes:
  
  So the compiler is supposed to guess which parts of the program can be parallelized and which can't be? What if a programmer, not thinking about parallelization at all, accidentally makes his entire program impossible to break up (eg. by using the same variable for loops throughout the entire code)? To do this, the compiler would have to understand the program's intent, and I think we'll get to 1 million processors long before we get that kind of AI.
  - Re: (Score:2)
    
    by CBravo ( 35450 ) writes:
    
    I am aware of the problems of automatic parallelization with current methods (and I did not state 'automatic'). If there was a simple/simplistic solution I would like to post it here (but I don't know one).
    Your statements have so many assumptions (and not all are true) that you make it hard for yourself to find a solution. Imho one has to think simple to find solutions to hard problems.
The Internet (Score:5, Interesting)

by pmontra ( 738736 ) writes: on Wednesday June 30, 2010 @03:37AM (#32741400) Homepage

The Internet is at least in the 1 billion cores range. The way to use many of them for a parallel computation has been demonstrated by Seti@home, Folding@home and even by botnets. They might not be the most efficient implementations when you have full control of the cores but they show the way to go when the availability of the cores and the communication between them is unreliable, when they have different times and different clocks and when they might be preempted to do different tasks.

Share
twitter facebook
Bye bye Von-neuman (Score:2)

by kikito ( 971480 ) writes:

With 100+ cores we should start considering leaving Von-neuman behind.
Separated memory/processing/instructions/registers would stop making sense. We would have to follow another model. I don't know which.
- Re: (Score:2)
  
  by TheTurtlesMoves ( 1442727 ) writes:
  
  This is the way for a lot of micro controllers and DSPs. Its not all it crack up to be. Pros and cons and all that.
- Re: (Score:2)
  
  by Skapare ( 16644 ) writes:
  
  OK, so, does that mean I should include RAM in an SoC or not?
The brain isn't a spiffy processing system. (Score:5, Insightful)

by master_p ( 608214 ) writes: on Wednesday June 30, 2010 @04:04AM (#32741534)

The brain does not do arithmetic, it only does pattern matching. That's what most people don't get and that's the obstacle to understanding and realizing AI.
If you ask how can humans can then do math in their brain, the answer is simple: they can't, but a pattern matching system can be trained to do math by learning all the relevant patterns.
If you further ask how humans can do logical inference in their brain, the answer is again simple: they can't, and that's the reason people believe in illogical things. Their answers are the result of pattern matching, just like Google returning the wrong results.

Share
twitter facebook
- Re: (Score:2)
  
  by Ihlosi ( 895663 ) writes:
  
  The brain does not do arithmetic, it only does pattern matching.
  Down at the neuron level, the brain does arithmetics on pulse frequency modulated signals.
What's wrong with the coherent memory model ? (Score:2)

by Arlet ( 29997 ) writes:

The current model of coherent memory/identical time/everything can route to everywhere just can't scale to machines of this size
Why not ? Obviously, you can't have a million processors accessing the same variable in memory, but with a layered system of caches, you could keep most processors working in their own local copy. As soon as a processor writes to memory that's also used by another process, extra hardware will keep the memory coherent. This architecture is basically a superset of a message passing a
If our brains have 100 billion processing elements (Score:2)

by Ihlosi ( 895663 ) writes:

... then our processors also have millions already.
A neuron is a fairly simple processing element, after all. Complexity comes from the sheer number of connections with other neurons that a single neuron can have.
Obligatory (Score:2)

by Hognoxious ( 631665 ) writes:

In Soviet Russia, a Beowulf cluster of these imagines YOU!!!!!
Not a brain (Score:2)

by Yvanhoe ( 564877 ) writes:

Our brains have billions of neurons, not billions of cores. These are completely different beasts when it comes to architecture.
The Human Brain (Score:2)

by dave420 ( 699308 ) writes:

The Human brain is rather spiffy, but it's far from perfect. It has fantastic performance, but it can frequently screw up massively. I don't think mankind would be too pleased if their most powerful computers got depressed (like Marvin) - we'd probably expect to be able to use them, and to trust their output. If they work like the human brain, we can't always do both.
SoC (Score:2)

by Skapare ( 16644 ) writes:

When are they going to get to a point where all of the RAM is on the same die with the processor core(s) that need access to that RAM? By shortening the path to the RAM from going off-chip to staying on-chip, the opportunity for increased speed and lower power consumption arises. And this can also be constructed more compactly, allowing more such complete processors within the same space. Then with more processors, at some point we no longer even need virtual memory for at least the bulk of the processor
Fuck you lazy editors (Score:2)

by drinkypoo ( 153816 ) writes:

Related stories
Intel Says to Prepare For "Thousands of Cores" [slashdot.org]
There, hope that helps.
- Re:Better be running OSS (Score:4, Interesting)
  
  by jd ( 1658 ) writes: <imipak AT yahoo DOT com> on Wednesday June 30, 2010 @02:38AM (#32741126) Homepage Journal
  
  I don't know about this specific project, but Manchester is strongly Open Source. The Manchester Computer Centre developed one of the first Linux distributions (and - at the time - one of the best). The Advanced Processor Technologies group has open-sourced software for developing asynchronous microelectronics and FPGA design software.
  Manchester University is highly regarded for pioneering work (they were working on parallel systems in 1971, and developed the first stored-program computer in 1948) and they have never been ashamed to share what they know and do. (Disclaimer: I studied at and worked at UMIST, which was bought by Manchester, and my late father was a senior lecturer/reader of Chemistry at Manchester. I also maintain Freshmeat pages for the BALSA projects at APT.)
  
  Parent Share
  twitter facebook
- Re:Better be running OSS (Score:5, Funny)
  
  by capo_dei_capi ( 1794030 ) writes: on Wednesday June 30, 2010 @02:48AM (#32741166)
  
  This 1-million core machine better be running open source software and not proprietary software.
  Yeah, especially if their software is licensed on a per-core basis.
  
  Parent Share
  twitter facebook
  - What about the E programming language? (Score:2)
    
    by elucido ( 870205 ) * writes:
    
    Why wouldnt E scale up to a billion cores?
- Re:Dangerous idea (Score:5, Insightful)
  
  by ProfessionalCookie ( 673314 ) writes: on Wednesday June 30, 2010 @03:33AM (#32741382) Journal
  
  From a science perspective I'm pretty sure that either computer are already "sentient" or (IMHO, more likely) that we don't really understand what sentience is. At all.
  
  Parent Share
  twitter facebook
- Re: (Score:2)
  
  by Hognoxious ( 631665 ) writes:
  
  Butlerian Juhad here we come!
- Re: (Score:2, Interesting)
  
  by perryizgr8 ( 1370173 ) writes:
  
  on the contrary, i can't wait for sentient machines. at last we will be free of shoddy human programming. no vendor lockins and other such stuff. just tell your computer to write a specific program you desire for a specific purpose, and he will write it for you.
- Re: (Score:2)
  
  by Wescotte ( 732385 ) writes:
  
  What's really going to blow your mind is that analog doesn't really exist ;)
  - Re: (Score:2)
    
    by necro81 ( 917438 ) writes:
    
    Talk to any chip designer and they'll tell you that digital doesn't really exist. Want to understand how to get a MOSFET to work properly in silicon? you'll need to model it as a perculiar analog device.
    - Re: (Score:2)
      
      by John Hasler ( 414242 ) writes:
      
      > you'll need to model it as a perculiar analog device.
      No, as a quantum mechanical device.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Reminds me of Hillis (Score:2)

Re: (Score:3, Informative)

Re: (Score:2, Insightful)

Re: (Score:2)

Re:Reminds me of Hillis (Score:5, Interesting)

Re: (Score:2)

1 billion cores (Score:3, Informative)

Re: (Score:3, Insightful)

Re:1 billion cores (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

multi core design (Score:5, Insightful)

Re:multi core design (Score:5, Interesting)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:multi core design (Score:5, Funny)

Re:multi core design (Score:4, Informative)

Re:multi core design (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:2)

E programming language (Score:2)

Re: (Score:2)

Re: "steal" (Score:4, Funny)

Bluudy Blogs (Score:4, Informative)

Distributed systems (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Distributed Computing (Score:2)

Re: (Score:2)

multi-tasking (Score:2)

Problems with this blog. (Score:4, Informative)

Re: (Score:3, Insightful)

Re: (Score:2, Interesting)

Human brain != computer (Score:2, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2, Interesting)

Re:Human brain != computer (Score:5, Insightful)

Re: (Score:3, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Damaged Brains (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Link with IMAC ExaScience lab? (Score:2, Interesting)

Distributed Computing (Score:3, Informative)

Last time I run a parallel program... (Score:3, Interesting)

Re:Last time I run a parallel program... (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Transputer, The Next Generation (Score:3, Insightful)

The Brain. (Score:2)

Steve Furber (Score:2)

Yep, its not like computers (Score:2, Interesting)

.... ends up looking pretty silly (Score:2)

Obviously... (Score:2)

Re: (Score:2)

Where to start? software (Score:3, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

The Internet (Score:5, Interesting)

Bye bye Von-neuman (Score:2)