Slashdot Log In
Is Parallelism the New New Thing?
Posted by
kdawson
on Friday March 28, @10:23AM
from the still-working-on-the-old-new-thing dept.
from the still-working-on-the-old-new-thing dept.
astwon sends us to a blog post by parallel computing pioneer Bill McColl speculating that, with the cooling of Web 2.0, parallelism may be a hot new area for entrepreneurs and investors. (Take with requisite salt grains as he is the founder of a Silicon Valley company in this area.) McColl suggests a few other upcoming "new things," such as Saas as an appliance and massive memory systems. Worth a read.
Related Stories
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading ... Please wait.

About time (Score:5, Funny)
Re:About time (Score:4, Insightful)
When I was in grad school back in the 1970s, people thought parallelism would be the next big thing, and it had some interesting technical challenges, so I got into it as much as was possible back then. Then I got out into the Real World [TM], where such ideas just got blank looks and "Let's move on" replies.
Some what later, in the 1980s, I worked on projects at several companies who thought that parallelism was the next big thing. That time around, I got my hands on a number of machines with hundreds of processors and gigabytes (Wow!) of memory, so I could actually try out some of the ideas from the previous decade. The main things that I learned was that 1) many of the concepts were viable, and 2) debugging in an environment where nothing is reproducible is hard. And I moved on, mostly to networking projects where you could actually do loosely-coupled multiprocessing (though management still gave you blank looks if you started talking in such terms).
Now we're getting personal computers with more than one processor. It's been a bit of a wait, but we even have management saying it's the "new^N thing". And debugging parallel code is still hard.
I'll predict that 1) We'll see a lot of commercial parallelized apps now, and 2) those apps will never be debugged, giving us flakiness that outshines the worst of what Microsoft sold back in the 1980s and 1990s. We'll still have the rush to market; developers will still be held to nonsensical schedules; debugging will still be treated as an "after-market" service; and we developers will still be looking for debugging tools that work for more than toy examples (and work with a customer's app that has been running for months when a show-stopper bug pops up).
There's a reason that, despite the existence of multi-process machines for several decades, we still have very little truly parallel code that works. Debugging the stuff is hard, mostly because bugs can rarely be reproduced.
(Of course, this won't prevent the flood of magical snake-oil tools that promise to solve the problem. There's a lot of money to be made there.
Re:ACtually (Score:4, Funny)
Paul Otellini ?
I didn't know you posted on slashdot !
So what's up man ? Can I buy you a beer ?
Re:About time (Score:5, Informative)
Windows is not the only closed-source proprietary operating system out there. AIX and Solaris have supported parallel functions for a number of years, and various IBM mainframe operating systems have had those functions since the '70's. There are architectures which had it in the '60's.
Proprietary closed-source operating systems had these functions FIRST before Linux was a twinkle in Linus Torvalds's shorts.
Re:About time (Score:5, Funny)
Do not mock the shorts of Torvalds, for they are mighty indeed!
Re:Multithreading Is to Blame (Score:4, Insightful)
Let me know when I can buy (Score:3, Insightful)
TRIPS (Score:4, Interesting)
Re: (Score:3, Interesting)
Re:TRIPS (Score:4, Interesting)
I guess my point is that I think we'll actually create the basic, expandable model fairly quickly. Would you agree that today's supercomputing, which utilizes parallelization on a scale far beyond desktop computing, has successfully harnessed parallelization? I hope you would. If so, then the next step is miniaturization of what supercomputing is already doing. That step is just now taking place. It's not something that will happen overnight, but I do think that after we've fully integrated parallelization into everyday computing we'll be back to the same old game again: that of looking for ever better ways to increase FLOPS through transistor/switch speed.
My basic thoughts on this are that it is, in theory, easier to model the perfect parallelization of a program, and the optimum number of cores for a specific type of computer, than it is to model the fastest possible clock speed of a CPU. Because of this we'll probably see diminishing returns in advancement of parallelization at an accelerated rate compared to CPU design and clock speed.
1% of programmers (Score:5, Insightful)
This seems far, far too low. Admittedly I work in a place that does "parallel programming," but it still seems awfully low.
Re: (Score:3, Interesting)
The entire hierarchy system in the IT fields has to deal with the painfully obvious fact that less than 1% of programmers know what t
Re:1% of programmers (Score:5, Insightful)
I think your experience is wildly skewed toward the high end of programming skill. The percentage of working programmers who can't iterate over an array is probably in the 15-20% range, even without getting into whether "web programmers" are included in that statistic. I'd be astonished if the number with parallel experience is significantly above 1%.
Re:1% of programmers (Score:5, Insightful)
If you ask how many can "regularly achieve significant performance through use of multiple threads" then 0.1% is far too high. If you mean "can exchange data between a userland thread and an ISR in compliance with the needs of reliable parallel execution" then its a safe bet that less than 0.1% are mentally up to the challenge. /. readers are not typical of the programming cummiity. These days people who can drag-and-drop call themselves programmers. Poeple who can spell "l337" are one!
evolution, not revolution (Score:5, Insightful)
Decade after decade, people keep trying to sell silver bullets for parallel computing: the perfect language, the perfect network, the perfect os, etc. Nothing ever wins big. Instead, there is a diversity of solutions for a diversity of problems, and progress is slow but steady.
Re:evolution, not revolution (Score:4, Insightful)
The next problem is that parallelism is not, as a rule, CPU-bound but network-bound. All the libraries in the world won't work when the network clogs and chokes.
The third problem is that coders are taught serial methods. Parallel thinking is very different from serial thinking. You run into problems that do not exist in the serial world, even on a timeslicing system like Linux. True parallelism, like true clockless computing, is a nightmare to do well. You can't just shove another library in and hope things'll work.
The fourth problem is the level of connectedness. Globus is a great toolkit for some things, but you wouldn't use it for programming a vector computer or - most likely - even a Beowulf cluster. It's a gridding solution and a damn good one, but grids are all it will do well. On the flip-side, solutions like bproc and MOSIX are superb mechanisms for optimally using a fairly tight cluster, but you'd never sanely use them on a grid. The latencies would make the very features that make those solutions useful in a cluster useless on a WAN-based grid.
I'm not sure I'm keen on Java on any parallel solution other than gridding. It's too slow, its threading model is still in its infancy, and the sandboxing makes RDMA an absolute nightmare to do safely. In fact, the very definition of sandboxing is that external entities can't go around poking bits of data into memory, which is the entire essence of RDMA - CPUless networking.
Regardless, there are some things that C++ and Java simply cannot do well that other, parallel-specific languages like Pi-Occam can do with extreme ease and safety. It is possible to prove an Occam program is safe. You cannot do likewise with a C++ or Java program.
Parallelism isn't just about more threads on one CPU. In a totally generalized parallel scenario, there may be any number of threads - a few tens of thousands would not be unusual - running on systems that may be SMP, multi-core, multi-threaded, vectored, clockless, or any combination of the above, where those systems may be on a tightly-coupled or loosely-coupled cluster, and where the cluster may be homogenous or heterogeneous, SSI or multi-imaged, where memory may be local, NUMA or distributed, and where these systems/clusters may be gridded over wide-area networks that may or may not be reliable or operational at any given time, and where threads, processes and entire operating systems may migrate from system to system without user intervention or awareness on the part of the application.
The number of true parallel experts in the world probably number less than a dozen. No, I'm not one of them. I'm good, I understand the problem-space better than the average coder, but I've talked to some of the experts out there and they're as far beyond any traditional programmer as a traditional programmer is beyond the chipmunk. A network engineer might consider themselves OK if they can set up OSPF optimally across a traditional star network of star networks. Any traditional routing protocol over a mesh without getting flaps and maintaining a reasonable level of fault tolerence would be considered tough. A butterfly network, a torroidal network or a hypercube would leave said network engineer a gibbering wreck. Modern supercomputers do not take up buildings. Modern supercomputers take up a few rooms. The interconnects take up entire buildings. And the air conditioning on top systems can be measured in football stadia.
OpenMOSIX is largely dead, because it was impossible to reconcile those who wanted load-balancing with those who wanted HPC. It's not that they can't be reconciled in theory, it's that the mindsets are too different to cram into one brain.
If one solution could solve parallelism, the Transputer would be the only processor in use today and Intel would be a smoking pile of twisted rubble. Or we'd be using Parallel Fortran or Parallel C today, and object-oriented programming (which doesn't parallelize easily) would never have evolved. SC|08 would have just one major vendor, the same as all the other technological trade shows. Even the Linux-oriented trade shows only have one or two core vendors, the rest are hangers-on. The supercomputer trade shows look more like the 1970s homebrew computer clubs - almost zero uniformity. The biggest joke was when Microsoft turned up one year and showed Excell on Windows Cluster Edition. There were a few pointy-hair types watching the presentations, but the geeks were mostly harassing the stalls that had more interesting stuff. Sandia had a neat gizmo for doing hardware collective operations, improving performance by easily an order of magnitude. There were source-to-source compilers that would rework code to be substantially faster on a given compiler. And there were some very impressive compilers out there! None for Java, that I recall. C, C++, Fortran, but no Java. The day someone writes a full BLAS/LAPACK implementation in Java that can compare with existing top-line low-level solutions is the day Java will be a screaming success in HPC. Like it or not, Parallel is HPC. It may not be called that, but that's what it is.
Parallel computing pioneer likes Parallelism (Score:4, Insightful)
Shock! And Awe!
"Next hot thing" my hiney (Score:5, Insightful)
Because really, anybody believes Web-Two-Oh was anything but the regular web's natural evolution with a fancy name tacked on?
Re: (Score:3, Insightful)
Web 2.0 was a single name for an amorphous collection of technologies and philosophies. It was even worse than AJAX.
Parallelism is a pretty simple, well-defined problem, and an old one. That doesn't mean it can't turn into
More so now, but depends ... (Score:4, Insightful)
One thing that should also be noted, is that in certain cases you will need to accept increased memory usage, since you want to avoid tasks locking on resources that they don't really need to synchronise until the end of the work unit. In this case it may be cheaper to duplicate resources, do the work and then resynchronise at the end. Like everything it depends on the size and duration of the work unit.
Even if your application is not doing enough to warrant running its tasks in parallel, the operating system could benefit, so that applications don't suffer on sharing resources that don't need to be shared.
Didn't we have this debate last week? (Score:5, Informative)
And the conclusion?
It's been around for years numbnuts, in commercial and server applications, middle tiers, databases and a million and one other things worked on by serious software developers (i.e. not web programming dweebs).
Parallelism has been around for ages and has been used commercially for a couple of decades. Get over it.
Please no (Score:3, Funny)
Coming soon: professional object-oriented XML-based AJAX-powered scalable five-nines high-availability multi-tier enterprise turnkey business solutions that convert visitors into customers, optimize cash flows, discover business logic and opportunities, and create synergy between their stupidity and their bank accounts - parallelized.
"the bastards say 'welcome'" (Score:5, Informative)
Here are a couple of lessons learned from that Ada experience:
1. Sometimes you want synchronization, and sometimes you want avoidance. Ada83 Tasking/Rendezvous provided synchronization, but was hard to use for avoidance. Ada95 added protected objects to handle avoidance.
2. In Ada83, aliasing by default was forbidden, which made it a lot easier for the compiler to reason about things like cache consistency. Ada95 added more pragmas, etc, to provide additional control on aliasing and atomic operations.
3. A lot of the early experience with concurrency and parallelism in Ada learned (usually the hard way) that there's a 'sweet spot' in the number of concurrent actions. Too many, and the machine bogs down in scheduling and synchronization. Too few, and you don't keep all of the processors busy. One of the interesting things that Karl Nyberg worked on in his Sun T1000 contest review was the tuning necessary to keep as many cores as possible running. (http://www.grebyn.com/t1000/ [grebyn.com] ) (Disclosure: I don't work for Grebyn, but I do have an account on grebyn.com as a legacy of the old days when they were in the ISP business in the '80s, and Karl is an old friend of very long standing....)
All this reminds me of a story from Tracy Kidder's Soul of a New Machine http://en.wikipedia.org/wiki/The_Soul_of_a_New_Machine [wikipedia.org]. There was an article in the trade press pointing to an IBM minicomputer, with the title "IBM legitimizes minicomputers". Data General proposed (or ran, I forget which) an ad that built on that article, saying "The bastards say, 'welcome' ".
dave
"...with the cooling of Web 2.0,..." (Score:5, Insightful)
What am I missing? (Score:4, Insightful)