Auto-Parallelizing Compiler From Codeplay

Auto-Parallelizing Compiler From Codeplay 147

Posted by ScuttleMonkey on Friday March 09, 2007 @11:28PM from the code-writes-you dept.

Max Romantschuk writes "Parallelization of code can be a very tricky thing. We've all heard of the challenges with Cell, and with dual and quad core processors this is becoming an ever more important issue to deal with. The Inquirer writes about a new auto-parallelizing compiler called Sieve from Codeplay: 'What Sieve is is a C++ compiler that will take a section of code and parallelize it for you with a minimum hassle. All you really need to do is take the code you want to run across multiple CPUs and put beginning and end tags on the parts you want to run in parallel.' There is more info on Sieve available on Codeplay's site."

Auto-Parallelizing Compiler From Codeplay

This discussion has been archived. No new comments can be posted.

Search 147 Comments Log In/Create an Account

Comments Filter:

Interesting, but.. (Score:5, Insightful)

by DigitAl56K ( 805623 ) writes: on Friday March 09, 2007 @11:49PM (#18297198)

The compiler will put out code for x86, Ageia PhysX and Cell/PS3. There were three tests talked about today, CRC, Julia Ray Tracing and Matrix Multiply. All were run on 8 cores (2S Xeon 5300 CPUs) and showed 739, 789 and 660% speedups respectively.

That's great - but do the algorithms involved here naturally lend themselves to the parallelization techniques the compiler uses? Are there algorithms that are very poor choices for parallelization? For example, can you effectively parallelize a sort? Wouldn't each thread have to avoid exchanging data elements any other thread was working on, and therefore cause massive synchronization issues? A solution might be to divide the data set by the number of threads and then after each set was sorted merge them in order - but that requires more code tweaking than the summary implies. So I wonder how different this is from Open/MT?

snake oil (Score:5, Insightful)

by oohshiny ( 998054 ) writes: on Friday March 09, 2007 @11:53PM (#18297216)

I think anybody who is claiming to get decent automatic parallelization out of C/C++ is selling snake oil. Even if a strict reading of the C/C++ standard ends up letting you do something useful, in my experience, real C/C++ programmers make so many assumptions that you can't parallelize their programs without breaking them.

Re:snake oil (Score:3, Insightful)

by mastershake_phd ( 1050150 ) writes: on Saturday March 10, 2007 @12:39AM (#18297448) Homepage

"All you really need to do is take the code you want to run across multiple CPUs and put beginning and end tags on the parts you want to run in parallel" The compiler isn't going to know if you're doing something stupid or not. In other words: use at your own risk. The old adage of "garbage in, garbage out" still applies.

But how are you supposed to know exactly how something is going to run under this? Even with a good understanding of what your trying to do and (hopefully) what exactly the compiler is doing you still might get some weird results under certain situations. It might work buts its still going to take lots of trial and error, or at least a lot of verification.

SmartVariables is a good alternative to MPI / PVM (Score:2, Insightful)

by Anonymous Coward writes: on Saturday March 10, 2007 @12:48AM (#18297490)

Let's see if I can teach any old dogs some new trix.

Here is a quote from the SmartVariables white-paper:

"The GPL open-source SmartVariables technology works well as a replacement for both MPI and PVM based systems, simplifying such applications. Systems built with SmartVariables don't need to worry about explicit message passing. New tasks can be invoked by using Web-Service modules. Programs always work directly with named-data, in parallel. Tasks are easily sub-divided and farmed out to additional web-services, as needed - without worry of breaking the natural parallelism. If two or more tasks ever access data of the same name and location, then that data is automatically shared between them - without need for additional parallel programming constructs. Instead of using configuration files with lists of available machines, a shared SmartVariables List object (with a commonly accepted name, like "machines@localhost") could easily hold the available host names, which can then be used for dynamic task allocation. The end-result is that SmartVariables-based parallel systems need only reference and work with distributed data, and don't need to manage it. Automatic sharing means there is no need to worry about explicit connection, infrastructure, or message-passing code. Instead, applications only need agree on the names used for their data. Names and object locations are easily managed by using a SmartVariables based Directory-Service as an additional layer of object indirection."

The rest of this paper is here: http://www.smartvariables.com/doc/DistributedProgr amming.pdf [smartvariables.com]

A single code-base works on Apple / Linux / Windows.
Complete code and docs at http://smartvariables.com/ [smartvariables.com]

Sounds like multi-threading AND NOT Parallelizing (Score:4, Insightful)

by mrnick ( 108356 ) writes: on Saturday March 10, 2007 @12:50AM (#18297496) Homepage

I read the article, the information at the company's web site and even white papers written on the compiler. And although I did see one reference to "Multiple computers across a network (e.g. a "grid")" there was no other mention of it.

When I think of Parallelizing software, after getting over my humors mind thinking of a virus that paralyzes users, what comes to mind is clustering. When I think of clustering the train of thought directs me to Beowulf and MPI or it's predecessor PVM. Though I can find no information that supports the concept of clustering in any manner.

Again I did see a reference to: "Multiple computers across a network (e.g. a "grid")" but according to Wikipedia grid computing is defined "A grid uses the resources of many separate computers connected by a network (usually the Internet) to solve large-scale computation problems. Most use idle time on many thousands of computers throughout the world."

Well, that sounds like the distributed SETI project and the like, which would seem even more ambitious than a compiler that would help write MPI code for Beowulf clusters.

From all the examples this looks like a god compiler for writing code that will run more efficiently on multi-core and multi-processor systems but would not help you in writing parallel code for clustering.

Though, this brings up a concept that many people forget. Even people that I would consider to be rather intelligent on the subject of clustering often forget this. And that is that if you have an 8 computer cluster with each node running on a system with dual-core Intel CPU installed that if you write parallel code for it using MPI you are benefiting from 8 cores in parallel. Many people that write parallel code forget about multi-threading. To benefit from all 16 cores in a cluster I just described the code would have to be written multi-threaded and parallel. One of the main professors involved in a clustering project at my university stated to me that in their test environment they were using 8 dell systems with dual-core Intel CPU so in total they had the power of 16 cores. Since he has his Ph. D. and all I didn't feel the need to correct him and explain that unless his code was both parallel and multi-threaded he was only getting the benefit of 8 cores. I knew he was not multi-threading because they were not even writing the code in MPI rather they were using Python and batching processes to the cluster. From my knowledge Python cannot write multi-threaded applications. Even if it can I know they were not (from looking at their code).

Sometimes it's the simplest things that confuse the brightest of us....

Nick Powers

Why C++ (Score:1, Insightful)

by impeachgod ( 982062 ) writes: on Saturday March 10, 2007 @04:32AM (#18298254)

Why use C++? Aren't there languages that support parallelizing better, like the functional ones? Or perhaps develop your own language tuned to parallelizing.

Re:single process uses 1 core unless multi-threade (Score:3, Insightful)

by julesh ( 229690 ) writes: on Saturday March 10, 2007 @05:47AM (#18298456)

The operating system on a multiple-core machine can split up the processes but one process can only run on one core unless it has been written in a multi-threaded fashion.

In parallel processing general each machine is running one part of a program, thus one program, and unless that program is multi-threaded as well as parallel then it can only use one core per node on a cluster.

Though, someone who writes multi-threaded parallel applications should be held in high esteem! I don't know any such coders.

Have you considered that if you run two copies of the process on each node, it will use both cores?

Auto-parallelizing? (Score:1, Insightful)

by Anonymous Coward writes: on Saturday March 10, 2007 @07:42AM (#18298814)

How can be automatic a compiler that needs you to mark the parallel sections? It just simplify the use of threads, but you still have to find parallelism and write your code parallel. It is like OpenMP...

Re:Why C++ (Score:1, Insightful)

by Anonymous Coward writes: on Saturday March 10, 2007 @07:57AM (#18298856)

Why use C++? Aren't there languages that support parallelizing better, like the functional ones? Or perhaps develop your own language tuned to parallelizing.

Because C++ has power, because C++ is alive, because C++ has speed, and... more importantly...

Because there are a lot of today's applications built on C++, working applications that work today, and won't need a new, unstable version for years because it was re-written from scratch in a new language.

Every three or five years, we've got a new very cool framework that will be obsolete three or five years after. No one will refactor from scratch an existing application once every three years, just to have this new shining concept, if a viable and simple enough alternative is offered in the existing application language.

What C++ applications need is not a new language, because today C++ application will never adopt it (who will pay for the months/years developpement costs just to switch language?). What C++ applications need is evolution of the language.

Think about how C++ evolved from C, and remains compatible, which eased a lot slow and constant evolution of a living application from C to C++. The next C++ (C++++, or, perhaps ++C?) will need to be compatible enough with C++ and C to enable C++ and C code mixing within one's library or executable.

Not that we need a ++C. C++ is alive enough today. Ok, looking at boost or templates, some would call is cancer metastasis. But is this truly warts, or is this the fear of new concepts introduced in an otherwise familiar language? Perhaps C++ don't need a new language. Perhaps what C++ needs is C++ language evolution and C++ developpers' cognitive evolution.

Re:OpenMP can support clusters (Score:3, Insightful)

by mi ( 197448 ) writes: <slashdot-2017q4@virtual-estates.net> on Saturday March 10, 2007 @09:28AM (#18299258) Homepage Journal

Won't that require some runtime support, like mpirun in MPI (that takes care of rsh/ssh-ing to each node and starting the processes)?

Well, yes, of course. You also need the actual hardware too :-)
This is beyond the scope of the discussion, really — all clusters require a fair amount of work to setup and maintain. But we are talking about coding for them here...

Re:batching code to a cluster is just silly (Score:3, Insightful)

by Mr. Hankey ( 95668 ) writes: on Sunday March 11, 2007 @08:06PM (#18311224) Homepage

I do work in a HPC environment, and we have a number of clusters available which are utilized through batch processing as well as software interfaces such as OpenMPI.

In defense of the batch system, assuming we definine parallel as "at the same time", and not "inter-node communication", one should be fine with the multiple processes approach. There are a number of commercial and open source schedulers which will do the work for you, including node weighting for systems with different numbers of processors and/or processor speeds. As long as the nodes are not doing anything else, the OS schedulers do the right thing.

Given that it's simpler to write, debug and verify single threaded code, pushing multiple instances of an application to a cluster's nodes can reduce development time. Not requiring direct communication between the nodes also allows execution the same application on multiple clusters as seen in grids, indeed even multiple architectures. It also lowers network overhead, unless you introduce InfiniBand or Myranet interfaces (which are costly.)

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Auto-Parallelizing Compiler From Codeplay 147

Auto-Parallelizing Compiler From Codeplay More Login

Auto-Parallelizing Compiler From Codeplay

Interesting, but.. (Score:5, Insightful)

snake oil (Score:5, Insightful)

Re:snake oil (Score:3, Insightful)

SmartVariables is a good alternative to MPI / PVM (Score:2, Insightful)

Sounds like multi-threading AND NOT Parallelizing (Score:4, Insightful)

Why C++ (Score:1, Insightful)

Re:single process uses 1 core unless multi-threade (Score:3, Insightful)

Auto-parallelizing? (Score:1, Insightful)

Re:Why C++ (Score:1, Insightful)

Re:OpenMP can support clusters (Score:3, Insightful)

Re:batching code to a cluster is just silly (Score:3, Insightful)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot