Slashdot is powered by your submissions, so send in your scoop


Forgot your password?
Programming IT Technology

Distributed Compilation, a Programmer's Delight 60

cyberpead writes in with a Developerworks article on the open source tool options that can help speed up your build process by distributing the process across multiple machines in a local area network.
This discussion has been archived. No new comments can be posted.

Distributed Compilation, a Programmer's Delight

Comments Filter:
  • by dsginter ( 104154 ) on Friday November 14, 2008 @12:37PM (#25761401)
    • If you get many slow machines and a slow network, it'll actually take longer to compile - and you'll still be able to happily say that.
      • distcc scales pretty close to linearly. You'd have to have not-very-many slow machines and a slow network.
        • Can you use the internet? Imagine using a p2p network... you compile the files and give the .obj back to the linker. This could take ages.
          • by Thiez ( 1281866 ) on Friday November 14, 2008 @02:30PM (#25763183)

            That would allow for people to inject malware, wouldn't it?

            To compile:

            void printhello() {
              printf("Hello world!\n");

            evil bastard changes to:

            void printhello() {
              printf("Hello world\n");

            Since the most practical way to spot the evil binary would be to compile the code yourself and compare, that sort of defeats the purpose of having someone else compile it. I guess you could have many random people compile the same piece of source-code and then compare all produced code, but that makes the whole thing rather complicated.
            Also, the p2p thing would only be useful for open source, as I doubt it would be smart for people trying to produce some closed source product to send their source to a p2p network that may or may not store everything.
            And this is all assuming the delays introduced by sending all this stuff over the internet are not so large that compiling locally is faster or almost as fast.

            It's probably best to compile your stuff on your lan, on machines that are close, and that can be trusted.

        • by heson ( 915298 )
          I like Icecream, dunno why it feels better than distcc, maybe the bugs are fewer.
    • So if I build 4 machines with quad cores and configure dist cc does that mean I`ll finally be able to build open office from source on gentoo?
      • Blah blah blah Gentoo blah blah compiling forever blah blah punchline.

        When I first used Gentoo several years ago it was with a 950Mhz Athlon CPU and it wasn't too bad.

        Now, with quad-core CPU's becomming the norm even in Desktop machines, the compiling thing is even less of an issue.

        I was able to run software on Gentoo that I could never get to run well together on any other distribution. You can almost always get the latest and greatest versions of everything with Gentoo. With Kernels taking almost no t
        • Re: (Score:3, Informative)

          That was my problem. Broken ebuilds. Conflicting requirement lists that the updater script wasn't any good at working out. Gentoo made me run back to Slackware for a while, and eventually to Ubuntu (about 2-3 years ago, to see what the buzz was about).
        • Abviously you`ve never tried to compile Open Office. ;) (Even on a powerful system it takes days. ).
          • Re: (Score:2, Insightful)

            by ORBAT ( 1050226 )
            If by powerful system you mean steam-powered Analytical Engine, yes, it'll take days.

            The longest OpenOffice compile I've ever done was something around 5 hours, and that was with the system doing other stuff on the side. Distcc et al reduce the compile time to around 2h.
          • Well, not days, hours. Historically, Gentoo has provided some binary packages for software that takes an undue amount of time to compile and won't affect the rest of the system if compiled with generic options.

            If I recall correctly, OpenOffice was one such package.

            Gentoo isn't that masochistic.
      • I compiled Open Office 3 last Monday on a 2.0 GHz Core 2 Duo with 1 GB of RAM and no swap in 2 hours and 45 minutes. That's my best time yet.
    • Tried to find it, but couldn't. It goes something like this:

      Panel 1: PHB, walking by Dilbert's cube: Dilbert, why aren't you working?

      Panel 2: Dilbert: My programs are compiling.

      Panel 3: PHB, sitting back at his desk by himself, thought bubble: I wonder if my programs compile.

    • by jerep ( 794296 )
      Compile times? I'm using D you insensitive clod.
  • by TheThiefMaster ( 992038 ) on Friday November 14, 2008 @12:39PM (#25761431)

    Due to a strange quirk in the way compilers are designed, it's (MUCH) faster to build a dozen files that include every file in your project than to build thousands of files.

    Once build times are down to 5 - 15 minutes you don't need distributed compiling. The link step is typically the most expensive anyway, so distributed compiling doesn't get you much.

    • Preprocessing in C (Score:5, Informative)

      by Frans Faase ( 648933 ) on Friday November 14, 2008 @01:36PM (#25762335) Homepage

      I guess you are refering to the preprocessing step of C and C++ compilers, which was really a lame hack, I think. If you have a lot of include files, preprocessing produces large intermediate files, which contain a lot of overlapping code, that has to be compiled over and over again.

      Preprocessing should have been removed a long time ago, but nasty backwards compatability issue, it was never done. Other languages, such as Java and D, solve this problem in a much better way. Just as did TurboPascal with its TPU files in the late 1980's.

    • Due to a strange quirk in the way compilers are designed, it's (MUCH) faster to build a dozen files that include every file in your project than to build thousands of files.

      True of Visual C++, not true of any other compiler I'm familiar with.

    • by jgrahn ( 181062 )

      Due to a strange quirk in the way compilers are designed, it's (MUCH) faster to build a dozen files that include every file in your project than to build thousands of files. Once build times are down to 5 - 15 minutes you don't need distributed compiling.

      But your code will be harder to understand. You're giving up a lot of tools, like static globals in C and anonymous namespaces in C++.

      Every time I have encountered painfully long compile times, the cause has been sloppiness. Usually, the direct cause is t

  • Imagine a beowolf cluster for those.
  • Is this new? (Score:3, Insightful)

    by daveewart ( 66895 ) on Friday November 14, 2008 @12:51PM (#25761615)

    Article summary: use 'make -j', 'distcc' and 'ccache' or something combination of these. These utilities are well known and widely used already, no?

    • Yea. I used to use distcc a lot about five years ago. It doesn't help with all compile functions but it can help if you're compiling something big like X11 or KDE.
    • Yeah, I was wondering the same thing. distcc and ccache has been a staple of Gentoo users since forever.

  • Minor error (Score:5, Informative)

    by pipatron ( 966506 ) <> on Friday November 14, 2008 @12:56PM (#25761697) Homepage
    There's a minor error in the article, which claims that your servers need access to the source. distcc was designed to not need this.
    • Re: (Score:3, Insightful)

      There's a minor error in the article, which claims that your servers need access to the source. distcc was designed to not need this.

      That implies you read the article, but that can't be the case.

      • Re: (Score:3, Informative)

        by cbreaker ( 561297 )
        I read it too, and it's true - they DO say all of the machines need access to the source which they do not.

        Maybe there's some special cases, but I've never had to have a shared source repository in order to use distcc.

        They also say the machines need to be exactly the same configuration, and they do elaborate on that a little bit, but it's not strictly true. Depending on the source you're compiling, you might only need to just have the same major version of GCC.
  • by adonoman ( 624929 ) on Friday November 14, 2008 @12:57PM (#25761737)

    Slashdot readership plummets to an all-time low as programmers actually have to work.

  • by TheRealMindChild ( 743925 ) on Friday November 14, 2008 @01:04PM (#25761829) Homepage Journal
    Sky rockets in flight... distcc delight......
    distcc deliiiiiiiight.
  • He's using TCSH! That's BAD FOR YOU! []

    Ok, enough offtopic. This is actually pretty cool, considering our development environment is clusters and clusters of IBM P-Series LPARS, and our codebase is (A) disgustingly huge, and (B) actually pretty amenable to parallelized make.

    FINALLY, I can justify to my boss that browsing /. is research! (Now if I could just make a good case for 4chan...)

  • by hachete ( 473378 ) on Friday November 14, 2008 @01:54PM (#25762629) Homepage Journal

    The reason for a lot of build machines in the rack may not be horsepower but rather you need x different machine versions, or a certain build only builds on a certain machine because of licence restrictions or you may only have one windows box with the Japanese character set installed because it causes so many problems that multiplying the problems just isn't worth it and so on and so forth. Building across n number of the same machine version just isn't worth the work IMO. Just get a bigger machine and save on the machine maintenance.

    So the real benefit of distcc might be parallel compilation; I see a big future for this, particularly with the chipsets becoming commonplace. Once upon a time, I would not countenance a dual-chip machine in the rack because of the indeterminate mayhem it would sometimes cause to a random piece of code deep in the bowels. Those problems are well gone.

    Umm. I wonder how this plays out how with VMWARE? A distributed compiler smart enough to use the (correct) local compiler across a varied build set would be worth having ...

  • Dang, no info on creating uniform toolchains for each distcc arch. IncrediBuild at work is really good about that, though it has the distinct advantage of being able to just shoot a single executable over the wire if the remote end needs it.
  • Icecream. (Score:2, Interesting)

    by Sir_Lewk ( 967686 )
    If you are interested in distributed compiling, you may want to check out icecream. []

    It's similar to distcc, but with some notable benefits.

  • If you want to distribute compilations, you must not use the 'native' gcc option, it will cause the compiler instances to emit objects in the native format of the compiler invoked and your compilation hosts may not all be identical.

  • It's called an "interpreter".

    (Cue flamewar in 3...2...1...

"I shall expect a chemical cure for psychopathic behavior by 10 A.M. tomorrow, or I'll have your guts for spaghetti." -- a comic panel by Cotham