Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Programming GNU is Not Unix IBM Software IT Technology

IBM Releases Open Source Machine Learning Compiler 146

sheepweevil writes "IBM just released Milepost GCC, 'the world's first open source machine learning compiler.' The compiler analyses the software and determines which code optimizations will be most effective during compilation using machine learning techniques. Experiments carried out with the compiler achieved an average 18% performance improvement. The compiler is expected to significantly reduce time-to-market of new software, because lengthy manual optimization can now be carried out by the compiler. A new code tuning website has been launched to coincide with the compiler release. The website features collaborative performance tuning and sharing of interesting optimization cases."
This discussion has been archived. No new comments can be posted.

IBM Releases Open Source Machine Learning Compiler

Comments Filter:
  • Re:Automation... (Score:3, Informative)

    by TinBromide ( 921574 ) on Friday July 03, 2009 @01:55AM (#28568723)

    I fail to see how automation leads to lower IQ scores. Care to elaborate? How does stepping up the pace, and eliminating tedious, mundane jobs, lead to a lesser society? I call FUD.

    wooooosh.
    See idiocracy. Go out and watch it. I'll wait








    Saw it? Good, now you should get the joke.

  • by EvanED ( 569694 ) <{evaned} {at} {gmail.com}> on Friday July 03, 2009 @02:02AM (#28568755)

    What things can a compiler do to your code to 'optimize' it for you?

    Check out the Wikipedia article [wikipedia.org] on optimization for some examples.

    In brief, some of the more common ones are things like substituting known values for expressions (e.g. x = 3; y = x + 2; can be changed to x = 3; y = 5;), moving code that doesn't do anything when run repeatedly outside a loop, and architecture-specific optimizations like code scheduling and register allocation. (E.g. with no -O parameters, or -O0, for something like "y = x; z = x;" GCC will generate code that loads "x" from memory twice, once for each statement. With optimization, it will load it once and store it in a register for both instructions.)

    If the compiler tries to do this, wouldn't it likely screw your code up?

    There are cases where optimizations will screw something up. One example is as follows. It's considered good security practice to zero out memory that held sensitive information (e.g. passwords or cryptographic keys) to limit the lifetime of that data. So you might see something like "password = input(); check(password); zero_memory(password); delete password;". But the compiler might see that zero_memory writes into password, but those values are never read. Why write something if you never need it? So it would remove the zero_memory call as it's useless code that can't affect anything. So it removes it. And your program no longer clears the sensitive memory.

    This was actually a bug in a couple crypto libraries for a while.

  • by walshy007 ( 906710 ) on Friday July 03, 2009 @02:18AM (#28568829)

    in regards to learning assembly, if you run linux, the best book I can recommend is Programming from the ground up [gnu.org] it's licensed under the GNU free documentation license, and in my honest opinion is likely the single best book for anyone who has no idea that wants to start, I already had some clue so skipped the first two thirds of the book, but read it for shits and giggles later and found it to be a very easy to grasp book.

    To this day if I forget minor details about things I pick that back up and re-read it a bit :)

  • by EvanED ( 569694 ) <{evaned} {at} {gmail.com}> on Friday July 03, 2009 @02:27AM (#28568863)

    Replace a mod (e.g. x % 32) with a bitwise-and (e.g. x & 31) when the divisor is a power of two.

    Another very similar one, and one that comes up more commonly, is the replacement of a multiplication or division by a constant by a series of additions, subtractions, and bitshifts.

    For instance, "x/4" is the same as "x>>2", but the division at one point in time (and still with some compilers and no optimization) would produce code that ran slower. Some people still make this optimization by hand, but I'd say it's almost certainly a bad idea nowadays, at least in the absence of information that your compiler isn't optimizing it and that it would be important.

    (You can combine operations too. x*7 is the same as x3-x, x*20 is the same as x4 + x2, etc.)

  • by BlackSabbath ( 118110 ) on Friday July 03, 2009 @02:44AM (#28568939)
    Before you get flamed to death by some idiot, you've got realise that compilers translate a higher-level language into a lower-level one, typically into machine instructions (or in the case of Java and .NET, virtual machine instructions), turning source code into executable form. Interpreters on the other hand, execute each statement of the language directly (effectively forming a virtual machine for that language).

    Naive compiler translations can be functionally correct but sub-optimal with respect to runtime performance, memory/disk footprint etc. Compiler optimisation is the effort to make this translation as optimal as possible with respect to some variable(s) e.g. performance, size

    What you are thinking of sound like source code optimization. There are various interpretations of this but to my mind, this means a combination of optimal algorithm selection and optimal algorithm implementation. Note that complex algorithms can be decomposed into smaller common algorithms e.g. a sort routine may be part of some higher-level algorithm, the sort-routine may be optimised independently of the higher-level routine.

    Check out: http://en.wikipedia.org/wiki/Compiler_optimization
  • by symbolset ( 646467 ) on Friday July 03, 2009 @03:14AM (#28569047) Journal

    What things can a compiler do to your code to 'optimize' it for you?

    The correct answer to this question is... it depends. No matter how advanced your compiler is it can't select the correct algorithm for you. If you're ordering your lists with a bubble sort instead of some kind of btree, there's nothing the compiler can do to help you except deliver the best O(n^2) sort it can. A truly artistic programmer can transcend all of the optimizations this compiler might achieve, by several orders of magnitude.

    But if you're the kind of code geek that Microsoft hires, yeah, you might get a version of Windows that boots to a usable desktop in under five minutes.

  • Re:Automation... (Score:3, Informative)

    by Fluffeh ( 1273756 ) on Friday July 03, 2009 @03:18AM (#28569061)
    I would argue. Either you get it without needing to watch the movie simply by being surrounded by people who in their day to day existence simply follow instructions blindly "because they have to" or you won't get it - whether you watch the movie or not.

    As for the GP, I believe he is mixing two things incorrectly:

    fail to see how automation leads to lower IQ scores
    and
    lead to a lesser society

    Lower IQ scores don't immediately mean a lesser society, but if you take the thinking out of a process and let a process/machine/program do all the thinking, your mind will inevitably get lazy and your work will suffer over time

  • by Anonymous Coward on Friday July 03, 2009 @03:25AM (#28569077)

    There are cases where optimizations will screw something up. One example is as follows. It's considered good security practice to zero out memory that held sensitive information (e.g. passwords or cryptographic keys) to limit the lifetime of that data. So you might see something like "password = input(); check(password); zero_memory(password); delete password;". But the compiler might see that zero_memory writes into password, but those values are never read. Why write something if you never need it? So it would remove the zero_memory call as it's useless code that can't affect anything. So it removes it. And your program no longer clears the sensitive memory.

    And it was the crypto library's fault, not the compiler's fault. Most languages worthy of doing crypto programming in have a facility to say ~"don't optimize this". An example: in C, the keyword "volatile" instructs the compiler that the field may be changed at any time, and thus all reads/writes must take place and must do so atomically [unfortunately, the C spec doesn't specify "in order" for volatile fields, but I digress].

  • by Anonymous Coward on Friday July 03, 2009 @03:46AM (#28569181)

    The C spec doesn't require atomicity of volatiles, but it does require in order. So you've got it the wrong way round!

  • by ZerothAngel ( 219206 ) on Friday July 03, 2009 @03:56AM (#28569227)

    Anybody out there know a good emulator for teaching assembly programming?

    SPIM [wisc.edu] is a possibility. It was used in a few courses (operating systems, compilers) at UCB some years ago. (Don't know if it's still used.)

  • by Skuto ( 171945 ) on Friday July 03, 2009 @05:53AM (#28569757) Homepage

    >The compiler is expected to significantly reduce time-to-market of new software,
    >because lengthy manual optimization can now be carried out by the compiler.

    The time to *make a new compiler* for a certain processor is reduced, and the
    process of figuring which optimizations are should be in the compiler for that architecture
    is automated.

    This is for the kind of research where they attempt to make many specialized processors
    on a single chip instead of a general monolithic one. In this case, you need many
    compilers and tuning those is important. It's the time optimizing THOSE that is lowered,
    not the one of writing the software that is compiled itself.

    I see no real relevance to the "normal" desktop situation on that website.

  • by smallfries ( 601545 ) on Friday July 03, 2009 @05:58AM (#28569775) Homepage

    No that's not true. A shift instruction has a one cycle latency and 1/2 cycle throughput on the Core2 / Core2-Duo. An add instruction also has a one cycle latency and 1/3 cycle throughput on the Core2-Duo.

    The integer multiplier on the Core2-Duo has a 4-cycle throughput and an 8-cycle latency. So in a "simple" case like x*9 = (x<<3)+x the optimisation would take 2 cycles, and the straight mul would take 8. In more complex cases the individual shifts will pipeline for more of a benefit. Only in cases where (t/3)+ceil(lg(t)) >= 8 will the optimisation be as slow as the multiplier for an expansion of t terms (obviously logs in the base 2 as the additions will form a tree). On x86 lacks of registers will kill this optimisation before the cost of the instructions exceeds the multiplier cost.

    And yes, I've also benchmarked the code to test it on a Core2-Duo, and my results match Intel's published figures and Agner Fog's data tables so I suggest you recheck your benchmark.

    Getting back to the article, the Milepost work isn't really suitable for this type of optimisation anyway. They try and optimise the compile-time by tuning the optimisation flags. For this type of low-level tuning of code the approaches in Program Interpolation, Sketching or PetaBricks would be more appropriate.

  • Re:Oh really? (Score:5, Informative)

    by jcupitt65 ( 68879 ) on Friday July 03, 2009 @06:21AM (#28569871)

    Read the article, that's not what this does. This is a project to automatically generate optimising compilers for custom architectures. The summary is a little unclear :-(

    It reduces time to market because you don't have to spend ages making an optimising compiler for your custom chip.

  • Re:Oh really? (Score:3, Informative)

    by kwikrick ( 755625 ) on Friday July 03, 2009 @08:01AM (#28570259) Homepage Journal

    No, this 'learning' compiler only learns how to optimally translate C++ statements to machine level operations. It cannot choose high level algorithms for you. And the reason that such a learning compiler is useful is not to help lazy application programmers, but because developing new, optimised compilers for the many different processors and platforms out there (think computers, mobile phones, embedded systems, etc) is time consuming.

I've noticed several design suggestions in your code.

Working...