Time to Get Good At Functional Programming? 620
prone2tech writes "From an article at Dr. Dobb's: Chipmakers have essentially said that the job of enforcing Moore's Law is now a software problem. They will concentrate on putting more and more cores on a die, and it's up to the software industry to recraft software to take advantage of the parallel-processing capabilities of the new chips. As is argued in this article, this means becoming proficient in parallel functional programming. The bad news? Getting good at functional programming is hard, harder than moving from iterative Pascal or Basic or C coding to object-oriented development. It's an exaggeration but a useful one: When you move to FP, all your algorithms break.'"
Re:Why would the Algorithm break? (Score:4, Informative)
This is why functional programs are more suited for concurrency, and this is why your sequential algorithms will fail to work.
Re:Why would the Algorithm break? (Score:2, Informative)
In practice, algorithms are generally programming language independent *within the set of imperative languages*. They are almost always written as a series of steps. And that's why they don't apply directly to functional languages.
I just checked Knuth's definition, and he lists: finiteness, definiteness, input, output and effectiveness as the requirements for an algorithm. All of these translate with more or less effort into the functional programming landscape.
As an example, I have seen the quicksort algorithm written something like:
sort(l) =
if l == [] then []
else with (lowers, equals, highers) = partition(l) in
sort(lowers) + equals + sort(highers)
(The original was in Haskell which as you can see I am no longer proficient in.)
So I think I agree that "all your algorithms will break" is exaggeration.
Re:it's always a good time to try functional (Score:5, Informative)
Python for example seems good for fp
Last time I heard this, I checked, and the python developers were refusing to commit tail-recursion optimisation patches because it 'made debugging too hard'. Since most functional algorithms are tail-recursive, you will blow your stack very quickly without this. It's even in GCC, meaning that C is better suited to functional programming than Python.
Suggested reading. (Score:5, Informative)
I've recently gotten into FP. I started with Erlang and then branched into ML and Haskell. In case you're interested, here are the best books I've encountered for each language:
Programming Erlang [amazon.com]
Programming Haskell [amazon.com]
ML for the Working Programmer [amazon.com]
Also, I'd definitely recommend starting with Erlang, because the Programming Erlang book made for a very easy introduction to functional programming.
Re:Why would the Algorithm break? (Score:3, Informative)
You mean to say that in a functional programming language, variables aren't? A paradox, a paradox, a most delightful paradox!
Re:Broken Algorithm BS (Score:5, Informative)
That's where FP comes into play, as it allows developer's to develop heavily parallelized code that is also safe and fault-tolerant.
Re:Broken Algorithm BS (Score:2, Informative)
No, it makes sense.
If you want to keep seeing Moore's-law increases in CPU performance, the improvements are no longer going to come from throwing more transistors at single-threaded programs. We hardware engineers are simply out of smart ideas to eke exponentially better performance out of a single instruction stream.
Instead, we're going to produce chips with more parallelism -- more cores and more threads. But it's going to take a lot of software work (languages, transactional memory, libraries, et cetera) for these improvements to translate to functionally faster computers.
Hardware is going parallel, like it or not. Software has to change as well.
Re:Amdahl's Law (Score:5, Informative)
Look at LabVIEW (Score:2, Informative)
Re:it's always a good time to try functional (Score:3, Informative)
s/stack/wad
dang newbies
Re:Why would the Algorithm break? (Score:5, Informative)
Functional variables are like mathematic variables - they're variable in that you may not have discovered their value yet, but once you discover their value it stays the same for the current instance of the problem. For the next instance of the problem (i.e. the next call to the function), you're talking about a different set of variables that potentially have different values.
Re:Broken Algorithm BS (Score:5, Informative)
It's true that Moore actually said the transistor count doubles every 18 months. However, for a long time, an immediate corollary of Moore's Law was that software doubled in speed every 18 months, which is essentially why Moore's Law as important. I think what they author is trying to say is that in order for this corollary to remain true, people must learn to write parallel software. It is much easier for a compiler to get an FP running in parallel than a sequential program (SP) running in parallel. Hence, those who can write in FP languages will be better suited to write the software of tomorrow than whose who can only write in SP languages.
Re:Broken Algorithm BS (Score:3, Informative)
Easy. Pure functional programming doesn't permit side-effects. Algorithms that perform I/O at various points in the algorithm can't easily be expressed in languages like that.
Also, although some popular functional languages like ML and Erlang have hacks to get around this, purely functional programming doesn't like a function modify global state. Without those hacks in the language, algorithms that require in-place modification of arrays (such as some sorting algorithms) can't be expressed at all in those languages. (You can modify the algorithms to not do in-place modifications of arrays, but then that's not the original algorithm any more.)
Re:Amdahl's Law (Score:2, Informative)
Yes.
But furthermore, saying that FP "solves" the multithreading problem is intellectually dishonest. FP proponents like to pretend that because FP programs are intrinsically parallelizable, you don't have to worry about it: the magic compiler or runtime will do it for you. Wrong. The difficulty of multithreaded programming is not to use threads, it's how to do it efficiently. Sure, if the problem is sorting 1 billion ints, you'll multithread from the top of the recursion, and that'll be quite efficient, but anyone can do the same trivially in any language. But the real problem start when you got to schedule multiple heterogeneous tasks, and FP has no intrinsic answer to that.
As for the legendary reliability that Erlang fans love to point out, I'm glad we got an example at last. Ericsson AXD-301. How? "No shared state and a sophisticated error-recovery model" he says. Yeah, right! It's a f*cking switch! How about custom hardware built like a tank, with predictable failure modes, and specs (protocols) that haven't changed in 10 years (at least). Give me a break, it's not exactly difficult to get plenty of nines in those conditions!
Re:Sounds like BYTE magazine in 1985 (Score:2, Informative)
Functional programming isn't faster in a multi-core environment per say. What makes it better for multi-core programming, is that through functional programming, it is easier for a non-human entity (read: the compiler) to defer your -intent- from your code. Once the compiler, and more so, the runtime, know your intent, it can do the heavy lifting to make your code work in parallel, which is easier than the alternative (manage and spawn threads on your own, the efficiency of which depends on the hardware, so you'd always have to test how many core are present, how many are available at under which load, etc...no fun).
So by moving some stuff on to functional programming, we can tend work on compilers and frameworks/runtimes that will take said code, and do whats best on the given platform to make it go fast, which, while possible, is much much more difficult using procedural.
An example off hand is Microsoft's Parallel LINQ. LINQ is just fancy syntax sugar over functional programming., and it lets you turn, by adding basically one method, any LINQ code to parallel code, adjusting itself to current system load and amount of cores. mylist.Select( o => DoSomething(o)) simply becomes mylist.AsParallel().Select( o => DoSomething(o)), and poof, you're now calling DoSomething on every element of mylist, in parallel, spreading the load as well as possible.
Now this is a simple example, and there IS equivalent syntax sugar to turn a normal For loop in parallel code... but with functional programming, you can push it further.
Thats all there is to it.
Re:it's always a good time to try functional (Score:5, Informative)
I would say Haskell, but I think that's the language everyone should learn, so I'm biased. The typeclass system provides for some of the functionality of object oriented programming.
If Haskell scares you, Ocaml is another good choice. It's a multi-paradigm language with an emphasis on functional programming, but it also allows you to use mutable state wherever you like (whether this is a good thing or not is a matter of some debate). It even has some traditional object-oriented programming features, but they tend not to get used much in practice.
If you care about performance, they both have decent native-code compilers. My impression is that Ocaml is a bit faster for single-core tasks, but Haskell's parallel programming features are much better.
Re:example (Score:2, Informative)
it's not totally obvious to me from the code that it's *not* tail-recursive; although the recursive call isn't the very last line of code in the function, it is the very last thing in its clause...?
The key is in the recursive expression:
head :: insert elt tail
This is not tail recursive because we can't reuse the current stack frame for the recursion - we have a local variable, head, which we need to keep around until *after* the recursive call returns.
Tail recursion is quite easy to spot, because the expression will have the recursive call ("insert") at top level. In other words, it would have to look something like this:
insert (head::acc) elt tail
where acc is some kind of accumulator.
Re:Sounds like BYTE magazine in 1985 (Score:5, Informative)
My response back then was to get excited about FP. My response now is: Where is the proof?
Whether functional programming is the best paradigm to use for parallel computing is undecided. But it does have a couple of advantages over imperative programming.
First, imperative programming specifies the order of evaluation, whilst functional programming does not. In Haskell, for instance, an expression can essentially be evaluated in any order. In Java, evaluation is strictly sequential; you have to evaluate line 1 before line 2.
Second, imperative languages like Java favour mutable data, whilst functional languages like Haskell favour immutable data structures. Mutability is the bane of parallel programming, because you have to have all sorts of locks and constraints to keep your data consistent between threads. Programming languages that do not allow mutable data don't have this problem.
why FP is helpful for parallel tasks (Score:3, Informative)
Many functional languages do not allow the program to manipulate shared state, which removes that whole class of problems; information sharing is limited to letting the run time system fork threads and copy function arguments and return values back and forth.
However, if you really do need shared mutable state (and there are plenty of algorithms that do), in Haskell there's software transactional memory [realworldhaskell.org], which provides a restricted API for manipulating shared mutable state with the STM monad. Since the code within an STM transaction is not allowed to cause side effects or access any shared mutable state outside of STM, the runtime system is able to stop and replay transactions whenever it detects conflicts. This is one area where FP actually can make things easier, by enabling the same sorts of operations you might do in an imperative multithreaded program, but doing it in a much safer way.
And, if you don't need shared mutable state, sometimes parallelism can be achieved just by replacing "map" with "parMap" in the right places and recompiling with the appropriate options. It doesn't get much easier than that.
Re:Thermodynamic computing (Score:3, Informative)
Cutting edge, yes. But not warp drive.
http://www.cise.ufl.edu/research/revcomp/ [ufl.edu]
Re:Broken Algorithm BS (Score:5, Informative)
Moore's law states that the number of transistors on a chip will double every two years. By definition it's a hardware problem.
Obviously, utilizing those transistors is a software problem, but Moore's law doesn't say anything about that.
The article sucks. The author seems to know FP about as well as he knows Moore's law.
Re:it's always a good time to try functional (Score:3, Informative)
Ruby is nowhere near a functional language. It's imperative as anything else.
Re:it's always a good time to try functional (Score:2, Informative)
Re:parallel algorithms, mutable data, and STM (Score:4, Informative)
I agree. I don't understand them very well, but fortunately one can make use of them without a complete understanding. For many programs, it suffices to understand that if you want to do IO, you need to do it within the IO monad (often in the IO action "main" which is equivalent to "int main() {...}" in C).
This doesn't mean you have to write whole file parsers in monadic fashion, which is what I thought until I actually had occasion to write a file parser in Haskell, and found that I could write the parser as a pure function that accepts a string and returns the parsed data structure. Then, in main, I just read the file in with hGetContents (which takes a filename and returns a string) and then pass that string to my pure file parser.
This would seem horribly inefficient for large files; what if there isn't enough memory to store the whole file as a string? But here, laziness comes to the rescue. The actual file IO doesn't happen until the file parser starts reading in the string. My pure function churns along, oblivious to the fact that it's causing deferred IO in the background, simply from forcing the evaluation of the string's value.
This is, perhaps, not a very great insight into the nature of monads, but I thought I would share my experience, and if not enlighten then at least show that one can write normal programs in a fairly straightforward manner without being fully enlightened.
Re:it's always a good time to try functional (Score:1, Informative)
You're talking about F#. Simon Peyton-Jones (the Haskell developer, PhD, functional language researcher) works for Microsoft now. I don't know what on specifically, but he is in the same working group as F#'s designer.
Re:Functional Programming Is a Red Herring (Score:5, Informative)
You seem to have some serious misunderstandings here.
Uh, no. By removing side effects functional programming removes the need to copy anything. If I'm trying to evaluate f(X) + g(X) for some complicated X, f, and g by evaluating f(X) and g(X) in parallel and adding the results, I don't need two copies of X because I know that neither f nor g will modify it. That's the whole point.
It only seems counter intuitive if you've swallowed the procedural programming paradigm and adopted it as your own to the point where you've forgotten how counter intuitive "X = X + 1" seemed at first.
And saying it's non-deterministic is just nuts. Sure, you could add non-deterministic semantics to any language, but there's nothing inherently non-deterministic about functional programming. In fact, I think you'd typically have to work a lot harder to make a functional language non-deterministic.
FP has nothing to do with threads, apart from the fact that functional programs could be executed by a large number of threads in parallel (or independent cores, or...?) without changing the outcome. And what exactly is the mess we're in? I can't think of another industry that has succeeded so spectacularly in such a short time.
And so on...did I just feed a troll?
--MarkusQ
Re:Multi Threaded programming (Score:5, Informative)
Huh? Quicksort is pretty easy to parallelize [wikipedia.org], due to its divide and conquer nature: it splits its list into sublists and recurses on those sublists.
Re:Python FP -- only slightly like real fp (Score:3, Informative)
Python has some features inspired by FP languages including Haskell, but is not anything like real functional programming. Haskell is far more powerful and serious, but also a heck of a lot more difficult to use. Python has a "practicality beats purity" mantra; you could think of Haskell as "purity strikes back".
Stuff Haskell gets you:
Serious native-code compiler whose output can beat compiled C code for some programs (and does darn well on average, see the Alioth shootout)
Ability to use multi-core parallelism, with a library module that treats shared memory as a transactional database, allowing use of shared data between threads while getting rid of all the lock management headaches of languages like Java. This can work because Haskell's functional purity guarantees that the threads won't clobber each other except under circumstances precisely controlled by the library.
Data parallelism allowing computations of list comprehensions to automatically be done in parallel on multiple CPU's
Rigorous type system (nothing like the broken ones like in C or Java that you might be used to) lets you express complex invariants in your datatypes so that errors can be caught by the compiler. This greatly decreases the amount of runtime debugging required.
I could go on, but you get the idea.
Good tutorial: http://learnyouahaskell.com/ [learnyouahaskell.com]
More detailed book (full text online): http://book.realworldhaskell.org/ [realworldhaskell.org]
Haskell has a very steep learning curve compared with other languages (or "unlearning curve", as some put it, since you have to forget everything you knew), but learning it (a still ongoing process for me) is one of the most interesting and mind-expanding things I've ever done as a programmer.
Re:An brief introduction to functional programming (Score:2, Informative)
The algorithms really do break (Score:5, Informative)
Let's say you have a few thousand (name, address) pairs and you want to be able to quickly look up a name to get the corresponding address, to add new names, etc. In imperative programming you'd probably use one of the mainstay data structures of CS 101, the good old hash table. To add a new name, you hash it and go and poke that address in the table to record the entry.
Well remember that stuff about values in functional programming being immutable? Right, no hash tables in functional programming. You'd instead use something like an AVL tree or red-black tree, that let you create a completely new structure that shares most of its content with the old one, except that the new one has this extra node. Of course FP language libraries come with modules for making those structures, and in practice you can use them at the API level sort like how you used to use hash tables, but they are completely different underneath, and if you want to program them yourself you are going to have to learn a lot of very basic techniques from scratch all over again. Chris Okasaki's book "Purely Functional Data Structures" is a good place to learn about this stuff in detail.
Even more basic: the good old "for" loop, which updates an index variable each time through. Whoops! You can't update the index in a functional language, so there's no "for" loop. You instead use recursion, or a "higher order function" (function that operates on other functions). So instead of
for (i = 0; i < n; i++) xs[i] = f(ys[i])
You'd write something like
ys = map f xs
("map" takes a function f and a list of values xs, applies the function to each item in the list, and gives you back a new list). There is also a "list comprehension" syntax that you might know from Python:
ys = [f(x) | x <- xs]
but for complicated functions you end up having to use higher order functions and recursion explicitly. You really have to think a lot harder to program 20 lines of Haskell than 20 lines of C. But those 20 lines can do an order of magnitude more.
(Aside:) In case you were wondering, yes, you can implement traditional hash tables and other mutable structures in functional languages, and there are times when it's necessary, but it's comparatively a pain in the ass and you give up some of the advantages that had you programming functionally in the first place. Here is an article about someone's experiences switching from a mutable structure to a functional structure in a large program, and the headaches the functional structure solved:
http://www.cs.tufts.edu/~nr/pubs/zipcfg-abstract.html [tufts.edu]
Re:Convince your boss. (Score:3, Informative)
It makes perfect sense if you consider that when he says "power" he wasn't referring to power consumption, but processing speed.
aka Heat consumption increases faster than power does.
Re:Convince your boss. (Score:3, Informative)
Realistically, most code that will scale to 16 cores should scale to 64 cores. The real problem is code for 1/2 cores not scaling to 8+ cores. Once 8+ cores are common, this problem will quietly vanish as no-one clings to the 1/2 core solutions anymore. You can already see this to a large extent in server applications.
Re:An brief introduction to functional programming (Score:4, Informative)
Notably, it's impossible to do a quicksort in a functional language
Some Haskell guy seems to disagree [haskell.org].
Re:Convince your boss. (Score:4, Informative)
For a more lower-level example, almost any kind of query over some data set is an inherently parallelizable operation. This includes virtually all usages of list comprehensions, map and filter in Python, for example.
Re:Broken Algorithm BS (Score:3, Informative)
However, tasks that can be completed with an iterative polynomial-time algorithm often end up exponential when recursive. Of course, a bit of tail recursion and you can deal with that, but some things aren't easily tail recursed.
That is not so. Recursion is more general that iteration, and if you make use of that extra expressive power, you can get higher-complexity algorithms. But then these algorithms would be just as complex if implemented with iteration and an explicit stack, or whatever data structure is needed.
The canonical example where iteration is linear and naive recursion is exponential is the computation of the Fibonacci series. But the simple iterative algorithm is by no means obvious - or, on the other hand, a linear-time recursive algorithm is not significantly harder to come up with.
Problems that are not easily tail-recursive are also not easily solved with simple iteration.
Re:Compiling C programs in parallel (Score:3, Informative)
Re:Suggested reading. (Score:3, Informative)
Re:No, look at the scope (Score:2, Informative)
How much does it really matter? How much of the Java code out there needs to be a great deal faster (on say, a percentage basis)? Of the Java code that does need to be faster, how much of it is not embarrassingly parallel? For people who are doing actual computation for a living, the percentage is probably pretty high, so they will benefit from improved tools, but for most code that is aimed at a user, simply enforcing better coding practices will probably improve responsiveness more (and users are a lot more concerned with perceived speed than actual speed).
I would guess the highly parallel code will end up being treated a lot like assembler, where most people ignore it, some people use it where they know they will benefit from it, and a remaining small group of people obsess over using it for everything.
Re:Convince your boss. (Score:3, Informative)
First, I'm not sure that I would use accounting procedures as my example of a "real world" problem, but whatever.
There are two more direct reasons why payroll computations aren't the sequential problem you're looking for. First, they're not CPU bound. No-one's saying that it would be so much easier to do a payroll calculations if their CPU were just twice as fast. Second, most companies doing payroll calculations have more than one employee. Sure, the procedure you described is serial - but there's no reason you can't do it for every employee simultaneously in parallel and still get correct answers.
A similar thing is true for many other problems: One instance is serial and computationally trivial, in order to be computationally expensive you must have many instances, in which case your whole problem is naturally very parallel.