Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

[ Create a new account ]

Scaling Large Projects With Erlang

Posted by Soulskill on Sunday July 06, @09:28AM
from the right-tool-for-the-right-job dept.
Delchanat points out a blog entry which notes, "The two biggest computing-providers of today, Amazon and Google, are building their concurrent offerings on top of really concurrent programming languages and systems. Not only because they want to, but because they need to. If you want to build computing into a utility, you need large real-time systems running as efficiently as possible. You need your technology to be able to scale in a similar way as other, comparable utilities or large real-time systems are scaling — utilities like telephony and electricity. Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires. Amazon SimpleDB is built upon Erlang. IMDB (owned by Amazon) is switching from Perl to Erlang. Google Gears is using Erlang-style concurrency, and the list goes on."

Related Stories

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • by Enlightenment (1073994) on Sunday July 06, @09:39AM (#24074403)
    They were right! [youtube.com]
  • Sufficiently? (Score:5, Interesting)

    by Anonymous Coward on Sunday July 06, @09:39AM (#24074405)
    Perhaps the systems would be better running efficiently rather than sufficiently?
  • Huh? (Score:5, Insightful)

    by The Breeze (140484) on Sunday July 06, @09:49AM (#24074463) Homepage

    "The two biggest computing providers of today"?

    What the hell does that mean?

    Also, is it just me or does the article intro sound like it was written by someone who has taken way too many marketing classes?

  • "running as sufficiently as possible"?

    Sometimes as a nation we must ask ourselves, is our children learning?

  • Scala (Score:5, Informative)

    by fils (88044) on Sunday July 06, @10:09AM (#24074545) Homepage

    People may also want to check out Scala at:
    http://www.scala-lang.org/ [scala-lang.org]

    It also uses the Erlang style concurrency approach and runs on the JVM with class compatibility with other JVM languages, ie Java, Groovy, etc.

    • Re:Scala (Score:4, Informative)

      by bonefry (979930) on Sunday July 06, @10:44AM (#24074725)

      There is a significant difference between Scala and Erlang.

      Erlang uses green threads. And green threads have advantages and disadvantages over native threads.

      For instance Erlang is bad at IO but on the other hand it can spawn millions of threads, something that the JVM has a hard time doing because native threads are limited by the kernel.

  • Why Erlang Matters (Score:5, Insightful)

    by mpapet (761907) on Sunday July 06, @10:12AM (#24074555) Homepage

    1. Multicore ready.
    Erlang will use them. Write your application in Erlang and it's done for you.

    2. Scales well.
    As an example, http://yaws.hyber.org/ [hyber.org] scales very nicely when loads increase. Your basic LAMP/LYMP setup runs much better on vanilla hardware.

    3. Designed for telecom
    The architects designed the language to run in a telecom environment so things like upgrades can be done while the application is running.

    Yaws in particular needs your help. Failover clustering inside the yaws server would be wonderful. Right now, it uses CGI to process other languages. It does it flawlessly, but a more direct solution might be a nice project.

    • 1. Invariable variables.
      This appears to have been done for no reason other than the designer's preference. In fact, it's not strictly true -- variables can be unbound, and later bound. They just can't be re-bound once bound.

      2. Weird syntax.
      Why, exactly, are there three different kinds of (required) line endings? It seems as though the syntax is designed to be as different from C as possible, while maintaining at least as many quirks. Moreso, even -- when constructing normal, trivial programs, you're going to hit most language features head-on and at their worst. Where's my 'print "hello\n"' that works most other places?

      I don't believe the important features of Erlang are mutually-exclusive with the sane syntax of, say, Ruby or Python.

      3. Not Unicode-ready.
      Strings are defined as ASCII -- maybe latin1. But there's no direct unicode support in the language -- if you're lucky, there are functions you can pipe it through.

      There are other things I haven't mentioned, mostly implementation-specific -- things like the fact that function-reloading cannot be done when you natively-compile (with hipe) for extra speed. My plan is to take the features I actually like from Erlang and implement them elsewhere, in a language I can actually stomach for its real tasks.

      • 1) Actually, there are quite a few good reasons for this, largely around the complete elimination of mutexing and locks. Just because you don't understand the purpose doesn't mean there wasn't one.

        2) Oooooh, a language is faulty because it has a syntax with which you are not familiar. Immediately kill all non-Java clones!

        3) They're just lists of numbers; they're neither ASCII nor Latin1. There is unicode parsing in the XMERL module.

        Please wait until you know a language before criticizing it.

        • As I understand it, you should look at variables in functional programming languages like Erlang more like those in a mathematical formula; such programs can be proven correct a lot easier, and since variables are effectively immutable

          All of this is based on the premise that Erlang is a functional language. It's not purely-functional, and I just don't see the point of doing it half-assedly. Erlang is effectively an imperative language dressed up like a functional language.

          And they're not immutable -- they can be unbound. As I understand it, this unboundedness is detected at runtime, not compiletime. If it was detected at compiletime, you'd have a valid point.

          it facilitates forking the line of execution in a way that would not be possible without all kinds of semaphores and other concurrency stuff

          Except that's not how Erlang does concurrency. It does concurrency with explicit "processes" (green threads) and message-passing.

          Now, it does make these very easy, and you can get it to distribute processes among a few real OS threads (one per core) -- so it's still very cool. But you're thinking of languages like Haskell, which can be automagically threaded. Erlang is manually threaded, it's just much easier to think in threads (or "processes") -- they're effectively a language feature.

  • by radarsat1 (786772) on Sunday July 06, @10:17AM (#24074587) Homepage

    I think the summary (and article) are somewhat poorly written, but that doesn't shadow the fact that functional languages are becoming more and more interesting these days with concurrency becoming so important.

    I'd like to learn one, but there are several out there.. What I'd like to see is a good in-depth comparison of different concurrent functional languages: why would I choose Haskell, or Erlang, or OCaml, for example? Are they all interpreted? (Does one exist that compiles?) Which ones support concurrency? What language features do they boast, and what are the advantages and disadvantages of these features? Do they have a complete set of libraries?

    Anyone know of an article like this? I've been searching for a while. Every article on functional languages I've found seems to concentrate on a particular one, but I can't find something helping me decide which one is most worth learning.

    • OCaml compiles down to native code, which about 10-20% slower than C. Faster than C in a few (narrow) cases.

      Haskell is also compiled to native code, but difficulties with the execution model mean it's pretty slow for any practical use.

      Erlang is interpreted - the execution model is similar to Perl or Python - which means its slow on single cores, but of course the whole point of Erlang is to run in highly concurrent, distributed machines. There is a project [google.com] to use OCaml for the performance-critical, single threaded parts, and Erlang for coordinating the parallelism.

      Of course, this is probably missing the point. Unless you're doing intensive numerical work, you probably don't need the performance. The real advantage of these languages is how your code will be much smaller, easier to understand, safer, and faster to write.

      Rich.

  • Gibberish (Score:5, Insightful)

    by SpinyNorman (33776) on Sunday July 06, @10:18AM (#24074597)

    If you want to build computing into a utility, you need large real-time systems running as sufficiently as possible.

    But if you want to build sprockets into a weasel you need small batch-mode systems running as necessarily as possible.

    If the poster had anything interesting to say (I'd guess not, but who knows!), it was totally obscured by his lack of grasp of the English language.

  • Too late (Score:4, Funny)

    by Fnord666 (889225) on Sunday July 06, @10:26AM (#24074631) Journal

    Anyhow, this post was not intended to be a rant about old-school technology solutions vs. current and future technology problems.

    Given that this statement appears almost halfway through the blog post, I would say that it was already too late for that.

  • Stupid article (Score:5, Informative)

    by IamTheRealMike (537420) * on Sunday July 06, @10:39AM (#24074697) Homepage

    Wow, it's not often I strongly criticise articles around here, but that was total garbage.

    For the smart ones that didn't RTFA, here's a quick summary:

    • I like Erlang.
    • Big companies like Google and Amazon make things fast by using concurrency.
    • Erlang supports (one type of) concurrency.
    • Thus Google and Amazon are [probably] using Erlang.
    • Thus everyone should learn Erlang.

    For the record, I work for Google and we don't use Erlang anywhere in the codebase. Google Gears restricts you to message passing between threads because JavaScript interpreters are not thread-safe, so it's the only way that can work. Visual Basic threading works the same way for similar reasons. It's not because eliminating shared state is somehow noble and pure, regardless of what the article would have you believe, and in fact systems like BigTable use both shared-state concurrency and message passing based concurrency.

    The article says this:

    Architects (but also university-professors for that matter) still think they can build current and future industrial-grade and internet-grade systems with the same technologies as they did 10-15 years ago.

    But in fact the Google search engine, which is one of the larger "industrial-grade, internet-grade" systems I know of, is written entirely in C++. A language which is much the same as it was 10-15 years ago. Thus the central point of his argument seems flawed to me.

    Seeing as the article is merely an advert for Erlang, I'll engage in some advocacy myself. If you have an interest in programming languages, feel free to check out Erlang, but be aware that such languages are taking options away from you, not giving you more. A multi-paradigm language like version two of D [digitalmars.com] is a better way to go imho - it supports primitives needed to write in a functional style like transitive invariance, as well as a simple lambda syntax, easy closures and first class support for lazyness.

    However it also compiles down to self-contained native code in an intuitive way, or at least, a way that's intuitive to the 99.9% of programmers used to imperative languages, unlike Erlang or Haskell. It provides garbage collection but doesn't force you to use it, unlike Java. It doesn't rely on a VM or JIT, unlike C#. It provides some measure of C and C++ interopability, unlike most other languages. And it has lots of time-saving and safety-enhancing features done in a clean way too.

      • Re:Stupid article (Score:5, Interesting)

        by IamTheRealMike (537420) * on Sunday July 06, @01:38PM (#24075795) Homepage

        Yes, D is very young and has problems. But then again, what language didn't? It's easy to forget but Python was first released in 1991. It took many years before it became mainstream (and some would say it's still not there yet).

        The post-mortem is an interesting document, but I disagree with the authors conclusions. The compilers are buggy, well, C++ had exactly the same problem for a long time but still was a huge success. In particular, the trend seems to be basing new compilers on LLVM, which has a pretty robust optimization core. Frontend bugs are by comparison pretty trivial and easy to fix. Another few years and I think this problem will be licked - and besides, lots of C++ code has workarounds for compiler issues. Same thing for class libraries.

        You're right about C-level FFIs. However D provides a simple C++ FFI which as far as I know is unique. Such a thing would be very useful for a company like Google which has a lot of C++ code, as it'd simplify binding considerably (I don't mean to imply anything about the future direction of the codebase, by the way).

        The argument about parallelism is a more interesting one. But I disagree with that too :) D provides exactly what is needed for automatic sharding of work across cores (or machines). Specifically the combination of transitive invariance, reflection and purity enforcement is a very powerful one.

        Essentially, if you can write your code to consist of non-trivial trees of pure functions, then it's perfectly safe to parallelise something like this:

        foreach (item; list) {
            fooResults[item] = someTransform(item);
            barResults[item] = anotherTransform(item);
        }

        If someTransform and anotherTransform are both pure, by implication their parameters are transitively invariant, and thus they can both be invoked in parallel (because the compiler knows "item" can't be changed). What's more both calls can be invoked simultaneously as well.

        Once the compiler knows these things, making this code run in parallel is simply another compiler optimization. That's the whole theory behind how functional languages can be super easy to parallelize. But in fact the key concepts can be applied to imperative languages as well, with the advantage that you can still have temporary mutable state within the function scopes - you just can't modify the heap, or anything reachable through your arguments.

        D has keywords that let the compiler know and enforce function purity.

        Now as it happens I doubt that any D compiler today implements this optimisation - it's sophisticated and transitive invariance is newly introduced in D2. But all the pieces of the puzzle are there. This also lets the compiler do calculations on data structures available at compile time.

  • by speedtux (1307149) on Sunday July 06, @10:44AM (#24074723)

    Erlang is a language that has all the right properties and mechanisms in place to do what utility computing requires.

    Well, except that it's darned inconvenient to actually write the applications in it.

    Google Gears is using Erlang-style concurrency, and the list goes on."

    Yup, and it makes more sense to add "Erlang-style concurrency" to existing languages than to throw out everything and switch to Erlang.

  • by Animats (122034) on Sunday July 06, @11:47AM (#24075027) Homepage

    The enthusiasm for "cloud computing" may evaporate when Xmas rolls around.

    I went to a talk at Stanford by the architect of Amazon's web services. It came out in questioning that the real motivation between Amazon's low-priced web services is that their load in the Xmas shopping season is about 4x the load for the rest of the year. Their infrastructure is sized for the November-December peak, so for ten months of the year they have vast excess capacity. That's why Amazon's web services are so cheap.

    Don't expect good response time during the shopping season. Although this Xmas might be OK, due to the recession.

    • "you need large real-time systems running as sufficiently as possible."

      Should that not be efficiently as possible?

      You obviously haven't looked very closely at any of the "market leader" software lately.

      Software from the Big Guys is more and more designed to sell (think forced upgrades) bigger, faster systems. You don't do this by making your software efficient.

      The logic behind many software updates these days is "Will this release require sufficient resources that customers will be persuaded to upgrade to new hardware?"

    • Re:Deceptive (Score:5, Insightful)

      by IamTheRealMike (537420) * on Sunday July 06, @10:23AM (#24074623) Homepage

      Actually, Gears doesn't use Erlang either. What he means is that Gears threading doesn't allow for shared state (is it really threading then?). Instead threads communicate back to the browser by message passing.

      It's remarkably deceptive indeed to even imply that Gears and Erlang are connected. Message passing based concurrency isn't exactly new or limited to Erlang, and can be implemented in any language.

      I'm not sure what the point of this piece is. I've looked at Erlang and didn't see much of anything to get me excited. It's a functional language, which like most of them have unnecessarily weird syntax and force immutable state. I don't really see what this buys you over a language like D 2 (or hell, even C++) in which you can write in a functional message passing style if you like, but then still use imperative shared state whenever useful, convenient or performant.