Forgot your password?
typodupeerror
Programming IT Technology

Facebook's Cross-Language Network Library 104

Posted by kdawson
from the talk-is-lightweight dept.
koreth writes "Facebook has released Thrift, a toolkit for making remote method calls. It generates interoperable network code in C++, Java, PHP, Python, and Ruby. Its protocol is much more lightweight (and probably much higher-performance) than SOAP or CORBA. Facebook uses it internally for high-traffic services like search. The license is extremely permissive."
This discussion has been archived. No new comments can be posted.

Facebook's Cross-Language Network Library

Comments Filter:
  • by Anonymous Coward
    Does this mean I can Poke You with my Python?
  • Perl? (Score:5, Interesting)

    by strredwolf (532) on Tuesday April 03, 2007 @08:06AM (#18585809) Homepage Journal
    Any ports to Perl? Anyone?
  • by faqmaster (172770) <jones.tm @ g mail.com> on Tuesday April 03, 2007 @08:08AM (#18585829) Homepage Journal
    I like my women like I like my licenses: extremely permissive.
  • Ohhh, goody (Score:2, Insightful)

    by Anonymous Coward
    Just what the world needed, Yet Another RPC Framework. I guess on the bright side, it can't possibly suck any harder than CORBA.
    • Re: (Score:1, Informative)

      by Gr8Apes (679165)
      I wish I could say "Who still uses CORBA?", unfortunately, the answer would include me. :( Give me Sun's RMI implementation any day of the week, or even C++ married to JMS. CORBA gives black holes a run for the money.
      • Re: (Score:1, Interesting)

        by Anonymous Coward
        http://www.zeroc.com/ice.html [zeroc.com] is supposed to be Corba well done. Have you tried it?
        • by morgan_greywolf (835522) * on Tuesday April 03, 2007 @09:07AM (#18586405) Homepage Journal

          http://www.zeroc.com/ice.html [zeroc.com] is supposed to be Corba well done. Have you tried it?


          No, thanks. I prefer my CORBA medium-rare.
        • Re: (Score:1, Interesting)

          by Anonymous Coward
          The ICE website says "We have conducted extensive performance comparisons with TAO, which is widely considered to be among the fastest CORBA implementations available."
          I wonder how much better it is - I had run extensive benchmarks comparing TAO releases to VisiBroker Java and VisiBroker/J consistently beat TAO by wide margins all times.

          VisiBroker sure was a very neat CORBA ORB from Borland. May be I should go back and do some tests comparing VisiBroker and ICE.
          • by Gr8Apes (679165)
            Now you're just being evil and dredging up old memories.

            CORBA is hopelessly broken. It's like using XML for coding. It's last century's technology for a problem that has since been elegantly solved several times over with many fewer pain points. It's like saying COBOL is a good web development language with a straight face and meaning it.

            • >>CORBA is hopelessly broken.

              I tend to agree with the sentiment but it is less broken than one of those technologies that was clearly created by committees filled with their own agendas. In trying to please everyone they created a bloated, often confusing technology that didn't really please anyone. CORBA's biggest usage is in a space most people would have never predicted - embedded. But it is usually a much tighter subset of the CORBA spec.

              I looked at ICE a couple of years ago and it does

              • by Gr8Apes (679165)

                CORBA's biggest usage is in a space most people would have never predicted - embedded. But it is usually a much tighter subset of the CORBA spec.

                >>It's last century's technology for a problem that has since been elegantly solved several times over with many fewer pain points.

                Maybe. My sense is that it is a problem with too many variables for a one-size-fits-all solution which is why so many people continue to fashion custom solutions.

                CORBA in essence is messaging, nothing more or less. There are less complicated solutions out there that are significantly easier to use than CORBA, and some of those are considerably more robust as well, having loose integration.

                • >>CORBA in essence is messaging, nothing more or less.

                  That is simply not true. Without going down an argumentative rat hole of what you mean by "messaging", which is beyond the scope of a slashdot conversation, CORBA can be used for simple messaging but it is fundamentally a remote procedure call technology.

                  Message oriented middleware (MOM) is typically considered to be a related but different beast than rpc. Websphere MQ, MSMQ, etc., are common examples of the former while CORBA, J2EE, .net

                  • Two, the distributed space is too big and has to many variables for any one (or even all of the currently existing) technologies to satisfy, which is why people continue to create new ones.

                    The complexity and variability of the distributed system problem domain is one reason that the CORBA specs are so huge and far reaching (another cause is design-by-consortium). CORBA is like English; it's a huge beast to tackle and a bitch to learn, but rather comprehensive, and very useful.

                    Our shop uses CORBA with (C

                    • Re: (Score:3, Interesting)

                      by sonofagunn (659927)
                      CORBA is the best solution for a lot of applications. Web services just don't perform as well and don't handle more complicated interfaces as elegantly (inheritance, one-way calls, callbacks, etc).

                      Web services are nice for simple remote calls, but in a complex system where all sorts of RPCs are flying around the place, CORBA is a better solution.

                      Other solutions aren't as interoperable between different languages/environments. CORBA still has it's place. ICE sounds even better, but I haven't tried it. Gi
                    • by Gr8Apes (679165)
                      I don't know what you've been programming, but "Web services" perform just fine, provided you design and program them optimally for what you're doing. They can also handle inheritance and "callbacks" just fine, you just may have to use a particular subset or framework to make that easier to deal with.

                      Now, will they necessarily integrate with some particular legacy system as easily? That might be where CORBA rules, but that falls into the category of "predating better technology solutions". That case does no
                    • I don't know what you've been programming, but "Web services" perform just fine, provided you design and program them optimally for what you're doing.

                      I used to think that. And loudly evangelize that. Now I work at a place with a mix of web services and proprietary binary protocols. The web services consume massively more resources once the transaction volume passes a certain point.

                      Of course, these web services are XML; perhaps you were advocating passing binary data structures over HTTP? (just kidding)

                      As

                    • by Gr8Apes (679165)
                      I think 35K concurrent users at peak across 8 physical multi-core/multi-proc machines with multiple instances of appservers just might have given me a bit of insight into how web services work. (Each transaction averages about 800KB total, but that's largely due to brain-dead design - something worked with but honestly wasn't responsible for)

                      I won't argue that binary can be faster. It's also a PITA to code and maintain in a changing heterogenuous environment.
      • Re:Ohhh, goody (Score:4, Informative)

        by LizardKing (5245) on Tuesday April 03, 2007 @10:51AM (#18587921)

        I've worked with CORBA at my last three jobs, and I've been pretty happy with it. I've used OmniORB, Orbacus, JacORB and MICO - all of which work very well, although the licensing cost of Orbacus puts it out of reach for most of the things I work on. I do have to wrap a lot of the C++ stuff in helper classes though, as the mapping for that language is far too baroque. One of the consultants at IONA has produced an open source CORBA utilities library [ciaranmchale.com] that which is far more extensive than my one.

      • We use it a lot in financial apps to lots of data around at highspeed. Other apps in the company use SOAP, XML based stuff and suffer from poor performance.

        The Marshalling is quick and efficient.
        • by Gr8Apes (679165)
          Considering what I've seen XML based items do, I seriously doubt you've exceeded the capability of SOAP or XML. I'd much more readily believe that it's your implementation/architecture that has issues. (I happen to have worked on a large scale app that messaged an average 800KB messages for each of 35K concurrent users. Another app transferred the data of tens of thousands of transactions from a SAP instance to a secondary DB for metric and reporting purposes using XML in near real time (less than 1s delay
          • I happen to have worked on a large scale app that messaged an average 800KB messages for each of 35K concurrent users

            OK, that's 800 x 35 = 28G. 28G per what?

            Another app transferred the data of tens of thousands of transactions...

            OK, roughly 50k txns/?? second? minute? month? To understand your figures, I need the denominator.

            In any event, you can hit an arbitrary number of txns/sec with an inefficient protocol given enough servers. To bring the numbers into focus, we need txns/sec, avg txn size,

            • by Gr8Apes (679165)
              The window was no larger than 5 minutes. The average traffic rate was about 634 messages/s which is about 500MB/s in data flow. (Before you point out that's a lot of data, XML compresses quite well)

              The second case topped out in testing at about 130K transactions/min. The desired goal was 200K, but the initial receiving system wouldn't scale high enough.

              And yes, you'd think you could hit an arbitrary number of txns/sec given enough servers. Truth is, it's highly dependent upon your architecture, even more th
    • Re: (Score:3, Informative)

      by Cthefuture (665326)
      No kidding.

      XML is a standard for heavyweight text type communication.

      ASN.1 BER encoding is a standard for lightweight binary communication (similar to this Thrift crap except ASN.1 is an ISO standard and used everywhere).

      Any RPC method worth its salt would use one of those.
      • by salec (791463)
        Can ASN.1 describe formats which have sub-octet fields, or fields that transverse octet boundaries? I see they mention bit strings, but this bit strings seem to be "octetized" (only one bit string per octet, the rest is padded). E.g. how ASN.1 describes simple HDLC header?
        • ASN.1 is not meant to encode existing data formats. It is its own encoding format.

          It is probably the most widely used binary encoding though, everything from the Internet protocols you are using right now to stuff in your cell phone.
  • I think this raises a potential privacy concern. Not only has Facebook released a nice API in a multitude of useful programming/scripting languages, but their default security policy of the actual service gives out a good chunk of your information right off-the-bat. For the uninformed Facebook user, this spells trouble. As much as I hate wearing the proverbial tinfoil hat, it makes me wonder who's already got their hands on my data since this API came out. How many apps have already been written to simply c
    • Re: (Score:3, Insightful)

      I feel no pain for uninformed users. I'm sorry, but if you put something on the internet and don't know about how it'll be displayed or shown or shared or whatever (accessed?) then you deserve whatever you get.
      • I am not sure whether being uninformed in this case is due to willful ignorance or Facebook changing the rules on people yet again.
    • by mwvdlee (775178) on Tuesday April 03, 2007 @08:45AM (#18586167) Homepage
      RTFA

      They're not giving away any API to their data.

      What they've released is nothing more than a platform-independant RPC protocol.

      And a weird one at that. Instead of relying on common, generic data-format such as XML, they seem to be relying on a custom compiler for their own definition language. I'm sure the underlying data-format is usable without the compiler, but then there could be better methods for writing/reading it.
      • by Kyle_Katarn-(ISF) (982133) on Tuesday April 03, 2007 @09:09AM (#18586429)
        He's not saying THIS is an API, but that they have released one. Which is true; I've dabbled with it a bit myself.
      • XML is slow, if you want fast lightweight RPC which is generally what I want unless I working over the internet which is slow anyway then your going to have to write something proprietary.
      • by The Pim (140414)

        Instead of relying on common, generic data-format such as XML

        XML was not designed as a "generic data-format"; it was designed as a "better SGML", that is, a document format. In fact, it is not a good data format, as can be seen by the contortions involved in adding a type system (essential to a general purpose data format). Which still doesn't work [bell-labs.com], by the way.

        Besides which, designing your own data format, while requiring some care, is not exactly a Herculean labor. If they would just add product an

    • Re: (Score:1, Informative)

      by Anonymous Coward

      I think this raises a potential privacy concern. Not only has Facebook released a nice API in a multitude of useful programming/scripting languages,
      Dude, this *isn't* an API to Facebook's database. It's a stand-alone remote procedure call mechanism (think SOAP or CORBA - you know, like they said in the summary) that happens to be developed by and used by Facebook.

    • Sorry, try again. They didn't release their internal API, they released a framework for RPC calls. Completely different, which you might have noticed if you had actually read the article. And if you don't want your information shared, don't put it somewhere beyond your physical control, i.e. on Facebook.
    • Re: (Score:2, Informative)

      by UltraAyla (828879)
      Their API requires that a user authorize a site to collect their information. If, after a warning for each site, a user still authorizes it, then that's their own problem
  • by nekokoneko (904809) on Tuesday April 03, 2007 @08:49AM (#18586209)
    Post benchmarks to prove a statement or don't state it at all. Don't use weasel words [wikipedia.org] to try to convey a point of view without solid evidence. BTW, it seems this statement was made by either the submitter or the editor, since I couldn't find anything mentioning it on TFA.
    • by acidrain (35064)

      Sigh. Take all the programmers working on a project which hasn't been carefully profiled, round them up in a meeting room and ask them whey their project is slow. You will get five different reasons, and odds are none of them are correct.

      You can tell someone is an expert at optimization when they refuse to make any kind of guess.

      Finally, comparing the value of different implementations on the basis of elegance is a worthwhile hint about their potential, but comparing them *after* they have both been car

  • Indexes (Score:3, Interesting)

    by mwvdlee (775178) on Tuesday April 03, 2007 @08:51AM (#18586235) Homepage
    Why does the format include those "1:", "2:", etc. indexes in the structure definitions and method argument lists?
    Couldn't it do this automatically, or can you mix them up in some way?
    • by Falkkin (97268)
      See section 5 of their whitepaper for a full analysis -- but basically this allows for versioning and interface-definition changes. They can roll out changes to the interfaces in an incremental manner because the servers can robustly deal with new clients that send unexpected fields and old clients that neglect to send expected fields.
  • by garo5 (895321) on Tuesday April 03, 2007 @08:53AM (#18586267) Homepage

    According to the tutorial this api relies on code generation, which I personally don't like.

    Does anybody know any good C++ RPC library which uses templates and which does not need code generating with any external tool nor executable?

    C++ templates allows metaprogramming, so such tools should be able to be developped, but I don't know any. Does anybody know any?

    - Garo

    • by Anonymous Coward
      Try RCF http://www.codeproject.com/threads/RMI_For_Cpp.asp [codeproject.com]

      Or roll your own with boost::asio - http://tinyurl.com/2zpbfd [tinyurl.com], though I think a boost library is already in progress
    • by LizardKing (5245) on Tuesday April 03, 2007 @10:38AM (#18587705)

      Does anybody know any good C++ RPC library which uses templates and which does not need code generating with any external tool nor executable?

      Yup, sockets. Every RPC-ish system I'm aware of (Sun RPC/XDR, CORBA, SOAP, RMI, ASN.1) needs a code generator that produces the stubs which make it easier than using raw sockets. The code that's produced by these stub compilers can be pretty small and well optimised (apart from SOAP), plus you shouldn't need to edit it by hand. Some compilers, such as a decent one for CORBAs IDL, can also produce the boilerplate code that you then fill in with your implementation of the RPC calls. While I usually dislike generated code, when it comes to RPC systems I'm quite glad they do a decent job of hiding complexity from me.

    • I believe Hessian [caucho.com] has a C++ port. I'm not sure if this is what you want, though.
    • by be-fan (61476)
      The sheer pain of doing non-trivial code generation with C++ templates makes it not worth it. Even something relatively simple like Boost.Lambda, which doesn't generate all that much code when you think of it, is nearly unusable because of how much it shows down compilation and how throughly it messes up any error messages for errors made in the vicinity of a template call. It would be a massive PITA to use an API that generated a non-trivial amount of marshaling code using the template mechanism.
    • Does anybody know any good C++ RPC library which uses templates and which does not need code generating with any external tool nor executable?

      Yes, CORBA. You can do DII (Dynamic Interface Invocation) on the client side, and DSI (Dynamic Skeleton Interface) on the server-side. You are never required to use generated code with CORBA. OTOH, the amount of code that you will have to write using DII/DSI is large (not as large as the generated code would be, but large), and usually a PITA. BTW, you can mix a

  • The license (Score:5, Informative)

    by Anonymous Conrad (600139) on Tuesday April 03, 2007 @08:53AM (#18586273)
    Is basically the MIT license [opensource.org] with a few tweaks to the first paragraph (e.g. person -> person or organisation), the second paragraph expanded to cover some of the ideas in the middle section of the BSD licence [opensource.org] and the third paragraph verbatim (or practically verbatim). Note that it appears equivalent to the MIT license in that there's no non-endorsement clause as you'd find in BSD or Apache 1.1 [opensource.org].
  • ... is yet another RPC solution.
    • by dkf (304284)
      It's not another one. It's just REST with a custom security layer on top (and not even done so entirely transparently; ick!) Moreover, you still need either an XML parser or a JSON security hole, err, parser. To cap it off, facebook don't use it with HTTPS so who knows what mischief some man-in-the-middle could cause?
  • by kabdib (81955) on Tuesday April 03, 2007 @11:14AM (#18588349) Homepage
    Same ideas that have been around for 20+ years, but I have to admit it's a fairly nice implementation of a close-to-the-wire protocol.

    They could have gone more flexible and abstract; structs are *bad* for you, and they're missing a fair amount of opportunity to make things dynamic, e.g., growable arrays, hashes, sets, arbitrary nested structures, and even things like canonicalized timestamps, which are a quite important (but often neglected) platform-dependent type (see how often time gets mangled when you go multi-platform...).

    As for efficiency, it wouldn't be hard to be better than SOAP. I have some horror stories...
    • by FueledByRamen (581784) <sabretooth@gmail.com> on Tuesday April 03, 2007 @04:24PM (#18593611)

      growable arrays
      On the transport layer? That doesn't make any sense. The endpoint implementations insert into language-standard growable arrays (PHP native indexed arrays, C++ std::vector, et al), as they should.

      hashes
      Easily represented as maps.

      sets
      They have those. Templated type 'set' in the Thrift interface file (just like std::set).

      arbitrary nested structures
      And those. map<string, map<string, set<string> > > is a perfectly valid construct in the Thrift file, and will emit (as you might expect) the same thing using STL data structures in the C++ endpoint, or nested assocative arrays in the PHP endpoint. The same thing applies to non-templated structure nesting as well, and to templating around user defined structures.

      canonicalized timestamps
      There's no good reason to make a separate timestamp class; an int64 is plenty big enough to hold microseconds (or nanoseconds, even) since the epoch.
  • .....however I still think Zuckerberg needs to DIAF. While we are at it, toss Tom in there for good measure
  • by Smack (977) on Tuesday April 03, 2007 @12:51PM (#18589833) Homepage
    So they wrote something in-house, for their own reasons. Open-source advocates say "release everything... it'll be useful to someone". So they did release it, and then they get slammed for not using the existing standards and because people don't like their methodologies.

    Bravo.
    • by alienmole (15522)
      I don't think open source advocates would claim "release everything, and no-one will criticize it". Criticism is crucial to open source (and everything else), since that's how the good stuff is separated from the bad. Someone who doesn't want to ever receive any criticism should simply avoid doing anything and interacting with other people.

      One pragmatic argument for releasing your code is then you'll find out how good and useful it really is, compared to the competition, beyond just what your own little t
  • what is it? What does it do?

Live within your income, even if you have to borrow to do so. -- Josh Billings

Working...