Forgot your password?
typodupeerror
Software Microsoft Open Source

Microsoft Goes In For Hadoop 67

Posted by timothy
from the well-it-is-a-nice-name dept.
Frankie70 writes that after more than three years, Microsoft has "finally learned to stop worrying and love Hadoop." Frankie70 excerpts from the linked Wired article: "Any aversion to Hadoop disappeared on Wednesday, when the company announced that it will integrate the platform with future versions of its relational database, SQL Server, and its platform cloud, Windows Azure, an online service for hosting and readily scaling applications. The company is now working to port the Hadoop platform to Windows."
This discussion has been archived. No new comments can be posted.

Microsoft Goes In For Hadoop

Comments Filter:
  • by binarylarry (1338699) on Thursday October 13, 2011 @11:06AM (#37701890)

    So what they mean is, they're going to do a search and replace to make it compile as a C# application.

    • by allenw (33234)

      Actually, there is an ever increasing amount of JNI (read: C) code in Hadoop that is in the critical path for security and performance features. Most of that code is not very portable. So either MS is going to pay for some major overhauling of that code, completely new code/branch to replicate that functionality or MS Hadoop is going to be severely lacking in features/performance.

      • If that C code is well-written, it shouldn't be hard to port it over - the porting would have to be done at the Java/C boundary, and .NET actually has it much simpler thanks to P/Invoke.

        • by allenw (33234)

          It isn't. There is an incredible overuse of glibc/Linux-isms to the point that even porting it to another UNIX is difficult.

          • Well then, perhaps those guys will clean it up architecturally while they're porting it, and submit the changes upstream.

    • by Ed Avis (5917)
      Since it's written in Java, they can just run it on the .NET virtual machine using IKVM [ikvm.net].
  • I think MS getting involved with Open Source is great, but....

    We've seen the way that they work before, embrace and extend... This hasn't worked out that well for them before, but you have to ask if there is an alterior motive in there...

    • I am not sure what an alterior motive is, but I am quite sure that MS has an ulterior motive for this. The only question is whether or not that ulterior motive is detrimental to the Open Source community.
      • by Forbman (794277)

        Sell more Windows Server & SQL Server Enterprise/Data Center licenses?

        • by cayenne8 (626475)

          Sell more Windows Server & SQL Server Enterprise/Data Center licenses?

          What major data center (working with large volumes of critical data) in its right mind would ever even consider using MS SQL Server as its database? Who'd consider running their critical server for any database on a windows box?!?!

          Not in any major production player I've ever seen or worked at....

          • by Anonymous Coward

            NASQAQ uses sql server http://www.computerworld.com/s/article/106050/Microsoft_unwraps_flagship_database_SQL_Server_2005

            DirectEdge - 4th largest stock exchange uses sql server
            http://blogs.technet.com/b/dataplatforminsider/archive/2011/06/03/fourth-largest-us-stock-exchange-direct-edge-looks-to-sql-server-parallel-data-warehouse-for-big-data-needs.aspx

      • by jbolden (176878)

        I think they are telling the truth about their goals:

        1) Get Hadoop to work on Windows servers
        2) Create a Windows server management interface for Hadoop
        3) Create SQL Server extension to manage Hadoop.

        And the motive is:
        a) Sell server licenses
        b) Sell SQL Server licenses

      • by jimicus (737525)

        You would be amazed how many people go nuts over the latest F/OSS platform du jour... and then complain that it runs first and foremost under Linux.

        Even if they're never going to go anywhere near the underlying OS anyway, still that gets brought up.

        Windows Server licensing is quite lucrative for Microsoft. So if they can now announce "Hadoop: Now certified for Windows (TM) Server" they can sell more licenses for Windows Server.

      • I think it's to integrate a map/reduce structure into SQL server... I haven't RTFA, but that is about it... I wouldn't necessarily expect them to use Hadoop directly, but to support Hadoop's interfaces. My $0.02 on this. I know a lot of people are using MongoDB, and other document centric datastores lately, and MS is moving to compete in their tool space. More power to them, doesn't mean it'll be my first choice.
      • by Thing 1 (178996)

        I am not sure what an alterior motive is, but I am quite sure that MS has an ulterior motive for this.

        My ex-girlfriend was an ulterior decorator.

    • by bernywork (57298)

      > alterior

      bad spelling, ulterior. Sorry, my bad.

    • by jbolden (176878)

      They are quite publicly indicating their intention is to embrace and extend:

      1) Get Hadoop to work on Windows servers
      2) Create a Windows server management interface for Hadoop
      3) Create SQL Server extension to manage Hadoop.

      So we don't have to speculate, that's what they say they are doing. That being the case all that stuff might be useful for Hadoop.

    • This is a smart move by MS. Microsoft is not working internally on any sort of NoSQL server, so they support an existing project that complements their own product. The very obvious goal is to integrate Hadoop with SQL Server management tools. The upshot is that Hadoop gets a leg up on their competitors (Yahoo! PNUTS, Google BigTable) and Microsoft sells more SQL server licenses. Seems to me to be a win/win.
    • In the beginning I thought it was ironic. But its very generous of them to provide a free meeting room to our open-source computer study group. And MSFT people attend, but dont speak often.
    • by allenw (33234)

      This isn't about Microsfot getting involved with open source. This is about Microsoft not getting left out. Beyond the countless startups, Apache Hadoop already has major players like Amazon, Dell, EMC, HP, IBM, NetApp, Oracle, VMware, ... trying to make a dent in the community in some form or another. Hell, I have a SuperMicro catalog on my desk emblazoned with the Apache Hadoop logo all over it. Like Oracle, they are coming in very late to the party and now need to play catch-up. Buying off Hortonwo

  • First they ignore you.
    Then they laugh at you.
    Then they port you on their platform.
    Then you win.

    Original: http://bit.ly/o3V3cA [bit.ly] [Google Books]
    • First they ignore you.
      Then they laugh at you.
      Then they port you on their platform.
      Then you win.

      First they ignore you.
      Then they laugh at you.
      Then they port you on their platform.
      Then they add some convenient feature that they only make available on their platform
      Then they win

      FTFY

      • by knuthin (2255242)
        That was one another depressing possibility.

        Think I will go into a corner and cry for a minute now. :/
  • Heh (Score:5, Funny)

    by Hatta (162192) on Thursday October 13, 2011 @11:27AM (#37702156) Journal

    Someone should trick Timothy into reposting this article. Then he'd be duped into posting a dupe about hadoop.

    • by Gilmoure (18428)

      *golf clap*

    • And then I'd file a false complaint against it being a copyright violation in France.

      It would be a Hadopi dupe dupe Hadoop.

      But I digress...

  • by mrflash818 (226638) on Thursday October 13, 2011 @11:33AM (#37702236) Homepage Journal

    "Those that do not learn from history are doomed to repeat it."

    "Embrace, extend and extinguish,"[1] also known as "Embrace, extend and exterminate,"[2] is a phrase that the U.S. Department of Justice found[3] was used internally by Microsoft[4] to describe its strategy for entering product categories involving widely used standards, extending those standards with proprietary capabilities, and then using those differences to disadvantage its competitors.

    http://en.wikipedia.org/wiki/Embrace,_extend_and_extinguish [wikipedia.org]

    • Microsoft allowed the pair to continue their contributions to the open source project, and Powerset, which was rolled into Redmond’s Bing search engine, continued to run atop Hadoop.

      This made Bing one of the first “shipping” Microsoft products to actually include open source code. But somewhere along the way, Microsoft moved the engine onto a proprietary platform...

      "Microsoft allowed the pair" -- Here

      "which was rolled into Redmond's Bing search engine" -- It

      "But somewhere along the way, Mi

  • by tomzyk (158497) on Thursday October 13, 2011 @11:34AM (#37702248) Journal

    its relational database, SQL Server, and its platform cloud, Windows Azure, an online service for hosting and readily scaling applications

    That's wonderful that the summary mentions what "SQL Server" and "Azure" are... but why no mention of wtf "Hadoop" is?
    Why do I need to RTFA just to find out what we're talking about here?

    Hadoop — an open source platform for crunching epic amounts of a data across an army of dirt-cheap servers

    • by slim (1652)

      The Wired article tells you what Hadoop is, because it's written by journalists, and aimed at a broad readership.

      The /. summary tells you what SQL Server and Azure are, because the descriptions arbitrarily happen to occur in the Wired paragraph that's been quoted.

      The /. summary does not tell you what Hadoop is, because (unlike Wired readers), /. readers are expected to have some basic knowledge of the software world. If you think Hadoop is obscure, maybe this isn't the site for you?

      • by Anonymous Coward

        I'm with the OP. Quoting an article is fine, even with descriptions one should know... but if an article is primarily about something that isn't common, then a nice little blurb about it would save AT LEAST TWO PEOPLE a wiki call. Hadoop isn't as ubiquitous as you believe. Coming from a heavy consulting background in coding, I've never seen this.

        Oddly, by your logic, /. readers don't know what SQL Server and Azure are. I'll put dollars to doughnuts that more people USE SQL Server than KNOW what Hadoop i

        • by slim (1652)

          Oddly, by your logic, /. readers don't know what SQL Server and Azure are.

          Nope. As I said, the definitions for those arbitrarily happen to be in the quote chosen for the summary. But those definitions aren't the pertinent part of the quote.

      • The /. summary tells you what SQL Server and Azure are, because the descriptions arbitrarily happen to occur in the Wired paragraph that's been quoted.

        So what you're saying is .... the person who wrote the summary doesn't know how to write a good summary or pick a good quote. :) (no, I'm not new here...)

    • Me too! I had to double click on Hadoop, then right click and choose 'Search Google for "Hadoop"' from the context menu. Then I had to switch tabs and scan for the wikipedia site, click on that, and then skim through the wiki for a few seconds just to figure out whether it was worth it to read anything beyond the summary.

      Sadly, this is happening more and more with Slashdot.
  • "the company plans to eventually release its work back to the open source community."

    That is a bit too vague... because 100 years from now is "eventually"

    • by jbolden (176878)

      If you look at their todo list they could release it constantly it won't matter. What they are doing is essentially creating extensions for their commercial products that work with Hadoop. I think they have every intention of trying to get the small parts that need to be in Hadoop back into the main tree.

  • by Utopia (149375)

    Microsoft has a equivalent to Hadoop known as Dryad.
    They should have open-sourced Dryad a long time ago.

    I wonder what is going to happen to Dryad with this focus on Hadoop.

    • Re:Dryad (Score:4, Informative)

      by mandelbr0t (1015855) on Thursday October 13, 2011 @01:07PM (#37703500) Journal

      Dryad is not quite Hadoop. From their whitepaper:

      We can map the whole relational algebra on top of Dryad, however Dryad is not a database engine: it does not include a query planner or optimizer; the system has no concept of data schemas or indices; and Dryad does not support transactions or logs

      I can see how Hadoop would supplement their own research in this field.

  • that's what i thought it said...... MS going all street fighter
  • The two cool parts of this announcement:

    1) They are contributing the bits needed to make it work on windows back to open source (Hortonworks is helping that make sure that goes smoothly)
    2) They are making JavaScript a first-tier language for writing map/reduce jobs, and contributing THAT work back to the community.

    That is awesome.

"Consistency requires you to be as ignorant today as you were a year ago." -- Bernard Berenson

Working...