Forgot your password?
typodupeerror
Windows Microsoft Software

Did Microsoft Borrow GPL Code For a Windows 7 Utility? 493

Posted by Soulskill
from the free-as-in-gimme dept.
Goatbert writes "Rafael Rivera over at WithinWindows.com has found evidence that Microsoft has potentially stolen code from an open source/GPL'd project (ImageMaster) for a utility made available on the Microsoft Store to allow download customers to copy the Windows 7 setup files to a DVD or USB Flash Drive. If Rivera's evidence holds up, this could be some serious egg in the face for Microsoft at a time when they're getting mostly good press from the tech media."
This discussion has been archived. No new comments can be posted.

Did Microsoft Borrow GPL Code For a Windows 7 Utility?

Comments Filter:
  • by BadAnalogyGuy (945258) <BadAnalogyGuy@gmail.com> on Saturday November 07, 2009 @12:30PM (#30014698)

    The code in question seems to be called into scrutiny because the two areas of code bear the same name (ReadBytes) and operate similarly.

    The longer you work in the development of software, the less magical it all becomes. The first time you plugged some code into a terminal and it worked, it seemed like an amazing amount of wizardry and behind-the-scenes stuff that you could never fully fathom. Compilers, binary code, arcane source languages, electronic signals. It's amazing to a neophyte just how much stuff is going on.

    But the longer you plug away at it, the more you realize that it's just code. Nothing special is really going on. You're mostly moving data from one area of memory to another. It's almost a form of Nirvana once you reach this point.

    So when someone comes along and says "OMG YOUR READBYTES METHOD IS JUST LIKE THIS ONE IN SOME GPL CODE!!!!11", it kind of pegs that person as someone who doesn't really have much experience with real programming. Sure, they may use a lot of tools, and know how to recompile their kernel, but they really don't have a firm grasp of what and why they are doing what they are doing.

    • by caffeinemessiah (918089) on Saturday November 07, 2009 @12:38PM (#30014746) Journal

      The code in question seems to be called into scrutiny because the two areas of code bear the same name (ReadBytes) and operate similarly.

      (bold mine)

      Actually, if the function is just something called "ReadBytes(char *buf)" or similar, then that's a bit strange. If it was truly Microsoft-written, it would be:
      WINAPI DWORD ReadBytesW(LPCSTRWRAAXA szCharBufW_x, struct READBYTESINFO *srbinfArgs).

      • by jdkane (588293) on Saturday November 07, 2009 @12:45PM (#30014784)

        Except that a truly Microsoft-written ReadBytes method on the .NET Framework can be that simple, for example one int parameter http://msdn.microsoft.com/en-us/library/system.io.binaryreader.readbytes.aspx [microsoft.com]
        So I wouldn't even jump to conclusions based on the signature of the method in question as to who it might have come from.

        • by caffeinemessiah (918089) on Saturday November 07, 2009 @12:49PM (#30014806) Journal

          Except that a truly Microsoft-written ReadBytes method on the .NET Framework can be that simple, for example one int parameter http://msdn.microsoft.com/en-us/library/system.io.binaryreader.readbytes.aspx [microsoft.com] [microsoft.com]

          There's a difference between a calling a method, where the object has internal state, and a C Win32 API function call, i.e., sans objects. I absolutely guarantee that you won't see many pretty signatures in the Win32 API. I'd bet that 99% of the Win32 API function SIGNATURES won't make it through a standards-compliant compiler without Windows.h. Anyway, my comment was supposed to be funny, but on second thought, it might actually deserve that informative mod.

          Don't even get me started on the dual-version ANSI and Unicode functions, although given the mess that the Win32 API is, it's probably an elegant solution.

          • Re: (Score:3, Informative)

            by VGPowerlord (621254)

            Except that a truly Microsoft-written ReadBytes method on the .NET Framework can be that simple, for example one int parameter http://msdn.microsoft.com/en-us/library/system.io.binaryreader.readbytes.aspx [microsoft.com] [microsoft.com]

            There's a difference between a calling a method, where the object has internal state, and a C Win32 API function call, i.e., sans objects. I absolutely guarantee that you won't see many pretty signatures in the Win32 API. I'd bet that 99% of the Win32 API function SIGNATURES won't make it th

          • by petermgreen (876956) <plugwashNO@SPAMp10link.net> on Saturday November 07, 2009 @10:20PM (#30018624) Homepage

            Don't even get me started on the dual-version ANSI and Unicode functions,
            Transitioning from a system where strings were assumed to be in the local legacy encoding to a unicode based system (a transition all operating systems relavent today have had to go through) is a difficult problem with essentially no good soloution.

            The way unix-like systems went for is to use UTF-8 and treat it as if it was just another legacy encoding. The problem with this approach is that it means that systems configured for unicode and systems configured for legacy use different encodings which tends to break stuff.

            The way windows went for is to introduce duplicate APIs for unicode, this has the advantage that nothing that worked before breaks but requires all apps that want unicode support to be updated.

            Can you think of any better soloutions?

    • by ozmanjusri (601766) <[aussie_bob] [at] [hotmail.com]> on Saturday November 07, 2009 @01:21PM (#30015042) Journal
      The code in question seems to be called into scrutiny because the two areas of code bear the same name (ReadBytes) and operate similarly.

      The ReadBytes code was just one example

      If you read TFA (yeah, I know...) you'll see the author has updated that original example with others [withinwindows.com].

      It looks like Microsoft's defence will be that the EULA says "“You may not reverse engineer, decompile or disassemble the software". They'll probably charge the guy with a DMCA violation...

      • by X0563511 (793323)

        Which is great, because they themselves offer debug symbols and checked builds.

        One hand giveth, the other taketh away.

      • by kjart (941720) on Saturday November 07, 2009 @03:46PM (#30016300)

        If you read TFA (yeah, I know...) you'll see the author has updated that original example with others [withinwindows.com].

        OP clearly did read TFA since he was criticizing the specifics provided. I'm not sure why you're taking a shot at that since the update was clearly made after the comment was posted.

        It looks like Microsoft's defence will be that the EULA says "“You may not reverse engineer, decompile or disassemble the software". They'll probably charge the guy with a DMCA violation...

        Why does it look like that exactly? Are you getting this from anywhere or just pulling it out of your ass?

      • by woolio (927141)

        It looks like Microsoft's defence will be that the EULA says ""You may not reverse engineer, decompile or disassemble the software". They'll probably charge the guy with a DMCA violation...

        Legally speaking, what does it mean to disassemble a program? Is it to convert its machine representation into a more readable format? Every processor in every computer does this, it just disassembles to a language that is not composed of English words and numbers. \

        If someone owns Visual Studio and another program o

    • Re: (Score:3, Insightful)

      by eggnoglatte (1047660)

      Especially in this case: the example code is part of a parser. Guess what, when you write a parser for any kind of file format without using a formal grammar tool, the most natural way to do it is read the individual components in the order the file format presents them. So the structure of the code is very much determined by the file format, and you have to expect a lot of similarities between different implementations, even if they were done completely independent of each other. Constants like the value 4

      • Re: (Score:3, Insightful)

        by The MAZZTer (911996)
        Why don't you check it yourself and find out instead of calling someone a liar with no evidence to back your claim? I mean really, if what you're saying is true, it should be trivial to produce the mismatched code samples, right?
        • by eggnoglatte (1047660) on Saturday November 07, 2009 @06:22PM (#30017240)

          a) I am not the one making wild claims about out somebody - the author of TFA does. I mean I get it - we don't like MS here on /., blah, blah. Still, if he makes claims, the burden of proof is on him, not on people not on everybody else to disprove him.

          b) I don't run windows, so getting everything set up with a .NET architecture would in fact be quite a pain in the ass.

    • Re: (Score:3, Insightful)

      by ceoyoyo (59147)

      His new example is pretty weak too. It's another function to read some sort of header, and, surprise, the code operates in a similar way. Well, it pretty much has to... it's reading the same kind of header.

      So far it's all pretty poor evidence.

    • by noidentity (188756) on Saturday November 07, 2009 @03:40PM (#30016272)
      I've just written my first program, and I licensed it under the GPL. Guess what? A bunch of people have already ripped me off! So I can understand this guy's situation. Here's the source, BTW:

      #include <stdio.h>

      int main()
      {
      printf( "Hello, world!\n" );
      return 0;
      }

  • Knee jerk (Score:5, Insightful)

    by Romancer (19668) <romancer&deathsdoor,com> on Saturday November 07, 2009 @12:31PM (#30014706) Journal

    So the evidence is a ReadBytes snippet?

    I'll wait till there's evidence before even commenting about the ramifications of something like this. This is just wild speculation at this point.

    • Re: (Score:3, Funny)

      by Blakey Rat (99501)

      But this is Slashdot!

      Without wild speculation there wouldn't hardly be any stories at all! And of course you have to get the 2-minutes hate for Microsoft going early.

    • Re:Knee jerk (Score:5, Insightful)

      by Timothy Brownawell (627747) <tbrownaw@prjek.net> on Saturday November 07, 2009 @01:17PM (#30015012) Homepage Journal

      Moderated 'Flamebait.' 0 points left.

      Seriously, whoever decided that we just get one dropdown and no 'confirm' button needs to be taken out back and shot. And I'd just used my other points on some actual trolls upthread, too. :(

    • by Dr. Evil (3501) on Saturday November 07, 2009 @01:19PM (#30015026)

      Oh no. Evidence is not required in this case. This failure to comply with the GPL means that Microsoft is governed by Copyright law in this matter.

      Their Internet service provider must be notified so that their Internet connection can be terminated.

    • Re:Knee jerk (Score:4, Interesting)

      by megabunny (710331) on Saturday November 07, 2009 @01:46PM (#30015308)

      The new example is much clearer. Basic structure follows well. All the magic numbers in the code that I looked at matched too, and there are quite a few. Looks like it was massaged at least a bit, probably just to fit in with the local code environment not to obscure it.

      But ...
      The article points out only two weaknesses in this code borrowing. MS did not feed back any (unknown at this point) enhancements to the source. And they did not offer the source under the right license.

      It is a real but very minor issue. If it wasn't MS it would not even be interesting.

      MB

  • by Tankko (911999) on Saturday November 07, 2009 @12:55PM (#30014856)

    Come on people, you can't have it both ways. If you can't "steal" music, you can't "steal" code. MS "stealing" this code didn't deprive the Open Source community from using the code (i.e. stealing my car), or at least that's the argument /.er use whenever the word is used in conjunction with music and movies. Eat your own dog food.

    • by liquiddark (719647) on Saturday November 07, 2009 @01:12PM (#30014978)

      I don't think everyone here believes you can't steal music, first off. I believe you can steal music, books, printed art, all kinds of artwork. I come from a fairly serious artist background, and I know folks personally who have been scraping by for years on the meagre earnings of an average artist. It's not a fun life.

      I believe large record syndicates are creeptastic and digital media is equation-changing, but that doesn't mean there's no evil in stealing non-physical works. Artists, unless they happen to be the pretty-close-to-literally one in a million shot, make almost nothing and they make a huge difference in how livable a society is. That's not changed by the fact that they can deliver media via digital channels; only people's expectations of the cost involved is changed. The number of consumers shrinks, but so does their expected price point. It's one of the reasons why there are still a lot of physical-media artists (the others including nobody's come up with good, cheap 3 dimensional sound, graphics, or texture delivery systems, physical media still work in some contexts, and art is large a physical act).

      And if you can steal art, you can certainly steal code. Of course, in this case it's probably going to have no repercussions because you'd have to educate people on the struggle of open source in terms that wouldn't make a lawyer cry before you could really even get into it, but those of us who've self-selected have at least a notion of the violation and its meaning. And, happily, the irony - if MS really is using open source in its first "better" product in a long time, that's a fun little fact to know.

      • I don't think everyone here believes you can't steal music, first off.

        Speak for yourself. I do believe you can't steal music.

        You could steal the original copies. You could steal a famous painting. But "stealing" music? For instance, what IS music? It's nothing but a mathematical concept involving harmonics and sound.

        What are words? You can't "steal" what I said. This isn't like the little mermaid where you could steal someone's voice and leave him/her mute.

        Non-physical works CANNOT be stolen. Unless you're talking about a PHYSICAL COPY, you cannot steal it by definition. Copying a work? That's completely different. But if it's a non-destructive process, you're not stealing it. You're just COPYING it.

        If you want to use an appropriate term for what Microsoft supposedly did with this GPL code, it's called plagiarism [wikipedia.org]. Sure, it's called "stealing" nowadays, but using this word is oversimplifying.

    • Glad to see that you haven't even read what people are saying but rather making broad assumptions based on a very slim amount of "evidence."

      If Microsoft did steal the code, then they should be punished. However, there really is no good evidence that they did indeed steal. Just because things are similar doesn't mean that one was stolen from another.

    • No, as the headline states, they borrowed it. And they promised to return it when they are done.

    • by khallow (566160)

      If you can't "steal" music, you can't "steal" code.

      A quick question here. Would I be dragged into court and fined thousands of dollars because my child or a buddy stole open source licensed code using my machine? In other words, it's unlikely that people will have their lives turned upside down by stealing code (assuming generously that they're doing something where that is even possible). The war on "stolen" music is something that can affect the typical slashdotter even if they had no part in the theft. I suppose you could say that this is a shining examp

    • I'm sure there are many people who hold both to be true. However, there are also people on Slashdot that realise that the two positions are to some extent, exclusive of one another.

    • by Junior J. Junior III (192702) on Saturday November 07, 2009 @01:24PM (#30015074) Homepage

      This is the correct argument, but you have it backward. If it's OK for MS to "steal" (by the definition that MS accepts for the word) then MS should allow people to "steal" Windows, and stop complaining about, trying to stop, prosecuting, software piracy. They should amend their EULA to allow users to decompile, reverse engineer, and modify their binaries.

      Besides, it's not as though GPL code is anti-copyright.

  • I, for one, welcome our newest open source project to the community - Windows 7.

  • by Ironsides (739422) on Saturday November 07, 2009 @01:10PM (#30014954) Homepage Journal
    Seriously, what he shows to be evidence looks like code that was written straight from reading the ISO disk image specification. Next up, school math class accused of mass cheating for solving math problems in similar ways.
  • by Locke2005 (849178) on Saturday November 07, 2009 @01:11PM (#30014968)
    I've written subroutines called "ReadByte" several times, so obviously both the Microsoft code and the GPL code is in violation of my company's copyright! (BTW, if the ReadBytes routine doesn't have a buffer size parameter and return the actual number of bytes read, it is bad code.)
  • Even if true, surely Microsoft would just need to perform minor corrective action (replace the code promptly and discipline or fire those responsible for inserting the stolen code). The software isn't a significant part of the system. Nor does it seem to be a difficult bit of code. So you can't really claim that Microsoft is making boatloads off of or even just saving money by stealing the code. And I think MS probably could make a good argument for saying that either they had a rogue developer or someone m
  • by Manfre (631065) on Saturday November 07, 2009 @01:18PM (#30015018) Homepage Journal

    With the amount of "evidence" in the article, the same accusation could be made against the GPL project. Perhaps the author of that project illegally gained access to Microsoft code and used it as a starting point for ImageMaster.

  • by Anonymous Coward on Saturday November 07, 2009 @02:38PM (#30015770)

    Element109 wrote on: http://social.msdn.microsoft.com/Forums/en-US/windowsopticalplatform/thread/421f3137-c9aa-45fb-8c5a-ec5dd6860036

    The iso and udf parsing portions were ported from the 7-zip project. The credits.txt file contains all the sources used in creating my project.

    7z
    by Igor Pavlov
    7-Zip is a file archiver with a high compression ratio.
    http://www.7-zip.org

    There are links to his source on his homepage. 7-zip is hosted on the SourceForge website.

    If you checkout my initial upload there is a file in the reader directory that is a very early stage of the initial udf port. I had excluded it from the VS environment and forgot about it. It is the file I deleted in the latest changeset.

  • GPL Quiz (Score:3, Insightful)

    by giminy (94188) on Saturday November 07, 2009 @04:34PM (#30016564) Homepage Journal

    It's that time again. Before anyone comments on GPL lifting, please take the GPL quiz:

    The GPL Quiz [gnu.org]

    Anyone who gets a perfect score may comment in this thread, all others please keep uninformed conclusions out.

  • by BitZtream (692029) on Saturday November 07, 2009 @04:48PM (#30016642)

    Look, when you take to functions that do essentially the same thing, and you compile them, to optimized code, there is a good chance, if the compiler is doing its job that the compiled byte code looks a lot a like. This code HAS to act the same, its reading the same data format. Its no surprised that when you decompile different versions that they look the a like, I would be concerned if they didn't.

    Go ahead and decompile it, so you aren't seeing the original source, you're seeing a decompilers version of the optimized code.

    I could probably write that function 100 different ways in one day and get the exact same thing out after compiling it to optimized byte code and then decompiling it. Its a rather specific process at that point for dealing with a standard. You almost HAVE to do things in that function that way in order for your code to actually work. There are a few changes that could be made, some branches could be done in different orders, but once you throw the optimizer at it, those branches are likely going to be reordered the same way to reuse registers and such rather than wasting extra ones.

    The author of the article is a newbie at best. Its fairly clear that he doesnt' actually understand what has happened in this process and has provided no evidence other than 'the end result looks the same!'. It could have went both ways, neither project was the first to write a UDF reader. My guess would be the first C# UDF code was actually a port of some C code to do it anyway.

    Finally if you read the comments section of the article, the ImageMaster credits.txt contains a link to MS source, while I haven't bothered to download the linked SDK, its a safe bet that the reason the code looks the same is because it probably is, ImageMaster PROBABLY pulled that function from an MS example. It happens ALL THE TIME.

    There is no MS conspiracy, just some douche bag blogger wanting to get posted on the front page of slashdot to increase his ad revenue.

    The proper thing to do is to remove this story from the front page to deny that traffic to him.

Testing can show the presense of bugs, but not their absence. -- Dijkstra

Working...