Forgot your password?
typodupeerror
Microsoft Operating Systems Security Software Windows

Microsoft Exploit Predictions Right 40% of Time 182

Posted by timothy
from the statistics-94pct-nonsense dept.
CWmike writes "Microsoft today called its first month of predicting whether hackers will create exploit code for its bugs a success — even though the company got its forecast right just 40% of the time for October. 'I think we did really well,' said Mike Reavey, group manager at the Microsoft Security Research Center (MSRC), when asked for a postmortem evaluation of the first cycle of the team's Exploitability Index. 'Four of the [nine] issues that we said where consistent exploit code was likely did have exploit code appear over the first two weeks. And another key was that in no case did we rate something too low.' Microsoft's Exploitability Index was introduced last month."
This discussion has been archived. No new comments can be posted.

Microsoft Exploit Predictions Right 40% of Time

Comments Filter:
  • Congratulations? (Score:3, Insightful)

    by Smidge204 (605297) on Friday November 14, 2008 @07:00AM (#25759027) Journal

    That's great, guys, but don't you think being proud that you were right about your code being exploited is... backwards? That's like being proud you correctly predicted you would get stabbed while walking through a ghetto wearing gang colors.

    Then again, this is Microsoft. They probably throw an office party every time something compiles without errors.
    =Smidge=

    • Re: (Score:3, Interesting)

      by David Gerard (12369)

      Indeed. I swear, I called it: it's easier to predict the holes when you release them yourself [today.com].

      After what was expected to be an unusually quiet Patch Tuesday, Microsoft has released eight patches for applications with an insufficient number of security holes. "Our market is the enterprise," said Microsoft security marketer Jonathan Ness. "Information technology professionals know that Windows is the greatest IT job creation scheme in history. Without Patch Tuesday, there's no reason for the experienced IT

    • Re: (Score:2, Interesting)

      That's great, guys, but don't you think being proud that you were right about your code being exploited is... backwards?

      Well, they're not proud of making exploitable code (if they were, there would have been a giant endless party at Microsoft for the last 20 years), they're proud of predicting when/how fast their code will be exploited.

      That's like being proud you correctly predicted you would get stabbed while walking through a ghetto wearing gang colors.

      No, it's like correctly predicting that you'll get st

      • Re: (Score:2, Insightful)

        by TheCycoONE (913189)

        No, it's like correctly predicting that you'll get stabbed 17 minutes after entering the ghetto, by 6 gang members dressed in red.

        Not at all. It's much more like guessing that you will be stabbed 6.8 minutes after entering a ghetto by 8-9 gang members dressed in red, then actually being stabbed after 17 minutes by 6 gang members wearing pink.

    • I can think of a few ways they can get that number up. Of course, none of them would be good for the consumer. But when has Microsoft put the consumer above having numbers that it can tout?
    • by Sockatume (732728)
      If you're sailing in a yacht made of cake with sails of tissue paper, with pegs for both legs and hooks for both hands, it's useful to know where the leaks in your boat are.
    • by iammani (1392285) on Friday November 14, 2008 @07:47AM (#25759271)

      Slashdot crowd *loves* MSFT bashing doesnt it.

      Ok lets see... Some company (say Canonical or MSFT) builds a huge software and releases it. And a third party finds a bug and reports it to them. Now would be good to predict the severity of the bug, so that the more exploitable ones can be fixed first? Thats exactly what they are doing, and they are able to get the severity 40% of the time right, with no false negatives (that not a single severe one has been classified as a low priority one).

      So, now, do you think this is bad or wrong or something?

      • by MrMr (219533) on Friday November 14, 2008 @08:11AM (#25759409)
        They build enough security holes in their applications to do meaningful statistics on the monthly number of exploits in the wild.
        So, now, do you think that that is not a reason for criticism on their internal software testing?
        • Re: (Score:3, Insightful)

          by mobby_6kl (668092)

          No, the criticism of either their coding practices or QA has nothing to do with a new and fairly efficient way to prioritize bug fixes. They already have the software with all the holes built in. Now they should deal with what they have in the best way possible, don't you agree?

          • by MrMr (219533)
            You mean that Microsoft is too small to maintain their own code?
            Who cares about prioritization? If they have a monthly batch of fresh flaws and their don't fix at least as many within a month they are fighting a lost battle anyway.
        • by RichiH (749257)
          I don't like MS either, but you are doing a wonderful job of sidestepping what GP said. Pretty much exactly what he complained about in the first place.
      • by argStyopa (232550)

        Yes, this IS bad or wrong or something.

        Wouldn't it make MORE sense to perhaps spend the human/technical resources FIXING the most exploitable bugs rather than standing around with a beer in hand saying 'yep, that's going to explode for sure'.

        *BOOM*

        'See? I told you so.'

        • Re: (Score:2, Informative)

          by iammani (1392285)

          Wouldn't it make MORE sense to perhaps spend the human/technical resources FIXING the most exploitable bugs rather than standing around with a beer in hand saying 'yep, that's going to explode for sure'.

          Yes it indeed would, and thats exactly what they have done and the story is about the review of the practice that happened at the end of the month (read during a review of what became an exploit and what got fixed at the right time)

        • by LordKronos (470910) on Friday November 14, 2008 @09:04AM (#25759755) Homepage

          Sure, if you have unlimited resources and can devote an infinite number of people to fixing everything, that would be great. However, if you have finite resources available and have to devote them to fixing up certain areas, how do you know where to devote your attention? If you can come up with a methodology for predicting such a thing, put it to the test, and get decent accuracy in your predictions, then wouldn't that be useful for confirming for you how you should devote your limited resources?

          There is nothing unique in what they are doing. I mean, look at the auto industry, for example. They don't just randomly assign engineers to try and make random things safer. They do studies, try to figure out what are the most dangerous aspects of a vehicle, and then assign engineers to work on those specific things.

          Fortunately for the auto industry, it's a little easier to do your predictions pre-release, since the "attack vectors" are more limited and well known (there are typically only so many ways you can get into an accident, so it's easier to model a majority of those cases). This allows them to be proactive in fixing flaws. Unfortunately, the attacks vectors in software are a bit more numerous, and you often have to take a more reactive approach. What Microsoft is doing here is trying to model things to see how reasonable it would be to devote resources in certain ways to be proactive.

          So again, in what way is this bad?

          • by Jaktar (975138)
            If I had mod points you'd have them, but next time don't go so in-depth with the auto industry. You may have gone over some peoples heads by not explicitly calling them 'car makers'.
      • by sjames (1099) on Friday November 14, 2008 @09:24AM (#25759921) Homepage

        Based on their success rate, they should flip a coin instead, then they'll be at 50%. That's what everyone's laughing at.

        • Re: (Score:2, Informative)

          by gazbo (517111)
          Statistics. You fail it hard.
        • Re:Congratulations? (Score:4, Informative)

          by PJ1216 (1063738) * on Friday November 14, 2008 @10:30AM (#25760585)
          If you actually want a correct coin analogy, its that every time they called heads (heads = bug will be exploited), it showed up heads 40% of the time. Every time they called tails (tails = bug won't be exploited), it showed up tails 100% of the time. Now, since there were 18 coin flips (bugs), they were right 13 times (4/9 were correctly called as heads, 9/9 were correctly called as tails). Thats 13/19. They had about a 68% success rate.

          I don't understand how the article got the math completely wrong or how people aren't seeing the extremely obvious flaw in the math.
        • Re: (Score:3, Informative)

          Actually, they'd have to flip a coin for every bug – and their current statistic, "40% of the bugs we identified as exploitable were exploited", would probably look great compared to the percentage they'd get by flipping a coin.

          Basically, you're looking at this wrong. Microsoft correctly predicted 40% of the exploitable bugs, but they also correctly predicted the non-exploitable ones which wouldn't be exploited.

          Suppose (and I don't have actual numbers, so I'll make up hypothetical ones) Microsoft find

      • And a third party finds a bug and reports it to them. Now would be good to predict the severity of the bug, so that the more exploitable ones can be fixed first?

        You mean, unlike the press which tends to overblow every report of a vulnerability on Linux and/or FireFox, although in reality the "vulnerability" only work in a few very rare cases where a complex mix of condition. Plus a very gullible and cooperative user who will go through a long process in order to reach the point ...

        Thats exactly what they are doing, and they are able to get the severity 40% of the time right, with no false negatives (that not a single severe one has been classified as a low priority one).
        So, now, do you think this is bad or wrong or something?

        It wouldn't be bad, if it weren't for microsoft's software quality being so bad, that simply calling every bug as critical would be a 100% sensitive 99% specific test.
        Today's news almost

      • There's a term for this, it's called "quality control". It used to be performed *before* distributing a product to market. The term for evaluating quality after distribution is called "damage control", and this software is akin to a nurse performing triage on patients the hospital injured.
      • So, now, do you think this is bad?

        I think it's pretty bad that without thinking, just by flipping coins, you can do better than them: http://tech.slashdot.org/comments.pl?sid=1029297&cid=25763435 [slashdot.org]

      • by Foofoobar (318279)
        Hmmm.... my vote is for 'or something'.

        Obviously you cannot separate your passionate love for Steve Ballmers trouser snake from the need for good engineering. Should 40% be acceptable for detecting parachute defects? How about life raft defects? Is this a severe comparison? Well when healthcare systems and life support systems and financial systems and even a INTERNATIONAL SPACE STATION run on their software, you'd think 99.9% would be the only goal they would be happy with and anything short of that wou
    • by sorak (246725)

      In fairness to MSFT, it could have some useful applications in prioritizing. History has shown us that a software company obviously can't fix every bug, so, a more efficient way of knowing in how many person hours to sling in which direction may prove useful.

    • That's great, guys, but don't you think being proud that you were right about your code being exploited is... backwards?

      I think it's even worse that they're proud that they're right about their code being exploited when they did worse than chance. They were more wrong than right, and they claim they did well.

      FTFA (edited but staying true to the point):

      Of the nine October vulnerabilities marked "Consistent exploit code likely," four did end up with exploit code available. None of the nine tagged "Inconsistent exploit code likely" had seen actual attack code. Microsoft correctly called the four bugs last month tagged with "Functioning exploit code unlikely."

      So they got eight right, out of twenty-two: 8/22. If we give them one third credit for the maybe-exploits not having full-blown exploits, they're still only at 11 out of 22.

      Here's my security advice: if you want to know whether a bug will be exploited, flip a coin.

      I'm better

  • That's not too bad (Score:5, Insightful)

    by 91degrees (207121) on Friday November 14, 2008 @07:00AM (#25759031) Journal
    A little heavy on the false positives but no false negatives so it allowed more efficient targeting of the risk areas. Also good enough to provide useful feedback.
    • Call us when they rated something TOO HIGH, or OVERESTIMATED the number of exploits, not the other way around.

      (boggle)

  • And another key was that in no case did we rate something too low

    Well, that's like saying, after you block all your email from getting through, "We rated all the spam accordingly, and let none of them through".

    How about, we just guess, a rough fucking guess, that any "remote code execution" or "run with elevated privileges" exploit or hell ANY GOD DAMN FUCKING BUG YOU FIND, needs fixing, right Microsoft?

    • Re: (Score:3, Insightful)

      by c_forq (924234)
      Wow, have some anger issues there? This isn't about not fixing bugs, this is about prioritizing bug fixes. Anything this large is going to have massive amounts of bugs (I can't count the times I've updated packages in Ubuntu, and the OS-X bug fixes come by the hundreds per .x release). Microsoft, just like Apple and Canonical, has limited resources to fix said bugs (and actually Apple and Canonical get some free work done for them, due to use of open source packages).
    • Re: (Score:3, Insightful)

      or hell ANY GOD DAMN FUCKING BUG YOU FIND, needs fixing, right Microsoft?

      Any goddamn bug doesn't need fixing asap the same way. Software always has bugs, even really good software, so it's a matter of prioritizing which bugs are show-stoppers, which are less problematic and which are minor.

      The problem with Microsoft is their habit of releasing bananaware: they ship green software that matures at the customers, at the expense of the customer of course who essentially pays to become a beta-tester for Microsof

      • Re: (Score:3, Interesting)

        by Khuffie (818093)
        In other terms, when other reputable software shops iron out most bugs in-house before releasing their products, Microsoft just removes show-stoppers and let its customers report all the other bugs.

        You mean, like Apple's Leopard release? Or Apple's iPhone 3G release? Or Apple's mobileme release?

        I fail to see how Microsoft has a reputation of releasing 'bananaware' whereas Apple doesn't. I don't recall hearing about major, crippling bugs when Office 2007 came out (one of their biggest apps), and rega
        • by azrider (918631)

          Vista was actually a solid enough release and most of the issues were due to bad drivers that manufacturers didn't bother updating a year beforehand when they had betas and release candidates.

          No, the problem was that MS changed the underlying layers between the betas, RC's and the RTM. Since that was happening, the manufacturers held off until they had a stable platform to shoot at.

          • by Khuffie (818093)
            No, the problem was that MS changed the underlying layers between the betas, RC's and the RTM. Since that was happening, the manufacturers held off until they had a stable platform to shoot at.

            Really? Care to cite proof of them changing the underlying layers between the RC and RTM? Or explain how some manufacturers were able to get drivers working for Vista properly before RTM?
            • by Blakey Rat (99501)

              Forget it, this is one of those Slashdot bullshit claims you see around here all the time which never has any kind of supporting evidence. One of the other ones is how DRM in Vista "slows down your computer."

              It's trouble enough to get vendors to support their products *at purchase*, and yet people have problems believing that they don't support products for new OSes that come out after the purchase? I have no idea where people get that idea.

              I bought a USB wifi card that didn't support Vista out-of-the-box..

      • by Blakey Rat (99501)

        The problem with Microsoft is their habit of releasing bananaware: they ship green software that matures at the customers, at the expense of the customer of course who essentially pays to become a beta-tester for Microsoft.

        I find it funny that that's a "problem" with Microsoft, and a stated goal of open source software. (Release early, release often.) Frankly, I'd be ecstatic if some open source projects followed Microsoft's example and removed the show-stopping bugs before release.

  • Any engineer who says that "40% is pretty good predicting" is incapable of writing good software, or managing a project, or, even, applying the scientific method.

    Hint: 40% is worse than guessing.

    • by gbjbaanb (229885)

      Dear MS. I have a foolproof way of enhancing and improving upon your algorithms to determine the exploitability index.

      if it comes up heads, its exploitable. Tails its gonna be ok.

      I estimate you will increase your predictive capabilities by a whole 10% using this method.

    • by Anonymous Coward on Friday November 14, 2008 @07:08AM (#25759077)

      No, it means that they were able to cut the field of their immediate focus nearly in half while not missing any issues. For such a complex system without any precise mathematical model, that's pretty good.

      In this case, flipping a coin is statistically likely to let an unaddressed issue through, and that's a big no-no for applications like this.

    • by rugatero (1292060) on Friday November 14, 2008 @07:20AM (#25759131)

      Hint: 40% is worse than guessing.

      No - from TFA:

      The index, launched last month, rates each vulnerability using a three-step system.

      Random guesses would be expected to yield 33% success.

    • Re: (Score:3, Informative)

      by mdmkolbe (944892)

      40% is worse than guessing only if you have only two choices (e.g. heads or tails). If you have more choices it is a bit better than guessing.

      MS was predicting not just whether exploits would appear but the kinds of exploits that will appear. Depending on how specific (e.g. there will be a buffer overrun in module XYZ) or general (e.g. there will be an exploit in Windows *somewhere*) they were about the kinds of exploits, 40% could be either pretty good (i.e. they were insightful) or pretty bad (i.e. t

      • Granted, they're doing better than guessing... but in reality, I only care that they get it right on the risks that count. They could be 1 for 10, if the harm that the single exploit would cause was more than the sum of the other 9, and be doing decent.

        For instance, if they patched the priv. escalation to SYSTEM that has a broad surface area (think, say, remote IIS exploit) over 9 exploits that require physical access and can only get guest access. If someone else has physical access to your box, it's n
    • by abigsmurf (919188) on Friday November 14, 2008 @07:23AM (#25759145)
      No it isn't. Unless of course you assume that for every bug hackers flip a coin and go "heads, I'll write an exploit for this".

      40% accuracy in predicting with no false negatives? There are plenty of distaster agencies around the world who would be incredibly pleased with that kind of accuracy

      • the new bar (Score:2, Interesting)

        by mevets (322601)

        Microsoft Security Research Centre is a success as a disaster agency? A bit harsh, but I suppose so...

    • by Raynor (925006)

      Actually 40% is quite good considering, as others have mentioned, that 33% would be the random chance.

      it is also worth noting that they have 40% prediction of KNOWN threats.

      I would bet there are about as many undiscovered exploits re: these updates, which could drive up or down the percentage.

      If I can predict the stock market by +7% over random guessing, that is pretty damn good predicting.

    • Hint: 40% is worse than guessing.

      I'm assuming you meant "worse than flipping a coin". But this was not a heads/tails judgment; it was "for this given defect, is it Highly Likely, Somewhat Likely, or Not Very Likely that it will be exploited"?

    • "This month, we're going to predict whether evil hackers will exploit bugs in our code. What do you predict?"

      Steve Ballmer: "No."
      James Allchin: "Yes."
      Mike Reavey: "Yes."
      Jim Gries: "No, I fixed all the bugs."
      Sarah Ford: "I dunno. I'd say no; I'm confident in Microsoft."
      Val Mazur: "No."
      Rui Chen: "Well, the possibility is there, but they'll never prove that they did, so it's the same as no."
      Kathleen Dollard: "Of course I will! er --I mean, THEY will. Yes."
      Michel Fournier: "How am I supposed to know? How m

    • My understanding is that they (implicitely) assessed thousands of potential exploits. Of these, thousands minus 20 were classified as safe and 20 as dangerous. All guesses from the "safe" category were correct and 8 out of 20 from the "dangerous" category were correct. If all those thousands minus 20 assessments would be taken into account, their statistic would be much better. Even more: it would be fishy if all of the 20 potential exploits would have occurred.

    • Any engineer who says that "40% is pretty good predicting" is incapable of writing good software

      It's ZERO false positives, and many false negatives. FFS, elevating 2.25 issues to "immediate priority, someone will exploit this soon" status for every one real issue seems damn good to me.

  • Interestingly what they are saying here is that they think that

    a) Hackers are smarter than they actually are
    b) Microsoft code is easier to exploit than it actually is

    So the perception is that Microsoft is better than their prediction, but the implication of that is that Microsoft think they are rubbish.

    Maybe all these years of "Microsoft sucks" posts on Slashdot have actually come from the MS security team.

    • by Raynor (925006)

      No. What they say is:

      You should fix this bug first, since we believe it is the most likely to be exploited.

      You can save these for later, since we don't believe it will be immediately exploited.

      There is, however, something to be said for hackers referring to this list to find "unlikely" bugs to exploit.

    • So Microsoft thought their code was exploitable and said so, and it was, and instead of doing something about it they just congratulated themselves on predicting it!

      Now here's an odd idea rather than predicting if something is exploitable and then publishing it, why not just not write code that is easily exploitable....!

      and note the 40% is only the exploits they know about ....so even that is suspect....

  • by 140Mandak262Jamuna (970587) on Friday November 14, 2008 @07:17AM (#25759115) Journal
    Nov 14, Redmond, Washington. Today Head of Vistaland Security of Microsoft, Mr Ima F Anboi announced that Microsoft has raised the Exploitability Threat Level from Light Purple to Sunset Yellow. He urged the users to continue their normal activities and not take precipitous actions.

    Microsoft Exploitability Threat Level Indicator is a series of color codes starting from Dazzling Arctic White to Heart of Dick Cheney. Though exact number of these colors is considered a secret, from the past announcements we deduce there are at least 22 million of them.

    For PRNewswire, copy edited by Anurag Chakraborty in Bangalore and supervised by Robert Zimmermann in Pittsburgh.

  • there is so many to chose from...

  • ...is the same as being wrong 60% of the time.

    Doesn't look so impressive when you look at it this way.

    • by Chrisq (894406)
      I was going to say the same thing. Still, it didn't do George Bush any harm.
    • by Icarium (1109647)

      Without knowing the baseline they're working on, this could range from extremely impressive to completely useless.

      Ok. So 4 out of the 9 bugs they expected to see exploits codes for actually had exploits meterialise. How many bugs had exploits coded that were not in thier 9 candidates? What is the total number of bugs taken into consideration?

      If you were playing "battleship" on a 3x3 board with 4 "ships", taking 9 guesses to hit all 4 would be pretty dismal. Change that into a 30x30 board and suddenly 9 gues

    • Re: (Score:3, Insightful)

      by dubl-u (51156) *

      Doesn't look so impressive when you look at it this way.

      Depends on the payoff.

      It's not good if you're betting even money on coin tosses. But if you're a venture capitalist, it's great. The general rule for tech VCs is that 7 bets out of 10 will fail, 2 will do ok, and 1 will be a big success. If that 1 success is buying 10% of Google in the very early days, your 70% failure rate is still pretty awesome, because you're still up billions of dollars.

  • Microsoft is now bragging about the fact that they predicted 40% of their bugs would be turned into exploits?
    I realize that Windows is a complex hunk of crap...errr...operating system, but wouldn't they be better served trying to find and correct these issues rather then just releasing them into the wild and keeping their fingers crossed?
    Their attitude is sort of like pointing the gun at your foot and firing five times, and bragging that you only hit two of your toes.
    This is why, every day when I arri
  • So let me get this correct. Microsoft's determination whether or not there would be an exploit was correct less frequently than if they had just randomly chosen yes or no, and Microsoft calls that good performance?

    With such low standards of good performance, it is no wonder that the software coming out of Redmond lately has been so horribly poor.

  • More fail from MS (Score:2, Insightful)

    by foldingstock (945985)
    They can predict exploits in their own software. Well paint me yellow and call me a phone directory!

    How can a PR team for one of the largest corporations in the US seriously release a statement like this? What kind of company fails so badly that they can only predict 40% of exploits in their own [proprietary] software?

    If a major car (or car part) manufacturer "accurately" predicted that 40% of their automobiles would explode and burn their owners alive due to a fuel system defect....would people still
    • What kind of company fails so badly that they can only predict 40% of exploits in their own [proprietary] software?

      They had zero false positives. So, put it this way, looking at the source, they came up with some number of exploits that had to be fixed. Other people came up with only 40% of that number.

      It's a good thing, both that MS fixed them all, and that outside people seeking exploits are only 40% efficent.

  • They tried to predict if a hole will be exploited or not. Those are two outcomes. If you were to guess you would end up with a 50% chance of guessing right.

    And they were only 40% right and 60% wrong?

  • The ones I worry about are the 12 year old bugs that have had exploit code for 8 years already and only now gets fixed - maybe.
  • Does anybody else think it's really funny that Microsoft's predicting abilities are better than their patching abilities?

  • Thanks, Microsoft! (Score:3, Interesting)

    by scribblej (195445) on Friday November 14, 2008 @02:04PM (#25763733)

    No one seems to be looking at this from the opposite angle.

    If I'm writing malware that's going to need to exploit Windows, this gives me an easy chart of which exploit I should pick -- the ones with the lowest patch priority, of course.

  • Warning. If you love Microsoft, don't read. Your delicate sensibilities might be hurt 40% of the time.

    Engineering sector 99.999 Microsoft 40.000 Sounds about right.

Those who do not understand Unix are condemned to reinvent it, poorly. - Henry Spencer, University of Toronto Unix hack

Working...