Please create an account to participate in the Slashdot moderation system


Forgot your password?
Google Medicine Social Networks Stats Science

When Google Got Flu Wrong 72

ananyo writes "When influenza hit early and hard in the United States this year, it quietly claimed an unacknowledged victim: one of the cutting-edge techniques being used to monitor the outbreak. A comparison with traditional surveillance data showed that Google Flu Trends, which estimates prevalence from flu-related Internet searches, had drastically overestimated peak flu levels. The glitch is no more than a temporary setback for a promising strategy, experts say, and Google is sure to refine its algorithms. But with flu-tracking techniques based on mining of web data and on social media taking off, Nature looks at how these potentially cheaper, faster methods measure up against traditional epidemiological surveillance networks." Crowdsourcing is often useful, but it seems to have limits.
This discussion has been archived. No new comments can be posted.

When Google Got Flu Wrong

Comments Filter:
  • Modern epidemics and pandemics are almost ALWAYS overestimated by those predicting them. In part, this is because those predicting them often have a vested interest in making them sound a scarier than they actually are. So you get a lot of this "The sky is falling! Weessa all gonna die! Give me more research money!" screaming from epidemiologists and those in related fields.

    • In part, this is because those predicting them often have a vested interest in making them sound a scarier than they actually are.

      Financial incentive? In science?

      Well, yes. Scientists are people too, and they want the same thing most of us want: to put together enough of a money pile to leave the rat race adn go do what we want for a change, without having to make it profitable and thus bending it to the lowest common denominator (LCD).

      Michael Crichton's State of Fear reveals this tendency in our media and

      • Michael Crichton's State of Fear reveals this tendency in our media and science.

        Really? I was under the impression that it was a novel, not a document. Are you perhaps confusing "claim" and "reveal"?

    • by DrXym ( 126579 ) on Thursday February 14, 2013 @11:49AM (#42895993)
      This is borderline conspiracy think. Scientists of all stripes want their predictions to be testable, with minimal error bars and as accurate as possible.
      • by Bigby ( 659157 )

        The scientists aren't the ones working as the middle man between their work and the media.

      • by Anonymous Coward

        People mass-failing the iterated researcher's dilemma (similar to an iterated prisoner's dilemma, but related to funding rather than sentencing) does not require a conspiracy. It just requires that enough people know nothing of game theory and have a poor grasp of cost/benefit analysis.

        Dismissing observations of common human behavior as some sort of conspiracy is simply obstructive to any process of understanding.

    • by eepok ( 545733 ) on Thursday February 14, 2013 @12:06PM (#42896123) Homepage

      Actually, that's kinda the goal. When it comes to the expenditure of time and money, if you don't come in with a Chicken Little, people are just going to ignore you. With the Chicken Little, you get people to fall in line and the effects of major epidemics or problems are mitigated.

      Slashdot-friendly example: Today, people will say that the Y2K issue was completely blown out of proportion. Airplanes didn't fall out of the sky, bank accounts were there on Jan 1, 2000, and everything was just fine. Of course, that ignores the teams of coders working in even-then-archaic coding languages to adapt old software to work beyond their expected lifespan. Who knows what Y2K would have been had we just done nothing, but we're all better off with the purse-string-holders getting concerned.

      • by crazyjj ( 2598719 ) * on Thursday February 14, 2013 @12:14PM (#42896207)

        It's only a problem when it causes people to panic (like yelling "fire" in a crowded theater, then defending yourself with "Well, it got them to think about fire safety, didn't it?"). If it just causes Cleatus Dipshit to wash his hands more and cover his goddamn mouth when he sneezes, I'm okay with it. If it causes people to sell their houses and empty their bank accounts to buy underground bunkers and canned goods, then we have a problem.

        Of course, there is also the issue of fraud when it comes to public grant money. I don't like the idea of a scientists who are knowingly exaggerating their findings taking grant money away from those who aren't.

      • Who knows what Y2K would have been had we just done nothing

        Well, presumably...

        the teams of coders working in even-then-archaic coding languages to adapt old software to work beyond their expected lifespan ...would know.

        But where are their stories?

        I'm asking out of curiosity - not necessarily because I'm sceptical. Wikipedia does have some stories of Y2K-bug related issues (one even fatal, although I think more than just Y2K-bug failed there), but there doesn't seem to be a reference to people stating they

        • curses. Well it's a good thing I didn't have to fix any Y2K bugs - I can't even manage a simple </blockquote> ;)

        • by samkass ( 174571 )

          Is there anyone who was working in software in 1999 who WASN'T spending a lot of time considering Y2K issues? We had to upgrade most of the software stack from our servers at the time and put in the approved two-digit rounding code to the UI date parsing. Not exactly heroic, but I'm not aware of a single piece of server software that required no modifications for Y2K. Everyone was involved in a thousand tiny ways.

          My guess is that the reason there's not a lot of blogs and personal stories is that it was m

        • by sjames ( 1099 )

          There's not a lot of stories because it was all pretty boring stuff. A lot of setting the clock ahead and redo the QC tests, punch out a few bugs that crop up and test again, just like any code. Where's the stories of coders getting Turbo Tax ready for next year's new rules? It's just not that exciting and most of it happened in industries that typically say nothing about their development efforts in the first place.

          There were stories at the time of mid to upper management people being brought in as develop

    • Actually, I think it's the media that has a vested interest in hyping the story. The interview five people, and the one that says "WE'RE ALL GONNA DIE!" is the one that gets quoted. They get paid for how many eyeballs see the page, not for how accurate their reporting is.
    • by Sique ( 173459 ) on Thursday February 14, 2013 @12:30PM (#42896417) Homepage
      No. You only hear in the media about epidemic and pandemic estimates of the upper range. The prediction "we'll have 30,000 deaths in 2013 due to the normal flu" wouldn't make any headlines, because every year, about 30,000 die after getting sick with the flu. But most predictions of epidemics and pandemics are exactly like this -- it's just the expected behaviour. There is a big difference between the average estimates coming from the scientists and the single highest estimates reported in the media. And of course, "everything is normal" is no news, thus it doesn't get reported that often. Information is the inverse of probability, and reports about highly improbable events have higher information content than reports about average events. Highly improbable events happen and contradict our expectations, and thus it is important to report them. Normal events happen, but we were expecting them anyway, thus there is no point in reporting them. Your "ALWAYS" is probably more due to confirmation bias on your side than anything else.
    • by ewrong ( 1053160 )

      I was working at the NHS (National Health Service in the UK) a few years back when a 'flu pandemic' was being predicted, bird flu I think. Anyway, as a developer there I was pulled into a meeting to discuss plans to create some sort of emergency website with contact details if things went really bad.

      Whilst we waited for various people to get on the phone etcetera the guy next to me, a very senior doctor in the service, started moaning about why he was there. To paraphrase and the figure I use is one I just

    • I think you need to RTFA, because this has nothing to do with what you're talking about (i'm not entirely sure what point you're making, because what you appear to be saying is complete nonsense, and i'd hold you to a higher standard than that).

      google has most likely fallen prey to "man flu" syndrome, where a sniffle and a headache is confused with actually having the flu, which can kill.

  • the joke was "I opened the window and influenza"

  • by concealment ( 2447304 ) on Thursday February 14, 2013 @11:34AM (#42895863) Homepage Journal

    Computer modeling is a powerful technology that should not be underestimated.

    However, it should also not be overestimated.

    When the "real world" has millions of convergent factors responsible for an event, computer models can sometimes capture a few thousand. Based on those, a simulation is created that suggests a certain outcome. But it may be using less than 1% of the necessary data.

    This is like making architectural models out of child's blocks and then being surprised when the building falls down after it is eventually made. There are issues of scale in addition to data that can reveal periodistic or epicyclic patterns that cannot be modeled in a linear method.

  • drastically overestimated peak flu levels

    So do most "professionals" that study it.
    H1N1 was so incredibly over-hyped, for example.

    • by Anonymous Coward

      The professionals will provide the usual range of predictions, creating a more-or-less gaussian distribution around the actual result, and then the media will self-select the ones on the highest part of the curve because that's what keeps people watching the news.

  • by doconnor ( 134648 ) on Thursday February 14, 2013 @11:38AM (#42895901) Homepage

    They should subtract out a factor based on how much the flu is being talking about in the media.

  • by Sarten-X ( 1102295 ) on Thursday February 14, 2013 @11:38AM (#42895905) Homepage

    In short, a system that learns from abnormal circumstances will no longer work as well under normal circumstances. This year's flu outbreak didn't follow previous models, so Google's application of those models was inaccurate... but we'll blame Google for it anyway, and cast shame upon them for being so terribly wrong.

    Of course, the article is much better, delving into other systems that also predict and monitor flu outbreaks, and why they were or were not correct. TFA is really about the difference between traditional reporting sources (as from doctors' offices) and newer data-mining approaches (harvesting from searches and Twitter).

    Screw you, Slashdot.

  • by h4rr4r ( 612664 ) on Thursday February 14, 2013 @11:44AM (#42895941)

    This is probably because people will update their social media sites with claims of having the flu. If they actually had the flu odds are they would not have the strength to even do that.

    The real flu is pretty terrible and people often think they have it when they have a minor cold.

    • by Bigby ( 659157 )

      I've had two different colds in the last month...which is very very odd. One of them was quite powerful. Many people would call it the flu. Some out of ignorance and others to make their situation sound worse than it really is...for pity. Others will say that to solidify to their bosses that they aren't going to work.

    • by wren337 ( 182018 )

      I had the flu this season, I was laid out in bed for 4 days. Didn't eat anything, drank a little orange juice. Bundled up in a wool hat under a pile of blankets, drenched in cold sweat. I haven't been that sick since I was a kid.

    • by antdude ( 79039 )

      I still compute when very sick. Just not very good! :P

  • "Good morning, gentlemen. What does the overnight Google search analysis show?"

    "Well, there is the continued flu outbreak on the east coast, with the biggest concentration in Boston. There seems to be a ringworm outbreak in pets in the southwest, and our numbers show, and I caution you this is probably a 60% overestimate, the apparent nationwide removal of 3.8 million brains due to unspecified causes."

  • It's only a matter of time before a real flu epidemic rages though the world. The trick with flu is the balance between it's virulence and it's morbidity. Flu's that come by that are virulent AND overly morbid will burn out. People will die too fast to spread the disease. This is why there has been no world wide outbreak of Ebola... it kills so fast, it can't spread. A mild flu (low morbidity) can spread far and wide, because it doesn't kill the majority of its hosts, thus allowing them to pass the dis
  • Google announces they're tracking the flu (hey everyone, come see a map that will tell you how bad the flu is in your area!), Larry Page announces he's offering free flu shots [] to all kids in the Bay Area, and Google announces it's launching a flu shot locator []. Of course searches for "flu" and "influenza" are going to increase. That will throw off the accuracy of your model. What they're really measuring is this: "people who are thinking about the flu and proactively reaching out to learn more."
    • You don't get free flu shots in the US? I'd be curious to see a cost/benefit analysis - but then I suppose when hospital rooms cost the patient money there's little motivation for the government to try to keep you out of them.

  • ...when people regularly died from it?

    • ...when people regularly died from it?

      In my State the death rate from influenza is about 1.3 per hundred thousand. Which just happens to be the same as our homicide rate.

      The thing I wonder about is if the CDC is accurately estimating the number of people who Google and decide, "yup, I've got the flu, I've got no money for a doctor's visit, no insurance, and certainly no money for anti-virals" and those cases never make it into any surveillance systems. Are they accounting for the real unemployment rate wh

  • Of course the sensational news story of this past winter was the rampant outbreak of "flu" which suddenly has become one of the biggest health scares the world has ever seen.

    Google needs a sensational hyperbole filter on their Internet scrapes, something to blow past the kind of rampant proliferation of "news" not based on fact or reality, but only reported to drive web hits or broadcasts has become common place these days. Some reporter goes to the ER of a hospital, sees a room pack of sniffling, coughing

  • by idontgno ( 624372 ) on Thursday February 14, 2013 @01:18PM (#42897057) Journal
    but apparently they have a whiz-bang hypochondria pandemic detector.
    • by evelo ( 1786080 )
      Personally I find the mania detector quite useful, and hope it can be used to expose other mass-illusions such as hybrid cars being positive for the environment and guns in school being a good thing. I know I won't need a flu shot, but I want to know how many crazy people I should prepare to disbelieve and avoid any given day.
  • I don't get flu shots either. Maybe there is a connection.

  • Google Flu has never been used to officially declare a flu outbreak. It's a neat tool, and it has been successfully used in retrospective studies, but until it actually helps us prepare for a flu outbreak in ways above and beyond what traditional surveillance already does, it will continue to just be a neat tool and not a useful one. The same goes for the Twitter flu prediction models. These tools are cool, but unless people actually do things differently to prepare for an outbreak based on their predict

The trouble with the rat-race is that even if you win, you're still a rat. -- Lily Tomlin