Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Google Bug Communications Software IT

How Google Broke Itself and Fixed Itself, Automatically 125

lemur3 writes "On January 24th Google had some problems with a few of its services. Gmail users and people who used various other Google services were impacted just as the Google Reliability Team was to take part in an Ask Me Anything on Reddit. Everything seemed to be resolved and back up within an hour. The Official Google Blog had a short note about what happened from Ben Treynor, a VP of Engineering. According to the blog post it appears that the outage was caused by a bug that caused a system that creates configurations to send a bad one to various 'live services.' An internal monitoring system noticed the problem a short time later and caused a new configuration to be spread around the services. Ben had this to say of it on the Google Blog, 'Engineers were still debugging 12 minutes later when the same system, having automatically cleared the original error, generated a new correct configuration at 11:14 a.m. and began sending it; errors subsided rapidly starting at this time. By 11:30 a.m. the correct configuration was live everywhere and almost all users' service was restored.'"
This discussion has been archived. No new comments can be posted.

How Google Broke Itself and Fixed Itself, Automatically

Comments Filter:
  • by Anonymous Coward on Saturday January 25, 2014 @02:31PM (#46067283)

    On recovering by using the "last known good" configuration. What wizardry!

    I expect we'll be seeing the Google patent application on that shortly </sarcasm>

    Give Google a little credit (but not too much please). If they were Apple they'd have already patented it.

  • by stjobe ( 78285 ) on Saturday January 25, 2014 @02:35PM (#46067301) Homepage

    "The Google Funding Bill is passed. The system goes on-line August 4th, 2014. Human decisions are removed from configuration management. Google begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug."

  • by 93 Escort Wagon ( 326346 ) on Saturday January 25, 2014 @02:42PM (#46067337)

    Give Google a little credit (but not too much please). If they were Apple they'd have already patented it.

    Whereas Google would just look for a small company holding a relevant patent, then buy it.

  • by Immerman ( 2627577 ) on Saturday January 25, 2014 @03:03PM (#46067447)

    Google perceives this as an attack by humanity, and routs all search queries to goat.se in self defense.

  • by Anonymous Coward on Saturday January 25, 2014 @04:02PM (#46067819)
    Yeah that totally must be it. Me, the guys who write configuration management tools who'll tell you how hard it is (and sell you consultancy to try to make it slightly less hard) and the guys who write monitoring tools who'll tell you how hard it is (and sell you consultancy to try to make it slightly less hard). All those guys from companies like Facebook and Google who give talks at conferences about how difficult it is. We all suck at it and don't know what we're talking about. If only we'd listened to Slashdot, all our troubles would be but a dream.
  • by phantomfive ( 622387 ) on Saturday January 25, 2014 @05:15PM (#46068323) Journal
    "Our system is high-availability, it can return 404s all day for decades without going down"
  • by Anonymous Coward on Saturday January 25, 2014 @06:22PM (#46068785)

    Careful. Only the advice of Anonymous Cowards is trustworthy. All the other people on Slashdot are not to be trusted. After all, they are not even able to find out how to post anonymously! ;-)

One man's constant is another man's variable. -- A.J. Perlis

Working...