Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Communications The Internet

Google Blames Gmail Troubles On Maintenance Goof 109

Slatterz writes "Google has apologised for the two-and-a-half-hour Gmail outage on Tuesday morning, and admitted that the cause was down to data center maintenance. 'Lots of people around the world who rely on Gmail were disrupted during their waking and working hours, and we are very sorry. We did everything we could to restore access as soon as possible, and the issue is now resolved,' said Gmail site reliability manager Acacio Cruz in a blog post. Google had been testing new code designed to keep data geographically closer to its owner, which brought about disruption when maintenance in one data center caused another facility to be overloaded. This had a cascade effect, according to Google, and it took the company an hour to get it back under control."
This discussion has been archived. No new comments can be posted.

Google Blames Gmail Troubles On Maintenance Goof

Comments Filter:
  • by Jon_Hanson ( 779123 ) <jon@the-hansons-az.net> on Wednesday February 25, 2009 @06:48PM (#26989071)
    Maybe it's related to this but I noticed this past weekend that the Jabber server running on my Linux machine no longer can get presence information for people on GMail/GTalk. From the logs I can see my server attempting to make a connection but nothing happens after 20 seconds and my server gives up for the time being. I haven't changed anything on my side but I'm unsure who to contact about issues like these.
  • by Achromatic1978 ( 916097 ) <robert@@@chromablue...net> on Wednesday February 25, 2009 @07:06PM (#26989421)
    Not sure, but it's not just you. I use a couple of different clients for work communication (the dozen people in my company telecommute) - Gtalk, Trillian, Adium, and fring on my cellphone. My boss got snarky one day recently because he said my status said available, but I was unresponsive all day. After investigating, I saw the same thing. If I set my status away on any of the non-official clients, it wouldn't propogate out. So it's a two way issue, not just one.
  • by dave562 ( 969951 ) on Wednesday February 25, 2009 @07:06PM (#26989435) Journal
    The first thing that came to mind when reading the article is, "They were 'testing' code in a fscking production environment?!" Then I realized that Gmail is still a beta app. I think these things are to be expected from beta software. What I'm curious about is whether or not corporate users who are paying for Gmail were effected as well. If so, then Google better get their ducks in a row, and fast. It's one thing to play around with your servers when people aren't paying you for uptime. It's another thing entirely to test code on a production network.
  • by BikeHelmet ( 1437881 ) on Wednesday February 25, 2009 @07:36PM (#26989943) Journal

    It wasn't 2.5 hours for me - it was more like 14-15 hours.

    It stopped working at night time, around ~9PM (this is when Gmail Notifier failed to login, and curious, I tried to login manually). It wasn't working yet at 2AM in the morning. I went to sleep, woke up, and it was still broken. It finally came back online some time after lunch.

    This would be quite irritating if I were a business. As it was, I did have some important emails to send off, but waiting a day didn't kill me.

  • by Anonymous Coward on Wednesday February 25, 2009 @09:40PM (#26991591)

    How did you calculate 4 nine's for gMail? 4 9's is 52 minutes of downtime per year, while this outage was over 2 hours.

    And this isn't their first outage. The last one I remember was April of 2008.

    Is it even possible to measure 6 9's of downtime for an internet service? 6 9's is just 30 seconds of downtime per year -- less than 3 seconds per month -- 100 msec/day. Can you honestly say that you never experience 100 msec of additional latency once a day? Maybe once a month they have a hard disk timeout that makes a query take 3 seconds instead of a fraction of a second. Can you tell if this is a problem with that service, or somewhere on the general internet? Even 5 nines is less than one second per day.

    And measuring availability of a search engine is tricky anyway since the search engine database is constantly changing. If you are connected to a google server in that just happens to be having an indexing problem and is 12 hours out of date, how would you even know that is the case. Your search may fail to find the recently updated website that you're looking for but you don't know it's because there's an indexing problem. So you may say that the search engine is up, but in reality it's not able to give you the result you're looking for so.

8 Catfish = 1 Octo-puss

Working...