Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Google The Internet IT Technology

Confirmed Gmail / Google App Outage 189

mbone writes "Earlier today there was a confirmed Google outage which got a lot of attention from network operators. From a post to NANOG after everything calmed down: 'Google ack'd a maintenance on their core network did not go as planned-Forced traffic to one peer link that was unable to handle all the traffic. Maintenance has been rolled back. Issue has been restored.' This is exactly what makes me nervous about cloud computing and data storage. It's bad enough when I screw up a config and it takes down my mail, but what about when it happens to the entire globe at once?" Several readers also point to CNET's coverage of the outage. Update: 05/14 19:25 GMT by T : CWmike adds this: "Steven J. Vaughan-Nichols writes that what may be happening is a massive DDoS attack. Based on the size of the attack that would be needed to interfere with Google, I believe that it's quite likely to be the result of an attack from the controllers of the Windows worm, Conficker. Another theory that has been put about — that the problem was due to AT&T NOC routing problems — does not appear to hold water, writes Steven." Update: 05/14 21:01 GMT by T : Google's put up a low-detail explanation on their blog that says "An error in one of our systems caused us to direct some of our web traffic through Asia, which created a traffic jam. As a result, about 14% of our users experienced slow services or even interruptions."
This discussion has been archived. No new comments can be posted.

Confirmed Gmail / Google App Outage

Comments Filter:
  • by MozeeToby ( 1163751 ) on Thursday May 14, 2009 @02:58PM (#27954587)

    I've noticed some inconsistencies on my companies finance.google page. It seems to be giving two different values for gains and losses for the day, the one on the graph is correct but the one at the heading is not. It also lists our company as one of the related companies, something that it has never done before.

    I've got to wonder just what the hell happened here. Major and unusual issues across nearly all of Google's services? This isn't going to be good for Google's brand image.

  • by recharged95 ( 782975 ) on Thursday May 14, 2009 @03:01PM (#27954659) Journal
    In the end, who the F* cares if a cloud service goes down?

    If a life is not lost, there are no worries with cloud computing (hence, cloud computing should be used for non-life critical services, gmail is a perfect example).

    Of course, VCs may have lost revenue, Capitalists may sweat from loss stock trades, teenagers may lose that one twitter about how cool Miley is to them, some adult may not get that date tonight from craigslist, you may miss that one Hulu commercial, some K-12 kid may not be able to send out his homework, some college kid can't access his pirate bay music lists, or the USPoTC may miss that extra minute to promote his stimulus bill.

    In the end, I hope cloud services shows us that we are not slaves to time. The human race has advanced enough to know that already. And really, if "the cloud" is down for an hour, maybe you should go outside and enjoy the wonders of nature and peace for once, or talk to someone physically. It begs to ask the question: "can it wait?"

  • by roc97007 ( 608802 ) on Thursday May 14, 2009 @03:16PM (#27954979) Journal

    If we're talking about the same outage that caused google advertisements to hang forever this morning, it caused access to many unrelated websites to hang, including slashdot itself. This seems like a really bad single-point-of-failure issue. If a site can't display ads, shouldn't it come up anyway?

    It's bad enough that I have to wait tens of seconds for Captcha content to pop up long after a login page has loaded.

    This is starting to get annoying. If this is "cloud computing", I'd rather stay on earth.

  • by teknopurge ( 199509 ) on Thursday May 14, 2009 @03:24PM (#27955129) Homepage

    You mistakingly act as if this is the first time google has had an outage in 5 years. Try again. [google.com] Some more too [google.com].

    Over the years there have been countless issues with google - from gmail being down to apps not working, though it tends to to affect everyone, but subsets of users.

    Some of the google issues have to do with mailboxes getting lost and reassigned, etc. If it doesn't happen to you, it doesn't count as an issue, according to your logic.

  • by GPLDAN ( 732269 ) on Thursday May 14, 2009 @03:25PM (#27955137)
    When done correctly, the "cloud" is the internet itself. Google has network design issues, some of their key services only have a couple of ingresses into Tier-1 providers:

    http://en.wikipedia.org/wiki/Tier_1_carrier [wikipedia.org]

    I don't work for them, i don't hold their stock, and I am not (currently) a customer, so I have no skin in their game, but Internap as a BUSINESS MODEL, becomes more important.

    If you are a major company that comes to rely HEAVILY on Cloud Services, you want to insure that you have on-ramps into several Tier-1 providers ALL AT ONCE, without having to contract individually with 4 or 5 of them yourself. I predict more companies will mimic this model of aggregation, essentially handling the business of BGP optimization for customers, and handing customers 2 redundant pipes and saying "hey, don't worry if San Fran has an earthquake and these peering points blow up, we'll get you out via this Tier-1 backbone over to your cloud computing provider's service via this backbone within seconds. Let us handle that."

    Especially with ISPs that get into pissing matches, like when Cogent and Telia got into it, and cut each other off. If you had Cogent as your only ISP, you were screwed if you wanted to get to a bunch of Swedish sites, because Cogent's CEO was trying to play chicken over some tariff rates. The cloud computing model will no longer tolerate that, it's not just some website, it's a BUSINESS function.

    that's my take at least.
  • by againjj ( 1132651 ) on Thursday May 14, 2009 @04:36PM (#27956513)

    For good or for ill, the Internet has become rather important for the functioning of society, and it is only getting more so as time goes by. Compare it to any other piece of infrastructure.

    Recently here in the bay area, we lost part of the MacArther Maze (the interchange of 580, 880, and 80 on the Oakland side of the Bay Bridge). You can trivialize by saying that the tool plaza may have lost revenue, the bus line may sweat from loss of fares, some adult may not get that date tonight to the SF restaurant, you may miss that one baseball game, some K-12 kid may not be able to get to the zoo, etc., or you can recognize that the bay bridge is one DAMN IMPORTANT piece of infrastructure that makes waves if it is down.

    There is a lot that relies on cloud services, many more than you may realize. That is why there are binding QoS contracts. When something goes down, it costs money and time. While you can route around the damage, or maybe take a vacation for the day, that does not mean that failures are unimportant. When you say, "If a life is not lost, there are no worries with cloud computing", you trivialize any loss other than life. The recent housing downturn didn't cost lives, but it did cost jobs, homes, and retirement incomes, to name a few. Sorry, when a major Internet service goes down, someone had better "the F* care".

  • Re:Big Deal (Score:2, Interesting)

    by jdenver ( 1554803 ) on Thursday May 14, 2009 @07:06PM (#27958975)
    Well good thing we still had access to Twitter! ;-)

    I was following the #googlefail [twitter.com] channel I found from the InformationWeek story and found a link to some cool response time graphs [sitesteady.com] from the outage.

    There's also a really great Wired article with graphs [wired.com] from a Tier-1 provider showing the incredible drop in network traffic (by about 15Gbps) during the outage.
  • Re:Big Deal (Score:1, Interesting)

    by Anonymous Coward on Thursday May 14, 2009 @08:54PM (#27960057)

    Anywhere from an average of 15-30 minutes a month of downtime.

    For select users. It doesn't count as downtime if it doesn't affect you.

Never test for an error condition you don't know how to handle. -- Steinbach

Working...