Forgot your password?
typodupeerror
Google The Internet IT Technology

Confirmed Gmail / Google App Outage 189

Posted by timothy
from the were-you-there-when-it-happened dept.
mbone writes "Earlier today there was a confirmed Google outage which got a lot of attention from network operators. From a post to NANOG after everything calmed down: 'Google ack'd a maintenance on their core network did not go as planned-Forced traffic to one peer link that was unable to handle all the traffic. Maintenance has been rolled back. Issue has been restored.' This is exactly what makes me nervous about cloud computing and data storage. It's bad enough when I screw up a config and it takes down my mail, but what about when it happens to the entire globe at once?" Several readers also point to CNET's coverage of the outage. Update: 05/14 19:25 GMT by T : CWmike adds this: "Steven J. Vaughan-Nichols writes that what may be happening is a massive DDoS attack. Based on the size of the attack that would be needed to interfere with Google, I believe that it's quite likely to be the result of an attack from the controllers of the Windows worm, Conficker. Another theory that has been put about — that the problem was due to AT&T NOC routing problems — does not appear to hold water, writes Steven." Update: 05/14 21:01 GMT by T : Google's put up a low-detail explanation on their blog that says "An error in one of our systems caused us to direct some of our web traffic through Asia, which created a traffic jam. As a result, about 14% of our users experienced slow services or even interruptions."
This discussion has been archived. No new comments can be posted.

Confirmed Gmail / Google App Outage

Comments Filter:
  • by JWSmythe (446288) <jwsmytheNO@SPAMjwsmythe.com> on Thursday May 14, 2009 @02:42PM (#27954283) Homepage Journal

        In comments from Google Admins, they said "oops." :)

  • Google Voice Issues (Score:5, Informative)

    by 0100010001010011 (652467) on Thursday May 14, 2009 @02:43PM (#27954317)
    My Google voice account went all sorts of haywire.

    1) Text messages sent from the web got duplicated. One person got near 10 duplicates in quick succession. I also got duplicate messages back.
    2) My number doesn't work. If you call it you get a "Currently unavailable"
    3) A few calls that came in before the outage aren't showing up in the Received/Missed calling list.
    • by MozeeToby (1163751) on Thursday May 14, 2009 @02:58PM (#27954587)

      I've noticed some inconsistencies on my companies finance.google page. It seems to be giving two different values for gains and losses for the day, the one on the graph is correct but the one at the heading is not. It also lists our company as one of the related companies, something that it has never done before.

      I've got to wonder just what the hell happened here. Major and unusual issues across nearly all of Google's services? This isn't going to be good for Google's brand image.

    • by N!NJA (1437175)
      strange. my Firefox 3.0.10 got somehow affected by this outage. it just refused to open! it loaded about 30Mb of data to RAM but went nowhere from there. the browser window never appeared. and i tried to re-launch it several times, but for no avail! very odd.... anyone else had problems with it? Opera -- although not able to open Google.com -- opened fine!

      Is Firefox tied to Google like E.T. was tied to Elliot?
    • Re: (Score:2, Flamebait)

      by Jeian (409916)

      Fixed-width fonts are a useful thing, in specific situations.

      Entire comments should not be written in them.

  • by WarwickRyan (780794) on Thursday May 14, 2009 @02:43PM (#27954319)

    ...and take an stroll to the great big place known as "outside".

  • by Anonymous Coward on Thursday May 14, 2009 @02:44PM (#27954325)

    call me....

    • Re: (Score:3, Funny)

      by PhxBlue (562201)

      call me....

      I can't ... all my contacts' phone numbers are stored in GMail!

  • by llZENll (545605) on Thursday May 14, 2009 @02:44PM (#27954327)

    And yet somehow miraculously we are all still alive. The sky is not falling!

  • by Nick Ives (317) on Thursday May 14, 2009 @02:44PM (#27954347)

    When it's just your mail server down, everyone else gets annoyed at you because you're not {gett,receiv}ing mail they're {sending, expecting from} you. When the cloud is down, everyone can just chill and be thankful that they're not going to log on to find a whole stream of new emails.

    This sucks for docs though but using a completely cloud based doc solution is a bit mental. Even if you're mobile it's best to have a local copy to save on battery life.

    • by rho (6063) on Thursday May 14, 2009 @03:38PM (#27955367) Homepage Journal

      It also sucks for the Web in general.

      Google was so fucked that a lot of pages that had Google ads, or Google Analytics were slow to load or not loading at all.

      • by Nick Ives (317) on Thursday May 14, 2009 @03:53PM (#27955635)

        Browsers should be smarter about that. Maybe if they remembered that certain hosts are down and so stop trying to load scripts from them? They could periodically retry unreachable script-hosts in the background and then ask the user if they wanted to reload all relevant tabs.

        The problem with remotely hosted scripts isn't just limited to Google or cloud apps, it's a more general issue and browsers should be able to handle it with grace.

        • by rho (6063)

          Browsers should be smarter about that. Maybe if they remembered that certain hosts are down and so stop trying to load scripts from them? They could periodically retry unreachable script-hosts in the background and then ask the user if they wanted to reload all relevant tabs.

          What does the unicorn burger taste like in this mystical land in which you live?

          • by Nick Ives (317)

            What does the unicorn burger taste like in this mystical land in which you live?

            HEATHEN!

            I love unicorns [youtube.com] and I won't let you eat them!

            I was requesting a feature btw, browsers should just be smarter about javascript hosts being down. If you load a webpage that references a script from a different domain and then that script times out whilst trying to load, it wouldn't be hard to just have a record of unreachable scripts.

            Every time you try to load a remote script just check against the unreachable scripts and see if it's OK to try asking for that script again. This would be great for when

        • by mgblst (80109)

          Yeah, congrats, you have solved one little problem, and created about 1000 new problems.

      • by RLiegh (247921)

        Google was so fucked that a lot of pages that had Google ads, or Google Analytics were slow to load or not loading at all.

        Which is different from business as usual how, exactly? There's a reason googleanalytics has a place in my adblock file (and I'm far from alone in that).

      • by Krneki (1192201) on Thursday May 14, 2009 @05:02PM (#27956981)
        Noscript is the solution to this.

        There is a reason we don't like all the nasty stuff loading in the background.
      • This is why everyone moving everything and anything to Google is a bad thing. Sometimes I feel like the last person on the planet that doesn't use Google services to run damn near everything.

    • e-mail is supposed to be reliable because of its distributed nature. It is not supposed to be on single "cloud", distributed machines should be caring for it. It is just like XMPP vs. old fashion MSN/AIM etc. junk.

      Let me show what I see with the "cloud" (which is one of the worst abused terms) right now:
      (wget)
      s3.amazonaws.com[72.21.207.242]
      Saving to: `423.dmg'

      10% [===> ] 4109203
      • by Nick Ives (317)

        e-mail is supposed to be reliable because of its distributed nature.

        You seem to be missing the point of what I'm saying. When your email server is down, you can't send or receive mail. This leads to lots of irate phone calls about why you haven't replied to or sent some email. When everyone's email is down, you get the occasional call about how it sucks that email is down because so-and-so wanted you to do <trivial task>.

        Even better, more complicated things that involve moving attachments have to be postponed which leaves you to catch up with your real work! Plus nobo

  • by ColdWetDog (752185) on Thursday May 14, 2009 @02:44PM (#27954349) Homepage

    It's bad enough when I screw up a config and it takes down my mail, but what about when it happens to the entire globe at once

    If everybody goes down, nothing happens and you just go outside (beyond the doors, out into the bright white light) and enjoy your day until 'they' fix it.

    What's not to like?

  • by LingNoi (1066278) on Thursday May 14, 2009 @02:44PM (#27954357)

    This is exactly what makes me nervous about cloud computing and data storage. It's bad enough when I screw up a config and it takes down my mail, but what about when it happens to the entire globe at once?"

    If it bothers you then use a mail client to download your mail from Google. As someone that has been using my gmail account all week I didn't even notice a problem, the whole thing seems overblown.

    • Re: (Score:2, Insightful)

      by Botia (855350)
      The problem wasn't just mail. Any site that used Google for web statistics, mapping, or other services that Google offers was affected. For example, certain online banking systems use Google Analytics. These were affected.
      • by Ash Vince (602485)

        The problem wasn't just mail. Any site that used Google for web statistics, mapping, or other services that Google offers was affected. For example, certain online banking systems use Google Analytics. These were affected.

        Strange, I am responsible for several sites that use Google analytics and I had a nice quiet day. They are fairly intensively monitored so if this had affected us I would have heard horrible alarms going off and clients ringing us.

        Maybe the issue was more localised than people making it sound.

        • I didn't notice any outage in Google services either. But I only use it for mail, maps, and searching. I checked my email at least a dozen times today with no problems.
        • by Ash Vince (602485) on Thursday May 14, 2009 @03:30PM (#27955237) Journal

          I know it is poor form to reply to your own posts but I have just read the full article above and discovered that us in the UK seemed to be ok. Also not affected was the West coast apparently.

          Maybe someone told Google I was on holiday tomorrow and needed a nice quiet day to clear my desk :)

          • by berashith (222128)

            I was wondering... my work proxy pops out somewhere in France or Switzerland and I never saw any problems.

    • by jadin (65295)

      Not only that but if your personal mail server goes down, there's just you fixing it. When google's does, how many hundreds if not thousands of people are scrambling on red alert to fix it?

    • If it bothers you then use a mail client to download your mail from Google. As someone that has been using my gmail account all week I didn't even notice a problem, the whole thing seems overblown.

      I've had a lot more lost Google downtime caused by power outages or ISP service interruptions than I've had with Google being down. So, yeah, I agree with you, very overblown. Doesn't matter how dependent we are on the cloud, we still cannot take the internet for granted.

  • Mail Servers (Score:5, Insightful)

    by Aladrin (926209) on Thursday May 14, 2009 @02:47PM (#27954419)

    Having run my own mail server, and used mail servers run by companies I work for, I'll -gladly- take GMail's track record for reliability. Even with no 'guarantee', it's been a hell of a lot better than anything else I've experienced.

    And what's -really- the difference between a server going down locally that affects you and a server going down globally that affects you? Nothing.

    • Re:Mail Servers (Score:5, Informative)

      by ACMENEWSLLC (940904) on Thursday May 14, 2009 @03:00PM (#27954627) Homepage

      >>And what's -really- the difference between a server going down locally that affects you and a server going down globally that affects you? Nothing.

      Actually, I disagree. There is a difference. If it's local and I own it, I have to fix it. If it's outsourced and Google owns it, I sit back and let Google fix it. Which is nice.

      ThePlanet.com had a bad switch install a few days ago which brought down part of our cloud. Our website was down, as was our access to Google DNS gave an IP down there for Google. If you look at the last year, the cloud solution has had a better uptime than what I was providing computing in planned maintenance, patching, updates and all.

      It was nice to leave at 5pm, knowing ThePlanet would fix the switch and get us back up. And they did. It's a lot easier to gripe about the cloud being down and sit back, than to manage and fix your own local servers switches and such. When you get to managing hundreds of servers, it becomes time to know what to outsource.

    • by jimicus (737525)

      I run my own mail server and you know what? I honestly have not the remotest idea why some people find it so difficult.

        (Famous last words!)

    • And what's -really- the difference between a server going down locally that affects you and a server going down globally that affects you? Nothing.

      The difference is that you don't have a global melt-down of every web base service that is dependent on Google.

  • by geekmux (1040042) on Thursday May 14, 2009 @02:52PM (#27954499)

    Take a good look kids. Google was down and Twitter was up. This only happens once in every 3,271 days. You probably won't see it again, at least in Twitters lifetime...

  • by cwgmpls (853876) on Thursday May 14, 2009 @02:57PM (#27954581) Journal

    Anyone who has ever used or administered a mail server has experienced a mail server going down. This is not news.

    What is news is that Google Mail has been up for so long until now. And current accounts seem to indicate the outage lasted about one hour.

    One hour of down time after five years of steady service is good enough for me. It is better than any other mail server I have ever used.

    • by JWSmythe (446288)

      Well, I've administered a lot of mail servers over the years, and even when I've announced an outage, it's pretty much guaranteed that I'll get a phone call within 30 seconds of taking the machine down.

      I've noticed Gmail having problems quite often lately. Mostly the inbox can't load, times out, whatever. Not that I'm complaining though. It's free, and I can keep a copy back here for when they go under. :) I just don't look forward to copying my mail back up to my own server. It

    • Not to mention that it didn't seem to affect all of their users. Bravo, Google! Bravo, again for restoring services so quickly.
    • by teknopurge (199509) on Thursday May 14, 2009 @03:24PM (#27955129) Homepage

      You mistakingly act as if this is the first time google has had an outage in 5 years. Try again. [google.com] Some more too [google.com].

      Over the years there have been countless issues with google - from gmail being down to apps not working, though it tends to to affect everyone, but subsets of users.

      Some of the google issues have to do with mailboxes getting lost and reassigned, etc. If it doesn't happen to you, it doesn't count as an issue, according to your logic.

      • by sharkey (16670) on Thursday May 14, 2009 @03:38PM (#27955379)
        It'll get better once it's out of Beta.
      • This is what happens when all your engineers are too smart... they build things for their level of skill, and then when something goes wrong there's nobody even smarter to call in to fix it.

        In this case, google builds this fantastically complicated yet simple global filesystem and series of interdependent services that make up their search and apps. Then something goes wrong like say their enormous bandwidth temporarily exceeded by a site backup (or whatever) and dominoes start falling all over each others

    • by drinkypoo (153816)

      I was not affected at all by this outage (I have been using gmail all day, no lie) but that could be because I am using offline gmail... but I was sending and more importantly receiving mail. I guess it could have happened and been over before 6:08 pacific... no, it looks like it happened later. I have replied mails from all around that time.

      I have seen gmail outages before, so I don't really know why this is allegedly news. None of them lasted long though. Maybe they were just rewriting my email or somethi

    • If my email server goes down for an hour, I probably won't even notice. But if the adservers are down, causing the whole internet to run slower than Vista on 512MB, I not only notice but get very annoyed.

  • by recharged95 (782975) on Thursday May 14, 2009 @03:01PM (#27954659) Journal
    In the end, who the F* cares if a cloud service goes down?

    If a life is not lost, there are no worries with cloud computing (hence, cloud computing should be used for non-life critical services, gmail is a perfect example).

    Of course, VCs may have lost revenue, Capitalists may sweat from loss stock trades, teenagers may lose that one twitter about how cool Miley is to them, some adult may not get that date tonight from craigslist, you may miss that one Hulu commercial, some K-12 kid may not be able to send out his homework, some college kid can't access his pirate bay music lists, or the USPoTC may miss that extra minute to promote his stimulus bill.

    In the end, I hope cloud services shows us that we are not slaves to time. The human race has advanced enough to know that already. And really, if "the cloud" is down for an hour, maybe you should go outside and enjoy the wonders of nature and peace for once, or talk to someone physically. It begs to ask the question: "can it wait?"

    • by againjj (1132651) on Thursday May 14, 2009 @04:36PM (#27956513)

      For good or for ill, the Internet has become rather important for the functioning of society, and it is only getting more so as time goes by. Compare it to any other piece of infrastructure.

      Recently here in the bay area, we lost part of the MacArther Maze (the interchange of 580, 880, and 80 on the Oakland side of the Bay Bridge). You can trivialize by saying that the tool plaza may have lost revenue, the bus line may sweat from loss of fares, some adult may not get that date tonight to the SF restaurant, you may miss that one baseball game, some K-12 kid may not be able to get to the zoo, etc., or you can recognize that the bay bridge is one DAMN IMPORTANT piece of infrastructure that makes waves if it is down.

      There is a lot that relies on cloud services, many more than you may realize. That is why there are binding QoS contracts. When something goes down, it costs money and time. While you can route around the damage, or maybe take a vacation for the day, that does not mean that failures are unimportant. When you say, "If a life is not lost, there are no worries with cloud computing", you trivialize any loss other than life. The recent housing downturn didn't cost lives, but it did cost jobs, homes, and retirement incomes, to name a few. Sorry, when a major Internet service goes down, someone had better "the F* care".

  • "It's bad enough when I screw up a config and it takes down my mail, but what about when it happens to the entire globe at once?"

    That's much better for you. Instead of having to explain to everybody that the dog ate your homework or whatever, you can sit back and let them explain it to you...

  • Ah, we all get our power from the "electrical cloud". We all need private generators. Ah! Ah!

  • by roc97007 (608802) on Thursday May 14, 2009 @03:16PM (#27954979) Journal

    If we're talking about the same outage that caused google advertisements to hang forever this morning, it caused access to many unrelated websites to hang, including slashdot itself. This seems like a really bad single-point-of-failure issue. If a site can't display ads, shouldn't it come up anyway?

    It's bad enough that I have to wait tens of seconds for Captcha content to pop up long after a login page has loaded.

    This is starting to get annoying. If this is "cloud computing", I'd rather stay on earth.

    • Of course since I have Google Analytics and adsense in my hosts file, those websites never gave me any problems this morning. I started this in 1998 when I was on dialup because it sped up the loading of many websites as doubleclick and others simply bogged down.

    • by GoRK (10018)

      You're telling me; I start getting reports from users all around the office that sites are failing to respond -- looks like a bigtime BGP barf, but then I realize it's all google ads and google analytics hanging pages all over creation. I couldn't think of a good way to mitigate this other than to blackhole Google's Georgia datacenter, and I figured by the time I did that, Google would have it fixed. Imagine my surprise when they didn't after a few hours. I guess there's a first time for everything.

  • by GPLDAN (732269) on Thursday May 14, 2009 @03:25PM (#27955137)
    When done correctly, the "cloud" is the internet itself. Google has network design issues, some of their key services only have a couple of ingresses into Tier-1 providers:

    http://en.wikipedia.org/wiki/Tier_1_carrier [wikipedia.org]

    I don't work for them, i don't hold their stock, and I am not (currently) a customer, so I have no skin in their game, but Internap as a BUSINESS MODEL, becomes more important.

    If you are a major company that comes to rely HEAVILY on Cloud Services, you want to insure that you have on-ramps into several Tier-1 providers ALL AT ONCE, without having to contract individually with 4 or 5 of them yourself. I predict more companies will mimic this model of aggregation, essentially handling the business of BGP optimization for customers, and handing customers 2 redundant pipes and saying "hey, don't worry if San Fran has an earthquake and these peering points blow up, we'll get you out via this Tier-1 backbone over to your cloud computing provider's service via this backbone within seconds. Let us handle that."

    Especially with ISPs that get into pissing matches, like when Cogent and Telia got into it, and cut each other off. If you had Cogent as your only ISP, you were screwed if you wanted to get to a bunch of Swedish sites, because Cogent's CEO was trying to play chicken over some tariff rates. The cloud computing model will no longer tolerate that, it's not just some website, it's a BUSINESS function.

    that's my take at least.
  • Many sites rely on Google in ways that aren't immediately evident - for instance, during the outage, Google Analytics connections were lagged, which meant that all our our sites that incorporate Analytics were ALSO lagged.

    What's amazing is the extent to which an outage on a single entity can bring down ALL of the other entities that surround it -- not just those who rely more visibly, e.g., Google Docs., on their services.

    Yikes!

    --Dave

    • by saiha (665337)

      Its because of very poor design. Its the same reason that slow loading adverts slow down a site. One small aspect of a site should not affect its performance the way it does today.

    • by javaxjb (931766)

      I thought maybe something had corrupted my Firefox session at the time... I suspected Google was having problems, so I went to E*Trade which was failing to load just about anything except text. I don't see any references to Google or ga.js on E*Trade's pages. But, E*Trade does rely heavily on Akamai servers. If it was a DOS attack, it may have affected Akamai, too (either as the subject of a separate direct attack or an indirect victim of traffic generated by Google's problems).

      Both Google and E*Trade rec

  • It's bad enough when I screw up a config and it takes down my mail, but what about when it happens to the entire globe at once?

    I was reading this comment and it occurred to me that the latter is actually preferred. With the first option, your systems are messed up, but everyone else wants you to continue to conduct business. With the latter situation, your systems are down and so are the people who would normally be trying to reach you.

    • by mbone (558574)

      In this case, none of my systems were down, and I wouldn't have known that there was a problem if I hadn't heard from outside, as my connection to Google goes through Cogent, and that seems to have been unaffected.

  • This speculation from the ComputerWorld blog doesn't belong in the post. Even the blog author says its conjecture. Especially ridiculous since the NANOG post in the second link already explained that the problem was a routing error at Google.
  • Am I missing something? What is "ack'da maintenance?"

    Sounds like someone's watched the new Star Trek a few times too many...

  • Companies expecting to do mission critical work over the Net need dedicated lines, dedicated machines, and somebody from THEIR company overseeing the system.

    Relying on other people is a sure route to disaster. It's hard enough relying on your OWN people.

    The Net is NOT fault-tolerant - unless YOU make it so.

  • Cloud Cloud cloud cloud? Cloud cloud "cloud cloud cloud?"

    Cloud cloud cloud's cloud cloud cloud Cloud Computing cloud cloud cloud cloud cloud...

  • by robbrit (1408421) on Friday May 15, 2009 @01:03AM (#27961903) Homepage
    Somebody must have typed "google" into Google. It's the only possible explanation.

The universe does not have laws -- it has habits, and habits can be broken.

Working...