Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Bug The Internet Databases Social Networks

Why Browsers Blamed DNS For Facebook Outage 96

Julie188 writes "That was probably the only time 'DNS' will ever be a trending term on Twitter. The cause was Facebook's 2.5 hour outage on Thursday, which incorrectly told users trying to access the site that a DNS error was to blame. In truth, experts who've read Facebook's explanation say the site went down because Facebook gave itself a distributed denial-of-service attack when a system admin misconfigured a database. So why was DNS blamed? The 27-year-old communications protocol has been known to cause other, somewhat similar outages."
This discussion has been archived. No new comments can be posted.

Why Browsers Blamed DNS For Facebook Outage

Comments Filter:
  • Duh (Score:5, Insightful)

    by vlm ( 69642 ) on Sunday September 26, 2010 @12:00PM (#33703620)

    So why was DNS blamed?

    From http://www.facebook.com/note.php?note_id=431441338919&id=9445547199&ref=mf&_fb_noscript=1 [facebook.com]

    The way to stop the feedback cycle was quite painful - we had to stop all traffic to this database cluster, which meant turning off the site.

    I'm, uh, taking a wild guess that simply shutting off port 80 is not going to allow for a controllable ramp up... they could redirect to another site, Orkut or myspace would have been mildly humorous. I am mildly surprised they don't have a simple emergency box with a simple static "undergoing repair" page, but, whatever ...

    So, other than zapping the A records and waiting, what are they supposed to do? Bonus points if they were doing DNS based load balancing and simply unplugged their (dns based) load balancer.

    I have no dog in the fight, having deleted my facebook account months ago. It is kind of funny that a page of technobabble is described as "technical details" as if folks like us/me would find it to be a complete description rather than pretty vague. Then again we're dealing with farmville addicts and you can't reason with addicts.

  • by j_col ( 1895476 ) on Sunday September 26, 2010 @12:30PM (#33703792)
    I found the genuine panic from many Facebook users to this outage very amusing.
  • Re:Ageism (Score:4, Insightful)

    by morgan_greywolf ( 835522 ) on Sunday September 26, 2010 @01:03PM (#33703974) Homepage Journal

    Really? DNS is broken? So typing say, http://slashdot.org/ [slashdot.org] doesn't work for you?

    No. DNS has a few security issues, but they're mostly minor. The fact that DNS works for millions of people every day without issue at least 99% of the time proves that DNS is a successful design, even if it could use some security updating.

  • Re:Ageism (Score:5, Insightful)

    by kasperd ( 592156 ) on Sunday September 26, 2010 @01:22PM (#33704098) Homepage Journal
    Some people think technology should be replaced just because it is old. But really, it should be replaced if it doesn't suit our needs and there is a different technology that does suit it.

    It is better to replace a 1 year old technology that does not suit our needs than to replace a 50 year old one that does. Usually when replacing, you want to replace with something newer. But in some cases it may turn out to be better to replace a new and misdesigned technology with an older and proven one.

    That said, there are improvements to both IP and DNS which should be rolled out because they fix real problems. The rollouts are not happening as fast as they ought to, mainly because it is problematic to roll out a change to the entire Internet, especially when not everybody involved is cooperating.

    But I don't think that really has anything to do with this outage.
  • Re:Ageism (Score:3, Insightful)

    by dlgeek ( 1065796 ) on Sunday September 26, 2010 @03:01PM (#33704676)
    And is definitely showing it's age. There's been a big cry for years from those working at the really high end of networking that we need to replace (really just extend) TCP because it doesn't work well with high bandwidth-delay-product links. This is because the max window size and ramp-up algorithm (slow start) don't allow you to saturate the pipe quickly enough or even at all. There are several proposed extensions floating around to fix the problem but none of them have widespread adoption.

    This actually is the case with a lot of our old networking protocols - yes, they were incredibly well designed at the time, but many are showing that they need to be upgraded to reflect modern technology. Back to our original case, the original DNS protocol does have a lot of problems that have surfaced lately (think about the sequence number prediction stuff from a couple years back) which inspired the roll-out of DNSSEC. IPv4 is hitting it's limits, but we're having trouble rolling out IPv6. How much easier would fighting spam be if SMTP had a strong authentication system for sent messages? Even HTTP, which has undergone several revisions, is again showing limitations, hence Google rolling out SPDY which allows predictive pushes, stream parallelism, etc.

    I don't think anyone seeks to criticize the designers of these protocols, and the protocols have excelled and scaled far, far beyond anyone's wildest expectations. That being said, they have been showing cracks lately as technology has grown, and nothing looks like it did back when they were written. However, we have hit a point where the difficulty in upgrading or replacing them is actually starting to hold us back.
  • by kiwimate ( 458274 ) on Sunday September 26, 2010 @03:30PM (#33704860) Journal

    So is Slashdot.

    I don't know that finger pointing is necessarily healthy - that tends to suggest CYA and childish blame games. But on a technical IT focused web site, one might suppose that a lessons learned exercise on the root cause of the failure of a massive website would be of interest and hopefully even an educational experience.

Living on Earth may be expensive, but it includes an annual free trip around the Sun.

Working...