How Verizon and a BGP Optimizer Knocked Large Parts of the Internet Offline Today 73
Cloudflare issued a blog post explaining how Verizon sent a large chunk of the internet offline this morning after it wrongly accepted a network misconfiguration from a small ISP in Pennsylvania. The outages affected Cloudflare, Facebook, Amazon, and others. The Register reports: For nearly three hours, network traffic that was supposed to go to some of the biggest online names was instead accidentally rerouted through a steel giant based in Pittsburgh. More than 20,000 prefixes -- roughly two per cent of the internet -- were wrongly announced by regional U.S. ISP DQE Communications: this announcement informed the sprawling internet's backbone equipment to thread netizens' traffic through one of DQE's clients, steel giant Allegheny Technologies, a rerouting that was then, mindbogglingly, accepted and passed on to the world by Verizon, a trusted major authority on the internet's highways and byways. And so, systems around the planet automatically updated, and connections destined for Facebook, Cloudflare, and others, ended up going to Allegheny, which black holed the traffic.
Internet engineers suspect that a piece of automated networking software -- a BGP optimizer called Noction -- used by DQE was to blame for the problem. But even though these kinds of misconfigurations happen every day, there is significant frustration and even disbelief that a U.S. telco as large as Verizon would pass on this amount of incorrect routing information. The sudden, wrong, change should have been caught by filters and never accepted. [...] One key industry group called Mutually Agreed Norms for Routing Security (MANRS) has four main recommendations: two technical and two cultural for fixing the problem. The two technical approaches are filtering and anti-spoofing, which basically check announcements from other network operators to see if they are legitimate and remove any that aren't; and the cultural fixes are coordination and global validation -- which encourage operators to talk more to one another and work together to flag and remove any suspicious looking BGP changes. Verizon is not a member of MANRS.
Internet engineers suspect that a piece of automated networking software -- a BGP optimizer called Noction -- used by DQE was to blame for the problem. But even though these kinds of misconfigurations happen every day, there is significant frustration and even disbelief that a U.S. telco as large as Verizon would pass on this amount of incorrect routing information. The sudden, wrong, change should have been caught by filters and never accepted. [...] One key industry group called Mutually Agreed Norms for Routing Security (MANRS) has four main recommendations: two technical and two cultural for fixing the problem. The two technical approaches are filtering and anti-spoofing, which basically check announcements from other network operators to see if they are legitimate and remove any that aren't; and the cultural fixes are coordination and global validation -- which encourage operators to talk more to one another and work together to flag and remove any suspicious looking BGP changes. Verizon is not a member of MANRS.
Ironic (Score:4, Funny)
"Can you hear me n&*&*#$&*##$# #@@@
NO CARRIER
Re: (Score:2)
Not ironic. It's you should expect when you let one company have so much control.
Re: (Score:2)
one that is violently imposed.
That's not true! You can completely change your government every two years. Takes very little effort.
Re: (Score:1)
So what? Majority rules... "Live it, or live with it"
Re: Ironic (Score:1)
This was just NSA's time to sniff some trafic. Last month was China's turn to make some BGP misconfigurations.
Re: (Score:2)
Re: (Score:2)
But he works for Sprint now. :P
Re: (Score:1)
Re: (Score:2)
Why wouldn't Verizon accept it?
Remember, Verizon is losing 60% of their IT staff due to their fucked up layoff/sell remaining employees to Infosys. Most of them this week if I recall. There's no one left to give a damn.
Google too? (Score:1)
Re:Google too? (Score:5, Informative)
Re: (Score:3)
Unless the route you take to get to the DNS resolver you're using was being blackholed. Might have been better off with your ISP's servers for once
Really? (Score:5, Insightful)
But even though these kinds of misconfigurations happen every day, there is significant frustration and even disbelief that a U.S. telco as large as Verizon would pass on this amount of incorrect routing information.
There is only "disbelief" because people fail to give Verizon the credit it deserves for being as shitty as it actually is.
Re: (Score:1)
Re: (Score:3)
This didn't break the internet. It broke 2% of the internet. Large in total number of people affect, but still only a small chunk of the internet. And people are mocking those responsible for such a huge fuck up, because it was avoidable.
In other words, this is the internet working exactly as intended.
Re:Really? (Score:4, Funny)
Why is this not trivial to fix? (Score:4, Insightful)
What kind of idiot firmware would accept a BGP routing change without verifying that the updated route works?
A basic sanity check, such as sending pings to verify that a route works should eliminate this sort of problem.
Re: (Score:1)
It's not that simple: even if you had a simple way to "verify that the updated route works" ignoring the problem of incoming and outgoing routes taking different paths, it doesn't solve the fact that you would have to push out the BGP update to the whole internet to properly test it and only then would you truly know if you broke something. This is not something that can be done at the firmware level.
I mean from the perspective of the system that made the /21 announcements, from its perspective it probably
Re: (Score:2)
Standard practice is to have a prefix list filter on all of an ISP's BGP peers. For example...company A peers to AT&T...Company A has a /23 assigned to them via ARIN. AT&T has a filter only accepting that /23...if company A advertises some other network block they don't own, its simply rejected. The process of creating these filters differs from ISP to ISP. Some do it manually....you give your ISP a list of the networks you will be advertising, along with an LOA (letter of authorization), they can c
Re: (Score:2)
Re: (Score:2)
What address do you ping? Every one owned by Amazon and Facebook? Which arbitrary number of active addresses need to be present for it to be valid? So you require all computers to be available before you can setup internet routing infrastructure?
I'm keen to hear your other suggestions, are they all equally unworkable?
Where are the conspiracy theories? (Score:2)
So where is everyone blaming the US government for this?
Different case (Score:2)
When traffic was incorrectly routed through china a few weeks ago the theories accusing the Chinese government were everywhere. It was "impossible" that any ISP would accidentally accept incorrect routing information.
Difference - in this case, it was a small company publishing bad routing to start with, then a larger unrelated company (Verizon) accepting the routes.
In the case of the Chinese incident, it was the Chinese company publishing and accepting the routes from start to finish - and in this U.S. case
Re: (Score:2)