Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Internet Businesses Networking

Blackout Shows Net's Fragility 287

It doesn't come easy wrote to mention a ZDNet article discussing a recent outage between Level 3 Communications and Cogent Communication. A business feud inadvertently highlighted the fragility of the Internet's skeleton. From the article: "In theory, this kind of blackout is precisely the kind of problem the Internet was designed to withstand. The complicated, interlocking nature of networks means that data traffic is supposed to be able to find an alternate route to its destination, even if a critical link is broken. In practice, obscure contract disputes between the big network companies can make all these redundancies moot. At issue is a type of network connection called 'peering.' Most of the biggest network companies, such as AT&T, Sprint and MCI, as well as companies including Cogent and Level 3, strike "peering agreements" in which they agree to establish direct connections between their networks. "
This discussion has been archived. No new comments can be posted.

Blackout Shows Net's Fragility

Comments Filter:
  • by varmittang ( 849469 ) on Friday October 07, 2005 @10:31AM (#13739294)
    I think since Wednesday.
  • by anandsr ( 148302 ) on Friday October 07, 2005 @10:34AM (#13739321) Homepage
    Internet cannot route when your providers do not want you to communicate.
    Nothing can protect you in this case.
    If on the other hand there was a natural calamity and every one was trying to get you access
    then you would get it. Like it happened during Katarina.
    This is not a natural calamity.

    The best option is to ditch your provider if they are not a monopoly and if they are lobby to your government to create multiple providers.
  • by lostlogic ( 831646 ) on Friday October 07, 2005 @10:36AM (#13739350) Homepage
    You would only notice if you are on one of these two networks. I am personally on UUNet at home and MCI at work, and my server is on SpringLink (via Schlund, who I am not familiar with). As a result, all of my traffic is completely unaffected. Customers on a single-homed connection through Cogent, or through L3 cannot see other single homed customers on the other network. The rest of us don't know the difference. The dumb thing that this article points out is that both Cogent and L3 are refusing to route packets destined for each other through the rest of the internet (probably for fear of fucking up other peering agreements by dumping too much traffic on their other peers). I believe there was a comment in the previous thread about this issue saying that traffic in one direction could be routed, but that even return packets were being null-routed at some point, preventing any type of connection from being established.
  • by elfguygmail.com ( 910009 ) on Friday October 07, 2005 @10:39AM (#13739375) Homepage
    It's very true, and anyone can see how a few big companies basically make the net work in north america. Simply do traceroutes to various big web sites, and you'll notice the packets always go across the same networks. The biggest one seems to be alter.net (MCI), with others including Level3, above.net, AT&T and UUnet. Basically you remove any of these and the North American part of the Internet would be in chaos. The problem is because most ISPs do the same thing. They pick a primary provider, and get a backup one. The problem is they all pick the same few primary companies, and their backup links are much smaller pipes.
  • by Anonymous Coward on Friday October 07, 2005 @10:41AM (#13739396)
    If the traffic is about even both ways, the peering aggrements are made with no cash exchange. If it's uneven, the network not bearing its fair share of traffic usually has to pony up some cash as part of the "peer" aggreement. If things don't turn out as expected, the network carrying an unfair burden will usually back out or renegotiate the peer aggreement. You usually don't find out what the actual network traffic is until you start peering.
  • by brunes69 ( 86786 ) <`gro.daetsriek' `ta' `todhsals'> on Friday October 07, 2005 @10:53AM (#13739500)
    Take, for instance, the connections running between Europe and America. I bet most of them run in almost exactly the same place on the sea bed because it's the cheapest / shortest path to take. A fairly localized geological disaster (at least in geological terms) could cut all the cables at once; or at least enough to make to difference.

    This isn't a good example, because in this case most traffic would automatically be re-routed to go through Asia and the trans-Pacific cables. And if those went down it would go over South America Oceana.

    It would get much slower, sure, but would not cause an outage.

    There is no *technical* reason this peering relationship breaking down should be causing an outage either. If the both also peered with some third party that could service them both, like MCI or something, then the traffic would still get through. The companies are just being bull-headed.

  • by Daniel Boisvert ( 143499 ) on Friday October 07, 2005 @10:55AM (#13739524)
    NANOG has been on fire with posts about this issue over the past few days. The following two from Leo Bicknell do a good job of explaining why this sort of thing would happen, why nobody in particular is The Bad Guy[tm], and why this issue has no relevance to the issue of internet resilience in the case of natural or manmade disaster:

    http://www.merit.edu/mail.archives/nanog/msg12302. html [merit.edu]
    http://www.merit.edu/mail.archives/nanog/msg12350. html [merit.edu]
  • by Cally ( 10873 ) on Friday October 07, 2005 @10:57AM (#13739550) Homepage
    Check the NANOG archive over the last few days for far, far more than you ever wanted to know about "The Art of Peering: The Peering Playbook"... or read the book yourself [xchangepoint.net].
  • by gskouby ( 61416 ) on Friday October 07, 2005 @10:58AM (#13739561)
    About 4 months ago I got a call from a sales critter at Cogent saying "We will knock 50% off of the price you are paying for your L3 connectivity if you drop them and come be our customer." I was kind of surprised at the boldness of this proposition because they were specifically targeting current L3 customers. I was even more surprised to find out from others that this sales pitch from Cogent was company wide. Of course this pissed off L3 and that was the start of this pissing contest.
  • by 99BottlesOfBeerInMyF ( 813746 ) on Friday October 07, 2005 @11:02AM (#13739585)

    Am I missing something here?

    I only read about this very briefly, but my understanding is it went beyond that. Just cutting the peering connection is fine and proper and packets then are rerouted through other peers, possibly costing more money, possibly not. Then the internet goes on as before and everyone is happy and the peers involved can negotiate a new link if they want and figure it will save them money by avoiding other routes where they have to pay for traffic.

    My understanding is that in this case they not only cut the link, but they advertised routes to their other peers for traffic from the first peer, which they then maliciously, and probably in breach of those other contracts, filtered out, resulting in failed traffic routing. Basically they intentionally lied (to the routers) and said sure we'll route those and then did not.

    I don't think this highlights the fragility of the internet, so much as the fact that end users usually rely upon a single peer (ISP) and if they can't trust them to not intentionally break traffic they had better find a new ISP.

  • by Feyr ( 449684 ) on Friday October 07, 2005 @11:11AM (#13739675) Journal
    that's wrong, no one is filtering them. not anymore than they normally would to maintain their network.

    what we are seeing here is a pissing contest between two "tier1". so there literally is no other route the packets can take to reach each other network (contractually speaking, not technically). each of these networks have peering contracts with other companies, not transit. a peer is only used to reach other's network, a transit lets you reach networks beyond the network you are transiting through.

  • Re:A New Approach (Score:5, Informative)

    by BeBoxer ( 14448 ) on Friday October 07, 2005 @11:11AM (#13739678)
    I know there's been talk of wireless mesh networks where everybody is both an end point and a router. This would work in populated areas but I'm not sure how well it would work for "long haul" connections which is what the issue is here.

    If by "work in populated areas" you mean "slow the network to a crawl" then yes, it would work. Mesh networking is cool stuff, but you aren't going to build a backbone out of it. Wireless is really fast compared to your DSL line or cable modem. But it isn't even in the same ballpark as what you can do on fiber. Backbone links are running at 10Gbps or even 40Gbps. Full duplex, so that is 20Gbps or 80Gbps of "marketing bandwidth". Compared to what, 22Mbps or 54Mbps half-duplex for your wireless? You aren't going to build a comparable backbone out of wireless links running at roughly 1/1000th of the speed. Physics pretty much guarantees that fiber links will always be faster than wireless.
  • by Mulligan ( 29951 ) on Friday October 07, 2005 @11:31AM (#13739870)

    At the fringes there are really two types of internet service offered: upstream and downstream. Most consumers (individuals) need a lot of downstream and very little upstream. They typically are sold assymetric service that is heavily biased in this direction. My cable connection, for example, gives me ~5Mbps down and 768kbps up. On the flip side are the content providers who typically need a lot of upstream bandwidth and less upstream bandwidth. ISPs have found that these customer are willing/able to pay quite a bit more for their internet connections. Therefore, the law of supply and demand has increased the cost of connections with higher upstream capacity.

    Several levels up the ISP heirarchy, however, there are mostly only symmetric lines (T3, OCx, ...) providing equal upstream and downstream bandwith. In order to maximize the use of this bandwidth, many providers try to balance the number of content providers with content consumers in order to use the upstream and downstream capacity equally. In theory, this usage should be well balanced by the time it reaches the Teir 1 providers [keynote.com].

    The problem we are having right now is caused by Cogent not subscribing to that business model. They have found that the cost to support content consumers is much higher than the cost to support providers. (If for no other reason than there are far more of them.) So, their business model skews heavily towards the provider customers, reducing their operational costs. This, in turn, means that they are able to offer lower costs to those content providers -- in many cases undercutting the other big service providers such as Level 3

    This, of course, makes the other providers unhappy because it cuts into their high-yield business. So, occasionally, one of them demands compensation for "transit" instead of providing free peering. They do this because they feel (rightly IMO) that Cogent is able to make more money on these high paying content providers by using an asset owned by the other service providers -- the online customer/consumer base. Basically, Level 3 is telling Cogent that because Cogent is making money by using that virtual asset owned by Level 3, Cogent owes Level 3 some sort of compensation. It is worth noting that several other Teir 1 providers already take this approach with Cogent and Cogent is forced to pay for "transit" service to those providers' customers.

    As long as all the Teir 1 providers cooperate, the system works reasonably well. However, in this case, Cogent is trying to take advantage of that informal cooperation to make some extra money. So, they are being capatalists. In this case, capatalism is at odds with cooperation and the system is not working well.

    Many people are calling for government regulation to prevent this sort of situation. I expect this to cause some major problems. The issue could be resolved if all the Teir 1 providers would realize that there is a different market value for ingress and outgress traffic. In a free market, I expect that the ingress traffic (corresponding to upstream traffic of content providers from the lower levels) would have substantially more value than the outgress traffic (downstream traffic to consumers). The outgress traffic might even have negative value (meaning that a service provider would charge to take care of it). In the case that two peers balance their traffic well (the ideal cooperative solution) no money needs to change hands. In the other cases (like this one) the ISP with excess outgress usage should probably be charging the one with excess ingress.

    Unfortunately, there is no fluidity to the system between the true market (the upstream and downstream bandwidth consumers) and the core market (the Teir 1 providers). If there were, Level 3 could justify their demand for more money based on the value of the traffic they were accepting from lower down the food chain.

  • by Spazmania ( 174582 ) on Friday October 07, 2005 @11:39AM (#13739953) Homepage
    If either Level3 or Cogent was buying a "default" service from a third party, their customers wouldn't have a problem. The moment the peering connection was cut the lower-priority BGP routes from the third party would have taken over and their traffic would have gone through the third-party link.

    The reason these two jokers are having this problem is that they made a business decision to only move traffic with reciprocal peering and then failed to keep that peering alive. That's because they're both cheap-ass bastards; peering costs a heck of a lot less than buying transit.

    Go buy from someone else who who isn't a cheap-ass. Someone who buys transit for anything they can't peer. You won't have a problem.

    The only lesson here is that most time honored of lessons: you get what you pay for.
  • by cr0sh ( 43134 ) on Friday October 07, 2005 @11:45AM (#13740007) Homepage
    I am late to this thread, and it has probably been said anyhow, but I want to reiterate:

    The problem isn't soley with the business arrangements between the "big providers" - oh, certain, that does have impact, but the internet would be as robust as ever, if every participant on it could be a peer.

    This is how the network was meant to be, a mesh comprised of stupid interconnects and smart nodes. Every node on the internet, from the largest colo to the smallest wireless handheld, should have the ability to be a true peer on the internet. In practice, this isn't really possible, but imagine a mesh network with a distributed p2p DNS system which many people could run if they wanted to - if only a fraction were running it, and were distributed enough, such outages might not occur (the traffic could continue to be routed, albeit at a slower pace).

    Everyone should be able to be a peer on the network, everyone should be able to get at least one static IP, everyone should be able to run their own server(s) if they want to. Right now, the only way you can do it is by paying huge amounts of $$$ so you can get a garden hose instead of a straw. I am not saying access to the internet should be or could be free, but peering should be a natural right of being a part of the internet, not something you have to pay extra (a LOT extra) for.

  • Roadrunner affected (Score:2, Informative)

    by Jeff85 ( 710722 ) on Friday October 07, 2005 @11:45AM (#13740008) Homepage
    I had a friend on Roadrunner who complained he couldn't connect to many sites. I think he happened to know that they used Level 3. Is there a way to determine what backbone your ISP or a particular site uses?
  • by frost22 ( 115958 ) on Friday October 07, 2005 @12:27PM (#13740393) Homepage
    *Sigh*. Why do you spew nonsense if you actually have not even found out how a clue looks like, not to mention ever aquired one ?

    So you claim there are no Internet Exchange Points [wikipedia.org] ?

    pray tell, what is this thing [mae.net] ? Or that one, not to mention the middle one [mae.net].
    Oh, and what do you think those Guys do [switchanddata.com] for a living ?

    Nobody expects you to be a fucking genius or know everything. But why are some folks constantly touting stupid nonsense instead of keeping their mouths shut and learning something ?
  • by Alioth ( 221270 ) <no@spam> on Friday October 07, 2005 @12:28PM (#13740409) Journal
    Cogent COULD route around the damage - if they wanted to, but they don't.

    If the peering point had been taken out by a bomb, the re-routing would have been performed in fairly short order. However, this is not the case here.

    Level3 think that Cogent is taking the piss and is not a real peer. Level3 want Cogent to buy transit to reach Level3, either directly from them (or from someone else) because at the moment the peering is very lopsided, and costing Level3 a bucketload of money and giving Cogent a boatload of free bandwidth.

    Cogent on the other hand doesn't want to pay for transit to Level3.

    Right now, Cogent could route all their traffic for Level3 over transit they pay for. They don't want to do that because it won't force Level3 back into the peering agreement. So what they do is leave the link severed and do not re-route so that Level3 customers cannot get to sites hosted by Cogent. This means Level3 customers will grumble at Level3. Additionally, they offer a year's free transit to single homed Level3 customers just to raise the brinkmanship with Level3 a notch higher. Basically it's war between L3 and Cogent.

    If Cogent re-routes their traffic, they are defeated and L3 will never re-peer. What Cogent are hoping is that enough angry customers on the L3 end will whine at L3 so L3 will be forced to re-peer.

    For the rest of us in the peanut gallery (i.e. those of us who aren't single homed customers of Cogent or Level3) we can just watch the fun and games and throw peanut shells at the squabbling combatants because we don't see any black hole at all.
  • by Secrity ( 742221 ) on Friday October 07, 2005 @12:42PM (#13740527)
    Is there a way to determine what backbone your ISP or a particular site uses?

    The Unix traceroute command can be used to do this:

    $ traceroute slashdot.org
  • by LurkerXXX ( 667952 ) on Friday October 07, 2005 @01:12PM (#13740768)
    I have friends working for other Level3 clients and peers. The packets are getting to Level3. Then they disappear. The routes ARE advertised to them. That's the beauty of good internet routing, it heals around wounds.

    FYI, smaller ISPs pay larger ISPs for bandwidth all the time. The larger ISPs have huge costs. Switches costing hundreds of thousands of dollars, filled with a bunch of cards in it that each cost hundreds of thousands. Lots of them. Lots of fiber and other costs. It gets real easy to have billions invested just in hardware. They offset those costs in part by selling bandwidth to smaller ISPs. That's the way the net works.

    Try telling some small ISP that they should stop paying their upstream provider. That the upstream provider should give them bandwidth free so that the larger ISPs customers can access websites hosted by the smaller ISP. They will tell you you are living in a dream world. That's not the way the net works.

  • by beebware ( 149208 ) on Friday October 07, 2005 @02:44PM (#13741565) Homepage
    Oh, for the record, the BBC has a brilliant network in the UK and the US - http://support.bbc.co.uk/support/network/ [bbc.co.uk] . I believe, although I haven't even attempted to confirm it, that they have peering agreements with most of the major UK ISPs.
  • by drakaan ( 688386 ) on Friday October 07, 2005 @02:45PM (#13741572) Homepage Journal
    That doesn't make any sense. It's not as if there's no other route at all between the two networks. Routing protocols and ICMP unreachables exist to allow traffic to route around trouble like this. Unless the link was deliberately broken and packets unceremoniously dropped, the source for a given connection attempt would see it's packet routed in what appeared to be an excessive manner, but it'd still get from point A to point B.

    If Cogent users can get to Qwest and L3 users can get to Qwest, but cogent users can't talk to L3 users, then cogent and L3 are doing something intentionally bad and screwing everyone on the internet.

  • by Anonymous Coward on Friday October 07, 2005 @04:34PM (#13742494)
    the problem has been solved. I can ping level3 from cogent and i have one connection. I don't know yet who flinched first......
  • Fixed now? (Score:4, Informative)

    by dereference ( 875531 ) on Friday October 07, 2005 @05:17PM (#13742788)
    The availability grid for the past 4 hours shows ~40% and the grid for the past 1 hour shows 100%. As noted by "Cally" below, I honestly have no idea how exactly this grid has been generated (hence my original disclaimer) but this certainly seems to indicate, from a practical standpoint, that the L3/Cogent issue has been very recently resolved. Indeed, from my (single-homed) L3 server I can now traceroute directly to a (single-homed) Cogent host.
  • by billstewart ( 78916 ) on Friday October 07, 2005 @08:57PM (#13744079) Journal
    There are two basic ways that networks connect to each other - peering and transit. In a transit arrangement, one network (typically the big one) agrees to deliver any traffic the other network hands it, in return for a bunch of money, and it typically either advertises a default route (telling a small customer that they can send it all their packets) or a bunch of detailed routes and a default (telling a dual-homed medium-large customer how good its connections are to lots of places, but that customer might use another carrier for destinations that are closer with that carrier.) If you're an end customer, or a small ISP buying service from a big ISP, that's usually what you buy.

    Peering arrangements are different. Two networks that have a lot of traffic for each other will set up direct connections, split the direct costs of the connections, and not charge for accepting packets from the other carrier. But they'll only advertise the routes for their *own* customers. If two small ISPs peer with each other, typically they're each also buying transit service from big ISPs, but it's cheaper for them to dedicate a connection or put bits on a public peering point like MAE-West than to both pay their upstream ISPs.

    The biggest ISPs in the US are called "Tier 1" ISPs, and they all peer with each other rather than buying transit, though they might buy transit for international connections, if they can't get the other side to buy transit from them. It seems flaky, but it makes business sense, or at least it did for a while. In some sense, being big enough that all the other Tier 1s will peer with you is what defines Tier 1, and aside from technical issues, it's a marketing thing - "See, we're one of the big players!" Peering and Transit don't mix very well - you either connect to a given carrier by peering, or by transit, or else you spend a long time hammering out custom arrangements about exactly which routes you'll accept and tweaking routing tables.

    Cogent is a Wannabe-Tier-1. Their main business model is to put fiber into big multi-tenant office buildings and sell everybody 100-meg Ethernet for about the price other carriers charge for one or two T1s. If I were a customer, I wouldn't expect there to be enough upstream to really get that much bandwidth all the time, but I'd expect to get more than a T1 all the time, and a lot more than a T1 almost all the time. Level 3 has apparently decided they're not getting enough value out of the relationship (i.e. not sending Cogent enough packets to make it worth their while) to keep peering, and wants Cogent to either pay them for service or get transit from somebody else. They gave them about 50 days to make other arrangements, but Cogent decided to play chicken with them.

Get hold of portable property. -- Charles Dickens, "Great Expectations"

Working...