Bufferbloat: Dark Buffers In the Internet 124
Expanding on earlier work from Jim Gettys of Bell Labs with a new article in the ACM Queue, CowboyRobot writes that Gettys "makes the case that the Internet is in danger of collapse due to 'bufferbloat,' 'the existence of excessively large and frequently full buffers inside the network.' Part of the blame is due to overbuffering; in an effort to protect ourselves we make things worse. But the problem runs deeper than that. Gettys' solution is AQM (active queue management) which is not deployed as widely as it should be. 'We are flying on an Internet airplane in which we are constantly swapping the wings, the engines, and the fuselage, with most of the cockpit instruments removed but only a few new instruments reinstalled. It crashed before; will it crash again?'"
Re:Is this a problem? (Score:4, Insightful)
Re:Is this a problem? (Score:4, Insightful)
What he suggests amounts to actively choosing between those two conditions - If your average demand falls below your link speed, a larger buffer will help smooth the load over time.
That's a pretty simplified way of putting it, but basically correct. Major equipment vendors have been slow to adopt more advanced queuing strategies (Stochastic Fair Queuing integrated with some of the more advanced flavors of early discard.) Fortunately we're budgeted for and piloting a shaper for purchase soon, and this time around have a chance to get something both well supported and cutting edge.
Personally I pine for ATM's ABR CoS with it's fast end-to-end congestion notification, but as history has shown us, the inevitable fate of the tech world is for the inferior to be gradually, painfully, and kludgingly adapted to become the same thing as the technologies it displaced through lowballing. In this case, that inferior thing being IP/ethernet.
Re:Cringely again... (Score:5, Insightful)
Buffer and cache are not the same thing. Packets are written to a buffer once and read from it once. Caches are useless if, on average, blocks aren't read from them more than they are written to them. So treating them as analogous is highly misleading.
The deal with throughput is that you can only win by storing packets if there is going to be room to send them without delay. If you buffer every packet that's sent, it does get delivered, but by the time it gets to its destination, it's too late. You can adjust the TCP algorithms to behave somewhat less badly in this situation, but what you can't do is get genuine flow control with big buffers, because the endpoints have no way to determine the throughput of the network.
The only way the endpoints can determine the throughput of the network is if packets get dropped when there's congestion. When packets don't get dropped, what you see is that whenever there is more traffic to send over the link than the link can hold, it just winds up in a buffer. Latency rises. Eventually all the senders give up. Then the buffers start to drain, and packets get delivered. Then the acks start coming back. Now the endpoints think they are on a high latency link, so they crank back up again and fill the buffers again.
So what you see is a network that works great as long as the total load presented to the network is less than the aggregate capacity of the network. As soon as the demand for bandwidth exceeds the supply, every single stream starts to stall. If you've stayed at a hotel recently, you've seen this: a dozen people try to watch video streams over a fairly wimpy connection, and then you can't do _anything_ over the connection, because the buffer fills up.
If you didn't have that giant buffer, all the endpoints would be able to tell that the link was congested, and would slow down. If the total available bandwidth wasn't enough, the video streams would basically fail, but you could still get mail and surf the web. But with bufferbloat, not only can't you watch video streams, you also can't surf the web or get email or ssh to your server.
You can see this by pinging a server somewhere out on the internet. When the link isn't congested, you'll see reasonable round trip times, typically 100ms. Then when it gets congested, you'll see packets dropped, and you'll see the RTT rise to as much as a minute. Then as all the senders notice that their packets aren't being delivered, they back off and suddenly the RTT starts to drop again, and you start to hope the network's been fixed. But it's fool's gold: as soon as the senders notice, they bomb the buffer again, and the RTT goes back up. Rinse and repeat until you give up.
You probably don't see this very often on your home link, because you probably aren't saturating it. But it happens a lot at Wifi hotspots in particular, and also sometimes on 3G networks. It's quite disheartening, particularly when you're paying for the connection. You also see it on big ISPs like Comcast when you try to reach content providers that aren't willing to pay the ransom to Comcast to get on their uncongested link.
A lot more to interoperate with (Score:5, Insightful)
Re:SPAM (Score:4, Insightful)
And just maybe some of us are interested in how research has progressed since the last article...