Slashdot is powered by your submissions, so send in your scoop

 



Forgot your password?
typodupeerror
×
The Internet Networking Technology

Bufferbloat: Dark Buffers In the Internet 124

Expanding on earlier work from Jim Gettys of Bell Labs with a new article in the ACM Queue, CowboyRobot writes that Gettys "makes the case that the Internet is in danger of collapse due to 'bufferbloat,' 'the existence of excessively large and frequently full buffers inside the network.' Part of the blame is due to overbuffering; in an effort to protect ourselves we make things worse. But the problem runs deeper than that. Gettys' solution is AQM (active queue management) which is not deployed as widely as it should be. 'We are flying on an Internet airplane in which we are constantly swapping the wings, the engines, and the fuselage, with most of the cockpit instruments removed but only a few new instruments reinstalled. It crashed before; will it crash again?'"
This discussion has been archived. No new comments can be posted.

Bufferbloat: Dark Buffers In the Internet

Comments Filter:
  • Cringely again... (Score:5, Informative)

    by beetle496 ( 677137 ) on Friday December 02, 2011 @10:34PM (#38246890) Homepage

    Cingely has been writing about this all year. He cites Jim Gettys too. See: http://www.cringely.com/tag/bufferbloat/ [cringely.com]

  • by skids ( 119237 ) on Friday December 02, 2011 @10:39PM (#38246926) Homepage

    Seems so, but isn't. For TCP traffic, a shallow buffer that drops traffic will result in more goodput than a deep buffer. Which is the point.

  • by CyprusBlue113 ( 1294000 ) on Friday December 02, 2011 @10:43PM (#38246938)

    That is actually the exact problem. You do not want buffers larger than the flight time of your circuit. You absolutely want the buffers to fill and drop packets otherwise.

  • by pla ( 258480 ) on Friday December 02, 2011 @11:00PM (#38247022) Journal
    Seems so, but isn't. For TCP traffic, a shallow buffer that drops traffic will result in more goodput than a deep buffer. Which is the point.

    Yes and no...

    If you don't (or only rarely) fill your buffer, a smaller buffer introduces less latency than a large one, while still allowing you to maximize throughput. If, however, you usually have your buffer full, you increase latency for literally no benefit, since you've already maximized throughput simply through resource demand.

    The former will occur when your average load falls below your actual bandwidth, and allows you to get the most out of your link. The latter occurs when you consistently exceed your bandwidth, in which situation you may as well not even have a buffer, because it only increases latency without increasing throughput. That describes TFA's real point.

    What he suggests amounts to actively choosing between those two conditions - If your average demand falls below your link speed, a larger buffer will help smooth the load over time. If, however, your average demand exceeds your link speed, throw away the buffer because it doesn't help.

    But as per the GP's point - If you have an always-full buffer, you literally gain nothing but latency.
  • by CyprusBlue113 ( 1294000 ) on Friday December 02, 2011 @11:08PM (#38247046)

    The problem with buffers is most all of the time they are configured by size in bits. They need to be sized based on bit flight time of the circuit, which is in delay ms times throughput in bits. The disconnect between those values is a problem in *either* direction, especially past the retransmit threshold on the above side.

    Buffers should be dynamicly sized based on flight time of data on the specifc link, and ideally kept updated. WRED is also highly suggested.

    What really exacerbates the issue is devices with buffers that must be the same size for all links on X (be it card, slot, or chassis).

  • by icebike ( 68054 ) on Friday December 02, 2011 @11:22PM (#38247084)

    Seems so, but isn't. For TCP traffic, a shallow buffer that drops traffic will result in more goodput than a deep buffer. Which is the point.

    Exactly.

    Early Congestion notification along with ONLY a minimal amount of client side buffering is really all you need.
    The deep buffer just make it worse for everyone.

    Oh, and And just as a Car Analogy is inappropriate to describe TCP traffic the Airplane Analogy is worse.

  • Re:Alarmism? (Score:5, Informative)

    by Anonymous Coward on Friday December 02, 2011 @11:25PM (#38247100)

    Except it is an alarmist. The current situation isn't optimal but being optimal and having a critical issue are two different things. The crux of the problem is basically "Long delays from bufferbloat are frequently attributed incorrectly to network congestion, and this misinterpretation of the problem leads to the wrong solutions being proposed." That means is the administrators *might* mistake large buffer slow downs for other causes of network congestion. Idealy, it should definitely be dealt with better but it's hardly a collapse of the network.

    A network buffer acts just as that, a buffer to smooth out traffic spikes. A buffer does this at the cost of latency. If a buffer is large AND consistently full, that means that network link is always being fully utilized to where a large buffer isn't needed which basically induces large latency on top of waiting for the link to clear for no benefits (the extra latency *may* confuse administrators is basically the "danger"). On the other hand, if the link is under utilized the majority of times, the a large buffer is beneficial to deal with spike traffic. The majority of networks are the latter and hence designed as such. Two solutions, get faster links or deal with it more intelligently.

  • Re:Cringely again... (Score:4, Informative)

    by Idbar ( 1034346 ) on Saturday December 03, 2011 @12:34AM (#38247376)
    Basic, and worth reading is Raj Jain's 1992 paper [wustl.edu].
  • by Twinbee ( 767046 ) on Saturday December 03, 2011 @02:11AM (#38247844)
    I thought this animation by Richard Scheffenegger was a good way to show what's happening: http://www.skytopia.com/project/articles/lag/nam00000.avi [skytopia.com] Here's a description of the video:

    The bad Bufferbloat setup is on the left (yellow dots), and the 'good' setup (i.e. how things used to be configured about 10-20 years ago when RAM was more expensive!) is on the right (cyan/blue dots).

    Both sides start off okay, but notice how the left side 'queues' (tall yellow dot columns) keep on growing over time, while the right side blue columns stop short because of the small buffer size. As they stop short, some data 'packets' must be dropped, and this gets reported back to the upload site that it's shoving data to the user too fast. As a result, the upload site temporarily slows the sending of data, and thus the system self-corrects.

    Meanwhile, on the left side, these packets of data never get dropped, so the giant bloated yellow buffers get filled more and more, but the computer at the upload site doesn't realise the carnage of these giant queues further down the line, and instead thinks "All is okay, let's keep sending data fast!".

    Finally, when a smaller piece of data needs to be sent to the user (see 2:30+ signified by red dots on the left and dark blue dots on the right), the left side shows the red dots (which could be say, a small email) wading through giant queues to reach their destination, really slowly. Furthermore these tiny bits of data often need special 'emergency' treatment as they hold up other larger data associated with it. On the good right side, the dark blue dots have no such giant queues.

  • by skids ( 119237 ) on Saturday December 03, 2011 @02:20AM (#38247888) Homepage

    That analogy doesn't quite do the trick. TCP windowing is a bit more sophisticated than that. You can think of it maybe as a commander sending couriers out to support a mobile squad through hostile territory. If too many of them never make it to the squad, or back, he sends them less frequently so they can sneak through more discretely. If the troops make it through then he sends them faster because the more ammo he can get through the better. But he also has to decide how many men to put on courier duty. If the couriers take too long the squad has obviously moved further away from the base, and if he waited for the next one to return, he wouldn't be sending enough ammo. If the couriers return quickly, he can make do with less couriers.

    Big buffers are like a flimsy rope bridge in the courier's path that takes a long time to cross. Couriers have to wait on one side because only one can cross at once, but the large groups waiting at the side of the cliff is more likely to get attacked. Until they do get attacked, however, the commander starts to think the squad has moved very far away, so he puts more couriers on duty. Since he thinks the squad is far away, he is not expecting them to return for a longer amount of time, it takes him longer to realize that they are starting to go missing entirely.

    One of the best solutions to this problem turns out to be for some of the couriers to randomly go AWOL, and for more of them to go AWOL the bigger the crowd at the rope bridge gets. This basic concept is called Random Early Discard, and there have been a lot of ways invented for deciding who goes AWOL and why. If some of the couriers go AWOL, the commander thinks they are being attacked, so he slows down and also takes some troops off courier duty.

  • by pla ( 258480 ) on Saturday December 03, 2011 @09:20AM (#38249248) Journal
    You know, like, measuring things? Where does the problem happen? Under what circumstances?

    You mean, like figure-2 or even better, figure-5, in TFA? Where the (most common) 2^n buffer sizes stand out so obviously in the data that you'd need to try not to notice the trend?

    Of course, this situation doesn't actually require much "real" data to prove. If each 1500 byte packet takes 10ms to transmit, and you have a full 256KB buffer - Which will unavoidably happen any time you try to sustain a transmit faster than your link can handle - You will have 1.7 seconds of latency in a FIFO queue.


    tldr

    Don't worry, we could tell.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (5) All right, who's the wiseguy who stuck this trigraph stuff in here?

Working...