ARPANET Co-Founder Calls for Flow Management 163
An anonymous reader writes "Lawrence Roberts, co-founder of ARPANET and inventor of packet switching, today published an article in which he claims to solve the congestion control problem on the Internet. Roberts says, contrary to popular belief, the problem with congestion is the networks, not Transmission Control Protocol (TCP). Rather than overhaul TCP, he says, we need to deploy flow management, and selectively discard no more than one packet per TCP cycle. Flow management is the only alternative to peering into everyone's network, he says, and it's the only way to fairly distribute Internet capacity."
RMS on the same subject. (Score:5, Interesting)
He seems to agree [stallman.org]. This surprised me but it seems that equipment can do this fairly.
Re: (Score:2)
The right link [stallman.org]. Sorry about that.
Re:RMS on the same subject. (Score:4, Insightful)
Re:RMS on the same subject. (Score:4, Insightful)
Re:RMS on the same subject. (Score:5, Insightful)
Re:RMS on the same subject. (Score:5, Funny)
Re:RMS on the same subject. (Score:5, Funny)
Re: (Score:2)
Re: (Score:3, Funny)
Re: (Score:2)
Accessible, knowlegible and fair (Score:5, Interesting)
Everyone's got their favorite experts and they are often a shortcut to lots of research you don't have time for. He's an independent expert who cares more about your rights than other things, happens to be an expert in OS design who's been working since the early 70s and knows something about networking as well. Finally, he likes to answers email.
Re: (Score:2)
This is a thinly veiled infomercial or PR fluff piece.
And where can I buy this flow management? (Score:5, Insightful)
Re:And where can I buy this flow management? (Score:5, Informative)
Should be able to get it anywhere. (Score:3, Informative)
I have been told that the ability to do this has been around since the 1970s. Don't all equipment makers have some version?
Re:And where can I buy this flow management? (Score:4, Informative)
Re: (Score:3, Insightful)
Re: (Score:2, Informative)
Anyway, in 99% of cases to achieve same thing you could use Linux with SFQ queueing.
He's even got the *look.* (Score:2)
inventor of packet switching (Score:5, Informative)
Re: (Score:3, Informative)
If you ask Larry Roberts he would say that the honor belongs to Kleinrock.
Personally I don't think that you can say that there is a sole inventor
That's all fine... (Score:5, Interesting)
Will this method really offset the retransmits it triggers? Only if not everyone does it, unless I'm missing something.
What might work better is scaled drops: if a router and its immediate peers are nearing capacity, they start to drop a packet per cycle, automatically causing the routers at their perimeter to route around the problem, easing up on their traffic.
It still seems like a system where an untrusted party could take advantage to drop packets in this manner from non-preferred sources or to non-preferred destinations however.
Re: (Score:2)
Re:That's all fine... (Score:4, Interesting)
Re: (Score:3, Informative)
Routing does not change based on traffic on that short a timescale, it changes if a link goes down, or a policy agreement changes, an engineer changes some link allocation, etc. Doing traffic-sensitive routing is hard because of oscillations; in your example, would the perimeter nodes switch back to the now congestion-free router?
Actually, many large scale networks *can* change routing based on congestion.
The mechanism used is MPLS, using RSVP TE.
Essentially, traffic is classified based on chosen parameters (protocol, port, etc) and placed into logical tunnels, and each can reach the same destination via a different path. Every so often (depending on administrator configuration, often 15 minutes), the router looks at utilisation of each tunnel on each interface, and can signal a different path for various tunnels in case of congest
Re: (Score:2)
Re:That's all fine... (Score:5, Interesting)
If the total traffic is above some certain threshold, but below a critical value, then a signficant number of packets will be retransmitted. This causes the load to increase, the next cycle around, causing further packet loss and further retransmits. There will be a time - starting with a fall in fresh network demand - in which observed network demand actually rises, due to accumulation of erors.
There will then be a third critical value, close to but still below the rated throughput of the switch or router. Provided no errors occur, the traffic will flow smoothly and packet loss should not occur. This isn't entirely unlike superheating - particularly on collapse. Only a handful of retransmits would be required - and they could occur anywhere in the system for which this is merely one hop of many - to cause the traffic to suddenly exceed maximum throughput. Since the retransmitted packets will add to the existing flows, and since the increase in traffic will increase superlinearly, that node is effectively dead. If there's a way to redirect the traffic for dead nodes, there is then a high risk of cascading errors, where the failure will ripple out through the network, taking out router/switch after router/switch.
Does flow management work? Linux has a range of RED and BLUE implementions. Hold a contest at your local LUG or LAN Gamer's meets, to see who can set it up the best. Flow management also includes ECN. Have you switched that on yet? There's MTUs and window sizes to consider - default works fine most times, but do you understand those controls and when they should be used?
None of this stuff needs to be end-to-end unless it's endpoint-active (and only a handful of such protocols exist). It can all be done usefully anywhere in the network. I'll leave it as an exercise to the readership to identify any three specific methods and the specific places on the network they'd be useful on. Clues: Two, possibly all three, are described in detail in the Linux kernel help files. All of them have been covered by Slashdot. At least one is covered by the TCP/IP Drinking Game.
Re: (Score:2, Informative)
RED is tricky to set up, but neither Blue nor PI require much tuning, if any. (I'm running Blue on all of my 2.6 Linux routers, and RED on all the 2.4 ones.)
Yes, I have. On all of my hosts and routers. It's a big win for interactive connections, but doesn't matter that much for bulk throughput.
Re: (Score:2)
Unless you're running some fancy link technology, you don't get to tune your MTU. If, like most of us, you're running Ethernet and WiFi only, you're stuck with 1500 bytes. As for window sizes, they're pretty much tuned automatically nowadays, at least if you're running a recent Linux or Windows Vista.
Youl'll find the Web100 patch for Linux provides much better a
Re: (Score:2, Interesting)
toss one packet?! (Score:2, Insightful)
Re: (Score:2, Informative)
Re: (Score:2)
Re:toss one packet?! (Score:5, Interesting)
Yes and when the retransmission occurs the router may be able to handle your packet. The router won't be overloaded forever after all.
The bigger part of the equation is that with TCP the more packets are dropped the slower you transmit packets. With this solution the heaviest transmissions would have more packets dropped and therefore be slowed down the most.
I admit, I'd have to check the details of the protocol to see if this is open to abuse by those with a modified TCP stack. The problem is that the packets are dropped in a predictable manner and a modified TCP stack could be designed to 'filter' the noise and yet still degrade when other packets are lost and provide a reliable connection.
Re: (Score:3, Informative)
IMHO hacks like this don't help enough to go through the trouble of installing, and i
Re: (Score:2)
20 million P2P users with hacked stacks in this scenario would probably result in poorer performance and greater congestion than we have now.
Re: (Score:2)
Creates incentive to remove retransmit delay (Score:3, Interesting)
I think this proposal is a bit reckless and naive at the same time. Not a good combination. Add to that he is trying to set a precedent for data degradation when none is needed.
If networks want to reduce traffic in a civil manner, they will price their service similar to the way hosting providers do: Offer a flat rate up to a set cap measured in Gb/month, with overages priced at a different rate. People would then pay for their excesses, allowing the ISP to spend more on adding c
Re: (Score:2)
I am not sure what world you live in, but I don't know many people who are happy with the pricing schemes of cell phones.
All the same problems would be shared with the scheme you propose. First, you would be charged for incoming bandwidth. Second, the rates are never lower than unlimited service people pay the higher rates because cell phones are more convenient. Third, you have to constantly track your usage and would have refrain from using your conn
Re: (Score:2)
Re: (Score:3, Insightful)
You're making the assumption that ISPs/cell phone companies base their prices on their ongoing cost. That's not entirely correct. The price of a service is often a function of supply and demand. And an ISP/cell phone company will often manipulate the perception of that supply and dem
Re:toss one packet?! (Score:4, Informative)
Let's say you have 500 hosts sharing a "fat pipe." If During peak times, the combined throughput used by TCP applications cause all available bandwidth on the link to be consumed. The result is, at that instant that all available bandwith is consumed, packets get dropped suddenly and indiscriminately. This means that 500 hosts all lose a slew of packets.
Per TCP specifications, when packets aren't acknowledged, all 500 hosts back off for a moment, and then retransmit at approximately the same time, causing another sudden burst in bandwidth usage, and more dropped packets.
This problem compounds until all hosts are simply busting packets, dropping packets, backing off, and repeating. The solution to this was a technique called "RED (Random Early Detection).
What this does is essentially detect when bandwidth is almost completely utilized, and then starts selectively and "fairly" dropping packets from the TCP streams. This causes the hosts to gradually back off, until bandwidth consumption is back in check. The result is that the whole "synchronization" issue is avoided, and the link is better utilized, as throughput is constant and reliable.
There is a variation called WRED or "Weighted Random Early Detection", in which certain types of packets get cut before others. This would allow the router to avoid dropping VoIP traffic, while implementing RED on non-realtime streams instead.
You can read more about this technique here: http://www.cisco.com/en/US/docs/ios/12_0/qos/configuration/guide/qcconavd.html [cisco.com]
Re: (Score:2)
I admit, I'd have to check the details of the protocol to see if this is open to abuse by those with a modified TCP stack. The problem is that the packets are dropped in a predictable manner and a modified TCP stack could be designed to 'filter' the noise and yet still degrade when other packets are lost and provide a reliable connection.
You'd have to modify the stack at both ends of the connection, as each end expects defined TCP behaviour. If you are a downloader, and a packet you're downloading gets lost, the other end will need to retransmit, and it _will_ slow down if it has to retransmit, so there's no way around this.
Of course, if you have access to both ends, then theres no reason at all for you to use a defined protocol (TCP), you could just blast data between them using any mechanism, and get around congestion control mechanisms
Re: (Score:2, Informative)
So it quickly drops down to below the available bandwidth then slowly grows the speed up to it.
Re: (Score:2)
in short i'd drop any isp that did this
Why not now? (Score:3, Interesting)
Seems like there would have to be a good reason, otherwise this would just make more sense, right?
Re: (Score:2)
Re:Why not now? (Score:5, Informative)
You also have problems tracking flows; routes change, so while a router may be tracking an active flow, the flow may choose another path. The router has no way of knowing this, so it has to keep track of the flow until it times out (and the timeout would have to be more than just a few seconds).
There are flow-based router architectures, but they are not generally used for ISP core/edge routers because there are too many ways they can break.
Re: (Score:3, Informative)
Re: (Score:3, Informative)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2, Interesting)
Only if you want to be fair.
In practice, however, you only want to be approximately fair: to ensure that the grandmother can get her e-mail through even though there's a bunch of people on her network running Bittorrent. So in practice it is enough to keep track of just enough flow history to make sure that you're fair enough often enough, and no m
Re:Why not now? (Score:4, Funny)
Why, did you think this plan had something to do with providing better service to end-users? When does that ever happen?
-Graham
Re: (Score:2)
Just compare it with how water can flow through a maze with multiple ways through the maze. The trick is to avoid bottlenecks.
Another issue is that even though the path that seems to have the best performance from the view of the end-routers may actually be the worst s
Re: (Score:2)
To do anything based on flows, routers would have to keep track of all the active flows, which amounts to all open TCP connections going through that router. For an active router, there would be millions of active flows at any one time, so the overhead would be huge. This would be like a NAT or stateful firewall device that could do line-rate forwarding at gigabit, 10G, or 100G port speeds.
Bollocks.
Any modern router worth its salt, _especially_ ones used in large networks, use flow based mechanisms for routing.
Lets say a router has three possible, equal cost, paths to a destination network. Which path will it take? In the old days, it would pick one of those paths and stick with it. But that results in 2/3 of the network being unavailable.
In the case of many destination routes, it might seem to be even, but if one of the destination networks has a much higher traffic flow than the others, yo
Re: (Score:2)
Re: (Score:2)
Re:Why not now? (Score:4, Insightful)
Can someone explain why this hasn't already been implemented?
It has been implemented and abandoned already because it doesn't scale. Serious routers today use the concept of interface adjacency: for a given inbound packet there are only a few possible destinations: they are each of the interfaces on the router.
When a route is installed into the FIB, you can recursively follow that route until you find the egress interface and the layer 2 address of the next hop - those will typically never change! So long as the router always keeps this adjacency information up to date, individual packets never need to have a route lookup performed - the destination prefix is checked in the adjacency table, the layer 2 header is rewritten, and the packet is queued for egress on the appropriate interface.
This allows for substantially higher throughput (in packets per second) than other methods because the adjacency table can be cleverly stored in content-addressable memory that provides constant time answers. A prefix will be installed in a content-addressable memory circuit as a lookup key. The value associated with that key is a pointer into the adjacency table that holds the interface and layer 2 information for that prefix.
By reconsidering the routing problem, and by using some smart circuits, the route lookup for a single packet has been reduced from O(k) to O(1), where k is the length of the longest prefix. For IPv4, that's up to 32-bits - so that means you do a single fetch and lookup instead of 32 or so comparisons for each packet. At a million packets per second, that's a huge difference.
Traditional flow-based routing requires creating in-memory structures for each flow, collectively called the flow cache. Each packet requires an initial full route lookup, which builds the structures for that flow. Then, subsequent packets in that flow can be matched against the cache and switched directly to the egress interface. This operation is much closer to that of a contemporary firewall. The good thing about this method is that it gives you a lot of visibility into the traffic. The bad side is that it requires a very large amount of memory for all of these structures. When that memory is exhausted, you can't route anymore flows!
This comparison is a bit apples to oranges - the adjacency table described above is pretty much state-of-the-art for off the shelf gear, while the flow cache architecture is highly dated. But without some substantial advances in the ways flows are created, tracked, and expired, no flow router is going to reach the number of packets per second that are required for very large installations in the Internet.
Re: (Score:3, Interesting)
Re: (Score:2)
Re: (Score:2)
There are a few problems to do good routing:
Re: (Score:2)
Back to subject, I also
Flow control??? (Score:4, Funny)
Reduce hop count. (Score:2, Interesting)
Beyond flow fairness, user fairness... (Score:3, Insightful)
This solves the P2P problem, and has a bunch of other advantages.
Note, also, you only need to do this at the edges, as the core is pretty overprovisioned currently.
Re:Beyond flow fairness, user fairness... (Score:4, Interesting)
Those who engage is low bandwidth activities are not entitled to more bandwidth while those engaging in high bandwidth activities are entitled to less. Both are entitled to equal bandwidth and have the right to utilize or not utilize accordingly.
Sadly, no... (Score:2)
Your neigbor and you share a common bottleneck. Your websurfing, he's got 6 torrents downloading. He is going to have at least 24 active flows, running full bore, and you will have 1 or 2 (which are bursty even). Thanks to how TCP works, without traffic shaping, you will receive 1 packet for every 24 he gets.
User fairness is necessary to be implemented in the network to keep his traffic from walking
Re: (Score:3, Insightful)
'Your websurfing, he's got 6 torrents downloading. He is going to have at least 24 active flows, running full bore, and you will have 1 or 2 (which are bursty even). Thanks to how TCP works, without traffic shaping, you will receive 1 packet for every 24 he gets.'
As things stand now, yes. But under the scheme he is suggesting my flows would be slo
Re: (Score:2)
Re: (Score:2)
I have an eight megabit connection. But not really, the connection is actually much faster than that, my bandwidth is capped at eight megabit. I have no problem with that. After all, I pay for unlimited use of an eight megabit connection. The problems come when you can't actually deliver that eight megabits on an unlimited basis despite me paying for it.
Now, if y
per-user fairness and Nat (Score:2)
This sounds like a pretty good idea until you start thinking about NAT'ed networks. Is it really fair to treat an entire office or dorm (or even a small country) the same as a single user who happens to have a unique IP from their ISP? And what about the transition to IPV6, when presumably IP addresses are no longer going to be scarce?
Re: (Score:2)
The company I currently work for already does this (we provide "premium" aka $$$ internet services at various hotels and airports around the world).
We do the traffic shaping at end points. At the end points we know who the users are, so we can give them different treatment.
Possible scenario: the normal users get their fair share. The VIP users get their fair
Re: (Score:2)
Re: (Score:2)
That's just going to encourage people to waste IP addresses. They're scarce already, we don't need organizations to start configuring their nats to use multiple IPs just to get more throughput.
The status quo now is that actual users are not evenly distributed about the address space, and dividing bandwidth by IP doesn't make any more sense than the current situation, in which bandwidth is divided amongst competing sockets.
Re: (Score:2)
Already companies _are_ getting better service than "normal users". So your concerns are unfounded.
Roberts proposal (and company's product) however appears to be more focused at the core rather than the edge (if it's the edge I don't even see why it's such a great benefit).
The real problem with P2P using up too much ISP bandwidth is Copyright Law and the **AA.
If ISPs could cache and seed all torrents (on demand or prefe
Re: (Score:2)
I agree with you there. ISPs at the edge have enough information to treat each user (ie paying customer) fairly. That doesn't help if the core routers are congested, though.
If ISPs start treating P2P different from regular traffic, they
Privacy issues? (Score:3, Interesting)
It seems to me that by moving knowledge of flows into the routers, you make it easier to tap into these flows from a centralized place - i.e., the router.
Not that tapping connections can't be done now by spying on packets, of course, but it would make it much cheaper to implement. High-overhead packet matching, reassembly, and interpretation is replaced by a simple table lookup in the router.
Donning my tinfoil hat, I can foresee a time when all routers 'must' implement this as a backdoor...
Re: (Score:2)
Weird solution (Score:4, Insightful)
I couldn't help but laugh a bit at his solution. He talks about "flow management" being put into the core of the network to solve TCP's unfairness problem, but at the end of the article he says:
So, in other words, his solution to the "P2P problem" is just a fancy version of a token bucket.
Re: (Score:2)
Another point is that TFA seems to imply that fixing TCP, whatever that means, will somehow solve the network congestion problems. However, the only real way to fix congestion is to grow capacity, which seems to have worked thus far.
Re: (Score:2, Funny)
To use an obligatory car analogy, think of congestion on the roadways. What you're talking about is widening the highways. But that just encourages more packets to be placed onto the network. We need to stop letting people use the Internet for what they want, when they want (freedom is bad). What we really need is the equivalent of, you guessed it, public transportation. Here's how it would work: As we all
Re: (Score:2)
Re:Weird solution (Score:4, Funny)
Sure, anyone can get on, but YOUR packets have to ride in back.
Re: (Score:2)
Hell, the idea of asking everyone to change their stack practically invites a "Your solution advocates a" reponse. You'd either have to lock out people running the 'old' stack or... wait for it... throttle them server side.
Linux already has per-flow fairness (Score:2, Informative)
Solutions for flow management... (Score:2, Funny)
http://www.kotex.com/ [kotex.com]
research on root causes of congestion? (Score:2, Insightful)
BGP and the intra-domain routing protocols assume there is at most one correct route from a given source address to a given destination address. That assumption could give rise to unnecessary congestion. For example, suppose the source wants to use bandwith of 100 units and the destination is capable of keeping up. But between them there are two routers, in parallel, each of which can supply only 50 units. If there's exactly one
routing via multiple paths (Score:2)
One reason why that might be problematic in practice, is that, iirc, TCP doesn't like getting packets out of order, and tends to respond to out-of-order packets similarly to dropped packets. If you have packets taking multiple paths, they are very likely to arrive out of order.
One could mitigate this, I suppose, by making sure all packets that are part of the same flow take the same path.
Re: (Score:2)
BGP and the intra-domain routing protocols assume there is at most one correct route from a given source address to a given destination address.
Interestingly, the original ARPANET didn't make that assumption and would load-share across links. The ARPANET was a denser mesh than the Internet today, but with much lower bandwidth links, only 56Kb/s.
Incidentally, the original MILNET, which was a purely military network using ARPANET nodes, was about six-connected; that is, each node had connections to about
No, TCP does not work by losing packets (Score:2)
TCP is mostly controlled by round trip time measurement and window size. Response to packet loss is a backup mechanism. If packet loss were the primary control mechanism, TCP would never work.
It's much better to throttle back before packet loss occurs, since any lost packet has to be resent and uses up resources from the sender to the drop point. Since the main bandwidth bottleneck is at the last mile to the consumer, the drop point tends to be close to the destination.
Don't trust the "Clean Slate I [stanford.edu]
Re: (Score:2)
These protocols work on the transmit side (trying to do the same thing on the receive side is a lot harder). They use round trip times to figure out whether excessive packet backlog is occuring on intermediate routers, and reduce the TCP window size accordingly in an attempt to reduce that backl
depends a lot on business models (Score:2)
Depends where the packet is in the flow, too. (Score:4, Interesting)
When the packet is close to an end point it is possible to use far more sophisticated queueing algorithms to make the flow do precisely what you want it to do. It's important for me because my outgoing bandwidth is pegged 24x7. Packet loss is not acceptable that close to the end point so I don't use RED or any early drop mechanism (and frankly they don't work that close to the end point anyway... they do not prevent bulk traffic from seriously interfering with interactive traffic), and it is equally unacceptable to allow a hundred packets build up on the router where the pipe constricts down to T1/DSL speeds, (which completely destroys interactive responsiveness).
For my egress point I've found that running a fair share scheduler works wonderfully. My little cisco had that feature and it works particularly well in newer IOS's. With the DSL line I couldn't get things working smoothly with PF/ALTQ until I sat down and wrote an ALTQ module to implement the same sort of thing.
Fair share scheduling basically associates the packets with 'connections' (in this case using PF's state table) and is thus able to identify those TCP connections with large backlogs and act on them appropriately. Being near the end point I don't have to drop any of the packets, but neither do I have to push out 50 tcp packets for a single connection and starve everything else that is going on. Fair share scheduling on its own isn't perfect, but when combined with PF/ALTQ and some prioritization rules to assign minimum bandwidths the result is quite good.
Another feature that couples very nicely with queueing in the egress router is turning on (for FreeBSD or DragonFly) the net.inet.tcp.inflight_enable sysctl. This feature is designed to specifically reduce packet backlogs in routers (particularly at any nearby bandwidth constriction point). While it can result in some unfair bandwidth allocation it can also be tuned to not be quite so conservative and simply give the egress router a lot more runway in its packet queues to better manage multiple flows.
The combination of the two is astoundingly good. Routers do much better when their packet queues aren't overstressed in the first place, only dropping packets in truely exceptional situations and not as a matter of course.
The real problem lies in what to do at the CENTER of the network, when you TCP packet has gone over 5 hops and has another 5 to go. Has anyone tried tracking the hundreds of thousands (or more) active streams that run through those routers? RED seems to be the only real solution at that point, but I really think dropping packets in general is something to be avoided at all costs and I keep hoping something better will be developed for the center of the network.
-Matt
PLEASE PLEASE PLEASE STOP POSTING HIS RANTS (Score:2)
Close, but there are other ways (Score:3, Insightful)
If WRED [wikipedia.org] didn't exist on every production-grade router made in the last 10+ years then there would certainly be a need for this technology. However, I'm not really sure how much benefit the "multi-flow fairness" concept would provide vs. just configuring WRED to discard only payload packets & not TCP control traffic. The tradeoff is the added complexity of the congestion avoidance mechanism having to be flow-aware, which increases cost, time to market, heat & power consumption, etc.
Such a technique combined with microflow policing [ciscopress.com] would come closer to what he describes. In fact one could probably refer to the congestion avoidance technique described in the article as "adaptive microflow policing".
A pretty standard config used with OpenBSD's PF firewall [openbsd.org] is to prioritize ACKs in both directions so that a line congested in one direction is still useful in the other.
BTW, TCP has already been re-engineered; it's called SCTP [wikipedia.org]. If you've got a custom high-bandwidth point-to-point application where you have complete control over both ends (mostly research stuff at this point), check it out.
A different approach to bandwidth management that is being developed by the major router vendors is the application-aware network. Imagine if the router was smart enough to read a field in an XML stream that indicates that this particular flow requires 64kbps or it should be dropped, it should have 256kbps to work well, and giving it more than 1mbps is not useful and you start to get the idea. That's just the tip of the iceberg.
Anyway, congestion control is useful & necessary, but "quality of service is no substitute for quantity of service"...
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
- Jesper