Slashdot Log In
New Router Manages Flows, Not Packets
Posted by
ScuttleMonkey
on Fri Jul 10, 2009 02:30 PM
from the let-the-injections-begin dept.
from the let-the-injections-begin dept.
An anonymous reader writes "A new router, designed by one of the creators of ARPANET, manages flows of packets instead of only managing individual packets. The router recognizes packets that are following the first and sends them along faster than if it had to route them as individuals. When overloaded, the router can make better choices of which packets to drop. 'Indeed, during most of my career as a network engineer, I never guessed that the queuing and discarding of packets in routers would create serious problems. More recently, though, as my Anagran colleagues and I scrutinized routers during peak workloads, we spotted two serious problems. First, routers discard packets somewhat randomly, causing some transmissions to stall. Second, the packets that are queued because of momentary overloads experience substantial and nonuniform delays, significantly reducing throughput (TCP throughput is inversely proportional to delay). These two effects hinder traffic for all applications, and some transmissions can take 10 times as long as others to complete.'"
Related Stories
Submission: New Router Manages Flows, Not Packets by Anonymous Coward
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
Well duh (Score:5, Funny)
Damn right, they manage flows. It keeps the tubes from clogging.
Duuuurrrrrr.
Re: (Score:3, Funny)
I don't know if I trust this guy with my interweb tubes, though. Did you notice the mess of cables [ieee.org] behind him?
If we can't trust him to keep his wiring closet organized, how can we trust him to clean the tubes?
Been tried, and they saw it was *not* good (Score:5, Interesting)
All older cisco equipment worked this way. This was nice, and worked very well for the first router(s) closest to the end customer. However for routers meant to route for large numbers of users this turned out to be a disaster.
Just to give you an idea, this was EOS (end of support) before I turned 10 [cisco.com] (look for "netflow routing")
There are a number of very problematic properties :
-> trivial to ddos (just generate too many flows to fit in memory, or generally increase the per-packet lookup time)
-> not p2p compatible (p2p will cause flow based routers to perform at a snail's pace, because they open so much connections)
-> possible triple penalty for every new flow (first a failed flow lookup, followed by a failed route lookup, going to default route)
-> very hard to have a good qos policy this way. A pipe has a fixed bandwidth, and you almost always oversubscribe. Therefore useful policies are very hard to formulate per-flow.
-> if you divide bandwidth per-flow over tcp then a large overload will "synchronize" everything. So let's explain what happens if 3 users are happily surfing about and another user starts bittorrent. Bandwidth gets divided over all the flows, and *every* connection closes, due to timeouts.
There are a number of advantages
-> easy, very extensive QOS is trivial to implement
-> stateful firewalling is almost laughably easy to implement, and very advanced firewalling can be done (e.g. easy to block ssh but not https, just filter on the string "openssh" anywhere in the connection. Added bonus : hilarity ensues if you email someone the text "openssh", and his pop3 connection keeps getting closed)
Here's the deal : a router has to lookup in a table of about 300.000 entries in per-packet switching (excepting MPLS P routers). My PC is, at this moment, opening 331 flows to various destinations, each sending an average of 5 packets (probably a lot of DNS requests are dragging this number down), but you have to keep in mind that a flow-based router has to look up first in the "flow table" AND in the route table (which still has 300.000 entries).
As soon as a flow-based router services more than 1000 machines (in either direction, ie. 100 clients communicating with 900 internet hosts = 1000 machines serviced), it's performance will fail to keep up with a packet-based router. That's not a lot. If a single client torrents or p2p's you will hit this limit easily, resulting in slower performance. 2000 machines and packet-based switching is double as efficient.
So : flow-based routing ... for your wireless access point ... perhaps. For anything more serious than that ? No way in hell.
Parent
This does not solve the problem (Score:5, Insightful)
TCP's congestion control algorithm, which causes congestion and then backs off is the real culprit here, and this router does nothing to fix that. The way to fix that is to dump TCP's congestion control and replace it with real flow control in the network layer. That requires lots of memory on intermediaries, because you need all the hosts along the data path to cooperate with each other to communicate about flow control, and that means keeping state. At which point, we're not talking about datagram networks anymore. And that means dumping the other desirable thing about datagram networks: fault tolerance. Packets are path-independent.
Anyway: getting back to TCP's congestion control: his article even says that "During congestion, it adjusts each flow rate at its input instead." Wait, what? "If an incoming flow has a rate deemed too high, the equipment discards a single packet to signal the transmission to slow down." That's how it works right now! The only difference that I can see is that he's being a little smarter about which packets to discard, unlike RED, which is what he's comparing this to. If so, that's an improvement, but it doesn't solve the problem. It will still take awhile for TCP to notice the problem, because the host has to wait for a missed ACK. TCP can only "see" the other host-- it does not know (or care) about flow control along the path. Solving the problem requires flow control along that path, i.e., in the network layer, but IP lacks such a mechanism.
Re: (Score:2)
Flow control can be greatly improved by adding NACKs to protocol. I.e. a router will (try to) send a NACK packet after it drops your packet.
This NACK might get lost, sure, so a timeout mechanism is still required. But in general NACKs give much better flow control. Another variant is heartbeat ACKs (used in SCTP), they allow a range of other optimizations.
It's possible to do better than TCP. Though of course, circuit-switched networks are still superior in flow control.
Re:This does not solve the problem (Score:4, Interesting)
> TCP's congestion control algorithm, which causes congestion and then backs off is the real culprit here
In a dumb network with intelligence on the edges, you can:
1) cause congestion and then back off (TCP)
2) hammer away at whatever rate you think you need (UDP)
3) use a pre-set limit (which might be too high as well so no one does that on public networks)
State-ful packet switching is literally impossible, fixed-path routing not desirable for the reason you stated above and I would not want anyone to inspect my traffic _by design_, anyway.
TCP may not be perfect, but I fail to see an alternative.
Parent
Re:This does not solve the problem (Score:4, Interesting)
TCP's congestion control backs off exponentially because it has to. There is a stability property that if the network is undergoing increased congestion (this is how TCP learns the available throughput and utilizes it) and the senders do not back off exponentially then their backing off will not be fast enough to relieve congestion and therefore stabilize the system. If this router is selectively stalling individual flows I do not believe that will be fast enough to deal with growing congestion from many greedy clients.
Basically, eventually the buffer space of the router will become exhausted and it will be forced to drop packets non-selectively hence initiating TCP backoffs from randomly selected flows, resulting in current behavior. So, of course in that gray area between the first dropped flow and when we need to revert back to normal behavior we may see improved network performance for some flows but they will just take advantage of this by opening up their TCP windows more until the inevitable collapse comes.
The end result will be delaying backing off many TCP flows (which will speed them up creating more congestion) at the expense of completely trashing a few flows (which will stall anyways for packet reordering). and so the resulting system will be less stable.
Parent
Re: (Score:3, Interesting)
TCP's congestion control backs off exponentially because it has to.
Sure, but it's looking at the problem from the wrong end. IP has no feedback mechanism to allow for flow control (i.e., to prevent the sender from overrunning the receiver), so TCP has congestion control instead to stop it from happening when it does. Since TCP has no way of knowing what the available bandwidth is, it goes looking for it by causing the problem and then backing off. And since packet-switched traffic is "bursty", it resumes increasing the rate until it hits the ceiling again (because maybe
Re:This does not solve the problem (Score:4, Informative)
This has already been addressed in the IP specs: ECN [wikipedia.org]
One of the big problems with getting ECN adopted has been that Windows hasn't supported it. Vista does and I haven't seen anything specific but I'm reasonably certain that Windows 7 does as well. MAC OSX 10.5 supports it as well. Linux has supported it for quite awhile. It's usually disabled by default, so that may be an issue in getting it widely supported. But the issue isn't that we don't know how to do it better. It's just overcoming the inertia.
Parent
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
TCP's congestion control algorithm, which causes congestion and then backs off is the real culprit here, and this router does nothing to fix that. The way to fix that is to dump TCP's congestion control and replace it with real flow control in the network layer.
Just remove the excess forwarding buffers; there's no point buffering more than what's required for the internal forwarding jitter, which should really be no more than a few datagrams at most. TCP is based on a model where congestion = loss, not congestion = pileup. Other UDP based protocols - DNS, etc, all have their own retransmission mechanisms, also based on the same model of congestion = loss. What happens when routers have ridiculous quantities of buffer - several seconds' worth - is that entire
Re: (Score:3, Interesting)
Anagran has a paper on just this topic; they claim to do better than WRED because they track the rate of every TCP connection.
http://www.packet.cc/files/IFD2c.pdf [packet.cc]
so... (Score:2, Funny)
a router tampon?
This isn't new (Score:2)
Ah yes, Larry Roberts. He seems to poke his head up every once in a while. From Caspian Networks, and now Anagran. He certainly likes to push flow routing, although it's been shown not to scale in practice.
Re: (Score:3, Insightful)
One question I was hoping would be answered is what this flow routing buys you that something like SCTP wouldn't?
Re:This isn't new (Score:5, Informative)
I'm hesitant to say he's full of shit without hearing a bit more of the debate around his ideas.
There really isn't a debate around his ideas, at least not any more.
The hitch is management overhead. Managing a flow requires remembering the flow. That means data structures and stateful processing. It's expensive and no one has demonstrated hardware accelerators that do a good job of it. On the other hand, devices like a TCAM can accelerate stateless packet switching a couple orders of magnitude past what's possible with a generic PC.
At low data rates where DRAM latency is not an issue (presently around the 500mbps range), flows can work and accomplish much of what he claims. At higher data rates (like the 10-100gbps links on the backbone) we simply can't build hardware capable of managing flows for any kind of reasonable price.
Beyond that, Larry has really missed the boat. The next routing challenge isn't raw bits per second. That's pretty much in hand. Rather the next challenge is the number of routes in the system. If you want two ISPs for reliability (instead of one), you currently have to announce a route into the backbone that is processed by every single router in the backbone even if it never sees your packets. That currently costs about $8k per route per year, the cost is falling a lot more slowly than the route count is climbing and the lack of filtering and accounting systems mean that each one of those $8k's is an overhead cost to the backbone networks rather than a cost directly recoverable from the user who announced the route.
Flow based routing doesn't help us solve that challenge in the least. If anything, it makes it worse.
If you're interested in routing theory and research, I recommend the Internet Research Task Force Routing Research Group (IRTF RRG). They're chartered by the IETF to perform basic research into Internet routing architectures and anyone interested can participate.
Parent
Re: (Score:3, Informative)
Definitely not new.
"The router recognizes packets that are following the first and sends them along faster than if it had to route them as individuals."
Where have I heard this before...oh hay...
http://en.wikipedia.org/wiki/Cisco_Express_Forwarding [wikipedia.org]
This sounds like a cracker's dream (Score:2, Insightful)
It manages flow of traffic, recognizing when one packet belongs with the others. This sounds wonderful, at least for people trying to inject packets.
I hope these things recognize the evil bit [faqs.org].
Puffery by a startup (Score:5, Informative)
The main players in the routing industry have been working on flow-aware routing for years.
(I'm in the hardware side of our company so I'm not sure where how many and which of the features built on the flow-based architecture are already in the field. But I'm willing to bet a significant chunk of change that that the full bore will be deployed on more than one name-brand company's product line and be the dominant paradigm in routing long before these guys can convince the telecoms and ISPs to adopt their product. No matter how many big names they have on staff - or how good their box is. Breaking into networking is HARD.)
Re: (Score:2)
Ya I'm failing to see what is special here. Now the article was kind of light on the details, so maybe there's more to it, but to me it sounds like what the Cisco 6000s and such already do. When you start a flow the first packet hits the router and it decides where it is going, if it is allowed and all that jazz. After that, the subsequent packets are switched which makes it much faster. Routing is essentially done on flows, not packets.
Maybe this is somehow way more amazing, but it doesn't look like it.
Re: (Score:2)
Especially trying to break into a market by telling everyone about your awesome super cool new way of doing things ... that everyone else has been doing for 10 years already.
Didn't Ipsilon try this a long time back? (Score:2)
Little help? (Score:2)
Some thoughts (Score:4, Interesting)
First, routers discard packets somewhat randomly, causing some transmissions to stall.
While it is true that whether or not a particular packet will be discarded is the result of a probabilistic process, it is unfair to call it "random". Based on a model of the queue within the router and estimation of the input parameters the probability of a packet being discarded can be calculated. In fact, that's how they design routers. You pick a bunch of different situations and decide how often you can afford to drop packets, then design a queueing system to meet those requirements. Queueing theory is a well-established field (the de-facto standard textbook was written in 1970!) and networking is one of the biggest applications.
Second, the packets that are queued because of momentary overloads experience substantial and nonuniform delays
You wouldn't expect uniform delays. A queueing system with a uniform distribution on expected number of customers in the queue is a very strange system indeed. Those sorts of systems are usually related to renewal processes and don't often show up in networking applications. That's actually a good thing, because systems with uniform distributions on just about anything are much more difficult to solve or approximate than most other systems.
"Substantial" is the key word here. Effectively the concept of managing "flows" just means that the router is caching destinations based on fields like source port, source IP address, etc. By using the cache rather than recomputing the destination the latencies can be reduced, thus reducing the number of times you need to use the queue. In queueing theory terms you are decreasing mean service time to increase total service rate. Note however that this can backfire: if you increase the variance in the service time distribution too much (some delays will be much higher when you eventually do need to use the queue) you will actually decrease performance. Of course assumedly they've done all of this work. In essence "flow management" seems to be the replacement of a FIFO queue with a priority queue in a queueing system, with priority based on caching.
Personally, I'm not sure how much of a benefit this can provide. Does it work with NAT? How often do you drop packets based on incorrect routing as compared to those you would have dropped if you had put them in the queue? If this was a truly novel queueing theory application I would have expected to see it in a IEEE journal, not Spectrum.
And of course, any time someone opens with "The Internet is broken" you have to be a little skeptical. Routing is a well-studied and complex subject; saying that you've replaced "packets" with "flows" ain't gunna cut it in my book.
So, they've reimplemented CEF (Score:4, Interesting)
Yippee.
Cisco (and probably several others) have done this by default for many many moons now. By way of practical demonstration, notice that equal weight routes load balance per flow, not per packet. What it allows is subsequent routing decisions to be offloaded from a route processor down to the asics on the card level. And don't try to turn CEF off on a layer 3 switch - even a lightly loaded one - unless you want your throughput to resemble 56k.
It looks like horrible technolgy (Score:2, Funny)
Among the innovations:
no ram for buffering flows to cope with any temporary overcommitments. Instead it does this:
"Even more significant, the FR-1000 does away entirely with the queuing chips. During congestion, it adjusts each flow rate at its input instead. If an incoming flow has a rate deemed too high, the equipment discards a single packet to signal the transmission to slow down."
Um, discarding a random packet in the middle of my session will indeed slow the flow down, much in the same way as if you s
This Design is Flawed (Score:3, Informative)
Don't Cross The Streams (Score:4, Insightful)
It would be bad.
I'm fuzzy on the whole good/bad thing. What do you mean, "bad"?
Try to imagine all the packets on your network stopping instantaneously and every router on the Internet exploding at the speed of light.
Total TCP reversal!!
Right, that's bad. Important safety tip. Thanks, Egon.
Wrong (Score:3, Informative)
"TCP throughput is inversely proportional to delay"
Absolutely wrong, 2Mb/s at 1ms delay gives the same throughput as 2Mb/s at 10ms delay
As long as the window is large enough
Re: (Score:2)
http://fasterdata.es.net/ [es.net]
Re: (Score:2)
I see what you did there... (Score:2, Interesting)
He has re-invented the layer 3 switch... now with less jitter and latency because:
The FR-1000 does away entirely with the queuing chips. During congestion, it adjusts each flow rate at its input instead. If an incoming flow has a rate deemed too high, the equipment discards a single packet to signal the transmission to slow down. And rather than just delaying or dropping packets as in regular routers, in the FR-1000 the output provides feedback to the input. If thereâ(TM)s bandwidth available, the equi
Look what it *doesn't* have. (Score:3, Insightful)
He's doing it *without* custom ASICs and without TCAM. TCAM is very expensive. I'm not sure this is faster than CEF or the like, but it may very well be cheaper.
Caspian Networks Reloaded (Score:2)
Fun watching people say this "doesn't work" -- back when I was at Caspian, the real world runs were working quite well at gigabit speed and if memory serves, they had a 10 gigabit line card (this was 2006). The cost was they had to design asics to do this and they were trying to get the same performance out of commodity hardware. It looks like this is the case - which means it's dropped the cost of the equipment significantly.
Where it does improvement over current routing and qos is that it does it on the
Re: (Score:2)
You won't get many answers, since that would require somebody else to be:
a) foolish enough to read it
b) even more foolish to admit tt
Re:Net neutrality anyone? (Score:5, Informative)
That an ISP may prioritize services like VOIP over http or bittorrent is not what net neutrality is about and quite frankly is something that a good network engineer would look into and would probably implement.
Parent
Re: (Score:2, Insightful)
That an ISP may prioritize services like VOIP over http or bittorrent is not what net neutrality is about and quite frankly is something that a good network engineer would look into and would probably implement.
QoS isn't a bad thing, but the user should be in control of it, not the ISP; Who's to say that encrypted packet doesn't need a low-latency link more than the unencrypted VoIP connection? The ISP doesn't know -- it has to guess based on protocol data that may or may not be accurate. But that's a lot more work to implement and so most ISPs won't do it...
Re:Net neutrality anyone? (Score:5, Funny)
QoS isn't a bad thing, but the user should be in control of it
Exactly! That way MY packets (not some of them, ALL OF THEM) need to be prioritized.
Kind of reminds me of the good old days when I had access to print queue priorities. No-one ever understood why my printouts always came out first...I maintained I was just lucky.
Parent
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
The problem is that I'd be afraid of other people prioritizing all traffic rather than just some. So now I'm gonna prioritize all of my traffic. So now everything is in a gold queue and nothing gets prioritized. It is somewhat of a prisoner's dilemma. Contrast this with a network with end to end control where you can trust DSCP or COS values along the way. A possible solution is maybe to allow end users to
Re: (Score:2)
So we have a router that does stateful packet inspection and prioritizes traffic based on internal rules. Aren't we supposed to be against this?
I dunno. If the router is designed to look at packet flow rather than the contents of said packets or its source and destination, then you have still can have net neutrality.
Re:Net neutrality anyone? (Score:5, Insightful)
Exactly how is this different from what we currently have?
Consider a conventional router receiving two packets that are part of the same video. The router looks at the first packet's destination address and consults a routing table. It then holds the packet in a queue until it can be dispatched. When the router receives the second packet, it repeats those same steps, not "remembering" that it has just processed an earlier piece of the same video.
Uh, no. This is called process switching. It hasn't been used in anything but the most low-end routers for quite some time. CEF (Cisco Express Forwarding) and MPLS [wikipedia.org] (Multiprotocol Label Switching) use flow control. The perform a lookup on the first packet, cache the information in a forwarding table and all further packets which are part of the same flow are switched, not routed, at effectively wire speeds. MPLS adds a label to the packet which identifies the flow, so it isn't even necessary to check the packet for the five components which define the flow. Just look at the label and send it on its way.
QOS (Quality Of Service) has multiple modes of operation and multiple queue types which address the issues of which packets to drop. It may or may not include deep packet inspection to attempt to determine the type of packet.
Perhaps they've come up with some new innovations that aren't obvious in the write-up because it's written at a relatively high level, but there's nothing here that isn't already implemented and that I don't already work with on a daily basis in production networks.
Parent
Re: (Score:2)
My cable modem connects to a Cisco 7200, which most certainly supports CEF and has for at least 10 years, which was when I first started playing with 7200s.
How much closer to the edge do you want?
Its been a few years since I was a router flunky so if I get the exact model wrong don't castrate me, but as I recall the Cisco 12k came out screaming about how it did this for many gb/s of data without even breathing heavy. I realize thats not the highest of high end by any means, and that model is years old, but
Re:Net neutrality anyone? (Score:4, Insightful)
No, it doesn't break net neutrality in and of itself, any more than a traffic light or a roundabout breaks road neutrality. The idea of routing flows, rather than packets, permits more packets to get through for the same bandwidth.
So long as all flows are treated fairly, this will actually BOOST network neutrality as network companies will have less justification to throttle back protocols which take disproportionate bandwidth - as they will no longer do so. Users will also have less cause to complain, as the effective bandwidth will move closer to the theoretical bandwidth.
The only concern is if corporations and ISPs use this sort of router to discriminate against flows (ie: ensure unfair usage) rather than to improve the quality of the service (ie: ensure fair usage).
The belief by ISPs that you cannot have high throughput unless you block legitimate users is nothing more than FUD. It has no basis in reality. It is possible, by moving away from best-effort and towards fair-effort, to get higher throughput for everyone.
Congested networks can be modeled as turbulent flow in a river. Blocking streams is like damming up some of the tributary streams. It causes a lot of grief and isn't really that effective.
On the other hand, smoothing out the turbulence will improve the throughput without having to dam up anything. QoS services are intended as smoothing mechanisms, not dams. For the most part, at least.
Most "net neutrality" advocates would be advised to focus only on the efforts to build gigantic dams, rather than to be unkind or unfair on those merely smoothing the way, with no bias or discrimination intended.
Parent
Re:Pretty girls make things go faster (Score:5, Funny)
Parent
Re: (Score:2)
Re: (Score:2)
If you have dozens or hundreds of long-duration, active flows (BitTorrent) and your neighbor has a few intermittent, short-duration flows (Firefox), it's pretty obvious who to throttle. The port numbers in use is irrelevant in this case.
Re: (Score:2)
Well, by that logic, packet-based routers don't work either. Any script kiddie can create new 'packets' at random, flood them at the router, and the router'll fall over and die.