MIT May Have Just Solved All Your Data Center Network Lag Issues 83

Posted by Unknown Lamer on Thursday July 17, 2014 @05:53PM from the hierarchy-beats-anarchy dept.

alphadogg (971356) writes A group of MIT researchers say they've invented a new technology that should all but eliminate queue length in data center networking. The technology will be fully described in a paper presented at the annual conference of the ACM Special Interest Group on Data Communication. According to MIT, the paper will detail a system — dubbed Fastpass — that uses a centralized arbiter to analyze network traffic holistically and make routing decisions based on that analysis, in contrast to the more decentralized protocols common today. Experimentation done in Facebook data centers shows that a Fastpass arbiter with just eight cores can be used to manage a network transmitting 2.2 terabits of data per second, according to the researchers.

MIT May Have Just Solved All Your Data Center Network Lag Issues

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 83 Comments Log In/Create an Account

Comments Filter:

- Re: (Score:2)
  
  by philip.paradis ( 2580427 ) writes:
  
  You're a friend and a cosmonaut.
- Re: (Score:2)
  
  by JSG ( 82708 ) writes:
  
  Nearly any network tech should be faster than Ethernet in certain circumstances. Ethernet is generally good though and appears to be quite good a scaling.
  I remember the good old days and the joys of beaconing 8)
  - Re: (Score:3)
    
    by David_Hart ( 1184661 ) writes:
    
    Nearly any network tech should be faster than Ethernet in certain circumstances. Ethernet is generally good though and appears to be quite good a scaling.
    The key word, there, is scaling.
    It looks like this is meant to make the network more efficient within a data center that handles a high volume of traffic, including high traffic spikes, by receiving a network time slot request from the end point (i.e. software running on a UNIX server) and sending a response that schedules packets to arrive just-in-time along a specific path to avoid queuing.
    However, there is a less complicated way of achieving the same goal: Scalability - Increase your switch and server u
    - Re: (Score:2)
      
      by aaronb1138 ( 2035478 ) writes:
      
      Even if it is just a data center technology, a key placement might be SAN switching. Currently, much of "the cloud" or rather server cluster based computing suffers heavily from latency and you can never have enough storage bandwidth issues.
  - Re: (Score:2)
    
    by bugs2squash ( 1132591 ) writes:
    
    barely any link layer is shared today though, hubs are dead and gone (or at least hidden in cupboards by people who think they need them for packet sniffing). MIT's solution seems eerily like a Chris Christie plan to eliminate congestion on the NJ turnpike.
- - Re: (Score:3)
    
    by mark-t ( 151149 ) writes:
    
    Where in the comment did you read anything that suggested it would be about licensing? Or were you unaware that Berkeley and MIT are actual, real-world institutions, and it's possible to use those names without necessarily referring to the corresponding open source license.
  - Re: (Score:2)
    
    by sonamchauhan ( 587356 ) writes:
    
    No, I don't think so. RMS worked at MIT for over a decade.
- Re:They re-invented static scheduling (Score:5, Informative)
  
  by postbigbang ( 761081 ) writes: on Thursday July 17, 2014 @09:25PM (#47479521)
  
  Nah. They put MPLS logic-- deterministic routing by knowing the domain into an algorithm that optimizes time slots, too.
  All the hosts are know, their time costs, and how much crap they jam into wires. It's pretty simple to typify what's going on, and where the packet parking lots are. If you have sufficient paths and bandwidth in and among the hosts, you resolve the bottlenecks.
  This only works, however, if and when the domain of hosts has sufficient aggregate resources in terms of path availability among the hosts. Otherwise, it's the classic crossbar problem looking for a spot marked ooops, my algorithm falls apart when all paths are occupied.
  Certainly it's nice to optimize and there's plenty of room for algorithms that know how to sieve the traffic. But traffic is random, and pathways limited. Defying the laws of physics will be difficult unless you control congestion in aggregate from applications where you can make the application become predictable. Only then, or you have a crossbar matrix, will there be no congestion. For any questions on this, look to the Van Jacobsen algorithms and what the telcos had to figure out, eons ago.
  
scalability? (Score:1, Insightful)

by p25r1 ( 3593919 ) writes:

Good idea, however, its main problem is that it only scales up to a couple of racks and to scale to anything larger it will probably have to sacrifice the zero-queue design principle that it argues for...
- Re: (Score:3, Insightful)
  
  by Anonymous Coward writes:
  
  FTA: “This paper is not intended to show that you can build this in the world’s largest data centers today,” said Balakrishnan. “But the question as to whether a more scalable centralized system can be built, we think the answer is yes.”
Yawn (Score:1)

by JSG ( 82708 ) writes:

Good grief: they appear to have invented a scheduler of some sort. I read the rather thin Network World article and that reveals little.
Nothing to see here - move on!
- Re: (Score:3, Informative)
  
  by Anonymous Coward writes:
  
  A link to the paper is in the first article link. Direct link Here [mit.edu]. They also have a GIT repo to clone, if you're interested.
Hooray (Score:1)

by BlackHawk-666 ( 560896 ) writes:

Now I can see pictures of other's people's food and children so much more quickly...can't wait..>.>
- Re: (Score:1)
  
  by jimmifett ( 2434568 ) writes:
  
  You forgot about the pr0n and cats. I will say, faster pics of cats is probably worth some merit.
- Re: (Score:2)
  
  by 6Yankee ( 597075 ) writes:
  
  Optimise. Only friend people who eat their children.
rfc1925.11 proves true, yet again (Score:2, Interesting)

by mysidia ( 191772 ) writes:

Every old idea will be proposed again with a different name and a different presentation, regardless of whether it works.
Case in point: ATM To the Desktop.
In a modern datacenter "2.2 terabits" is not impressive. 300 10-gigabit ports (Or about 50 servers) is 3 terbits. And there is no reason to believe you can just add more cores and continue to scale the bitrate linearly. Furthermore... how will Fastpass perform during attempted DoS attacks or other stormy conditions where there are sm
- Re:rfc1925.11 proves true, yet again (Score:5, Interesting)
  
  by Archangel Michael ( 180766 ) writes: on Thursday July 17, 2014 @06:34PM (#47478693) Journal
  
  Your 300 x 10GB ports on 50 Servers is ... not efficient. Additionally, you're not likely saturating your 60GB off a single server, and you're running those six 10GB connections per server to try to eliminate other issues you have, without understanding them. You're speed issues are elsewhere (likely SAN or Database .. or both), and not in the 50 servers. In fact, you might be exasperating the problem.
  BTW, our data center core is running twin 40GB connections for 80 GB total network load, but were not really seeing anything using 10GB off a single node yet, except the SAN. Our Metro Area Network links are is being upgraded to 10GB as we speak. The "network is slow" is not really an option.
  
  - Re:rfc1925.11 proves true, yet again (Score:5, Funny)
    
    by chuckugly ( 2030942 ) writes: on Thursday July 17, 2014 @07:08PM (#47478859)
    
    In fact, you might be exasperating the problem.
    I hate it when my problems get angry, it usually just exacerbates things.
    
    - Re: (Score:2)
      
      by mysidia ( 191772 ) writes:
      
      I hate it when my problems get angry, it usually just exacerbates things.
      I hear most problems can be kept reasonably happy by properly acknowledging their existence and discussing potential resolutions.
      Problems tend to be more likely to get frustrated when you ignore them, and anger comes mostly when you attribute their accomplishments to other problems.
  - Re: (Score:1)
    
    by BitZtream ( 692029 ) writes:
    
    Your 300 x 10GB ports on 50 Servers is ... not efficient. Additionally, you're not likely saturating your 60GB off a single server, and you're running those six 10GB connections per server to try to eliminate other issues you have, without understanding them.
    You haven't worked with large scale virtualization much, have you?
    - Re: (Score:2)
      
      by mysidia ( 191772 ) writes:
      
      You haven't worked with large scale virtualization much, have you?
      In all fairness.. I am not at full scale virtualization yet either, and my experience is with pods of 15 production servers with 64 CPU Cores + ~500 Gb of RAM each and 4 10-gig ports per physical server, half for redundancy, and bandwidth utilization is controlled to remain less than 50%. I would consider the need for more 10-gig ports or a move to 40-gig ports, if density were increased by a factor of 3: which is probable in a few y
  - Re: (Score:3)
    
    by mysidia ( 191772 ) writes:
    
    Your 300 x 10GB ports on 50 Servers is ... not efficient. Additionally, you're not likely saturating your 60GB off a single server,
    It's not so hard to get 50 gigabits off a heavily consolidated server under normal conditions; throw some storage intensive workloads at it, perhaps some MongoDB instances and a whole variety of highly-demanded odds and ends, .....
    If you ever saturate any of the links on the server then it's kind of an error: in critical application network design, a core link within your n
    - Re: (Score:2)
      
      by Archangel Michael ( 180766 ) writes:
      
      While it is possible to fill your Data pathways up. Aggregate data is not the same as Edge Server data. In the case described above, s/he is running 300 x 10GB on 50 Servers. Okay, lets assume those are 50 Blades, maxed out on RAM and whatnot. The Only way to fill that bandwidth is to do RAM to RAM copying, and then you'll start running into issues along the pipelines in the actual Physical Server.
      To be honest, I've see this, but only when migrating VMs off host for host Maintenance, or a boot Storm on our
      - Re: (Score:2)
        
        by mysidia ( 191772 ) writes:
        
        To be honest, I've see this, but only when migrating VMs off host for host Maintenance, or a boot Storm on our VDI.
        Maintenance mode migrations are pretty common; especially when rolling out security updates. Ever place two hosts in maintenance mode simultaneously and have a few backup jobs kick off during the process?
- Re:rfc1925.11 proves true, yet again (Score:5, Informative)
  
  by Anonymous Coward writes: on Thursday July 17, 2014 @09:03PM (#47479403)
  
  This is about zero in-plane queuing, not zero queuing. There is still a queue on each host, the advantage of this approach is obvious to anyone with knowledge of network theory (ie. not you). Once a packet enters an ethernet forwarding domain, there is very little you can do to re-order or cancel it. If you instead only send from a host when there is an uncongested path through the forwarding domain, you can reorder packets before they are sent, which allows, for example, to insert high-priority packets into the front of the queue, and bucket low priority traffic until there is a lull in the network.
  Bandwidth is always limited at the highend. Technology and cost always limits the peak throughput of a fully cross-connected forwarding domain. That's why the entire internet isn't a 2 Billion way crossbar switch.
  Furthermore, you can't install 6x 10-gigabit ports in a typical server, they just don't have that much PCIe bandwidth. You might also want to look at how much a 300 port 10GigE non-blocking switch really costs, multiply that up by 1000x to see how much it would cost Facebook to have a 300k node DC with those, and start to appreciate why they are looking at software approaches to optimise the bandwidth and latency of their networks with resources that are cost-effective, considering their network loads like everyone else's network loads never look like the theoretical worst-case of every node transmitting continuously to random other nodes.
  Real network loads have shapes, and if you are able to understand those shapes, you can make considerable cost savings. It's called engineering, specifically traffic engineering.
  -puddingpimp
  
  - Re: (Score:2)
    
    by Blaskowicz ( 634489 ) writes:
    
    You can get consumer hardware with 40 PCIe 3.0 lanes that run right into the CPU, wouldn't that be enough PCIe bandwith?
The wonders of central planning... (Score:1)

by mi ( 197448 ) writes:

centralized arbiter to analyze network traffic holistically and make routing decisions based on that analysis, in contrast to the more decentralized protocols common today
Central planning works rather poorly for humans [economicshelp.org]. Maybe, it will be better for computers, but I remain skeptical.
Oh, and the term "holistically" does not help either.
- - Re: (Score:2)
    
    by tepples ( 727027 ) writes:
    
    Then perhaps what is needed inside a data center isn't the Internet but instead a smart network that happens to connect to the Internet.
Ok (Score:1)

by Anonymous Coward writes:

Ok, but the most important question is: did they implement it in Javascript, Go, Rust, Ruby or some other hipster, flavor-of-the-month-language?
- - Re: (Score:2)
    
    by philip.paradis ( 2580427 ) writes:
    
    Does your shop have a relatively narrow development scope? Over the course of my career, I've found that single language shops are either fairly tightly tied to a small set of problem domains, or they're full of people who see every problem as a nail so to speak. The latter condition is an unfortunate state of inflexibility that tends to extend into other areas, including higher level systems work and network architecture. I'm not saying your organization suffers from that affliction, but I would like to un
    - - Re: (Score:2)
        
        by philip.paradis ( 2580427 ) writes:
        
        How shall we define importance? In terms of scope, are we talking about kernel space, userland code that humans directly interact with, systems/infrastructure code, data processing systems, or something else entirely?
Great! Another single point-of-failure... (Score:3, Insightful)

by gweihir ( 88907 ) writes: on Thursday July 17, 2014 @06:23PM (#47478625)

This is a really bad idea. No need to elaborate further.

OpenDayLight (Score:1)

by Thinman ( 59679 ) writes:

How different is this to http://www.opendaylight.org/ [opendaylight.org]?
- Re: (Score:2)
  
  by arth1 ( 260657 ) writes:
  
  They may slow down the world if this gets hyped to the point that it sells.
  The problem is T.ANSTAAFL. This is Yet Another Implementation that seeks to reduce the average latency, without thought to the fact that what really hurts is the worst case latency bottleneck. This, like many other approaches before it, will worsen the worst case in order to buy the average case lunch.
  You either have to come up with a solution that reduces the worst case, which is what really hurts, or make Pareto improvements, i.
Nginx? (Score:2)

by CrashNBrn ( 1143981 ) writes:

I thought Nginx was created by Igor Sysoev?
- - Re: (Score:2)
    
    by BitZtream ( 692029 ) writes:
    
    Actually, it is fast because it does nothing, and thats also the point.
    You don't use Apache to serve boat loads of static files, you use Nginx.
    You don't use Nginx to serve ASP.NET or Java EE apps.
This sounds familiar... (Score:3)

by certain death ( 947081 ) writes: on Thursday July 17, 2014 @06:52PM (#47478793)

Maybe because that is what Token Ring did! Just sayin'!

I for one (Score:2)

by fisted ( 2295862 ) writes:

I for one welcome all but our new Fastpass &,dash; static scheduling overlords.
- Re: (Score:2)
  
  by X0563511 ( 793323 ) writes:
  
  Careful, the next one might have "smart" quotes!
- Re: I for one (Score:2)
  
  by bill_mcgonigle ( 4333 ) * writes:
  
  "Editors"
Good news for ... (Score:1)

by CaptainDork ( 3678879 ) writes:

... my Candy Crush Saga.
Net Neutrality (Score:2)

by Lead Butthead ( 321013 ) writes:

And big network service provider will implement it to the detriment of their revenue (think Comcast and Netflix). Riiiiiight.
So, if it allows less restricted dataflow... (Score:3, Funny)

by jeffb (2.718) ( 1189693 ) writes: on Thursday July 17, 2014 @08:33PM (#47479273)

...are they trying to say that "Arbiter macht frei"?

- Re: (Score:1)
  
  by larpon ( 974081 ) writes:
  
  Arnbitter [ytimg.com] macht frei.
Please read carefully, this article is a carefully (Score:1)

by Anonymous Coward writes:

This paper shows no tangible benefit other than a slight decrease in TCP retransmits, something that the authors never test if it shows any real benefit.
Crucially, this system is not "zero queue". They simply move queuing to the edge of the network and in to the arbiter. Notice that there is no evaluation of the total round trip delay in the system. The dirty secret is because it's no better, especially as the load increases, since the amount of work that the arbiter must do grows exponentially with both th
Some back of the envelope calculations (Score:2)

by MerlynEmrys67 ( 583469 ) writes:

2.2TBit/sec is just under 40 ports which is just over 2 switches...
It will only take one extra management processor (8 cores) to manage two switches... Get back to me when you can drive 100TBit/sec with one core
PS - is there extra compute needed on the management plane of the edge switches here? I don't think so but it is hard to tell
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re:They re-invented static scheduling (Score:5, Informative)

scalability? (Score:1, Insightful)

Re: (Score:3, Insightful)

Yawn (Score:1)

Re: (Score:3, Informative)

Hooray (Score:1)

Re: (Score:1)

Re: (Score:2)

rfc1925.11 proves true, yet again (Score:2, Interesting)

Re:rfc1925.11 proves true, yet again (Score:5, Interesting)

Re:rfc1925.11 proves true, yet again (Score:5, Funny)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:rfc1925.11 proves true, yet again (Score:5, Informative)

Re: (Score:2)

The wonders of central planning... (Score:1)

Re: (Score:2)

Ok (Score:1)

Re: (Score:2)

Re: (Score:2)

Great! Another single point-of-failure... (Score:3, Insightful)

OpenDayLight (Score:1)

Re: (Score:2)

Nginx? (Score:2)

Re: (Score:2)

This sounds familiar... (Score:3)

I for one (Score:2)

Re: (Score:2)

Re: I for one (Score:2)

Good news for ... (Score:1)

Net Neutrality (Score:2)

So, if it allows less restricted dataflow... (Score:3, Funny)

Re: (Score:1)

Please read carefully, this article is a carefully (Score:1)

Some back of the envelope calculations (Score:2)

Re: (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals