Is There a Place for a $500 Ethernet Card? 423
prostoalex writes "ComputerWorld magazine runs a story on Level 5 Networks, which emerged from the stealth startup status with its own brand of network cards and software called EtherFabric. The company claims they are reducing the load on the servers CPUs and improve the communications between the servers. And it's not vaporware: 'The EtherFabric software shipping Monday runs on the Linux kernel 2.4 and 2.6, with support for Windows and Unix coming in the first half of next year. High volume pricing is $295 for a two-port, 1GB-per-port EtherFabric network interface card and software, while low volume quantities start from $495.'"
yes, there is (Score:4, Informative)
Re:A look into the past (Score:4, Informative)
Of course, the A2 is perfectly capable of running it's own TCP/IP stack - Uther doesn't do any of that, IIRC, and nor does the LANceGS (although, it seems that the LANce can only do pings on the
Wheel of Reincarnation (Score:1, Informative)
Re:A look into the past (Score:3, Informative)
Load on our servers from network processing increased easily by 20% when we moved to an all Ethernet/IP setup. $500 for a "smart" NIC, hell yeah! As much as my boss may chide me about it, I sill lament the loss of ATM in our network.
-PONA-
Re:Sure there's a place for them (Score:3, Informative)
Re:A look into the past (Score:3, Informative)
The thing that has changed is that the frequency that frames arrive at has gone up. Unless you can use jumbo frames (and even then, if the payloads are small), GigE is delivering the same sized frames as fast ethernet, just 10x faster. This tends to create a hell of a lot more interrupts for the processor to handle (a condition made worse by the deeper pipelines in processors like the P4). If you can offload the processing of the frames a bit, just enough to give a processor a chance to get something done, you could dramatically improve performance.
That being said, changes to the protocol (such as jumbo frames) can also have a positive effect in a lot of circumstances, and have the advantage of being cheaper to implement.
Re:A look into the past (Score:3, Informative)
Is There a Place for a $500 Ethernet Card? (Score:3, Informative)
Sure [amazon.com] there [amazon.com] is [amazon.com].
Re:What good is such a fast Ethernet card... (Score:2, Informative)
It's a farm of servers that looks at incoming requests and renders the pages based on the host header name. The same boxes might be serving up transactional content for a couple dozen businesses off of a common code base, with all of them having wildly different look/feel and behavior. Much of what differentiates one merchant's presentation from another is data driven, to say nothing of a page that must pre-calculate municipal tax rates for each of maybe a hundred different types of items on an order, take into account the shopper's account status, order history, affiliate referals... to say nothing of real-time inventory availability checked on every page load, multi-language and currency behavior, intrusion and fraud detection, item kitting, etc. Grabbing a hundred scraps of data from the underlying database (including session management, and all sorts of other housekeeping, including writes traffic logging) is actually pretty minimal when you consider what it's all doing.
Add on top of that the layers that have to monitor all of that activity for the company that's running all of this for those merchants (and reporting to them on traffic, visitor search behavior in real time, and so on) and you'll see what I mean. So, sure, slightly faster 1GB ethernet cards can definately help out. Would a few slightly better designed stored procedures help? Maybe a tiny bit. But really complex online selling through a managed service with lots of users... there's a certain amount of complexity that can't be designed away.
High-Performance Computing (Score:2, Informative)
That being said, http://www.ammasso.com/ [ammasso.com] makes an Ethernet card (priced around $495, I believe) that utilizes both TCP offload and RDMA. The latency of the cards is around 10us. This is great for people needing a high-performance cluster, but can't afford the Infiniband interconnect.
Comment removed (Score:3, Informative)
Re:A look into the past (Score:5, Informative)
Peripherals like that built into the motherboard are generally on a PCI bus segment anyway. You can see by looking at the device manager in Windows or by using lspci in Linux. In both cases you will see a vendor ID, bus number, and slot number.
Re:A look into the past (Score:4, Informative)
Collisions are not a problem on switched networks. Even on older shared media and hub based networks, collisions were not the evil thing that they were portrayed as. Ethernet is not Aloha. See Measured Capacity of an Ethernet: Myths and Reality [compaq.com] by David R. Boggs, Jeffrey C. Mogul, Christopher A. Kent. It debunks much of the misinformation that is still prevalent.
Re:A look into the past (Score:4, Informative)
The thing driving smart ethernet cards is stuff like rdma and scsi over ip. The part of thinking behind it for rdma is that the card exerts the same load on the host as local dma (i.e. almost none). For scsi over ip, they think that doing scsi is already enough for the host cpu so let it treat the network interface as "send it and forget it."
As for avoiding the kernel context switch, I haven't looked at how this card is implemented, but with the right smarts on the card, and a replacement socket library, they could enable each process to talk directly to the card - bypassing the kernel once stuff is initialized - kind of the way an X server can talk directly to the frame-buffer without involving the kernel.
Search for "Dealing with high network loads" (Score:3, Informative)
and have a read of why the interrupt problem isn't a problem anymore, at least on Linux. Note the date too - October 2001.
LWN.net [lwn.net]
NAPI has been implemented in the kernel.org kernels for a number of years now.
Re:A look into the past (Score:4, Informative)
Re:A look into the past (Score:5, Informative)
With newer Linux-kernels this is quite simply not the case.
To avoid the torrent of interupts from a fast nic the Linux-kernel detects that the card gets packets so often that essentially there's "always" one or more packets waiting in the cards buffers.
It responds to this condition by disabling interupts for the card in question and switch to polling it regularily.
Normally polling is inefficient, because it amounts to asking over and over again "got anything for me now?", where in most situations the answer is "no" 99.99% of the time, which makes it a waste of resources to ask in the first place.
This changes when the answer is "yes" basically all the time. There's no need for the network card to tell the cpu over and over and over "I got a packet for you", instead the cpu collects packets regularily.
It's sorta like having your lawyer receive legal letters for you.
If you get very few, it'd be a waste for you to drop by him every day and ask if he's gotten any for you (polling) most of the days you'd be making the trip for naught. In this situation interupts (i.e. having your lawyer call you and inform you on the rare occasions when a letter *does* arrive) makes more sense.
But if your letter-flow increases to the point where there's normally 3-5 letters every day and it's rare that no letter arrives, then it no longer makes sense that the lawyer calls you every day to tell you you got letters (interupts), you already assumed that. In this scenario it makes more sense for you to come by regularily and pick up letters without being prompted to do so. (polling)
Thus, the flow of interupts from a Gb-nic being flooded with 100byte small-packets (say a loaded dns-server) is not 1 million interupts every second -- it is zero interupts.
(allthough what you write is correct for kernels older than 2.6.10 and for less clever OSes.)
Re:Um, no. (Score:2, Informative)
If you encrypt data it makes it more random>
Encrypted data should look completely random, not "more" random, or else you can use the patterns in the stream for malicious activity. That's why, if you want to compress and encrypt, you always compress and then encrypt. Compressing an encrypted stream is impossible if your encryption is worth its salt.
I worked on TCP offload card at Adaptec (Score:5, Informative)
http://www.adaptec.com/worldwide/product/prodfami
It was a complete TCP stack in hardware (with the exception of startup/teardown, which still was intentionally done in software, for purposes of security/accounting).
Once the TCP connection was established, the packets were completely handled in hardware, and the resulting TCP payload data was DMA'ed directly to the application's memory when a read request was made. Same thing in the other direction, for a write request. Very fast!
I'm not sure of the exact numbers but we reduced CPU utilization to around 10%-20% of what it was under a non-accelerated card, and were able to saturate the wire in both directions using only a 1.0Ghz CPU. This is something that was difficult to do, given the common rule of thumb that you need 1Mhz of CPU speed to handle every 1Mbit of data on the wire.
To make a long story short, it didn't sell, and I (among many others) was laid off.
The reason was mostly about price/performance: who would pay that much for just a gigabit ethernet card? The money that was spent on a TOE-accelerated network card would be better spent on a faster CPU in general, or a more specialized interconnect such as InfiniBand.
When 10Gb Ethernet becomes a reality, we will once again need TOE-accelerated network cards (since there are no 10GHz CPU's today, as we seem to have hit a wall at around 4Ghz). I'd keep my eye on Chelsio [chelsio.com]: of the Ethernet TOE vendors still standing, they seem to have a good product.
BTW, did you know that 10Gb Ethernet is basically "InfiniBand lite"? Take InfiniBand, drop the special upper-layer protocols so that it's just raw packets on the wire, treat that with the same semantics as Ethernet, and you have 10GbE. I can predict that Ethernet and InfiniBand will conceptually merge, sometime in the future. Maybe Ethernet will become a subset of InfiniBand, like SATA is a subset of SAS....
Re:IPSEC (Score:3, Informative)
Re:A look into the past (Score:3, Informative)
How it works. (Score:3, Informative)
This card, and the software which drives it, differ from traditional ethernet accellerator cards and from alternative network protocols (like myrinet and iWarp) in several ways.
Alternative protocols not only require using a different software API but also require custom hardware at both communication endpoints.
Traditional hardware TCP/IP accelerators run the bottom half of the stack in custom silicon. This does tend to help reduce host CPU load but suffer from a number of problems. Since host CPU speeds have tended to increase regularly, they often helped for only a brief period of time. They also tended to help most for large packets but helped little or not at all for small packets.
This technology claims to help large and small packets equally well, and also claims to reduce packet latency across the board. It does so by running the bulk of the TCP/IP stack in user space rather than via system calls. The hardware runs ethernet Rx and Tx processing but does not implement the higher level IP protocol processing. Instead, once connections are established, the ethernet frames coming from the hardware, are fed via a system call interface to the application process to which they belong. Then, no further context switching between kernel and the process are required. The top end of the hardware driver and all of the subsequent IP layers, are executed in the context of the user space process. They are linked to the app via shared libraries.
Basically, instead of the linking the IP calls against code which requires frequent switching between user and kernel space, the entire upper half of the stack is run by the application sending and receiving the packets. This offers uniform benefits in packet latency across all packet sizes, and offers improvement in throughput as well.
I assume that all that is required is to link against a different set of shared libraries to gain these benefits (and of course to have the custom hardware on at least one side of the comm. link). This looks very good in principle.
The following page provides an overview of the technology and compares it to each of the competing mechanisms.
http://www.level5networks.com/sol_approaches.htm [level5networks.com]