Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Software Technology

Remus Project Brings Transparent High Availability To Xen 137

An anonymous reader writes "The Remus project has just been incorporated into the Xen hypervisor. Developed at the University of British Columbia, Remus provides a thin layer that continuously replicates a running virtual machine onto a second physical host. Remus requires no modifications to the OS or applications within the protected VM: on failure, Remus activates the replica on the second host, and the VM simply picks up where the original system died. Open TCP connections remain intact, and applications continue to run unaware of the failure. It's pretty fun to yank the plug out on your web server and see everything continue to tick along. This sort of HA has traditionally required either really expensive hardware, or very complex and invasive modifications to applications and OSes."
This discussion has been archived. No new comments can be posted.

Remus Project Brings Transparent High Availability To Xen

Comments Filter:
  • by Lurching ( 1242238 ) on Wednesday November 11, 2009 @06:50PM (#30066950)
    They may have a patent too!!
  • Himalaya (Score:3, Interesting)

    by mwvdlee ( 775178 ) on Wednesday November 11, 2009 @06:57PM (#30067042) Homepage

    How does this compare to a "big iron" solution like Tandem/Himalaya/NonStop/whatever-it's-called-nowadays.

  • Re:Himalaya (Score:5, Interesting)

    by teknopurge ( 199509 ) on Wednesday November 11, 2009 @07:22PM (#30067282) Homepage
    VM replication like this still has an IO bottleneck. This isn't magic: unless you move to infiniband you're not going to touch something like a Stratus or NonStop machine. By the time you add in the cost of the high-perf interconnects, you're on-par with the real-time boxes. All this convergence going on with people redesigning the mainframe but ass-backward with client/server gear. Makes little sense to me other than it being a gimmick.

    By the time you get all the components that provide the processing and I/O throughput of those high-end boxes, the x86/64 commodity hardware cost advantage has evaporated.
  • by melted ( 227442 ) on Wednesday November 11, 2009 @07:33PM (#30067398) Homepage

    I'm pretty sure that if I just yank the cable, not everything will be replicated. :-)

  • Re:state transfer (Score:4, Interesting)

    by Vancorps ( 746090 ) on Wednesday November 11, 2009 @07:38PM (#30067442)

    If your primary and secondary systems are physically located next to each other then they aren't in the category of highly available. Furthermore with storage replication and regular snapshotting you can have your virtual infrastructure at your DR site on the cheap while gaining enterprise availability and most importantly, business continuity.

    I'll agree with being skeptical about transparency although how many people already have this? I went with XenServer and Citrix Essentials for it, I already have this fail-over and I can tell you that it works. I physically pulled a blade out of the chassis and sure enough, by the time I got back to my desk the servers were functioning having dropped a whole packet. Further tweaking of the underlying network infrastructure resulted in keeping the packet with just a momentary rise in latency.

    Enterprise availability is fast coming to the little guys.

  • by TheRaven64 ( 641858 ) on Wednesday November 11, 2009 @08:00PM (#30067622) Journal
    I know that a company called Marathon Technologies owns a few patents in this area. A few of their developers were at the XenSummit in 2007 where the project was originally presented.
  • by BitZtream ( 692029 ) on Wednesday November 11, 2009 @09:12PM (#30068200)

    No it won't.

    VMWare claims the same crap and its simply not true.

    You have a 50ms window between checkpoints that can be lost, in your example . The only way to ensure no lost is to ensure that every change, every instruction, every microcode executed in the CPU on machine A is duplicated on B before A continues to the next one. You simply can't do that without specialized hardware since you don't even have access to the microcode as its executed on standard hardware.

    50ms on my hardware/software can mean thousands of transactions lost. That can wreak havoc on certain network protocols and cause database operations to fail completely as you replay portions of transactions that the database has already seen.

    I can come up with situations all day long as to how this isn't as seamless as you make it out to be. Sure, xclock transitions to the other machine in what appears to be a perfect no loss transition, or solitaire on a windows machine, but thats not exactly useful.

    Remus has plenty of uses, but it has plenty of pitfalls and regardless of claims does require consideration when developing systems unless you're introducing latency that to me, would just be completely unacceptable and would require applications to be aware of the latency. Hell, thats 6.25MB of data that can be transmitted over a gigabit pipe between checkpoints. That can kill performance.

    I know what you're saying, I know what you mean, and I just don't think you realize how much that latency can effect certain classes of applications.

  • by dido ( 9125 ) <dido&imperium,ph> on Wednesday November 11, 2009 @09:51PM (#30068462)

    This is something that the much simpler Linux-HA environment deals with by using something they call STONITH, which basically means to Shoot The Other Node In The Head. STONITH peripherals are devices that can completely shut down a server physically, e.g. a power strip that can be controlled via a serial port. If you wind up with a partitioned cluster, which they more colorfully call a 'split brain' condition, where each node thinks the other one is dead, each of them uses the STONITH device to make sure, if it is able. One of them will activate the STONITH device before the other, and the one which wins keeps on running, while the one that loses really kicks the bucket if it isn't fully dead. I imagine that Remus must have similar mechanisms to guard against split brain conditions as well. I've had several Linux-HA clusters go split brain on me, and I tell you it's never pretty. The best case is that they only both try to grab the same IP address and get an IP address conflict, in the worst case, they both try to mount and write to the same fiberchannel disk at the same time and bollix the file system. If a Remus-based cluster split brains, I can imagine that you'll get mayhem just as awful unless you have a STONITH-like system to prevent it from happening.

  • Re:state transfer (Score:3, Interesting)

    by shmlco ( 594907 ) on Thursday November 12, 2009 @01:32AM (#30069682) Homepage

    "If your primary and secondary systems are physically located next to each other then they aren't in the category of highly available."

    High availability covers more than just distributed data centers. Load-balancing, fail-over, clustering, mirroring, reduntant switches, routers, and other hardware: all are zero-point-of-failure, high availability solutions.

  • Re:Himalaya (Score:3, Interesting)

    by anon mouse-cow-aard ( 443646 ) on Thursday November 12, 2009 @07:56AM (#30071094) Journal
    We had a 700 kline app written in some Tandem specific application language. the smallest server we could get from HP was 400 K$. we re-wrote the app in python to use pairs of servers replicating via DRDB over ethernet and a load balancer in front. DRBD is slow, but with the new app I could just add pairs of nodes. We already had such a configuration for another application, and we combined the two, so the hardware cost was just adding two nodes in this cluster, at about 4 K$ per server node. 400 K$ -> 8 k$. I think it would take a heck of a lot of hardware to compensate for the pricing of that gear.

"What man has done, man can aspire to do." -- Jerry Pournelle, about space flight

Working...