Best Solution For HA and Network Load Balancing? 298
supaneko writes "I am working with a non-profit that will eventually host a massive online self-help archive and community (using FTP and HTTP services). We are expecting 1,000+ unique visitors / day. I know that having only one server to serve this number of people is not a great idea, so I began to look into clusters. After a bit of reading I determined that I am looking for high availability, in case of hardware fault, and network load balancing, which will allow the load to be shared among the two to six servers that we hope to purchase. What I have not been able to determine is the 'perfect' solution that would offer efficiency, ease-of-use, simple maintenance, enjoyable performance, and a notably better experience when compared to other setups. Reading about Windows 2003 Clustering makes the whole process sounds easy, while Linux and FreeBSD just seem overly complicated. But is this truly the case? What have you all done for clustering solutions that worked out well? What key features should I be aware for successful cluster setup (hubs, wiring, hardware, software, same servers across the board, etc.)?"
F5 (Score:1, Interesting)
Re:You will be OK (Score:2, Interesting)
Re:Is It Mission Critical? (Score:5, Interesting)
It's not overboard. And even with a hosting provider you're still dependent on hardware problems. What you can do to realise what you want is:
- buy 2 cheap servers with lots of RAM
- set them up as XEN platforms
- create 2 virtuals for the loadbalancers
- setup LVS (heartbeat + ldirectord) on each virtual
- create 4 webserver virtuals, 2 on each xen host
- configure your loadbalancers to distribute load over all webserver virtuals
And you're done. Oh, make sure to disable tcp_checksum_offloading on your webservers, else LVS won't work that well (read: not at all).
we run a nonprofit with 100m+ visitors a day (Score:5, Interesting)
STOP. You have no idea what you're doing. (Score:5, Interesting)
I'm sorry, but I have to say that. Don't be offended, please - sooner or later you will look at your submission and laugh really hard, but for now you need to realise that you said something very, very silly. A few people already politely pointed out that 1000 visitors a day is nothing - but seriously, it's such a great magnitude of nothingness that, if you make such a gross misintepretation of your expected traffic, you need to reconsider if you really are the right person for the job *right now* and maybe gain some more experience before trying to spend other people's money on a ton of hardware that will just sit there, idle and consume huge amounts electricity (also paid by other people's money).
I'm serving a 6k/day website (scripting, database, some custom daemons etc.) from a Celeron 1.5GHz with 1GB RAM, and it's still doing almost nothing. If you really have to have some load balancing, get two of those for $100 each.
Re:1000+ a day isn't very much (Score:5, Interesting)
HA isn't there just for load issues. It's there to guarantee availability. 1,000 users might be peanuts, but I've got a site that only gets a couple hundred visitors a day. That site has clustered load balancers which talk to redundant app servers, which talk to redundant web servers (connected via redundant switches). It's really important that the site be there for those couple of hundred visitors.
The number of visitors isn't as important as the importance of the visitors.
Re:1000+ a day is trivial have you thought of amaz (Score:4, Interesting)
Outsource web hosting. But, which provider? (Score:3, Interesting)
I've looked at A2 Hosting [a2hosting.com]. I've never used them, and don't know anyone connected with them, but they seem like they know what they are doing.
I wouldn't recommend my present web host.
Does anyone else have recommendations about web hosting?
Requirements for 1000 unique visitors/day (Score:3, Interesting)
The requirements to handle 1000 unique visitors/day will depend on what exactly you are serving. I ran a website that got well over 1000 uniques per day on a Pentium MMX 200 Mhz with 64 megs of RAM and a 1.2 gigabyte hard drive. This was significantly overkill for the site. However, that was entirely static content. Oh, except it handled email, spam filtering, and a database for a POS system for a retail establishment with two stores.
If you are serving mostly dynamic content, you'll want more processing power and more RAM. Almost certainly, you'll be fine with a bottom end computer, but you probably want something manufactured in the last five years or so. This will obviously depend on what your dynamic content actually is, though; more complexity will require more processing power.
If you cannot afford any outages, you may want to look at redundant hardware, failover systems, etc. etc., but you first need to determine how much an outage will cost you. What if you have a 5 minute outage? An outage lasting an hour? Eight hours? A day? In any case, before you look at redundant hardware, you'll need a service level agreement from your ISP.
And of course, if you are looking at something to stream 1 gigabyte of traffic to each of these thousand uniques, that's a whole different matter. Now you may want to look at content delivery networks, and possibly multiple servers just to handle the outbound network traffic.
No matter what your requirements, though, you need to look at a good backup solution.
Re:1000+ a day isn't very much (Score:4, Interesting)
Sounds like you do HA pretty much how I do it, only with different pieces (I live in Sun + Oracle land).
One thing I try to get HA admins to stop thinking about is the "primary" and "backup" servers. I find deploying a sysadmin.mental state which considers both (or all) servers equal helps make stuff more reliable in the long run. So if server "bob" bursts into flames, server "joe" takes over. Bob is then serviced and left alone until Joe bursts into flames (or we schedule some downtime and run re-certification tests).
The biggest key, as you may have guessed, to getting admins to treat servers equally instead of primary/backup is to name them appropriately. Bob & Joe is better in this case than haMaster and haSlave, or serverA and serverB, etc.
Same with IPMP and stuff - don't define a preference, whatever is, is.
Unless you have asymmetric hardware, which I really, really, really discourage.