Online Website Backup Options? 173
pdcull writes "I can't be the only person on the planet who has this problem: I have a couple of websites, with around 2 GB of space in use on my hosting provider, plus a few MySQL databases. I need to keep up-to-date backups, as my host provides only a minimal backup function. However, with a Net connection that only gets to 150 Kbps on a good day, there is no way I can guarantee a decent backup on my home PC using FTP. So my question is: does anybody provide an online service where I can feed them a URL, an FTP password, and some money, and they will post me DVDs with my websites on them? If such services do exist (the closest I found was a site that promised to send CDs and had a special deal for customers that had expired in June!), has anybody had experience with them which they could share? Any recommendations of services to use or to avoid?"
Why not use an online solution? (Score:5, Informative)
Rather than "posting DVDs" I'd go for something like Amazon's S3 and just dump the backup to them. Here is a list of S3 Backup solutions [zawodny.com] that would do the job.
I've personally moved away from hard-media as much as possible because the issue on restore is normally about the speed to get it back on the server and its there that online solutions really win as they have the peering arrangements to get you the bandwidth.
Why FTP? Use rsync. (Score:5, Informative)
It seems like the only problem with your home computer is FTP. Why not use rsync, which does things much more intelligently - and with checksumming, guarantees correct data?
The first time would be slow, but after that, things would go MUCH faster. Shoot, if you set up SSH keys, you can automate the entire process.
bqinternet (Score:2, Informative)
We use http://www.bqinternet.com/
cheap, good, easy.
Re:Why FTP? Use rsync. (Score:2, Informative)
Re:yeah, use rsync. (Score:5, Informative)
Then what you need is rdiff-backup, works like rsync except it keeps older copies stored as diffs.
As for FTP, why the hell does anyone still use ftp? It's insecure, works badly with nat (which is all too common) and really offers nothing you don't get from other protocols.
Gmail backup (Score:4, Informative)
You may have to use extra tools to break your archive into seperate chunks fitting Gmail's maximum attachment size, but I've used Gmail to backup a relative small (~20mb) website. The trick is to make one complete backup, then make incremental backups using rdiff-backup. I have this done daily with a cron job, sending the bz2'ed diff to a Gmail account. Every month, it will make a complete backup again.
And a seperate Gmail account for the backup of the mysql database.
This may be harder to do with a 2GB website, i guess, since Gmail provides atm about 6GB of space which will probably last you about 2 months. Of course you could use multiple gmail accounts or automated deletion of older archives...
But seriously, 2GB isn't too hard to do your from own PC if you only handle diffs. The first time download would take a while, but incremental backups shouldn't take too long unless your site changes drastically all the time.
rsync - it's in the tag (Score:5, Informative)
Re:Why FTP? Use rsync. (Score:2, Informative)
Backuppc.sourceforge.net (Score:2, Informative)
I sure hope you're no UK based... (Score:5, Informative)
Comment removed (Score:3, Informative)
Re:yeah, use rsync. (Score:5, Informative)
Then what you need is rdiff-backup, works like rsync except it keeps older copies stored as diffs.
Another option is to use the --link-dest option to rsync. You give rsync a list of the older backups (with --link-dest), and the new backup is made using hard links to the old files where they're identical.
I haven't looked at rdiff-backup, it probably provides similar functionality.
Part of my backups script (written for zsh):
setopt nullglob
older=($backups/*(/om))
unsetopt nullglob
rsync --verbose -8 --archive --recursive --link-dest=${^older[1,20]} \
user@server:/ $backups/$date/
Re:yeah, use rsync. (Score:5, Informative)
There is also the --backup --backup-dir options (you'll need both). It keeps a copy of the files that have been deleted or changed, if you use a script to keep it in seperate directories you'll have a pretty good history of all the changes.
Re:yeah, use rsync. (Score:3, Informative)
Also, rsync has a --bwlimit option to limit the bandwidth it uses.
Re:Why FTP? Use rsync. (Score:4, Informative)
I use rsync on a few dozen systems here, some of which are over 1TB in size. Rsync works very well for me. Keep in mind that if you are rsyncing an open file such as a database, the rsync'd copy may be in an inconsistent state if changes are not fully committed as rsync passes through the file. There are a few options here for your database. First one that comes to mind is to close or commit and suspend/lock it, make a copy of it, and then unsuspend it. Then just let it back up the whole thing, and if you need to restore, overwrite the DB with the copy that was made after restoring. The time the DB is offline for the local copy will be much less than the time it takes rsync to pass through the DB, and will always leave you with a coherent DB backup.
If your connection is slow, and if you are backing up large files, (both of which sound true for you?) be sure to use the keep-partial option.
One of my connections is particularly slow and unreliable. (it's a user desktop over a slow connection) For that one I have made special arrangements to cron once an hour instead of once a day. It attempts the backup, which is often interrupted by the user sleeping/shutting down the machine. So it keeps trying it every hour it's on, until a backup completes successfully. Then it resets the 24 hr counter and won't attempt again for another day. That way I am getting backups as close to every 24 hrs as possible, without more.
Another poster mentioned incrementals, which is not something I need here. In addition to using a version of rsync that does incrementals, you could also use something off-the-shelf/common like retrospect that does incremental but wouldn't normally work for your server, and instead of running that over the internet, run it on the local backup you are rsyncing to. If you need to go back in time a bit still can, but without figuring a way to jimmy in rsync through your network limits.
Re:Why not use an online solution? (Score:3, Informative)
S3 is a pretty good option. I've been using the jungledisk client along with rsync to manage offsite home backups. S3 Is pretty cheap and the clients are fairly flexible.
I haven't played with any remote clients, but your hosting provider can probably hook up one of the many clients mentioned in the parrent. The price of S3 is hard to beat. I spend about $6 per month on ~20 gigs worth of backups.
rsync.net (Score:1, Informative)
rsync.net --- online backup provider with sshfs/sftp/webdavs/scponly support
Why not use Suso? (Score:4, Informative)
Sorry for the self plug, but this just seems silly. Your web host should be backing up your website and offer you restorations. I guess this isn't a standard feature any more. But it is at Suso [suso.com]. We backup your site and databases everyday. And can restore them for you for free.
Re:Why not use an online solution? (Score:4, Informative)
JungleDisk's built-in backup can also keep older versions of files, which is great in case a file gets corrupted and you only discover that after a few days. It's dirt cheap too, $20 for a lifetime license on an unlimited number of machines.
For this to work, you need to be able to run the jungledisk daemon though, which is not an option with some shared hosting plans. Also, to mount the S3 bucket as a disk, you obviously need root access. But if you do, JungleDisk is hard to beat IMHO.
Re:Why not use an online solution? (Score:3, Informative)
. . . to mount the S3 bucket as a disk, you obviously need root access. But if you do, JungleDisk is hard to beat IMHO.
Not really. If the server kernel has FUSE [wikimedia.org] enabled, and the user space tools are installed, any user member of the related group can mount a "jungledisked" S3 bucket in his userspace without the need for root access.
Re:yeah, use rsync. (Score:3, Informative)
Re:Actually, his only problem is.... (Score:2, Informative)
Thanks for your comments... (Score:5, Informative)
Re:Actually, his only problem is.... (Score:3, Informative)
Nevertheless, as others have mentioned, if your data is fairly static, then the initial backup might be painful, but then backing up only changes shouldn't be too difficult.
I've never really understood some of the problems that come along, mainly because I'm not a website developer (only as a personal thing). If you develop your site locally and then upload it, all your pages and codes and images should already be on your own computer.
If you get a lot of dynamic content (people uploading media or writing thing in a blog or something), then I can see the problem.
In your case, unless you can convince your provider to run a utility to mount Amazon's S3 and then give you shell level access instead of just FTP, then I don't see how that will work for you (many web sites use S3, but I don't think it's possible in your position).
So that leaves option 1: get a better provider. But even that doesn't help, does it, because now you have to move all your stuff to a new provider.
You're really stuck. Maybe you can pay for temporary shell level access to a server somewhere that does have a lot of speed. I don't know. Looks like you will probably have to suck it up and do some really long, slow backups, regardless of your solution.
Re:Shared hosting (Score:4, Informative)
> OK, I keep hearing "use rsync" or other software. What
> about those of us who use shared web hosting, and
> don't get a unix shell, but only a control panel?
As long as you've got scriptability on the client, you should be able to cobble something together. Like, in OS X, you can mount an FTP volume in the finder (Go -> Connect to Server -> ftp ://name:password@ftp.example.com) and then just
(Interestingly, it shows up as user@ftp.example.com in the Finder but the user name isn't shown in /Volumes/.)
AFAIK, pretty much any modern OS (even Windows since 98, AFAIK) can mount FTP servers as volumes. OS X mounts them as R/O, which I always thought was lame, but that's another rant.
> Or who have a shell, but uncaring or incompetent
> admins who won't or can't install rsync?
If you've got shell (ssh) access, you can use rsync. (Not over telnet, natch. If that's all you've got, look at the workaround above.) Rsync works over ssh with nothing more than
Use SSH keys to make life perfect.
Or, google for 'site mirroring tool'. Many have an option to only download newly-changed files.
To get your databases, make a page like
and download that every so often.
For the original poster, who was complaining about downloading many gigs over a slow link, just rsync over and over until its done--if it drops a connection, the next attempt will start at the last good file.
And if you've got a control panel, look for a button labeled 'backup'! My host uses CPanel and there's a magic button.
Final option: how did the data get onto the www server in the first place? Isn't there already a "backup" on your local machine in the form of the original copies of all the files you've uploaded? If you haven't been backing up in the first place, well, yeah, making up for that might be a little painful. (Note: if your site hosts lots of user-uploaded content, ignore any perceived snarkiness. :-) )
Re:Why not use an online solution? (Score:3, Informative)
It's important, when using this method, that your second server be in a separate datacenter.
Duplicating your data is only half of a backup plan. The other half is making sure that at least one of those duplicates is in a physically separate location.
There are many things that can conceivably take down entire datacenters -- theft, bankruptcy, utility outages, floods, fire, earthquakes...
While these things are somewhat unlikely, they *do* happen, and you don't want to lose years of work if they do.