BitTorrent For Enterprise File Distribution? 291
HotTuna writes "I'm responsible for a closed, private network of retail stores connected to our corporate office (and to each other) with IPsec over DSL, and no access to the public internet. We have about 4GB of disaster recovery files that need to be replicated at each site, and updated monthly. The challenge is that all the enterprise file replication tools out there seem to be client/server and not peer-to-peer. This crushes our bandwidth at the corporate office and leaves hundreds of 7Mb DSL connections (at the stores) virtually idle. I am dreaming of a tool which can 'seed' different parts of a file to different peers, and then have those peers exchange those parts, rapidly replicating the file across the entire network. Sounds like BitTorrent you say? Sure, except I would need to 'push' the files out, and not rely on users to click a torrent file at each site. I could imagine a homebrew tracker, with uTorrent and an RSS feed at each site, but that sounds a little too patchwork to fly by the CIO. What do you think? Is BitTorrent an appropriate protocol for file distribution in the business sector? If not, why not? If so, how would you implement it?"
Sneakernet (Score:5, Insightful)
Works great (Score:5, Insightful)
BitTorrent is an excellent intranet content-distribution tool; we used it for years to push software and content releases to 600+ Solaris servers inside Microsoft (WebTV).
-j
Sure, why not? (Score:5, Insightful)
Sure! BitTorrent, remember, is only a protocol, it's just become demonized due to the types of files being shared using it. But if you're sharing perfectly legitimate data, then what's wrong with using a protocol that's already been extensively tested and developed?
Just because it's been used to pirate everything under the sun doesn't make it inappropriate in other arenas.
Re:Sneakernet (Score:5, Insightful)
The bandwidth of a DVD in the postal service isn't great but it's reasonable and quite cost effective.
From the summary: "I would need to 'push' the files out, and not rely on users to click a torrent file at each site." I imagine that the following is also true: "I would need to 'push' the files out, and not rely on users to insert a disc and run setup.exe at each site."
Re:Snail-mail USB sticks (Score:5, Insightful)
Why would they want to pay for those USB sticks (and any shipping fees that might be involved) when they have a perfectly good network already in place to send the data in a secure manner? There are too many variables involved in using USB sticks as a means of transferring back-up data. Sticks could get damaged, lost, stolen, etc, not to mention that the server at each store would need to allow USB access which could potentially open them up to other security risks. Just imagine if someone at a store decided to plug in their own USB stick and swipe a few files. Nice idea, but there are too many risks involved with a physical transfer of data.
Re:CIO's want pre-built software (Score:2, Insightful)
Better yet, tack on:
6. Give the script that handles this a name, build deployment tools, and release them under GPL.
Chained client/server (Score:4, Insightful)
Have you thought about building up a distribution tree for your sites?
Group all of your stores based upon geographic location. State, region, country, etc. Pick one or two stores in each group and they are the only ones that interact with the parent group.
E.g. Corporate will distribute the files to two locations in each country. Then two stores from each region will see that the country store has the files and download them. Repeat down the chain until all stores have the files.
Re:Sure, why not? (Score:3, Insightful)
Pirates still prefer FTP, it seems all of the big warez groups are still pushing files around using FTP...
Re:Snail-mail USB sticks (Score:3, Insightful)
Because depending upon the actual files that might be overkill. For recovery files there's probably a lot of similar or same files in each batch. Something like Jigdo, rsync or distributing diffs might be a lot more efficient.
With those the main concern is having an appropriate client to automatically handle the updating on that end.
Most of those options would also be capable of checking the integrity of previous updates and could be run more frequently just to verify that the data is uncorrupted. I think that bittorrent has similar capabilities.
Re:In a word, Yes (Score:5, Insightful)
For Blizzard, updates to World of Warcraft are very much a "business critical function".
Re:Sneakernet (Score:4, Insightful)
Also, burning (and packaging and mailing...) a bunch of DVDs isn't necessarily cheap/quick/easy, so it breaks down pretty quickly as the number of stores increases.
Re:Cisco already makes a product to do this - WAAS (Score:3, Insightful)
presumable "on steroids" means "with a fancy GUI".
rsync does this too. rsync can push or pull.
besides, there are plenty of rsync gui's, too.
however, bittorrent is almost certainly the best solution for this purpose -- the real question is coherency. You always know that eventually you'll have a complete and perfect copy at each location -- but how do you know WHEN that copy is complete so you can work on it? if this is strictly a backup system, then it's not needed, but it's probably not a good thing to be using files as they're being written:
some scripting -- rsync or BTdownload -- would fix this. copy the files to a working location when the update is complete, and then work from there while updates are restarted on the temp dir.
Re:better approach (Score:1, Insightful)
The question remains.. (Score:3, Insightful)
How are they connected to each other? If the same bottleneck router is used to reach each other, then it is a mott point. People often forget about the underlying network workings and abstract away that important detail. They can reach each others IPs, but that is not to say all traffic goes through the same weak link in the chain regardless.
Re:Sneakernet (Score:3, Insightful)
surely "push the files" to a remote site is the same as "posting the files" via a different transport mechanism. When people say that they need to remotely push the files, its not that the users can't/won't be able to handle them if they're not there already setup, its because they'll forget or just be too lazy to click the button to retrieve them. A DVD in the post is difficult to miss.
However, a DVD in the post may not arrive or may be corrupt.
Re:If the CIO expects "official" support... (Score:4, Insightful)
Actually, there is a Tivoli product that does more or less exactly what the OP asks for: IBM Tivoli Provisioning Manager for Dynamic Content Delivery [ibm.com]
Re:Sure, why not? (Score:4, Insightful)
Re:Sneakernet (Score:2, Insightful)
I dunno, but step three is profit.
Re:WTF? (Score:3, Insightful)
I don't like the DVD option. If it was a matter of sending out to "the other site," that'd be one thing. But, if you need to burn hundreds of DVD's for all the locations it suddenly becomes practically a full time job that could be replaced with a shell script and the WAN. I mean, 300 stores, assuming 15 minutes per DVD (including everything -- verify the data, put it in the envelope, print the envelope label, take it to the mail room, etc.) makes for almost 80 hours (about two work weeks!) of work. If your data needs grow to where you need two DVD's, or you add more remote locations, then it literally becomes a matter of a full month of work to get each month's backups out.
My inclination would be to not bother with RSS, and just sftp the torrent to each remote location as a push. But, that's a minor matter of which technology you happen to be more familiar with. (If he can implement the RSS plan faster than it takes him to look up sftp command line switches, then more power to him -- I'm certainly the other way around.) But, somebody posted some information about dsync which seems even better than that - bit torrent style peer sharing, and rsync style efficient replication. All as one tool. Minimizes the needed upload from the central site from (4 GB * number of stores) every month to just (1*changed data). I truly can't imagine DVD's being better.
netapp (Score:3, Insightful)
What you really want is a solution like Netapp file servers. It will distribute the files at the file system block level, updating only the blocks that have changed. You install a filer at your central office, then have multiple mirrors of that one at the various field offices. All the PCs get their boot image off the network file server (the local one). With one update, you can upgrade every PC in the entire company.