Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!


Forgot your password?
The Internet

Video A Primer on Data Backup for Small- to Medium-Sized Companies (Video) 76

This is a conversation with Jeff Whitehead and Lou Montulli, respectively Vice President of Technical Operations/CTO and Chief Scientist for Zetta.net, a company that specializes in online backup and disaster recovery service. Also, while this interview was arranged without his help, in the interest of full disclosure we'd like to tell you that Zetta's CEO is Ali Jenab, who used to be CEO of Slashdot's parent company. But this discussion isn't about Ali or Zetta.net, but about data backup, and what methods are best and most cost-effective for companies ranging from home-based businesses up to enterprise operations with thousands of employees. Among other things, we discussed the importance of multiple-site storage for important data, a factor that was drilled in to us yesterday by an article titled Another Iron Mountain Fire Points Up Shortcomings of Physical Storage by long-time tech journalist Sharon Fisher. And never forget: You don't know how effective your backup and data storage arrangements are until you try to retrieve your data -- and if you don't try to retrieve data until you need it, and things don't work, you are in big trouble. (Don't see the video? Here's a link.)

Robin Miller:
I am Robin Miller for Slashdot. And looking at Lou and Jeff and the titling tells you more about them. They work for a company called Zetta. And we’re talking about, what you do when you backup and how should you backup and the difference between archiving and smart backups for things you need right away. So let’s start with Lou. Lou, just give us some idea of what different sized businesses you might think about as far as backup for instance?

Lou Montulli:So we generally look at in three different size business segments. We got the very small SOHO businesses, might be a drycleaner, anywhere up to several tens of people who don’t have that much data. And you’ve got some medium size folks who might have between 2 and 50 terabytes of data and range anywhere from as small as 10 people up to a few hundred people. And then you have kind of the large enterprise folks which are hundreds of terabytes in general and range from hundreds of people up to tens of thousands of people.

And the needs for each of these companies is usually defined by how much data that they have, because when you have more data, the kinds of problems that you have dealing with the data size and how to offsite it are different. In very small companies, there are lots of tools available, anywhere from just: buy a USB drive, take it home with you up to very small tape systems. The medium size business is what we tend to address which is the 2 to 50 terabyte range and we feel that’s a perfect size to employ Internet based backup.

It’s small enough that it can travel over the wire efficiently and it’s big enough that it’s really important data, not that any data isn’t important but it’s big enough that the problems of backing it up are actually reasonably substantial. So you want a real company that understands enterprise IT helping you do it. And the other segment which we don’t address, which is the very large enterprise tends to deal with multiple datacenters, have massive robotic tape libraries and/or massive on-disk backup and other highly complex systems.

Robin Miller:Okay. Jeff, so you just realized with your small but growing business that you have to do some data backup or else, I live in Florida and we haven’t had a hurricane hit us for a while but one could any time, so what should I do with my small but growing business as far as data backup?

Jeff Whitehead:Basically what you are describing is a geographic risk that’s specific to Florida and so what you would like to do is make sure that your data is offsite. So that if a disaster occurs in one location, it’s very unlikely to happen in another location, like fires can happen anywhere but it’s very unlikely that two or perhaps three depending on how many times you make copies of your data based on the sensitivity or burn down all on the same day, that’s just not going to happen.

Robin Miller:What about the difference between data you need now and archive data? Lou, what about the difference, do we store them differently?

Lou Montulli:Well, that’s a great question. I would say that all data is important and you never want to lose any of your data. Obviously if you put it in the archive, there is a reason why you’re keeping it. So it’s not really a case where you’d say I want to increase my risk or archive data, but generally what you want to do is say, I am willing to take a penalty in terms of access speed in order to gain a better price in your archived data.

So you can look at different types of media or different types of lower performance spinning disk to gain advantages and price on archived data. But I definitely don’t recommend that people ever take a chance in terms of data integrity on any of their data. That should never be something you sacrifice.

Jeff Whitehead:I think there really is two different kinds of archive data. One is where you have the data and could possibly reconstruct it, say off of a tape drive, or off of disk for computers that are sort of spread out or data that’s been crypt down and you could re-crypt that in some way and archived data where you transfer it of some place because that was the only copy of data and it’s got to be really protected and stay there forever.

Robin Miller:Okay, yeah, I was thinking actually in terms of____4:54. I haven’t had an active business for some years, but my wife sold art, and we still have credit card receipts which were supposed to hold for seven years, and we have bank safe deposits. That’s all you need I think for a small business. What’s the next stage in the electronic stage beyond that?

Lou Montulli:Well, lot of people are scanning them now and putting them into some sort of either database or just putting them on a file system, you are off-siting them somewhere and the type of data you’re talking about is actually really important to be able to get to that data and find it and search it because it becomes a needle-in-the-haystack problem, but the most important thing is that you have at least one or more copies of it somewhere and being able to get to it when you need to.

Robin Miller:Okay. So how do we search for it?

Lou Montulli:Well, searching is a complicated thing. It’s very much dependent on – it’s a good question for Google. It very much depends on the media type, right, obviously it’s very difficult to searching photos, but it’s really easy to search within a text document. So I think it’s very much depends on your particular application type and that you make a decision based on the type of data.

Robin Miller:So we get up into terabytes, yottabytes, and zettabytes, so how do we search that?

Jeff Whitehead:It’s tricky, I think that in many cases, it’s sort of an old tradition of the data I’m looking for was on Server 23 and it was on the C-drive/My Documents, and people kind of have a recollection of that. If you’re a very small business, it’s often fairly simple, you know that your credit card receipts were in a given folder and go and look in that place. If you’ve got a very large data set that all looks the same, then you really need some sort of specialized application that will give you an index and someway of searching those things, like a great example for photos is the Picasas of the world or the Windows’ thumbnails, so you have a way of looking through those things.

Robin Miller:What else should I know that I don’t know?

Jeff Whitehead:Not all backups are of the same quality. I like to tell people a backup isn’t a backup until you have restored it. So, I personally have run into tapes that I thought were good and have then corrupted or trying to restore a system that relied on a particular type of a RAID card being in the system____7:29after you restored it. So really you need to think about what could happen for my data. If it’s a Word document, there’s not a whole lot of risk there. You can get versions of Word that go way back and open that up. If it’s an application, there is the whole – all the pieces of the applications stack you need to protect so you can restore and reinstall.

Lou Montulli:I had a few items as well. I think especially in the world of Internet-based backups, there are specific problems related to going over the Internet that are not there if you’re backing over the LAN within your enterprise. And they get harder and harder as the data sizes get bigger. One of them is just the reliability and security of the internet, so making sure that you choose a vendor that is well versed in security and is always using encryption technologies to make sure that the un-trusted WAN connections are always encrypted.

The other one is the ability to get large amounts of data over the Internet. It’s still a very difficult process especially when dealing with terabytes and petabytes. It’s a particular question that we focused a lot of our time on it and how to make high bandwidth connections really efficient for very large file transfers. And then one other item is not transferring all your data all the time, so in LAN-based backups, it’s common to take full copies of your server everyday or at least every week and maybe do incrementals in between.

If you’re doing full copies of your entire dataset over the Internet, you’d quickly find that you need massive amounts of bandwidth. So, having an “Incremental forever” technology is really, really important and then the reverse of that if you are “Incremental forever”, how long will your restores takes. So we always recommend “Incremental forever” with reverse incremental technology, so you also have full backups available for restores and quick restores.

Part of any good data integrity strategy and disaster coverage strategy is one, getting your data off-site, get it out of your enterprise, and two, making sure it’s far enough away from where you are looking and so that any particular regional disaster zone is not going to affect all of your copies of data.

Initially if you really want full data protection such that you can sleep at night and never have to worry about data losses, we recommend that customers not just make one off-site copy, but make multiple off-site copies, either across the country or in very different regional zones. And of course each of those copies ought to have fantastic data protection such that you’re not worried about standard failures like single disk failures or network failures or other things like that causing the entire copy to get corrupted because if you go down to a single copy, then you are again not sleeping great at night.

So the entire chain ought to be you are getting your data backed up least every day if not more common than that, it’s off-site, it’s in multiple locations and it’s with data provider that’s providing absolutely top-tier data protection and data integrity along with all the other security and ease-of-use concerns that go along with it, because we could talk about an entirely other subject which is backups have been notoriously difficult to keep running on a regular basis.

Robin Miller:Lets

Lou Montulli:Jeff, you want to take that one or you want me to

Jeff Whitehead:Sure. Backup reliability is another case where different solutions have wildly different performance characteristics. And for a lot of IT administrators, backups is a job they really just don’t like, they get up in the morning, they look at their backup status, and there is 70 exceptions they got to run down and meanwhile the pager starts going off and the printer won’t print and the CEO needs his new laptop, and so backups tend to be sort of at the bottom of the pile because they’re important, but they’re not pressing in the same way as a customer darkening your doorway and needing some help or a solution right now.

So really it is important and it’s tricky to get this data ahead of time because everyone says, yes, our backups are great, never having issues, but it’s really important to talk to the user communities of an existing product and find out what is your daily life driving this backup solution like. And it’s got to be like-for-like in terms of hardware and software and the whole solution because what may work for someone that’s got a very high-end data center and a high performance storage or a network may not work for someone with a windows, small business server, it’s got a entirely different set of characteristics with it.

Robin Miller:Okay, let me give you the word, Lou with question mark after it, cloud?

Lou Montulli:Cloud, I love the cloud. The cloud has the potential to make everyone’s job easier, and make things cheaper and better, now that’s potential, not every cloud provider actually delivers. The potential there is that you can have a complete end-to-end service that actually makes your life as an IT administrator actually easier, because they are either single vendor or multiple vendor integrated solutions that solve one thing very well, they have a support staff behind them and ideally they work virtually all the time without any problem, and when you do have a problem you have one number to call, and they can solve it because they’re an end-to-end service. So it’s like having the world’s best expert hired on to your team just managing your particular system. And that’s what you should look for in a cloud vendor is, the absolute best in that particular space specialized to do what you want it to do and when that works, and always try it before you buy it.

Robin Miller:What about backing up your cloud, or is it inherently backed up?

Lou Montulli:That depends on your cloud vendor. So many vendors provide their own backup solution, whether it be____13:32specialized within their own type of cloud or they layer on another kind of backup product. Now some customers do choose to not trust in a single vendor and layer on an additional layer and back it up to another cloud or bring it back into their enterprise, that’s kind of a popular thing to do and say, hey, I’m going to trust this cloud vendor to do this one thing for me, but I also always want to have a copy within my enterprise if anything happens to that vendor or if I just want to move my data out of____14:04different vendors, so having it within your own enterprise can be useful.

Jeff Whitehead:I think there is a distinction between the application as a service vendors like the Salesforces, Office 365s, the Google Apps of the world, people tend to not want to back those up, infrastructure of the service like Amazon EC2, Microsoft Azure, the Rackspace offerings, I think that those you need to backup and I’d also go out on a limb and say that I think that while the application-as-a-service providers are a new and better thing and have made IT easier and better for everyone, the infrastructure-as-a-service guys are a little bit newer and I think they’re not quite to where straight hosting was or is today. And so it’s really hosting with dynamic characteristics that are bolted on to it, but you still have to do all the things you do a traditional hosting, which includes____15:00taking your own backups hopefully through different providers.

Robin Miller:I have learned personally, I won’t say the hard way, but I do know that even when you are using a “cloud service provider,” you should have some backups, there is one, I will not use their name, it starts with G, ‘Giggle’ or something and last week I was conducting an interview just like this on their hangout service and it stopped, now since it’s hangout service that meant that some of my information for writing a story on Google Drive was not accessible to me, I’m just one little freelance writer in Florida cursing them. How many millions of people were shut-off, so yes I had on a USB hard drive, I had a copy of the story I was working on. So, should we not even with Cloud have a backup from our stuff on Salesforce or whatever?

Jeff Whitehead:Well, I think that’s a good example and with a – again it depends on the application. With a Word document or an article you’re writing, you can open it in some kind of editor. If you did have a backup of Salesforce, I’d sort of question what would you do with it, do you have a way of standing up your own Salesforce stack? I think most people don’t.

Robin Miller:So basically, if I use the application service provider, I’m placing full faith and trust in them?

Jeff Whitehead:That’s true.

Robin Miller:That’s a good thing to talk about another time, right now that’s I think enough food for thought, do you have anything you’d like throw in here, Lou?

Lou Montulli:Yes I do. One more topic would be to appliance or not to appliance?

Robin Miller:Okay.

Lou Montulli:We find some customers having in their mind that they want an appliance and a lot of customers come to the door saying I don’t want any appliance and it’s an interesting question. And often it comes down to pure functionality. If you had the choice and you could do the same thing with or without an appliance, I think most of us would choose not to buy the appliance because if the functionality is the same, why would you want to manage yet another server in your infrastructure and why would you pay for that hardware if you don’t have to. There are a few cases where an appliance makes life easier, but it’s great if you can have a service infrastructure provider who can do it all entirely in software. It just makes the process of upgrading easier, it makes the process of handling multiple office deployments a lot easier and it removes one more device from your data centers.

Robin Miller:You guys, could you provide an appliance if I wanted?

Lou Montulli:We could provide an appliance like experience, but we don’t sell any appliances, so we have a full software stack and if you want to run something that looks like an appliance, like a backup appliance and export your data to that appliance, then we work quite well in that environment, but we don’t require nor do we sell any specific hardware that is labeled as an appliance.

Robin Miller:Could we not in fact get a generic server and then get a plaque that says appliance and put it on the front?

Lou Montulli:Exactly, which is exactly what a lot of folks do is, they are buying a generic server and throwing their software in it and then marking it up 5x and selling it to you. We consider ourselves more of a pure software play in our service infrastructure and we find the convenience there is tremendous and we can bring customers on the same day. We don’t have to wait for an appliance to be delivered and installed.

This discussion has been archived. No new comments can be posted.

A Primer on Data Backup for Small- to Medium-Sized Companies (Video)

Comments Filter:

With all the fancy scientists in the world, why can't they just once build a nuclear balm?