Most Digital Content Not Stable 353
brunes69 writes "The CBC is running an article profiling the problems with archiving digital data in New Brunswick's provincial archives. Quote from the story: 'I've had audio tape come into the archives, for example, that had been submerged in water in floods and the tape was so swollen it went off the reel, and yet we were able to recover that. We were able to take that off and dry it out and play it back. If a CD had one-tenth of one per cent of the damage on one of those reels, it wouldn't play, period. The whole thing would be corrupted'. Given the difficulties with preserving digital data, is it really the medium we should be using for archival purposes?"
Re:Multiple identical copies? (Score:5, Informative)
wring recovery method (Score:5, Informative)
Just because it's harder to recover the data doesn't mean it's impossible.
Of course, anyone using CDs or DVDs for large data backup must have a lot of interns to do the disc swapping.
have people already forgotten? (Score:5, Informative)
We can take this seriously. (Score:4, Informative)
Emphasizing the “I” in RAID [8k.com].
Re:Like what? (Score:2, Informative)
DLT
reel-to-reel
Mini8mm
SAN
CD/DVD
etc...
Depends on how deep your pockets go and your calculation for the value of the data if lost. You are doing the math on loss of data, riggghhhhttt?
Re:wring recovery method (Score:5, Informative)
To recover data from a CD, you can simply photograph it at high enough resolution. Even with huge scratches, even with parts of the disc physically missing, you can recover the data exactly as it was encoded. How? Reed Solomon code [wikipedia.org] .
Quoth wikipedia:
Apples and Oranges (Score:3, Informative)
Sad to say, tape dies too.
What is more interesting is the use of compression (and rights management, though if your originals are encrypted you deserve to get screwed - physical security comes first). With analog and simple stream encoding of time domain data (such as audio recordings) much data can be recovered using an external benchmark for the time code. Compress that data and lose your parity and you're totally hosed.
I've never been a proponent of compressed or encoded backups. Sure they save space and add a layer of "security", but that comes at the cost of flexibility should damage occur.
Of course, as has certainly already been mentioned - with digital data, you have the luxury of making multiple perfect copies as well as the ability to perform automated checks of that data, mostly possible without user interaction necessary.
Othwise, stone tablets have the best track record so far, though the storage density is a bit on the light (or should I say heavy?) side.
Re:That's nothing, think of DRM (Score:3, Informative)
I think that was the point behind "depending on how you define American" -- the GP was referring to the urbanized cultures of Mexico, Central and South America that had writing systems that they were forced to give up along with the rest of their culture.
Re:They could try harder (Score:3, Informative)
Sorry to spoil the fun (VXA tape format) (Score:3, Informative)
I know I'm offtopic, injecting facts into this debate, but I thought it might be interesting to bring up the VXA tape format. It allegedly survives all kinds of abuse like freezing, see Freezing Test [exabyte.com]
I have never tried these drives, and would love to hear from someone independent who has.
Re:1% = Total Loss? (Score:4, Informative)
Re:It's the messanger, not the message (Score:3, Informative)
It depends on the content as well. Content that is inherently analog tends to be more 'robust' in analog form. For instance, in the military they say [wired.com], "A computer with a bullet in it is just a paperweight. A map with a bullet in it is still a map."
Re:That's nothing, think of DRM (Score:3, Informative)
Best summed up by Chief Seattle, in 1854: "This we know: the earth does not belong to man, man belongs to the earth. All things are connected like the blood that unites us all. Man did not weave the web of life, he is merely a strand in it. Whatever he does to the web, he does to himself."
There was life before CDs. (Score:3, Informative)
Erm
There are terabytes (quite literally tons) of data sitting around on everything from old 7- and 9-track 1/2" open reel tape, to old 8" and 5-1/4" floppies, and other formats that are basically dead. [I'm not familiar with anything older than that, but I'm sure there are some real greybeards around that could enlighten you as to what came after punchcards but before the vac-column tape drives.] The only saving grace of those formats is that if you can find a reader, there's a chance it might either still work, or could be made to work, if you could find a compatible computer to interface it to (because the machines themselves were built pretty well; they were still viewed as industrial equipment of a sort, rather than consumer electronics). But the expense of doing that would be enormous -- the people who know how to maintain, and increasingly to operate, those things are retiring and becoming hard to find.
And analog formats aren't exactly immune, either. Where I used to work, we had several boxes of old video recordings on 2" quad [wikipedia.org] that we were storing for preservation purposes, but couldn't afford to have transferred to another medium (despite the obvious: that the longer you wait, the more expensive it's going to get if you ever do really want it). That format was used for over 20 years; there's got to be thousands of hours of it sitting around.
Even if you define 'mainstream' to be something that an average person could afford, CDs certainly weren't first; lots of people had PCs with various types of digital storage.
But to only focus on 'mainstream' formats misses the point entirely. Stuff that's been distributed out to millions of people isn't what's at risk of disappearing; it's the original source material (think NASA's Apollo videos), or information that's naturally stored in big 'silos' (think public records) that's really at risk, and those have been stored in a plethora of formats, digital and analog, over the past 50-75 years, which are difficult to work with today.