Lockheed Chosen For Electronic Records Archives 282
TrentL writes "How will we be able to read 1990's email messages in the year 2090? Will GIF files still be accessible in 2105? The US National Archives - tasked with preserving records "for the life of the republic" - has chosen Lockheed Martin to solve exactly this problem. Lockheed was awarded the $308M Electronic Records Archives contract after a year-long design competition. Full Disclosure: I worked on Lockheed's demo team."
Why not? (Score:3, Interesting)
Re:Why not? (Score:5, Insightful)
SO now, forward a hundrew years. Just 15 years ago, I was working with CDs that would last 100 years (50 bucks a pop). Now, ppl seem to assume that the current disk will last that long. They will not. The old disks were made out of thin gold sheets in plastics. They are now some plastic in plastic. These CDs/DVDs will last less than 10 years (and probably closer to 5). In addition, the tape drives and hard disks are storing million time more data than what was in tape in the 60s. That is the storage density is WAY up. So now, as a small pox shows up, it will affect millions x more data, making recovery very difficult.
Re:Why not? (Score:5, Informative)
Even that's pretty generous IMHO. In my experience, recent blank CDs (and DVDs) are lucky to make out 18 months, and many of mine are delaminating or corroding after only 12. I've now gotten into the habit of burning two copies of everything I "archive", and re-burning them every 12 months. Thus far I've had errors, but never errors in the same place on each copy.
Contrast this to the good old "Kodak Gold" CDs I was burning onto back in 1996, almost all of which are still readable with 0% errors...
Re: (Score:2)
Re:Why not? (Score:5, Informative)
Saving data for THOUSANDS of years (Score:3, Informative)
The Rosetta Project's customized "Rosetta Disk" adds another clever innovation: naked-eye-readable words around the ed
Re:Why not? (Score:2, Insightful)
All these people whinging about about how cd's won't last - I'm pretty confid
Re:Why not? (Score:4, Interesting)
The amount of data we are talking about is HUGE. There is no way humans could manually upgrade the data. It would be a technical and policy nightmare. As for preserving emails, the email messages of the executive branch contain much historical significance.
Re:Why not? (Score:3, Funny)
Blah, if that has so much historical significance, you just need to post it to the newsgroups, and it will be preserved for as long as internet exists.
Re:Why not? (Score:3, Insightful)
While making things last as long as possible is a good thing, you can't plan for 100 years down the road. You have NO idea what will truly happen.
There needs to be a system to move data from older mediums to newer mediums every few years as they become available. Multiple copies, with verification. Checking. Double checking.
Re:Why not? (Score:4, Insightful)
This has been discussed before. The sheer volume of data that would have to be copied every few years is HUGE. How long would it take you to transfer a stadium full of CDs onto DVDs? How much would that cost?
There's good reason people are looking for digital technologies that are as inherently stable, long-lasting, and reliable as writing on paper.
GIF? (Score:5, Funny)
We're just lucky that Walt didn't dream up LZW compression while he was working on Steamboat Mickey, or we'd have patents lasting for the author's life plus 90 years!
-paul
Re:GIF? (Score:5, Insightful)
Re:GIF? (Score:3, Interesting)
I agree, Walt [rotten.com] was much more evil than corporate Disney. Credit where it's due.
Re:GIF? (Score:3, Funny)
Indeed. The esteemed authority Dr. Hibbert agrees: "Well, only one in two million people has what we call the 'evil gene'. Hitler had it, Walt Disney had it, and Freddy Quimby has it." You just can't argue with the Simpsons.
Walt named names... (Score:3, Informative)
Re:GIF? (Score:2, Interesting)
http://en.wikipedia.org/wiki/GIF#Unisys_and_LZW_p
Chick and Egg problem (Score:5, Interesting)
You see; whatever format you used for anything has to be documented and you can't use paper because it won't last as long
Worse still you need some computer science grads to write up exactly the format down to how long a char is and the bit/byte order. It is a extremely difficult task even if you don't take into consideration finding a storage medium that will last that long.
Oh that answer is obvious. (Score:5, Insightful)
Latin, videlicet.
But seriously the problem in records is not going to be collecting the data, but turning it into knowledge. Meaning that humans in the future are likely to seriously misinterpret or be unaware of the intended meanings and social and political contexts of the preserved data.
This is not a technology problem.
They ought to make sure that real professional historians are there.
Software is easy. Re:Chick and Egg problem (Score:4, Interesting)
Getting code to run on later architectures is not usually very difficult. I am fairly comfortable with the proposition of porting any code to any future architecture -- the "emulator scene" testifies to the viability of this strategy. The biggest problem to be solved is reading storage media for which no hardware exists.
For example, how do I get to my college research stored on AmigaDos floppies? Tragically, the easiest solution is to try to get my Amiga running again, and then move the data over a serial cable with kermit. I'm awfully glad I have kermit on that computer, because I don't think I'd be able to find any 2400 baud Amiga BBSes around to download it.
Re:Chick and Egg problem (Score:2)
Re:Chick and Egg problem (Score:2)
1) Carve one copy of the ascii table in metal (choosing metal as less brittle than stone).
2) Store the description of the file format in english ascii, and give it a unique identifier, also in english-ascii.
3) Store the files as whatever binary you want, alongside a pointer to #2. The pointer should be english-ascii as well.
If you're real paranoid, you store a copy of webster's english dictionary on metal (only a few thousand metal pages to print, should be pretty lo
Re:Chick and Egg problem (Score:2)
This has a fundamental chicken and egg problem: So you store the information, you also need to store the format of that information. So then how do you read "format of the information" document? What format is *that* in?
It looks like you'd enjoy reading Godel, Escher, Bach by D. Hofstadter. The whole book's about dealing with the philosophical implications of this problem.
Chinese whispers (Score:3, Interesting)
I took it off of tape, via the BBC and a serial lead, I have all my chickens and all my eggs. So long as you move to a newer form of media before the old one perishes then your going to be OK.
I think it's a Chinese whisper problem not a chicken and egg problem, what happens when inaccuracies are introduces
e.g. Someone writes a file in a odd charset, nobody notices that the charset is different from ASCI when they conv
Comment removed (Score:5, Funny)
Re:Truth be told. (Score:2)
Re:Truth be told. (Score:2)
Photogenic memory. (Score:2)
Re:Truth be told. (Score:2)
maybe you meant photographic memory...
Re:Truth be told. (Score:2)
He's good buddies with Jethro Tull.
I want it too (Score:5, Insightful)
from the US government? (Score:2, Interesting)
"The system's "initial operating capability" should be available during Fiscal Year 2007. Weinstein noted that "the system's architecture makes it flexible enough to accommodate evolving policy change," including the importance of "providing public access while protecting privacy and sensitive information.""..HAHAHAHAHA! Anything even *remotely* important or interesting, paid for by tax payers or not, sorry, "terrorism, security", yada
Re:I want it too (Score:3, Funny)
It'll probably be some $50 million database system that runs on Microsoft Windows 2003 Server and requires Oracle along with a mish-mash of Visual Basic .NET applications to accept data input and display it. I mean hell, we'll still be run
Re:I want it too (Score:2)
I worked on Lockheed's demo and sat through their entire presentation to the NARA staff. What you describe could not be further from the truth. The design team was well aware of the value of open stan
Re:I want it too (Score:2)
IDE Raid.. (Score:3, Interesting)
Re:IDE Raid.. (Score:3, Insightful)
Either way tapes still hold some value in offsite storage.
Re:IDE Raid.. (Score:3, Insightful)
For ordinary backup requirements, where data only has to be retrievable for a few months or years, disks can be useful. Under these conditions, the mean time between failures of the backup drives is at least as good as that of the production drives.
Archival backup, however, depends on an extremely low rate of failure over a very long time. The ideal backup medium is not only stable but can also be read using simple me
mbox & gif formats are so complex (Score:2)
Personally, I'd be more worried about proprietary formats.
Offsite storage and What type of Media? (Score:2)
Seriously though, what medium would work the best for this? At this point I think that hard drives cost just as much as Ultrium tapes, for just as much storage. Seeing as tapes die so quickly, you may as well back them up onto true hard drives, then just let them sit for a few years. After ten to twenty years, carry it forward to the next big storage medium.
Re:Offsite storage and What type of Media? (Score:2)
That way, if earth really does get in the way of a new highway, well, it will not hurt the mice.
Re:Offsite storage and What type of Media? (Score:2)
But, yeah. Every fews years I move everything that was previously on the mirrored array in my server to whateve
Real Video (Score:5, Insightful)
Re:Real Video (Score:3, Funny)
Will we want to archive what we can? (Score:2, Insightful)
I remember reading an article about the archival of scientific research; many researches involved in the discovery of DNA's structure didn't keep their (hand-written) notes, but they were later recovered by ot
The US National Archives (Score:5, Funny)
Task completed......
I hope they archive... (Score:3, Funny)
Oh God, no... (Score:2)
It sometimes amazes me that LMCO manages to build planes that actually fly. But then I have to remember that the people designing the planes apparently aren't the ones designing their software.
If they build aircraft the way they build software [fas.org], their planes would make these [redbullflugtagusa.com] look like this [af.mil].
Momentous Task Indeed (Score:2, Interesting)
Just look back at how much technology has changed in the past 10 years. We had 5.25" Floppy drives used back in those times, and 3.5" floppies were used as well, and CD burners were just starting to come available at the speedy rates of 1-2x, not to mention hard drives were so small compared
Re:Momentous Task Indeed (Score:2)
Keeping data alive forever will require effort forever. It would be poor form indeed to toss your data into safe, just assuming that you will be able to read it X hundreds of years down the line.
There is no eternal data format or medium, but the data itself can be preserved eternally. People don't seem to "get" this.
Priorities. (Score:2, Funny)
Google? (Score:5, Interesting)
Re:Google? (Score:2)
Re:Google? (Score:3, Informative)
Lockheed partners include BearingPoint Inc., McLean VA; Fenestra Technologies Corp., Germantown, MD; FileTek Inc., Rockville, MD; History Associates Inc., Rockville, MD; EDS Corp., Plano, TX; Image Fortress Corp., Westford, MA; Metier Ltd., Washington, DC; Science Applicati
What is a formsfor ? (Score:3, Funny)
While looking through the documentation http://www.archives.gov/era/about/documentation.h
I found a link to the project requirements : http://www.archives.gov/era/about/requirements.cs
Which contains the following line
I know, one typo in one line in several hundred, but why that line ?
Protect against the 1984 "memory hole"? (Score:3, Insightful)
Re:Protect against the 1984 "memory hole"? (Score:5, Insightful)
In the year 2525! (Score:2)
Mebbe if we kept email as plain text we wouldnt have to ask this question. Im fond of mbox myself. However any sufficiently documented format will only leave us with storage media issues.
Re:In the year 2525! (Score:2)
We don't have a choice. Not only do we have to keep track of the text of the message, but we also have to archive the attachments and the arrangement (i.e. if this email is just one message in a dozen-email exchange, that has to be preserved).
Cue Stealth (TM) jokes here (Score:4, Funny)
Hhmm...
I can make this work... (Score:2)
2) I'll bank the cash and cream off some interest to keep me in luxury - let's say I can get 5% interest PA, that's $15.4M a year so I'll take, say, $5M/year for myself and my efforts.
3) I'll Spend up to $10M/year maintaining a secure storage facility and purchasing 3 units of every storage device that comes to market together with a range of media, host systems and documentation on acid-free archive paper.
4) There will be an annual charge for subscribers to the facility.
5) Prof
IFF-ILBM (Score:4, Informative)
A fair number of artists and video producers used Amigas. One of Amiga's advantages was that practically all the graphics programs used ILBM format, which meant you could easily feed the output from one application into another, and then into another. It was good, and it wasn't all that many years ago.
Just trying finding a program on Mac OS X or Windows today that can read IFF-ILBM files! Go on, try it! Photoshop, for one, doesn't have a clue about them. The best you can hope is to find some obscure freeware IFF-to-PNG converter that someone has hacked together.
Another example: It's getting harder to find apps that play "tracker module" music, and the programs that are available tend to be awkward and unreliable. Everything went to MP3, and mod music was quickly forgotten.
So if the idea of today's commonplace formats becoming unknown in the future sounds far-fetched at all. . . It's not.
Re:IFF-ILBM (Score:2, Informative)
This is where the true support for these formats will remain - open source. If you want support, you have the freedom to write it yourself.
Of course, if memory serves me right non-free Paint Shop Pro still has IFF support as well. Hmmm. This [activewin.com] page seems to say so. I seem to remember Photoshop having IFF support, but that was 3.0 or 4.0ish many moons ago onna MacOS classic box.
Re:IFF-ILBM (Score:2)
i never used amiga so i don't know about IFF-ILBM but a google search turns up lots of hits including a gimp plugin, Some tracker formats are deffinately supported by winamp and i'm pretty sure the format specs are out there.
and there are emulators out there for almost any old system you can think of (though i admit getting the roms can be tricky).
Re:IFF-ILBM (Score:2, Informative)
How about Clay? (Score:5, Funny)
Step 1) Generate more documents in open formats (Score:5, Insightful)
Obviously, open standards are not a panacea. There are countless standards created by the military that never really spread farther than that, and therefore the support for them is limited (and thus companies that do support it can charge a pretty penny for it). And with open standards, at it is much easier to write an implementation if you need to. Compare this to MS Word, which is a pain to reverse engineer now, just imagine having to do so in the distant future, when it is not as widespread. And of course, for the very long term, nothing is more certain (and more inconvenient) than printing everything out and storing it in a warehouse, which is what is done now. But the longer that can be postponed, the more money can be saved.
As an added bonus, just imagine the competition that would spring up in the word processor market, if the DoD mandated that all new word processor documents generated internally or by contractors be in OASIS format, starting 5-10 years from now. Microsoft would have to support it (and well) or throw away a huge number of Office sales. The DoD would no longer be locked into a single vendor, saving them money upfront in addition to the money they saved on document retention.
Until then, the best plan is likely to convert as much as possible to a few standards like PDF, which is what I expect will happen here.
Re:Step 1) Generate more documents in open formats (Score:2)
A lot of people in these comments keep saying, "I can solve this problem for $10k! Convert everyone to Open Office!" That's all well and good, but people, we are already DECADES behind on this problem. Whether you like it or not, there's a boatload of Word95/Excel/BMP/etc files out there (and worse).
Drm (Score:2)
Great - Lockheed Martin. Now there is an idea (Score:2)
Lockhead - Martin data entry... (Score:3, Interesting)
The bizarre thing is that I found out a few of the invididuals would "pad" their PPM (Parcels Per Minute) by typing in zipcodes they were familiar with instead of reading what was on the display, just to enter a dozen or so really quickly. It didn't happen often, but it helped them keep up the pace and "clear" the system queue more quickly, thus gaining them and their workmates an early break. However, I've no idea what damage may have occurred by their lax attitudes, and I really don't want to know now.
Which brings me to my point (I think): how can we be certain the data they're entering is one-hundred percent accurate, regardless if the medium lasts a century?
Is lockheed really doing anything? (Score:2)
Already here (Score:2)
I can fix this. (Score:3, Interesting)
The HUGE concern is the undocumented, encrypted or (worse) DRM'ed files. Reading those in 100 years may be exceedingly difficult.
We can read documents written in heiroglyphs around 2000 years ago. The only problem is with languages for which no translations *ever* existed.
Survival and longevity of antique media are a much bigger problem.
Hi Trent (Score:2)
dspace.org (Score:2)
$308M would sure go far if doned to this open source federation!
An Arms Dealer to Guard the Memory Hole! (Score:3, Interesting)
From the article: "The system's architecture makes it flexible enough to accommodate evolving policy change," including the importance of "providing public access while protecting privacy and sensitive information." From the sound of that I'm betting its some wonky and ridiculous XML format infected with a sadly pathetic little DRM imp.
The fact is that I can read anything if I have a copy of the software that originally viewed/created it and the machine (or an emulation of the machine) on which the software ran. Adding one more format to the mix just means we have to emulate one more machine and keep track of one more piece of software and all the doubtlessly expensive effort which will be spent in conversion is wasted.
It's great to see the National Archives working on this but I would rather see the tax money farmed out in challenge grants to organizations like the
Long Now [longnow.org] that have a chance in Hell of delivering something useful than pouring money into yet another defense company to ensure that whatever technology we use to store records can be properly sanitized and locked away according to the whims of government and "changing policy."
The biggest issue facing us right now is that most of the music, words and images created by our civilization are illegal to preserve. Ridiculous copyright extensions have ensured that the huge mass of data for which no rights owner can be found will simply rot instead of being digitized and stored.
A software emulator [iconbar.com] can ensure that historic file formats are readable in the future, but Big Media would rather squeeze our history to death before it letting go of the rights.
This is like 1000 fires at the Library in Alexandria. Future generations will curse us for every scrap of information we allow to rot while we squabble.
Rosetta Project (Score:3, Informative)
sure, it may not be terribly convenient, but it's certainly going to be readable 100 to 1000 years from now (by which point we should have adequate OCR to complete the task of reading the disc automatically)
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
Nonetheless, I believe the "Why?" question should be asked before AND after the fact, continuously. I don't believe we should just roll over when the majority says its ok to tax-and-spend. I've met many people through slashdot who have come to agree with the non-authoritarian positions I've stated in the past, and I love the debates I usually get before I get modded down.
That's what
Pfah. (Score:2)
Happy to help!
--grendel drago
Re:Pfah. (Score:2)
While Rand had some interesting ideas, so did Clinton and Bush at times.
I don't believe in the use of force by anyone against anyone else except in the direct defense of one's property or person. To me, taxation is force, and I just can't understand why laws are necessary at the federal level.
I'm not advocating shutting down ALL government (yet), just the federal level. It is my firm belief that the State-Countries that will be left will be MUCH more competitive
Re:Unconstitutional, unnecessary, and unacceptable (Score:5, Insightful)
I'm curious, did you have any criticism for the $300M "bridge to nowhere" in Alaska when it was reported in the new budget this year? And where are you on the $200B+ we're spending in Iraq?
The Bridge has a name you insensitive clod (Score:2)
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
http://www.lewrockwell.com/orig/stinnett1.html [lewrockwell.com]
http://www.lewrockwell.com/pilger/pilger17.html [lewrockwell.com]
http://www.lewrockwell.com/rogers/rogers40.html [lewrockwell.com]
And some of those items took decades to make it. If they're going to keep certain government information in the Archives, make it available immediately.
I'm against the $300M bridge to nowhere -- I believe in privatized roads funded by local businesse
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
But since you see the value in the National Archives, why don't you see the value in digitizing it for archive and access by the people who own it?
Personally, I expect that Lockheed will bungle the job. They're not experts in the archive business, no matter how much of their own they do, or how much they "demonstrated" to the government. They're in the "blow things up" business.
Re
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
As a multi-business owner, I can tell you that I'd rather be paying for the roads that come to my store from the highway, and I'd rather co-op it with other businesses by finding the BEST builder and maintainer of the roads. Today, we all pay hidden gas and other taxes for use of the roads, but the costs are crazy (I should know, I've worked for a highway contractor).
If we could should the average taxpay
Re:Unconstitutional, unnecessary, and unacceptable (Score:5, Insightful)
Wow, you can access the Constitution? I mean it was written in 1776. That's a long time ago. Good thing somebody thought to save it!
We're saving lots of data, because 1) lots of it is important and 2) we have very little perspective on it yet. In 200 years we might very well have a very different idea of what was important today.
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
Why can't private companies do it? $300M in every pork barrel program adds up quickly, and government just inflates more currency into creation whenever things get tight.
Our economy is heading to the gutter fast once the housing bubble bursts, an
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
I'm pretty sure LockheedMartin IS a private company.
Oh, did you mean you want them to front the cost, too? And then who pays them for their services of preserving the National Archives. I expect that would be the government.
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
Re:Unconstitutional, unnecessary, and unacceptable (Score:2)
"To make all Laws which shall be necessary and proper for carrying into Execution the foregoing Powers and all other Powers vested by this Constitution in the Government of the United States, or in any Department or Officer thereof."
You need to keep official records of things like: court rulings, legislation, federal expenditures, etc.
Resize? (Score:2)
Re:What about Resolution? (Score:2)
Funny, but displays are not going by Moore's law.
Old high res Sun monitors from a deacade (or more) ago still work just fine at 1600x1200. That was super high res back then, when most were lucky to get 800x600.
Now, Apple has a 2500x1600 display that is pretty sweet, and IBM has the highest density monitor at something like 4000x3000.
Now if I could just get a 2100x1080 projector for something reasonable so I can watch 1080i on a wall... Sony has one, but it costs thousands...
Re:What about Resolution? (Score:2)
Well... think of 8mm film. It's grainy, low quality and generally a poor medium. But you can still watch it. The best example of this is the stock suspention bridge footage [camerashoptacoma.com]. Granted this was shot on 16mm, but given the fact that this is the only copy available it's still seen today, and this was shot 65 years ago. It's in black and
so rescale it (Score:2)
just as old archive film looks pretty shitty by modern TV standards but you can still see whats going on just fine.
Re:not fair (Score:2)
Re:Microfiche (Score:2)
Re:Microfiche (Score:2)
that means that the data is stored in a less accessible format for long life. That does not mean that you cannot keep the data in a "liquid" state as well for easy consumption, but in case of a catastrophic data loss, the archive could be used to rebuild the database.
Re:Tax dollars at work: 300M to solve a 100k probl (Score:3, Insightful)
How in the fucks sake do you expect this to last 100+ years? Don't use lossy compression? How is that a solution?
Take Windows Bitmap image format. It's not lossy. That doesn't mean that we won't forget how to display the damn thing...
Raid 5? What problem do you think you're solving? Kee