Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
United States Technology

Lockheed Chosen For Electronic Records Archives 282

TrentL writes "How will we be able to read 1990's email messages in the year 2090? Will GIF files still be accessible in 2105? The US National Archives - tasked with preserving records "for the life of the republic" - has chosen Lockheed Martin to solve exactly this problem. Lockheed was awarded the $308M Electronic Records Archives contract after a year-long design competition. Full Disclosure: I worked on Lockheed's demo team."
This discussion has been archived. No new comments can be posted.

Lockheed Chosen For Electronic Records Archives

Comments Filter:
  • Why not? (Score:3, Interesting)

    by Poromenos1 ( 830658 ) on Monday September 12, 2005 @05:33PM (#13541846) Homepage
    Analog media couldn't be restored because the machines that read it broke (couldn't they make new ones?) but as long as the specs exist, I don't see why they won't be able to read the digital data (assuming we still use two bits in the future).
    • Re:Why not? (Score:5, Insightful)

      by WindBourne ( 631190 ) on Monday September 12, 2005 @05:45PM (#13541964) Journal
      First off, you do not seem to know (or do not remember) that NASA is losing all sorts of data. They have 2 problems. Just 40 years ago, they were storing data on Tape Drives. The tapes are decaying so the data is disappearing. In addition, the formats are disappearing. Back then, all the specs were written down, and yet, the formats are hard to find in mountains of data.

      SO now, forward a hundrew years. Just 15 years ago, I was working with CDs that would last 100 years (50 bucks a pop). Now, ppl seem to assume that the current disk will last that long. They will not. The old disks were made out of thin gold sheets in plastics. They are now some plastic in plastic. These CDs/DVDs will last less than 10 years (and probably closer to 5). In addition, the tape drives and hard disks are storing million time more data than what was in tape in the 60s. That is the storage density is WAY up. So now, as a small pox shows up, it will affect millions x more data, making recovery very difficult.
      • Re:Why not? (Score:5, Informative)

        by drsmithy ( 35869 ) <drsmithy@gm[ ].com ['ail' in gap]> on Monday September 12, 2005 @07:02PM (#13542574)
        They are now some plastic in plastic. These CDs/DVDs will last less than 10 years (and probably closer to 5).

        Even that's pretty generous IMHO. In my experience, recent blank CDs (and DVDs) are lucky to make out 18 months, and many of mine are delaminating or corroding after only 12. I've now gotten into the habit of burning two copies of everything I "archive", and re-burning them every 12 months. Thus far I've had errors, but never errors in the same place on each copy.

        Contrast this to the good old "Kodak Gold" CDs I was burning onto back in 1996, almost all of which are still readable with 0% errors...

      • Comment removed based on user account deletion
    • Re:Why not? (Score:5, Informative)

      by tabkey12 ( 851759 ) on Monday September 12, 2005 @05:46PM (#13541976) Homepage
      The specs could easily be lost over a long period of time, and it's very hard to reverse engineer algorithms from scratch (given that in 100 years, newer and more optimal algorithms than, LZW will be used). It's predicted that the only image format that will still be around in 100 years is ppm [wikipedia.org], simply because it only takes about half an hour to implement from scratch!
      • Check out the micro-etched data disks [rosettaproject.org] used by the Rosetta Project. Their goal is to create a long-lasting archive of the basic elements of 1,000 different languages. The storage medium they're using involves etching readable words on to metal disks. The words are not readable by the naked eye, but all you need to read them is a decent optical microscope -- no special hardware or software.

        The Rosetta Project's customized "Rosetta Disk" adds another clever innovation: naked-eye-readable words around the ed
    • Re:Why not? (Score:2, Insightful)

      by RiotXIX ( 230569 )
      Yeah I've got to agree with this. If pages of the bible made it this long on paper (and commodore emulator geeks are going to be around forever by the looks of it), I have my doubts that machines are going to be having trouble interpreting code for reading ascii or utf8. Please...if the data's that important then the people who own it should upgrade it to the latest format (if the old is 'suddenly'about to become totally obselete).
      All these people whinging about about how cd's won't last - I'm pretty confid
      • Re:Why not? (Score:4, Interesting)

        by TrentL ( 761772 ) on Monday September 12, 2005 @06:26PM (#13542282) Homepage
        All these people whinging about about how cd's won't last - I'm pretty confident that if I bother to hold on to the cdroms in my draw, provided they're kept in their cases/good condition they'll be just as playable (on the same hardware) in 100 years. Frankly I hope (probably all) of the stuff in my e-mail isn't around in 100 years.

        The amount of data we are talking about is HUGE. There is no way humans could manually upgrade the data. It would be a technical and policy nightmare. As for preserving emails, the email messages of the executive branch contain much historical significance.
        • Re:Why not? (Score:3, Funny)

          by 2Bits ( 167227 )
          As for preserving emails, the email messages of the executive branch contain much historical significance.

          Blah, if that has so much historical significance, you just need to post it to the newsgroups, and it will be preserved for as long as internet exists.
    • Re:Why not? (Score:3, Insightful)

      by wo1verin3 ( 473094 )
      People are severely over complicating this problem.

      While making things last as long as possible is a good thing, you can't plan for 100 years down the road. You have NO idea what will truly happen.

      There needs to be a system to move data from older mediums to newer mediums every few years as they become available. Multiple copies, with verification. Checking. Double checking.

      • Re:Why not? (Score:4, Insightful)

        by evilviper ( 135110 ) on Tuesday September 13, 2005 @04:11AM (#13545295) Journal
        There needs to be a system to move data from older mediums to newer mediums every few years as they become available. Multiple copies, with verification. Checking. Double checking.

        This has been discussed before. The sheer volume of data that would have to be copied every few years is HUGE. How long would it take you to transfer a stadium full of CDs onto DVDs? How much would that cost?

        There's good reason people are looking for digital technologies that are as inherently stable, long-lasting, and reliable as writing on paper.
  • GIF? (Score:5, Funny)

    by crimethinker ( 721591 ) on Monday September 12, 2005 @05:34PM (#13541860)
    Of course we'll be able to read GIF files! By then, all the stupid patents should have expired (pending action by the House of Misrepresentatives, of course).

    We're just lucky that Walt didn't dream up LZW compression while he was working on Steamboat Mickey, or we'd have patents lasting for the author's life plus 90 years!

    -paul

  • by Manip ( 656104 ) on Monday September 12, 2005 @05:35PM (#13541868)
    This has a fundamental chicken and egg problem: So you store the information, you also need to store the format of that information. So then how do you read "format of the information" document? What format is *that* in?

    You see; whatever format you used for anything has to be documented and you can't use paper because it won't last as long ... Do you carve it into stone?

    Worse still you need some computer science grads to write up exactly the format down to how long a char is and the bit/byte order. It is a extremely difficult task even if you don't take into consideration finding a storage medium that will last that long. :-(
    • by mbkennel ( 97636 ) on Monday September 12, 2005 @05:50PM (#13542001)
      This has a fundamental chicken and egg problem: So you store the information, you also need to store the format of that information. So then how do you read "format of the information" document? What format is *that* in?

      Latin, videlicet.

      But seriously the problem in records is not going to be collecting the data, but turning it into knowledge. Meaning that humans in the future are likely to seriously misinterpret or be unaware of the intended meanings and social and political contexts of the preserved data.

      This is not a technology problem.

      They ought to make sure that real professional historians are there.
    • by hypnagogue ( 700024 ) on Monday September 12, 2005 @05:52PM (#13542013)
      This is not nearly as difficult as you make it seem: implement the parser in a standardized language. The formal specification of the standardized language can then be included with the source of the parser.

      Getting code to run on later architectures is not usually very difficult. I am fairly comfortable with the proposition of porting any code to any future architecture -- the "emulator scene" testifies to the viability of this strategy. The biggest problem to be solved is reading storage media for which no hardware exists.

      For example, how do I get to my college research stored on AmigaDos floppies? Tragically, the easiest solution is to try to get my Amiga running again, and then move the data over a serial cable with kermit. I'm awfully glad I have kermit on that computer, because I don't think I'd be able to find any 2400 baud Amiga BBSes around to download it.
    • Do you carve it into stone?
      No, but long-lifespan microfiche could work (AFAIK). I suppose data density is lower compared to digital, but you could undoubtedly improve it with that kind of money.
    • Let me see if I can save them some work:

      1) Carve one copy of the ascii table in metal (choosing metal as less brittle than stone).

      2) Store the description of the file format in english ascii, and give it a unique identifier, also in english-ascii.

      3) Store the files as whatever binary you want, alongside a pointer to #2. The pointer should be english-ascii as well.

      If you're real paranoid, you store a copy of webster's english dictionary on metal (only a few thousand metal pages to print, should be pretty lo

    • This has a fundamental chicken and egg problem: So you store the information, you also need to store the format of that information. So then how do you read "format of the information" document? What format is *that* in?


      It looks like you'd enjoy reading Godel, Escher, Bach by D. Hofstadter. The whole book's about dealing with the philosophical implications of this problem.

    • Chinese whispers (Score:3, Interesting)

      I have code on a modern HDD that I typed into a BBC computer 15+ years ago fro ma magazine.

      I took it off of tape, via the BBC and a serial lead, I have all my chickens and all my eggs. So long as you move to a newer form of media before the old one perishes then your going to be OK.

      I think it's a Chinese whisper problem not a chicken and egg problem, what happens when inaccuracies are introduces

      e.g. Someone writes a file in a odd charset, nobody notices that the charset is different from ASCI when they conv
  • by account_deleted ( 4530225 ) on Monday September 12, 2005 @05:35PM (#13541869)
    Comment removed based on user account deletion
  • I want it too (Score:5, Insightful)

    by spblat ( 26399 ) on Monday September 12, 2005 @05:36PM (#13541872) Homepage Journal
    It's not just the government that needs this. Since we're funding this effort with our taxpayer dollars, I'm hopeful that some of the results from this work will lead to the availability of tools us normal folks can use to make sure our precious data can be preserved and passed down from one generation to the next.
    • by zogger ( 617870 )
      good luck! What's theirs is theirs! What's yours is theirs! from the PR:

      "The system's "initial operating capability" should be available during Fiscal Year 2007. Weinstein noted that "the system's architecture makes it flexible enough to accommodate evolving policy change," including the importance of "providing public access while protecting privacy and sensitive information.""..HAHAHAHAHA! Anything even *remotely* important or interesting, paid for by tax payers or not, sorry, "terrorism, security", yada

    • Since we're funding this effort with our taxpayer dollars, I'm hopeful that some of the results from this work will lead to the availability of tools us normal folks can use to make sure our precious data can be preserved and passed down from one generation to the next.

      It'll probably be some $50 million database system that runs on Microsoft Windows 2003 Server and requires Oracle along with a mish-mash of Visual Basic .NET applications to accept data input and display it. I mean hell, we'll still be run

      • It'll probably be some $50 million database system that runs on Microsoft Windows 2003 Server and requires Oracle along with a mish-mash of Visual Basic .NET applications to accept data input and display it. I mean hell, we'll still be running Windows in 2090 so it only makes sense to stick with standards.

        I worked on Lockheed's demo and sat through their entire presentation to the NARA staff. What you describe could not be further from the truth. The design team was well aware of the value of open stan
    • The easiest way to ensure data will never go away is to publish it on a web site or post it in a newsgroup. Ever tried to delete a message that found its way into Google's results?
  • IDE Raid.. (Score:3, Interesting)

    by markass530 ( 870112 ) <<moc.liamg> <ta> <035ssakram>> on Monday September 12, 2005 @05:36PM (#13541880) Homepage
    Not sure where I read it, but there was an article I read about using good old cheap IDE Raid as a tape replacement. Some guy did it on a large scale for university, and a (relativly low cost). Considering the low cost per GB, and easy scalability, why not?
    • Re:IDE Raid.. (Score:3, Insightful)

      by ipjohnson ( 580042 )
      The IDE-raid doesn't solve all the problems. One of the big ups for tape backup is that you can take it off site. So you either have to copy the raid and move it off site or setup a second set of systems off site.

      Either way tapes still hold some value in offsite storage.
    • Re:IDE Raid.. (Score:3, Insightful)

      At current prices and density, disks often work out to be cheaper per byte than tape.

      For ordinary backup requirements, where data only has to be retrievable for a few months or years, disks can be useful. Under these conditions, the mean time between failures of the backup drives is at least as good as that of the production drives.

      Archival backup, however, depends on an extremely low rate of failure over a very long time. The ideal backup medium is not only stable but can also be read using simple me

  • and so poorly documented. I can see why we'd spend hundreds of millions making sure we preserve the formats.

    Personally, I'd be more worried about proprietary formats.
  • Hmmm... Offsite storage for the United States... Satellite Launch?

    Seriously though, what medium would work the best for this? At this point I think that hard drives cost just as much as Ultrium tapes, for just as much storage. Seeing as tapes die so quickly, you may as well back them up onto true hard drives, then just let them sit for a few years. After ten to twenty years, carry it forward to the next big storage medium.
    • As strange as it sounds, it would be good to send to Mars, once we start sending man there. It would be nice to have a collection of all that knowledge on a different planet.

      That way, if earth really does get in the way of a new highway, well, it will not hurt the mice.
    • Every five years, maybe. Hard drives aren't particularly stable long term, any more than tape. I had a 5 Mb. (yes, megabyte) Corvus hard disk that I bought back in 1983 or so ... I tried hooking it up a couple years ago and it was, well, blank. Worked fine after I reformatted it, but it had just erased itself sitting in the closet. And the problem is only getting worse as data density increases.

      But, yeah. Every fews years I move everything that was previously on the mirrored array in my server to whateve
  • Real Video (Score:5, Insightful)

    by wildzer0 ( 889523 ) on Monday September 12, 2005 @05:41PM (#13541923)
    For a start, they should stop using stupid proprietary formats like Real Video (the Press Conference Video on their website is only available for Real Player).
  • This is a topic that's been raised quite a lot recently. Firstly, would we even want to read emails from 1990 in the future? Unlike, say, Byron's letters, that give us lucid insights and useful historical detail, most of modern day e-mail- and IM-based communication is mostly functional and lifeless.

    I remember reading an article about the archival of scientific research; many researches involved in the discovery of DNA's structure didn't keep their (hand-written) notes, but they were later recovered by ot

  • by isotope23 ( 210590 ) on Monday September 12, 2005 @05:42PM (#13541935) Homepage Journal
    tasked with preserving records "for the life of the republic"

    Task completed......

  • by electricsalmon ( 835864 ) on Monday September 12, 2005 @05:44PM (#13541955)
    ...all the 1990's pr0n! We need to keep that in a repository for the benefit of mankind for generations to come!
  • Is this how they plan to bury the records forever?

    It sometimes amazes me that LMCO manages to build planes that actually fly. But then I have to remember that the people designing the planes apparently aren't the ones designing their software.

    If they build aircraft the way they build software [fas.org], their planes would make these [redbullflugtagusa.com] look like this [af.mil].

  • Lockheed Martin is going to have fun with this one... preserving records for that length of time will be a considerable task... and hopefully they will figure out a way that will succesfully archive records forever...

    Just look back at how much technology has changed in the past 10 years. We had 5.25" Floppy drives used back in those times, and 3.5" floppies were used as well, and CD burners were just starting to come available at the speedy rates of 1-2x, not to mention hard drives were so small compared

  • Priorities. (Score:2, Funny)

    by 9mm Censor ( 705379 ) *
    goatse and tubgirl shall be archived, in all their digital glory for the ages to see.
  • Google? (Score:5, Interesting)

    by dustinbarbour ( 721795 ) on Monday September 12, 2005 @05:55PM (#13542046) Homepage
    Did Google compete for this contract? They're the ones with the largest infrastructure for such a project and the brains to give us a really slick interface to it all. Not to mention that they could probably have faster response times than archive.org which totally fuckin' blows.
    • your sig link is out of date
    • Re:Google? (Score:3, Informative)

      by TrentL ( 761772 )
      The two companies that were "down selected" to compete in the Analysis & Design phase were Lockheed Martin and Harris Corporation. I don't recall what companies participated in the initial competition...I doubt Google was involved.

      Lockheed partners include BearingPoint Inc., McLean VA; Fenestra Technologies Corp., Germantown, MD; FileTek Inc., Rockville, MD; History Associates Inc., Rockville, MD; EDS Corp., Plano, TX; Image Fortress Corp., Westford, MA; Metier Ltd., Washington, DC; Science Applicati
  • by zrq ( 794138 ) on Monday September 12, 2005 @05:58PM (#13542065) Journal
    I was curious to see if the plans included making any software developed for the project OpenSource.
    While looking through the documentation http://www.archives.gov/era/about/documentation.ht ml/ [archives.gov]
    I found a link to the project requirements : http://www.archives.gov/era/about/requirements.csv / [archives.gov]
    Which contains the following line :
    ERA2.6.3,The system shall check online formsfor correctness,,

    I know, one typo in one line in several hundred, but why that line ?
  • by Anonymous Coward on Monday September 12, 2005 @05:58PM (#13542070)
    Technically I don't see any problem with storing 100PB of data in the next decade, and keep it safe from natural disasters. But how about unnatural disasters, such as an evil administration changing the entire archive to reflect better on itself or protect itself from criminal prosecution? Copies of the archive packages need to be suitable dispersed in multiple jurisdictions or even shot into space in order to make this kind of data destruction infeasible.
  • "How will we be able to read 1990's email messages in the year 2090?

    Mebbe if we kept email as plain text we wouldnt have to ask this question. Im fond of mbox myself. However any sufficiently documented format will only leave us with storage media issues.
    • Mebbe if we kept email as plain text we wouldnt have to ask this question.

      We don't have a choice. Not only do we have to keep track of the text of the message, but we also have to archive the attachments and the arrangement (i.e. if this email is just one message in a dozen-email exchange, that has to be preserved).
  • "The Electronic Records Archives. By the same man who gave us the Stealth(TM) aircraft".

    Hhmm...
  • 1) Give me the $308M.

    2) I'll bank the cash and cream off some interest to keep me in luxury - let's say I can get 5% interest PA, that's $15.4M a year so I'll take, say, $5M/year for myself and my efforts.

    3) I'll Spend up to $10M/year maintaining a secure storage facility and purchasing 3 units of every storage device that comes to market together with a range of media, host systems and documentation on acid-free archive paper.

    4) There will be an annual charge for subscribers to the facility.

    5) Prof
  • IFF-ILBM (Score:4, Informative)

    by Zobeid ( 314469 ) on Monday September 12, 2005 @06:09PM (#13542150)
    This example of format obsolescence just popped into my head. Back when Commodore-Amiga was a going concern, the IFF-ILBM graphics format was pretty widely used. It was a nearly universal standard on Amiga.

    A fair number of artists and video producers used Amigas. One of Amiga's advantages was that practically all the graphics programs used ILBM format, which meant you could easily feed the output from one application into another, and then into another. It was good, and it wasn't all that many years ago.

    Just trying finding a program on Mac OS X or Windows today that can read IFF-ILBM files! Go on, try it! Photoshop, for one, doesn't have a clue about them. The best you can hope is to find some obscure freeware IFF-to-PNG converter that someone has hacked together.

    Another example: It's getting harder to find apps that play "tracker module" music, and the programs that are available tend to be awkward and unreliable. Everything went to MP3, and mod music was quickly forgotten.

    So if the idea of today's commonplace formats becoming unknown in the future sounds far-fetched at all. . . It's not.
    • Re:IFF-ILBM (Score:2, Informative)

      by rubypossum ( 693765 ) *
      Hey, someone else has had this problem too. Fortunately, free software to the rescue. and this [gimp.org] plugin works nicely. [freshmeat.net]

      This is where the true support for these formats will remain - open source. If you want support, you have the freedom to write it yourself.

      Of course, if memory serves me right non-free Paint Shop Pro still has IFF support as well. Hmmm. This [activewin.com] page seems to say so. I seem to remember Photoshop having IFF support, but that was 3.0 or 4.0ish many moons ago onna MacOS classic box.
    • there is a big difference between becoming obscure (e.g. you need to get hold of specialist utilities to handle them) and becoming totally unknown.

      i never used amiga so i don't know about IFF-ILBM but a google search turns up lots of hits including a gimp plugin, Some tracker formats are deffinately supported by winamp and i'm pretty sure the format specs are out there.

      and there are emulators out there for almost any old system you can think of (though i admit getting the roms can be tricky).
  • by Narmer_the_King ( 902532 ) on Monday September 12, 2005 @06:09PM (#13542154)
    YES! Finally a job after all those years studying Akkadian! Clay tablets are some of the most durable media I know. At least they have a proven record. Vast numbers of documents illustrating the fascinating world of accounting, esp. Sumerian sheep and goat transactions, is available thanks to the scribal choice of clay (combined with hot arid conditions). Will soon Lockheed HR soon be seeking 8-10 years of prior "Cuneiform/Pictographic" scribal experience? I can also read omens in the entrails of an ox. That can come in handy.
  • by pavon ( 30274 ) on Monday September 12, 2005 @06:11PM (#13542170)
    I have been saying for years that the DoD should make an initiative to move towards open standards for this exact reason. The document retention requirements they have are incredible, and yet nearly all the documents generated are saved in proprietary formats. Now with the OASIS (OpenDoc) format solidifying and there is more than one implementation of it, they wouldn't even have to define a standard for word processing or spreadsheets.

    Obviously, open standards are not a panacea. There are countless standards created by the military that never really spread farther than that, and therefore the support for them is limited (and thus companies that do support it can charge a pretty penny for it). And with open standards, at it is much easier to write an implementation if you need to. Compare this to MS Word, which is a pain to reverse engineer now, just imagine having to do so in the distant future, when it is not as widespread. And of course, for the very long term, nothing is more certain (and more inconvenient) than printing everything out and storing it in a warehouse, which is what is done now. But the longer that can be postponed, the more money can be saved.

    As an added bonus, just imagine the competition that would spring up in the word processor market, if the DoD mandated that all new word processor documents generated internally or by contractors be in OASIS format, starting 5-10 years from now. Microsoft would have to support it (and well) or throw away a huge number of Office sales. The DoD would no longer be locked into a single vendor, saving them money upfront in addition to the money they saved on document retention.

    Until then, the best plan is likely to convert as much as possible to a few standards like PDF, which is what I expect will happen here.
    • I'm not going to go into Lockheed's plan in detail, but you may be happy to know that my teammates were well aware of the OASIS standards and their value.

      A lot of people in these comments keep saying, "I can solve this problem for $10k! Convert everyone to Open Office!" That's all well and good, but people, we are already DECADES behind on this problem. Whether you like it or not, there's a boatload of Word95/Excel/BMP/etc files out there (and worse).
  • Of course they need a national digital archive since the death of the general purpose computer has been already decided non DRM'd content at some time in the future it will no longer be accessable. We must protect the corporate entities by making sure that only government has access to non DRM'd content.
  • Great. Now we can be sure that the only way to read any of these documents will be with IE 4.0 since they will prohibit even the sound of the word open source [lmco.com] (sorry PDF. See clause 11) and given that they are a defense contractor you can bet they will lock in to some proprietary SW version that is 4 versions older than what is current.
  • by zenneth ( 767572 ) on Monday September 12, 2005 @06:30PM (#13542325)
    I worked with them for a while, as a data entry person back in the early 90's. Basically, we were responsible for keying in a parcel's 5-9 digit Zip code after it had been scanned into the system. By scanned, I mean the front of the package or envelope showing the send-to and return addresses was presented on a monochrome display, which allowed the person operating the terminal to enter the zip codes for the parcels. Then you'd hit a key and move to the next one, and so on and so on.

    The bizarre thing is that I found out a few of the invididuals would "pad" their PPM (Parcels Per Minute) by typing in zipcodes they were familiar with instead of reading what was on the display, just to enter a dozen or so really quickly. It didn't happen often, but it helped them keep up the pace and "clear" the system queue more quickly, thus gaining them and their workmates an early break. However, I've no idea what damage may have occurred by their lax attitudes, and I really don't want to know now.

    Which brings me to my point (I think): how can we be certain the data they're entering is one-hundred percent accurate, regardless if the medium lasts a century?
  • Or are they just subcontracting it out, as usual.
  • In Germany, we already do that: Zentrale Bergungsort Bundesrepublik Deutschland [bva.bund.de] (German, but pictures). Here is a short description [showcaves.com] in english. All the documents are kept on microfilms, but I don't know what they do with audio/video material.
  • I can fix this. (Score:3, Interesting)

    by sbaker ( 47485 ) * on Monday September 12, 2005 @07:02PM (#13542571) Homepage
    Hmmm - I'd better email myself the GIF spec - maybe along with some source code to read it with - and a C compiler to compile the source with. Ah WTF - I'll just email myself the Linux sources. ...but seriously...there won't be any problem with reading GIF if anyone actually wants to - the file format is documented all over the place and in 100 years, if there are still GIF files on some kind of readable media - then the odds are very good that those documents will be easy to find. Programming a GIF reader (or a reader for almost any documented file format) is easy - presuming you are sufficiently motivated. A historian who is interested in 100 year old documents shouldn't have any problems getting them converted to whatever format is needed.

    The HUGE concern is the undocumented, encrypted or (worse) DRM'ed files. Reading those in 100 years may be exceedingly difficult.

    We can read documents written in heiroglyphs around 2000 years ago. The only problem is with languages for which no translations *ever* existed.

    Survival and longevity of antique media are a much bigger problem.
  • I went browsing through some of your older comments and I came across this one [slashdot.org] which contains this broken link [archives.gov]! Get it? archives.gov -> broken link? Ha ha ... heh.
  • HP and MIT have been working on this same issue with the DSpace Project [dspace.org].

    $308M would sure go far if doned to this open source federation!
  • by monk ( 1958 ) on Monday September 12, 2005 @07:15PM (#13542663) Homepage
    The articles were light (to the point of vacuum) on details about the approach proposed by the company.

    From the article: "The system's architecture makes it flexible enough to accommodate evolving policy change," including the importance of "providing public access while protecting privacy and sensitive information." From the sound of that I'm betting its some wonky and ridiculous XML format infected with a sadly pathetic little DRM imp.
    The fact is that I can read anything if I have a copy of the software that originally viewed/created it and the machine (or an emulation of the machine) on which the software ran. Adding one more format to the mix just means we have to emulate one more machine and keep track of one more piece of software and all the doubtlessly expensive effort which will be spent in conversion is wasted.

    It's great to see the National Archives working on this but I would rather see the tax money farmed out in challenge grants to organizations like the
    Long Now [longnow.org] that have a chance in Hell of delivering something useful than pouring money into yet another defense company to ensure that whatever technology we use to store records can be properly sanitized and locked away according to the whims of government and "changing policy."
    The biggest issue facing us right now is that most of the music, words and images created by our civilization are illegal to preserve. Ridiculous copyright extensions have ensured that the huge mass of data for which no rights owner can be found will simply rot instead of being digitized and stored.

    A software emulator [iconbar.com] can ensure that historic file formats are readable in the future, but Big Media would rather squeeze our history to death before it letting go of the rights.

    This is like 1000 fires at the Library in Alexandria. Future generations will curse us for every scrap of information we allow to rot while we squabble.
  • Rosetta Project (Score:3, Informative)

    by moosesocks ( 264553 ) on Monday September 12, 2005 @10:00PM (#13543680) Homepage
    Many /.-ers would be interested in the Rosetta Project which aims to preserve many world languages using an extremely failsafe medium [rosettaproject.org]. defintiely a cool read -- check it out.

    sure, it may not be terribly convenient, but it's certainly going to be readable 100 to 1000 years from now (by which point we should have adequate OCR to complete the task of reading the disc automatically)

To be awake is to be alive. -- Henry David Thoreau, in "Walden"

Working...