Forgot your password?
typodupeerror
The Internet Network

Fixing Broken Links With the Internet Archive 79

Posted by Soulskill
from the maintain-URIs-or-T.B-L.-will-beat-you-up dept.
eggboard writes "The Internet Archive has copies of Web pages corresponding to 378 billion URLs. It's working on several efforts, some of them quite recent, to help deter or assist with link rot, when links go bad. Through an API for developers, WordPress integration, a Chrome plug-in, and a JavaScript lookup, the Archive hopes to help people find at least the most recent copy of a missing or deleted page. More ambitiously, they instantly cache any link added to Wikipedia, and want to become integrated into browsers as a fallback rather than showing a 404 page."
This discussion has been archived. No new comments can be posted.

Fixing Broken Links With the Internet Archive

Comments Filter:
  • by SunTzuWarmaster (930093) on Friday January 24, 2014 @06:26PM (#46062021) Homepage

    So let's say that my company has three lines of products on three different webpages. We decide to discontinue two of the lines of products for being unprofitable, and remove the pages. Google search results still show the pages, and archive.org still shows them to users. These products are still shown to my potential customers, who experience frustration when they attempt to get them.

    Alternately, I create a temporary webpage for displaying some demo content to a potential client. It is a demo page, and ridden with bugs, holes, and other areas that need improvement. Archive.org still shows this page as part of search results? What will potential clients think of my company, given that it put up a buggy/terrible page?

    Alternately, let's just say that I rename a longstanding webpage (technology.slashdot.org to tech.slashdot.org) and delete the old URL. Should archive.org redirect to false content?

    Or, let's say that my restaurant decides to take down its 2013menu.html page, and doesn't wish customers to be able to compare its new and old menu side by side to see where prices inflated.

    Error messages have purpose. While the most common case is that the page/server went offline, there are many times where a page URL changes as a result of regular website updates, where you don't want users to obtain old content.

    Sometimes things are deleted for a reason.

  • by IonOtter (629215) on Friday January 24, 2014 @07:59PM (#46062857) Homepage

    There was a fascinating website dedicated to high-energy weapons and experiments, called svbxlabs.com

    It was run by a young man who'd been born in the US to Ukranian immigrants, which is actually important to keep in mind. He was brilliant, at least in my eyes, putting together the most incredible devices. HERF cannons, railguns, Tesla coils; you name it. He was the first to explain what the OptiCom traffic Light Changer [fleetsafety.com] was, and how it worked.

    In short, he was doing a lot of work on things a LOT of people would much rather he didn't. Things were zipping along nicely, and his college professor was very excited to see what he came up with next.

    Then 9/11 happened. Within four months, the site was gone. And Slava Person vanished from the Internet not long after that. Other people took up the mantle of his work, such as powerlabs.org, but it's not as good as Mr. Slava's work had been.

    But if you put svbxlabs.com into WBM/A.O, you can find most of what he did. Also, one of the problems of WBM/A.O is that you can't just click on the links. Sometimes you have to copy them, then enter them into the WBM window, otherwise your browser tries to go to the direct link. Which no longer exists.

    I've also used it to find all kinds of fan fiction, role-playing games, artwork and more.

    I approve of this.

The only thing cheaper than hardware is talk.

Working...