Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
The Internet Data Storage The Media

Treating the Web As an Archive 72

An anonymous reader sends a link to a blog post by David Eaves discussing how the ease of finding information on the web affects how we analyze history. "... nothing is different per se — the same old research methods will be used — but what if it is 10 times easier to do, 100 times faster and contains a million times the quantity of information? With the archives of newspapers, blogs and other websites readily available to be searched, the types of research once reserved for only the most diligent and patient might be more broadly accessible." As an example, he points to an almost 10-year-old article detailing the events surrounding the repeal of the Glass-Steagall Act, which some believe was a significant contributing factor to the current financial crisis.
This discussion has been archived. No new comments can be posted.

Treating the Web As an Archive

Comments Filter:
  • You can't believe everything you read on the internet.
    • by Coopjust ( 872796 ) on Saturday May 02, 2009 @10:32AM (#27798375)
      True, but looking back at verifiable events can give us some real insight.

      Try looking at the Slashdot archives on September 11th, 2001. I was in middle school when the attacks happened, and I wasn't a Slashdot reader. Even more than the articles, the comments are very interesting. Panic. First hand accounts. Anger (We're going to bomb them into oblivion, we'll have Osama in a week, etc.)

      While you can't trust old information on the internet, it does have a wide variety of verifiable information that is more accessible digitally than it ever has been before.
    • by Z00L00K ( 682162 ) on Saturday May 02, 2009 @10:53AM (#27798495) Homepage Journal

      No, you can't believe everything, but if you check the sources you can classify it as being acceptably reliable or not.

      The web contains a great deal of information but you still need a search engine to deal with it - like Google. Unfortunately - or luckily - Google does filter out some pages with insecure and/or inappropriate content. This is of course negative for some researchers but positive for most people on the net.

      And it's never wrong to double-check the information provided. It may be correct, but there may be opposite views too.

      • Don't believe anything ever.

        My philosophy is to make your best guess in the face of uncertainty while still recognizing that uncertainty means that you will be wrong at least some of the time (maybe even all the time).

        I've found that there is no surer path to error than in believing that you've figured something out.

    • Nor should you believe everything you read on paper.
    • by pohl ( 872 )

      You can't believe everything you read on the internet.

      I don't believe you.

    • by aliquis ( 678370 )

      You can't believe everything anywhere, which make that argument rather lame. Would you trust whatever newspaper? What about the bible? ..

    • Re: (Score:3, Insightful)

      by houstonbofh ( 602064 )
      Too bad it takes so long to read the article... (Not that long, just a few minutes of shock afterwards) Then your would see his premise. The web as an archive is only a small part. A bigger part is how journalism is changing to quick payoff, not truly investigative. (Verifiably true) And that those who do not study history are doomed to repeat it (also true) and we did. And that this is not the machinations of one party but both. (Also provable)

      Perhaps you should change that to "Don't believe someth
      • Yessiree.

        Because Today's +1Informative mods STAY even after tomorrow's "retraction article" which says "disregard yestrerday, it was all made up by the source."

        Blending content here AttemptingAWin, Rotten Tomatoes has a recap of Marvel movies. Look at the entry for Ghost Rider.

        "...like Daredevil, Ghost Rider went down as a critical dud whose respectable performance at the box office was overshadowed by the beating it took from writers"

        Really now!? So the public liked the movies, while certain pundits playin

    • You can't believe everything that is in the history books either. History is written by those who have the power and the money to distort history.

  • Huh (Score:5, Informative)

    by paazin ( 719486 ) on Saturday May 02, 2009 @10:40AM (#27798415)
    From the article (Nov 1999):

    The decision to repeal the Glass-Steagall Act of 1933 provoked dire warnings from a handful of dissenters that the deregulation of Wall Street would someday wreak havoc on the nation's financial system.

    Yep and no one forsaw this financial crisis, indeed.

    • Re:Huh (Score:5, Interesting)

      by Knave75 ( 894961 ) on Saturday May 02, 2009 @10:49AM (#27798473)
      Also from the article

      I think we will look back in 10 years' time and say we should not have done this but we did because we forgot the lessons of the past, and that that which is true in the 1930's is true in 2010,'' said Senator Byron L. Dorgan, Democrat of North Dakota.

      That is almost spooky. We need this guy to be running the country.

      • Well if half of the isle disagrees on every piece of legislation at least half of them are going to be right. Big surprise.
      • Re: (Score:1, Insightful)

        by SlowGenius ( 231663 )

        "Spooky" would imply that there was some mystery to it. To anyone who was paying attention, it only required enough common sense to know that foxes shouldn't be allowed to guard henhouses. There wasn't any mystery.

        Just like there wasn't much of a mystery about the lack of WMDs in Iraq before the war to anybody who was paying any attention at all to how the Carlyle Group and Halliburton's subsidiary Kellog Brown & Root were going to make hundreds of billions of dollars of profit from Cheney's unabashed

        • Offtopic, eh?

          And there I was, thinking the topic had something to do with how the ease of finding information on the web affects how we analyze history.

          Me suspects the moderator merely has a different take on politics than I do. So much for freedom of speech.

    • Re: (Score:3, Interesting)

      by houstonbofh ( 602064 )

      From the article (Nov 1999): said Senator Charles E. Schumer, Democrat of New York. ''There are many reasons for this bill, but first and foremost is to ensure that U.S. financial firms remain competitive.''

      But I thought the Republicans were to blame for this economy... That is what the media said. You mean we can't trust our politicians and the media?

      • Shoulda have watched Fox News then. ^^

      • Re: (Score:3, Interesting)

        by nine-times ( 778537 )

        Who held the majority in Congress in 1999?

        There's plenty of blame to go around.

      • Re:Huh (Score:5, Informative)

        by Trepidity ( 597 ) <delirium-slashdot@@@hackish...org> on Saturday May 02, 2009 @11:55AM (#27798897)

        There's bipartisan blame for that bill, but it was primarily pushed by Republicans.

        The act itself was named Gramm-Leach-Bliley after its three Republican drafters and promoters. The first version of the act passed both Houses with mainly Republican support, especially in the Senate. In the House, it passed 343-86, with a 205-16 tally for the Republicans, and 138-70 for the Democrats (counting Sanders as a D for the moment). In the Senate, it passed 54-44, with a 53-0 tally for the Republicans, and a 1-44 tally for the Democrats. Schumer actually voted against that version of the bill (Fritz Hollings was the lone Democrat in favor).

        After reconciliation between the House and Senate versions failed, a new version was drafted that gave some concessions to Democrats, mainly in the form of strengthened anti-redlining provisions and strengthened medical and financial privacy regulations. The sweetened bill passed by large margins, though still with the Democrats (now reduced to only a smaller core) being the primary opposition. In the House, 57 still voted no, including 52 Democrats and only 5 Republicans. In the Senate, there were 8 nays, comprised of 7 Democrats and 1 Republican. Clinton (a Democrat) signed the bill.

      • Re:Huh (Score:5, Insightful)

        by Vellmont ( 569020 ) on Saturday May 02, 2009 @11:58AM (#27798913) Homepage


        But I thought the Republicans were to blame for this economy

        The repeal of Glass-Steagall was one of many pieces of de-regulation that lead to this mess. The loudest voices I hear championing the call for less regulation, smaller government, etc is the republican party. If you read the article the vast majority of the opposition came from the Democrats (with only 1 Republican Senator voting no). It was pretty weak opposition to be sure.

        So sure, Democrats can share a lot of the blame here. But don't ignore the fact that Republicans are largely the ones pushing for de-regulation (many still want even MORE).

        Anyway, I think the thing to take home from all this is not one party over another, but rather one set of ideas as being wrong. I always hear the main argument against regulation being "unintended consequences", like it's some kind of magical argument to wave over everything. What people seem to forget is that ANYTHING can have "unintended consequences", including doing nothing.

    • From the article (Nov 1999):

      The decision to repeal the Glass-Steagall Act of 1933 provoked dire warnings from a handful of dissenters that the deregulation of Wall Street would someday wreak havoc on the nation's financial system.

      Yep and no one forsaw this financial crisis, indeed.

      Which neatly illustrates the problem with this (fast, broad, and lacking in scholarship) type of historical 'research'... It makes it trivial to find someone whose opinion supports your position, even if the prediction at the tim

      • Re: (Score:3, Funny)

        by jmorris42 ( 1458 ) *

        > It makes it trivial to find someone whose opinion supports your position...

        Exactly. Anybody with a functioning brain could figure this one out but like so many myths of the left it goes unchallenged. And when challenged the challenger is either thrown down the memory hole (if small, as in watch this post go -1) or shouted down violently in the hope they learn to never question authority[1] again. Or just shouted down to drown them out. And even if the rebuttal is total and absolute the politically

  • by wjh31 ( 1372867 ) on Saturday May 02, 2009 @10:41AM (#27798421) Homepage
    Yes theres alot of information on the internet, its easy and fast to find it. But its also easy and fast to find a great deal of crap on the internet that isnt actually of any use to you. Filtering the wheat from the chaff can often take as much or more effort as finding the information in the first place. How many times have you had to re-word your search phrase, try several search results, and use ctrl-f to actually select the usefull information from all the extra crap.
    • so it takes 5 mins rather than 5 seconds - that's still better than spending days or weeks scouring filing cabinets and book shelves for fewer useful results
      • by wjh31 ( 1372867 )
        There is no denying that that is true. However to have an algorithm that helped you could take it from those 5 mins to (nearly) 5 seconds. The human mind is pretty damn good at filtering out information we dont need, think about how much sensory stimulation we recieve all the time, and how we manage to completely ignore most of it, and just focus on the sound/sight/etc that we are actually interested in. If we were able to figure out the algoritm in our mind, or anything close to it, im sure it would provid
    • Sturgeon's Law applies -- 90% of the 'information' on the Internet is crap.
    • Filtering the wheat from the chaff can often take as much or more effort as finding the information in the first place.

      Assuming of course that you are skilled in identifying that strain of wheat in the first place. If it's a topic you are familiar with, this is fairly straightforward. If it's a topic you aren't familiar with... it's much, much more difficult.

    • Re: (Score:3, Insightful)

      by argent ( 18001 )

      How many times have you had to re-word your search phrase, try several search results, and use ctrl-f to actually select the usefull information from all the extra crap.

      Yes, that's way more trouble than driving down to a university library and spending the afternoon grovelling through microfiche to get a comparable amount of information.

      • by ResidntGeek ( 772730 ) on Saturday May 02, 2009 @02:48PM (#27799935) Journal
        Man, have you ever _been_ to a university library? The sheer amount of information available in those places can have a comparable effect to the Total Perspective Vortex, if you stop and think about it too much. It's also the most beautiful thing on earth.

        So, you should go sometime. Wander around a good university library (I recommend Perkins Library at Duke, if you're anywhere near there) sometime, just marveling at the sheer amount of information available - open a few books and skin them, go to the official documents section and look at random UN subcommittee reports from 1978, check out the journal archives and read organic chemistry papers from 1932... then go home and try to still feel powerful and informed while you wrestle with Google and the Wayback Machine trying to get a newspaper article from 2007 that isn't on the website anymore.
        • by argent ( 18001 )

          Yes, I've been to college, for real, dude. And not just for the coed babes and frat parties.

          Yes, there's way more information off the web than on it, my point was about the quality of the search tools. I'm totally in agreement about the ephemeral nature of the net. We need an "interweb of congress" with the fundage to archive everything and the will not to remove anything, or we're going to end up in the future of Stross's "Glasshouse" where there's virtually nothing known from the '90s to the 2100s because

  • I anticipate that treating the web as an archive will ultimately lead to a frustrating dungeon of page-not-found errors, expired domain names and pay-to-access newspaper web sites.

    • I anticipate that treating the web as an archive will ultimately lead to a frustrating dungeon of page-not-found errors, expired domain names and pay-to-access newspaper web sites.

      Ultimately?

  • by line-bundle ( 235965 ) on Saturday May 02, 2009 @11:01AM (#27798537) Homepage Journal

    The web fails in so many ways.

    1. It's to easy to rewrite history. Because articles are (generally) on one website they can be changed. This is unlike a newspaper archive where it would be costly to destroy all copies of the paper.

    2. The web is biased. If aliens connect to the internet they would think all the human race ever does is porn and bashing MS (maybe not exactly that, but you get the idea)

    3. The web becomes unreadable faster than paper archives. Protocol changes and what-not.

    4. The web is too easy to control. A private company can censor the web via lawsuits.
    .
    .
    .
    I'm tired

    • 1) That's why some sources aren't trustworthy, and also why the Web Archive project is so important.
      2) Different sites have different bias. There are sites that are very pro-MS too.
      3) Which is why standard transfer protocols and document specifications are so important. HTML has been readable since 1991.
      4) The internet routes around censorship. Ever hear of the Streisand effect?
    • Because articles are (generally) on one website they can be changed. This is unlike a newspaper archive where it would be costly to destroy all copies of the paper.

      It's costly to destroy all copies of a newspaper. It's almost impossible to destroy all copies of an online file - because making a digital copy costs nothing. Just try getting rid of embarrassing personal information off the internet. It's not easy. Look at all the trouble facebook or myspace users have to go through. Once you put something on the internet, it's very hard to take it off. Even if the government or the media were to rewrite history, the real news would be shared on peer to peer networks, r

      • Re: (Score:3, Insightful)

        by grumbel ( 592662 )

        The problem is that this only works if the information is valuable enough for people to actually copy it. With by far the most information on the net, that just never happens. And even if you copied it, you don't have any way to check that its authentic and not a manipulated copy and neither do you have a way to keep your copy online for the general public, as DMCA takedown notices will make sure that the information can only be found in obscure places.

        I think the biggest problem with the Internet is the la

    • Re: (Score:3, Interesting)

      I would double your point #3 as very important. Look what happened to tables in HTML. Once they were central to web design, now they are gradually being deprecated. As despicable as the BLINK tag is/was, it's a classic example. In the future,BLINK won't even work, and then a website won't be understood in all of its "glory". Now BLINK is ugly and stupid, but TABLES are not. When will CSS be deprecated, then what?

      Your point #3 is endemic to all digital data, and it is why I think our culture, unlike many b

      • Deprecation doesn't usually lead to removal of the feature as far as web standards go. Tables in HTML were deprecated because there was a better way- CSS and other attributes intended to be used for page/text alignment, tables were supposed to be for tabular data only. All major browsers still render tables- why remove the rendering when so many pages use it, and it isn't a security issue?

        The blink tag is a poor example at best, as it was never a standard to begin with. Which is why specified standards a
      • Even if you don't have a program that can render html or css, researchers will always be able to read the plain text. In the year 12009, first year computer science students will still be able to write a program that turns html into plain text. As long as the tables and CSS are only used for formating, it doesn't really matter. Of course, occasionally there are times when tags actually matter:
        Schrodinger's cat is <blink> not </blink> dead.
        • I see your point, but I'd suggest that HTML is only readable as long as HTML is readable. Sure: we might have a "web" in the future, but it won't work the same way. It used to be that betamax was the BEST and everyone used it. VHS (for a variety of good reasons) became the standard even though it was inferior.

          So, we could go from HTML to something else, say Internet Crap Markup Language (ICML) and ICML might suck ass compared to HTML, or it migh be superior - either way, HTML is toast, and all those HTML

          • There is an insane amount of data currently in HTML, a standard that's been going strong and has just been continually added to over 18 years.

            A) Why change from HTML?
            B) Why wouldn't we be able to convert to ICML?

            HTML is primarily a text format anyways, so unless we gave up text based formats all together, HTML would still be readable- some markup might be lost, but the majority of the information is there.
  • by Ralph Spoilsport ( 673134 ) on Saturday May 02, 2009 @11:14AM (#27798641) Journal
    Back when the Bush Junta decided to invade Iraq, the article on Time MAgazine's website by George HW Bush as to why deposing Saddam would be a Really Bad Idea disappeared. As far as I know it still isn't there.

    I think Archive.org is a good online archive, but its actual mission is impossible: it would automatically require a doubling of the size of the interweb thingie.

    So, combine that with the Memory Hole problem, and you have a precarious situation: not a good formula for notions of an archive, where consistency, completeneess, and reliability are paramount.

    RS

    • by maxume ( 22995 )

      Yes well, as long as most people treat it as a new avenue for research, rather than the final attainment of perfection, we should be able to continue to muddle on.

  • If ALL publications were archived online to allow for searching through the web it would make comparative research so much easier. The trustworthyness of sources will be exactly the same as they are now, it's just at pesent you have to physically go to a library archive & scan through paper or microfische copies of the pages, this is time consuming & also has the potential that a researcher will miss information. Even if the immense amount of information (& the huge storage upgrade needed to h
    • If ALL publications were archived online to allow for searching through the web it would make comparative research so much easier.

      This exists. It's called LexisNexis [wikipedia.org].

      • Perhaps I should have been more specific and said "Published for free" Sadly LexisNexis is also chargeable & you need more than one subscription for everything.
        Paper archives are normally free to search.
        Additionally the news site states it only dates back as far as 1986, many newspapers have even less historical archive than that included, and some do not appear on their list of publications at all. It goes a little way to making available past hundred years or so of print journalism, but it's re
  • The problem is that the vast majority of "information" on the Web is (a) hopelessly commercial, (b) non-objective chatter, or (c) actual misinformation. Have you tried to use Google lately to solve some specific technical problem or better understand a specific issue? Unless you are very skilled with search query syntax, the majority of hits on the first several pages are likely to be useless, irrelevant, or worse misleading.

    Take for instance this search:

    (specs OR specifications)

    Now, you'd hope th

    • Take for instance this search:

      (specs OR specifications)

      Now, you'd hope that would consistently get you only pages that detail the specifications of that system, right? It doesn't: what it will get you, primarily, is page after page of hits from commercial interests - parts suppliers - who I suspect have figured out that they can subvert this search simply by ensuring that the word "specifications" appears somewhere in all their item pages.

      Searching for solutions to technical problems, OTOH, is li

      • by macraig ( 621737 )

        BTW, that was supposed to read "some computer model (specs OR specifications)", but I goofed it. I've tried that search, and normally I have to refine it by adding site exclusions to the query; adding term exclusions doesn't work, because words like "battery" appear in not only the parts-suppliers hits but also the actual specifications themselves.

        It's a tricky process that the average person, I suspect, never really learns to master. The result is, of course, that the Web owns them rather than the revers

  • Generic web searches mainly suck for finding anything comprehensive in printed/journalistic/academic/legal content. They are okay for 1st year/freshman initial searches and fact finding, but for anything serious, there's a reason why deep databases like LexisNexis [wikipedia.org] and WestLaw [wikipedia.org] exist.

  • It's great when high-quality information is available, but it can be very damaging when old information that is either untrue or taken out of context resurfaces.

    Imagine if your best high school buddy blogged about your getting drunk in your sophomore year. You asked him to delete it and he did. Because you were minors, the police records of the event were also sealed.

    Your buddy's blog got caught up in an archive and when you were 23 and running for local alderman it came up two days before the election. Y

  • Robert Rubin was calling for reform of Glass-Stegall at least as early as 1995. Clinton People wanted the repeal as well and didn't just meander into it.

    Monday, May 1 1995

    "Rubin calls for modernization through reform of Glass-Steagall Act."

    "Robert E. Rubin, secretary of the Treasury, recommended that Congress pass legislation to reform or repeal the Glass-Steagall Act of 1933 to modernize the country's financial system. In testimony before the House Committee on Banking and Financial Services, Rubin said Cl

I tell them to turn to the study of mathematics, for it is only there that they might escape the lusts of the flesh. -- Thomas Mann, "The Magic Mountain"

Working...