Forgot your password?
typodupeerror
The Internet Businesses Yahoo!

Archive Team Is Busy Saving Geocities 267

Posted by kdawson
from the one-might-well-ask-why dept.
jamie found this note from Jason Scott, who organizes the Archive Team. They are busy downloading as much of Geocities as they can before it vanishes from the Net after Yahoo pulled the plug. (Note: that textfiles.com link is a good candidate for Readability.) "..after 48 hours of work, Archive Team has saved over 200,000 Geocities sites. We're now pulling in new sites at the rate of something like 5 a second. Is that fast enough? We'll see, won't we. ... A side-effect of the whole process is I now know way, way, way too much [sic] about Geocities than I ever expected to. We've had to dissect every aspect of how the site functions to understand how to mirror things, from its history through how it does crazy javascript ads. Some of it is stupid and some is hilarious... We think we have most every site from 1999 and before on Geocities that was left. ... It is more important to me to grab the data than to figure out how to serve it later. People who have been talking about copyright and stuff seem to think I'm going to sell it or take credit or some crap. I don't see how the final collection won't end up online, but how is elusive — maybe a torrent of a bunch of zip files, or as a curated collection, or as a bunch of hard drives. However it is, I'll make sure people can get it, somehow."
This discussion has been archived. No new comments can be posted.

Archive Team Is Busy Saving Geocities

Comments Filter:
  • by ipb (569735) on Monday April 27, 2009 @06:54PM (#27739141) Homepage
    to surround it all by a blink tag
    • Re:Don't forget (Score:4, Informative)

      by Sfing_ter (99478) on Monday April 27, 2009 @07:39PM (#27739631) Homepage Journal

      firefox still supports the blink :D

      • Eye candy, or eye cancer? You be the judge.

        I remember reading a sentence of a paragraph once that was trapped in the blink vortex and said fuck this. *copy* *paste into notepad*

      • Re:Don't forget (Score:5, Informative)

        by SnowZero (92219) on Monday April 27, 2009 @09:54PM (#27740801)

        Until I found about:config, browser.blink_allowed.

      • by dargaud (518470)
        I've never understood why the same people who complain about the blink tag praise OSX 'design', including the dock's bouncing icons. Both irritate me to no end and the first thing I ever did with Mac OSX was spend 10 minutes on the web to figure how to turn that concentration killing monstrosity off.
  • by Glass Goldfish (1492293) on Monday April 27, 2009 @06:57PM (#27739181)

    With Google losing half a billion a year, how long until they pull the plug on Youtube? I guess it could turn a profit, but when? My guess is the next downturn will cause shareholder pressure to force their hand.

    • by symbolset (646467) on Monday April 27, 2009 @07:03PM (#27739235) Journal
      They'll be broke in only 40 years.
      • Re: (Score:3, Interesting)

        by Anonymous Coward

        They'll be broke in only 40 years.

        I wonder if you were thinking the same thing I was when you said this.

        There is a part in Citizen Kane where his editor is telling Kane as a publisher 'your losing hundreds of thousands of dollars a month' or words to that effect and Kane says 'your right, at that rate I'll have to close the doors in 20 years' or there abouts.

        I am too lazy to login or google the exact quote.

      • Re:At that rate... (Score:5, Insightful)

        by PopeRatzo (965947) * on Monday April 27, 2009 @08:16PM (#27739945) Homepage Journal

        They'll be broke in only 40 years.

        Because of course, we know they'll never adapt, they'll never innovate, right?

        I mean, it's only Google. It's not like there's any smart people involved. What have they ever done?

        Sometimes, I tire of intellectual midgets.

        • Re: (Score:2, Informative)

          by symbolset (646467)

          The technology environment is not likely to change more in the next forty years than it has in the last forty.

          :-)

          • by beav007 (746004)
            I think you're wrong. With the ease of information transfer and academic research that the internet allows now, but didn't for large portions of the previous 40 years, there is huge potential for growth in knowledge and inventions, compared to 40 years ago.
          • Technology growth is exponential, it certainly will change more in the next 40 years.

            Google technological singularity.
            • Re:At that rate... (Score:5, Insightful)

              by symbolset (646467) on Monday April 27, 2009 @10:48PM (#27741183) Journal

              You know, 40 years ago businesses with rare exception didn't have computers. There was no Internet. It took a professional typist about 10 minutes to bang out a professional letter. There were no cellular phones - hell, touch-tone wouldn't even be invented for fifteen years.

              I've got more transistors in my house than existed then in all the world. I've got more storage in my desktop computer (3TB) than existed in the world at that time. I can communicate in ways that at that time were absurd speculative fiction, and would have seemed absurdly undesirable. For example, an annoying computer sends an email reminder every night at midnight to my cellular phone and I can't convince its administrator to make it stop. I could turn my cell phone into a streaming web beacon that updates my position on a world-visible map in real time and I don't actually know if it's doing that without my permission. I can stream my live first person perspective to everyone in the world bored enough to watch it. And now it takes a team of 3 most of a day to craft and deliver a professional email.

              You're right. By then we may have lost the ability to communicate in the written form entirely, and lost the option to opt out. That would definitely be "more change".

              • please mod this guy up
              • Re: (Score:3, Insightful)

                by Dun Malg (230075)

                It took a professional typist about 10 minutes to bang out a professional letter.

                Why is this an example of advancement? Technology hasn't changed that. What's changed is that the "typist" can now send it to a recipient halfway around the world instantly, or print 100 copies in minutes. The typist still has to bang out the letter on a keyboard, same as always.

        • Sure they can adapt. But what is to stop them from following the same path that they have a proven track record of following, and keep themselves perpetually behind the eight ball? Just like the tech weenies who just have to buy the latest gadget. So far they seem to be good at overextending themselves. Not all companies have leaders who can reign themselves in. I seem to recall some server company being bought by Oracle recently.
    • Those numbers were on crack just so you know. (The cost to run youtube #s)
    • Re: (Score:2, Insightful)

      by EonBlueTooL (974478)
      Last time someone brought this up moore's law was mentioned.

      As storage capacity and throughput expand and become cheaper, google can start to make a profit.

      I still however think that google is stupid for not doing what hulu does.
    • by ZorinLynx (31751)

      This is why I use the various tools available out there to locally save ANY YouTube video I particularly like.

      It's a very important rule to follow when you're on the net: If you like it, save it. It won't be there forever.

      • This is especially true with Youtube. Content is removed by the minute for various, sometimes superfluous reasons.

    • by SharpFang (651121) on Tuesday April 28, 2009 @03:18AM (#27742681) Homepage Journal

      You're missing one important point:

      How much would Google be losing to competition if they didn't have Youtube?

      It's a war out there, and Youtube is an outpost - costly to keep, but if you don't keep it, the enemy will gain not only it but a lot of field.

  • I lost the password to my Geocities page 10 years ago. Think you might be able to find it?
  • Yes, future generations must know about the horrors visited upon us by the millions of tubgirl and lolcats clones which populated Geocities. Those who forget history are doomed to repeat it.
    • by Ilgaz (86384) on Monday April 27, 2009 @07:14PM (#27739361) Homepage

      I think some Yahoo suits thinking exactly as you joked but a message for them: It is history they will be rm -rf 'ing and you show like a company which can't even afford idle webpages hosting for historical purposes, in such a bad shape with no future.

      They will be deleting (or considering even) dead/passed away people's webpages while they don't have any chance to reply to their lame mails or "click here" things. They did the very same thing in Yahoo Briefcase, 10 MB of highly compressible data for God's sake. At most!

    • by jlarocco (851450) on Monday April 27, 2009 @07:15PM (#27739383) Homepage

      There was a time, I'd put it somewhere between 1996 and 1998, when Geocities wasn't half bad. Few people were really "up" on the technology, so they'd use Geocities to host real, actual pages that didn't suck. Granted it didn't last very long, and practically overnight everybody was using real hosting options for anything serious. But for a little while, seeing search engine return a link to Geocities wasn't automatically a bad thing.

      Then again, maybe there just wasn't much to compare to back then. Or maybe it just seemed neat because I was only 14.

      • by QuantumG (50515) * <qg@biodome.org> on Monday April 27, 2009 @07:26PM (#27739509) Homepage Journal

        Or maybe it just seemed neat because I was only 14.

        Thanks for making me feel like an old man.

      • Re: (Score:2, Insightful)

        by linzeal (197905)
        I was 18 and it wasn't half bad as you say. There might be a lot of important information there to archive and we should help them if we can.
        • Re: (Score:3, Interesting)

          by PopeRatzo (965947) *

          There might be a lot of important information there to archive and we should help them if we can.

          Can you give us an example?

          I'm not doubting that there's something culturally crucial that's on a Geocities page somewhere that's never been moved elsewhere, but I'd like an example before I get too exercised.

          • Re: (Score:2, Funny)

            by GillyGuthrie (1515855)
            I found an excellent page describing dives from the top of the castle in Super Mario 64 on geocities once.
      • Re: (Score:3, Interesting)

        by darkstar949 (697933)
        Agreed, in fact there is still some good content up on Geocites that I just recently discovered. Case and point would be a fairly inclusive reverence to the Cokin Filter System [geocities.com]. I'm not sure if it is still being updated, but it would be a loss if it is the only site like it on the internet.
    • by Eudial (590661) on Monday April 27, 2009 @07:20PM (#27739445)

      Uh. We already have repeated it. Myspace is basically last couple of years' geocities.

      Now there's the web 2.0 boom which is the geocities of the future. Except, instead of small personals sites with blinking gif animations, you have big sites with horrible AJAX interfaces that completely breaks page navigation. Yes, this applies to big websites like slashdot and freshmeat as well.

      What the hell? What was wrong with the old slashcode? The difference for the end user is that now you have to click 10 times to do what you could do in one click in the web 1.0 version.

      The lesson to be learn is that you shouldn't fix what isn't broken.

      Now I'll get back to my rocking chair. I've got kids to keep off the lawn.

      • by PopeRatzo (965947) * on Monday April 27, 2009 @08:24PM (#27740017) Homepage Journal

        you shouldn't fix what isn't broken.

        That would eliminate a whole lot of what we call "progress" in technology and culture.

        Sometimes, you don't realize something is "broken" until somebody comes along and "fixes" it.

        Know what? I like people who fix what isn't broken.

        • by Eudial (590661) on Monday April 27, 2009 @08:58PM (#27740347)

          you shouldn't fix what isn't broken.

          That would eliminate a whole lot of what we call "progress" in technology and culture.

          Sometimes, you don't realize something is "broken" until somebody comes along and "fixes" it.

          Know what? I like people who fix what isn't broken.

          Though aimlessly adopting any new technology that comes along isn't progress.

          I'm appending a list of browser features mutilated by web 2.0:

          • The back, reload and forward buttons
          • Navigation with the cursor keys.
          • Bookmarking
          • Searching in pages

          When every webpage has it's own conventions for what happens when you press a key, you haven't moved forward, you've moved into chaos. Nowadays, what happens when you press a key or click on an element is an entirely arbitrary matter in the hands of the website designer, and completely different from site to site.

          Navigating webpages used to be difficult enough when all links were immediately available. Now, adding to the pain, you have to search page elements that are only loaded if you perform some arcane voodoo ritual that the designer figured decided was how the page elements should work.

          It's not that web 2.0 pages have a new interface that's different from the old, it's that every single web 2.0 page has it's own conventions.

          • by ghmh (73679)

            I'm appending a list of browser features mutilated by web 2.0:

            • The back, reload and forward buttons
            • Navigation with the cursor keys.
            • Bookmarking
            • Searching in pages

            Flash mutilated those long before this so called '2.0'

          • I'm appending a list of browser features mutilated by web 2.0:

            * The back, reload and forward buttons
            * Navigation with the cursor keys.
            * Bookmarking
            * Searching in pages

            The back, reload and forward buttons are doable even in web 2.0 by applying hash codes and history stacks to the navigation. It is not easy but doable!

            Navigation with the cursor keys, same here doable!

            Bookmarking, as well doable by adding deep linking via hash codes!

            Searching in pages: pleaaze... that has nothing to do with dhtml based pages!
            You can search within pages as long as you are document centric and dont have a rich client application running!

            The problems I see currently is that all of this stuff i

            • Re: (Score:3, Interesting)

              by Sancho (17056) *

              Searching in pages: pleaaze... that has nothing to do with dhtml based pages!
              You can search within pages as long as you are document centric and dont have a rich client application running!

              I will give an example, most of the stuff mentioned can be done via applying a hash value which represents some kind of application state (hash because it is alterable from the script without causing page refreshes)

              I think you're both coming to the discussion with a different set of assumptions. You're absolutely right that for a web application, many of his gripes don't make sense. Realistically, though, many companies use DHTML for content which is static.

              http://digg.com/ [digg.com] is a perfect example. Disable Javascript and go to the comments on one of their stories. Now turn on Javascript. There's actual content which is inaccessible unless you have Javascript turned on. Slashdot has a similar system, except it grace

      • by stephanruby (542433) on Monday April 27, 2009 @10:15PM (#27740935)

        Uh. We already have repeated it. Myspace is basically last couple of years' geocities.

        Except for the fact that the girls are younger and sluttier, a definitive improvement.

      • The new Slashdot interface is better than the old, all in all. The preferences popup/overlay is stupid and the moderating interface needs to go back to having a confirm moderation button but the dynamic display of remaining mod points is nice and the inline, dynamic commenting is brilliant. The ajax-driven thread expand/collapse is also good.

      • by AdamHaun (43173)

        Uh. We already have repeated it. Myspace is basically last couple of years' geocities.

        I have a theory that all new internet formats (blogs, social networking pages, etc.) ultimately evolve into attempts to recreate Geocities. Geocities is the archetypal version of what happens when everyone has a web presence.

      • Re: (Score:3, Insightful)

        by Mex (191941)

        I humbly disagree that Myspace is anywhere near as useful as Geocities could be*. Or at least entertaining.

        You could spend hours on interesting geocities sites devoted to a very particular subject. Anyone remember the website "Spatula City"? I think it was hosted on geocities for a time.

        Then you had the websites that were kind of like mini-wikipedias for tv shows, Star Trek, the simpsons, and so on.

        There was the odd personal webpage that was actually interesting (I remember "Tales from a loser" or something

  • by brasselv (1471265) on Monday April 27, 2009 @07:05PM (#27739267)

    Isn't anybody going to move a finger, while a significant part of our collective history disappears forever?

    I really don't think anyone should be allowed to simply pull the plug, no matter what TOS say.

    If I buy the Colosseum and then decide to blow it up "because it's mine", I bet I'd be stopped by someone, rightly so.

    As a historian of year 2075, I'd really want to have access to Geocities if I am researching the '90s.

    It happened at least once before. In the 50's and early 60's, video storage technology was expensive, and most video documentation was not not considered to be of any 'historical value'. As a result, most of it was just erased and we have lost forever an incredible source of information on that period.

    Is there a productive way to scream? A petition of some kind? An attorney to be addressed?

    • by QuantumG (50515) * <qg@biodome.org> on Monday April 27, 2009 @07:08PM (#27739293) Homepage Journal

      If you buy a movie theater that shows dirty porn films and has jerk-off booths in the back, people will be demanding you blow it up for years, and when you do, they'll throw a party.

    • Re: (Score:3, Insightful)

      by floodo1 (246910)
      While I wouldn't liken Geocities to the Colosseum, I too believe that these guys should be commended for keeping such an interesting archive. The beauty of the internet is that it's all digital so it's as if (to continue your Colosseum example) someone came in and copied the entire Colosseum before you blew it up.

      That said, everyone that originally had sites on Geocities should have already been responsible for the content they left there. If it was actually important then they should already have moved
      • That said, everyone that originally had sites on Geocities should have already been responsible for the content they left there.

        How about the people that have composed historically significant geocities content but the people themselves are dead? That's the deal with history. The important content can't be maintained by its creators for the long term.

        Seth

      • Re: (Score:3, Interesting)

        by mike2R (721965)
        Maybe not the Colosseum itself, but maybe the contemporary graffiti scrawled on it. See (although these are from Pompeii). [pompeiana.org]

        It's actually quite an apt comparison, and shows how little we have changed as a species :) eg:

        I.4.5 (House of the Citharist; below a drawing of a man with a large nose); 2375: Amplicatus, I know that Icarus is buggering you. Salvius wrote this.

    • Re: (Score:3, Informative)

      by djdavetrouble (442175)

      Isn't anybody going to move a finger, while a significant part of our collective history disappears forever?

      Yes, the Archive guys are lifting their finger 5 times every second and archiving them.
      Don't make me say that RTF thing.

    • Re: (Score:3, Interesting)

      by merreborn (853723)

      Is there a productive way to scream? A petition of some kind? An attorney to be addressed?

      Petitioning Yahoo to continue hosting an antiquated service that is likely bleeding money isn't likely to be productive, obviously.

      But it would be awfully nice of them to .tar everything up and .torrent it. There are thousands of us who'd be more than happy to do our part to keep those bits from disappearing into the ether.

    • As a historian of year 2075, I'd really want to have access to Geocities if I am researching the '90s.

      I'm unclear; are you a historian for the future, or one from the future? Either way, care to share with us whether Myspace finally gets shut down like this too?

    • by Haoie (1277294)

      On the upside, at least Yahoo gave warning.

      Although there's no exact date for closure, is there yet?

    • by jcnnghm (538570)

      Yeah, if you give something to someone, you should have to keep giving it to them forever. How else will we all feel entitled.

    • by Wuhao (471511)

      If the service does indeed belong to history, then let's see history pay its bills.

    • If I buy the Colosseum and then decide to blow it up "because it's mine",

      Funny that you mentioned it, exactly what you described happened with the greek Acropolis in Athens a few hundred years ago. The turkish used it as a weapons storage and it blew up!
      Not that the greek back then even bothered, athens by that time was nothing more than a village with a handful of people!

  • Shame on Yahoo (Score:5, Insightful)

    by Xero (19560) on Monday April 27, 2009 @07:12PM (#27739327)

    This is just ridiculous the amount of work they have to go through to half ass archive geocities. Why can't yahoo just hand over a stack of hard drives to archive.org or someone?

    • by Ilgaz (86384)

      It seems the new management has no clue how Internet works. It sounds funny while I write but it seems like the truth. The large storage companies doesn't have a clue about sponsoring things. E.g. instead of putting a gigantic SAN ad to a "Windows 7 rocks" story at CNET, hand them some quality storage right IBM?

      I better start archiving my Yahoo mail which is up since 1998.

    • by mgblst (80109)

      You can be certain they won't be reusing them. I guess it would involve too many privacy concerns, and too much effort your yahoo. They are probably a little bitter, since they spent so much money on geocities only a few years ago. Of course, they are also the reason that it lost popularity.

  • by egcagrac0 (1410377) on Monday April 27, 2009 @07:15PM (#27739379)

    I want to make sure that any geocities site I may have been affiliated with back in my formative years is not seen by anyone who might recognize me now.

    Who do I make the check out to, and how many significant places will be required?

    • Re: (Score:3, Interesting)

      by N3Roaster (888781)

      It might already be gone. I, too, once had a page on GeoCities, so I decided to look into it. Searching for it, Google couldn't find it (but it seems Google Books likes to interpret the old long s as an f). Fine tuning my search pulled up one hit: a Usenet post with a link to the page in the .sig. So, I take this, and I go to the wayback machine. Put in the URL, and I get two versions, both from the year 2000 (well after I had stopped updating the site). Clicking the links, both were unavailable. The conten

  • by TheModelEskimo (968202) on Monday April 27, 2009 @07:20PM (#27739443)
    There was an awesome amount of amateur research on Geocities. Some of my favorite reference sites are therefore just about toast (most of them containing first-hand military history).

    And just because someone asked, I saved all ~300 of my Youtube favorites to my HDD last weekend, when I realized how much I rely on them for my own hobby research projects, teaching classes, etc. Most of it was stuff that will never be on DVD. Some of it is stuff that the owners have *already* deleted in the last week, due to perfectionism or whatever.

    I was a Boy Scout, and relying on some free service without thinking of contingencies just doesn't make sense.
    • Re: (Score:3, Insightful)

      by AlHunt (982887)

      >I was a Boy Scout, and relying on some free service without thinking of contingencies just doesn't make sense.

      Sounds kind of like the argument against Web Apps ...

  • Isn't this already taken care of by things like google cache or the internet wayback machine?

    • Google Cache only covers some content, and only until it expires from Google's search results.

      archive.org would probably be up for mirroring it, but it's unclear that they have all of it.

    • Re: (Score:3, Informative)

      by Randle_Revar (229304)

      >internet wayback machine

      who do you think archive.org is?

      And google cache is strictly short term.

  • Ironically enough, I had moved past the article in question to read the article about Jason's bandwidth being overwhelmed by myspace layout providers referencing an image on textfiles.com; I clicked on the next article and... down to to either "maintenance or capacity problems". 8^/

  • The first web page I ever created (never finished though) was on Geocities, what's left is here http://www.geocities.com/brad1138/ [geocities.com] but it is disappearing fast. Had pictures of kids and family etc.... I always wondered how long it would last, over 10 years isn't bad I guess.
    • Makes me think of The Langoliers. The data of yesterday getting sucked up one byte at a time, obliterated. Meanwhile, Our Heroes try to rescue all the content they can find... before the Langoliers get to it.
  • by Jubilex (28229) on Monday April 27, 2009 @08:11PM (#27739899)

    ...to rhyme with 'atrocities' ?

  • by TinBromide (921574) on Monday April 27, 2009 @08:38PM (#27740135)
    I posted earlier about how Geocities was the early web 2.0 in practice, where anybody could post anything and contribute to the community. I'm sure that there is a wealth of information on geocities about obscure topics that *Might* come in handy if you were to let your true inner geek reign supreme. I.E. I have bios roms of early mac's that I found on Geocities sites that couldn't be found anywhere else, and I'm sure that if they were posted nowadays, they would be subject to lawsuits or take-down notices by Apple.

    I think that our generation will leave less of a mark than that which came before it because nobody is writing on paper. Geocities is the closest thing that we have to shoe-boxes full of letters and diaries for the period spanning the late 90's (In the form of websites about star trek and software and pointless articles posted by ambitious young proto-webdesigners). In the future, there will be a similar scramble to preserve facebook and myspace to preserve correspondence for future generations.
  • by British (51765) <british1500@gmail.com> on Monday April 27, 2009 @08:49PM (#27740243) Homepage Journal

    Angelfire was fun to snoop around on, since the image subdirectories were open for the browsing. Sometimes you found images not meant for the public.

  • by jonwil (467024) on Monday April 27, 2009 @09:07PM (#27740433)

    Here is just one example of content on Geocities that has value.
    http://www.geocities.com/SiliconValley/8682/ [geocities.com]
    These old documents are still of value to people modding the old games.

  • Back in 1996-97 I made an extremely amateurish geocities site with some unfinished programming tutorials, the most popular of which was on qbasic. I sort of stopped working on it after a while, lost my password, and couldn't get yahoo to authenticate me years later when I wanted to remove my ridiculous site. The bio page is especially embarrassing, and the programming material that is there is of no use today. Honestly I'm too lazy to expend any more energy in my effort to shut down my site, so naturally I
  • by MikeURL (890801) on Monday April 27, 2009 @10:03PM (#27740855) Journal
    We are way WAY too hyper obsessed with archiving data. How many of those 200,000 web sites are of genuine value? Of that tiny number how many reproduce information that can be found elsewhere? If you are left with more than 5 websites that contain valuable info that can't be found elsewhere I'd be shocked.
    • by kimvette (919543)

      This site has been of enormous value to me and friends who are also soy intolerant and/or allergic to soy:

      http://www.geocities.com/hotsprings/4620/decoder.htm [geocities.com]

      • Re: (Score:3, Funny)

        by gadabyte (1228808)

        with that site gone, how will people ever know that soy beverages, soy cheese, soy flour, soy meal, soy oil, soy sauce, soy protein, and soybeans ALL CONTAIN SOY PRODUCTS?

        i know it's not all of them, but seriously - damn near half of the products on that page have SOY in the name. i can only deduce that geocities hates natural selection.

    • Value to whom and for what? This stuff is of historical importance just as much as the diaries of people in the previous decades is to historians. Why not preserve it?

Unix is the worst operating system; except for all others. -- Berry Kercheval

Working...