Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
The Internet News

All of Gopherspace Available For Download 200

An anonymous reader writes "Cory Doctorow tells us that '[i]n 2007, John Goerzen scraped every gopher site he could find (gopher was a menu-driven text-only precursor to the Web; I got my first online gig programming gopher sites). He saved 780,000 documents, totalling 40GB. Today, most of this is offline, so he's making the entire archive available as a .torrent file; the compressed data is only 15GB. Wanna host the entire history of a medium? Here's your chance!' Get yourself a piece of pre-Internet history (torrent)." Update: 04/30 00:16 GMT by T: As several readers have pointed out below, our anonymous friend probably meant to say "pre-Web," rather than "pre-Internet."
This discussion has been archived. No new comments can be posted.

All of Gopherspace Available For Download

Comments Filter:
  • Oh gopher from su.se (Score:1, Informative)

    by Anonymous Coward on Thursday April 29, 2010 @06:21PM (#32037888)

    Porn, lots of porn. Also, not understanding why emacs wouldn't run on a mac.

  • by migla ( 1099771 ) on Thursday April 29, 2010 @06:25PM (#32037954)

    "Do you have any facts or figures underpinning your statements ?"

    That would indeed be interesting, but GP makes a reasonable assumption, akin to "There were more horse carriages out and about before the automobile." No?

  • Re:Shame on Slashdot (Score:5, Informative)

    by commodore64_love ( 1445365 ) on Thursday April 29, 2010 @06:25PM (#32037960) Journal

    Beat me to it. The summary should read "Get yourself a piece of pre-world wide web history," since gopher came AFTER the birth of the internet (1981) but before the widespread usage of the web (circa 1993).

  • Re:Wrong (Score:3, Informative)

    by gyrogeerloose ( 849181 ) on Thursday April 29, 2010 @06:28PM (#32037972) Journal

    To a lot of people, WWW=Internet. Us old greybeards who remember when the Internet was telnet, FTP, e-mail and Usenet know better.

  • by Hatta ( 162192 ) on Thursday April 29, 2010 @06:35PM (#32038060) Journal

    That's more the fault of the clients than the protocol. There's no reason you can't serve hypertext documents over gopher, and no reason a gopher client couldn't display graphics.

  • Re:Wrong (Score:3, Informative)

    by uglyduckling ( 103926 ) on Thursday April 29, 2010 @06:44PM (#32038158) Homepage
    IIRC Usenet wasn't a network so much as local repositories which synced. Your local Usenet server would sync up with other peer servers on a schedule, I suppose a bit like a massive distributed email system. Some Usenet sites weren't strictly Internet connected, but many used the Internet as the means to communicate with peer servers.
  • by cwgmpls ( 853876 ) on Thursday April 29, 2010 @06:46PM (#32038176) Journal
    In gopher, everthing is either a link or text. There is no way to embed a link into a body of text -- what is now called "hypertext".
  • Re:Wrong (Score:2, Informative)

    by elfprince13 ( 1521333 ) on Thursday April 29, 2010 @06:52PM (#32038256) Homepage
    UUCP was the original method used for Usenet transfer, and was distinct from the Internet, but it was hooked up to the Internet at various locations to make contact with servers outside the local UUCP network. This was an era when email (transferred via UUCP) could take longer than snail mail to make it to its intended user (and the addresses were more like a full trip-map than just an address)
  • Re:Wrong (Score:4, Informative)

    by Rene S. Hollan ( 1943 ) on Thursday April 29, 2010 @06:56PM (#32038294)
    Usenet carried posts and articles in newsgroups. Synchronization took place via abstracted mechanisms, most commonly uucp over serial modem links.

    So, yes, Usenet preceded the Internet in the sense that it did not rely in IP, though both generally evolved around the same time.

    But, there was a rather vibrant pre-WWW internet where the protocols of choice were smtp (mail), ftp (file transfer), and gopher and archie for repositories of places to find stuff. News could be carried via nntp (net news transfer protocol).

    What some may not know was that sendmail could work over transiently connected points as well, rather like usenet. Anyone still remember bang path notation? One would address mail using the sequence of hosts required to get it from one's own to the destination, using names understood by each successive host in the sequence. One of the reasons sendmail configuration files were so horrendous was to permit relaying between networks using different host naming conventions.

  • by Anonymous Coward on Thursday April 29, 2010 @07:06PM (#32038382)

    They teach us the difference and why it no longer matters;P

    Tell that to people using non-WWW email clients, pushing SOAP data, sshing into their servers, using Skype, video chat, P2P software, etc.

    While the WWW is becoming ubiquitous, with Google and Bing as major hubs, there's a lot of stuff (including everything going via UDP) happening on the Internet that has little or nothing to do with WWW (or even http[s] for the most part).

  • by Anonymous Coward on Thursday April 29, 2010 @07:07PM (#32038394)

    Gopher can contain binary files as well. If the archive is truly complete, then it contains more than just text.

    I recall finding a ROM site on gopher about 2 years ago, so if this archive is complete you'll get a complete set of Atari 2600 and Coleco ROMs free with your torrent download. (I think it had a few NES too, but it was mostly the pre-NES consoles)

  • by Obfuscant ( 592200 ) on Thursday April 29, 2010 @07:13PM (#32038436)
    Do you have any facts or figures underpinning your statements ?

    Yes.

    In 1997 we had a 100Gb disk array holding the research data from our lab, all of which was available via gopher (and ftp, and the web). We moved to a 200Gb array shortly after, and then a 400Gb after that. And then 3Tb, around 2008.

    Sometime around 2007 or 2008 the SunOS system that ran the gopher server died permanently and was replaced by a virtual linux server without gopher. Even without that server, I found not long ago that I was still creating .cap files -- which were gopher, as I recall, but maybe archie.

    Quantitatively, online currently I have more than 15Gb of data for just 1997, all of which was gophered at the time. In 1998, another 18Gb.

    So, I would say, had the gopher scraping been done in 1997 instead of 2007, the result would have been a lot more data. In fact, a few months earlier in 2007 and it might have BEEN a lot bigger.

  • Re:Wrong (Score:3, Informative)

    by commodore64_love ( 1445365 ) on Thursday April 29, 2010 @07:22PM (#32038524) Journal

    Ahhh the good old days.

    You post a question on rec.arts.tv like, "When does the new season of TNG start?", wait for the midnight syncing between your local BBS and the rest of the nation, and then you come back tomorrow morning to learn the answer. If you're lucky. Sometimes you had to wait 2 days for a reply.

  • Re:Gopher (Score:5, Informative)

    by Anonymous Coward on Thursday April 29, 2010 @07:28PM (#32038562)

    So does this mean we're getting 6 more weeks of winter or not?

    No, just another ten years of November.

    I believe you mean September. [wikipedia.org]

  • Gopher lives! (Score:5, Informative)

    by John Hasler ( 414242 ) on Thursday April 29, 2010 @07:34PM (#32038610) Homepage

    ...gopher was a menu-driven text-only precursor to the Web...

    What do you mean, "was"? Gopher still works fine. There are dozens of servers out there. See quux.org [quux.org] or just install your Linux distribution's gopher package and fire it up.

  • Re:Shame on Slashdot (Score:3, Informative)

    by whoop ( 194 ) on Thursday April 29, 2010 @08:37PM (#32039178) Homepage

    I've been around a while, and I can't think of any time a Slashdot editor fact-checked, spell-checked, or proofread a submission. Look at it, they put the entire thing into a quote. That way they can just say they're quoting the submitter and that's what he said.

    They might add the "UserXXX writes," part themselves, but a couple characters of perl could probably do that part just as well.

  • Re:Shame on Slashdot (Score:3, Informative)

    by Grant_Watson ( 312705 ) on Thursday April 29, 2010 @08:42PM (#32039230)

    This definition is probably looser than most, but here's a quick and dirty view:

    The Web is a huge collection of interlinked documents addressable by URLs and served with HTTP. The Internet is the world-wide TCP/IP network over which the Web and many other services operate.

  • by nxtw ( 866177 ) on Thursday April 29, 2010 @08:53PM (#32039312)

    There's no markup for hypertext in HTTP either.

    The original pre-RFC HTTP states that a response is an HTML message [w3.org].

  • Re:Shame on Slashdot (Score:4, Informative)

    by pizza_milkshake ( 580452 ) on Thursday April 29, 2010 @09:30PM (#32039586)
    Here's my explanation in graphic form: http://parseerror.com/images/explain/internet-vs-web.jpg [parseerror.com]
  • Re:Shame on Slashdot (Score:3, Informative)

    by suso ( 153703 ) * on Thursday April 29, 2010 @09:30PM (#32039588) Journal

    Um, the generally accepted start of the Internet is by activities surrounding the start of ARPANET in the late 1960s. ARPA in its name still lives on as part of reverse DNS entries. Some people say it started in 1967, some say 1969, either way, it was much earlier than 1981 and there are a lot more protocols that are part of what we call "the Internet" than just TCP/IP, although of course not all of it is routed globally. Check your /etc/protocols file sometime, the first line says Internet (IP) protocols.

  • Re:Shame on Slashdot (Score:3, Informative)

    by nacturation ( 646836 ) * <nacturation AT gmail DOT com> on Thursday April 29, 2010 @10:26PM (#32039984) Journal

    History of the Internet from 1957 to present:

    http://vimeo.com/2696386?pg=embed&sec=2696386 [vimeo.com]

    Quite educational, even if you think you know all about it.

  • Re:Shame on Slashdot (Score:3, Informative)

    by nacturation ( 646836 ) * <nacturation AT gmail DOT com> on Thursday April 29, 2010 @10:56PM (#32040144) Journal

    Having just watched it again, it may not fully answer your question. With what you learned from the video in mind, the OSI model [wikipedia.org] is the layers the video talked about. There are seven layers altogether, with the lowermost layer being the physical hardware everything runs on, followed by the network connecting the hardware, then how data is passed over the network, and so on until you get to the application layer. You've heard of TCP/IP? That's TCP (layer 4) running on top of an IP (layer 3) network. ICMP is a different network which is what things like 'ping' (ICMP echo) and 'traceroute' run over. You've heard of UDP? That's another layer 4 protocol different from TCP.

    What runs on the application layer is things you're already familiar with. SMTP (email), telnet, FTP, DNS, NTP (network time protocol), and so on including HTTP. HTTP is effectively the web -- it's what a world wide web browser ("web browser", or now just "browser" for short) uses as its primary protocol and why you see URLs starting with http: . So HTTP or "the web" is an application that runs on top of everything below it. You still need the physical hardware, the network connecting the hardware, the various transmission protocols and so on to deliver the data used by the web. Similarly, SMTP or commonly just "email" is an application that runs on top of everything below it.

    Think of the acronyms if that will help you understand it better. SMTP is Simple Mail Transfer Protocol, a protocol for transferring simple mail. HTTP is HyperText Transfer Protocol, a protocol for transferring hypertext. FTP is File Transfer Protocol, a protocol for transferring files. NNTP is Network News Transfer Protocol, a protocol for transferring network news, what you've likely heard of as simply Usenet or "newsgroups". You get the idea.

    That's the simplistic view of things. In reality, HTTP has been extended to transfer more than just hypertext. Through the use of MIME types (image/gif, image/jpeg, text/html, text/xml, image/binary, and so on) you can transfer arbitrary things that browsers and other applications can understand.

    Hopefully that makes a bit more sense.

  • To be somewhat more accurate, it's not "now" called hypertext: it was called hypertext before gopher even existed. Gopher was first released in 1991, while Ted Nelson coined "hypertext" in 1965, and there were dozens of implementations before the WWW (the most popular outside academia was probably Apple's HyperCard, released in 1987).

  • Re:Shame on Slashdot (Score:3, Informative)

    by jeremyp ( 130771 ) on Friday April 30, 2010 @08:00AM (#32042746) Homepage Journal

    Actually, the Internet is the world-wide IP network. TCP is just one of many protocols that are used to transmit information across it.

Anyone can make an omelet with eggs. The trick is to make one with none.

Working...