The Internet Archive Has Saved Over 10,000,000,000,000,000 Bytes of the Web 135
An anonymous reader writes "Last night, the Internet Archive threw a party; hundreds of Internet Archive supporters, volunteers, and staff celebrated that the site had passed the 10,000,000,000,000,000 byte mark for archiving the Internet. As the non-profit digital library, known for its Wayback Machine service, points out, the organization has thus now saved 10 petabytes of cultural material."
The announcement coincided with the release of an 80-terabyte dataset for researchers and, for the first time, the complete literature of a people: the Balinese.
Re:Yes, but... (Score:4, Interesting)
All of which is rather useless... (Score:5, Interesting)
...since the TOS specifically prohibits copying data from the site:
"Our terms of use specify that users of the Wayback Machine are not to copy data from the collection. If there are special circumstances that you think the Archive should consider, please contact info at archive dot org. "
Warrick hasn't been taking new requests for months (and I'm sure it's more of a research tool than an actual service for the public), and the site effectively blocks attempts to backup data using wget. It makes me wonder who (or what) this archive really serves, because it's most certainly not the general public.
Download Link? (Score:5, Interesting)
How nice of them to do the archiving and release such a large dataset.
Where can I download the file?
What the hell (Score:5, Interesting)
Re:Relevance of byte count (Score:5, Interesting)
No, go ahead and mod me down. Every time i post, I look at my user ID and think "GOD FUCKING DAMNIT IF I HAD WAITED LIKE TEN MINUTES I WOULD HAVE HAD A PALINDROME AUAUUUUUUGGGHHH"
i deserve all the downmods i get, accidental or otherwise.
Private archive (Score:4, Interesting)
It's great that archive.org is doing this, but it's such an important part of history so I thought I would do a mini-version for the pages I visit, just to be able to refer back to stuff. I've been using the Firefox addon called Shelve to save all pages I visit on my home computer for about 2 months now (at most one version for each day). It's a total of 5.8 GB. It's not useful for browsing though, I'd love it if it was better integrated with Firefox such that I could choose among all versions of each page. There's sometimes some excellent information on university pages or cheap hosting, that could be 10 years old, and you never really know how long it's going to stay up..
Anyway, this may give some perspective too; 2 months of daily snapshots of slashdot, other news, some tech stuff and a little Facebook takes just 5.8 GB.