Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Google Businesses The Internet

Google URL Index Hits 1 Trillion 249

mytrip points out news that Google's index of unique URLs has reached a milestone: one trillion. Google's blog provides some more information, noting, "The first Google index in 1998 already had 26 million pages, and by 2000 the Google index reached the one billion mark. Over the last eight years, we've seen a lot of big numbers about how much content is really out there. To keep up with this volume of information, our systems have come a long way since the first set of web data Google processed to answer queries. Back then, we did everything in batches: one workstation could compute the PageRank graph on 26 million pages in a couple of hours, and that set of pages would be used as Google's index for a fixed period of time. Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day."
This discussion has been archived. No new comments can be posted.

Google URL Index Hits 1 Trillion

Comments Filter:
  • by Anonymous Coward on Saturday July 26, 2008 @12:19AM (#24345571)

    Trillion can mean 1E+12 or 1E+18 depending on which country you are in.

  • Re:How long till.. (Score:2, Insightful)

    by Anonymous Coward on Saturday July 26, 2008 @12:19AM (#24345575)

    Won't happen since the universe's max integer is significantly smaller than a googol or a 'google'.

  • Re:Amazing (Score:5, Insightful)

    by timmarhy ( 659436 ) on Saturday July 26, 2008 @12:20AM (#24345581)
    i wish they would work on weeding out the crap. anything you google now is infested with cheesy search sites that list other websites and try plaster you with ads. they contribute zero to the web.
  • by Anonymous Coward on Saturday July 26, 2008 @12:56AM (#24345765)

    you are fucking kidding right?

  • by Animats ( 122034 ) on Saturday July 26, 2008 @01:25AM (#24345867) Homepage

    But how many of those trillion pages have unique, useful content? E-mail is over 95% spam, and the web is getting there.

    There were about 153 million registered domains at the beginning of the year. The ones from the spam-friendly registrars [knujon.com] are mostly junk. Tim Bernars-Lee said in 2006 that web junk was becoming a major problem, and it's become worse since then.

    If you throw out all the anonymous but commercial domains (we call them "bottom-feeders"), as we do with SiteTruth [sitetruth.com], the Web looks a lot better. Search engines are getting stricter about this. You don't see that many "landing pages" in Google any more. Bad news [fool.com] for companies like Marchex [yahoo.com], the publicly traded web spammer that cranks out all those junk "What you need, when you need it" sites.

    "The mass trials are going well. There will be fewer Russians, but better ones." - Greta Garbo in Ninotchka.

  • Re:Some numbers (Score:3, Insightful)

    by miraboo ( 1164359 ) on Saturday July 26, 2008 @01:35AM (#24345901)

    My hobby:

    Getting the fewest possible google results above 0 with a quoted string.

    "interspecies gangbang": 6
    "hot topic meets disney world": 2
    "died in a blogging accident": 15,300
    "can boys make babies": 4
    "why does it hurt when I read": 1

    My Hobby

    Attributing my sources: http://xkcd.com/369/ [xkcd.com]

  • Re:How long till.. (Score:5, Insightful)

    by rho ( 6063 ) on Saturday July 26, 2008 @01:38AM (#24345917) Journal

    I'm more interested in when Google starts returning relevant results to my queries.

    I can't believe that I'm the only one that finds Google's quality of service somewhat below par. I guess they're better than randomly stabbing in the dark, and there certainly isn't any alternative that's obviously better, but Google sure isn't everything they think they are.

    I know--stop trying to compete with Wikipedia and cut out Experts-Exchange.com from your search results since their pages don't actually return the information you think they do.

  • by blind biker ( 1066130 ) on Saturday July 26, 2008 @03:57AM (#24346391) Journal

    I think google.com's search engine achieved its peak usefuleness about 5 years ago. Now, for the most part when I google for a certain electronic component I get some crappy webstore front (and by crappy I mean I can't actually order the component but must "contact by phone" first) or if I search for an electronic device, be it pro or just home electronics, I get those "Read reviews and compare prices"-sites. Which I hate with a passion. WTF google, you have the world's most talented programmers, can't you weed out this crap from your search? At least so it doesn't come up as top hits?

  • Re:Amazing (Score:3, Insightful)

    by Anonymous Coward on Saturday July 26, 2008 @04:53AM (#24346553)

    i wish they would work on weeding out the crap.

    There are a *lot* of people at Google working on that problem. Please understand that it is really difficult to keep up with new attacks when your site is #1, because many people out there are aiming directly for it. No matter how many work on ranking and relevance inside the company, there will always be 10x-100x that number of people outside who are working on the shady side of SEO, spamming, etc. It's a never-ending battle, much like spam email. We're trying.

    anything you google now is infested with cheesy search sites that list other websites and try plaster you with ads. they contribute zero to the web.

    We're working on that both from the search side (ranking) and the ads side (not letting those sites run using Google ads).

    If you want to help, you have many options:

    1. Join Google. If you get in, and say you want to work improving search results or stopping spammy ads sites, you'll have no trouble joining an appropriate group.
    2. If you've got a better approach, start a company. If it is a better approach, sooner or later you'll get noticed, and probably bought by a search company. Good ideas are worth a lot in a bid industry.
    3. Read the available research in the area, do your own experiments, and contribute to the pool of knowledge. Could easily lead to #1 or #2.
    4. If a company you recognize engages in shady practices, tell them you'll take your business elsewhere unless they clean up. If you're a blogger, remind them of that fact. If the company you work for starts to do shady things, point out that you don't think its ok.
  • by Anonymous Coward on Saturday July 26, 2008 @05:22AM (#24346637)

    Not that I advocate or even like Microsoft's Live Search, but you are an idiot.

    Your first "sample" returns the getfirefox.net site as the very first search result. Similarly your second "sample" returns linux.org as the very first search result. Guess what your third "sample" returns as the very first result? If you guessed anything but openoffice.org, you lose.

    What you are confusing with search results are actually ads, just like every other search engine has. I don't even think you really are confusing the ads with the results, this is just your thinly disguised means to MS bash.

  • Re:Amazing (Score:3, Insightful)

    by cheater512 ( 783349 ) <nick@nickstallman.net> on Saturday July 26, 2008 @06:44AM (#24346893) Homepage

    If you design your queries well enough, then you dont see any of the crap.

This file will self-destruct in five minutes.

Working...