Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
The Internet Businesses Google

Yahoo Passes Google in Total Items Searched 434

tonyquan writes "Yahoo announced today that its search engine passed Google's for overall capacity, with 20 billion documents and images indexed versus 11.3 billion for Google. Observers had previously pegged Yahoo's index at just 8 billion items. The growth is due to a recent expansion effort. More info can be found on the Yahoo! Search blog and at CNet."
This discussion has been archived. No new comments can be posted.

Yahoo Passes Google in Total Items Searched

Comments Filter:
  • by Anonymous Coward on Monday August 08, 2005 @09:49PM (#13275388)
    We recently launched a mobile search engine [mwtj.com]. The domain was registered, pages created, etc, so I'm observing it go from zero page rank, to having a page rank and getting crawled. Yahoo's bot definitely crawls more frequently, and Googlebot doesn't seem to crawl any links unless they are linked to from external pages. I assume that as the pagerank increases, Googlebot will get more aggressive, but from what I can see in the logs it's clear that Googlebot takes a "wait and see" approach to crawling.

    That's not a bad thing. There are a lot of useless pages out there, and having twice as many pages in the index certainly does not mean twice as many useful pages.

    I am glad to see the search engine wars are on and competitive.

  • by Anonymous Coward on Monday August 08, 2005 @09:49PM (#13275391)
    Reading these comments here all I can say is you guys are so brainwashed by the Google hype machine.

    First, Google is NOT an innovator. Why not? Everything they do is a slight improvement on existing services:

    - Search: Sure, it's the best search around, but it is simply an improvement over existing search services. And by now Yahoo's search is comparable. Soon there will be many equivalent search engines.

    - Maps: Looks pretty, but it's just an incremental improvement over existing services. Trivial for Yahoo or anyone else to catch up.

    - GMail: Nothing to see here except very good marketing. Who ever uses 1 GB of email? Nobody.

    A lot of Google's services actually suck if you think about it. Froogle? Google Images? Those are a joke. And thanks for breaking Google Groups to make it unusable.

    If you think Google is the greatest thing since sliced bread, take a deep breath and realize that it's just a company that is very good at marketing, and making lots of money.

    Google is an advertising company, they are not a technology company. They are not true innovators like, say, Apple or Oracle. Just look at the reasons I outlined above to understand why. A true innovator ushers in a new age. Like Apple with the iPod and digital music. Or Oracle with database systems. Google hasn't ushered in a new age of anything.

    Stop the hype.
  • Re:Great... (Score:4, Informative)

    by donutello ( 88309 ) on Monday August 08, 2005 @09:54PM (#13275417) Homepage
    Google keeps and holds its users because searches *work*.

    You must not have used Google recently. It's been about 2 years since Google stopped returning useful results. Now, most of the results are crap. Unfortunately, there isn't a better search engine out there.
  • Hey Yahoo (Score:3, Informative)

    by Spackler ( 223562 ) on Monday August 08, 2005 @09:55PM (#13275422) Journal
    It's not the size of the boat...
    it's the motion of the ocean.


  • by HeroreV ( 869368 ) on Monday August 08, 2005 @10:02PM (#13275463) Homepage
    I agree about Froogle. Usually over 90% of all items can't be ordered by price even though the engine was clearly able to determine what the price was. How is it being froogle if you can't easily figure out which is the cheapest?
  • by Eric Giguere ( 42863 ) on Monday August 08, 2005 @10:02PM (#13275465) Homepage Journal
    The Yahoo! crawler (Slurp) is definitely more aggressive than the Googlebot. It comes knocking on my door several times a day, especially the blog pages. Google is more conservative and keeps things in a sandbox, too.
  • Re:fantastic (Score:4, Informative)

    by fembots ( 753724 ) on Monday August 08, 2005 @10:23PM (#13275567) Homepage
    While 9 billion additional pages are pretty useless to an individual, it can however mean each topic will have an additional 30 pages, or a search on Ferrari images gives another 25 pictures.
  • Re:fantastic (Score:5, Informative)

    by b0r1s ( 170449 ) on Monday August 08, 2005 @10:27PM (#13275585) Homepage
    Google's index should be growing faster in the coming months. With more and more webmasters implementing Google's sitemap helpers, a lot of unlinked/dynamic pages should start showing up very, very soon.
  • by rpdillon ( 715137 ) * on Monday August 08, 2005 @10:54PM (#13275724) Homepage
    Try MyWay. They're ad-free too. I've been using them for years. They don't have some of the newer stuff, but they have a notepad, calendar, address book, news, email, search, games, etc.
  • by Alomex ( 148003 ) on Monday August 08, 2005 @10:56PM (#13275737) Homepage
    Indeed. It is interesting to note that the new MSN engine crawling like mad had a hard time matching Google's count, much less surpasssing it. Out of the blue comes laggard Yahoo with a much larger count. Pardon me, but I'm somewhat skeptical.

    Say using the old technique of searching for typos I just tried Yahoo and Google. Yahoo reports five matches versus Google's five. However out of the five Yahoo matches three of them are spurious!

    Some other searches with their actual count:

    Yahoo 1, Google 1.
    Yahoo 0, Google 1.
    Yahoo 1, Google 5.
    Yahoo 26, Google 36.

    This reminds me of an old Altavista crawl, where they discovered that nearly 10% of their pages where non-standard 401 pages.

  • Re:Great... (Score:5, Informative)

    by mph ( 7675 ) <mph@freebsd.org> on Monday August 08, 2005 @11:43PM (#13275945)
    Adding "review" usually results in storefronts that say "Be the first to review this product!".
  • by Sancho ( 17056 ) on Tuesday August 09, 2005 @12:54AM (#13276175) Homepage
    Multiple search engines are probably the way to go, honestly, but here's some counter-anecdotal evidence.

    Search for:
    super mario world hacks

    on each of Yahoo and Google, and check the first hit. Google takes it hands down, with an entire page devoted to SMW hacks, vs. Yahoo's page on SNES hacks.

    I routinely try other search engines, and while another one occasionally trumps Google, the big G tends to come out on top overall.
  • by iceanfire ( 900753 ) on Tuesday August 09, 2005 @02:24AM (#13276502)
    "Especially since they have the minimalist interface which doesn't suck." Last time I checked, Google used that interface... infact Yahoo copied it from them.
  • by vicaya ( 838430 ) on Tuesday August 09, 2005 @04:22AM (#13276810)

    For popular search terms (queries with millions of hits) index size doesn't matter much. Yahoo, google, ask, msn etc all produce pretty similar results (that tend to favor established sites/pages.) For rare terms or combinations, which contribute to the Long Tail [wikipedia.org] of web search, index size is very important. Both Yahoo and Google report estimated (often inflated) hits for popular terms and exact numbers for rare terms, which still include dups. You need to go to the last result page to find out the exact non-dup number, which sometimes can shrink the de-dup'ed hits by a factor of 10. Let's see how the new yahoo fairs against google with a few queries I picked randomly:

    • "Acid Brass" stockport - yahoo:20 google:24
    • "anetan district" - yahoo:17 google:15
    • "chunder blunder" - yahoo:25 google:27
    • "information theoretical death" - yahoo:45 google:46
    • kliningan juru - yahoo:27 google:47
    • "phylogenetic organisms" - yahoo:5 google:10
    • zibelthiurdos thrace - yahoo:9 google:4

    Yahoo used to consistently underperform google on rare terms, it seems they indeed have caught up. But it has NOT really exceeded google in terms of useful size (Yahoo has more dups.) Still, it's a worthy engineering effort. Congrats!

  • by danila ( 69889 ) on Tuesday August 09, 2005 @05:08AM (#13276925) Homepage
    This blog post [raelity.org] (and especially the comments) discusses pinging yahoo.com, the switch to pinging google and what else do other people ping.

    Incidentally, this is the 2nd result when searching for "ping yahoo" on Yahoo! and only the 9th result when searching on Google (the first 8 are much less relevant).

    This is typical example [webmasterworld.com] of real-life "ping yahoo.com to check if you're online" suggestion.

    P.S. And personally I do ping yahoo.com. The are the Internet and compared to them Google is insignificant. :)
  • Re:fantastic (Score:1, Informative)

    by Anonymous Coward on Tuesday August 09, 2005 @06:50AM (#13277142)
    Here are some Google-fu tips:

    If you want to get rid of those pages with all words known to mankind on it, uses "-".
    Example:
    Google: +Hawking => 2,450,000 results.
    Google: +Hawking -sex -mp3 -money => 1,210,000 results.

    If you want to get (partially) rid of those dynamically generated pages use double keywords and/or quotation marks.
    Examples:
    Google: +Hawking => 2,450,000 results
    Google: +Hawking +Hawking => 2,190,000 results

    Google: +Stephen +Hawking => 893,000 results
    Google: +"Stephen Hawking" => 774,000 results.

    Combine everything to nicely start to cut away some noise.
  • by Darkman, Walkin Dude ( 707389 ) on Tuesday August 09, 2005 @07:06AM (#13277182) Homepage

    I can only agree here. A couple of interesting points, yahoo will index your website whether or not any site in the world is pointing a link to it, and yahoo actually pays attention to the the meta tags at the top. Now while I'll be the first one to observe that meta tags have been abused horribly, in a lot of cases they do in fact represent the content of the site well. Its no more of a risk than any of the other criteria used to index websites, really. The quality of google's search and image search has declined quite a bit in the last few months, the question is whether or not they recognise that.

  • by TeknoHog ( 164938 ) on Tuesday August 09, 2005 @08:29AM (#13277470) Homepage Journal
    AFAIK, Tuva is part of Russia, not Mongolia.

    Besides, it's hardly the middle of nowhere, as it has become famous for its traditional throat singing. One of the people who made it famous was Richard Feynman; I first learned of Tuva as I was searching for stuff on Feynman. It shouldn't be news to any fan of Feynman that he was into obscure music.

    If you're looking for less well known parts of the world, you might have a look at the other 'autonomous republics' within Russia, such as Komi or Mari.

  • I am not surprised. (Score:4, Informative)

    by mrjb ( 547783 ) on Tuesday August 09, 2005 @10:09AM (#13278139)
    Google refuses to index pages that aren't linked to by at least a gazillion other sites, submitted or not.
    My site [rollingears.com], for example, has been up and running for nearly two months, submitted a few times and actually linked to by a few pages that are indexed by Google but it still doesn't appear *at all* in Googles index, not even far in the bottom.

    Even if you enter site:www.....com in the search bar directly, it just says it doesn't know it. At least Yahoo has got it in there, never mind high ranked or not.

New York... when civilization falls apart, remember, we were way ahead of you. - David Letterman

Working...