In Some Places, Local Search Beating Google 216
babooo404 points out Newsweek coverage of Google focusing on areas in which the search giant may be vulnerable. In some countries outside the US, local competition is handing Google its head. In South Korea a company called Naver dominates. And in Russia, portal site Yandex leads in both search and advertising. In the Cyrillic language market Google is a distant third in search, and Yandex is trouncing Google in the advertising arena by 70% to 2%.
Gotta Love It (Score:4, Insightful)
Re:Gotta Love It (Score:5, Insightful)
In Soviet Russia the currency transfer trounce you (Score:2, Insightful)
As a result, dealing with an external broker for services was too painful to contemplate. This restriction formed a protectionist barrier on any service dealing with relatively small financial transactions. As a result companies like Google were locked out off the market in favour of the local brokers.
AFAIK they have a freely convertible currency now which changes the rules of the game back in favour of Google and from there on
Re:Newsflash! (Score:3, Insightful)
Re:Too western? (Score:3, Insightful)
It's not really comparable to Google. They're apples and oranges IMHO.
Re:Gotta Love It (Score:5, Insightful)
Re:OTOH (Score:2, Insightful)
Re:As a Korean (Score:2, Insightful)
Re:Newsflash! (Score:1, Insightful)
Please NO! (Score:3, Insightful)
All these contents are prohibited to robots (via robots.txt), which means Google can't even index them. Thus, no matter how great Google's search algorithm is, it will be almost impossible to match Naver's quality.
This could be the beginning of a slippery slope. Suppose Google responded by ignoring robots.txt files in Korea and protecting orkut, blogger and its own sites with robots.txt files that it does not obey itself. Up until now there has been an unwritten rule - something protected by robots.txt won't be indexed by any public search engine. The possible side-effect of breaking this rule is that robots.txt files are ignored, which can be a real pain for small scale interactive sites.Re:Gotta Love It (Score:1, Insightful)
Sadly, that may well be true.
Re:Gotta Love It (Score:4, Insightful)
That's not the complicated part (Score:5, Insightful)
Just simple lists of keywords associated with that link won't do. We already had that kind of search engines long before Google, and there's a reason why Google handed their arse to them.
And then there are the people gaming the system for a quick profit... even if it means ruining a valuable resource for everyone else. There was an almost epidemic of link spam on all possible forums and blogs, for example, just to raise the Google rank of a couple of pages.
Most of Google's uphill battle so far has been tweaking the algorithm to defend against such "attacks".
(And now that I mention it, it dawns upon me that maybe that's why smaller national engines can do better locally. With everyone trying to game Google and generally the larger English-reading world, it could be that noone bothered polluting the smaller national searches.)
So just being able to swap links around won't do much.
A second and third problems I see with your idea are, well:
1. timing. When I search for something, I'd rather not depend on the right people being online at that exact time. I also want the answer in half a second. Google does that with in-RAM indexes. I wouldn't bet a fortune on someone doing that equally fast via several hops over the net, P2P style.
2. reliability. P2P traffic has been poisoned repeatedly by interested parties, like, say, the RIAA and MPAA. And it's entirely trivial to do so. So what's to keep other interested parties from poisoning P2P search with falsely tagged links?
Even on Google, it's not entirely rare that someone buys ad-word keywords on their competitors' trademarks or such. E.g., if you have a company called, say, "Houndwire", I could buy that keyword for an ad for my company. Now everyone who searches for your company, will have my ad served to them. Then keep my fingers crossed that if I'm in roughly the same market, some people will just go ahead and buy from me. There have been even laws proposed against that kind of impersonation.
Now for adwords it's one thing, but the same could just as well be applied to poisoning a P2P search. Which could ruin its usefulness pretty fast.
Re:Gotta Love It (Score:5, Insightful)
While its fun/popular to make fun of the US and English speakers, few other language groups will praise someone for their broken sentences as they make their first attempts. Most people are pretty touchy when their tongue is mispronounced. Perhaps that is fair but I wouldn't say its English speakers looking down on others due to their language (perhaps other things but not language).
And no, most Americans do not have a second language. But why would they? Its not like a small European nation where you can travel or see people from other countries on a semi-often basis. There many parts of the US where you will go years without a foreign visitor. You could argue that people should travel to see the world but when you have a nation that is large and varied as a majority of Europe, what's the need? You have enough to do just to know your own country. Wait a few years and most Americans will at least be bilingual, the schools have really picked up the amount of Spanish taught.
Re:Gotta Love It (Score:5, Insightful)
Umm no. Japanese will often compliment you on your attempts to communicate in their language. However they are just being polite, and actually you really suck at it.
I think this is a general rule for most languages. Paradoxically, people will stop commenting on how 'good' your language skills are only when you are fluent and they don't notice your shortcomings. If someone politely comments that you speak very well in a particular language, most likely you still have some way to go.
Re:In Soviet Russia the currency transfer trounce (Score:5, Insightful)
For example, if I'm searching information about, say, the name of Putin's dog I can use the following search query:
"Imja sobaki Putina" - (the name of Putin's dog) and Yandex can find documents with the words
"Imena sobak Putina" - (the names of Putin's dogs - note the plural) or documents with the words
"Imen sobak Putina" - ([about] the names of Putin's dogs)
"Imena sobakam Putina" - another grammar case.
Russian morphology is MUCH MUCH more complex than in English. Yandex started working on morphological search in 1996, so it's not surprising that it's still much better than Google.
Re:In Soviet Russia the currency transfer trounce (Score:4, Insightful)
It is a matter of approach to morphology actually.
IIRC Google approach to morphology as a whole is to throw brute force statistical analysis at it. They use statistical models and loads of data for translation. This works wonders with languages like English who have more exemptions than grammar rules while having fairly rigid sentence ordering and relatively limited common vocabulary.
Russian is very difficult to be subjected to this approach. Due to it undergoing a forced language reform at the turn of the 20th century, russian grammar can be expressed in less than 10 pages of strict rules with around 30-40 exemptions. This grammar used to be drilled down with vengeance in Russian schools so it has not changed a bit since formulated 100 years ago.
While the rules are strict (and relatively easy) the meaning of many key grammar elements is positional-dependant. To add insult to injury it has one of the largest working day-to-day vocabularies and there are probably more ways to say the same thing than in any other language (I mean proper Russian, not "Na huja zhe tebe eto nado blad'"..
So no wonder an analytical model is more successful than statistical. Thanks for pointing it out.
Re:Gotta Love It (Score:3, Insightful)
Indeed, my main *complaint* about Google is that it likes to let its search-results be influenced by the language of the searcher, even when that is explicitly not wished, and it doesn't seem to be possible to turn that off.
You can "Search the web" (default) "Search pages in German" and "Search pages from Germany", which is fine and dandy, whats less fine is that the result you get if you "search the web" are *VERY* different if you happen to be logged in to google (say because you use gmail) compared to what you get if you ain't. And the results you get are *MUCH* worse.
My guess is, they're trying to bias the results so that pages of presumed interest for Germans are ranked higher, which is freaking ANNOYONG if you are like me, and search for terms that really are not local.
Example: Change your interface-language in Google to German, then "search the web" for "ubuntu". The top 4 links are to ubuntuusers.de and de.wikipedia.org/wiki/Ubuntu the ubuntu homepage is down at 5.
Now, I'd *expect* that if I had said "search german pages" or "search pages from germany", but I explicitly did NOT, I wanted the most relevant pages for the word "ubuntu" regardless of language and domain, if I wanted something different I'd have said so thankyouverymuch.
It's equisitely braindead to FORCE the user to prioritize pages from the same country, or in the same language as the users choosen interface-language, without mentioning that by a word. The option does NOT say "prefer german pages", it says: "Show the google user-interface in German", the two *aren't* the same and shouldn't be treated as such.
As far as I've been able to discover it is IMPOSSIBLE to convince Google that yes, I'd like the user-interface to be Norwegian (or german), but NO, I do -NOT- want those domains or languages given extra emphasis when I search, unless I say so (for which there are options!)
It's bad enough to make google localisation useless for me. I have it set to english. It's the only way to make it deliver sensible results.