Google Opens Up (Some) Search Algorithms 86
overmars writes "After years of closely guarding the formula for its search algorithms, Google is opening up a little.
The search engine company has kept its search formula a closely guarded secret for two reasons: competition and to prevent abuse, said Udi Manber, Google's vice president of engineering, search quality, in a post on the corporate blog. Manber said the blog post is the first part of a renewed effort at the company 'to open up a bit more than we have in the past.'
Manber said the most famous part of Google's ranking algorithm is PageRank, an algorithm developed by Google cofounders Larry Page and Sergey Brin. While PageRank is still in use, it is a 'part of a much larger system,' he said.
'Other parts include language models (the ability to handle phrases, synonyms, diacritics, spelling mistakes, and so on), query models (it's not just the language, it's how people use it today), time models (some queries are best answered with a 30-minutes old page, and some are better answered with a page that stood the test of time), and personalized models (not all people want the same thing),' he said."
What exactly is open? (Score:5, Insightful)
What, exactly, has Google opened up? As far as I can see fron TFA all that is explained is on a very general level, with no detail what so ever. I can't see Google's competion gaining any significant benefit from this.
Re:Dont do it Google! (Score:3, Insightful)
In reality, I'm sure Google's leadership has done some heavy analysis on exactly how much openness benefits them.
Re:Dont do it Google! (Score:5, Insightful)
I took one course in Information Retrieval, and I could come up with most of these things with an evening or two of brainstorming, at least on a general level like this. Ideas like PageRank gave Google the edge in the early days, but now, their advantage lies in other areas. The have a stunning amount of capital tied up in hardware, giving them amazing speed, and amazing amounts of data. They have code optimized to handle those amounts of data in reasonable time. They have the experience to take simple probability models like the ones described in the article, and make them work with those amounts of data.
This is why it's impossible to beat Google at search and other data-based markets. It's not one simple patented idea anymore. If it was just that, Google would've disappeared years ago. The only way to beat the points described above, is to have the capital to buy the hardware, and knowledge to match Google. Microsoft can do that, but Google has one other thing that Microsoft doesn't. They understand their developers. They understand that if you give these kinds of scientist/developers an interesting problem, a fantastic dataset and the freedom to attack it in their own way, you barely even have to pay them anymore. The interest will take over and completely fuel the project. They will work overtime, and come in on the weekends, without being asked.
That will bring energy to a project and a company, that you can never get through any tactic that Microsoft is likely to employ. I admit I don't precisely know what Microsoft is like on the inside, but I simply cannot conceive of them as a company that understands the joy of programming, or the joy of science (which is a huge big part of information retrieval). In any case, one blog post with some sketchy details isn't going to tell Microsoft anything they don't know already.