Google Outlines the Role of Its Human Evaluators

Google Outlines the Role of Its Human Evaluators 62

Posted by timothy on Sunday June 07, 2009 @03:34PM from the humans-are-dead-they're-probably-dead dept.

An anonymous reader writes "For many years, Google, on its Explanation of Our Search Results page, claimed that 'a site's ranking in Google's search results is automatically determined by computer algorithms using thousands of factors to calculate a page's relevance to a given query.' Then in May of 2007, that statement changed: 'A site's ranking in Google's search results relies heavily on computer algorithms using thousands of factors to calculate a page's relevance to a given query.' What happened? Google's core search team explain."

Google Outlines the Role of Its Human Evaluators

This discussion has been archived. No new comments can be posted.

Search 62 Comments Log In/Create an Account

Comments Filter:

Summary, missing from TFS (Score:5, Informative)

by Useful Wheat ( 1488675 ) writes: on Sunday June 07, 2009 @04:02PM (#28243663)

Because the summary wasn't kind enough to give you the answer to the question, here it is.
Human evaluators (mostly college students) are trained in the art of validating a search engine result. They examine the results of their searches, and determine which ones are the most highly relevant. For example, searching for the Olympics should yield information about the 2008 Olympics (or any current one) instead of the 1996 Olympics. The reviewers frequently work on the same query results, that way they can see how consistently the reviewers are rating websites.
The vast upshot of this, is that it helps weed out those websites that are cheating the system, and trying to get their website as the #1 google hit, so they can show you ads. So the large part of what they are doing is tracking spam websites, not real ones.

Re:Google is PEOPLE (Score:4, Informative)

by _Sprocket_ ( 42527 ) writes: on Sunday June 07, 2009 @04:06PM (#28243687)

In reality this is why search engines like Wolfram Alpha without the broad research and knowledge of Google in the industry don't stand much of a chance unless Google drops the ball.
Yeah - but before Google was people, Yahoo was people. Google gets an advantage based on what they're doing. But it doesn't make them invulnerable. Look at the tech industry for the past several decades to see this theme played out again and again.

some personal observations on the program (Score:3, Informative)

by Sem_D_D ( 175107 ) writes: on Sunday June 07, 2009 @07:02PM (#28245023) Homepage

I have been in the program from almost the very beginning and I am glad they are coming finally frank and open about it. some more comments and caveats first:
-as anything modern in IT, people sign Non-Disclosure Agreements (NDAs) so not a lot can be said from within the circle without breaking its terms. Having read the interview, I see the chief has also pretty much kept it this way, let alone only for the terms that are already publicly disclosed -google operates through 3rd party outsourcers and pretty much all non-essential communication is through them and not google directly, that's why the guy can't tell ya exact number about his posse. the big numbers are probably very correct, but I'm not sure about now. there seemed to be a very big wave of cut-offs and discontinued access for raters about a year ago, a lot of people got the boot and I'm not sure why - my bet is just a sweep of the axe. some were gone for a good reason, others very randomly. -the raters have a few spaces and forums to discuss their work, open to public and with minimal chance for an NDA break. -the raters have mods, too, but I haven't seen activity on that from for a while. -the specifics of the most cases have drawn me to a conclusion that for each surveyed example, there are at least 6 or 7 people working and giving opinions about, before a final decision is drawn, so there is your internal balance and weeding out bad judgement. lemme say it again you cannot single-handedly change Google's opinion about a particular site and particular search term. -about natural language processing - this is the scary part. you cannot imagine how good are these guys, especially their algorithms. from time to time they let us sneak peek at it and let me say we had a look at some betas (or alfa-s) of correct grammar processing and translation MONTHS ahead of their official announcement to the world. you could tell it was machine-made translation, but it was good, scary good. And I'm NOT talking English only, no,no. -the pay -it gets delayed about 6 weeks after month's end but is regular and usually not enough for a living, mainly due to the lack of work. first year it was good, very good, but in 2008 it started getting less and less, which is a shame, since it is a nice way to browse the net and get really paid to do it ! ;-) in those initial months, we were mainly dealing with spam, but recently even that is not so much present. -the reason they do not pinpoint sites has to do with the entire structure of the reviewing process - we look at a certain page from the perspective of a concrete search term and it's relevance to it, which is a good compromise. also you can get good content AND spam at the same time. Altogether for nearly two years in it, the terms we are monitoring haven't changed drastically an it can be boring from time to time, but otherwise, you get to see some really weird things people type into the search field. -altogether, recently I was both happy and pissed off at what their focus of work changed -dumbing down. more simpler and simpler explanations and help for the raters, so no surprise. -oh, yeah, one more thing. The leaked Guidelines - way beyond old so of not much use for reverse-engineering and helping the SEO guys. good luck with that :)

Re:Summary, missing from TFS (Score:2, Informative)

by modrzej ( 1450687 ) writes: <m.m.modrzejewski ... m minus math_god> on Sunday June 07, 2009 @07:40PM (#28245319)

The vast upshot of this, is that it helps weed out those websites that are cheating the system, and trying to get their website as the #1 google hit, so they can show you ads. So the large part of what they are doing is tracking spam websites, not real ones.
Actually, it calls for further explanation, because manual tweaking of results produces bias and legal concerns. As guy from Google said,
We don't use any of the data we gather in that way. I mean, it is conceivable you could. But the evaluation site ratings that we gather never directly affect the search results that we return. We never go back and say, 'Oh, we learned from a rater that this result isnâ(TM)t as good as that one, so letâ(TM)s put them in a different order.' Doing something like that would skew the whole evaluation by-and-large. So we never touch it.
Mankind's knowledge stands on the shoulders of Google, so they can't just hire, say, a thousand students and use this evaluation as an significant weighting factor. It's rather a evaluation of algorithms for the sake of further improvement done fully by algorithms.

Search results that don't include search terms (Score:2, Informative)

by kurish666 ( 1041420 ) writes: on Sunday June 07, 2009 @09:56PM (#28246175)

Seems like Google changed something for the worse in the last 6-12 months or so. My searches now seem to produce an increasing number of results that don't actually include the terms I specified. Presumably it's to drive a BS metric that shows Google yields more hits for a given search than their competitors. It's extremely frustrating--This second-guessing of the user's query was one of the biggest reasons I stopped using AltaVista, Yahoo, or whatever the hell other engines used to be out there before Google came to dominate. Anybody else seeing this?

Re:Fuzzy logic is killing Google (Score:4, Informative)

by zunger ( 17731 ) writes: on Monday June 08, 2009 @12:10AM (#28246989)

Plus signs should still be treated as true literals. Quotation marks don't indicate literality -- they indicate that you really, really care about things like word order and so on within the quotes. It used to be true that quotation marks implied a plus on everything inside them, but that wasn't an intentional feature. The advanced search check box was, AFAIK, just equivalent to sticking everything in quotes.
If you're still seeing fuzzification with a plus sign, something may be a bit screwy, and you should file a bug with a specific broken query. (Of course, if you run the query +wombats and see the word "wombat" highlighted in the snippet, that isn't the same thing -- +wombats was treated literally, so this document really truly matched the word "wombats," it might just also have matched the word "wombat" and the snippet highlighter decided that it made sense, for this particular query, to highlight the term. A bug would be if you found a truly irrelevant document coming up.)

Re:Google is PEOPLE (Score:3, Informative)

by Gulthek ( 12570 ) writes: on Monday June 08, 2009 @08:29AM (#28249589) Homepage Journal

Of course they're both search engines, but most people (in my experience and I work in a library) know the difference and don't have any trouble differentiating the two.
Google = search for websites. Wolfram = search for data.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Google Outlines the Role of Its Human Evaluators 62

Google Outlines the Role of Its Human Evaluators More Login

Google Outlines the Role of Its Human Evaluators

Summary, missing from TFS (Score:5, Informative)

Re:Google is PEOPLE (Score:4, Informative)

some personal observations on the program (Score:3, Informative)

Re:Summary, missing from TFS (Score:2, Informative)

Search results that don't include search terms (Score:2, Informative)

Re:Fuzzy logic is killing Google (Score:4, Informative)

Re:Google is PEOPLE (Score:3, Informative)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot