Stories
Slash Boxes
Comments

News for nerds, stuff that matters

To Search Smarter, Find a Person?

Posted by Zonk on Tuesday March 25, @12:43PM
from the when-the-man's-right-the-man's-right dept.
Svonkie writes "Brendan Koerner reports in Wired Magazine that a growing number of ventures are using people, rather than algorithms, to filter the Internet's wealth of information. These ventures have a common goal: to enhance the Web with the kind of critical thinking that's alien to software but that comes naturally to humans. 'The vogue for human curation reflects the growing frustration Net users have with the limits of algorithms. Unhelpful detritus often clutters search results, thanks to online publishers who have learned how to game the system.'"

Related Stories

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

To Search Smarter, Find a Person? 25 Comments More | Login | Reply /

 Full
 Abbreviated
 Hidden
More | Login | Reply
Keybindings Beta
Q W E
A S D
Loading ... Please wait.
  • Will Google... (Score:5, Funny)

    ...for food?
  • While I can imagine a growing job market based on this, futurists like Kurzweil in The Age of Spiritual Machines [amazon.com] see AI coming very soon, and semantic web buffs can point to victories of semantic metadata tagging in at least some limited areas of the web
    • by Anonymous Coward
      AI and metadata are indeed just around the corner. The trouble is, as the article points out, that web publishers find ways to game the system. Some websites pop up at the top of the search burying the ones you actually want.

      If I can guarantee anything I
      • AI and metadata are indeed just around the corner. The trouble is, as the article points out, that web publishers find ways to game the system. Some websites pop up at the top of the search burying the ones you actually want.


        In fact, it's a basic theorem that given sufficient time, human-level intelligence can always beat any system with less than human-level intelligence (aside from trivial cases like a complete firewall). This is because the human's theory of mind can fully encompass the lesser system (so you can understand how it works), while the reverse is not true. Computers can only beat humans at chess when the match is played with a time control.

        This doesn't mean that a computer system can never be good enough to solve this problem. However, it does mean that if you could build a computer system that could solve it, then it would insist on being paid.

        It also doesn't mean that using human-level intelligence will always solve this problem. Humans can still be beaten, they just start on a level playing field. Hence it's pretty much inevitable that some people will still find ways to game the system.
    • just around the corner (Score:3, Interesting)

      Expect to lose your job soon after the paperless office arrives. It's always just around the corner but something human gets in the way every time. AI will be much the same.

      • Re: (Score:3)

        And the Internet is just a fad. As soon as people get tired of pr0n, the Internet will go away ... ;)
    • Re: (Score:3, Insightful)

      Yea, but some people just want to ask the librarian where to find the book they're looking for.
  • Algorithms are written by people (Score:4, Insightful)

    by geoffrobinson (109879) on Tuesday March 25, @12:48PM (#22859278) Homepage
    People are better at sorting stuff before them. Algorithms, written by people, have a harder time doing what we do intuitively but can sort through more stuff. Algorithms do indeed reflect the wisdom of people, so this is a false dichotomy.

    Unless we are talking about Skynet.
  • Generation Gap (Score:5, Insightful)

    by techpawn (969834) on Tuesday March 25, @12:48PM (#22859282) Journal
    It's more a generation gap. While people in my generation are well versed on how to navigate Google and all it's side dark alleys for the gold nugget the boss is really looking for the older boss just wants it to work and is more prone to hit the "I'm feelin' lucky" button and trust what that tells them. That's where the tech snoops like us come in handy to find the obscure and convoluted information on the net. On more that one occasion the uppers have come to me to find something online because I can find it faster and more accurately.
    • Re:Generation Gap (Score:5, Insightful)

      I agree that to some degree it is a generation gap. However, there are plenty of people among the younger generations who don't know how to do anything more than basic searching. When I use boolean operators on Google when in the company of friends, they are baffled. The rise in computing means that the computer has become a basic appliance, but people who really know how to hack more than the most common uses will always be a minority.
  • Critical thinking comes naturally? (Score:4, Insightful)

    by QMO (836285) on Tuesday March 25, @12:50PM (#22859306) Homepage Journal

    ...critical thinking that's alien to software but that comes naturally to humans...
    That seems a little out of touch with reality, there.
  • New Ingenious Filtering System! (Score:5, Interesting)

    by RobBebop (947356) on Tuesday March 25, @12:52PM (#22859344)

    His solution was to create Brijit, a Washington, DC-based startup launched in late 2007 that produces 100-word abstracts of both online and offline content. Every day, Brijit publishes around 125 concise summaries of newspaper and magazine articles, as well as audio and video programs, rating each on a scale of 0 ("actively avoid") to 3 ("a must read") so readers can decide whether it's worth their time to click through.

    Tag article "activelyavoid" and move along.

    Interestingly enough, this whole thing sounds like an idea Rob Malda thought up about 10 years ago, except Brijit lacks a discussion and moderation system where experts and opinionated thinkers can vie to share their collective wisdom to enhance the content of the original article.

  • Everything Old is New Again (Score:5, Insightful)

    by amplt1337 (707922) on Tuesday March 25, @12:54PM (#22859382) Journal
    Or, "1995 called, it wants its Yahoo! back."

    In the absence of the mythical, impossible strong-AI, there will always be an important role for experts -- you know, thinking meat, sitting there pushing charges through neurons, having opinions about stuff -- and those experts will probably use a lot of mechanized search tools to improve the breadth of their knowledge, their awareness of knowledge, and the accessibility of information. Technology and people work together!

    But you're an idiot if you take out the wetware-based BS filter.

    It's coordinating all that expert opinion, and filtering out the drivel, that poses the great organizational challenge of our collective information future. Wiki-based approaches are a good first step; maybe a "trusted-wiki" like Citizendium [citizendium.org] will be the next step; it's definitely going to keep evolving. But it's long been recognized by the reasonable that if you want an informed opinion, rather than a pattern match, you ask the librarian. We've known that since Alexandria -- nay, Ur -- and it's a shame we keep forgetting.
  • Finally (Score:5, Funny)

    by imgod2u (812837) on Tuesday March 25, @12:56PM (#22859414) Homepage
    "Insane Google-fu" can be put on my resume under "skills".
  • Like the original Yahoo (Score:4, Insightful)

    by Aram Fingal (576822) on Tuesday March 25, @12:57PM (#22859430)
    Back in the early days of the web, it was often easier to use a web index rather than a search engine. You would go to a site like Yahoo and lookup what you wanted in a hierarchy of categories. That was often the best way to do it before search engines became more sophisticated.
  • There's a lot of interesting things you can do in research when you get people involved. The simplest is just hiring someone to find the information you need. I believe that a *lot* of companies could significantly increase the productivity of their developers, engineers, etc, if they maintained a pool of trained searchers that could be called upon for difficult queries (paid at maybe a fourth the rate of salaried employees). I know that I've had searches for work that took most of a day just to find that one formula I needed from 30 years worth of journal papers.

    A somewhat more interesting thing, in my opinion, is all the "wisdom of crowds" stuff we see so much hype about. It's interesting because it works very well in certain cases - basically the case where the popular thing is the right thing. The main problem with this is that any search engine that shows you 10 results and then counts which ones you click, well, it's not getting your input on result #11, or 23, etc. So before anyone votes, items that happen to be near the top almost certainly stay at the top. Many good items that the algorithm ranked medium might never get voted on!

    One way around this is to randomly select some less good results, so that viewers get a chance to vote for the underdogs and bring them to the top of the pile. But this pollutes results for each user, essentially making them pay a "moderation tax" by requiring them to see things that the algorithm has no reason to believe are better results.

    All-in-all, social information finding features seem to be much better suited for finding things you didn't even *know* you wanted - StumbleUpon being a great example of a tool for doing that. I would imagine that this could be very useful even in the corporate sector, as many business strategies and engineering techniques have variants or cousins that are similar in function, but may be more obscure. Having the ability to see that "people who searched for X ended up wanting to know about Y too" might save me a lot of time...
  • Unhelpful detritus often clutters search results, thanks to online publishers who have learned how to game the system.
    Publishers who modify their web pages to fool an automated search engine generally do so in ways that are immediately obvious to us. As a result, we can generally parse our search results very quickly to get the information we require.

    But what if the system being "gamed" is a human-based search engine? Since the publisher must fool humans anyway, the "unhelpful detritus" in the end users' results will blend in. Even if there are fewer false positives, those that remain will be harder to eliminate.
  • Webrings writ large (Score:3, Interesting)

    by RyoShin (610051) <tukaroNO@SPAMgmail.com> on Tuesday March 25, @01:18PM (#22859760) Homepage Journal
    Back in the heyday of free hosting services like Geocities and Fortunecity, small sites (mainly by and for fans) didn't rely a whole lot on search engines to drive traffic. Much more common and trusted were instead Webrings [wikipedia.org]. For those who never partook: a webring is a loose community of related websites. It was moderated by a handful of people, and each site would put a little Webring script at the bottom of their page(s). This allowed users to surf between related content without having to go to some external website. It built more trust between the websites.

    While I have not RTFA (this is Slashdot, after all), the summary makes it sound like the combination of Webrings and "Top X" lists, both of which are used much less now and don't carry as much weight but still require user interaction on a grand scale.

    I'd be interested to see how this kind of search engine turns out- however, you also have the problem of "majority think", so searching for, say, evolution might have a first result for a page "debunking" it. But then I browse at +4, so I shouldn't complain.
  • by Animats (122034) on Tuesday March 25, @01:20PM (#22859788) Homepage

    Wikia shows the problem with this approach. Coverage of Star [Wars|Trek|Gate|Craft] is extensive. Coverage of, say, bank regulation is nonexistent. If you want to find out how we got into the subprime mortgage mess or what to do about it, Wikia search is totally useless. That's what you get from volunteer editors. Wikipedia does better, but most of the good contributions were made years ago.

    Today, you pay the editors, or you get fancruft.

    It's amusing that the author of the article feels overwhelmed by The Economist. That's a very well written magazine with good reporters; they had the only reporter in Lhasa when the Chinese clamped down, and they have a good analysis this week of the issues surrounding derivatives. If this guy can't handle The Economist, his organization's answers will probably be dumbed down to the level of, say, "People". That level of crap one can get for free, from many existing sources.

    Remember Google Answers? Nobody really cared, and Google shut it down.

    There's a whole industry of expensive, small-circulation specialist newsletters, but those are niche operations run by specialists in narrow fields.

  • Applicable (Score:3, Insightful)

    by omarius (52253) <omar@@@allwrong...com> on Tuesday March 25, @01:48PM (#22860210) Homepage Journal
    "Where is the wisdom we have lost in knowledge?
    Where is the knowledge we have lost in information?"
        --T.S. Eliot
  • Wow, perhaps it's just me, but.... (Score:3, Interesting)

    by zappepcs (820751) on Tuesday March 25, @01:52PM (#22860260) Journal
    The critical human thought phrase has been struck down, though I think for many of the wrong reasons. A long time ago (car analogy incoming) people used to work on their own vehicles much more so than today. The onboard computer stopped a lot of that, and general complexity stopped more. With home computers and the Internet both problems exist and for many people (until this recession hits hard) it is cheaper to pay someone else to find stuff than to figure out how to find it themselves.

    It's not really difficult, many of those sufferers know how to use a library, which is the real world equivalent of searching on the Internet. (not that the Internet is not real world) Most people were taught how to use a library in their school days and that usage has not changed much with time. The usage of Internet searching does change, and there are multiple ways of doing it. People who are not interested in learning new ways will always just say it is too difficult.

    Using boolean modifiers or advanced search is always there, people just don't use it. They also don't fix their own lawnmowers or other things. They just replace them or pay someone else to do the 'hard' stuff. There is enough information on the Internet to allow anyone to learn to protect their home computer from infections and malware, yet it still is a problem.

    The human problem of search engines will NOT go away, it can only be made to look less with smarter UIs. A tag cloud system of bookmarking could be used to refine search results but would not work in all cases. The URL history with timestamps might help, but not in all cases. Analysis of search results and those pages actually visited might help narrow the criteria to personal bias but not in all cases. That is why the operator has to be smart enough to know what they want and don't. The Internet does not come with your very own personal cruise director to make sure all goes well. People just believe that it is supposed to be easy because they want to do the cool things that they hear about on television and from their friends etc.

    Perhaps one day the interface will be fast enough to be considered good when our brains can be plugged into the computer itself, something like The Matrix, reducing click delays and reading to milliseconds. Until then, teaching people how to use complex search strings will help reduce the angst and pain.

    "cars +toyota -hummer 2005" aobut 2.98M hits
    is better than
    "cars 2005" about 19 million hits
    but you have to teach people that those extra characters really REALLY do help.

    If people don't know how to use a soldering gun, please don't give them one... or something like that. Oh yeah, car analogy: you apparently can't drive on the streets of the USA legally without a license, which you cannot obtain without demonstrating proficient control of the vehicle.
  • Yahoo! (Score:3, Interesting)

    by GottliebPins (1113707) on Tuesday March 25, @02:39PM (#22860888)
    Wow, I remember back in the day when we only had one search engine and it was human powered with real links to real content. It was called Yahoo!
    • by Animats (122034) on Tuesday March 25, @01:39PM (#22860060) Homepage

      We're back to the Yahoo! model because people have figured out how to game the system, namely Google, without adding content that's important to the searcher.

      It's not hard to throw out most of the bottom-feeders. [sitetruth.com] We do it. The crowd at Search Engine Watch (which, despite the name, is all about advertising, not search quality) is writing me angry messages for doing that. Now that we've demonstrated that 36% of Google AdSense advertisers are bottom-feeders, they know they're being watched. Some feel they're being targeted.

      Bear in mind that most search requests are really, really dumb. [google.com] That's what Google has to answer. In fact, most Google search requests don't hit the search engine at all; there's a cache of common queries and answers in all the front end machines, and a sizable fraction of requests are answered from cache.

    • Re:Really? (Score:4, Insightful)

      by h3llfish (663057) on Tuesday March 25, @01:40PM (#22860076)
      Wow, three posts in a row that made the same lame joke. That's gotta be clue that you're not as clever as you thought you were.

      There has to be some kind of intelligent filtering. If it's not done for me, it's done by me, when I choose which result to click. The biggest problem with paying someone to do that sorting for you is the simple fact that it's too expensive. Yahoo might have stayed a human-sorted list forever, except that it would have taken an army of "surfers" to do it. The web just got too big to be done that way all the time.

      Google results used to be a lot more relevant than they are now. Far too often, I'm interested in X, and search for "X" on Google, I find millions of people who want to sell me X. But I'm not even sure if I want to buy it. I'm looking for information about X. That is getting harder and harder to find. The quote in the summary is correct - people have learned how to "game" the system.

      How often do you "google" something, and then just go to the Wikipedia link? I do all the time. That way, I can be sure to get actual information about the subject, rather than a link to its Amazon page. In many ways, because of the search engine optimizers, Wikipedia is already replacing Google as the default source of information.