Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Google Software The Internet

Extracting Meaning From Millions of Pages 138

freakshowsam writes "Technology Review has an article on a software engine, developed by researchers at the University of Washington, that pulls together facts by combing through more than 500 million Web pages. TextRunner extracts information from billions of lines of text by analyzing basic relationships between words. 'The significance of TextRunner is that it is scalable because it is unsupervised,' says Peter Norvig, director of research at Google, which donated the database of Web pages that TextRunner analyzes. The prototype still has a fairly simple interface and is not meant for public search so much as to demonstrate the automated extraction of information from 500 million Web pages, says Oren Etzioni, a University of Washington computer scientist leading the project." Try the query "Who has Microsoft acquired?"
This discussion has been archived. No new comments can be posted.

Extracting Meaning From Millions of Pages

Comments Filter:
  • Comment removed (Score:5, Interesting)

    by account_deleted ( 4530225 ) on Friday June 12, 2009 @08:51AM (#28306721)
    Comment removed based on user account deletion
  • Zero results (Score:3, Interesting)

    by John Hasler ( 414242 ) on Friday June 12, 2009 @09:12AM (#28306945) Homepage

    I tried half a dozen queries of the sort I often use Google for (example: "What is the velocity of sound in hydraulic fluid?"). No answers.

  • Concise (Score:2, Interesting)

    by moogsynth ( 1264404 ) on Friday June 12, 2009 @09:12AM (#28306951)
    Try "Who paid SCO?" Concise, to the point. Nice.
  • by morgan_greywolf ( 835522 ) on Friday June 12, 2009 @09:13AM (#28306963) Homepage Journal

    Actually, just like any other search, it just shows ALL of the likely results and you are still responsible for determining for yourself which of the statements is true. It says "CIA killed JFK" but the first result it returns is "Lee Harvey Oswald killed JFK". It also seems to pare down the results somewhat, because I know I've seen conspiracies also suggesting that the KGB killed JFK, or that the Mafia killed JFK. I'm guessing that more people think the CIA killed JFK than the KGB or the Mafia.

  • by Anonymous Coward on Friday June 12, 2009 @09:28AM (#28307127)

    Allowing a search engine to visit a site and allowing somebody to pass your web page content around are two completely different things.

  • by somersault ( 912633 ) on Friday June 12, 2009 @09:49AM (#28307375) Homepage Journal

    I suppose the major problem with this is that it cannot tell the difference between truth and lies or urban legends

    Most humans can't either, how do you expect a search engine to?

    There will be a lot of false positives and negatives that will be hard to identify as such unless it directly works with something like snopes.com , which kind of defeats the purpose because it means someone has had to research every question anyway.

    If a project like this which simply scoured the whole 'net, you wouldn't really be able to verify anything beyond people's opinions or beliefs, which may or may not be 'true'.

    I think something like this would work really well for factual results if it was only allowed to draw conclusions from verified sources, say something like Wikipedia articles that have been verified by experts in the appropriate field (I've not been following all this type of thing recently but perhaps that is what Wolfram Alpha does already). It could perhaps be useful to have it search the general internet for supplementary results for some questions though, especially those of a philosophical nature where it may be impossible to establish definite answers ("is there a god" and the like).

  • by Anonymous Coward on Friday June 12, 2009 @09:53AM (#28307429)

    I would go with...

    • ...meters per second and kilometers per hour
    • ...feet per second and miles per hour
    • ...feet per second and meters per second
    • ...miles per hour and kilometers per hour

    But meters per second and miles per hour? WHY?!

  • by rm999 ( 775449 ) on Friday June 12, 2009 @01:43PM (#28310891)

    I think you're missing the point. This is an AI project - it's research. Presumably, the questions you are typing in haven't been processed by a complicated nest of if-thens written by someone who knows English; instead, statistical models of language and meaning were extracted from the internet. Some people claim this is the equivalent of "teaching" a computer.

    The first example, which is what most search engines do, leads to impressive search results but is limited by the logic people can code up. This AI, on the other hand, may be a primitive example of the way Google will work 15 years from now.

Anyone can make an omelet with eggs. The trick is to make one with none.

Working...