freakshowsam writes "Technology Review has an article on a software engine that pulls together facts by combing through more than 500 million Web pages that has been developed by researchers at the University of Washington. The tool extracts information from billions of lines of text by analyzing basic relationships between words. "The significance of TextRunner is that it is scalable because it is unsupervised," says Peter Norvig, director of research at Google, which donated the database of Web pages that TextRunner analyzes. The prototype still has a fairly simple interface and is not meant for public search so much as to demonstrate the automated extraction of information from 500 million Web pages, says Oren Etzioni, a University of Washington computer scientist leading the project."
This discussion was created for logged-in users only, but now has been archived.
No new comments can be posted.
Extracting Meaning from Millions of Pages 0 Comments More Login /
Get More Comments