Guest / Items
Technology Review: Extracting Meaning from Millions of Pages
Get Feed
- Description
-
A software engine that pulls together facts by combing through more than 500 million Web pages has been developed by researchers at the University of Washington. The tool extracts information from billions of lines of text by analyzing basic relationships between words.
Some experts say that this kind of "automated information extraction" will likely form the basis for far more intelligent next-generation Web search, in which nuggets of information are first gleaned and then combined intelligently.
The University of Washington project represents a scaling up of an existing technology developed there called TextRunner in terms of both the number of pages and the scope of topics that it can analyze.
"The significance of TextRunner is that it is scalable because it is unsupervised," says Peter Norvig, director of research at Google, which donated the database of Web pages that TextRunner analyzes. "It can discover and learn millions of relations, not just one at a time. With TextRunner, there is no human in the loop: it just finds relations on its own."
- Original URL
Comments
Report ThisTwine is about discovering, collecting and sharing the content that interests you. Learn More
Stats
- 8 Twines
- Make a comment
Tags
Community Tags
Who's Interested In This?
-
JDP added to Search, Data Mining Technologies, Computers and Telecommunications, Search and Discovery, Competitive Intelligence & Technology …, techMix 9 months ago
-
Marcello Cividini added to CRM future 9 months ago
-
Thanks to all! Bye. added to Web Data Extraction 5 months ago
Public Comments
Add a Comment