Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Fetch Technologies, Inc.

Fetch Technologies, Inc. Patent applications
Patent application numberTitlePublished
20110282877METHOD AND SYSTEM FOR AUTOMATICALLY EXTRACTING DATA FROM WEB SITES - In accordance with an embodiment, data may be automatically extracted from semi-structured web sites. Unsupervised learning may be used to analyze web sites and discover their structure. One method utilizes a set of heterogeneous “experts,” each expert being capable of identifying certain types of generic structure. Each expert represents its discoveries as “hints.” Based on these hints, the system may cluster the pages and text segments and identify semi-structured data that can be extracted. To identify a good clustering, a probabilistic model of the hint-generation process may be used.11-17-2011