Patent application number | Description | Published |
20090171944 | Set Similarity selection queries at interactive speeds - The similarity between a query set comprising query set tokens and a database set comprising database set tokens is determined by a similarity score. The database sets belong to a data collection set, which contains all database sets from which information may be retrieved. If the similarity score is greater than or equal to a user-defined threshold, the database set has information relevant to the query set. The similarity score is calculated with an inverse document frequency method (IDF) similarity measure independent of term frequency. The document frequency is based at least in part on the number of database sets in the data collection set and the number of database sets which contain at least one query set token. The length of the query set and the length of the database set are normalized. | 07-02-2009 |
20090319518 | METHOD AND SYSTEM FOR INFORMATION DISCOVERY AND TEXT ANALYSIS - A method for searching text sources including temporally-ordered data objects, such as a blog, is provided including the steps of: (i) providing access to text sources, each text source including temporally-ordered data objects; (ii) obtaining or generating a search query based on terms and time intervals; (iii) obtaining or generating time data associated with the data objects; (iv) identifying data objects based on the search query; and (v) generating popularity curves based on the frequency of data objects corresponding to one or more of the search terms in the one or more time intervals. A system and computer program for text source searching is also provided. | 12-24-2009 |
20100125559 | SELECTIVITY ESTIMATION OF SET SIMILARITY SELECTION QUERIES - The invention relates to a system and/or methodology for selectivity estimation of set similarity queries. More specifically, the invention relates to a selectivity estimation technique employing hashed sampling. The invention providing for samples constructed a priori that can efficiently and quickly provide accurate estimates for arbitrary queries, and can be updated efficiently as well. | 05-20-2010 |
20100318519 | Incremental Maintenance of Inverted Indexes for Approximate String Matching - In embodiments of the disclosed technology, indexes, such as inverted indexes, are updated only as necessary to guarantee answer precision within predefined thresholds which are determined with little cost in comparison to the updates of the indexes themselves. With the present technology, a batch of daily updates can be processed in a matter of minutes, rather than a few hours for rebuilding an index, and a query may be answered with assurances that the results are accurate or within a threshold of accuracy. | 12-16-2010 |
20110047185 | META-DATA INDEXING FOR XPATH LOCATION STEPS - In accordance with a method of encoding meta-data associated with tree-structured data, a first set of elements of a plurality of elements in the tree-structured is associated explicitly with explicit meta-data levels, and a second set of elements of the plurality of elements is associated by inheritance with explicit meta-data levels of closest ancestor elements of the first set of elements. The plurality of elements is packed into a plurality of leaf nodes of an index structure. The plurality of leaf nodes is merged into a plurality of non-leaf nodes until a root non-leaf node is generated. The plurality of non-leaf nodes of the index structure is associated with indicators representing ranges of the explicit meta-data levels in the packed first set of elements, such that explicit meta-data level ranges of descendant non-leaf nodes are subsets of explicit meta-data level ranges of ancestor non-leaf nodes. | 02-24-2011 |
20110145398 | System and Method for Monitoring Visits to a Target Site - Methods and systems for monitoring visits to a target site are provided. A list of one or more origin sites is embedded in the target site. A determination is made whether any entry in the list of origin site has been previously visited. | 06-16-2011 |
20120323870 | Incremental Maintenance of Inverted Indexes for Approximate String Matching - In embodiments of the disclosed technology, indexes, such as inverted indexes, are updated only as necessary to guarantee answer precision within predefined thresholds which are determined with little cost in comparison to the updates of the indexes themselves. With the present technology, a batch of daily updates can be processed in a matter of minutes, rather than a few hours for rebuilding an index, and a query may be answered with assurances that the results are accurate or within a threshold of accuracy. | 12-20-2012 |