Patent application number | Description | Published |
20080215314 | METHOD FOR ADAPTING A K-MEANS TEXT CLUSTERING TO EMERGING DATA - A method and structure for clustering documents in datasets which include clustering first documents and a first dataset to produce first document classes, creating centroid seeds based on the first document classes, and clustering second documents in a second dataset using the centroid seeds, wherein the first dataset and the second dataset are related. The clustering of the first documents in the first dataset forms a first dictionary of most common words in the first dataset and generates a first vector space model by counting, for each word in the first dictionary, a number of the first documents in which the word occurs, and clusters the first documents in the first dataset based on the first vector space model, and further generates a second vector space model by counting, for each word in the first dictionary, a number of the second documents in which the word occurs. Creation of the centroid seeds includes classifying second vector space model using the first document classes to produce a classified second vector space model and determining a mean of vectors in each class in the classified second vector space model, the mean includes the centroid seeds. | 09-04-2008 |
20080235220 | METHODOLOGIES AND ANALYTICS TOOLS FOR IDENTIFYING WHITE SPACE OPPORTUNITIES IN A GIVEN INDUSTRY - A method for analyzing predefined subject matter in a patent database being for use with a set of target patents, each target patent related to the predefined subject matter, the method comprising: creating a feature space based on frequently occurring terms found in the set of target patents; creating a partition taxonomy based on a clustered configuration of the feature space; editing the partition taxonomy using domain expertise to produce an edited partition taxonomy; creating a classification taxonomy based on structured features present in the edited partition taxonomy; creating a contingency table by comparing the edited partition taxonomy and the classification taxonomy to provide entries in the contingency table; and identifying all significant relationships in the contingency table to help determine the presence of any white space. | 09-25-2008 |
20080243889 | INFORMATION MINING USING DOMAIN SPECIFIC CONCEPTUAL STRUCTURES - A method and analytics tools for information mining incorporating domain specific knowledge and conceptual structures are disclosed, the method including: providing a first set of documents related to a first topic of interest; using a first taxonomy to categorize the first set of documents into a set of categories; providing a second set of documents related to a second topic of interest; categorizing the second set of documents according to the set of categories of the first set of documents; using an element of domain knowledge to re-categorize the first set of documents; and examining a category to identify a document of interest. | 10-02-2008 |
20080301105 | METHODOLOGIES AND ANALYTICS TOOLS FOR LOCATING EXPERTS WITH SPECIFIC SETS OF EXPERTISE - A method and analytics tools for locating experts with specific sets of expertise are disclosed, the method including providing a collection of documents P | 12-04-2008 |
20080301138 | Method for Analyzing Patent Claims - A patent evaluation method analyzes key words in the claims and how many patents use those words, to measure the impact of a given patent. For a group of patents in a particular field (e.g., as defined by a patent classification code), the key words can be indexed against the patents having claims in which those key words appear, and in particular with respect to that patent having the earliest reference date (e.g., a publication date such as the date on which the patent issued or any corresponding patent application was published). Output may be presented in the form of a table, which aids in quickly understanding a patent's value compared to other patents in its group. Various visualization and user interaction tools may be employed. | 12-04-2008 |
20090198570 | METHODOLOGIES AND ANALYTICS TOOLS FOR IDENTIFYING POTENTIAL LICENSEE MARKETS - A method is disclosed for use with at least one initial document describing a technical concept suitable for licensing, the method comprising: retrieving a set of intellectual property documents from a data warehouse; partitioning the set of intellectual property documents into a plurality of document categories; classifying the set of intellectual property documents by an industry parameter; constructing a contingency table that includes a listing of industry classifications for each of the document categories, and identifying documents within a particular one of the document categories that have different industry classifications so as to identify at least one potential new licensee industry of the technical concept described in the initial document. | 08-06-2009 |
20090292660 | USING RULE INDUCTION TO IDENTIFY EMERGING TRENDS IN UNSTRUCTURED TEXT STREAMS - A method for identifying emerging concepts in unstructured text streams comprises: selecting a subset V of documents from a set U of documents; generating at least one Boolean combination of terms that partitions the set U into a plurality of categories that represent a generalized, statistically based model of the selected subset V wherein the categories are disjoint inasmuch as each document of U is included in only one category of the partition; and generating a descriptive label for each of the disjoint categories from the Boolean combination of terms for that category. | 11-26-2009 |
20100145940 | SYSTEMS AND METHODS FOR ANALYZING ELECTRONIC TEXT - Systems and methods for systematically analyzing an electronic text are described. In one embodiment, the method includes receiving the electronic text from a plurality of sources. The method also includes determining an at least one term of interest to be identified in the electronic text. The method further includes identifying a plurality of locations within the electronic text including the at least one term of interest. The method also includes for each location within a plurality of locations, creating a snippet from a text segment around the at least one term of interest at the location within the electronic text. The method further includes creating multiple taxonomies for the at least one term of interest from the snippets, wherein the taxonomies include an at least one category. The method also includes determining co-occurrences between the multiple taxonomies to determine associations between categories of a different taxonomies of the multiple taxonomies. | 06-10-2010 |
20110252025 | SYSTEM AND METHOD FOR TOPIC INITIATOR DETECTION ON THE WORLD WIDE WEB - The exemplary embodiments of the present invention provide a system, method and computer program products for determining a particular document that initiated a topic of interest in a collection of documents, were each of the documents has contents and a time it was created. The method includes ranking the documents in the collection based on the respective times that the documents were created, ranking the documents based on how similar their respective contents are to the topic of interest and ranking the documents based on originality of their respective contents. The method further includes producing a composite ranking of the documents based on the time, contents, and originality rankings, and then determining the particular document that initiated the topic of interest from the composite ranking. | 10-13-2011 |
20110252030 | SYSTEMS, METHODS AND COMPUTER PROGRAM PRODUCTS FOR A SNIPPET BASED PROXIMAL SEARCH - The exemplary embodiments of the present invention provide a system, method and computer program products for a snippet based proximal search. A method comprises ranking the documents based on text that is similar to a text snippet. The ranking includes automatically generating proximity queries that include the text snippet, submitting the proximity queries, and collecting the document results. The method comprises selecting a plurality of highest ranked documents to form a subset of documents, extracting snippets from each document in the subset, and creating a vector space model for a set defined by a union of the extracted snippets and the text snippet. The method comprises ranking the extracted snippets according to their vector distances from the input text snippet, and ranking the documents within the subset of documents based on the ranking of the extracted snippets to determine a best matching document in view of the text snippet. | 10-13-2011 |
20150081654 | Techniques for Entity-Level Technology Recommendation - Methods, systems, and articles of manufacture for entity-level technology recommendation are provided herein. A method includes searching a first query against a first corpus of documents to determine a set of documents matching an entity of interest identified in the first query, generating a list of technologies that (i) appear within the content of the set of documents and (ii) are associated to the entity of interest, searching a second query against a second corpus of documents to determine a set of documents representing a technology recommendation for the entity of interest, wherein said second query is based on one or more selected technologies from the list of technologies, and outputting the set of documents representing a technology recommendation to a user and/or a display. | 03-19-2015 |