Patent application number | Description | Published |
20090106223 | ENTERPRISE RELEVANCY RANKING USING A NEURAL NETWORK - A neural network is used to process a set of ranking features in order to determine the relevancy ranking for a set of documents or other items. The neural network calculates a predicted relevancy score for each document and the documents can then be ordered by that score. Alternate embodiments apply a set of data transformations to the ranking features before they are input to the neural network. Training can be used to adapt both the neural network and certain of the data transformations to target environments. | 04-23-2009 |
20100153315 | BOOSTING ALGORITHM FOR RANKING MODEL ADAPTATION - Model adaptation may be performed to take a general model trained with a set of training data (possibly large), and adapt the model using a set of domain-specific training data (possibly small). The parameters, structure, or configuration of a model trained in one domain (called the background domain) may be adapted to a different domain (called the adaptation domain), for which there may be a limited amount of training data. The adaption may be performed using the Boosting Algorithm to select an optimal basis function that optimizes a measure of error of the model as it is being iteratively refined, i.e., adapted. | 06-17-2010 |
20120158710 | MULTI-TIERED INFORMATION RETRIEVAL TRAINING - Methods and systems for multi-tiered information retrieval training are disclosed. A method includes identifying results in a ranked ordering of results that can be swapped without changing a score determined using a first ranking quality measure, determining a first vector and at least one other vector for each identified swappable result in the ranked ordering of results based on the first ranking quality measure and at least one other ranking quality measure respectively, and adding the first vector and the at least one other vector for each identified swappable result in the ranked ordering of results to obtain a function of the first vector and the at least one other vector. Access is provided to the function of the first vector and the at least one other vector for use in the multi-tiered information retrieval training. | 06-21-2012 |
20120330647 | HIERARCHICAL MODELS FOR LANGUAGE MODELING - The described implementations relate to natural language processing, and more particularly to training a language prior model using a model structure. The language prior model can be trained using parameterized representations of lexical structures such as training sentences, as well as parameterized representations of lexical units such as words or n-grams. During training, the parameterized representations of the lexical structures and the lexical units can be adjusted using the model structure. When the language prior model is trained, the parameterized representations of the lexical structures can reflect how the lexical units were used in the lexical structures. | 12-27-2012 |
Patent application number | Description | Published |
20080275902 | Web page analysis using multiple graphs - A collection of web pages is modeled as a directed graph, in which the nodes of the graph are the web pages and directed edges are hyperlinks. Web pages can also be represented by content, or by other features, to obtain a similarity graph over the web pages, where nodes again denote the web pages and the links or edges between each pair of nodes is weighted by a corresponding similarity between those two nodes. A random walk is defined for each graph, and a mixture of the random walks is obtained for the set of graphs. The collection of web pages is then analyzed based on the mixture to obtain a web page analysis result. The web page analysis result can be, for example, clustering of the web pages to discover web communities, classifying or categorizing the web pages, or spam detection indicating whether a given web page is spam or content. | 11-06-2008 |
20080281817 | Accounting for behavioral variability in web search - The concept of variability pertains to whether users exhibit consistent search interaction patterns, for example, in terms of interaction flow or information targeted. Methods are provided for analyzing variability, and then adapting search-related functionality (e.g., processes and/or interfaces) to account for variability characteristics, for example, to account for predictable search interaction behavior. | 11-13-2008 |
20090106229 | Linear combination of rankers - Described herein is a system that includes a receiver component that receives first scores for training points and second scores for the training points, wherein the first scores are individually assigned to the training points by a first ranker component and the second scores are individually assigned to the training points by a second ranker component. The apparatus further includes a determiner component in communication with the receiver component that automatically outputs a value for a parameter α based at least in part upon the first scores and the second scores, wherein α is used to linearly combine the first ranker component and the second ranker component. | 04-23-2009 |
20090106232 | BOOSTING A RANKER FOR IMPROVED RANKING ACCURACY - A system described herein includes a trainer component that receives an estimated gradient of cost that corresponds to a first ranker component with respect to at least one training point and at least one query. The trainer component builds a second ranker component based at least in part upon the received estimated gradient. The system further includes a combiner component that linearly combines the first ranker component and the second ranker component. | 04-23-2009 |
20090112781 | PREDICTING AND USING SEARCH ENGINE SWITCHING BEHAVIOR - Aspects of the subject matter described herein relate to predicting and using search engine switching behavior. In aspects, switching components receive a representation of user interactions with at least one browser. The switching components derive information from the representation that is useful in predicting whether a user will switch search engines. The derived information and information about a user's current interaction with a browser is then used by a switch predictor to predict whether the user will switch search engines. This prediction may be used in a variety of ways examples of which are given herein. | 04-30-2009 |
20100281024 | LINEAR COMBINATION OF RANKERS - Described herein is a system that includes a receiver component that receives first scores for training points and second scores for the training points, wherein the first scores are individually assigned to the training points by a first ranker component and the second scores are individually assigned to the training points by a second ranker component. The apparatus further includes a determiner component in communication with the receiver component that automatically outputs a value for a parameter α based at least in part upon the first scores and the second scores, wherein α is used to linearly combine the first ranker component and the second ranker component. | 11-04-2010 |
20100318540 | IDENTIFICATION OF SAMPLE DATA ITEMS FOR RE-JUDGING - Described is a technology for identifying sample data items (e.g., documents corresponding to query-URL pairs) having the greatest likelihood of being mislabeled when previously judged, and selecting those data items for re-judging. In one aspect, lambda gradient scores (information associated with ranked sample data items that indicates a relative direction and how “strongly” to move each data item for lowering a ranking cost) are summed for pairs of sample data items to compute re-judgment scores for each of those sample data items. The re-judgment scores indicate a relative likelihood of mislabeling. Once the selected sample data items are re-judged, a new training set is available, whereby a new ranker may be trained. | 12-16-2010 |
20110238648 | PREDICTING AND USING SEARCH ENGINE SWITCHING BEHAVIOR - Aspects of the subject matter described herein relate to predicting and using search engine switching behavior. In aspects, switching components receive a representation of user interactions with at least one browser. The switching components derive information from the representation that is useful in predicting whether a user will switch search engines. The derived information and information about a user's current interaction with a browser is then used by a switch predictor to predict whether the user will switch search engines. This prediction may be used in a variety of ways examples of which are given herein. | 09-29-2011 |
20110282816 | LINK SPAM DETECTION USING SMOOTH CLASSIFICATION FUNCTION - A spam detection system is disclosed. The system includes a classifier training component that receives a first set of training pages labeled as normal pages and a second set of training pages labeled as spam pages. The training component trains a web page classifier based on both the first set of training pages and the second set of training pages. A spam detector then receives unlabeled web pages uses the web page classifier to classify the unlabeled web pages as spam pages or normal pages. | 11-17-2011 |
20120271811 | PREDICTING AND USING SEARCH ENGINE SWITCHING BEHAVIOR - Aspects of the subject matter described herein relate to predicting and using search engine switching behavior. In aspects, switching components receive a representation of user interactions with at least one browser. The switching components derive information from the representation that is useful in predicting whether a user will switch search engines. The derived information and information about a user's current interaction with a browser is then used by a switch predictor to predict whether the user will switch search engines. This prediction may be used in a variety of ways examples of which are given herein. | 10-25-2012 |
20130282632 | LINK SPAM DETECTION USING SMOOTH CLASSIFICATION FUNCTION - A spam detection system is disclosed. The system includes a classifier training component that receives a first set of training pages labeled as normal pages and a second set of training pages labeled as spam pages. The training component trains a web page classifier based on both the first set of training pages and the second set of training pages. A spam detector then receives unlabeled web pages uses the web page classifier to classify the unlabeled web pages as spam pages or normal pages. | 10-24-2013 |
20140156260 | GENERATING SENTENCE COMPLETION QUESTIONS - The subject disclosure is directed towards automated processes for generating sentence completion questions based at least in part on a language model. Using the language model, a sentence is located, and alternates for a focus word (or words) in the sentence are automatically provided. Also described is automated filtering candidate sentences to locate the sentence, filtering the alternates based upon elimination criteria, scoring sentences with the correct word and as modified the alternates, and ranking the alternates. Manual selection may be used along with the automated processes. | 06-05-2014 |