Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees

Burges, WA

Chris Burges, Bellevue, WA US

Patent application number	Description	Published
20080270376	WEB SPAM PAGE CLASSIFICATION USING QUERY-DEPENDENT DATA - A web spam page classifier is described that identifies web spam pages based on features of a search query and web page pair. The features can be extracted from training instances and a training algorithm can be employed to develop the classifier. Pages identified as web spam pages can be demoted and/or removed from a relevancy ranked list.	10-30-2008
20080313147	Multi-level search - A computer-implementable method and system for performing a multi-level search. The method includes performing a primary search that involves executing a query submitted by a user, and returning primary search results (a list of documents, for example). The method further includes automatically performing a secondary search. The secondary search involves identifying at least one third-party source of information based on the query, and automatically assessing a semantic interpretation of the query. The secondary search utilizes the identified at least one third-party source of information and the semantic interpretation of the query to derive secondary search results, which are displayed along with the primary search results.	12-18-2008

Chris J. Burges, Bellevue, WA US

Patent application number	Description	Published
20110161260	USER-DRIVEN INDEX SELECTION - Techniques for index building are described. Clickcounts of respective training URLs may indicate a number of times that corresponding training URLs were clicked in search engine results. A machine learning algorithm implemented on a computer computes a trained model that is then stored. The clickcounts and respective URLs are passed to the machine learning algorithm to train the model to predict probabilities based on feature vectors of URLs. An index of web pages is built for a set of URLs that identify the web pages. Feature vectors for the URLs are computed. Probabilities of the web pages of the URLs being searched in the future by users may be computed by processing the feature vectors with the trained model. The probabilities may be used to determine which of the URLs to include in the index.	06-30-2011

Chris J.c. Burges, Bellevue, WA US

Patent application number	Description	Published
20090106223	ENTERPRISE RELEVANCY RANKING USING A NEURAL NETWORK - A neural network is used to process a set of ranking features in order to determine the relevancy ranking for a set of documents or other items. The neural network calculates a predicted relevancy score for each document and the documents can then be ordered by that score. Alternate embodiments apply a set of data transformations to the ranking features before they are input to the neural network. Training can be used to adapt both the neural network and certain of the data transformations to target environments.	04-23-2009
20100153315	BOOSTING ALGORITHM FOR RANKING MODEL ADAPTATION - Model adaptation may be performed to take a general model trained with a set of training data (possibly large), and adapt the model using a set of domain-specific training data (possibly small). The parameters, structure, or configuration of a model trained in one domain (called the background domain) may be adapted to a different domain (called the adaptation domain), for which there may be a limited amount of training data. The adaption may be performed using the Boosting Algorithm to select an optimal basis function that optimizes a measure of error of the model as it is being iteratively refined, i.e., adapted.	06-17-2010
20120158710	MULTI-TIERED INFORMATION RETRIEVAL TRAINING - Methods and systems for multi-tiered information retrieval training are disclosed. A method includes identifying results in a ranked ordering of results that can be swapped without changing a score determined using a first ranking quality measure, determining a first vector and at least one other vector for each identified swappable result in the ranked ordering of results based on the first ranking quality measure and at least one other ranking quality measure respectively, and adding the first vector and the at least one other vector for each identified swappable result in the ranked ordering of results to obtain a function of the first vector and the at least one other vector. Access is provided to the function of the first vector and the at least one other vector for use in the multi-tiered information retrieval training.	06-21-2012
20120330647	HIERARCHICAL MODELS FOR LANGUAGE MODELING - The described implementations relate to natural language processing, and more particularly to training a language prior model using a model structure. The language prior model can be trained using parameterized representations of lexical structures such as training sentences, as well as parameterized representations of lexical units such as words or n-grams. During training, the parameterized representations of the lexical structures and the lexical units can be adjusted using the model structure. When the language prior model is trained, the parameterized representations of the lexical structures can reflect how the lexical units were used in the lexical structures.	12-27-2012

Patent applications by Chris J.c. Burges, Bellevue, WA US

Chrisopher J.c. Burges, Bellevue, WA US

Patent application number	Description	Published
20080275833	Link spam detection using smooth classification function - A collection of web pages is considered as a directed graph in which the pages themselves are nodes and the hyperlinks between the pages are directed edges in the graph. A trusted entity identifies training examples for spam pages and normal pages. A random walk is conducted through the directed graph that includes the collection of web pages and the stationary probabilities, and transitional probabilities, among the nodes in the directed graph are obtained. A classifier training component estimates a classification function that changes slowly on densely connected subgraphs within the directed graph. The classification function assigns a value to each of the nodes in the directed graph and identifies them as spam or normal pages based upon whether the value meets a given function threshold value.	11-06-2008

Christopher J. Burges, Bellevue, WA US

Patent application number	Description	Published
20150134329	CONTENT IDENTIFICATION SYSTEM - The content of a media program is recognized by analyzing its audio content to extract therefrom prescribed features, which are compared to a database of features associated with identified content. The identity of the content within the database that has features that most closely match the features of the media program being played is supplied as the identity of the program being played. The features are extracted from a frequency domain version of the media program by a) filtering the coefficients to reduce their number, e.g., using triangular filters; b) grouping a number of consecutive outputs of triangular filters into segments; and c) selecting those segments that meet prescribed criteria, such as those segments that have the largest minimum segment energy with prescribed constraints that prevent the segments from being too close to each other. The triangular filters may be log-spaced and their output may be normalized.	05-14-2015

Christopher J.c. Burges, Bellevue, WA US

Patent application number	Description	Published
20080275902	Web page analysis using multiple graphs - A collection of web pages is modeled as a directed graph, in which the nodes of the graph are the web pages and directed edges are hyperlinks. Web pages can also be represented by content, or by other features, to obtain a similarity graph over the web pages, where nodes again denote the web pages and the links or edges between each pair of nodes is weighted by a corresponding similarity between those two nodes. A random walk is defined for each graph, and a mixture of the random walks is obtained for the set of graphs. The collection of web pages is then analyzed based on the mixture to obtain a web page analysis result. The web page analysis result can be, for example, clustering of the web pages to discover web communities, classifying or categorizing the web pages, or spam detection indicating whether a given web page is spam or content.	11-06-2008
20080281817	Accounting for behavioral variability in web search - The concept of variability pertains to whether users exhibit consistent search interaction patterns, for example, in terms of interaction flow or information targeted. Methods are provided for analyzing variability, and then adapting search-related functionality (e.g., processes and/or interfaces) to account for variability characteristics, for example, to account for predictable search interaction behavior.	11-13-2008
20090106229	Linear combination of rankers - Described herein is a system that includes a receiver component that receives first scores for training points and second scores for the training points, wherein the first scores are individually assigned to the training points by a first ranker component and the second scores are individually assigned to the training points by a second ranker component. The apparatus further includes a determiner component in communication with the receiver component that automatically outputs a value for a parameter α based at least in part upon the first scores and the second scores, wherein α is used to linearly combine the first ranker component and the second ranker component.	04-23-2009
20090106232	BOOSTING A RANKER FOR IMPROVED RANKING ACCURACY - A system described herein includes a trainer component that receives an estimated gradient of cost that corresponds to a first ranker component with respect to at least one training point and at least one query. The trainer component builds a second ranker component based at least in part upon the received estimated gradient. The system further includes a combiner component that linearly combines the first ranker component and the second ranker component.	04-23-2009
20090112781	PREDICTING AND USING SEARCH ENGINE SWITCHING BEHAVIOR - Aspects of the subject matter described herein relate to predicting and using search engine switching behavior. In aspects, switching components receive a representation of user interactions with at least one browser. The switching components derive information from the representation that is useful in predicting whether a user will switch search engines. The derived information and information about a user's current interaction with a browser is then used by a switch predictor to predict whether the user will switch search engines. This prediction may be used in a variety of ways examples of which are given herein.	04-30-2009
20100281024	LINEAR COMBINATION OF RANKERS - Described herein is a system that includes a receiver component that receives first scores for training points and second scores for the training points, wherein the first scores are individually assigned to the training points by a first ranker component and the second scores are individually assigned to the training points by a second ranker component. The apparatus further includes a determiner component in communication with the receiver component that automatically outputs a value for a parameter α based at least in part upon the first scores and the second scores, wherein α is used to linearly combine the first ranker component and the second ranker component.	11-04-2010
20100318540	IDENTIFICATION OF SAMPLE DATA ITEMS FOR RE-JUDGING - Described is a technology for identifying sample data items (e.g., documents corresponding to query-URL pairs) having the greatest likelihood of being mislabeled when previously judged, and selecting those data items for re-judging. In one aspect, lambda gradient scores (information associated with ranked sample data items that indicates a relative direction and how “strongly” to move each data item for lowering a ranking cost) are summed for pairs of sample data items to compute re-judgment scores for each of those sample data items. The re-judgment scores indicate a relative likelihood of mislabeling. Once the selected sample data items are re-judged, a new training set is available, whereby a new ranker may be trained.	12-16-2010
20110238648	PREDICTING AND USING SEARCH ENGINE SWITCHING BEHAVIOR - Aspects of the subject matter described herein relate to predicting and using search engine switching behavior. In aspects, switching components receive a representation of user interactions with at least one browser. The switching components derive information from the representation that is useful in predicting whether a user will switch search engines. The derived information and information about a user's current interaction with a browser is then used by a switch predictor to predict whether the user will switch search engines. This prediction may be used in a variety of ways examples of which are given herein.	09-29-2011
20110282816	LINK SPAM DETECTION USING SMOOTH CLASSIFICATION FUNCTION - A spam detection system is disclosed. The system includes a classifier training component that receives a first set of training pages labeled as normal pages and a second set of training pages labeled as spam pages. The training component trains a web page classifier based on both the first set of training pages and the second set of training pages. A spam detector then receives unlabeled web pages uses the web page classifier to classify the unlabeled web pages as spam pages or normal pages.	11-17-2011
20120271811	PREDICTING AND USING SEARCH ENGINE SWITCHING BEHAVIOR - Aspects of the subject matter described herein relate to predicting and using search engine switching behavior. In aspects, switching components receive a representation of user interactions with at least one browser. The switching components derive information from the representation that is useful in predicting whether a user will switch search engines. The derived information and information about a user's current interaction with a browser is then used by a switch predictor to predict whether the user will switch search engines. This prediction may be used in a variety of ways examples of which are given herein.	10-25-2012
20130282632	LINK SPAM DETECTION USING SMOOTH CLASSIFICATION FUNCTION - A spam detection system is disclosed. The system includes a classifier training component that receives a first set of training pages labeled as normal pages and a second set of training pages labeled as spam pages. The training component trains a web page classifier based on both the first set of training pages and the second set of training pages. A spam detector then receives unlabeled web pages uses the web page classifier to classify the unlabeled web pages as spam pages or normal pages.	10-24-2013
20140156260	GENERATING SENTENCE COMPLETION QUESTIONS - The subject disclosure is directed towards automated processes for generating sentence completion questions based at least in part on a language model. Using the language model, a sentence is located, and alternates for a focus word (or words) in the sentence are automatically provided. Also described is automated filtering candidate sentences to locate the sentence, filtering the alternates based upon elimination criteria, scoring sentences with the correct word and as modified the alternates, and ranking the alternates. Manual selection may be used along with the automated processes.	06-05-2014

Patent applications by Christopher J.c. Burges, Bellevue, WA US

Christopher J. C. Burges, Bellevue, WA US

Patent application number	Description	Published
20100185649	SUBSTANTIALLY SIMILAR QUERIES - A system described herein includes analyzer component that analyzes queries submitted by users and corresponding URLs selected by the users, wherein the queries include a first query and a second query, and wherein the analyzer component determines that the first query and the second query are substantially similar queries. The system additionally includes a correlator component that, responsive to the analyzer component determining that the first query and the second query are substantially similar, generates correlation data that indicates that the first and second queries are substantially similar.	07-22-2010
20110295847	CONCEPT INTERFACE FOR SEARCH ENGINES - Concepts are presented related to a search engine query. Users can subsequently navigate search results and/or reformulate a query at a conceptual level. In one instance, users can specify weight with respect to one or more concepts to capture interest or lack of interest with respect to search intent. Based on one or more weights, a search query can be modified and results presented to a user along with associated concepts to enable continued interaction. Additionally or alternatively, organization and/or presentation of search results as well as advertisements can be influenced by user-specified weights or other interactions with concepts.	12-01-2011

Patent applications by Christopher J. C. Burges, Bellevue, WA US

Christopher John Champness Burges, Bellevue, WA US

Patent application number	Description	Published
20100262612	RE-RANKING TOP SEARCH RESULTS - The claimed subject matter provides a system and/or a method that facilitates generating sorted search results for a query. An interface component can receive a query in a first language. A first ranker can be trained from a portion of data related to a second language. A second ranker can correspond to the first language, wherein the second ranker is untrained due to a limited amount of data related to the first language. A sorting component can invoke the first ranker to generate and order a pre-defined number of search results for the received query and subsequently invoke the second ranker to the pre-defined number of search results to generate a re-ordered number of search results in the first language for the received query.	10-14-2010
20100317444	USING A HUMAN COMPUTATION GAME TO IMPROVE SEARCH ENGINE PERFORMANCE - Human computation games are provided wherein a player is shown a page, such as a web page. The player is then asked to provide one or more terms that are intended to cause a search engine to return the page in response to performing a query using the terms. The terms provided by the player during game play are then collected, stored, and utilized to improve the performance of the search engine.	12-16-2010