Inventors list |
Assignees list |
Classification tree browser |
Top 100 Inventors |
Top 100 Assignees |
Burges, WA
Chris Burges, Bellevue, WA US
| Patent application number | Description | Published |
|---|---|---|
| 20080270376 | WEB SPAM PAGE CLASSIFICATION USING QUERY-DEPENDENT DATA - A web spam page classifier is described that identifies web spam pages based on features of a search query and web page pair. The features can be extracted from training instances and a training algorithm can be employed to develop the classifier. Pages identified as web spam pages can be demoted and/or removed from a relevancy ranked list. | 10-30-2008 |
| 20080313147 | Multi-level search - A computer-implementable method and system for performing a multi-level search. The method includes performing a primary search that involves executing a query submitted by a user, and returning primary search results (a list of documents, for example). The method further includes automatically performing a secondary search. The secondary search involves identifying at least one third-party source of information based on the query, and automatically assessing a semantic interpretation of the query. The secondary search utilizes the identified at least one third-party source of information and the semantic interpretation of the query to derive secondary search results, which are displayed along with the primary search results. | 12-18-2008 |
Chris J. Burges, Bellevue, WA US
| Patent application number | Description | Published |
|---|---|---|
| 20110161260 | USER-DRIVEN INDEX SELECTION - Techniques for index building are described. Clickcounts of respective training URLs may indicate a number of times that corresponding training URLs were clicked in search engine results. A machine learning algorithm implemented on a computer computes a trained model that is then stored. The clickcounts and respective URLs are passed to the machine learning algorithm to train the model to predict probabilities based on feature vectors of URLs. An index of web pages is built for a set of URLs that identify the web pages. Feature vectors for the URLs are computed. Probabilities of the web pages of the URLs being searched in the future by users may be computed by processing the feature vectors with the trained model. The probabilities may be used to determine which of the URLs to include in the index. | 06-30-2011 |
Chris J.c. Burges, Bellevue, WA US
| Patent application number | Description | Published |
|---|---|---|
| 20090106223 | ENTERPRISE RELEVANCY RANKING USING A NEURAL NETWORK - A neural network is used to process a set of ranking features in order to determine the relevancy ranking for a set of documents or other items. The neural network calculates a predicted relevancy score for each document and the documents can then be ordered by that score. Alternate embodiments apply a set of data transformations to the ranking features before they are input to the neural network. Training can be used to adapt both the neural network and certain of the data transformations to target environments. | 04-23-2009 |
| 20100153315 | BOOSTING ALGORITHM FOR RANKING MODEL ADAPTATION - Model adaptation may be performed to take a general model trained with a set of training data (possibly large), and adapt the model using a set of domain-specific training data (possibly small). The parameters, structure, or configuration of a model trained in one domain (called the background domain) may be adapted to a different domain (called the adaptation domain), for which there may be a limited amount of training data. The adaption may be performed using the Boosting Algorithm to select an optimal basis function that optimizes a measure of error of the model as it is being iteratively refined, i.e., adapted. | 06-17-2010 |
Chrisopher J.c. Burges, Bellevue, WA US
| Patent application number | Description | Published |
|---|---|---|
| 20080275833 | Link spam detection using smooth classification function - A collection of web pages is considered as a directed graph in which the pages themselves are nodes and the hyperlinks between the pages are directed edges in the graph. A trusted entity identifies training examples for spam pages and normal pages. A random walk is conducted through the directed graph that includes the collection of web pages and the stationary probabilities, and transitional probabilities, among the nodes in the directed graph are obtained. A classifier training component estimates a classification function that changes slowly on densely connected subgraphs within the directed graph. The classification function assigns a value to each of the nodes in the directed graph and identifies them as spam or normal pages based upon whether the value meets a given function threshold value. | 11-06-2008 |
Christopher J.c. Burges, Bellevue, WA US
| Patent application number | Description | Published |
|---|---|---|
| 20080275902 | Web page analysis using multiple graphs - A collection of web pages is modeled as a directed graph, in which the nodes of the graph are the web pages and directed edges are hyperlinks. Web pages can also be represented by content, or by other features, to obtain a similarity graph over the web pages, where nodes again denote the web pages and the links or edges between each pair of nodes is weighted by a corresponding similarity between those two nodes. A random walk is defined for each graph, and a mixture of the random walks is obtained for the set of graphs. The collection of web pages is then analyzed based on the mixture to obtain a web page analysis result. The web page analysis result can be, for example, clustering of the web pages to discover web communities, classifying or categorizing the web pages, or spam detection indicating whether a given web page is spam or content. | 11-06-2008 |
| 20080281817 | Accounting for behavioral variability in web search - The concept of variability pertains to whether users exhibit consistent search interaction patterns, for example, in terms of interaction flow or information targeted. Methods are provided for analyzing variability, and then adapting search-related functionality (e.g., processes and/or interfaces) to account for variability characteristics, for example, to account for predictable search interaction behavior. | 11-13-2008 |
| 20090106229 | Linear combination of rankers - Described herein is a system that includes a receiver component that receives first scores for training points and second scores for the training points, wherein the first scores are individually assigned to the training points by a first ranker component and the second scores are individually assigned to the training points by a second ranker component. The apparatus further includes a determiner component in communication with the receiver component that automatically outputs a value for a parameter α based at least in part upon the first scores and the second scores, wherein α is used to linearly combine the first ranker component and the second ranker component. | 04-23-2009 |
| 20090106232 | BOOSTING A RANKER FOR IMPROVED RANKING ACCURACY - A system described herein includes a trainer component that receives an estimated gradient of cost that corresponds to a first ranker component with respect to at least one training point and at least one query. The trainer component builds a second ranker component based at least in part upon the received estimated gradient. The system further includes a combiner component that linearly combines the first ranker component and the second ranker component. | 04-23-2009 |
| 20090112781 | PREDICTING AND USING SEARCH ENGINE SWITCHING BEHAVIOR - Aspects of the subject matter described herein relate to predicting and using search engine switching behavior. In aspects, switching components receive a representation of user interactions with at least one browser. The switching components derive information from the representation that is useful in predicting whether a user will switch search engines. The derived information and information about a user's current interaction with a browser is then used by a switch predictor to predict whether the user will switch search engines. This prediction may be used in a variety of ways examples of which are given herein. | 04-30-2009 |
| 20100281024 | LINEAR COMBINATION OF RANKERS - Described herein is a system that includes a receiver component that receives first scores for training points and second scores for the training points, wherein the first scores are individually assigned to the training points by a first ranker component and the second scores are individually assigned to the training points by a second ranker component. The apparatus further includes a determiner component in communication with the receiver component that automatically outputs a value for a parameter α based at least in part upon the first scores and the second scores, wherein α is used to linearly combine the first ranker component and the second ranker component. | 11-04-2010 |
| 20100318540 | IDENTIFICATION OF SAMPLE DATA ITEMS FOR RE-JUDGING - Described is a technology for identifying sample data items (e.g., documents corresponding to query-URL pairs) having the greatest likelihood of being mislabeled when previously judged, and selecting those data items for re-judging. In one aspect, lambda gradient scores (information associated with ranked sample data items that indicates a relative direction and how “strongly” to move each data item for lowering a ranking cost) are summed for pairs of sample data items to compute re-judgment scores for each of those sample data items. The re-judgment scores indicate a relative likelihood of mislabeling. Once the selected sample data items are re-judged, a new training set is available, whereby a new ranker may be trained. | 12-16-2010 |
Christopher J. C. Burges, Bellevue, WA US
| Patent application number | Description | Published |
|---|---|---|
| 20100185649 | SUBSTANTIALLY SIMILAR QUERIES - A system described herein includes analyzer component that analyzes queries submitted by users and corresponding URLs selected by the users, wherein the queries include a first query and a second query, and wherein the analyzer component determines that the first query and the second query are substantially similar queries. The system additionally includes a correlator component that, responsive to the analyzer component determining that the first query and the second query are substantially similar, generates correlation data that indicates that the first and second queries are substantially similar. | 07-22-2010 |
Christopher John Champness Burges, Bellevue, WA US
| Patent application number | Description | Published |
|---|---|---|
| 20100262612 | RE-RANKING TOP SEARCH RESULTS - The claimed subject matter provides a system and/or a method that facilitates generating sorted search results for a query. An interface component can receive a query in a first language. A first ranker can be trained from a portion of data related to a second language. A second ranker can correspond to the first language, wherein the second ranker is untrained due to a limited amount of data related to the first language. A sorting component can invoke the first ranker to generate and order a pre-defined number of search results for the received query and subsequently invoke the second ranker to the pre-defined number of search results to generate a re-ordered number of search results in the first language for the received query. | 10-14-2010 |
| 20100317444 | USING A HUMAN COMPUTATION GAME TO IMPROVE SEARCH ENGINE PERFORMANCE - Human computation games are provided wherein a player is shown a page, such as a web page. The player is then asked to provide one or more terms that are intended to cause a search engine to return the page in response to performing a query using the terms. The terms provided by the player during game play are then collected, stored, and utilized to improve the performance of the search engine. | 12-16-2010 |
