Kenthapadi
Krishnaram Kenthapadi, Mountain View, CA US
Patent application number | Description | Published |
---|---|---|
20100318539 | LABELING DATA SAMPLES USING OBJECTIVE QUESTIONS - Described is a technology for obtaining labeled sample data. Labeling guidelines are converted into binary yes/no questions regarding data samples. The questions and data samples are provided to judges who then answer the questions for each sample. The answers are input to a label assignment algorithm that associates a label with each sample based upon the answers. If the guidelines are modified and previous answers to the binary questions are maintained, at least some of the previous answers may be used in re-labeling the samples in view of the modification. | 12-16-2010 |
20100318546 | SYNOPSIS OF A SEARCH LOG THAT RESPECTS USER PRIVACY - Described is releasing output data representing a search log, in which the data is suitable for most data mining/analysis applications, but is safe to publish by preserving user privacy. The search log is processed such that a query is only included if a sufficient count of that query is present; noise may be added. User contributions that are considered may be limited to a maximum number of queries. The output may indicate how often (possibly plus noise) that each query appeared. Other output may comprise a query-action graph, a query-inaction graph and/or a query-reformulation graph, with nodes representing queries and nodes representing actions, inactions or reformulations (e.g., clicked URLs, skipped URLs, or selected related queries), and edges between nodes representing action, skip or selection counts (possibly plus noise). The output may correspond to the top results/related queries returned from a search. | 12-16-2010 |
Krishnaram Kenthapadi, Sunnyvale, CA US
Patent application number | Description | Published |
---|---|---|
20140196151 | PRESERVING GEOMETRIC PROPERTIES OF DATASETS WHILE PROTECTING PRIVACY - The privacy of a dataset is protected. A private dataset is received that includes multiple rows of multidimensional data. Each row may correspond to a user, and each dimension may be an attribute of the user. A projection matrix is applied to each row to generate a lower dimensional sketch of the row. Noise is added to each of the lower dimensional sketches. The sketches with the added noise may be published together with the projection matrix. The sketches preserve geometric relationships of the original dataset including clustering, distances, and nearest neighbor, and therefore may be useful for data mining purposes while still protecting the privacy of the users. | 07-10-2014 |
20140324982 | TOPIC IDENTIFIERS ASSOCIATED WITH GROUP CHATS - Text messages over some period of time are collected. Topic identifiers, such as hashtags, are extracted from the text messages. The text messages associated with each topic identifier are processed to identify which topic identifiers are associated with group chats based on information associated with the text messages such as the times when the text messages were generated and whether the text messages identify user accounts. The topic identifiers that are determined to be associated with the group chats are incorporated into applications that allow users to search for group chats, and to view text messages from past group chats. | 10-30-2014 |
Krishnaram G. Kenthapadi, Mountain View, CA US
Patent application number | Description | Published |
---|---|---|
20090073888 | DETERMINING QUALITY OF COMMUNICATION - A method, computer-readable medium, and system for providing a quality measurement based on communications within a communication application. Communication attributes that include information associated with a user's communications are obtained. In embodiments, such communication attributes may pertain to communication duration and communication frequency. Upon obtaining communication attributes, a quality measurement may be determined based on the communication attributes. Such a quality measurement provides an indication of the quality of the user's communications. In embodiments, the quality measurement may be stored, communicated to a user, or implemented within a communication application. | 03-19-2009 |
Krishnaram N. G. Kenthapadi, Mountain View, CA US
Patent application number | Description | Published |
---|---|---|
20090306996 | RATING COMPUTATION ON SOCIAL NETWORKS - A social network may be used to determine a rating of a user with no prior history. The ratings for unrated nodes may be inferred from the existing ratings of users associated with the unrated node in either or both the underlying social network or other social networks. Additionally in some implementations, the effect of the rating of a rated node to an unrated node diminishes as the strength of their relationships decreases. In some cases, a social network may be modeled as an electrical network, and ratings may be modeled as voltages on the nodes of the social network, relationships in the social network may be modeled as connections in the electrical network, and in some cases the strength of relationships may be modeled as conductance of the connections. Ratings for nodes may be determined using Kirchhoff's Law and in some cases by solving a set of linear equations or by propagating positive and negative ratings using a random walk with absorbing states. | 12-10-2009 |
20090313286 | GENERATING TRAINING DATA FROM CLICK LOGS - Data from a click log may be used to generate training data for a search engine. The pages clicked as well as the pages skipped by a user may be used to assess the relevance of a page to a query. Labels for training data may be generated based on data from the click log. The labels may pertain to the relevance of a page to a query. | 12-17-2009 |
20110145227 | DETERMINING PREFERENCES FROM USER QUERIES - A query may be received at a computing device through a network. One or more attribute values that are preferences for a subset of the one or more terms of the query may be identified by the computing device. One or more products or services having associated attributes that have values that match a subset of the identified attribute values may be identified by the computing device, and a subset of the identified products or services may be presented by the computing device through the network. Implementations may also identify latent preferences, that is, preferences that are found for a query even where such a preference is not explicitly part of a term or token of the query. | 06-16-2011 |
20110314012 | DETERMINING QUERY INTENT - A tree structure has a node associated with each category of a hierarchy of item categories. Child nodes of the tree are associated with sub-categories of the categories associated with parent nodes. Training data including received queries and indicators of a selected item category for each received query is combined with the tree structure by associating each query with the node corresponding to the selected category of the query. When a query is received, a classifier is applied to the nodes to generate a probability that the query is intended to match an item of the category associated with the node. The classifier is applied until the probability is below a threshold. One or more categories associated with the nodes that are closest to the intent of the received query are selected and indicators of items of those categories that match the received query are output. | 12-22-2011 |
20120226661 | INDEXING FOR LIMITED SEARCH SERVER AVAILABILITY - Documents are replicated among servers comprising a search engine based on the value of each document by approximating its value as one of the top search results for one or more exemplary queries. Documents are allocated among servers comprising a search engine by calculating a relevance value for each document and then distributing the documents evenly to the servers. A subset of servers are selected from among a plurality of servers comprising a search engine using term-based, server-specific histograms reflecting the number of instances of the term in each document allocated to each server, and then selecting servers to service a query based on the documents on those servers. | 09-06-2012 |