Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Ye-Yi Wang, Redmond US

Ye-Yi Wang, Redmond, WA US

Patent application numberDescriptionPublished
20080281806SEARCHING A DATABASE OF LISTINGS - A database having listings rather than long documents is searched using a term frequency-inverse document frequency (Tf/Idf) algorithm.11-13-2008
20080281827USING STRUCTURED DATABASE FOR WEBPAGE INFORMATION EXTRACTION - A structured database is used for webpage information extraction, and in particular, to obtain training data from the webpage for training a statistical model. The structured database has a plurality of entries, wherein each entry comprises a plurality of fields. One of the fields comprises a URL (uniform resource locater), while another field comprises information at least similar to other information to be located in a webpage associated with the URL. For at least some of the entries in the structured database, a web page associated with the URL is retrieved. The webpage is analyzed and if information is found in the webpage similar to the information in the structured database, the webpage is identified as being suitable to be considered as a training sample.11-13-2008
20090019027Disambiguating residential listing search results - A directory assistance system includes a directory database and a search engine. The search engine is configured to search the directory database for a first set of residential listings based on at least one first search term. A second search term is received that is related to a cohabitant of the listing to be found. At least one search result is selected that satisfies the second search term.01-15-2009
20090037175Confidence measure generation for speech related searching - A voice search system has a speech recognizer, a search component, and a dialog manager. A confidence measure generator receives speech recognition features from the speech recognizer, search features from the search component, and dialog features from the dialog manager, and calculates an overall confidence measure for voice search results based upon the features received. The invention can be extended to include the generation of additional features, based on those received from the individual components of the voice search system.02-05-2009
20090150308MAXIMUM ENTROPY MODEL PARAMETERIZATION - Described is a technology by which a maximum entropy model used for classification is trained with a significantly lesser amount of training data than is normally used in training other maximum entropy models, yet provides similar accuracy to the others. The maximum entropy model is initially parameterized with parameter values determined from weights obtained by training a vector space model or an n-gram model. The weights may be scaled into the initial parameter values by determining a scaling factor. Gaussian mean values may also be determined, and used for regularization in training the maximum entropy model. Scaling may also be applied to the Gaussian mean values. After initial parameterization, training comprises using training data to iteratively adjust the initial parameters into adjusted parameters until convergence is determined.06-11-2009
20090276380COMPUTER-AIDED NATURAL LANGUAGE ANNOTATION - The present invention uses a natural language understanding system that is currently being trained to assist in annotating training data for training that natural language understanding system. Unannotated training data is provided to the system and the system proposes annotations to the training data. The user is offered an opportunity to confirm or correct the proposed annotations, and the system is trained with the corrected or verified annotations.11-05-2009
20090327260CONSTRUCTING A CLASSIFIER FOR CLASSIFYING QUERIES - To construct a classifier, a data structure correlating queries to items identified by the queries is received, where the data structure contains initial labeled queries that have been labeled with respect to predetermined classes, and unlabeled queries that have not been labeled with respect to the predetermined classes. The data structure is used to label at least some of the unlabeled queries with respect to the predetermined classes. Queries in the data structure that have been labeled with respect to the predetermined classes are used as training data to train the classifier.12-31-2009
20100145694REPLYING TO TEXT MESSAGES VIA AUTOMATED VOICE SEARCH TECHNIQUES - An automated “Voice Search Message Service” provides a voice-based user interface for generating text messages from an arbitrary speech input. Specifically, the Voice Search Message Service provides a voice-search information retrieval process that evaluates user speech inputs to select one or more probabilistic matches from a database of pre-defined or user-defined text messages. These probabilistic matches are also optionally sorted in terms of relevancy. A single text message from the probabilistic matches is then selected and automatically transmitted to one or more intended recipients. Optionally, one or more of the probabilistic matches are presented to the user for confirmation or selection prior to transmission. Correction or recovery of speech recognition errors avoided since the probabilistic matches are intended to paraphrase the user speech input rather than exactly reproduce that speech, though exact matches are possible. Consequently, potential distractions to the user are significantly reduced relative to conventional speech recognition techniques.06-10-2010
20100169317Product or Service Review Summarization Using Attributes - Described is a technology in which product or service reviews are automatically processed to form a summary for each single product or service. Snippets from the reviews are extracted and classified into sentiment classes (e.g., as positive or negative) based on their wording. Attributes are assigned to the reviews, e.g., based on term frequency concepts, as nouns, which may be paired with adjectives and/or verbs. The summary of the reviews belonging to a single product or service is generated based on the automatically computed attributes and the classification of review snippets into attribute and sentiment classes. For example, the summary may indicate how many reviews were positive (the sentiment class), along with text corresponding to the most similar snippet based on its similarity to the attributes (the attribute class).07-01-2010
20100268725ACQUISITION OF SEMANTIC CLASS LEXICONS FOR QUERY TAGGING - A user's search experience may be enhanced by providing additional content based upon an understanding of the user's intent. Query tagging, the assigning of semantic labels to terms within a query, is one technique that may be utilized to determine the context of a user's search query. Accordingly, as provided herein, a query tagging model may be updated using one or more stratified lexicons. A list data structure (e.g., lists of phrases obtained from web pages) and seed distribution data (e.g., pre-labeled probability data) may be used by a graph learning technique to obtain an expanded set of phrases and their respective probabilities of corresponding with particular lexicons (e.g., semantic class lexicons). The expanded set of phrases may be used to group phrases into stratified lexicons. The stratified lexicons may be used as features for updating and/or executing the query tagging model.10-21-2010

Patent applications by Ye-Yi Wang, Redmond, WA US