Patent application number | Description | Published |
20080235740 | System and method for storing advertising data - A computerized method is disclosed for presenting advertising data extracted from a video data stream, the method including storing a plurality of advertising data items extracted from the video data stream at an end user device; and displaying a plurality of sorted advertising indicator data items at the end user device, wherein each of the advertising indicator data items indicates one of the plurality of stored advertising data items. A system is disclosed for performing the method. A data structure is disclosed providing a functional and structural interrelationship between a processor in the system and data in the data structure. | 09-25-2008 |
20080300871 | METHOD AND APPARATUS FOR IDENTIFYING ACOUSTIC BACKGROUND ENVIRONMENTS TO ENHANCE AUTOMATIC SPEECH RECOGNITION - Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model. | 12-04-2008 |
20080300877 | SYSTEM AND METHOD FOR TRACKING FRAUDULENT ELECTRONIC TRANSACTIONS USING VOICEPRINTS - Disclosed are systems, methods, and computer readable media for comparing customer voice prints with a database of known fraudulent voice signatures and continually updating the database to decrease the risk of identity theft. The method embodiment comprises comparing a received voice signal against a database of known fraudulent voice signatures, denying the caller's transaction if the voice signal substantially matches the database of known fraudulent voice signatures, adding the caller's voice signal to the database of known fraudulent voice signatures if the voice signal does not substantially match a separate speaker verification database and received additional information is not verified. | 12-04-2008 |
20080319741 | SYSTEM AND METHOD FOR IMPROVING ROBUSTNESS OF SPEECH RECOGNITION USING VOCAL TRACT LENGTH NORMALIZATION CODEBOOKS - Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook. | 12-25-2008 |
20090076795 | System And Method Of Generating Responses To Text-Based Messages - In accordance with one aspect of the present invention, an automated method of and system for generating a response to a text-based natural language message is disclosed. The method includes identifying a sentence in the text-based natural language message. Also, identifying an input clause in the sentence. Further, comparing the input clause to a previously received clause, where the previously received clause is correlated with a previously generated response message. Additionally, generating an output response message based on the previously generated response message. The system includes means for performing the method steps. | 03-19-2009 |
20090112600 | SYSTEM AND METHOD FOR INCREASING ACCURACY OF SEARCHES BASED ON COMMUNITIES OF INTEREST - Disclosed are systems, methods and computer-readable media for using a local communication network to generate a speech model. The method includes retrieving for an individual a list of numbers in a calling history, identifying a local neighborhood associated with each number in the calling history, truncating the local neighborhood associated with each number based on the at least one parameter, retrieving a local communication network associated with each number in the calling history and each phone number in the local neighborhood, and creating a language model for the individual based on the retrieved local communication network. The generated language model may be used for improved automatic speech recognition for audible searches as well as other modules in a spoken dialog system. | 04-30-2009 |
20090192838 | SYSTEM AND METHOD FOR OPTIMIZING RESPONSE HANDLING TIME AND CUSTOMER SATISFACTION SCORES - A system and method disclosed for using and updating a database of template responses for a live agent in response to user communications. The method includes computing an average string distance between each response from a live agent and a template, use to generate the response, modifying the computed average string distance based on a customer satisfaction score associated with each response and selecting a response that minimizes the computed average string distance and maximizes customer satisfaction. Upon receiving a further communication on a certain issue, the system presents a prototype response that has been added to the template database to the live agent for use in generating a response to the further communication that reduces handling time and increases customer satisfaction. | 07-30-2009 |
20090234853 | Finding the website of a business using the business name - A system and method are provided for augmenting information on business directory databases. Using the business name contained in a business directory database and Web data mining technology, the website of a business is found and validated, prior to enriching the database entries. | 09-17-2009 |
20090254498 | System and method for identifying critical emails - Disclosed is a method and system for identifying critical emails. To identify critical emails, a critical email classifier is trained from training data comprising labeled emails. The classifier extracts N-grams from the training data and identifies N-gram features from the extracted N-grams. The classifier also extracts salient features from the training data. The classifier is trained based on the identified N-gram features and the salient features so that the classifier can classify unlabeled emails as critical emails or non-critical emails. | 10-08-2009 |
20090282114 | System and method for generating suggested responses to an email - Disclosed is a method and system for responding to a client email. A new client email is received and analyzed, and a response email is determined from the analyzing of the client email and from analysis of stored email-response pairs. | 11-12-2009 |
20100027767 | TRANSPARENT VOICE REGISTRATION AND VERIFICATION METHOD AND SYSTEM - Transparent voice registration of a party is provided in order to provide voice verification for communications with a service center. Verbal communication spoken by a party during interaction between the party and an agent of the service center is captured. A voice model associated with the captured communication is created and stored in order to provide voice verification during a subsequent call to the service center. When a requester contacts the service center, a comparison of the voice of the requester and a voice model of the person that the requester claims to be is performed, in order to verify the identity of the requester. Additionally, a voice model associated with a party is automatically updated after a subsequent communication between the party and the service center. | 02-04-2010 |
20100070360 | SYSTEM AND METHOD FOR CREATING A SPEECH SEARCH PLATFORM FOR COUPONS - Disclosed herein are systems, methods, and computer readable-media for creating a speech search platform for coupons. The method includes receiving coupons from vendors, generating indexing information about the received coupons for use with speech searches, integrating the received coupons and respective indexing information into a database accessible through a Representational State Transfer (REST) Application Programming Interface (API) as part of a speech search platform for coupons, receiving from a user a natural language query through the speech search platform for coupons, identifying coupons in the database which match the natural language query based on location and a user profile, and transmitting the identified coupons to the user. The method can further include modifying the REST API to include coupon-specific parameters. Identified coupons can be transmitted to the consumer by notifying a coupon issuer that the user is entitled to a discount. | 03-18-2010 |
20100070378 | SYSTEM AND METHOD FOR AN ENHANCED SHOPPING EXPERIENCE - Disclosed herein are systems, methods, and computer readable-media for creating a virtual shopping area. The method includes receiving a query from a user and an automated input specific to the user from a computing device, generating a list of merchants based on the query and the automated input, generating a virtual shopping area from the list of merchants and based on one or more constraints, and displaying the virtual shopping area on the computing device. One optional step is presenting to the user an interface to purchase query-related items from merchants in the virtual shopping area. The method optionally includes receiving an indication of intent to purchase an item from the user, displaying an image of the item to the user, and dynamically updating the displayed image of the item as the user specifies item-specific details. The list of merchants can be restricted to merchants geographically close to the user. | 03-18-2010 |
20100125457 | SYSTEM AND METHOD FOR DISCRIMINATIVE PRONUNCIATION MODELING FOR VOICE SEARCH - Disclosed herein are systems, computer-implemented methods, and computer-readable media for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by (1) identifying word and phone alignments and corresponding likelihood scores, and (2) discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information (MMI), maximum likelihood (MLE) training, minimum classification error (MCE) training, or other functions known to those of skill in the art. Speech utterances can be names. The speech utterances can be received as part of a multimodal search or input. The step of discriminatively adapting pronunciation weights can further include stochastically modeling pronunciations. | 05-20-2010 |
20100145704 | SYSTEM AND METHOD FOR INCREASING RECOGNITION RATES OF IN-VOCABULARY WORDS BY IMPROVING PRONUNCIATION MODELING - Disclosed herein are systems, methods, and computer readable-media for generating a lexicon for use with speech recognition. The method includes receiving symbolic input as labeled speech data, overgenerating potential pronunciations based on the symbolic input, identifying best potential pronunciations in a speech recognition context, and storing the identified best potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio. | 06-10-2010 |
20100217580 | On-Demand Language Translation for Television Programs - A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal. The video signal including the translated information in the target language may be sent to a display device. | 08-26-2010 |
20100324893 | SYSTEM AND METHOD FOR IMPROVING ROBUSTNESS OF SPEECH RECOGNITION USING VOCAL TRACT LENGTH NORMALIZATION CODEBOOKS - Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook. | 12-23-2010 |
20110022379 | On-Demand Language Translation for Television Programs - In an embodiment, a method of providing an on demand translation service is provided. A subscriber may be charged a reduced fee or no fee for use of the on demand translation service in exchange for displaying commercial messages to the subscriber, the commercial messages being selected based on subscriber information. A multimedia signal including information in a source language may be received. The information may be obtained as text in the source language from the multimedia signal. The text may be translated from the source language to a target language. Translated information, based on the translated text, may be transmitted to a processing device for presentation to the subscriber. The received multimedia signal may be sent to a multimedia device for viewing. | 01-27-2011 |
20110137648 | SYSTEM AND METHOD FOR IMPROVED AUTOMATIC SPEECH RECOGNITION PERFORMANCE - Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer. A scheduling algorithm can tailor a particular combination of speech recognition resources and release the supplemental speech recognizer based on increased demand. | 06-09-2011 |
20110137653 | SYSTEM AND METHOD FOR RESTRICTING LARGE LANGUAGE MODELS - Disclosed herein are systems, methods, and computer-readable storage media for performing speech recognition based on a masked language model. A system configured to practice the method receives a masked language model including a plurality of words, wherein a bit mask identifies whether each of the plurality of words is allowed or disallowed with regard to an adaptation subset, receives input speech, generates a speech recognition lattice based on the received input speech using the masked language model, removes from the generated lattice words identified as disallowed by the bit mask for the adaptation subset, and recognizes the received speech based on the lattice. Alternatively during the generation step, the system can only add words indicated as allowed by the bit mask. The bit mask can be separate from or incorporated as part of the masked language model. The system can dynamically update the adaptation subset and bit mask. | 06-09-2011 |
20110184973 | METHOD AND APPARATUS FOR DETECTING AND EXTRACTING INFORMATION FROM DYNAMICALLY GENERATED WEB PAGES - A method and apparatus for automatically detecting and extracting information from dynamically generated web pages are disclosed. For example, the present method stores user provided information that is entered into a form interlace of a web page for a first query. Responsive to the first query, a first response web page is received and stored. The present method then automatically generates a second query to acquire a second response web page that is responsive to the second query. Finally, the present method compares the first response web page and the second response web page. In one embodiment, the present invention extracts information that is dissimilar between the first response web page and the second response web page. This extracted information is deemed to be the pertinent information requested by the user. | 07-28-2011 |
20110246184 | SYSTEM AND METHOD FOR INCREASING ACCURACY OF SEARCHES BASED ON COMMUNICATION NETWORK - Disclosed are systems, methods and computer-readable media for using a local communication network to generate a speech model. The method includes retrieving for an individual a list of numbers in a calling history, identifying a local neighborhood associated with each number in the calling history, truncating the local neighborhood associated with each number based on the at least one parameter, retrieving a local communication network associated with each number in the calling history and each phone number in the local neighborhood, and creating a language model for the individual based on the retrieved local communication network. The generated language model may be used for improved automatic speech recognition for audible searches as well as other modules in a spoken dialog system. | 10-06-2011 |
20110258531 | Method and Apparatus for Building Sales Tools by Mining Data from Websites - A website mining tool is disclosed that extracts information from, for example, a company's website and presents the extracted information in a graphical user interface (GUI). In one embodiment, web pages from a website are stored in, for example, computer memory and a structure of the web pages is identified. A plurality of blocks of information is then extracted as a function of this structure and a category is assigned to each block of information. The elements in the blocks of information are then displayed, for example to a salesperson, as a function of these categories. In another embodiment, Document Object Modeling parsing is used to identify the structure of the web pages. In yet another embodiment, a support vector machine is used to categorize each block of information. | 10-20-2011 |
20110321098 | System and Method for Automatic Identification of Key Phrases during a Multimedia Broadcast - An Internet Protocol television system includes a user profile agent, a keyword detection agent, and an information search agent. The user profile agent is in communication with a multimedia device, and generates a user profile based on information received from the multimedia device. The keyword detection agent is in communication with the user profile agent, and searches text associated with a multimedia video stream transmitted to the multimedia device for keywords associated with the user profile. The information search agent is in communication with the keyword detection agent, and connects to an information source associated with the keywords detected by the keyword detection agent, and provides additional information associated with the keywords to the multimedia device. | 12-29-2011 |
20120022950 | Systems and Methods for Targeted Advertising in Voicemail to Text Systems - Systems and methods are provided for a voice message to text system supporting targeted advertisements. Voice messages received from users are converted to raw text messages that are normalized to insert proper punctuation and extract entity information. The normalized text and entity information are processed to extract concepts, such as critical phrases, from the normalized text. Extracted concepts are then matched to advertisements on an advertisement database having user selection criteria. Advertisements having selection criteria matching the extracted concepts are transmitted to the users, and the advertisers that placed the advertisements are charged fees for the advertisements. User profile information and user context information can additionally be used to select advertisements for transmission to users. | 01-26-2012 |
20120065963 | System And Method Of Generating Responses To Text-Based Messages - In accordance with one aspect of the present invention, an automated method of and system for generating a response to a text-based natural language message is disclosed. The method includes identifying a first selected input clause in a sentence in the text-based natural language message. Also, assigning a semantic tag to the first selected input clause and matching the semantic tag to a historical input tag. The historical input tag associated with a first previously generated response clause. Further; generating an output response message based on the historical response clause, the output response message derived from the historical input tag and a second previously generated response clause. The system includes means for performing the method steps. | 03-15-2012 |
20120078617 | System and Method for Increasing Recognition Rates of In-Vocabulary Words By Improving Pronunciation Modeling - The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes receiving symbolic input as labeled speech data, overgenerating potential pronunciations based on the symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio. | 03-29-2012 |
20120084081 | SYSTEM AND METHOD FOR PERFORMING SPEECH ANALYTICS - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing trend analysis of speech. A system practicing the method receives a speech trend analysis request having candidate feature constraints, an objective function with respect to a speech trend to be analyzed, and a set of speech record constraints. The system selects a subset of speech records from the group of speech records based on the set of speech record constraints to yield selected speech records, identifies features in the selected speech records based on the set of candidate feature constraints to yield identified features, and assigns a weight to each of the identified features based on the objective function. Then the system ranks the identified features by their respective weights to yield ranked identified features, and outputs at least one of the ranked identified features associated with a speech-based trend in response to the speech trend analysis request. | 04-05-2012 |
20120084086 | SYSTEM AND METHOD FOR OPEN SPEECH RECOGNITION - Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination. | 04-05-2012 |
20120101817 | SYSTEM AND METHOD FOR GENERATING MODELS FOR USE IN AUTOMATIC SPEECH RECOGNITION - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a model for use with automatic speech recognition. These principles can be implemented as part of a streamlined tool for automatic training and tuning of speech, or other, models with a fast turnaround and with limited human involvement. A system configured to practice the method receives, as part of a request to generate a model, input data and a seed model. The system receives a cost function indicating accuracy and at least one of speed and memory usage, The system processes the input data based on seed model and based on parameters that optimize the cost function to yield an updated model, and outputs the updated model. | 04-26-2012 |
20120203547 | SYSTEM AND METHOD FOR IMPROVING ROBUSTNESS OF SPEECH RECOGNITION USING VOCAL TRACT LENGTH NORMALIZATION CODEBOOKS - Disclosed are systems, methods, and computer readable media for performing speech recognition. The method embodiment comprises selecting a codebook from a plurality of codebooks with a minimal acoustic distance to a received speech sample, the plurality of codebooks generated by a process of (a) computing a vocal tract length for a each of a plurality of speakers, (b) for each of the plurality of speakers, clustering speech vectors, and (c) creating a codebook for each speaker, the codebook containing entries for the respective speaker's vocal tract length, speech vectors, and an optional vector weight for each speech vector, (2) applying the respective vocal tract length associated with the selected codebook to normalize the received speech sample for use in speech recognition, and (3) recognizing the received speech sample based on the respective vocal tract length associated with the selected codebook. | 08-09-2012 |
20120221337 | METHOD AND APPARATUS FOR PREDICTING WORD ACCURACY IN AUTOMATIC SPEECH RECOGNITION SYSTEMS - The invention comprises a method and apparatus for predicting word accuracy. Specifically, the method comprises obtaining an utterance in speech data where the utterance comprises an actual word string, processing the utterance for generating an interpretation of the actual word string, processing the utterance to identify at least one utterance frame, and predicting a word accuracy associated with the interpretation according to at least one stationary signal-to-noise ratio and at least one non-stationary signal to noise ratio, wherein the at least one stationary signal-to-noise ratio and the at least one non-stationary signal to noise ratio are determined according to a frame energy associated with each of the at least one utterance frame. | 08-30-2012 |
20120253799 | SYSTEM AND METHOD FOR RAPID CUSTOMIZATION OF SPEECH RECOGNITION MODELS - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating domain-specific speech recognition models for a domain of interest by combining and tuning existing speech recognition models when a speech recognizer does not have access to a speech recognition model for that domain of interest and when available domain-specific data is below a minimum desired threshold to create a new domain-specific speech recognition model. A system configured to practice the method identifies a speech recognition domain and combines a set of speech recognition models, each speech recognition model of the set of speech recognition models being from a respective speech recognition domain. The system receives an amount of data specific to the speech recognition domain, wherein the amount of data is less than a minimum threshold to create a new domain-specific model, and tunes the combined speech recognition model for the speech recognition domain based on the data. | 10-04-2012 |
20120271898 | SYSTEM AND METHOD FOR OPTIMIZING RESPONSE HANDLING TIME AND CUSTOMER SATISFACTION SCORES - A system and method disclosed for using and updating a database of template responses for a live agent in response to user communications. The method includes computing an average string distance between each response from a live agent and a template, use to generate the response, modifying the computed average string distance based on a customer satisfaction score associated with each response and selecting a response that minimizes the computed average string distance and maximizes customer satisfaction. Upon receiving a further communication on a certain issue, the system presents a prototype response that has been added to the template database to the live agent for use in generating a response to the further communication that reduces handling time and increases customer satisfaction. | 10-25-2012 |
20120278361 | USING WEB-MINING TO ENRICH DIRECTORY SERVICE DATABASES AND SOLICITING SERVICE SUBSCRIPTIONS - A system and method are provided for augmenting information on business directory databases and communicating with businesses is disclosed. Using the enriched business directory database and Web mining technology, customized email message are sent inviting businesses to enter their enriched business information into the directory or even subscribe to other paid services provided by the directory service. | 11-01-2012 |
20120290298 | SYSTEM AND METHOD FOR OPTIMIZING SPEECH RECOGNITION AND NATURAL LANGUAGE PARAMETERS WITH USER FEEDBACK - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for assigning saliency weights to words of an ASR model. The saliency values assigned to words within an ASR model are based on human perception judgments of previous transcripts. These saliency values are applied as weights to modify an ASR model such that the results of the weighted ASR model in converting a spoken document to a transcript provide a more accurate and useful transcription to the user. | 11-15-2012 |
20120290435 | SYSTEM AND METHOD FOR AN ENHANCED SHOPPING EXPERIENCE - Disclosed herein are systems, methods, and computer readable-media for creating a virtual shopping area. The method includes receiving a query from a user and an automated input specific to the user from a computing device, generating a list of merchants based on the query and the automated input, generating a virtual shopping area from the list of merchants and based on one or more constraints, and displaying the virtual shopping area on the computing device. One optional step is presenting to the user an interface to purchase query-related items from merchants in the virtual shopping area. The method optionally includes receiving an indication of intent to purchase an item from the user, displaying an image of the item to the user, and dynamically updating the displayed image of the item as the user specifies item-specific details. The list of merchants can be restricted to merchants geographically close to the user. | 11-15-2012 |
20120304239 | Multimodal Portable Communication Interface for Accessing Video Content - A portable communication device has a touch screen display that receives tactile input and a microphone that receives audio input. The portable communication device initiates a query for media based at least in part on tactile input and audio input. The touch screen display is a multi-touch screen. The portable communication device sends an initiated query and receives a text response indicative of a speech to text conversion of the query. The portable communication device then displays video in response to tactile input and audio input. | 11-29-2012 |
20130035932 | SYSTEM AND METHOD OF GENERATING RESPONSES TO TEXT-BASED MESSAGES - A system to generate a response to a text-based natural language message includes a user interface, processing device, and a computer-readable storage medium storing executable instructions to generate the response to the text-based natural language message. The instructions and a method for generating the response include identifying a sentence in the text-based natural language message, identifying an input clause in the sentence, and parsing the input clause, thereby defining a relationship between words in the input clause. The instructions and method also include assigning a semantic tag to the parsed input clause, comparing the input clause to a previously received clause, the previously received clause being correlated with a previously generated response clause, and generating an output response message derived from the previously generated response clause. | 02-07-2013 |
20130035939 | System and Method for Discriminative Pronunciation Modeling for Voice Search - Disclosed herein is a method for speech recognition. The method includes receiving speech utterances, assigning a pronunciation weight to each unit of speech in the speech utterances, each respective pronunciation weight being normalized at a unit of speech level to sum to 1, for each received speech utterance, optimizing the pronunciation weight by identifying word and phone alignments and corresponding likelihood scores, and discriminatively adapting the pronunciation weight to minimize classification errors, and recognizing additional received speech utterances using the optimized pronunciation weights. A unit of speech can be a sentence, a word, a context-dependent phone, a context-independent phone, or a syllable. The method can further include discriminatively adapting pronunciation weights based on an objective function. The objective function can be maximum mutual information, maximum likelihood training, minimum classification error training, or other functions known to those of skill in the art. | 02-07-2013 |
20130090925 | SYSTEM AND METHOD FOR SUPPLEMENTAL SPEECH RECOGNITION BY IDENTIFIED IDLE RESOURCES - Disclosed herein are systems, methods, and computer-readable storage media for improving automatic speech recognition performance. A system practicing the method identifies idle speech recognition resources and establishes a supplemental speech recognizer on the idle resources based on overall speech recognition demand. The supplemental speech recognizer can differ from a main speech recognizer, and, along with the main speech recognizer, can be associated with a particular speaker. The system performs speech recognition on speech received from the particular speaker in parallel with the main speech recognizer and the supplemental speech recognizer and combines results from the main and supplemental speech recognizer. The system recognizes the received speech based on the combined results. The system can use beam adjustment in place of or in combination with a supplemental speech recognizer. A scheduling algorithm can tailor a particular combination of speech recognition resources and release the supplemental speech recognizer based on increased demand. | 04-11-2013 |
20130097264 | SYSTEM AND METHOD FOR OPTIMIZING RESPONSE HANDLING TIME AND CUSTOMER SATISFACTION SCORES - A system and method disclosed for using and updating a database of template responses for a live agent in response to user communications. The method includes computing an average string distance between each response from a live agent and a template, use to generate the response, modifying the computed average string distance based on a customer satisfaction score associated with each response and selecting a response that minimizes the computed average string distance and maximizes customer satisfaction. Upon receiving a further communication on a certain issue, the system presents a prototype response that has been added to the template database to the live agent for use in generating a response to the further communication that reduces handling time and increases customer satisfaction. | 04-18-2013 |
20130156166 | TRANSPARENT VOICE REGISTRATION AND VERIFICATION METHOD AND SYSTEM - A method includes receiving a communication from a party at a voice response system and capturing verbal communication spoken by the party. Then a processor creates a voice model associated with the party, the voice model being created by processing the captured verbal communication spoken by the party. The creation of the voice model is imperceptible to the party. The voice model is then stored to provide voice verification of the party during a subsequent communication. | 06-20-2013 |
20130159828 | Method and Apparatus for Building Sales Tools by Mining Data from Websites - A website mining tool is disclosed that extracts information from, for example, a company's website and presents the extracted information in a graphical user interface (GUI). In one embodiment, web pages from a website are stored in, for example, computer memory and a structure of the web pages is identified. A plurality of blocks of information is then extracted as a function of this structure and a category is assigned to each block of information. The elements in the blocks of information are then displayed, for example to a salesperson, as a function of these categories. In another embodiment, Document Object Modeling parsing is used to identify the structure of the web pages. In yet another embodiment, a support vector machine is used to categorize each block of information. | 06-20-2013 |
20130177893 | Method and Apparatus for Responding to an Inquiry - Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry. | 07-11-2013 |
20130212468 | Alert Driven Interactive Interface to a Website Mining System - Disclosed is a web server that includes a headlines module for automatically generating headlines based on data retrieved from a network (e.g., World Wide Web). The web server also includes an interactive agent for generating responses to inquiries relating to the headlines based on the data. | 08-15-2013 |
20130305301 | Multimodal Portable Communication Interface for Accessing Video Content - A portable communication device has a touch screen display that receives tactile input and a microphone that receives audio input. The portable communication device initiates a query for media based at least in part on tactile input and audio input. The touch screen display is a multi-touch screen. The portable communication device sends an initiated query and receives a text response indicative of a speech to text conversion of the query. The portable communication device then displays video in response to tactile input and audio input. | 11-14-2013 |
20130311170 | Methods and Systems for Natural Language Understanding Using Human Knowledge and Collected Data - Disclosed herein are systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding. The disclosed systems and methods embrace a data-driven approach to natural language understanding which progresses seamlessly along the continuum of availability of annotated collected data, from when there is no available annotated collected data to when there is any amount of annotated collected data. | 11-21-2013 |
20130346086 | Method and System for Providing an Automated Web Transcription Service - A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page. | 12-26-2013 |
20140053171 | On-Demand Language Translation for Television Programs - A method, a system and a machine-readable medium are provided for an on demand translation service. A translation module including at least one language pair module for translating a source language to a target language may be made available for use by a subscriber. The subscriber may be charged a fee for use of the requested on demand translation service or may be provided use of the on demand translation service for free in exchange for displaying commercial messages to the subscriber. A video signal may be received including information in the source language, which may be obtained as text from the video signal and may be translated from the source language to the target language by use of the translation module. Translated information, based on the translated text, may be added into the received video signal. | 02-20-2014 |
20140205985 | Method and Apparatus for Responding to an Inquiry - Disclosed is a method and apparatus for responding to an inquiry from a client via a network. The method and apparatus receive the inquiry from a client via a network. Based on the inquiry, question-answer pairs retrieved from the network are analyzed to determine a response to the inquiry. The QA pairs are not predefined. As a result, the QA pairs have to be analyzed in order to determine whether they are responsive to a particular inquiry. Questions of the QA pairs may be repetitive and, without more, will not be useful in determining whether their corresponding answer responds to an inquiry. | 07-24-2014 |
20140303972 | Method and Apparatus for Identifying Acoustic Background Environments Based on Time and Speed to Enhance Automatic Speech Recognition - Disclosed are systems, methods, and computer readable media for identifying an acoustic environment of a caller. The method embodiment comprises analyzing acoustic features of a received audio signal from a caller, receiving meta-data information, classifying a background environment of the caller based on the analyzed acoustic features and the meta-data, selecting an acoustic model matched to the classified background environment from a plurality of acoustic models, and performing speech recognition as the received audio signal using the selected acoustic model. | 10-09-2014 |
20140316780 | METHOD AND SYSTEM FOR PROVIDING AN AUTOMATED WEB TRANSCRIPTION SERVICE - A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page. | 10-23-2014 |
20140330555 | Methods and Systems for Natural Language Understanding Using Human Knowledge and Collected Data - Disclosed herein are systems and methods to incorporate human knowledge when developing and using statistical models for natural language understanding. The disclosed systems and methods embrace a data-driven approach to natural language understanding which progresses seamlessly along the continuum of availability of annotated collected data, from when there is no available annotated collected data to when there is any amount of annotated collected data. | 11-06-2014 |
20140350915 | On-Demand Language Translation for Television Programs - In an embodiment, a method of providing an on demand translation service is provided. A subscriber may be charged a reduced fee or no fee for use of the on demand translation service in exchange for displaying commercial messages to the subscriber, the commercial messages being selected based on subscriber information. A multimedia signal including information in a source language may be received. The information may be obtained as text in the source language from the multimedia signal. The text may be translated from the source language to a target language. Translated information, based on the translated text, may be transmitted to a processing device for presentation to the subscriber. The received multimedia signal may be sent to a multimedia device for viewing. | 11-27-2014 |
20140358537 | System and Method for Combining Speech Recognition Outputs From a Plurality of Domain-Specific Speech Recognizers Via Machine Learning - Disclosed herein are systems, methods and non-transitory computer-readable media for performing speech recognition across different applications or environments without model customization or prior knowledge of the domain of the received speech. The disclosure includes recognizing received speech with a collection of domain-specific speech recognizers, determining a speech recognition confidence for each of the speech recognition outputs, selecting speech recognition candidates based on a respective speech recognition confidence for each speech recognition output, and combining selected speech recognition candidates to generate text based on the combination. | 12-04-2014 |
20150055764 | TRANSPARENT VOICE REGISTRATION AND VERIFICATION METHOD AND SYSTEM - A method includes receiving a communication from a party at a voice response system and capturing speech spoken by the party during the communication. Then a processor creates a voice model of the party, the voice model being created by processing the speech, without notifying the party. The voice model is then stored to provide voice verification during a subsequent communication. | 02-26-2015 |
20150073797 | System and Method for Increasing Recognition Rates of In-Vocabulary Words By Improving Pronunciation Modeling - The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes overgenerating potential pronunciations based on symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio. | 03-12-2015 |