Class / Patent application number | Description | Number of patent applications / Date published |
704239000 | Similarity | 46 |
20090112586 | SYSTEM AND METHOD OF EVALUATING USER SIMULATIONS IN A SPOKEN DIALOG SYSTEM WITH A DIVERSION METRIC - Systems, methods and computer-readable media associated with using a divergence metric to evaluate user simulations in a spoken dialog system. The method employs user simulations of a spoken dialog system and includes aggregating a first set of one or more scores from a real user dialog, aggregating a second set of one or more scores from a simulated user dialog associated with a user model, determining a similarity of distributions associated with each of the first set and the second set, wherein the similarity is determined using a divergence metric that does not require any assumptions regarding a shape of the distributions. It is preferable to use a Cramér-von Mises divergence. | 04-30-2009 |
20090240498 | SIMILIARITY MEASURES FOR SHORT SEGMENTS OF TEXT - Systems and methods to perform short text segment similarity measures. Illustratively, a short text segment similarity environment comprises a short text engine operative to process data representative of short segments of text and an instruction set comprising at least one instruction to instruct the short text engine to process data representative of short text segment inputs according to a selected short text similarity identification paradigm. Illustratively, two or more short text segments can be received as input by the short text engine and a request to identify similarities among the two or more short text segments. Responsive to the request and data input, the short text engine executes a selected similarity identification technique in accordance with the sort text similarity identification paradigm to process the received data and to identify similarities between the short text segment inputs. | 09-24-2009 |
20090271195 | SPEECH RECOGNITION APPARATUS, SPEECH RECOGNITION METHOD, AND SPEECH RECOGNITION PROGRAM - A speech recognition apparatus capable of attaining high recognition accuracy within practical processing time using a computing machine having standard performance by appropriately adapting a language model to a speech about a certain topic, irrespectively of a degree of detail and diversity of the topic and irrespectively of a confidence score of an initial speech recognition result is provided. The speech recognition apparatus includes hierarchical language model storage means for storing a plurality of language models structured hierarchically, text-model similarity calculation means for calculating a similarity between a tentative recognition result for an input speech and each of the language models, recognition result confidence score calculation means for calculating a confidence score of the recognition result, topic estimation means for selecting at least one of the language models based on the similarity, the confidence score, and a depth of a hierarchy to which each of the language models belongs, and topic adaptation means for mixing up the language models selected by the topic estimation means, and for creating one language model. | 10-29-2009 |
20100121639 | Speech Processing - The described implementations relate to speech spelling by a user. One method identifies one or more symbols that may match a user utterance and displays an individual symbol for confirmation by the user. | 05-13-2010 |
20110035219 | AUTOMATIC SPOKEN LANGUAGE IDENTIFICATION BASED ON PHONEME SEQUENCE PATTERNS - A language identification system that includes a universal phoneme decoder (UPD) is described. The UPD contains a universal phoneme set representing both 1) all phonemes occurring in the set of two or more spoken languages, and 2) captures phoneme correspondences across languages, such that a set of unique phoneme patterns and probabilities are calculated in order to identify a most likely phoneme occurring each time in the audio files in the set of two or more potential languages in which the UPD was trained on. Each statistical language models (SLM) uses the set of unique phoneme patterns created for each language in the set to distinguish between spoken human languages in the set of languages. The run-time language identifier module identifies a particular human language being spoken by utilizing the linguistic probabilities supplied by the one or more SLMs that are based on the set of unique phoneme patterns created for each language. | 02-10-2011 |
20110071828 | SYSTEM AND METHOD OF SPEECH DISCRIMINABILITY ASSESSMENT, AND COMPUTER PROGRAM THEREOF - A speech discriminability assessment system includes: a biological signal measurement section for measuring an electroencephalogram signal of a user; a presented-speech sound control section for determining a speech sound to be presented to the user by referring to a speech sound database retaining a plurality of monosyllabic sound data; an audio presentation section for presenting an audio associated with the determined speech sound to the user; a character presentation section for presenting a character associated with the determined speech sound to the user, subsequent to the presentation of the audio by the audio presentation section; an unexpectedness detection section for detecting presence or absence of an unexpectedness signal from the measured electroencephalogram signal of the user, the unexpectedness signal representing a positive component at 600 ms±100 ms after a time point when the character was presented to the user; and a speech sound discriminability determination section for determining a speech sound discriminability based on a result of detection by the unexpectedness detection section. | 03-24-2011 |
20110202341 | GRAMMAR WEIGHTING VOICE RECOGNITION INFORMATION - A device receives a voice recognition statistic from a voice recognition application and applies a grammar improvement rule based on the voice recognition statistic. The device also automatically adjusts a weight of the voice recognition statistic based on the grammar improvement rule, and outputs the weight adjusted voice recognition statistic for use in the voice recognition application. | 08-18-2011 |
20120191453 | SYSTEM AND METHODS FOR MATCHING AN UTTERANCE TO A TEMPLATE HIERARCHY - A system and methods for matching at least one word of an utterance against a set of template hierarchies to select the best matching template or set of templates corresponding to the utterance. The system and methods determines at least one exact, inexact, and partial match between the at least one word of the utterance and at least one term within the template hierarchy to select and populate a template or set of templates corresponding to the utterance. The populated template or set of templates may then be used to generate a narrative template or a report template. | 07-26-2012 |
20120232899 | SYSTEM AND METHOD FOR IDENTIFICATION OF A SPEAKER BY PHONOGRAMS OF SPONTANEOUS ORAL SPEECH AND BY USING FORMANT EQUALIZATION - A system and method for identification of a speaker by phonograms of oral speech is disclosed. Similarity between a first phonogram of the speaker and a second, or sample, phonogram is evaluated by matching formant frequencies in referential utterances of a speech signal, where the utterances for comparison are selected from the first phonogram and the second phonogram. Referential utterances of speech signals are selected from the first phonogram and the second phonogram, where the referential utterances include formant paths of at least three formant frequencies. The selected referential utterances including at least two identical formant frequencies are compared therebetween. Similarity of the compared referential utterances from matching other formant frequencies is evaluated, where similarity of the phonograms is determined from evaluation of similarity of all the compared referential utterances. | 09-13-2012 |
20120232900 | SPEAKER RECOGNITION FROM TELEPHONE CALLS - The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to the at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for the at least one target speaker to obtain at least one comparison result; and determining whether one of the at least one unknown speakers is identical with the at least one target speaker based on the at least one comparison result. | 09-13-2012 |
20120239398 | SPEAKER VERIFICATION METHODS AND APPARATUS - In one aspect, a method for determining a validity of an identity asserted by a speaker using a voice print is provided. The method comprises acts of performing a first verification stage comprising comparing a first voice signal from the speaker uttering at least one first challenge utterance-with at least a portion of the voice print and performing a second verification stage if it is concluded in the first verification stage that the first voice signal was obtained from an utterance by the user. The second verification stage comprises adapting at least one parameter of the voice print based, at least in part, on the first voice signal to obtain an adapted voice print, and comparing a second voice signal from the speaker uttering at least one second challenge utterance with at least a portion of the adapted voice print. | 09-20-2012 |
20120259637 | METHOD AND APPARATUS FOR RECEIVING AUDIO - An electronic apparatus and method for retrieving a song, and a storage medium. The electronic apparatus includes: a storage unit which stores a plurality of songs; a user input unit which receives a hummed query which is inputted for retrieving a song; and a song retrieving unit which retrieves a song based on the hummed query from among the plurality of stored songs when the hummed query is received. The song retrieving unit extracts a pitch and a duration of the hummed query, converts each of the extracted pitch and duration into multi-level symbols, calculates a string edit distance between the hummed query and one of the plurality of songs based on the symbols, and determines a similarity between the hummed query and a song based on edit operations which are performed within the calculated string edit distance. | 10-11-2012 |
20120330662 | INPUT SUPPORTING SYSTEM, METHOD AND PROGRAM - An input supporting system ( | 12-27-2012 |
20130006630 | STATE DETECTING APPARATUS, COMMUNICATION APPARATUS, AND STORAGE MEDIUM STORING STATE DETECTING PROGRAM - A state detecting apparatus includes: a processor to execute acquiring utterance data related to uttered speech, computing a plurality of statistical quantities for feature parameters regarding features of the utterance data, creating, on the basis of the plurality of statistical quantities regarding the utterance data and another plurality of statistical quantities regarding reference utterance data based on other uttered speech, pseudo-utterance data having at least one statistical quantity equal to a statistical quantity in the other plurality of statistical quantities, computing a plurality of statistical quantities for synthetic utterance data synthesized on the basis of the pseudo-utterance data and the utterance data, and determining, on the basis of a comparison between statistical quantities of the synthetic utterance data and statistical quantities of the reference utterance data, whether the speaker who produced the uttered speech is in a first state or a second state; and a memory. | 01-03-2013 |
20130046539 | Automatic Speech and Concept Recognition - A method, an apparatus and an article of manufacture for automatic speech recognition. The method includes obtaining at least one language model word and at least one rule-based grammar word, determining an acoustic similarity of at least one pair of language model word and rule-based grammar word, and increasing a transition cost to the at least one language model word based on the acoustic similarity of the at least one language model word with the at least one rule-based grammar word to generate a modified language model for automatic speech recognition. | 02-21-2013 |
20130054242 | REDUCING FALSE POSITIVES IN SPEECH RECOGNITION SYSTEMS - Embodiments of the present invention improve methods of performing speech recognition. In one embodiment, the present invention includes a method comprising receiving a spoken utterance, processing the spoken utterance in a speech recognizer to generate a recognition result, determining consistencies of one or more parameters of component sounds of the spoken utterance, wherein the parameters are selected from the group consisting of duration, energy, and pitch, and wherein each component sound of the spoken utterance has a corresponding value of said parameter, and validating the recognition result based on the consistency of at least one of said parameters. | 02-28-2013 |
20130289988 | POST PROCESSING OF NATURAL LANGUAGE ASR - A post-processing speech system includes a natural language-based speech recognition system that compares a spoken utterance to a natural language vocabulary that includes words used to generate a natural language speech recognition result. A master conversation module engine compares the natural language speech recognition result to domain specific words and phrases. A voting engine selects a word or a phrase from the domain specific words and phrases that is transmitted to an application control system. The application control system transmits one or more control signals that are used to control an internal or an external device or an internal or an external process. | 10-31-2013 |
20130304469 | INFORMATION PROCESSING METHOD AND APPARATUS, COMPUTER PROGRAM AND RECORDING MEDIUM - Among multiple documents presented to a user, a high interest and a low interest document are specified, a word group in the high interest document is compared with a word group in the low interest document, and a string of word groups associated weight values is generated as a user feature vector. A word group included in each of multiple data items targeted for assigning priorities is extracted, and data feature vectors are generated specific to each data item, based on the word groups extracted. A degree of similarity between each data feature vectors of multiple data items and user feature vector is obtained, and according to the degree of similarity, priorities are assigned to the multiple data items to be presented to the user. Therefore, it is possible to extract user's feature information on which the user's interests and tastes are reflected more effectively. | 11-14-2013 |
20130325469 | METHOD FOR PROVIDING VOICE RECOGNITION FUNCTION AND ELECTRONIC DEVICE THEREOF - A method for providing a voice recognition function and an electronic device thereof are provided. The method provides a voice recognition function in an electronic device that includes outputting, when a voice instruction is input, a list of prediction instructions that are candidate instructions similar to the input voice instruction, updating, when a correction instruction correcting the output candidate instructions is input, the list of prediction instructions, and performing, if the correction instruction matches with an instruction of high similarity in the updated list of prediction instructions, a voice recognition function corresponding to the voice instruction. | 12-05-2013 |
20130325470 | SYSTEM AND METHOD FOR IDENTIFICATION OF A SPEAKER BY PHONOGRAMS OF SPONTANEOUS ORAL SPEECH AND BY USING FORMANT EQUALIZATION - A system and method for identification of a speaker by phonograms of oral speech is disclosed. Similarity between a first phonogram of the speaker and a second, or sample, phonogram is evaluated by matching formant frequencies in referential utterances of a speech signal, where the utterances for comparison are selected from the first phonogram and the second phonogram. Referential utterances of speech signals are selected from the first phonogram and the second phonogram, where the referential utterances include formant paths of at least three formant frequencies. The selected referential utterances including at least two identical formant frequencies are compared therebetween. Similarity of the compared referential utterances from matching other formant frequencies is evaluated, where similarity of the phonograms is determined from evaluation of similarity of all the compared referential utterances. | 12-05-2013 |
20140012575 | DETECTING POTENTIAL SIGNIFICANT ERRORS IN SPEECH RECOGNITION RESULTS - In some embodiments, the recognition results produced by a speech processing system (which may include two or more recognition results, including a top recognition result and one or more alternative recognition results) based on an analysis of a speech input, are evaluated for indications of potential significant errors. In some embodiments, the recognition results may be evaluated to determine whether a meaning of any of the alternative recognition results differs from a meaning of the top recognition result in a manner that is significant for a domain, such as the medical domain. In some embodiments, words and/or phrases that may be confused by an ASR system may be determined and associated in sets of words and/or phrases. Words and/or phrases that may be determined include those that change a meaning of a phrase or sentence when included in the phrase/sentence. | 01-09-2014 |
20140172427 | System And Method For Event Summarization Using Observer Social Media Messages - A method for processing messages pertaining to an event includes receiving a plurality of messages pertaining to the event from electronic communication devices associated with a plurality of observers of the event, generating a first message stream that includes only a portion of the plurality of messages corresponding to a first participant in the event, identifying a first sub-event in the first message stream with reference to a time distribution of messages and content distribution of messages in the first message stream, generating a sub-event summary with reference to a portion of the plurality of messages in the first message stream that are associated with the first sub-event, and transmitting the sub-event summary to a plurality of electronic communication devices associated with a plurality of users who are not observers of the event. | 06-19-2014 |
20140249817 | Identification using Audio Signatures and Additional Characteristics - Techniques for using both speaker-identification information and other characteristics associated with received voice commands to determine how and whether to respond to the received voice commands. A user may interact with a device through speech by providing voice commands. After beginning an interaction with the user, the device may detect subsequent speech, which may originate from the user, from another user, or from another source. The device may then use speaker-identification information and other characteristics associated with the speech to attempt to determine whether or not the user interacting with the device uttered the speech. The device may then interpret the speech as a valid voice command and may perform a corresponding operation in response to determining that the user did indeed utter the speech. If the device determines that the user did not utter the speech, however, then the device may refrain from taking action on the speech. | 09-04-2014 |
20140257809 | SPARSE MAXIMUM A POSTERIORI (MAP) ADAPTION - Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements. | 09-11-2014 |
20140337024 | METHOD AND SYSTEM FOR SPEECH COMMAND DETECTION, AND INFORMATION PROCESSING SYSTEM - A method for speech command detection comprises extracting speech features from a speech signal inputted into a system, converting the speech features into a word sequence, obtaining time durations of speech segments corresponding to the respective non-command words and an acoustic score of each of the command word candidates, calculating rhythm features of the speech signal based on the time durations, and recognizing a speech corresponding to the at least one command word candidates as a speech command directed to the system or a speech not directed to the system based on the acoustic score and the rhythm features. The word sequence comprises at least two successive non-command words and at least one command word candidate. The rhythm features describe a similarity of time durations of speech segments corresponding to the respective non-command words, and/or a similarity of energy variations of the speech segments corresponding to the respective non-command words. | 11-13-2014 |
20140337025 | CLASSIFICATION METHOD AND DEVICE FOR AUDIO FILES - The present disclosure discloses a classification method and system for audio files, the classification method includes: constructing Pitch sequence of the audio files to be classified; calculating eigenvectors of the audio files according to the Pitch sequence of the audio files; and classifying the audio files according to the eigenvectors of the audio files. The present disclosure can achieve automatic classification of the audio files, reduce the cost of the classification, and improve classification efficiency and flexibility and intelligence of the classification. | 11-13-2014 |
20150073791 | APPARATUS AND METHOD FOR ANALYSIS OF LANGUAGE MODEL CHANGES - An apparatus, a method, and a machine-readable medium are provided for characterizing differences between two language models. A group of utterances from each of a group of time domains are examined. One of a significant word change or a significant word class change within the plurality of utterances is determined. A first cluster of utterances including a word or a word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances. A second cluster of utterances not including the word or the word class corresponding to the one of the significant word change or the significant word class change is generated from the utterances. | 03-12-2015 |
20150081296 | METHOD AND APPARATUS FOR ADJUSTING DETECTION THRESHOLD FOR ACTIVATING VOICE ASSISTANT FUNCTION - A method for activating a voice assistant function in a mobile device is disclosed. The method includes receiving an input sound stream by a sound sensor and determining a context of the mobile device. The method may determine the context based on the input sound stream. For determining the context, the method may also obtain data indicative of the context of the mobile device from at least one of an acceleration sensor, a location sensor, an illumination sensor, a proximity sensor, a clock unit, and a calendar unit in the mobile device. In this method, a threshold for activating the voice assistant function is adjusted based on the context. The method detects a target keyword from the input sound stream based on the adjusted threshold. If the target keyword is detected, the method activates the voice assistant function. | 03-19-2015 |
20150127342 | SPEAKER IDENTIFICATION - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing speaker identification. In some implementations, an utterance vector that is derived from an utterance is obtained. Hash values are determined for the utterance vector according to multiple different hash functions. A set of speaker vectors from a plurality of hash tables is determined using the hash values, where each speaker vector was derived from one or more utterances of a respective speaker. The speaker vectors in the set are compared with the utterance vector. A speaker vector is selected based on comparing the speaker vectors in the set with the utterance vector. | 05-07-2015 |
20150302868 | CONVERSATION QUALITY ANALYSIS - Embodiments disclosed herein provide systems, methods, and computer readable media for analyzing a conversation between a plurality of participants. In a particular embodiment, a method provides determining a first speaker from the plurality of participants and determining a second speaker from the plurality of participants. The method further provides determining a first plurality of turns comprising portions of the conversation when the first speaker is speaking and determining a second plurality of turns comprising portions of the conversation when the second speaker is speaking. The method further provides determining a characterization for quality of the conversation based on gaps between turns of the first plurality of turns and turns of the second plurality of turns. | 10-22-2015 |
20150356974 | SPEAKER IDENTIFICATION DEVICE, SPEAKER IDENTIFICATION METHOD, AND RECORDING MEDIUM - A speaker identification device includes: a primary speaker identification unit that computes, for each pre-stored registered speaker, a score that indicates the similarity between input speech and speech of the registered speakers; a similar speaker selection unit that selects a plurality of the registered speakers as similar speakers according to the height of the scores thereof; a learning unit that creates a classifier for each similar speaker by sorting the speech of a certain similar speaker among the similar speakers as a positive instance and the speech of the other similar speakers as negative instances; and a secondary speaker identification unit that computes, for each classifier, a score of the classifier with respect to the input speech, and outputs an identification result. | 12-10-2015 |
20160012823 | SYSTEM AND METHODS FOR PERSONAL IDENTIFICATION NUMBER AUTHENTICATION AND VERIFICATION | 01-14-2016 |
20160019915 | REAL-TIME EMOTION RECOGNITION FROM AUDIO SIGNALS - Systems, methods, and computer-readable storage media are provided for recognizing emotion in audio signals in real-time. An audio signal is detected and a rapid audio fingerprint is computed on a user's computing device. One or more features is extracted from the audio fingerprint and compared with features associated with defined emotions to determine relative degrees of similarity. Confidence scores are computed for the defined emotions based on the relative degrees of similarity and it is determined whether a confidence score for one or more particular emotions exceeds a threshold confidence score. If it is determined that a threshold confidence score for one or more particular emotions is exceeded, the particular emotion or emotions are associated with the audio signal. As desired, various action then may be initiated based upon the emotion/emotions associated with the audio signal. | 01-21-2016 |
20160042739 | FAST SPEAKER RECOGNITION SCORING USING I-VECTOR POSTERIORS AND PROBABILISTIC LINEAR DISCRIMINANT ANALYSIS - A method for performing speaker recognition comprises: estimating respective uncertainties of acoustic coverage of respective speech utterance(s) by first and second speakers, the acoustic coverage representing respective sounds used by the speakers when speaking; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient memory usage by discarding dependencies between uncertainties of different sounds for the speakers; representing the respective uncertainties of acoustic coverage in a manner that allows for efficient computation by representing an inverse of the respective uncertainties of acoustic coverage and then discarding the dependencies between the uncertainties of different sounds for the speakers; and computing a score between the speech utterance(s) by the speakers in a manner that leverages the respective uncertainties of the acoustic coverage during the comparison, the score being indicative of a likelihood that the speakers are the same speaker. | 02-11-2016 |
20160086609 | SYSTEMS AND METHODS FOR AUDIO COMMAND RECOGNITION - The present application discloses a method, an electronic system and a non-transitory computer readable storage medium for recognizing audio commands in an electronic device. The electronic device obtains audio data based on an audio signal provided by a user and extracts characteristic audio fingerprint features from the audio data. The electronic device further determines whether the corresponding audio signal is generated by an authorized user by comparing the characteristic audio fingerprint features with an audio fingerprint model for the authorized user and with a universal background model that represents user-independent audio fingerprint features, respectively. When the corresponding audio signal is generated by the authorized user of the electronic device, an audio command is extracted from the audio data, and an operation is performed according to the audio command. | 03-24-2016 |
20160093296 | SYSTEM AND METHOD FOR MACHINE-MEDIATED HUMAN-HUMAN CONVERSATION - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing speech. A system configured to practice the method monitors user utterances to generate a conversation context. Then the system receives a current user utterance independent of non-natural language input intended to trigger speech processing. The system compares the current user utterance to the conversation context to generate a context similarity score, and if the context similarity score is above a threshold, incorporates the current user utterance into the conversation context. If the context similarity score is below the threshold, the system discards the current user utterance. The system can compare the current user utterance to the conversation context based on an n-gram distribution, a perplexity score, and a perplexity threshold. Alternately, the system can use a task model to compare the current user utterance to the conversation context. | 03-31-2016 |
20160111089 | VEHICLE AND CONTROL METHOD THEREOF - A vehicle of recognizing received voice based on a language set in an external apparatus includes: a communication unit configured to receive text data stored in an external apparatus; a data converter configured to convert the received text data into voice data; a speech input unit configured to receive a speech from a user; a speech recognizer configured to recognize the received speech based on a language set in the external apparatus; and a controller configured to search for voice data corresponding to the recognized speech in the converted voice data, to generate a control command including the voice data found by the controller based on the recognized speech, and to transmit the control command to the external apparatus through the communication unit. | 04-21-2016 |
20160111112 | SPEAKER CHANGE DETECTION DEVICE AND SPEAKER CHANGE DETECTION METHOD - A speaker change detection device sets first and second analysis periods before and after each of time points in a voice signal, generates, for each of the time points, a first speaker model from a distribution of features in frames in the first analysis period, and a second speaker model from a distribution of features in frames in the second analysis period, calculates, for each of the time points, a matching score representing the likelihood of similarity of features between a group of speakers in the first analysis period and a group of speakers in the second analysis period by applying the features extracted from the second analysis period to the first speaker model and applying the features extracted from the first analysis period to the second speaker model, and detects a speaker change point on the basis of the matching scores at the plurality of time points. | 04-21-2016 |
20160118039 | SOUND SAMPLE VERIFICATION FOR GENERATING SOUND DETECTION MODEL - A method for verifying at least one sound sample to be used in generating a sound detection model in an electronic device includes receiving a first sound sample; extracting a first acoustic feature from the first sound sample; receiving a second sound sample; extracting a second acoustic feature from the second sound sample; and determining whether the second acoustic feature is similar to the first acoustic feature. | 04-28-2016 |
20160125873 | METHOD AND SYSTEM FOR RECOGNIZING SPEECH USING WILDCARDS IN AN EXPECTED RESPONSE - A speech recognition system used in a workflow receives and analyzes speech input to recognize and accept a user's response to a task. Under certain conditions, a user's response might be expected. In these situations, the expected response may modify the behavior of the speech recognition system to improve recognition accuracy. For example, if the hypothesis of a user's response matches the expected response then there is a high probability that the user's response was recognized correctly. An expected response may include expected words and wildcard words. Wildcard words represent any recognized word in a user's response. By including wildcard words in the expected response, the speech recognition system may make modifications based on a wide range of user responses. | 05-05-2016 |
20160125885 | Sensory Enhancement Systems and Methods in Personal Electronic Devices - Disclosed are personal electronic devices (PEDs) having a sensory enhancement (SE) system for monitoring environmental conditions and detecting environmental events, for example but not limited to, changes in acoustic, thermal, optical, electromagnetic, chemical, dynamic, wireless, atmospheric, or biometric conditions. The detection of such events can be used to invoke a notification, an alert, a corrective action, or some other action, depending upon the implementation to the PED user or another party. | 05-05-2016 |
20160180841 | SYSTEM AND METHOD FOR HANDLING MISSING SPEECH DATA | 06-23-2016 |
20160188563 | Nutrient Content Identification Method and Apparatus - Techniques are provided for calculating nutrient content information. A server that hosts a fitness management application receives text information that describes food recipe information. The serer parses the text information to identify relevant food information. The relevant food information includes a first text portion that corresponds to food ingredient information and a second text portion that corresponds to food quantity information. The server matches the food ingredient information to the first text portion with a known food ingredient in a database of food ingredient information. The server converts the food quantity information in the second text portion to a known food quantity type. The server calculates nutrient content information of the food ingredient information using nutritional information of the known food ingredient and the known food quantity type. | 06-30-2016 |
20170236520 | Generating Models for Text-Dependent Speaker Verification | 08-17-2017 |
20190147857 | METHOD AND APPARATUS FOR AUTOMATIC SPEECH RECOGNITION | 05-16-2019 |
20190147861 | METHOD AND APPARATUS FOR CONTROLLING PAGE | 05-16-2019 |