Patent application number | Description | Published |
20110257974 | GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location. | 10-20-2011 |
20110288868 | DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar. | 11-24-2011 |
20110289516 | Registering an Event - A computer-implemented method for registering an event includes detecting occurrence of at least one event to be registered in a sequence. The sequence is to have entries for occurred events, each of the entries being a number indicating at least one of the occurred events and being associated with an aggregation number reflecting a number of times the entry has been aggregated within the sequence. The method includes identifying a new entry for extending the sequence, the new entry comprising a first number corresponding to the detected at least one event. The method includes revising the sequence by adding the numbers of at least two entries whose respective aggregation numbers satisfy a criterion for aggregation. The method includes storing the revised sequence. | 11-24-2011 |
20110295590 | ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location. | 12-01-2011 |
20110307253 | Speech and Noise Models for Speech Recognition - An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal. | 12-15-2011 |
20120022675 | PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input. | 01-26-2012 |
20120022860 | Speech and Noise Models for Speech Recognition - An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal. | 01-26-2012 |
20120022869 | ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location. | 01-26-2012 |
20120022870 | GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location. | 01-26-2012 |
20120022874 | DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar. | 01-26-2012 |
20120083910 | PROGRESSIVE ENCODING OF AUDIO - The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity. | 04-05-2012 |
20120084089 | PROGRESSIVE ENCODING OF AUDIO - The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity. | 04-05-2012 |
20120191448 | SPEECH RECOGNITION USING DOCK CONTEXT - Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for performing speech recognition using dock context. In one aspect, a method includes accessing audio data that includes encoded speech. Information that indicates a docking context of a client device is accessed, the docking context being associated with the audio data. A plurality of language models is identified. At least one of the plurality of language models is selected based on the docking context. Speech recognition is performed on the audio data using the selected language model to identify a transcription for a portion of the audio data. | 07-26-2012 |
20120191449 | SPEECH RECOGNITION USING DOCK CONTEXT - Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for performing speech recognition using dock context. In one aspect, a method includes accessing audio data that includes encoded speech. Information that indicates a docking context of a client device is accessed, the docking context being associated with the audio data. A plurality of language models is identified. At least one of the plurality of language models is selected based on the docking context. Speech recognition is performed on the audio data using the selected language model to identify a transcription for a portion of the audio data. | 07-26-2012 |
20120259631 | Speech and Noise Models for Speech Recognition - An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal. | 10-11-2012 |
20120278076 | DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information are described. A method includes determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received; determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received; weighting a contact disambiguation grammar according to the probabilities; and processing audio data using the contact disambiguation grammar to select a particular contact. | 11-01-2012 |
20120296643 | GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, identifying a set of geotagged audio signals that correspond to environmental audio associated with the geographic location, weighting each geotagged audio signal of the set of geotagged audio signals based on metadata associated with the respective geotagged audio signal, and using the set of weighted geotagged audio signals to perform noise compensation on the audio signal that corresponds to the utterance. | 11-22-2012 |
20120296655 | PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes obtaining sensor data from one or more sensors of a mobile device while the mobile device is operating in an inactive state, determining that a user of the mobile device is interacting with the mobile device based on the sensor data, invoking voice input functionality of the mobile device in response to determining that the user of the mobile device is interacting with the mobile device, detecting a voice input, and activating the mobile device in response to detecting the voice input. | 11-22-2012 |
20130238325 | GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, identifying a set of geotagged audio signals that correspond to environmental audio associated with the geographic location, weighting each geotagged audio signal of the set of geotagged audio signals based on metadata associated with the respective geotagged audio signal, and using the set of weighted geotagged audio signals to perform noise compensation on the audio signal that corresponds to the utterance. | 09-12-2013 |
20130297313 | ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location. | 11-07-2013 |