Patent application number | Description | Published |
20080221863 | SEARCH-BASED WORD SEGMENTATION METHOD AND DEVICE FOR LANGUAGE WITHOUT WORD BOUNDARY TAG - The present invention discloses a search-based segmentation method and device for a language without a word boundary tag. The inventive method includes the steps of: a. providing at least one search engine with a segment of a text including at least one segment; b. searching for the segment through the at least one search engine, and returning search results; and c. selecting a word segmentation approach for the segment in accordance with at least part of the returned search results. The invention solves the problems of word segmentation for a language without a word boundary tag, and thus combat the limitations of the prior art in terms of flexibility, dependence upon coverage of dictionaries, available training data corpuses, processing of a new word, etc. | 09-11-2008 |
20080288258 | METHOD AND APPARATUS FOR SPEECH ANALYSIS AND SYNTHESIS - The present invention provides a speech analysis method comprising steps of obtaining a speech signal and a corresponding DEGG/EGG signal; regarding the speech signal as the output of a vocal tract filter in a source-filter model taking the DEGG/EGG signal as the input; and estimating the features of the vocal tract filter from the speech signal as the output and the DEGG/EGG signal as the input, wherein the features of the vocal tract filter are expressed by the state vectors of the vocal tract filter at selected time points, and the step of estimating is performed using Kalman filtering. | 11-20-2008 |
20090037179 | Method and Apparatus for Automatically Converting Voice - The invention proposes a method and apparatus for significantly improving the quality of voice morphing and guaranteeing the similarity of converted voice. The invention sets several standard speakers in a TTS database, and selects the voices of different standard speakers for speech synthesis according to different roles, wherein the voice of the selected standard speaker is similar to the original role to a certain extent. Then the invention further performs voice morphing on the standard voice similar to the original voice to a certain extent, in order to accurately mimic the voice of the original speaker, so as to make the converted voice closer to the original voice features while guaranteeing the similarity. | 02-05-2009 |
20090089063 | VOICE CONVERSION METHOD AND SYSTEM - A method, system and computer program product for voice conversion. The method includes performing speech analysis on the speech of a source speaker to achieve speech information; performing spectral conversion based on said speech information, to at least achieve a first spectrum similar to the speech of a target speaker; performing unit selection on the speech of said target speaker at least using said first spectrum as a target; replacing at least part of said first spectrum with the spectrum of the selected target speaker's speech unit; and performing speech reconstruction at least based on the replaced spectrum. | 04-02-2009 |
20090274299 | OPEN ARCHITECTURE BASED DOMAIN DEPENDENT REAL TIME MULTI-LINGUAL COMMUNICATION SERVICE - A system and method for real-time network communications provides a session identifier as a public key for group communication between clients, and provides a channel identifier representing a private key for each of a plurality of clients. The channel identifier includes client-specific attributes, which function to indicate grouping criteria for the group communication. A dynamic communication link is created over a network between a client and a service based upon the public and private key combination such that group communication is enabled based upon the attributes of the private key and the public key. Communications are translated using a translation service which employs the attributes associated with the private key and the public key combination to provide response information in a designated language to enable multi-lingual real-time communications. | 11-05-2009 |
20090299746 | METHOD AND SYSTEM FOR SPEECH SYNTHESIS - A method for performing speech synthesis to a textual content at a client. The method includes the steps of: performing speech synthesis to the textual content based on a current acoustical unit set S | 12-03-2009 |
20100114556 | SPEECH TRANSLATION METHOD AND APPARATUS - A method and apparatus for speech translation. The method includes: receiving a source speech; extracting non-text information in the source speech; translating the source speech into a target speech; and adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. The apparatus includes: a receiving module for receiving source speech; an extracting module for extracting non-text information in the source speech; a translation module for translating the source speech into a target speech; and an adjusting module for adjusting the translated target speech according to the extracted non-text information so that the target speech preserves the non-text information in the source speech. | 05-06-2010 |
20110054901 | METHOD AND APPARATUS FOR ALIGNING TEXTS - A method and apparatus for aligning texts. The method includes acquiring a target text and a reference text and aligning the target text and the reference text at word level based on phoneme similarity. The method can be applied to automatically archiving a multimedia resource and a method of automatically searching a multimedia resource. | 03-03-2011 |
20110270605 | ASSESSING SPEECH PROSODY - A method, system and computer readable storage medium for assessing speech prosody. The method includes the steps of: receiving input speech data; acquiring a prosody constraint; assessing prosody of the input speech data according to the prosody constraint; and providing assessment result where at least of the steps is carried out using a computer device. | 11-03-2011 |
20130054244 | METHOD AND SYSTEM FOR ACHIEVING EMOTIONAL TEXT TO SPEECH - A method and system for achieving emotional text to speech. The method includes: receiving text data; generating emotion tag for the text data by a rhythm piece; and achieving TTS to the text data corresponding to the emotion tag, where the emotion tags are expressed as a set of emotion vectors; where each emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories. A system for the same includes: a text data receiving module; an emotion tag generating module; and a TTS module for achieving TTS, wherein the emotion tag is expressed as a set of emotion vectors; and wherein emotion vector includes a plurality of emotion scores given based on a plurality of emotion categories. | 02-28-2013 |
20130268270 | Forced/Predictable Adaptation for Speech Recognition - A method is described for use with automatic speech recognition using discriminative criteria for speaker adaptation. An adaptation evaluation is performed of speech recognition performance data for speech recognition system users. Adaptation candidate users are identified based on the adaptation evaluation for whom an adaptation process is likely to improve system performance. | 10-10-2013 |
20140019121 | DATA PROCESSING METHOD, PRESENTATION METHOD, AND CORRESPONDING APPARATUSES - A data processing method includes obtaining text information corresponding to a presented content, the presented content comprising a plurality of areas; performing text analysis on the text information to obtain a first keyword sequence, the first keyword sequence including area keywords associated with at least one area of the plurality of areas; obtaining speech information related to the presented content, the speech information at least comprising a current speech segment; and using a first model network to perform analysis on the current speech segment to determine the area corresponding to the current speech segment, wherein the first model network comprises the first keyword sequence. | 01-16-2014 |
20140019133 | DATA PROCESSING METHOD, PRESENTATION METHOD, AND CORRESPONDING APPARATUSES - A data processing method includes obtaining text information corresponding to a presented content, the presented content comprising a plurality of areas; performing text analysis on the text information to obtain a first keyword sequence, the first keyword sequence including area keywords associated with at least one area of the plurality of areas; obtaining speech information related to the presented content, the speech information at least comprising a current speech segment; and using a first model network to perform analysis on the current speech segment to determine the area corresponding to the current speech segment, wherein the first model network comprises the first keyword sequence. | 01-16-2014 |
20140095160 | CORRECTING TEXT WITH VOICE PROCESSING - The present invention relates to voice processing and provides a method and system for correcting a text. The method comprising: determining a target text unit to be corrected in a text; receiving a reference voice segment input by the user for the target text unit; determining a reference text unit whose pronunciation is similar to a word in the target text unit based on the reference voice segment; and correcting the word in the target text unit in the text by the reference text unit. The present invention enables the user to easily correct errors in the text vocally. | 04-03-2014 |
20140129220 | SPEAKER AND CALL CHARACTERISTIC SENSITIVE OPEN VOICE SEARCH - Techniques disclosed herein include systems and methods for open-domain voice-enabled searching that is speaker sensitive. Techniques include using speech information, speaker information, and information associated with a spoken query to enhance open voice search results. This includes integrating a textual index with a voice index to support the entire search cycle. Given a voice query, the system can execute two matching processes simultaneously. This can include a text matching process based on the output of speech recognition, as well as a voice matching process based on characteristics of a caller or user voicing a query. Characteristics of the caller can include output of voice feature extraction and metadata about the call. The system clusters callers according to these characteristics. The system can use specific voice and text clusters to modify speech recognition results, as well as modifying search results. | 05-08-2014 |
20140136198 | CORRECTING TEXT WITH VOICE PROCESSING - The present invention relates to voice processing and provides a method and system for correcting a text. The method comprising: determining a target text unit to be corrected in a text; receiving a reference voice segment input by the user for the target text unit; determining a reference text unit whose pronunciation is similar to a word in the target text unit based on the reference voice segment; and correcting the word in the target text unit in the text by the reference text unit. The present invention enables the user to easily correct errors in the text vocally. | 05-15-2014 |
20140298186 | ADJUSTING INFORMATION PROMPTING IN INPUT METHOD - A computer-implemented method and apparatus for adjusting information prompting in an input method. The method includes: obtaining prompt information displayed in response to entering a word in an input box by a user; and adjusting the sequence of subsequent prompt words in a prompt box of the input method according to the prompt information. The method for adjusting information prompting in an input method according to the embodiments of the present invention can adjust the sequence of prompt words in the prompt box of the input method in real time based on prompt information in the prompt box to facilitate user selection. | 10-02-2014 |
20140359739 | VOICE BASED BIOMETRIC AUTHENTICATION METHOD AND APPARATUS - Voice based biometric authentication method, apparatus (system), and computer program product. Provided is voice verification solution with a high accuracy rate that can prevent cheating via recording. The method includes: transmitting to the user a question prompt requiring the user to speak out a voice segment and an answer to a dynamic question, the voice segment having a corresponding text dependent speaker verification model enrolled before the authentication; segmenting, in response to receiving the voice answer, the voice segment part and the dynamic question answer part out from the voice answer; and verifying boundary smoothness between the voice segment and the answer to the dynamic question within the voice answer. With this method, whether a voice answer relates to cheating via recording is determined according to the degree of smoothness at a detected boundary. The apparatus and computer program product carry out the steps of the above-mentioned method. | 12-04-2014 |