Patent application number | Description | Published |
20080255827 | Voice Conversion Training and Data Collection - It may be desirable to provide a way to collect high quality speech training data without undue burden to the user. Speech training data may be collected during normal usage of a device. In this way, the collection of speech training data may be effectively transparent to the user, without the need for a distinct collection mode from the user's point of view. For example, where the device is or includes a phone (such as a cellular phone), when the user makes or receives a phone call to/from another party, speech training data may be automatically collected from one or both of the parties during the phone call. | 10-16-2008 |
20080262838 | Method, apparatus and computer program product for providing voice conversion using temporal dynamic features - An apparatus for providing voice conversion using temporal dynamic features includes a feature extractor and a transformation element. The feature extractor may be configured to extract dynamic feature vectors from source speech. The transformation element may be in communication with the feature extractor and configured to apply a first conversion function to a signal including the extracted dynamic feature vectors to produce converted dynamic feature vectors. The first conversion function may have been trained using at least dynamic feature data associated with training source speech and training target speech. The transformation element may be further configured to produce converted speech based on an output of applying the first conversion function. | 10-23-2008 |
20090094031 | Method, Apparatus and Computer Program Product for Providing Text Independent Voice Conversion - An apparatus for providing text independent voice conversion may include a first voice conversion model and a second voice conversion model. The first voice conversion model may be trained with respect to conversion of training source speech to synthetic speech corresponding to the training source speech. The second voice conversion model may be trained with respect to conversion to training target speech from synthetic speech corresponding to the training target speech. An output of the first voice conversion model may be communicated to the second voice conversion model to process source speech input into the first voice conversion model into target speech corresponding to the source speech as the output of the second voice conversion model. | 04-09-2009 |
20090094264 | Method, Apparatus and Computer Program Product for Providing Improved Data Compression - An apparatus for providing improved data compression may include an encoder comprising a quantizer for encoding input data and a side model. The quantizer may be trained with respect to high priority data among the input data and may be configured to partially encode the input data by encoding the high priority data. The side model may be trained jointly with the training of the quantizer and is configured to model low priority data among the input data. | 04-09-2009 |
20100080409 | Dual-mode loudspeaker - An apparatus uses a transducer to produce vibration in the ultrasonic frequency range and in the audible frequency range. A membrane or cantilever structure is coupled to the transducer to produce acoustic waves. When the vibration is in the audible frequency range, the membrane structure works like a conventional loudspeaker. When the vibration is in the ultrasonic frequency range, the ultrasonic signal is modulated by audio signal for creating better directivity. The acoustic waves in the ultrasonic frequency range can reproduce directional audible sound due to the nonlinear interaction of ultrasonic waves in air. | 04-01-2010 |
20100158287 | Multi-directivity sound device - A sound producing apparatus with multi-directivity has two or more speakers to produce music or other types of audio sounds. Each of the speakers can produce audio sounds in a certain direction and at least one speaker is steerable so that its sound propagation direction relative to the others can be changed. Furthermore, one or more speakers can produce audio sounds in an adjustable angular range. One or more of the speakers can be dual-mode speakers, each of which can be operated as a wide-angle speaker or a narrow-angle speaker. With multi-directivity, it is possible to play one type of music in one direction and another type of music in another direction to suit the interests of the audience. The apparatus can be a mobile phone, an audio player, a digital or analog recorder/player, or the like. | 06-24-2010 |
20140249815 | METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING TEXT INDEPENDENT VOICE CONVERSION - An apparatus for providing text independent voice conversion may include a first voice conversion model and a second voice conversion model. The first voice conversion model may be trained with respect to conversion of training source speech to synthetic speech corresponding to the training source speech. The second voice conversion model may be trained with respect to conversion to training target speech from synthetic speech corresponding to the training target speech. An output of the first voice conversion model may be communicated to the second voice conversion model to process source speech input into the first voice conversion model into target speech corresponding to the source speech as the output of the second voice conversion model. | 09-04-2014 |
20150055883 | Method, Apparatus and Computer Program Product for providing Improved Data Compression - An apparatus for providing improved data compression may include an encoder comprising a quantizer for encoding input data and a side model. The quantizer may be trained with respect to high priority data among the input data and may be configured to partially encode the input data by encoding the high priority data. The side model may be trained jointly with the training of the quantizer and is configured to model low priority data among the input data. | 02-26-2015 |
Patent application number | Description | Published |
20090171657 | Hybrid Approach in Voice Conversion - A hybrid approach is described for combining frequency warping and Gaussian Mixture Modeling (GMM) to achieve better speaker identity and speech quality. To train the voice conversion GMM model, line spectral frequency and other features are extracted from a set of source sounds to generate a source feature vector and from a set of target sounds to generate a target feature vector. The GMM model is estimated based on the aligned source feature vector and the target feature vector. A mixture specific warping function is generated each set of mixture mean pairs of the GMM model, and a warping function is generated based on a weighting of each of the mixture specific warping functions. The warping function can be used to convert sounds received from a source speaker to approximate speech of a target speaker. | 07-02-2009 |
20090172571 | LIST BASED NAVIGATION FOR DATA ITEMS - A user interface provides for contextual navigation in locating desired content based on similarity/dissimilarity criteria. Each item that is identified in a device is provided with at least one multi-dimensional descriptor. A content of each item can be stored remotely from the device. A search criteria is selected that relates to the descriptor and a selected active item. A search is conducted to identify all other items identified in the device that have a relationship with the search criteria. The results are presented to the user and can be ranked according to a selected relationship order. | 07-02-2009 |
20090299747 | METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING IMPROVED SPEECH SYNTHESIS - An apparatus for providing improved speech synthesis may include a processor and a memory storing executable instructions. In response to execution of the instructions by the processor, the apparatus may perform at least selecting a real glottal pulse from among one or more stored real glottal pulses based at least in part on a property associated with the real glottal pulse, utilizing the real glottal pulse selected as a basis for generation of an excitation signal, and modifying the excitation signal based on spectral parameters generated by a model to provide synthetic speech. | 12-03-2009 |
20090327979 | USER INTERFACE FOR A PERIPHERAL DEVICE - A system and method that includes detecting an activation of an input application control key of a peripheral device, the peripheral device having a single application control input key, to open a initial application, the initial application becoming an active application, providing an identifier of the open application, and detecting a speech prompt to activate at least one function related to the active application. | 12-31-2009 |
20110154193 | Method and Apparatus for Text Input - In accordance with an example embodiment of the present invention, there is provided a method comprising receiving a first text input at a first point in time, providing a first completion candidate for the first text input, receiving a second text input at a second point in time, determining a time difference between the second point in time and the first point in time and providing a second completion candidate for the second text input based on at least the first completion candidate and the time difference. | 06-23-2011 |
20120109654 | METHODS AND APPARATUSES FOR FACILITATING SPEECH SYNTHESIS - Methods and apparatuses are provided for facilitating speech synthesis. A method may include generating a plurality of input models representing an input by using a statistical model synthesizer to statistically model the input. The method may further include determining a speech unit sequence representing at least a portion of the input by using the input models to influence selection of one or more pre-recorded speech units having parameter representations. The method may additionally include identifying one or more bad units in the unit sequence. The method may also include replacing the identified one or more bad units with one or more parameters generated by the statistical model synthesizer. Corresponding apparatuses are also provided. | 05-03-2012 |