20130060564PSYCHOACOUSTIC TIME ALIGNMENT - A method of providing a quality measure for an output voice signal generated to reproduce an input voice signal, the method comprising: partitioning the input and output signals into frames; for each frame of the input signal, determining a disturbance relative to each of a plurality of frames of the output signal; determining a subset of the determined disturbances comprising one disturbance for each input frame such that a sum of the disturbances in the subset set is a minimum; and using the set of disturbances to provide the measure of quality.03-07-2013
20110218797ENCODER FOR AUDIO SIGNAL INCLUDING GENERIC AUDIO AND SPEECH FRAMES - A method for encoding audio frames by producing a first frame of coded audio samples by coding a first audio frame in a sequence of frames, producing at least a portion of a second frame of coded audio samples by coding at least a portion of a second audio frame in the sequence of frames, and producing parameters for generating audio gap filler samples, wherein the parameters are representative of either a weighted segment of the first frame of coded audio samples or a weighted segment of the portion of the second frame of coded audio samples.09-08-2011
20120239384VOICE PROCESSING DEVICE AND METHOD, AND PROGRAM - A voice processing device includes a voice pitch converting unit that performs a voice pitch converting process with respect to an input voice signal and converts voice pitch of the input voice signal, an error detecting unit that detects an error between the number of samples of an output voice signal, which is expected, and the number of samples of the output voice signal, which is actually output, and a time length control unit that controls adjustment of the time length in such a manner that the time length of the output voice signal is corrected by the amount of the error.09-20-2012
20090265164Method for Encoding and Decoding Object-Based Audio Signal and Apparatus Thereof - The present invention relates to a method for encoding and decoding object-based audio signal and an apparatus thereof. The audio decoding method includes extracting a first audio signal in which one or more music objects are grouped and encoded, a second audio signal in which at least two vocal objects are grouped step by step and encoded, and a residual signal corresponding to the second audio signal, from an audio signal, and generating a third audio signal by employing at least one of the first and second audio signals and the residual signal. A multi-channel audio signal is then generated by employing the third audio signal. Accordingly, a variety of play modes can be provided efficiently.10-22-2009
20090055169VOICE ENCODING DEVICE, AND VOICE ENCODING METHOD - A voice encoding device capable of generating a modulated proper monaural signal enriched in clearness and understandability, when the monaural signal is to be generated from a stereophonic signal. In this device, a weighting unit (02-26-2009
20080262834Sound Separating Device, Sound Separating Method, Sound Separating Program, and Computer-Readable Recording Medium - A sound separating apparatus includes a converting unit that respectively converts signals of two channels into frequency domains by a time unit, the signals representing sounds from sound sources. The apparatus also includes a localization-information calculating unit that calculates localization information regarding the frequency domains and a cluster analyzing unit that classifies the localization information into clusters and respectively calculates central values of the clusters. Finally, the apparatus further includes a separating unit that inversely converts, into a time domain, a value that is based on the central value and the localization information, and separates a sound from a given sound source included in the sound sources.10-23-2008
20090083029RETRIEVING APPARATUS, RETRIEVING METHOD, AND COMPUTER PROGRAM PRODUCT - A word coinciding with a key word input by speech and a word related to the word are set as retrieval candidate words based on a word dictionary in which words representing formal names and aliases of the formal names are registered in association with a family attribute indicating a familiar relation among the words. Content related to any one of retrieval words selected out of the retrieval candidate words and a word related to the retrieval word is retrieved.03-26-2009
20120173229SYSTEMS AND METHODS FOR PRESENTING END TO END CALLS AND ASSOCIATED INFORMATION - Systems and methods that, among other things, analyze and monitor the performance of a call center including performance of the interactive voice response (IVR) systems, call center agents, and other components of the call center. The systems and methods record characteristics of the call such as the audio data, and analyze that record to identify the events and the actions that take place during the call. A call center administrator may also identify a set of metrics, such as the number of dropped calls that occur during a day that may be monitored by the systems described herein. The data collected about these events and the resulting metrics may be stored in a database and provided to a call center administrator through a user interface that allows the administrator to browse through the collected metric and recorded call data and directly review relevant portions of a call.07-05-2012
20090048825System and method for providing internet based phone conferences using multiple codecs - A method of communicating digitized speech from a transmitting forum participant comprises the step of receiving a data structure that includes said digitized speech. The data structure is analyzed to determine whether the digitized speech is redundantly represented in a plurality of forms in the data structure. A portion of the data structure is forwarded to a receiving forum participant, thereby communicating the digitized speech from the transmitting forum participant. In this method, when the digitized speech is redundantly represented in the data structure in a plurality of forms, the forwarding step includes a step of selecting one or more forms, based on a function, from the plurality of forms in the data structure. Furthermore, the portion of the data structure that is forwarded to the receiving forum participant includes data in the data structure that corresponds to each of the selected one or more forms.02-19-2009
20100280821TEXT EDITING - An apparatus is provided with an ambiguous keystroke disambiguation and/or word autocompletion text editor application that uses a common language dictionary. The apparatus is also provided with one or more lexica that contain a vocabulary relating to a specific subject matter. The ambiguous keystroke disambiguation and/or word autocompletion text editor application uses one or more of the lexica in combination with a language dictionary for ambiguous keystroke disambiguation and/or word autocompletion text editing. The user can determine which of the lexica are to be used by the ambiguous keystroke disambiguation and/or word autocompletion text editor application. The user can also determine the priority with which the lexica are used. The lexica can be downloaded to the apparatus from the Internet or transferred from any other device that the apparatus is connected to and the lexica that are stored on the apparatus can be edited by a user.11-04-2010
20080281584FAST ACOUSTIC CANCELLATION - A speech enhancement system improves the perceptual quality of an aural signal. A receiver detects and receives an unvoiced signal, a fully voiced signal, or a mixed voice remote signal. A coherence processor identifies the similarities or differences between a local signal and the remote signal. A cancellation processor or controller dampens reflected signals that may be part of the local signal.11-13-2008
20120046940METHOD FOR PROCESSING MULTICHANNEL ACOUSTIC SIGNAL, SYSTEM THEREOF, AND PROGRAM - A method for processing multichannel acoustic signals, whereby input signals of a plurality of channels including the voices of a plurality of speaking persons are processed. The method is characterized by comprising: calculating the first feature quantity of the input signals of the multichannels for each channel; calculating similarity of the first feature quantity of each channel between the channels; selecting channels having high similarity; separating signals using the input signals of the selected channels; inputting the input signals of the channels having low similarity and the signals after the signal separation; and detecting a voice section of each speaking person or each channel.02-23-2012
20110071820Voice-quality evaluating system, communication system, test management apparatus, and test communication apparatus - A voice-quality evaluating system, in a secure network that allows a voice packet to pass, transmits and receives communication information for a voice quality testing between a test management apparatus and a test communication apparatus connected to the network and between the test communication apparatuses, for the voice quality testing between the test communication apparatuses arranged on the network. The voice-quality evaluating system embeds the communication information in a payload of the voice packet, and transmits and receives communication-information-embedded voice packet.03-24-2011
20120323566AUTOMATED DEMOGRAPHIC ANALYSIS BY ANALYZING VOICE ACTIVITY - Methods, systems, and media for determining a response to be generated in an environment are provided. The methods, systems, and media monitor the environment for a voice activity of an individual. The voice activity of the individual is detected and analyzed. A content descriptor of the voice activity is determined based on the voice activity of the individual. A demographic descriptor of the individual is determined based on the voice activity of the individual. The content descriptor, the demographic descriptor, and known information are correlated to determine the response to be generated in the environment.12-20-2012
20120095753NOISE POWER ESTIMATION SYSTEM, NOISE POWER ESTIMATING METHOD, SPEECH RECOGNITION SYSTEM AND SPEECH RECOGNIZING METHOD - A noise power estimation system for estimating noise power of each frequency spectral component includes a cumulative histogram generating section for generating a cumulative histogram for each frequency spectral component of a time series signal, in which the horizontal axis indicates index of power level and the vertical axis indicates cumulative frequency and which is weighted by exponential moving average; and a noise power estimation section for determining an estimated value of noise power for each frequency spectral component of the time series signal based on the cumulative histogram.04-19-2012
20080228470Signal separating device, signal separating method, and computer program - A signal separating device that is inputted with signals formed by mixing plural signals and separates the signals into individual signals includes a signal converting unit that converts input signals into signals in the time-frequency domain and generates observation spectrograms and a signal separating unit that generates separated results from the observation spectrograms generated by the signal converting unit. The signal separating unit interprets the observation spectrograms as observation signals subjected to convolutive mixtures in the time-frequency domain and generates separated results by executing processing for solving convolutive mixtures in the time-frequency domain.09-18-2008

