Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Voice recognition

Subclass of:

704 - Data processing: speech signal processing, linguistics, language translation, and audio compression/decompression

704200000 - SPEECH SIGNAL PROCESSING

704231000 - Recognition

Patent class list (only not empty are listed)

Deeper subclasses:

Class / Patent application numberDescriptionNumber of patent applications / Date published
704249000 Subportions 35
704250000 Specialized models 27
704247000 Preliminary matching 7
704248000 Endpoint detection 7
Entries
DocumentTitleDate
20110208524USER PROFILING FOR VOICE INPUT PROCESSING - This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device.08-25-2011
20090319270CAPTCHA Using Challenges Optimized for Distinguishing Between Humans and Machines - An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance using optimized challenge items selected for their discrimination capability to determine if it was articulated by an unauthorized human or a text to speech (TTS) system.12-24-2009
20110196677Analysis of the Temporal Evolution of Emotions in an Audio Interaction in a Service Delivery Environment - According to one illustrative embodiment, a method is provided for analyzing an audio interaction. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided.08-11-2011
20130211836GLOBAL SPEECH USER INTERFACE - A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications.08-15-2013
20100049515VEHICLE-MOUNTED VOICE RECOGNITION APPARATUS - A vehicle-mounted voice recognition apparatus 02-25-2010
20100076763VOICE RECOGNITION SEARCH APPARATUS AND VOICE RECOGNITION SEARCH METHOD - A voice recognition search apparatus includes: a dictionary create unit creating a first voice recognition dictionary from a search subject data; a voice acquisition unit acquiring first and second voices; a voice recognition unit creating first and second text data by recognizing the first and second voices using the first and second voice recognition dictionaries; a first search unit searching the search subject data by the first text data; and a second search unit searching a search result of the first search unit by the second text data.03-25-2010
20130080168AUDIO ANALYSIS APPARATUS - An audio analysis apparatus includes the following components. A strap has an end portion connected to a main body and is used to hang the main body from a user's neck. A first audio acquisition device is at the end portion or in the main body. Second and third audio acquisition devices are at positions separate from the end portion by substantially the same predetermined distances, on the respective sides of the strap extending from the user's neck. An analysis unit discriminates whether an acquired sound is an uttered voice of the user or another person by comparing audio signals of acquired by the first and second or third audio acquisition devices and detects an orientation of the user's face by comparing the audio signals acquired by the second and third audio acquisition devices. A transmission unit transmits the analysis result to an external apparatus.03-28-2013
20130080167Background Speech Recognition Assistant Using Speaker Verification - In one embodiment, a method includes receiving an acoustic input signal at a speech recognizer. A user is identified that is speaking based on the acoustic input signal. The method then determines speaker-specific information previously stored for the user and a set of responses based on the recognized acoustic input signal and the speaker-specific information for the user. It is determined if the response should be output and the response is outputted if it is determined the response should be output.03-28-2013
20090030689Mobile voice recognition data collection and processing - Voice recognition methods, systems and interfaces are used to collect data and produce databases that are then searched and used to produce reports or electronic filings. The databases are developed using a hierarchically designed command structure and a hierarchy of relational databases for the entry and recognition of voice commands. The invention uses an Adaptive Grammar that allows a very high probability for accurate recognition and a rapid recognition response to be achieved. The invention allows for multiple users and multiple mobile computers to maximize voice recognition capabilities.01-29-2009
20130041666VOICE RECOGNITION APPARATUS, VOICE RECOGNITION SERVER, VOICE RECOGNITION SYSTEM AND VOICE RECOGNITION METHOD - A voice recognition apparatus, a voice recognition server, a voice recognition system, and a voice recognition method, in which a general-purpose voice recognition engine may accurately recognize a limited number of words used in a specific area.02-14-2013
20130041665Electronic Device and Method of Controlling the Same - There are disclosed an electronic device and a method of controlling the electronic device. The electronic device according to an aspect of the present invention includes a display unit, a voice input unit, and a control unit configured to output a plurality of contents through the electronic device, receive a voice command through the voice input unit for performing a command, determine which of the plurality of contents correspond to the received voice command, and perform the command on one or more of the plurality of contents that correspond to the received voice command. According to the present invention, multi-tasking performed in an electronic device can be efficiently controlled through a voice command.02-14-2013
20100106504INTELLIGENT MECHANISM TO AUTOMATICALLY DISCOVER AND NOTIFY A POTENTIAL PARTICIPANT OF A TELECONFERENCE - A computer-implemented method, computer program product, and data processing system for notifying an identified person of a teleconference. Data corresponding to an audio record of the teleconference is received. Pattern recognition is performed on the data. Responsive to recognizing in the data a pattern corresponding to an identification of the identified person, a device associated with the identified person is contacted.04-29-2010
20100106503SPEAKER VERIFICATION METHODS AND APPARATUS - In one aspect, a method for determining a validity of an identity asserted by a speaker using a voice print that models speech of a user whose identity the speaker is asserting is provided. The method comprises acts of performing a first verification stage comprising acts of obtaining a first voice signal from the speaker uttering at least one first challenge utterance; and comparing at least one characteristic feature of the first voice signal with at least a portion of the voice print to assess whether the at least one characteristic feature of the first voice signal is similar enough to the at least a portion of the voice print to conclude that the first voice signal was obtained from an utterance by the user. The method further comprises performing a second verification stage if it is concluded in the first verification stage that the first voice signal was obtained from an utterance by the user, the second verification stage comprising acts of adapting at least one parameter of the voice print based, at least in part, on the first voice signal to obtain an adapted voice print, obtaining a second voice signal from the speaker uttering at least one second challenge utterance, and comparing at least one characteristic feature of the second voice signal with at least a portion of the adapted voice print to assess whether the at least one characteristic feature of the second voice signal is similar enough to the at least a portion of the adapted voice print to conclude that the second voice signal was obtained from an utterance by the user.04-29-2010
20100106502SPEAKER VERIFICATION METHODS AND APPARATUS - In one aspect, a method for determining validity of an identity asserted by a speaker using a voice print associated with a user whose identity the speaker is asserting, the voice print obtained from characteristic features of at least one first voice signal obtained from the user uttering at least one enrollment utterance including at least one enrollment word is provided. The method comprises acts of obtaining a second voice signal of the speaker uttering at least one challenge utterance, wherein the at least one challenge utterance includes at least one word that was not in the at least one enrollment utterance, obtaining at least one characteristic feature from the second voice signal, comparing the at least one characteristic feature with at least a portion of the voice print to determine a similarity between the at least one characteristic feature and the at least a portion of the voice print, and determining whether the speaker is the user based, at least in part, on the similarity between the at least one characteristic feature and the at least a portion of the voice print.04-29-2010
20090043579TARGET SPECIFIC DATA FILTER TO SPEED PROCESSING - A method is presented which reduces data flow and thereby increases processing capacity while preserving a high level of accuracy in a distributed speech processing environment for speaker detection. The method and system of the present invention includes filtering out data based on a target speaker specific subset of labels using data filters. The method preserves accuracy and passes only a fraction of the data by optimizing target specific performance measures. Therefore, a high level of speaker recognition accuracy is maintained while utilizing existing processing capabilities.02-12-2009
20090112591SYSTEM AND METHOD OF WORD LATTICE AUGMENTATION USING A PRE/POST VOCALIC CONSONANT DISTINCTION - Disclosed are systems and methods for recognizing speech in a spoken dialogue system. The method includes (1) receiving an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant, (2) generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result; (3) distinguishing between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech, (4) calculating a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post vocalic consonant in the input speech and the first score, (5) determining at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score, and (6) refining the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch.04-30-2009
20090112590SYSTEM AND METHOD FOR IMPROVING INTERACTION WITH A USER THROUGH A DYNAMICALLY ALTERABLE SPOKEN DIALOG SYSTEM - Disclosed are systems and methods for dynamically interacting with a user through a spoken dialogue system. A method includes the steps of (1) receiving a user utterance, (2) analyzing the user utterance for a threshold determination of dialect, (3) generating a response that reflects an incremental implementation of the dialect, (4) further varying the perceived implementation of the dialect in subsequent responses by a process of: (a) receiving a subsequent user utterance, (b) determining a modified level of confidence in the dialect based at least in part from the subsequent utterance, (c) generating a subsequent response that implements an incremental variation according to the modified level of confidence.04-30-2009
20090112589ELECTRONIC APPARATUS AND SYSTEM WITH MULTI-PARTY COMMUNICATION ENHANCER AND METHOD - A multi-party communication enhancer includes an audio data input adapted to receive voice data associated with a plurality of communication participants. A participant identifier included in the multi-party communication enhancer is adapted to distinguish the voice of a number of communication participants as represented within the received voice data. A cue generator, also included in the multi-party communication enhancer, is operable to generate a cue for each distinguished voice, with the generated cue being outputted in association with the corresponding distinguished voice.04-30-2009
20120191454Method and Apparatus for Obtaining Statistical Data from a Conversation - A system is described to monitor various parameters of a conversation, for example distinguishing voices in a conversation and reporting who in the group is violating the proper etiquette rules of conversation. These results would indicate any disruptive individuals in a conversation. So they are identified, monitored, trained to prevent further disturbances, and their etiquette is improved to prevent further disturbances. Some of the functions the system can perform include: report the identity of the voices, report how long one has spoken, report how often one interrupts, report how often one raises their voice, count the occurrences of obscenities and determine length of silences. The system, in addition, can provide meaning of words, send email, identify fast talkers, train to reduce the volume of a voice, provide a period of time to a voice, beep after someone uses a profanity, request a voice to speak up, provide grammatical corrections, provide text copies of conversation, and eliminate background noises.07-26-2012
20130060569VOICE AUTHENTICATION SYSTEM AND METHOD USING A REMOVABLE VOICE ID CARD - A voice authentication system using a removable voice ID card comprises: at server side, a voiceprint database for storing the voiceprints of all authorized users; a voiceprint updating means for updating the voiceprints in said voiceprint database; and a voiceprint digest generator for generating a voiceprint digest according to a request from a client; at client side, a voice ID card for storing the voiceprint of an authorized user; a validation means for validating the voiceprint in the voice ID card on the basis of the voiceprint digest from the server; an audio device for performing voice interaction with a user; and a voice authentication means for determining whether the voiceprint from said voice ID card is of the same speaker as the voice from said audio device.03-07-2013
20090271196CLASSIFYING PORTIONS OF A SIGNAL REPRESENTING SPEECH - Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a frame of the signal representing speech. The frame can be classified as unvoiced or voiced based on occurrence of one or more events within the frame. For example, the one or more events can comprise one or more glottal pulses. In response to classifying the frame as voiced, the frame can be processed.10-29-2009
20090083033Phonetic Searching - An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.03-26-2009
20090030690SPEECH ANALYSIS APPARATUS, SPEECH ANALYSIS METHOD AND COMPUTER PROGRAM - A speech analysis apparatus analyzing prosodic characteristics of speech information and outputting a prosodic discrimination result includes an input unit inputting speech information, an acoustic analysis unit calculating relative pitch variation and a discrimination unit performing speech discrimination processing, in which the acoustic analysis unit calculates a current template relative pitch difference, determining whether a difference absolute value between the current template relative pitch difference and a previous template relative pitch difference is equal to or less than a predetermined threshold or not, when the value is not less than the threshold, calculating an adjacent relative pitch difference, and when the adjacent relative pitch difference is equal to or less than a previously set margin value, executing correction processing of adding or subtracting an octave of the current template relative pitch difference to calculate the relative pitch variation by applying the relative pitch difference as the relative pitch difference of the current analysis frame.01-29-2009
20110022389APPARATUS AND METHOD FOR IMPROVING PERFORMANCE OF VOICE RECOGNITION IN A PORTABLE TERMINAL - An apparatus and method for improving the performance of voice recognition in a portable terminal are provided. The apparatus includes a voice recognition management unit, and a controller. After recognizing a user's voice and extracting at least one voice parameter, the voice recognition management unit determines if the extracted at least one voice parameter meets a criterion for determining one of success and failure of voice recognition. The controller analyzes a result of the determination by the voice recognition management unit and outputs a result of the analysis.01-27-2011
20110022388METHOD AND SYSTEM FOR SPEECH RECOGNITION USING SOCIAL NETWORKS - In an example embodiment, there is disclosed an apparatus comprising an audio interface configured to receive an audio signal, a data interface is configured to communicate with at least one social graph, and logic is coupled to the audio interface and the data interface. The logic is configured to identify a calling party. The logic is further configured to acquire data representative of a called party from the audio signal. The logic is configured to initiate a search of the at least one social graph for the data representative of the called party to identify the called party responsive to acquiring the data representative of the called party.01-27-2011
20090326942METHODS OF IDENTIFICATION USING VOICE SOUND ANALYSIS - Methods of using individually distinctive patterns of voice characteristics to identify a speaker include computing the reassigned spectrogram of each of at least two voice samples, pruning each reassigned spectrogram to remove noise and other computational artifacts, and comparing (either visually or with the aid of a processor) the strongest points to determine whether the voice samples belong to the same speaker.12-31-2009
20130166298VOICE ANALYZER - A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body to make the apparatus body hung from a neck of a wearer, a first voice acquisition unit that acquires a voice of a speaker and is disposed in either a left or right strap when viewed from the wearer, a second voice acquisition unit that acquires the voice of the speaker and is disposed in the opposite strap in which the first voice acquisition unit is disposed, and an arrangement recognition unit that recognizes arrangements of the first and second voice acquisition units, when viewed from the wearer, by comparing a voice signal of the voice acquired by the first voice acquisition unit with sound pressure of a heart sound of the wearer acquired by the second voice acquisition unit.06-27-2013
20130166299VOICE ANALYZER - A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body and is used to hang the apparatus body from a neck of a user, a first voice acquisition unit provided in the strap or the apparatus body, a second voice acquisition unit provided at a position where a distance of a sound wave propagation path from a mouth of the user is smaller than a distance of a sound wave propagation path from the mouth of the user to the first voice acquisition unit, and an identification unit that identifies a sound, in which first sound pressure acquired by the first voice acquisition unit is larger by a predetermined value or more than second sound pressure acquired by the second voice acquisition unit, on the basis of a result of comparison between the first sound pressure and the second sound pressure.06-27-2013
20130166300ELECTRONIC DEVICE, DISPLAYING METHOD, AND PROGRAM COMPUTER-READABLE STORAGE MEDIUM - An electronic device includes a voice recognition analyzing module, a manipulation identification module, and a manipulating module. The voice recognition analyzing module is configured to recognize and analyze a voice of a user. The manipulation identification module is configured to, using the analyzed voice, identify an object on a screen and identify a requested manipulation associated with the object. The manipulating module is configured to perform the requested manipulation.06-27-2013
20090006094OPTIMIZATION OF DETECTION SYSTEMS USING A DETECTION ERROR TRADEOFF ANALYSIS CRITERION - In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest.01-01-2009
20090276217VOIP CALLER AUTHENTICATION BY VOICE SIGNATURE CONTINUITY - There are provided methods and systems for authenticating a user. A method includes receiving a voice signature certificate corresponding to a setup portion of a Voice over Internet Protocol (VoIP) call. The VoIP call further has a voice conversation portion. The voice signature certificate includes a voice signature segment. The method further includes reproducing the voice signature segment to enable verification of voice continuity from the setup portion to the voice conversation portion. The verification is performing by comparing the voice signature segment to a user's voice during the voice conversation portion.11-05-2009
20110282666Utterance state detection device and utterance state detection method - An utterance state detection device includes an user voice stream data input unit that gets user voice stream data of an user, a frequency element extraction unit that extracts high frequency elements by frequency-analyzing the user voice stream data, a fluctuation degree calculation unit that calculates a fluctuation degree of the high frequency elements thus extracted every unit time, a statistic calculation unit that calculates a statistic every certain interval based on a plurality of the fluctuation degrees in a certain period of time, and an utterance state detection unit that detects an utterance state of a specified user based on the statistic obtained from user voice stream data of the specified user.11-17-2011
20110282665METHOD FOR MEASURING ENVIRONMENTAL PARAMETERS FOR MULTI-MODAL FUSION - Provided is a method for measuring environmental parameters for multi-modal fusion. The method for measuring environmental parameters for multi-modal fusion, includes: preparing at least one enrolled modality; receiving at least one input modality; calculating image related environmental parameters of input images in at least one input modality based on illumination of enrolled image in at least one enrolled modality; and comparing the image related environmental parameters with a predetermined reference value and discarding the input image or outputting it as a recognition data according to the comparison result.11-17-2011
20090171660METHOD AND APPARATUS FOR VERIFICATION OF SPEAKER AUTHENTIFICATION AND SYSTEM FOR SPEAKER AUTHENTICATION - A method for verification of speaker authentication comprises inputting a test utterance containing a password that is spoken by a speaker, extracting an acoustic feature vector sequence from the inputted test utterance, obtaining a matching path between the extracted acoustic feature vector sequence and a speaker template enrolled by an enrolled speaker, calculating a matching score of the obtained matching path upon considering spectral change of the test utterance and/or spectral change of the speaker template, and comparing the matching score with a predefined discriminating threshold to determine whether the inputted test utterance is an utterance containing a password spoken by the enrolled speaker.07-02-2009
20110301954METHOD FOR ADJUSTING A VOICE RECOGNITION SYSTEM COMPRISING A SPEAKER AND A MICROPHONE, AND VOICE RECOGNITION SYSTEM - A method for adjusting a voice recognition system and a voice recognition system is disclosed, wherein the voice recognition system comprises a speaker and a microphone, and wherein the method comprises the steps of: 12-08-2011
20110295603SPEECH RECOGNITION ACCURACY IMPROVEMENT THROUGH SPEAKER CATEGORIES - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a computer-based method includes receiving a speech corpus at a speech management server system that includes multiple speech recognition engines tuned to different speaker types; using the speech recognition engines to associate the received speech corpus with a selected one of multiple different speaker types; and sending a speaker category identification code that corresponds to the associated speaker type from the speech management server system over a network. The speaker category identification code can be used by any one of speech-interactive applications coupled to the network to select one of an appropriate one of multiple application-accessible speech recognition engines tuned to the different speaker types in response to an indication that a user accessing the application is associated with a particular one of the speaker category identification codes.12-01-2011
20090150151Audio processing apparatus, audio processing system, and audio processing program - Disclosed herein is an audio processing apparatus for processing a plurality of pieces of audio data of sounds picked up by a plurality of microphones. The apparatus includes: a speaker identification section configured to identify a speaker based on the audio data; a simultaneous speech section identification section configured to, when at least first and second speakers have been identified, identify speech sections during which the first and second speakers have made speeches, and identify a section during which the first and second speakers have made the speeches at the same time as a simultaneous speech section; and an arranging section configured to separate audio data of the first speaker and audio data of the second speaker from the simultaneous speech section, and allow the audio data of the first speaker and the audio data of the second speaker to be outputted at mutually different timings.06-11-2009
20090150150SYSTEM AND METHOD FOR CONTROLLING ACCESS TO A HANDHELD DEVICE BY VALIDATING VOICE SOUNDS - A method for controlling access to a handheld device (06-11-2009
20090150149Identifying far-end sound - Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo.06-11-2009
20080312923Active Speaker Identification - Procedures for identifying clients in an audio event are described. In an example, a media server may order clients providing audio based on the input level. An identifier may be associated with the client for identifying the client providing input within the event. The ordered clients may be included in a list which may be inserted into a packet header carrying the audio content.12-18-2008
20120035929MESSAGING SYSTEM - The messaging system (02-09-2012
20090094029Managing Audio in a Multi-Source Audio Environment - Methods, systems, and computer-readable media provide for the management of an audio environment with multiple audio sources. According to various embodiments described herein, real-time audio from multiple sources is received. A speaker is identified for each of the audio sources. Upon detecting a change from a first audio source to a second audio source, an identification of the speaker associated with the second audio source is provided. According to various embodiments, a recording of the real-time audio may be made and descriptors inserted to identify each speaker as the audio source changes. Real-time feedback from the speakers regarding characteristics of the audio may be received and corresponding adjustments to the audio made.04-09-2009
20090094030INDEXING METHOD FOR QUICK SEARCH OF VOICE RECOGNITION RESULTS - A method, system and computer program product for receiving a spoken request to obtain indexed results from a database. Like result types are assigned to categories, and within each category is a plurality of result entries. The result indices are hexadecimal encoded, and each hexadecimal encoding is preceded by an initial character representing the result category. A speech recognition system is engaged, which processes the spoken request. When a item is requested, the respective category is implicitly known by the index returned, and the index provides direct access within a database to the corresponding result based on the phonetics of the request.04-09-2009
20080208580Method and Dialog System for User Authentication - The invention relates to a method of authenticating a user (N). In a dialog between the user (N) to be authenticated and a dialog system (08-28-2008
20090276218Robot and Server with Optimized Message Decoding - A method for optimizing message transmission and decoding comprises: reading data from a memory of an originating device, the data comprising information regarding the originating device; encoding the data by converting the data to a subset of words having a ranked recognition accuracy higher than the remainder of words; transmitting the encoded data from the originating device to a receiving system audibly as words via a telephone connection; utilizing a voice recognition software to recognize the words; decoding the words back to the data; and taking a predetermined action based on the data.11-05-2009
20110173002IN-VEHICLE DEVICE AND METHOD FOR MODIFYING DISPLAY MODE OF ICON INDICATED ON THE SAME - A storage unit stores a correspondence between a voice command and a display mode modification operation. When a control unit determines that a vehicle is traveling according to a traveling state of the vehicle obtained by a traveling state acquisition unit, when a voice recognition unit recognizes a voice, which is uttered by a user and received by a voice input unit, and when the control unit determines that the recognized voice corresponds to a voice command stored in the storage unit, the control unit performs a display mode change operation corresponding to the voice command and modifies a display mode of an icon indicated on an indication screen of an indication unit.07-14-2011
20110173001SMS MESSAGING WITH VOICE SYNTHESIS AND RECOGNITION - When a subscriber's phone is sent a SMS message from any other Public Switch Telephone Network user, a voice call to the subscriber's phone is placed, and upon answering, the SMS message is translated into speech. A jargon translator is employed to convert SMS language into corresponding words. Once the message has been played, the subscriber receiving it may verbally request the opportunity to send a reply to the message by audibly speaking a response. The response is matched against an internal phrasebook to accurately transcribe the message. Transcription performance is improved by allowing each subscriber to provide a personal phrasebook which is combined with the internal one. However, if the spoken message is complex or not recognized, the message can be automatically relayed to a human agent for manual transcription.07-14-2011
20090287489SPEECH PROCESSING FOR PLURALITY OF USERS - A mobile communication device configured to communicate over a wireless network has an audio processing circuit that is adaptable based on a pattern of the speaker's voice to provide improved audio quality and intelligibility. The audio processing circuit is configured to receive a voice signal from an individual speaker, to determine a pattern associated with the speaker's voice, and to adjust a filter based on the determined pattern.11-19-2009
20120296649Digital Signatures for Communications Using Text-Independent Speaker Verification - A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature.11-22-2012
20120296650SPEECH RECOGNITION SYSTEM FOR PROVIDING VOICE RECOGNITION SERVICES USING A CONVERSATIONAL LANGUAGE - Embodiments of the present invention provide a method, system and article of manufacture for adjusting a language model within a voice recognition system, based on text received from an external application. The external application may supply text representing the words of one participant to a text-based conversation. n such a case, changes may be made to a language model by analyzing the external text received from the external application.11-22-2012
20110270611OIL LEVEL INSPECTION SYSTEM FOR RAILROAD CAR TRUCK - A system for inspecting an oil level in each part of a railroad car truck includes: an imaging unit that obtains an image of an oil level gauge; an oil level inspection unit that inspects whether or not the oil level in the each part of the railroad car truck is within a predetermined range based on the image of the oil level gauge obtained by the imaging unit; a voice input unit adapted for an inspector to input, via voice, an inspection result; a voice processing unit that determines whether or not the inspection result inputted via the voice input unit is good based on the inputted inspection result, and converts a determination result into displayable data; a display unit that displays an oil level inspection result and the determination result; and a storage unit that stores, as data, the oil level inspection result and the determination result.11-03-2011
20080275703Method and apparatus for identity verification - The present disclosure relates to identity verification devices and methods. A system is provided that utilizes a system of tonal and rhythmic visualization methods to accurately identify the true owner of a credit or other personal card based on their voice.11-06-2008
20080281594Autoscriber - The Autoscriber invention pertains to a system of inserting a printed SMPTE timecode into a textual representation of the spoken portion of a media recording. The system includes a user-supplied computer with Autoscriber (voice recognition) software, a printer and a user-supplied media recorder/player with SMPTE timecode reader and RS-422 data output.11-13-2008
20080312924SYSTEM AND METHOD FOR TRACKING PERSONS OF INTEREST VIA VOICEPRINT - Disclosed are systems, methods, and computer readable media for tracking a person of interest. The method embodiment comprises identifying a person of interest, capturing a voiceprint of the person of interest, comparing a received voiceprint of a caller with the voiceprint of the person of interest, and tracking the caller if the voiceprint of the caller is a substantial match to the voiceprint of the person of interest.12-18-2008
20080312922Method and System for Packetised Content Streaming Optimisation - A method of determining the speech content of a packet carrying speech encoded data missing from speech segment communicated by in a packetised data stream communicated using at least one VOIP link between a server platform and a client platform, the method comprising at the client platform: receiving a plurality of packets carrying speech encoded data forming said packetised data stream; processing each received packet to determine a unique message segment identifier associated with a speech segment of the received packet; processing each received packet to determine if it contains another unique message segment identifier associated with a previously received packet carrying encoded speech data; determining if the unique message segment identifier for the received packet exists in storage means provided on the client platform, and if not, storing the received packet in association with its unique message segment identifier; processing each received packet to determine a sequence identifier; checking if the sequence identifier is contiguous in sequence with a previously received packet stored locally on said client platform, and if not, determining the speech content of one or more missing packet in the sequence sent by the server platform to the client platform by retrieving a packet from said storage means having the same unique message segment identifier as the missing packet.12-18-2008
20100280828Communication Device Language Filter - Techniques are described that generally relate to systems, methods, and devices designed to selectively filter offensive communications in accordance with a user's intentions. Example methods may be designed to filter (such as by deleting, blocking, replacing, and/or modifying) various offensive words, phrases, and/or sounds that have been identified as having offensive meanings.11-04-2010
20100145695APPARATUS FOR CONTEXT AWARENESS AND METHOD USING THE SAME - An apparatus for context awareness includes: a voice-based recognition unit that recognizes a user's emotional state on the basis of a voice signal; a motion-based recognition unit that recognizes the user's emotional state on the basis of a motion; a position recognition unit that recognizes a location where the user is positioned; and a mergence-recognition unit that recognizes a user's context by analyzing the recognition results of the voice-based recognition unit, the motion-based recognition unit, and the position recognition unit. Accordingly, it is possible to rapidly and accurately accidents or dangerous contexts caused to a user.06-10-2010
20080235016SYSTEM AND METHOD FOR DETECTION AND ANALYSIS OF SPEECH - Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify words, vocalizations, or other noises directed to or spoken by the key child, independent of content. The analysis can include the number of responses between the child and another, such as an adult and the number of words spoken by the child and/or another, independent of content of the speech. One or more metrics can be determined based on the analysis and provided to assist in improving the language environment and/or tracking language development of the key child.09-25-2008
20090119106BUILDING WHITELISTS COMPRISING VOICEPRINTS NOT ASSOCIATED WITH FRAUD AND SCREENING CALLS USING A COMBINATION OF A WHITELIST AND BLACKLIST - According to one aspect of the invention there is provided a method, comprising collecting voiceprints of callers; identifying which of the collected voiceprints are associated with fraud; and generating a whitelist comprising voiceprints corresponding to the collected voiceprints not identified as associated with fraud.05-07-2009
20090125307System and a method for providing each user at multiple devices with speaker-dependent speech recognition engines via networks - A system and a method for providing each user at multiple devices with speaker-dependent speech recognition engines via networks according to the pre-stored speech sounds and characteristics of devices, by which each user can use speaker-dependent speech recognition engines in different devices without the need of repeating the same procedure of recording speech to train speech recognition engines for newly utilized devices.05-14-2009
20090089056ELECTRONIC APPARATUS AND DISPLAY PROCESS METHOD - According to one embodiment, an electronic apparatus includes a sound characteristic output module configured to analyze audio data in video content data, thereby outputting sound characteristic information indicative of sound characteristics of the audio data. A talk section detection process module detects talk sections in which talks are made by persons, which are included in the video content data, on the basis of the sound characteristic information, and classifies the detected talk sections into a plurality of groups which are associated with different speakers. A display process module displays, on a time bar which is representative of a sequence of the video content data, a plurality of bar areas indicative of positions of the detected talk sections in the sequence of the video content data, in different display modes in association with the groups.04-02-2009
20090164215DEVICE WITH VOICE-ASSISTED SYSTEM - A device with a voice-assisted system is provided by using a voice command to adjust operations. The voice-assisted system includes a voice recognition engine and a control device. The voice recognition engine receives a voice command and outputting a voice signal based on the voice command to the control unit. The control unit based on the voice signal adjusts the operations. A user is only required to input the voice command. The voice recognition engine performs a series of actions to adjust the operations. Therefore, the voice-assisted system can enhance convenience of adjusting the operations of the device and reduce operation complexity for the user.06-25-2009
20090187405Arrangements for Using Voice Biometrics in Internet Based Activities - In one embodiment, a method for identifying a user of a virtual universe utilizing audio biometrics is disclosed. The method can include prompting a client application with a request for an utterance, processing the reply to the request and creating a voice profile of the user/speaker. The voice profile can be associated with an avatar and when an utterance is received, the avatar can be identified and in some embodiments authenticated. Other embodiments are also disclosed.07-23-2009
20130218561System and Method for Enhancing Voice-Enabled Search Based on Automated Demographic Identification - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information.08-22-2013
20090326944VOICE RECOGNITION APPARATUS AND METHOD - An input voice detect is detected after starting a voice input waiting state; the detected voice is recognized; an elapsed time from the start of the voice input waiting state is counted; an informative sound which urges a user to input the voice is outputted when the elapsed time reaches a preset output set time; and the output of the informative sound is stopped when the elapsed time at the time of inputting the voice is shorter than the output set timedetect.12-31-2009
20110224986VOICE AUTHENTICATION SYSTEMS AND METHODS - A method for configuring a voice authentication system employing at least one authentication engine comprises utilising the at least one authentication engine to systematically compare a plurality of impostor voice sample against a voice sample of a legitimate person to derive respective authentication scores. The resultant authentication scores are analysed to determine a measure of confidence for the voice authentication system.09-15-2011
20090055178System and method of controlling personalized settings in a vehicle - A system is provided for controlling personalized settings in a vehicle. The system includes a microphone for receiving spoken commands from a person in the vehicle, a location recognizer for identifying location of the speaker, and an identity recognizer for identifying the identity of the speaker. The system also includes a speech recognizer for recognizing the received spoken commands. The system further includes a controller for processing the identified location, identity and commands of the speaker. The controller controls one or more feature settings based on the identified location, identified identity and recognized spoken commands of the speaker. The system also optimizes the grammar comparison for speech recognition and the beamforming microphone array used in the vehicle.02-26-2009
20090055179Method, medium and apparatus for providing mobile voice web service - Provided are a method and apparatus for providing a mobile voice web service in a mobile terminal. The method includes analyzing a web history of a user from web search logs of the user and generating a voice access list based on the analysis results, and performing voice recognition by dynamically generating a voice recognition syntax according to the generated voice access list. Accordingly, by limiting syntax required for voice recognition by generating a syntax suitable for a web context of the user, efficient voice recognition, which can be performed in a terminal not a server, can be implemented.02-26-2009
20090198495VOICE SITUATION DATA CREATING DEVICE, VOICE SITUATION VISUALIZING DEVICE, VOICE SITUATION DATA EDITING DEVICE, VOICE DATA REPRODUCING DEVICE, AND VOICE COMMUNICATION SYSTEM - A voice situation data creating device for providing the user with data with a good convenience for the user when the user uses voice data collected from sound sources and recorded with time. A direction/talker identifying section (08-06-2009
20080262840Method Of Verifying Accuracy Of A Speech - A method for verifying the accuracy of speech is provided. The speech is pre-loaded into a dialog system. A medium is provided to verify the accuracy of the pre-loaded speech in the dialog system by comparing the test content with a predetermined speech script.10-23-2008
20080262839Processing Control Device, Method Thereof, Program Thereof, and Recording Medium Containing the Program - A processor 10-23-2008
20080294435System and Method for Remote Speech Recognition - A system and method for remote speech recognition includes one or more customer premise equipment, a speech engine, and a communication engine. The customer premise equipment interfaces with a host from which the customer premise equipment is remotely located. The speech engine, remotely located from the host, recognizes a plurality of speech spoken by a user of the customer premise equipment and translates the speech into the language of the host. The speech engine further converts the recognized speech into one or more text data packets where the text data packets include the recognized speech as data instead of voice. The communication engine encrypts the text data packets and transmits the text data packets to the host. Transmitting data instead of voice to the host reduces the computational demands on the host. Additionally, the communication engine receives a plurality of information from the host.11-27-2008
20090204400System and method for processing a spoken request from a user - A system and method are described for processing a spoken request from a user. In one embodiment, a method is disclosed for attempting to recognize a spoken request from a user with a speech recognition engine above a predetermined level of accuracy. If the spoken request is not recognized above the predetermined level of accuracy, the spoken request is provided to a level one agent. If the level one agent does not recognize the request, a voice connection is established between the user and a level two agent. In another embodiment, a method is disclosed for determining whether a silent response system recognizes a spoken request from a user above a predetermined level of accuracy. A response is provided to the user if the silent response system recognizes the spoken request. Otherwise, a voice connection is established between the user and a call center.08-13-2009
20090210227VOICE RECOGNITION APPARATUS AND METHOD FOR PERFORMING VOICE RECOGNITION - A voice recognition apparatus includes: a voice recognition module that performs a voice recognition for an audio signal during a voice period; a distance measurement module that measures a current distance between the user and an voice input module; a calculation module that calculates a recommended distance range, in which being estimated that an S/N ratio exceeds a first threshold, based on the voice characteristic; and a display module that displays the recommended distance range and the current distance.08-20-2009
20090222265Voice Recognition Apparatus - A voice recognition apparatus 09-03-2009
20090248412Association apparatus, association method, and recording medium - There is provided an association apparatus for associating a plurality of voice data converted from voices produced by speakers, comprising: a word/phrase similarity deriving section which derives an appearance ratio of a common word/phrase that is common among the voice data based on a result of speech recognition processing on the voice data, as a word/phrase similarity; a speaker similarity deriving section which derives a result of comparing characteristics of voices extracted from the voice data, as a speaker similarity; an association degree deriving section which derives a possibility of the plurality of the voice data, which are associated with one another, based on the derived word/phrase similarity and the speaker similarity, as an association degree; and an association section which associates the plurality of the voice data with one another, the derived association degree of which is equal to or more than a preset threshold.10-01-2009
20090248414Personal name assignment apparatus and method - An apparatus includes unit acquiring speaker information including a first duration of a speaker and a name specified by name specifying information used to indicate a name, and acquiring the first duration as a first period, unit acquiring a second period including an utterance, unit extracting, if the second period is included in the first period, a first amount that characterizes a speaker, and associating the first amount with a name corresponding to the first period, unit creating speaker models from amounts, unit acquiring, from the content information, a third duration as an duration to be recognized, unit extracting, if the second period is included in the third period, a second amount that characterizes a speaker, unit calculating degrees of similarity between amounts of speaker models and the second amount, and unit recognizing a name of a speaker model which satisfies a set condition of the degrees as a performer.10-01-2009
20090248413DEVICES AND SYSTEMS FOR REMOTE CONTROL - Remote controllers and systems thereof are disclosed. The remote controller remotely operates a receiving host, in which the receiving host provides voice input and voice recognition functions. The remote controller comprises a first input unit and a second input unit for generating a voice input request and a voice recognition request. The generated voice input and voice recognition requests are then sent to the receiving host, thereby forcing the receiving host to perform the voice input and voice recognition functions.10-01-2009
20090259467Voice Recognition Apparatus - A voice recognition apparatus 10-15-2009
20090254343IDENTIFYING AUDIO CONTENT USING DISTORTED TARGET PATTERNS - Embodiments of a system for identifying audio content are described. During operation, the system receives a data stream from an electronic device via a communication network. Then, the system distorts a set of target patterns which are used to identify the audio content based on characteristics of the electronic device and/or the communication network. Next, the system identifies the audio content in the data stream based on the set of distorted target patterns.10-08-2009
20090259469METHOD AND APPARATUS FOR SPEECH RECOGNITION - A method and apparatus for performing speech recognition receives an audio signal, generates a sequence of frames of the audio signal, transforms each frame of the audio signal into a set of narrow band feature vectors using a narrow passband, couples the narrow band feature vectors to a speech model, and determines whether the audio signal is a wide band signal. When the audio signal is determined to be a wide band signal, a pass band parameter of each of one or more passbands that are outside the narrow passband is generated for each frame and the one or more band energy parameters are coupled to the speech model.10-15-2009
20090259470Bio-Phonetic Multi-Phrase Speaker Identity Verification - Systems and methods for bio-phonetic multi-phrase speaker identity verification are disclosed. Generally, a speaker identity verification engine generates a dynamic phrase including at least one dynamically-generated word. The speaker identity verification engine prompts a user to speak the dynamic phrase and receives a dynamic phrase utterance. The speaker identity verification engine extracts at least one voice characteristic from the dynamic phrase utterance and compares the at least one voice characteristic with a voice profile the generate a score. The speaker identity verification engine then determines whether to accept a speaker identity claim based on the score.10-15-2009
20090259468SYSTEM AND METHOD FOR DETECTING SYNTHETIC SPEAKER VERIFICATION - Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received.10-15-2009
20100185444METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING COMPOUND MODELS FOR SPEECH RECOGNITION ADAPTATION - An apparatus for providing compound models for speech recognition adaptation includes a processor. The processor may be configured to receive a speech signal corresponding to a particular speaker, select a cluster model including both a speaker independent portion and a speaker dependent portion based at least in part on a characteristic of speech of the particular speaker, and process the speech signal using the selected cluster model. A corresponding method and computer program product are also provided.07-22-2010
20100179813VOICE RECOGNITION SYSTEM AND METHODS - The present invention relates to a method of providing voice recognition. The method comprises the steps of receiving a packetised voice data of a person to be identified over a packet-switched network, comparing the voice data with a stored voice data of a user and, based on the comparison, providing an indication of the likelihood that the person to be identified is the user, wherein the step of receiving the voice data comprises waiting for sufficient voice data to be received.07-15-2010
20080215324Indexing apparatus, indexing method, and computer program product - Acoustic models to provide features to a speech signal are created based on speech features included in regions where similarities of acoustic models created based on speech features in a certain time length are equal to or greater than a predetermined value. Feature vectors acquired by using the acoustic models of the regions and the speech features to provide features to speech signals of second segments are grouped by speaker.09-04-2008
20100250252Conference support device, conference support method, and computer-readable medium storing conference support program - A conference support device includes an image receiving portion that receives captured images from conference terminals, a voice receiving portion that receives, from one of the conference terminals, a voice that is generated by a first participant, a first storage portion that stores the captured images and the voice, a voice recognition portion that recognizes the voice, a text data creation portion that creates text data that express the words that are included in the voice, an addressee specification portion that specifies a second participant, whom the voice is addressing, an image creation portion that creates a display image that is configured from the captured images and in which the text data are associated with the first participant and a specified image is associated with at least one of the first participant and the second participant, and a transmission portion that transmits the display image to the conference terminals.09-30-2010
20100217594PERSONAL AUTHENTICATION SYSTEM - In a system in which a business organization authenticates a user by speaker recognition, there is provided the system that obviates a necessity for the user to register a speaker model by uttering voice for each business partner. A user device 08-26-2010
20100145696Method, system and apparatus for improved voice recognition - An improved voice recognition system in which a Voice Keyword Table is generated and downloaded from a set-up device to a voice recognition device. The VKT includes visual form data, spoken form data, phonetic format data, and an entry corresponding to a keyword, and TTS-generated voice prompts and voice models corresponding to the phonetic format data. A voice recognition system on the voice recognition device is updated by the set-up device. Furthermore, voice models in the voice recognition device are modified by the set-up device.06-10-2010
20120173239METHOD FOR VERIFYING THE IDENTITYOF A SPEAKER, SYSTEM THEREFORE AND COMPUTER READABLE MEDIUM - The invention refers to a method of verifying the identity of a speaker based on the speakers voice comprising the steps of: receiving (07-05-2012
20120143608AUDIO SIGNAL SOURCE VERIFICATION SYSTEM - An audio signal source verification system is presented that, in certain embodiments, receives a first template for an audio signal and compares it to templates from different sound sources to determine a correlation between them. A question and response format may be used to eliminate false verifications and to increase the probability that an audio signal is from the purported source of the signal. Moreover mobile devices may be operated to provide audio signals generated by users of those phones and the audio signals and templates derived form those signals may be compared to known templates to determine a confidence level or other indication may be used to indicate the mobile device user is who they purport to be. Moreover comparisons can be made using templates of different richness to achieve confidence levels and confidence levels may be represented based on the results of the comparisons.06-07-2012
20120245941Device Access Using Voice Authentication - A device can be configured to receive speech input from a user. The speech input can include a command for accessing a restricted feature of the device. The speech input can be compared to a voiceprint (e.g., text-independent voiceprint) of the user's voice to authenticate the user to the device. Responsive to successful authentication of the user to the device, the user is allowed access to the restricted feature without the user having to perform additional authentication steps or speaking the command again. If the user is not successfully authenticated to the device, additional authentication steps can be request by the device (e.g., request a password).09-27-2012
20090187404BIOMETRIC CONTROL SYSTEM AND METHOD - A large-scale attendance, productivity, activity and availability biometric control method using the telephone network, for individuals client users at their work places, with speaker verification technology based on limited enrolling data and short verification sentences, and the equipment associated to this method. Through the biometric control method by means of the individuals' identity (ID) verification using voice recognition, the individual is biometrically identified which allows for the registering of permanence, entrance or exit times. The type of performed activity, attendance at the work place, and which keeps records on the performed activity type. The method considers receiving and making intelligent calls to achieve an active control on individuals' activity.07-23-2009
20090112592Remote controller with speech recognition - A receiver remote controller has a storage device storing electronic program guide (EPG) data that relates content to television channels containing said content. The remote controller is contained in a remote controller housing with the housing containing: a data interface that receives the EPG data provided by an EPG data source for storage in the storage device; a speech interface that receives speech input from a user and produces speech signals therefrom; a natural language speech processor engine that receives the speech signals and translating the speech signals to a query of the EPG database; and a processor that receives results of the query from the natural language speech processor, and either conveys the results of the query to a user utilizing a user interface or sends navigation commands to the receiver. This abstract is not to be considered limiting, since other embodiments may deviate from the features described in this abstract.04-30-2009
20090106024PORTABLE ELECTRONIC DEVICE AND A NOISE SUPPRESSING METHOD THEREOF - A portable electronic device comprises a main body (04-23-2009
20090043578Enhancing the Response of Biometric Access Systems - Disclosed are arrangements that provide security for items to which access is restricted by providing a single layer of security requiring a biometric signature (02-12-2009
20100305946SPEAKER VERIFICATION-BASED FRAUD SYSTEM FOR COMBINED AUTOMATED RISK SCORE WITH AGENT REVIEW AND ASSOCIATED USER INTERFACE - Disclosed is method for screening an audio for fraud detection, the method comprising: providing a User Interface (UI) control capable of: a) receiving an audio; b) comparing the audio with a list of fraud audios; c) assigning a risk score to the audio based on the comparison with a potentially matching fraud audio of the list of fraud audios; and d) displaying an audio interface on a display screen, wherein the audio interface is capable of playing the audio along with the potentially matching fraud audio, and wherein the display screen further displays metadata for each of the audio and the potentially matching fraud audio thereon, wherein the metadata includes location and incident data of each of the audio and the potentially matching fraud audio.12-02-2010
20090037173USING SPEAKER IDENTIFICATION AND VERIFICATION SPEECH PROCESSING TECHNOLOGIES TO ACTIVATE A PAYMENT CARD - The present invention discloses a payment card that uses speaker identification and verification (SIV) speech processing techniques for activation purposes. For example, the invention can initially identify a payment card in a deactivated state, which is an internal state of the payment card. Speech input can then be received. Speech characteristics of the speech input can be determined and compared against a voice print of an authorized card user. The payment card can be selectively activated based on comparison results. That is, when the voice print and the speech characteristics match, the payment card can be activated. Otherwise, the card will remain deactivated. An activated payment card is one that has undergone an internal state change from the deactivated state. For example, when activated a credit card number can appear in a display and a magnetic strip can contain payment information, neither of which are present in the deactivated state.02-05-2009
20130144620METHOD, SYSTEM AND PROGRAM FOR VERIFYING THE AUTHENTICITY OF A WEBSITE USING A RELIABLE TELECOMMUNICATION CHANNEL AND PRE-LOGIN MESSAGE - Various embodiments of the present invention for validating the authenticity of a website are provided. An example of a method according to the present invention comprises providing a website having an artifact, receiving a communication from a user, at a service provider, for validating the website associated with a service provider, inquiring from the user a description of the artifact comparing the artifact on the website with the description of the artifact from the user and generating a indication to the user based upon the comparing. The communication is over a first communication channel and the website is accessed over a second communication channel. The first communication channel is different than the second. The artifact can be displayed after a user session is identified.06-06-2013
20130144621Systems and Methods for Assessment of Non-Native Spontaneous Speech - Computer-implemented systems and methods are provided for assessing non-native spontaneous speech pronunciation. Speech recognition on digitized speech is performed using a non-native acoustic model trained with non-native speech to generate word hypotheses for the digitized speech. Time alignment is performed between the digitized speech and the word hypotheses using a reference acoustic model trained with native-quality speech. Statistics are calculated regarding individual words and phonemes in the word hypotheses based on the alignment. A plurality of features for use in assessing pronunciation of the speech are calculated based on the statistics, an assessment score is calculated based on one or more of the calculated features, and the assessment score is stored in a computer-readable memory.06-06-2013
20110035221Monitoring An Audience Participation Distribution - Apparatus for monitoring an audience participation distribution at an event comprising a speech activity module operable to generate speech data representing speech detected at the event, a speaker identification module operable to determine, using the speech data, a first speaker who has contributed to the detected speech, and a processing unit operable to generate speaker data representing a value for the time that the first speaker has contributed to the detected speech and to output distribution data based on the speaker data representing a measure of the participation for the first speaker at the event.02-10-2011
20110125498METHOD AND APPARATUS FOR HANDLING A TELEPHONE CALL - One embodiment of the invention provides a computer-implemented method of handling a telephone call. The method comprises monitoring a conversation between an agent and a customer on a telephone line as part of the telephone call to extract the audio signal therefrom. Real-time voice analytics are performed on the extracted audio signal while the telephone call is in progress. The results from the voice analytics are then passed to a computer-telephony integration system responsible for the call for use by the computer-telephony integration system for determining future handling of the call.05-26-2011
20100036664SUBTITLE GENERATION AND RETRIEVAL COMBINING DOCUMENT PROCESSING WITH VOICE PROCESSING - An apparatus for retrieving a character string includes: storage for storing text data obtained by recognizing a voice in a presentation, second text data extracted from document data used in the presentation, and associated information of the first text data and the second text data. The apparatus also includes a retrieval unit for retrieving, by use of the associated information, the character string from text data composed from the first text data and the second text data.02-11-2010
20100070277VOICE RECOGNITION DEVICE, VOICE RECOGNITION METHOD, AND VOICE RECOGNITION PROGRAM - A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit.03-18-2010
20110178801SYSTEM AND METHOD FOR ACCESS TO MULTIMEDIA STRUCTURES - A system for access to multimedia structures has telephone sets capable of connecting to a telephone network, a storage device capable of storing a plurality of multimedia structures representing messages and/or data and/or commands, and a network access server that can be associated with the telephone sets and is capable of selectively instantiating the multimedia structures via an interconnection network. There is also a voice-recognition and speech-synthesis system that can be associated with the network access server and that comprises modules for reading files in XML format and for processing the files so as to obtain files in a format that can be synthesized by a speech synthesizer.07-21-2011
20110071832IMAGE DISPLAY DEVICE, METHOD, AND PROGRAM - It is an object of the present invention to make an act of viewing an image interactive and further enriched. A microphone 03-24-2011
20110071831Method and System for Localizing and Authenticating a Person - The present invention refers to a method for localizing a person comprising the steps carried out in a computing system (03-24-2011
20120303369Energy-Efficient Unobtrusive Identification of a Speaker - Functionality is described herein for recognizing speakers in an energy-efficient manner. The functionality employs a heterogeneous architecture that comprises at least a first processing unit and a second processing unit. The first processing unit handles a first set of audio processing tasks (associated with the detection of speech) while the second processing unit handles a second set of audio processing tasks (associated with the identification of speakers), where the first set of tasks consumes less power than the second set of tasks. The functionality also provides unobtrusive techniques for collecting audio segments for training purposes. The functionality also encompasses new applications which may be invoked in response to the recognition of speakers.11-29-2012
20090319271System and Method for Generating Challenge Items for CAPTCHAs - Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system.12-24-2009
20090240499LARGE VOCABULARY QUICK LEARNING SPEECH RECOGNITION SYSTEM - A speech recognition system comprising: an analog to digital converter, a time to frequency transformer, a noise filter; a context preprocessor, an acoustic word classifier, an initial acoustic model generator, a textual search module, and a trainer. The system recognizes speech initially prior to training, due to the context preprocessor classifying words of identical sound by the context of a leading and trailing neighboring group of words and by the acoustic model generator creating an initial acoustic model derived from an acoustic word statistical analysis ‘average’. Applications of the system include voice activated computer games, command and control systems and text dictation.09-24-2009
20110257975VOICE OVER IP BASED BIOMETRIC AUTHENTICATION - A receiver receives from a remote system a voice biometric sample from a party attempting to obtain a service from the apparatus using the remote system. A processor selectively determines when to request authentication of the party by a remote voice biometric system. A transmitter transmit a request to the party to provide the voice biometric sample responsive to the processor determining to request authentication of the party. The apparatus provides the service contingent upon authentication of the party by the remote voice biometric system.10-20-2011
20110257974GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.10-20-2011
20090006093SPEAKER RECOGNITION VIA VOICE SAMPLE BASED ON MULTIPLE NEAREST NEIGHBOR CLASSIFIERS - A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.01-01-2009
20110131043VOICE RECOGNITION SYSTEM, VOICE RECOGNITION METHOD, AND PROGRAM FOR VOICE RECOGNITION - The present invention enables the recognition process at high speed even when a lot of garbage is included in the grammar. The first voice recognition processing unit generates a recognition hypothesis graph which indicates a structure of hypothesis that is derived according to a first grammar together with a score associated with respective connections of a recognition unit by executing a voice recognition process based on the first grammar to a voice feature amount of input voice, and the second voice recognition processing unit outputs the recognition result from a total score of a hypothesis which is derived according to a second grammar after executing a voice recognition process according to the second grammar that is specified to accept a section other than keywords in input voice as the garbage section to a voice feature amount of input voice, and the second voice recognition processing unit acquires the structure and the score of the garbage section from the recognition hypothesis graph.06-02-2011
20080255842Personalized Voice Activity Detection - A method of transferring a real-time audio signal transmission, including: registering voice patterns (or other characteristics) of on more users to be used to identify the voices of the users, accepting an audio signal as it is created as a sequence of segments, analyzing each segment of the accepted audio signal to determine if it contains voice activity (10-16-2008
20080255841VOICE SEARCH DEVICE - A text data search using a voice is conventionally a full-text search using a word as an index word for a part recognized as a word in an input voice. Therefore, if any of the parts recognized as the words is falsely recognized, a search precision is lowered. In the present invention, referring to a language model generated by a language model generating part from text data to be subjected to a search which is divided by a learning data dividing part into a linguistic part and an acoustic model obtained by modeling voice features, a voice recognition part performs voice recognition for the input voice to output a phonemic representation. A matching unit converting part divides the phonemic representation into the same units as those of a text search dictionary, which is obtained by dividing the text data to be subjected to the search into the units smaller than those of the language model. A text search part uses the result of division to make a search on the text search dictionary.10-16-2008
20080255840Video Nametags - Video nametags allow automatic identification of people speaking in a video. A video nametag is associated with a person who is participating in a video, such as a video conference scenario or recorded meeting. The video nametag includes one or more sensors that detect when the person is speaking. The video nametag transmits information to a video conferencing system that provides an indicator on a display of the video that identifies the speaker. The system may also automatically format the display of the video to concentrate on the person when the person is speaking. The video nametag can also capture the wearer's audio and transmit it wirelessly to be used for the conference audio send signal.10-16-2008
20110022390SPEECH DEVICE, SPEECH CONTROL PROGRAM, AND SPEECH CONTROL METHOD - In order to speak numerals in a manner readily comprehensible to a user, a speech device includes a voice synthesis portion 01-27-2011
20120150540METHOD AND APPARATUS FOR DETECTING UNSOLICITED MULTIMEDIA COMMUNICATIONS - A service for searching for unsolicited communications is provided. For example, the service may inspect e-mail messages, instant messaging messages, facsimile transmissions, voice communications, and video telephony, and analyze these communications to determine whether an intended communication is unsolicited. In connection with voice and video telephony, a voice sample may be obtained from the caller and voice recognition may be performed on the sample to determine an identity of the person or the voice. The voice sample may also be used to determine the type of voice—i.e., if the voice is live, machine generated, or prerecorded. Where the call is a video telephony call, image recognition may be used to inspect an image of the person. The information obtained from voice recognition, voice type recognition, and image recognition may be used to detect whether the messages if from a known source of unsolicited communications.06-14-2012
20100324898VOICE RECOGNITION WITH DYNAMIC FILTER BANK ADJUSTMENT BASED ON SPEAKER CATEGORIZATION - Voice recognition methods and systems are disclosed. A voice signal is obtained for an utterance of a speaker. The speaker is categorized as a male, female, or child and the categorization is used as a basis for dynamically adjusting a maximum frequency f12-23-2010
20100121641EXTERNAL VOICE IDENTIFICATION SYSTEM AND IDENTIFICATION PROCESS THEREOF - An external voice identification system and an identification process thereof is disclosed. The external voice identification system of a multimedia electronic device is activated by identifying and analyzing inputting a voice message, and the multimedia electronic device can be an iPod player having a storage module stored with a plurality of voice files and has a transmission interface to electrically connect to a voice identification system. The voice identification system is electrically connected to the transmission interface and has a built-in identification module, and a identification unit can identify and analyze the voice signals. An adapting interface is connected to a voice input unit to receive the external voice signal, and thus identify and analyze the external voice signals by the identification module to further activate the multimedia electronic device for playing the voice signal (songs), and select, to adjust and switch the playing content by the inputted external voice signal.05-13-2010
20120310647PATTERN PROCESSING SYSTEM SPECIFIC TO A USER GROUP - Methods and apparatus for identifying a user group in connection with user group-based speech recognition. An exemplary method comprises receiving, from a user, a user group identifier that identifies a user group to which the user was previously assigned based on training data. The user group comprises a plurality of individuals including the user. The method further comprises using the user group identifier, identifying a pattern processing data set corresponding to the user group, and receiving speech input from the user to be recognized using the pattern processing data set.12-06-2012
20110071830COMBINED LIP READING AND VOICE RECOGNITION MULTIMODAL INTERFACE SYSTEM - The present invention provides a combined lip reading and voice recognition multimodal interface system, which can issue a navigation operation instruction only by voice and lip movements, thus allowing a driver to look ahead during a navigation operation and reducing vehicle accidents related to navigation operations during driving. The combined lip reading and voice recognition multimodal interface system in accordance with the present invention includes: an audio voice input unit; a voice recognition unit; a voice recognition instruction and estimated probability output unit; a lip video image input unit; a lip reading unit; a lip reading recognition instruction output unit; and a voice recognition and lip reading recognition result combining unit that outputs the voice recognition instruction03-24-2011
20100082342Method of Retaining a Media Stream without Its Private Audio Content - A method is disclosed that enables the handling of audio streams for segments in the audio that might contain private information, in a way that is more straightforward than in some techniques in the prior art. The data-processing system of the illustrative embodiment receives a media stream that comprises an audio stream, possibly in addition to other types of media such as video. The audio stream comprises audio content, some of which can be private in nature. Once it receives the data, the data-processing system then analyzes the audio stream for private audio content by using one or more techniques that involve looking for private information as well as non-private information. As a result of the analysis, the data-processing system omits the private audio content from the resulting stream that contains the processed audio.04-01-2010
20100017208INTEGRATED CIRCUIT FOR PROCESSING VOICE - An improved integrated circuit for processing voice (speech) is provided. This is a voice LSI. The voice LSI reduces a voice output level to 0V if a speech segment is silent. This voice LSI can reduce a white noise.01-21-2010
20080270131METHOD, PREPROCESSOR, SPEECH RECOGNITION SYSTEM, AND PROGRAM PRODUCT FOR EXTRACTING TARGET SPEECH BY REMOVING NOISE - The present invention relates to a method, preprocessor, speech recognition system, and program product for extracting a target speech by removing noise. In an embodiment of the invention target speech is extracted from two input speeches, which are obtained through at least two speech input devices installed in different places in a space, applies a spectrum subtraction process by using a noise power spectrum (Uω) estimated by one or both of the two speech input devices (Xω(T)) and an arbitrary subtraction constant (α) to obtain a resultant subtracted power spectrum (Yω(T)). The invention further applies a gain control based on the two speech input devices to the resultant subtracted power spectrum to obtain a gain-controlled power spectrum (Dω(T)). The invention further applies a flooring process to said resultant gain-controlled power spectrum on the basis of arbitrary Flooring factor (β) to obtain a power spectrum for speech recognition (Zω(T)).10-30-2008
20110307256SYSTEMS AND METHODS FOR PROVIDING NETWORK-BASED VOICE AUTHENTICATION - A system enables voice authentication via a network. The system may include an intelligent voice response engine operatively coupled to the network for receiving transaction or access requests from a plurality of telecommunications devices over the network. A speech recognition and verification services engine may be operatively coupled to the network and a database may be operatively coupled to the speech recognition and verification services engine for storing user voice print profiles. The speech recognition and verification services engine may receive a speaker verification call from the intelligent voice response engine and perform speaker verification on the received speaker verification call based on the stored user voice print profiles. The speech recognition and verification services engine may generate a verification score based upon results of the speaker verification.12-15-2011
20120065972WIRELESS VOICE RECOGNITION CONTROL SYSTEM FOR CONTROLLING A WELDER POWER SUPPLY BY VOICE COMMANDS - A wireless voice recognition control system for controlling the operation of an electric welder power supply by operator voice commands is disclosed. The system includes a remote module carried by the welder and a host module interfaced with the electric welder power supply. The remote module compares voice commands by the welder to preprogrammed voice command templates and operates to generate and broadcast a wireless signal when a spoken voice command matches a voice command template. The host module operates to receive the wireless signal and is configure to operate the electric welder power supply accordingly. In other embodiments, the host module and remote module operate to provide an audible feedback or acknowledgement to the welder. Other embodiments are also disclosed.03-15-2012
20120065973METHOD AND APPARATUS FOR PERFORMING MICROPHONE BEAMFORMING - A method and apparatus for performing microphone beamforming. The method includes recognizing a speech of a speaker, searching for a previously stored image associated with the speaker, searching for the speaker through a camera based on the image, recognizing a position of the speaker, and performing microphone beamforming according to the position of the speaker.03-15-2012
20120004914AUDIO HUMAN VERIFICATION - A system generates an audio challenge that includes a first voice and one or more second voices, the first voice being audibly distinguishable, by a human, from the one or more second voices. The first voice conveys first information and the second voice conveys second information. The system provides the audio challenge to a user and verifies that the user is human based on whether the user can identify the first information in the audio challenge.01-05-2012
20120004913METHOD AND APPARATUS FOR CONTROLLING OPERATION OF PORTABLE TERMINAL USING MICROPHONE - A method for controlling an operation of a portable terminal using a microphone includes detecting an operation mode of the portable terminal and driving an audio recognition mode according to the detected operation mode to activate the microphone, converting a signal, inputted through the microphone, into digital data and detecting audio characteristics from the digital data to extract audio analysis data for recognition of a type of the input signal, and determining whether there is UI setting information corresponding to the extracted audio analysis data type and performing a relevant function of the UI setting information.01-05-2012
20120209608MOBILE COMMUNICATION TERMINAL APPARATUS AND METHOD FOR EXECUTING APPLICATION THROUGH VOICE RECOGNITION - A mobile communication terminal apparatus and method are capable of recognizing an input voice of a user and executing an application related to the recognized voice. The apparatus includes a voice input unit to receive a first input voice; a voice recognition unit to acquire first voice instruction information based on the first input voice; a voice control table acquiring unit to acquire a first voice control table comprising the first voice instruction information and first icon position information; and an application execution unit to execute a first application based on the first icon position information included in the first voice control table. The method for registering voice instruction information includes acquiring voice instruction information for a selected application; acquiring execution information of the selected application; generating a voice control table comprising the execution information, and the voice instruction information; and storing the voice control table.08-16-2012
20100017209RANDOM VOICEPRINT CERTIFICATION SYSTEM, RANDOM VOICEPRINT CIPHER LOCK AND CREATING METHOD THEREFOR - The present invention provides a random voiceprint certification system comprises a training system, a random cipher generator, and a testing system, which is employed to process training or testing operation for the input raw voice data. In training voice, the training system obtains an appointment voiceprint feature model parameter groups from the input raw voice data. From the appointment voiceprint feature model parameter groups several voiceprint characteristic units are obtained and at least one reference voiceprint password, which is for the testing system to carry out the voice testing operation is built. In processing testing voice, the random cipher generator generates randomly at least one reference voiceprint password from the voiceprint characteristic units of the appointment voiceprint feature model parameter groups to build the random voiceprint cipher lock. The present invention generates randomly one or several reference voiceprint passwords. The random voiceprint certification system is built completely to form the random voiceprint cipher lock. Therefore, the effect of not easy for illegal invasion can be achieved.01-21-2010
20120010886Language Identification - A language identification system suitable for use with voice data transmitted through either a telephonic or computer network systems is presented. Embodiments that automatically select the language to be used based upon the content of the audio data stream are presented. In one embodiment the content of the data stream is supplemented with the context of the audio stream. In another embodiment the language determination is supplemented with preferences set in the communication devices and in yet another embodiment, global position data for each user of the system is used to supplement the automated language determination.01-12-2012
20110166859VOICE RECOGNITION DEVICE - A voice recognition unit is constructed in such a way as to create a voice label string for an inputted voice uttered by a user inputted for each language on the basis of a feature vector time series of the inputted voice uttered by the user and data about a sound standard model, and register the voice label string into a voice label memory 07-07-2011
20100235169SPEECH DIFFERENTIATION - Method for differentiation between voices including 1) analyzing perceptually relevant signal properties of the voices, e.g. average pitch and pitch variance, 2) determining sets of parameters representing the signal properties of the voices, and finally 3) extracting voice modification parameters representing modified signal properties of at least some of the voices. Hereby it is possible to increase a mutual parameter distance between the voices, and thereby the perceptual difference between the voices, when the voices have been modified according to the voice modification parameters. Preferably most of or all voices are modified in order to limit the amount of modification of one parameter. Preferred signal property measures are: pitch, pitch variance over time, glottal pulse shape, formant frequencies, signal amplitude, energy differences between voiced and un-voiced speech segments, characteristics related to overall spectrum contour of speech, characteristics related to dynamic variation of one or more measures in long speech segment. The method allows an automatic voice differentiation with a natural sound since it is based on a modification of signal properties determined for each of the voices.09-16-2010
20120022870GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.01-26-2012
20110093267AGE DETERMINATION USING SPEECH - A device may be configured to provide a query to a user. Voice data may be received from the user responsive to the query. Voice recognition may be performed on the voice data to identify a query answer. A confidence score associated with the query answer may be calculated, wherein the confidence score represents the likelihood that the query answer has been accurately identified. A likely age range associated with the user may be determined based on the confidence score. The device to calculate the confidence score may be tuned to increase a likelihood of recognition of voice data for a particular age range of callers.04-21-2011
20110093266VOICE PATTERN TAGGED CONTACTS - A method and system for associating a voice pattern with a contact record and/or for identifying a speaker using a mobile device. A mobile device may include a voice identification application for extracting a voice pattern from audio data and associating the voice pattern with a contact record that includes identification information such as, for example, a name of a person. The device may also be used to identify a speaker. The device captures audio data of a speaker; the voice identification application extracts a voice pattern from the audio data and compares the voice pattern to voice patterns associated with contact records stored in a contact directory. The voice identification application identifies a contact record having a voice pattern matching the voice pattern from the audio data and drives the device to display identification information from the contact record having a matching voice pattern.04-21-2011
20120065974JOINT FACTOR ANALYSIS SCORING FOR SPEECH PROCESSING SYSTEMS - Method, system, and computer program product are provided for Joint Factor Analysis (JFA) scoring in speech processing systems. The method includes: carrying out an enrolment session offline to enrol a speaker model in a speech processing system using JFA, including: extracting speaker factors from the enrolment session; estimating first components of channel factors from the enrolment session. The method further includes: carrying out a test session including: calculating second components of channel factors strongly dependent on the test session; and generating a score based on speaker factors, channel factors, and test session Gaussian mixture model sufficient statistics to provide a log-likelihood ratio for a test session.03-15-2012
20120072218SYSTEM AND METHOD FOR TRACKING PERSONS OF INTEREST VIA VOICEPRINT - Disclosed are systems, methods, and computer readable media for tracking a person of interest. The method embodiment comprises identifying a person of interest, capturing a voiceprint of the person of interest, comparing a received voiceprint of a caller with the voiceprint of the person of interest, and tracking the caller if the voiceprint of the caller is a substantial match to the voiceprint of the person of interest.03-22-2012
20110106536SYSTEMS AND METHODS FOR SIMULATING DIALOG BETWEEN A USER AND MEDIA EQUIPMENT DEVICE - Systems and methods for simulating dialog between a user and a media equipment device are provided. Videos of a user selected actor may be retrieved. An opener video of the selected actor may be displayed and based on a verbal response received from the user, a clip of a media asset associated with the selected actor may be retrieved. User reactions to the displayed clip may be monitored and subsequent videos of the actor and clips may be provided based on the user reactions. Clips of a media asset that matches preferences of the user may be retrieved. A clip associated with a mid level rank may be displayed. When the user reacts positively to the clip a clip associated with a low class level rank may be retrieved next otherwise a high class level rank clip may be retrieved next.05-05-2011
20120084087METHOD, DEVICE, AND SYSTEM FOR SPEAKER RECOGNITION - A method, device, and system for speaker recognition are provided. The method includes: receiving a Speaker Verification instruction sent from a Media Gateway Controller (MGC) (04-05-2012
20120221334SECURITY SYSTEM AND METHOD - A security system and method includes setting operation steps having a preset sequence, a trigger signal and a testing parameter for each of the operation steps, and a range of each of the testing parameters. The method further confirms a current operation step when a testing device starts a test process. If an output signal received from a sensing device is not identical to the trigger signal of the current operation step, a voice content file of the current operation step is sent to the voice output device. If a value of the output parameter read from the sensing device is not within the range of the testing parameter, a voice prompt file of the testing parameter is sent to the voice output device. After sending the voice content file or the voice prompt file, an abnormality processing command of the current operation step is sent to the testing device to stop the test process.08-30-2012
20120215536Methods and Voice Activity Detectors for Speech Encoders - Voice activity detectors are related methods are provided. Methods include receiving a frame of the input signal; determining a first SNR of the received frame; comparing the determined first SNR with an adaptive threshold; and detecting whether the received frame comprises voice based on the comparison. The adaptive threshold is at least based on total noise energy of a noise level, an estimate of a second SNR and on energy variation between different frames.08-23-2012
20100204992METHOD FOR INDENTIFYING AN ACOUSIC EVENT IN AN AUDIO SIGNAL - An process for recognizing an acoustic event in an audio signal has two stages The first stage involves possible candidates being selected, and the second stage involves each of the possible candidates being allocated a confidence value.08-12-2010
20100204991Ultrasonic Doppler Sensor for Speaker Recognition - A method and system recognizes an unknown speaker by directing an ultrasonic signal at a face of the unknown speaker. A Doppler signal of the ultrasonic signal is acquired after reflection by the face, and Doppler features are extracted from the reflected Doppler signal. The Doppler features are classified using Doppler models storing the Doppler features and identities of known speakers to recognize and identify the unknown speaker.08-12-2010
20120173238Remote Control Audio Link - One embodiment may take the form of a voice control system. The system may include a first apparatus with a processing unit configured to execute a voice recognition module and one or more executable commands, and a receiver coupled to the processing unit and configured to receive a first audio file from a remote control device. The first audio file may include at least one voice command. The first apparatus may further include a communication component coupled to the processing unit and configured to receive programming content, and one or more storage media storing the voice recognition module. The voice recognition module may be configured to convert voice commands into text.07-05-2012
20120316876Display Device, Method for Thereof and Voice Recognition System - A display system, a display device, a control method for the display device, and a voice recognition system are disclosed. A display device according to one embodiment of the present invention can carry out voice recognition upon a voice received from at least one speaker through at least one voice input device; and display the voice recognition result on the display unit. Accordingly, effective voice recognition is made possible for TV environments where various constraints exist differently from mobile terminal environments.12-13-2012
20120253808Voice Recognition Device and Voice Recognition Method - According to an embodiment, a voice recognition device includes a voice inputting unit, a voice recognition processing unit, a vibration movement pattern model holding unit, and a vibration movement unit. The voice recognition processing unit performs voice recognition processing using a digital signal output from the voice inputting unit to output a voice recognition result and outputs voice reliability of the received voice signal. The vibration movement pattern model holding unit stores models prepared according to a number of patterns of the voice reliability output from the voice recognition processing unit and holds vibration movements corresponding to the models. The vibration movement unit detects whether or not the voice reliability output from the voice recognition processing unit matches any one of the models in the vibration movement pattern model holding unit and performs vibration movement predetermined for a matched model.10-04-2012
20100268537Speaker verification system - A text-independent speaker verification system utilizes mel frequency cepstral coefficients analysis in the feature extraction blocks, template modeling with vector quantization in the pattern matching blocks, an adaptive threshold and an adaptive decision verdict and is implemented in a stand-alone device using less powerful microprocessors and smaller data storage devices than used by comparable systems of the prior art.10-21-2010
20090018831Speech Recognition Apparatus and Speech Recognition Method - Voices are prevented from being recognized with poor accuracy when a speaker is not close to a sound pickup device. A speech recognition apparatus (01-15-2009
20120232903KITCHEN AND/OR DOMESTIC APPLIANCE - The invention relates to a kitchen and/or domestic appliance comprising input means, which are connected to a voice-recognition system, for acoustic operator commands. The invention is characterised in that means for executing command-dependent actions are provided and that the voice-recognition system is used to identify and check the authorisation of a user.09-13-2012
20080312925System and Method for Implementing Voice Print-Based Priority Call Routing - A system, method, and computer-usable medium for routing a call. A server receives a call from a client. A routing engine captures a voice print from the call. In response to the routing engine capturing the voice print from the call, the routing engine compares the voice print to a database that includes a collection of voice prints. In response to the routing engine matching the voice print to at least one voice print among the collection of voice prints, an interactive voice response (IVR) module routes the call to an appropriate call queue based on the matching of the voice print. The appropriate queue routes the call from the appropriate call queue to a call center corresponding to the appropriate call queue.12-18-2008
20080300877SYSTEM AND METHOD FOR TRACKING FRAUDULENT ELECTRONIC TRANSACTIONS USING VOICEPRINTS - Disclosed are systems, methods, and computer readable media for comparing customer voice prints with a database of known fraudulent voice signatures and continually updating the database to decrease the risk of identity theft. The method embodiment comprises comparing a received voice signal against a database of known fraudulent voice signatures, denying the caller's transaction if the voice signal substantially matches the database of known fraudulent voice signatures, adding the caller's voice signal to the database of known fraudulent voice signatures if the voice signal does not substantially match a separate speaker verification database and received additional information is not verified.12-04-2008
20110004474Audience Measurement System Utilizing Voice Recognition Technology - A method, a system, and a computer program product for determining a total count of audience members within a sensory receiving environment during the presentation of a program. A voice recognition unit is enabled when a signal for a program/subject/event, such as a broadcast program, is received. The voice recognition unit receives one or more sounds in the sensory receiving environment and analyzes the characteristics of the sounds. When one or more unique human voices are identified during the program, a count of the number of unique human voices is determined. The count of unique human voices is transmitted to a server, whereby the count of unique human voices is equal to a count of audience members. The total count of audience members is calculated for all sensory receiving environment associated with the program. An audience analysis graphical user interface is generated to display the total count of audience members.01-06-2011
20110131044TARGET VOICE EXTRACTION METHOD, APPARATUS AND PROGRAM PRODUCT - An apparatus, program product and method is provided for separating a target voice from a plurality of other voices having different directions of arrival. The method comprises the steps of disposing a first and a second voice input device at a predetermined distance from one another and upon receipt of voice signals at said devices calculating discrete Fourier transforms for the signals and calculating a CSP (cross-power spectrum phase) coefficient by superpositioning multiple frequency-bin components based on correlation of the two spectra signals received and then calculating a weighted CSP coefficient from said two discrete Fourier-transformed speech signals. A target voice is separated when received by said devices from other voice signals in a spectrum by using the calculated weighted CSP coefficient.06-02-2011
20110035220AUTOMATED COMMUNICATION INTEGRATOR - An apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command.02-10-2011
20090326943Guidance information display device, guidance information display method and recording medium - A guidance information display device includes: a voice input unit; a display unit for displaying guidance information; an operation unit for accepting an operation; and a processor capable of executing the following processes of: a voice recognition process operation of performing voice recognition based on inputted voice; a calculation operation of calculating an evaluation value for a recognition result of voice recognition by the voice recognition process operation; a display operation of reading out guidance information corresponding to the recognition result from a storage unit, which stores the guidance information, and displaying the guidance information at a display unit; and a decision operation of deciding a display mode of the guidance information at the display unit based on a variable value, which varies with an operation from the operation unit for the guidance information displayed by the display operation, and the evaluation value calculated by the calculation operation.12-31-2009
20120323575SPEAKER ASSOCIATION WITH A VISUAL REPRESENTATION OF SPOKEN CONTENT - Speaker content generated in an audio conference is visually represented in accordance with a method. Speaker content from a plurality of audio conference participants is monitored using a computer with a tangible non-transitory processor and memory. The speaker content from each of the plurality of audio conference participants is monitored. A visual representation of speaker content for each of the plurality of audio conference participants is generated based on the analysis of the speaker content from each of the plurality of audio conference participant. The visual representation of speaker content is displayed.12-20-2012
20120323574SPEECH TO TEXT MEDICAL FORMS - Event audio data that is based on verbal utterances associated with a medical event associated with a patient is received. A list of a plurality of candidate text strings that match interpretations of the event audio data is obtained, based on information included in a medical speech repository, information included in a speech accent repository, and a matching function. A selection of at least one of the candidate text strings included in the list is obtained. A population of at least one field of an electronic medical form is initiated, based on the obtained selection.12-20-2012
20120271632Speaker Identification - Speaker identification techniques are described. In one or more implementations, sample data is received at a computing device of one or more user utterances captured using a microphone. The sample data is processed by the computing device to identify a speaker of the one or more user utterances. The processing involving use of a feature set that includes features obtained using a filterbank having filters that space linearly at higher frequencies and logarithmically at lower frequencies, respectively, features that model the speaker's vocal tract transfer function, and features that indicate a vibration rate of vocal folds of the speaker of the sample data.10-25-2012
20110213615VOICE AUTHENTICATION SYSTEM AND METHODS - A method for configuring a voice authentication system comprises ascertaining a measure of confidence associated with a voice sample enrolled with the authentication system. The measure of confidence is derived through simulated impostor testing carried out on the enrolled sample.09-01-2011
20110276331VOICE RECOGNITION SYSTEM - A voice recognition system includes: a voice input unit 11-10-2011
20110276330Methods and Devices for Appending an Address List and Determining a Communication Profile - Disclosed are methods and electronic communication devices, such as an in-car speaker device, that can receive via a downloading process, a communication address list from another device to the memory of the electronic communication device and can append a predetermined communication address to the communication address list. The predetermined communication address, which can be to an automated voice recognition based service, can be annunciated first. Also disclosed are methods and electronic communication devices for determining that a communication is with an automated voice recognition based service and then switching from a first call profile to a second call profile. Such a second profile can include different features such as a change of the frequency response of the audio signal of the electronic communication device, and/or reduction or elimination of the echo control, and/or a change in the noise control of the digital signal process.11-10-2011
20120095764METHODS FOR CREATING AND SEARCHING A DATABASE OF SPEAKERS - A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.04-19-2012
20120095763DIGITAL METHOD AND ARRANGEMENT FOR AUTHENTICATING A PERSON - Digital method for authentication of a person by comparing a current voice profile with a previously stored initial voice profile, wherein to determine the relevant voice profile the person speaks at least one speech sample into the system, this speech sample is conveyed to a voice-profile calculation unit and thereby, on the basis of a prespecified voice-profile algorithm, the voice profile is calculated, such that the overall size of the speech sample and/or parameters of its evaluation to determine the relevant voice profile are established dynamically and automatically as the sample is spoken, in response to the result of an evaluation of a first partial speech sample.04-19-2012
20110313765Conversational Subjective Quality Test Tool - A method for assessing quality of conversational speech between nodes of a communication network (12-22-2011
20120330663IDENTITY AUTHENTICATION SYSTEM AND METHOD - An identity authentication method is applied on a system. The system is connected to an external storage device storing a first voice model. The system includes an information server and a terminal. The information server includes a database. The information server executes the following steps. First, receiving the first voice model transmitted by the terminal. Second, determining whether the first voice model matches one second voice model, and transmitting the verification result to the terminal. The terminal executes the following steps. First, generating a prompt to prompt the user to input voice signals. Second, receiving the input voice signals. Third, extracting voice features from the input voice signals. Fourth, determining whether the extracted voice features matches the first voice model. Fifth, determining the verification result is successful when matches, and determining the identity authentication is success only when two verification results are both successful. A related system is also provided.12-27-2012
20110320200SPEAKER RECOGNITION IN A MULTI-SPEAKER ENVIRONMENT AND COMPARISON OF SEVERAL VOICE PRINTS TO MANY - One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer.12-29-2011
20120101822BIOMETRIC SPEAKER IDENTIFICATION - A biometric speaker-identification apparatus is disclosed that generates ordered speaker-identity candidates for a probe based on prototypes. Probe match scores are clustered, and templates that correspond to clusters having top M probe match scores are compared with the prototypes to obtain template-prototype match scores. The probe is also compared with the prototypes, and those templates corresponding to template-prototype match scores that are nearest to probe-prototype match scores are selected as speaker-identity candidates. The speaker-identity candidates are ordered based on their similarity to the probe.04-26-2012
20130013309System and Method for Low Overhead Voice Authentication - A system and method are provided to authenticate a voice in a frequency domain. A voice in the time domain is transformed to a signal in the frequency domain. The first harmonic is set to a predetermined frequency and the other harmonic components are equalized. Similarly, the amplitude of the first harmonic is set to a predetermined amplitude, and the harmonic components are also equalized. The voice signal is then filtered. The amplitudes of each of the harmonic components are then digitized into bits to form at least part of a voice ID. In another system and method, a voice is authenticated in a time domain. The initial rise time, initial fall time, second rise time, second fall time and final oscillation time are digitized into bits to form at least part of a voice ID. The voice IDs are used to authenticate a user's voice.01-10-2013
20130024197ELECTRONIC DEVICE AND METHOD FOR CONTROLLING THE SAME - An electronic device and a method for controlling an electronic device are disclosed. The electronic device includes: a display unit; a voice input unit; and a controller displaying a plurality of contents on the display unit, receiving a voice command for controlling any one of the plurality of contents through the voice input unit, and controlling content corresponding to the received voice command. Multitasking performed by the electronic device can be effectively controlled through a voice command.01-24-2013
20130024196SYSTEMS AND METHODS FOR USING A MOBILE DEVICE TO DELIVER SPEECH WITH SPEAKER IDENTIFICATION - Systems, methods, and apparatus for using at least one mobile device to receive a representation of at least one audio signal. In some embodiments, the at least one audio signal comprises speech of at least one of a plurality of first participants of a meeting, the plurality of first participants participating in the meeting from a first location, and the at least one audio signal may be audibly rendered to at least one second participant of the meeting at a second location different from the first location. In some embodiments, the at least one mobile device may further receive an indication of an identity of a leading speaker of the speech in the at least one audio signal, the leading speaker being identified from among the plurality of first participants, and may render the identity of the leading speaker to the at least one second participant.01-24-2013
20080235017VOICE INTERACTION DEVICE, VOICE INTERACTION METHOD, AND VOICE INTERACTION PROGRAM - The present invention provides a voice interaction device capable of performing an interaction meeting any demand from a user at proper time in flexible response to a circumferential condition of the user, a voice interaction method and a voice interaction program thereof. The voice interaction device controls the interaction with the user in response to an input voice from the user, including an available time calculation unit (09-25-2008
20080221888SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR ADDING VOICE ACTIVATION AND VOICE CONTROL TO A MEDIA PLAYER - A media player system, method and computer program product are provided. In use, an utterance is received. A command for a media player is then generated based on the utterance. Such command is utilized for providing wireless control of the media player.09-11-2008
20080221887SYSTEMS AND METHODS FOR DYNAMIC RE-CONFIGURABLE SPEECH RECOGNITION - Speech recognition models are dynamically re-configurable based on user information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. The techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition.09-11-2008
20080221886METHOD AND SYSTEM FOR THE ENTRY OF FLIGHT DATA FOR AN AIRCRAFT, TRANSMITTED BETWEEN A CREW ON BOARD THE AIRCRAFT AND GROUND STAFF - A system of assistance in the entry of flight data for an aircraft transmitted between a crew on board the aircraft and a ground staff including, a radiofrequency communications link to transmit flight data between the crew and the ground staff. At least one means of sending and one means of receiving data on board the aircraft, wherein the system includes a voice recognition means capable of detecting a piece of data of a predefined type emitted, during the communications call, by the crew or the ground staff and a means of analysis and transcription of this piece of data in digital or alphanumeric form.09-11-2008
20080221885Speech Control Apparatus and Method - A speech control apparatus and a method thereof are provided. The speech control apparatus logs the user in an application software according to a speech signal of a user. The speech control apparatus is connected to a password bank comprising a plurality of accounts and passwords. The speech control apparatus comprises a speech process module, a start module, a first receive module, an identity recognition module, a selection module, and a login module. The speech process module determines a meaning of the speech signal. The start module starts the application software according to the meaning of the speech signal. The first receiving module receives the biometrics feature of the user. The identity recognition module identifies the user as authorized according to the biometrics feature. The selection module selects a login set of account and password from the password bank according to the speech signal and the biometrics feature. The login module logs the user into the application software according to the login set of account and password.09-11-2008
20130185072Communication System and Method Between an On-Vehicle Voice Recognition System and an Off-Vehicle Voice Recognition System - A vehicle based system and method for receiving voice inputs and determining whether to perform a voice recognition analysis using in-vehicle resources or resources external to the vehicle.07-18-2013
20130179167METHODS AND APPARATUS FOR FORMANT-BASED VOICE SYNTHESIS - In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method.07-11-2013
20120253810COMPUTER PROGRAM, METHOD, AND SYSTEM FOR VOICE AUTHENTICATION OF A USER TO ACCESS A SECURE RESOURCE - Authenticating a purported user attempting to access a secure resource includes enrolling a user's voice sample by requiring the user to orally speak preselected enrollment utterances, generating prompts and respective predetermined correct responses where each question has only one correct response, presenting a prompt to the user in real time, and analyzing the user's real time live response to determine if the live response matches the predetermined correct response and if voice characteristics of the user's live voice sample match characteristics of the enrolled voice sample.10-04-2012
20120253809Voice Verification System - A voice verification module 10-04-2012
20080215323Method and System for Grouping Voice Messages - A method for grouping voice messages includes extracting a voice signature from a voice message and tagging the voice message with an identification associated with the voice signature. The method also includes grouping the voice message based on the identification.09-04-2008
20130096917METHODS AND DEVICES FOR FACILITATING COMMUNICATIONS - Methods and electronic devices for facilitating communications are described. In one aspect, a method for facilitating communications is described. The method includes: monitoring audio based communications; performing an audio analysis on the monitored audio based communications to identify a contact associated with the monitored communications; and providing information associated with the identified contact on an electronic device. In another aspect, an electronic device is described. The electronic device includes a processor and a memory coupled to the processor. The memory stores processor readable instructions for causing the processor to: monitor audio based communications; perform an audio analysis on the monitored audio based communications to identify a contact associated with the monitored communications; and provide information associated with the identified contact on an electronic device.04-18-2013
20110313766IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT - Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers.12-22-2011
20100286983OPERATION CONTROL APPARATUS AND METHOD IN MULTI-VOICE RECOGNITION SYSTEM - An operation control apparatus and method of controlling a plurality of operationally connected voice recognition-enabled systems, each having reciprocal control operational states corresponding to an enabled/disabled state.11-11-2010
20130204620ESTABLISHING A MULTIMODAL PERSONALITY FOR A MULTIMODAL APPLICATION IN DEPENDENCE UPON ATTRIBUTES OF USER INTERACTION - Establishing a multimodal personality for a multimodal application, including evaluating, by the multimodal application, attributes of a user's interaction with the multimodal application; selecting, by the multimodal application, a vocal demeanor in dependence upon the values of the attributes of the user's interaction with the multimodal application; and incorporating, by the multimodal application, the vocal demeanor into the multimodal application.08-08-2013
20120284027METHOD AND SYSTEM FOR SHARING PORTABLE VOICE PROFILES - An embodiment of the present invention provides a speech recognition engine that utilizes portable voice profiles for converting recorded speech to text. Each portable voice profile includes speaker-dependent data, and is configured to be accessible to a plurality of speech recognition engines through a common interface. A voice profile manager receives the portable voice profiles from other users who have agreed to share their voice profiles. The speech recognition engine includes speaker identification logic to dynamically select a particular portable voice profile, in real-time, from a group of portable voice profiles. The speaker-dependent data included with the portable voice profile enhances the accuracy with which speech recognition engines recognize spoken words in recorded speech from a speaker associated with a portable voice profile.11-08-2012
20120284026SPEAKER VERIFICATION SYSTEM - In an aspect, in general, a method for computer assisted speaker authentication in a voice communication session includes establishing a voice communication session between a first speaker and an agent, accepting a first voice signal from the first speaker, determining a voice characteristic measure of the first voice signal, including characterizing a similarity of the first voice signal to each of one or more stored characterizations of voice signals previously acquired from one or more known speakers, and providing an interface to the agent during the voice communication session between the agent and the first speaker, including presenting an indicator based on the determined voice characteristic measure to the agent.11-08-2012
20110288866VOICE PRINT IDENTIFICATION - Voice print identification may be provided. A plurality of speakers may be recorded and associated with identity indicators. Voice prints for each speaker may be created. If the voice print for at least one speaker corresponds to a known user according to the identity indicators, a database entry associating the user with the voice print may be created. Additional information associated with the user may also be displayed.11-24-2011
20130191127VOICE ANALYZER, VOICE ANALYSIS SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING A PROGRAM - A voice analyzer includes a plate-shaped body, a plurality of first voice acquisition units that are placed on both surfaces of the plate-shaped body and that acquire a voice of a speaker, a sound pressure comparison unit that compares sound pressure of a voice acquired by the first voice acquisition unit placed on one surface of the plate-shaped body with sound pressure of a voice acquired by the first voice acquisition unit placed on the other surface and determines a larger sound pressure, and a voice signal selection unit that selects information regarding a voice signal which is associated with the larger sound pressure and is determined by the sound pressure comparison unit.07-25-2013
20120016673SPEAKER RECOGNITION VIA VOICE SAMPLE BASED ON MULTIPLE NEAREST NEIGHBOR CLASSIFIERS - A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker.01-19-2012
20130197912SPECIFIC CALL DETECTING DEVICE AND SPECIFIC CALL DETECTING METHOD - A specific call detecting device includes: an utterance period detecting unit which detects at least a first utterance period in which the first speaker speaks in a call between a first speaker and a second speaker; an utterance ratio calculating unit which calculates utterance ratio of the first speaker in the call; a voice recognition execution determining unit which determines whether at least one of the first voice of the first speaker and second voice of the second speaker becomes a target of voice recognition or not on the basis of the utterance ratio of the first speaker; a voice recognizing unit which detects a keyword related to a specific call from the voice determined as a target of voice recognition among the first and second voices; and a determining unit which determines whether the call is the specific call or not on the basis of the detected keyword.08-01-2013

Patent applications in class Voice recognition

Patent applications in all subclasses Voice recognition