Entries |
Document | Title | Date |
20080208580 | Method and Dialog System for User Authentication - The invention relates to a method of authenticating a user (N). In a dialog between the user (N) to be authenticated and a dialog system ( | 08-28-2008 |
20080215323 | Method and System for Grouping Voice Messages - A method for grouping voice messages includes extracting a voice signature from a voice message and tagging the voice message with an identification associated with the voice signature. The method also includes grouping the voice message based on the identification. | 09-04-2008 |
20080215324 | Indexing apparatus, indexing method, and computer program product - Acoustic models to provide features to a speech signal are created based on speech features included in regions where similarities of acoustic models created based on speech features in a certain time length are equal to or greater than a predetermined value. Feature vectors acquired by using the acoustic models of the regions and the speech features to provide features to speech signals of second segments are grouped by speaker. | 09-04-2008 |
20080221885 | Speech Control Apparatus and Method - A speech control apparatus and a method thereof are provided. The speech control apparatus logs the user in an application software according to a speech signal of a user. The speech control apparatus is connected to a password bank comprising a plurality of accounts and passwords. The speech control apparatus comprises a speech process module, a start module, a first receive module, an identity recognition module, a selection module, and a login module. The speech process module determines a meaning of the speech signal. The start module starts the application software according to the meaning of the speech signal. The first receiving module receives the biometrics feature of the user. The identity recognition module identifies the user as authorized according to the biometrics feature. The selection module selects a login set of account and password from the password bank according to the speech signal and the biometrics feature. The login module logs the user into the application software according to the login set of account and password. | 09-11-2008 |
20080221886 | METHOD AND SYSTEM FOR THE ENTRY OF FLIGHT DATA FOR AN AIRCRAFT, TRANSMITTED BETWEEN A CREW ON BOARD THE AIRCRAFT AND GROUND STAFF - A system of assistance in the entry of flight data for an aircraft transmitted between a crew on board the aircraft and a ground staff including, a radiofrequency communications link to transmit flight data between the crew and the ground staff. At least one means of sending and one means of receiving data on board the aircraft, wherein the system includes a voice recognition means capable of detecting a piece of data of a predefined type emitted, during the communications call, by the crew or the ground staff and a means of analysis and transcription of this piece of data in digital or alphanumeric form. | 09-11-2008 |
20080221887 | SYSTEMS AND METHODS FOR DYNAMIC RE-CONFIGURABLE SPEECH RECOGNITION - Speech recognition models are dynamically re-configurable based on user information, background information such as background noise and transducer information such as transducer response characteristics to provide users with alternate input modes to keyboard text entry. The techniques of dynamic re-configurable speech recognition provide for deployment of speech recognition on small devices such as mobile phones and personal digital assistants as well environments such as office, home or vehicle while maintaining the accuracy of the speech recognition. | 09-11-2008 |
20080221888 | SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR ADDING VOICE ACTIVATION AND VOICE CONTROL TO A MEDIA PLAYER - A media player system, method and computer program product are provided. In use, an utterance is received. A command for a media player is then generated based on the utterance. Such command is utilized for providing wireless control of the media player. | 09-11-2008 |
20080235016 | SYSTEM AND METHOD FOR DETECTION AND ANALYSIS OF SPEECH - Certain aspects and embodiments of the present invention are directed to systems and methods for monitoring and analyzing the language environment and the development of a key child. A key child's language environment and language development can be monitored without placing artificial limitations on the key child's activities or requiring a third party observer. The language environment can be analyzed to identify words, vocalizations, or other noises directed to or spoken by the key child, independent of content. The analysis can include the number of responses between the child and another, such as an adult and the number of words spoken by the child and/or another, independent of content of the speech. One or more metrics can be determined based on the analysis and provided to assist in improving the language environment and/or tracking language development of the key child. | 09-25-2008 |
20080235017 | VOICE INTERACTION DEVICE, VOICE INTERACTION METHOD, AND VOICE INTERACTION PROGRAM - The present invention provides a voice interaction device capable of performing an interaction meeting any demand from a user at proper time in flexible response to a circumferential condition of the user, a voice interaction method and a voice interaction program thereof. The voice interaction device controls the interaction with the user in response to an input voice from the user, including an available time calculation unit ( | 09-25-2008 |
20080255840 | Video Nametags - Video nametags allow automatic identification of people speaking in a video. A video nametag is associated with a person who is participating in a video, such as a video conference scenario or recorded meeting. The video nametag includes one or more sensors that detect when the person is speaking. The video nametag transmits information to a video conferencing system that provides an indicator on a display of the video that identifies the speaker. The system may also automatically format the display of the video to concentrate on the person when the person is speaking. The video nametag can also capture the wearer's audio and transmit it wirelessly to be used for the conference audio send signal. | 10-16-2008 |
20080255841 | VOICE SEARCH DEVICE - A text data search using a voice is conventionally a full-text search using a word as an index word for a part recognized as a word in an input voice. Therefore, if any of the parts recognized as the words is falsely recognized, a search precision is lowered. In the present invention, referring to a language model generated by a language model generating part from text data to be subjected to a search which is divided by a learning data dividing part into a linguistic part and an acoustic model obtained by modeling voice features, a voice recognition part performs voice recognition for the input voice to output a phonemic representation. A matching unit converting part divides the phonemic representation into the same units as those of a text search dictionary, which is obtained by dividing the text data to be subjected to the search into the units smaller than those of the language model. A text search part uses the result of division to make a search on the text search dictionary. | 10-16-2008 |
20080255842 | Personalized Voice Activity Detection - A method of transferring a real-time audio signal transmission, including: registering voice patterns (or other characteristics) of on more users to be used to identify the voices of the users, accepting an audio signal as it is created as a sequence of segments, analyzing each segment of the accepted audio signal to determine if it contains voice activity ( | 10-16-2008 |
20080262839 | Processing Control Device, Method Thereof, Program Thereof, and Recording Medium Containing the Program - A processor | 10-23-2008 |
20080262840 | Method Of Verifying Accuracy Of A Speech - A method for verifying the accuracy of speech is provided. The speech is pre-loaded into a dialog system. A medium is provided to verify the accuracy of the pre-loaded speech in the dialog system by comparing the test content with a predetermined speech script. | 10-23-2008 |
20080270131 | METHOD, PREPROCESSOR, SPEECH RECOGNITION SYSTEM, AND PROGRAM PRODUCT FOR EXTRACTING TARGET SPEECH BY REMOVING NOISE - The present invention relates to a method, preprocessor, speech recognition system, and program product for extracting a target speech by removing noise. In an embodiment of the invention target speech is extracted from two input speeches, which are obtained through at least two speech input devices installed in different places in a space, applies a spectrum subtraction process by using a noise power spectrum (Uω) estimated by one or both of the two speech input devices (Xω(T)) and an arbitrary subtraction constant (α) to obtain a resultant subtracted power spectrum (Yω(T)). The invention further applies a gain control based on the two speech input devices to the resultant subtracted power spectrum to obtain a gain-controlled power spectrum (Dω(T)). The invention further applies a flooring process to said resultant gain-controlled power spectrum on the basis of arbitrary Flooring factor (β) to obtain a power spectrum for speech recognition (Zω(T)). | 10-30-2008 |
20080275703 | Method and apparatus for identity verification - The present disclosure relates to identity verification devices and methods. A system is provided that utilizes a system of tonal and rhythmic visualization methods to accurately identify the true owner of a credit or other personal card based on their voice. | 11-06-2008 |
20080281594 | Autoscriber - The Autoscriber invention pertains to a system of inserting a printed SMPTE timecode into a textual representation of the spoken portion of a media recording. The system includes a user-supplied computer with Autoscriber (voice recognition) software, a printer and a user-supplied media recorder/player with SMPTE timecode reader and RS-422 data output. | 11-13-2008 |
20080294435 | System and Method for Remote Speech Recognition - A system and method for remote speech recognition includes one or more customer premise equipment, a speech engine, and a communication engine. The customer premise equipment interfaces with a host from which the customer premise equipment is remotely located. The speech engine, remotely located from the host, recognizes a plurality of speech spoken by a user of the customer premise equipment and translates the speech into the language of the host. The speech engine further converts the recognized speech into one or more text data packets where the text data packets include the recognized speech as data instead of voice. The communication engine encrypts the text data packets and transmits the text data packets to the host. Transmitting data instead of voice to the host reduces the computational demands on the host. Additionally, the communication engine receives a plurality of information from the host. | 11-27-2008 |
20080300877 | SYSTEM AND METHOD FOR TRACKING FRAUDULENT ELECTRONIC TRANSACTIONS USING VOICEPRINTS - Disclosed are systems, methods, and computer readable media for comparing customer voice prints with a database of known fraudulent voice signatures and continually updating the database to decrease the risk of identity theft. The method embodiment comprises comparing a received voice signal against a database of known fraudulent voice signatures, denying the caller's transaction if the voice signal substantially matches the database of known fraudulent voice signatures, adding the caller's voice signal to the database of known fraudulent voice signatures if the voice signal does not substantially match a separate speaker verification database and received additional information is not verified. | 12-04-2008 |
20080312922 | Method and System for Packetised Content Streaming Optimisation - A method of determining the speech content of a packet carrying speech encoded data missing from speech segment communicated by in a packetised data stream communicated using at least one VOIP link between a server platform and a client platform, the method comprising at the client platform: receiving a plurality of packets carrying speech encoded data forming said packetised data stream; processing each received packet to determine a unique message segment identifier associated with a speech segment of the received packet; processing each received packet to determine if it contains another unique message segment identifier associated with a previously received packet carrying encoded speech data; determining if the unique message segment identifier for the received packet exists in storage means provided on the client platform, and if not, storing the received packet in association with its unique message segment identifier; processing each received packet to determine a sequence identifier; checking if the sequence identifier is contiguous in sequence with a previously received packet stored locally on said client platform, and if not, determining the speech content of one or more missing packet in the sequence sent by the server platform to the client platform by retrieving a packet from said storage means having the same unique message segment identifier as the missing packet. | 12-18-2008 |
20080312923 | Active Speaker Identification - Procedures for identifying clients in an audio event are described. In an example, a media server may order clients providing audio based on the input level. An identifier may be associated with the client for identifying the client providing input within the event. The ordered clients may be included in a list which may be inserted into a packet header carrying the audio content. | 12-18-2008 |
20080312924 | SYSTEM AND METHOD FOR TRACKING PERSONS OF INTEREST VIA VOICEPRINT - Disclosed are systems, methods, and computer readable media for tracking a person of interest. The method embodiment comprises identifying a person of interest, capturing a voiceprint of the person of interest, comparing a received voiceprint of a caller with the voiceprint of the person of interest, and tracking the caller if the voiceprint of the caller is a substantial match to the voiceprint of the person of interest. | 12-18-2008 |
20080312925 | System and Method for Implementing Voice Print-Based Priority Call Routing - A system, method, and computer-usable medium for routing a call. A server receives a call from a client. A routing engine captures a voice print from the call. In response to the routing engine capturing the voice print from the call, the routing engine compares the voice print to a database that includes a collection of voice prints. In response to the routing engine matching the voice print to at least one voice print among the collection of voice prints, an interactive voice response (IVR) module routes the call to an appropriate call queue based on the matching of the voice print. The appropriate queue routes the call from the appropriate call queue to a call center corresponding to the appropriate call queue. | 12-18-2008 |
20090006093 | SPEAKER RECOGNITION VIA VOICE SAMPLE BASED ON MULTIPLE NEAREST NEIGHBOR CLASSIFIERS - A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker. | 01-01-2009 |
20090006094 | OPTIMIZATION OF DETECTION SYSTEMS USING A DETECTION ERROR TRADEOFF ANALYSIS CRITERION - In detection systems, such as speaker verification systems, for a given operating point range, with an associated detection “cost”, the detection cost is preferably reduced by essentially trading off the system error in the area of interest with areas essentially “outside” that interest. Among the advantages achieved thereby are higher optimization gain and better generalization. From a measurable Detection Error Tradeoff (DET) curve of the given detection system, a criterion is preferably derived, such that its minimization provably leads to detection cost reduction in the area of interest. The criterion allows for selective access to the slope and offset of the DET curve (a line in case of normally distributed detection scores, a curve approximated by mixture of Gaussians in case of other distributions). By modifying the slope of the DET curve, the behavior of the detection system is changed favorably with respect to the given area of interest. | 01-01-2009 |
20090018831 | Speech Recognition Apparatus and Speech Recognition Method - Voices are prevented from being recognized with poor accuracy when a speaker is not close to a sound pickup device. A speech recognition apparatus ( | 01-15-2009 |
20090030689 | Mobile voice recognition data collection and processing - Voice recognition methods, systems and interfaces are used to collect data and produce databases that are then searched and used to produce reports or electronic filings. The databases are developed using a hierarchically designed command structure and a hierarchy of relational databases for the entry and recognition of voice commands. The invention uses an Adaptive Grammar that allows a very high probability for accurate recognition and a rapid recognition response to be achieved. The invention allows for multiple users and multiple mobile computers to maximize voice recognition capabilities. | 01-29-2009 |
20090030690 | SPEECH ANALYSIS APPARATUS, SPEECH ANALYSIS METHOD AND COMPUTER PROGRAM - A speech analysis apparatus analyzing prosodic characteristics of speech information and outputting a prosodic discrimination result includes an input unit inputting speech information, an acoustic analysis unit calculating relative pitch variation and a discrimination unit performing speech discrimination processing, in which the acoustic analysis unit calculates a current template relative pitch difference, determining whether a difference absolute value between the current template relative pitch difference and a previous template relative pitch difference is equal to or less than a predetermined threshold or not, when the value is not less than the threshold, calculating an adjacent relative pitch difference, and when the adjacent relative pitch difference is equal to or less than a previously set margin value, executing correction processing of adding or subtracting an octave of the current template relative pitch difference to calculate the relative pitch variation by applying the relative pitch difference as the relative pitch difference of the current analysis frame. | 01-29-2009 |
20090037173 | USING SPEAKER IDENTIFICATION AND VERIFICATION SPEECH PROCESSING TECHNOLOGIES TO ACTIVATE A PAYMENT CARD - The present invention discloses a payment card that uses speaker identification and verification (SIV) speech processing techniques for activation purposes. For example, the invention can initially identify a payment card in a deactivated state, which is an internal state of the payment card. Speech input can then be received. Speech characteristics of the speech input can be determined and compared against a voice print of an authorized card user. The payment card can be selectively activated based on comparison results. That is, when the voice print and the speech characteristics match, the payment card can be activated. Otherwise, the card will remain deactivated. An activated payment card is one that has undergone an internal state change from the deactivated state. For example, when activated a credit card number can appear in a display and a magnetic strip can contain payment information, neither of which are present in the deactivated state. | 02-05-2009 |
20090043578 | Enhancing the Response of Biometric Access Systems - Disclosed are arrangements that provide security for items to which access is restricted by providing a single layer of security requiring a biometric signature ( | 02-12-2009 |
20090043579 | TARGET SPECIFIC DATA FILTER TO SPEED PROCESSING - A method is presented which reduces data flow and thereby increases processing capacity while preserving a high level of accuracy in a distributed speech processing environment for speaker detection. The method and system of the present invention includes filtering out data based on a target speaker specific subset of labels using data filters. The method preserves accuracy and passes only a fraction of the data by optimizing target specific performance measures. Therefore, a high level of speaker recognition accuracy is maintained while utilizing existing processing capabilities. | 02-12-2009 |
20090055178 | System and method of controlling personalized settings in a vehicle - A system is provided for controlling personalized settings in a vehicle. The system includes a microphone for receiving spoken commands from a person in the vehicle, a location recognizer for identifying location of the speaker, and an identity recognizer for identifying the identity of the speaker. The system also includes a speech recognizer for recognizing the received spoken commands. The system further includes a controller for processing the identified location, identity and commands of the speaker. The controller controls one or more feature settings based on the identified location, identified identity and recognized spoken commands of the speaker. The system also optimizes the grammar comparison for speech recognition and the beamforming microphone array used in the vehicle. | 02-26-2009 |
20090055179 | Method, medium and apparatus for providing mobile voice web service - Provided are a method and apparatus for providing a mobile voice web service in a mobile terminal. The method includes analyzing a web history of a user from web search logs of the user and generating a voice access list based on the analysis results, and performing voice recognition by dynamically generating a voice recognition syntax according to the generated voice access list. Accordingly, by limiting syntax required for voice recognition by generating a syntax suitable for a web context of the user, efficient voice recognition, which can be performed in a terminal not a server, can be implemented. | 02-26-2009 |
20090083033 | Phonetic Searching - An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time. | 03-26-2009 |
20090089056 | ELECTRONIC APPARATUS AND DISPLAY PROCESS METHOD - According to one embodiment, an electronic apparatus includes a sound characteristic output module configured to analyze audio data in video content data, thereby outputting sound characteristic information indicative of sound characteristics of the audio data. A talk section detection process module detects talk sections in which talks are made by persons, which are included in the video content data, on the basis of the sound characteristic information, and classifies the detected talk sections into a plurality of groups which are associated with different speakers. A display process module displays, on a time bar which is representative of a sequence of the video content data, a plurality of bar areas indicative of positions of the detected talk sections in the sequence of the video content data, in different display modes in association with the groups. | 04-02-2009 |
20090094029 | Managing Audio in a Multi-Source Audio Environment - Methods, systems, and computer-readable media provide for the management of an audio environment with multiple audio sources. According to various embodiments described herein, real-time audio from multiple sources is received. A speaker is identified for each of the audio sources. Upon detecting a change from a first audio source to a second audio source, an identification of the speaker associated with the second audio source is provided. According to various embodiments, a recording of the real-time audio may be made and descriptors inserted to identify each speaker as the audio source changes. Real-time feedback from the speakers regarding characteristics of the audio may be received and corresponding adjustments to the audio made. | 04-09-2009 |
20090094030 | INDEXING METHOD FOR QUICK SEARCH OF VOICE RECOGNITION RESULTS - A method, system and computer program product for receiving a spoken request to obtain indexed results from a database. Like result types are assigned to categories, and within each category is a plurality of result entries. The result indices are hexadecimal encoded, and each hexadecimal encoding is preceded by an initial character representing the result category. A speech recognition system is engaged, which processes the spoken request. When a item is requested, the respective category is implicitly known by the index returned, and the index provides direct access within a database to the corresponding result based on the phonetics of the request. | 04-09-2009 |
20090106024 | PORTABLE ELECTRONIC DEVICE AND A NOISE SUPPRESSING METHOD THEREOF - A portable electronic device comprises a main body ( | 04-23-2009 |
20090112589 | ELECTRONIC APPARATUS AND SYSTEM WITH MULTI-PARTY COMMUNICATION ENHANCER AND METHOD - A multi-party communication enhancer includes an audio data input adapted to receive voice data associated with a plurality of communication participants. A participant identifier included in the multi-party communication enhancer is adapted to distinguish the voice of a number of communication participants as represented within the received voice data. A cue generator, also included in the multi-party communication enhancer, is operable to generate a cue for each distinguished voice, with the generated cue being outputted in association with the corresponding distinguished voice. | 04-30-2009 |
20090112590 | SYSTEM AND METHOD FOR IMPROVING INTERACTION WITH A USER THROUGH A DYNAMICALLY ALTERABLE SPOKEN DIALOG SYSTEM - Disclosed are systems and methods for dynamically interacting with a user through a spoken dialogue system. A method includes the steps of (1) receiving a user utterance, (2) analyzing the user utterance for a threshold determination of dialect, (3) generating a response that reflects an incremental implementation of the dialect, (4) further varying the perceived implementation of the dialect in subsequent responses by a process of: (a) receiving a subsequent user utterance, (b) determining a modified level of confidence in the dialect based at least in part from the subsequent utterance, (c) generating a subsequent response that implements an incremental variation according to the modified level of confidence. | 04-30-2009 |
20090112591 | SYSTEM AND METHOD OF WORD LATTICE AUGMENTATION USING A PRE/POST VOCALIC CONSONANT DISTINCTION - Disclosed are systems and methods for recognizing speech in a spoken dialogue system. The method includes (1) receiving an input speech having at least one pre-vocalic consonant or at least one post-vocalic consonant, (2) generating at least one output lattice that calculates a first score by comparing the input speech to a training model to provide a result; (3) distinguishing between the at least one pre-vocalic consonant and the at least one post-vocalic consonant in the input speech, (4) calculating a second score by measuring a similarity between the at least one pre-vocalic consonant or the at least one post vocalic consonant in the input speech and the first score, (5) determining at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch by using the second score, and (6) refining the results of the an automated speech recognition (ASR) system by using the at least one category for at least one pre-vocalic match or mismatch or at least one post-vocalic match or mismatch. | 04-30-2009 |
20090112592 | Remote controller with speech recognition - A receiver remote controller has a storage device storing electronic program guide (EPG) data that relates content to television channels containing said content. The remote controller is contained in a remote controller housing with the housing containing: a data interface that receives the EPG data provided by an EPG data source for storage in the storage device; a speech interface that receives speech input from a user and produces speech signals therefrom; a natural language speech processor engine that receives the speech signals and translating the speech signals to a query of the EPG database; and a processor that receives results of the query from the natural language speech processor, and either conveys the results of the query to a user utilizing a user interface or sends navigation commands to the receiver. This abstract is not to be considered limiting, since other embodiments may deviate from the features described in this abstract. | 04-30-2009 |
20090119106 | BUILDING WHITELISTS COMPRISING VOICEPRINTS NOT ASSOCIATED WITH FRAUD AND SCREENING CALLS USING A COMBINATION OF A WHITELIST AND BLACKLIST - According to one aspect of the invention there is provided a method, comprising collecting voiceprints of callers; identifying which of the collected voiceprints are associated with fraud; and generating a whitelist comprising voiceprints corresponding to the collected voiceprints not identified as associated with fraud. | 05-07-2009 |
20090125307 | System and a method for providing each user at multiple devices with speaker-dependent speech recognition engines via networks - A system and a method for providing each user at multiple devices with speaker-dependent speech recognition engines via networks according to the pre-stored speech sounds and characteristics of devices, by which each user can use speaker-dependent speech recognition engines in different devices without the need of repeating the same procedure of recording speech to train speech recognition engines for newly utilized devices. | 05-14-2009 |
20090150149 | Identifying far-end sound - Frames containing audio data may be received, the audio data having been derived from a microphone array, at least some of the frames containing residual acoustic echo after having acoustic echo partially removed therefrom. Probability distribution functions are determined from the frames of audio data. A probability distribution function comprises likelihoods that respective directions are directions of sources of sounds. An active speaker may be identified in frames of video data based on the video data and based on audio information derived from the audio data, where use of the audio information as a basis for identifying the active speaker is controlled by determining whether the probability distribution functions indicate that corresponding audio data includes residual acoustic echo. | 06-11-2009 |
20090150150 | SYSTEM AND METHOD FOR CONTROLLING ACCESS TO A HANDHELD DEVICE BY VALIDATING VOICE SOUNDS - A method for controlling access to a handheld device ( | 06-11-2009 |
20090150151 | Audio processing apparatus, audio processing system, and audio processing program - Disclosed herein is an audio processing apparatus for processing a plurality of pieces of audio data of sounds picked up by a plurality of microphones. The apparatus includes: a speaker identification section configured to identify a speaker based on the audio data; a simultaneous speech section identification section configured to, when at least first and second speakers have been identified, identify speech sections during which the first and second speakers have made speeches, and identify a section during which the first and second speakers have made the speeches at the same time as a simultaneous speech section; and an arranging section configured to separate audio data of the first speaker and audio data of the second speaker from the simultaneous speech section, and allow the audio data of the first speaker and the audio data of the second speaker to be outputted at mutually different timings. | 06-11-2009 |
20090164215 | DEVICE WITH VOICE-ASSISTED SYSTEM - A device with a voice-assisted system is provided by using a voice command to adjust operations. The voice-assisted system includes a voice recognition engine and a control device. The voice recognition engine receives a voice command and outputting a voice signal based on the voice command to the control unit. The control unit based on the voice signal adjusts the operations. A user is only required to input the voice command. The voice recognition engine performs a series of actions to adjust the operations. Therefore, the voice-assisted system can enhance convenience of adjusting the operations of the device and reduce operation complexity for the user. | 06-25-2009 |
20090171660 | METHOD AND APPARATUS FOR VERIFICATION OF SPEAKER AUTHENTIFICATION AND SYSTEM FOR SPEAKER AUTHENTICATION - A method for verification of speaker authentication comprises inputting a test utterance containing a password that is spoken by a speaker, extracting an acoustic feature vector sequence from the inputted test utterance, obtaining a matching path between the extracted acoustic feature vector sequence and a speaker template enrolled by an enrolled speaker, calculating a matching score of the obtained matching path upon considering spectral change of the test utterance and/or spectral change of the speaker template, and comparing the matching score with a predefined discriminating threshold to determine whether the inputted test utterance is an utterance containing a password spoken by the enrolled speaker. | 07-02-2009 |
20090187404 | BIOMETRIC CONTROL SYSTEM AND METHOD - A large-scale attendance, productivity, activity and availability biometric control method using the telephone network, for individuals client users at their work places, with speaker verification technology based on limited enrolling data and short verification sentences, and the equipment associated to this method. Through the biometric control method by means of the individuals' identity (ID) verification using voice recognition, the individual is biometrically identified which allows for the registering of permanence, entrance or exit times. The type of performed activity, attendance at the work place, and which keeps records on the performed activity type. The method considers receiving and making intelligent calls to achieve an active control on individuals' activity. | 07-23-2009 |
20090187405 | Arrangements for Using Voice Biometrics in Internet Based Activities - In one embodiment, a method for identifying a user of a virtual universe utilizing audio biometrics is disclosed. The method can include prompting a client application with a request for an utterance, processing the reply to the request and creating a voice profile of the user/speaker. The voice profile can be associated with an avatar and when an utterance is received, the avatar can be identified and in some embodiments authenticated. Other embodiments are also disclosed. | 07-23-2009 |
20090198495 | VOICE SITUATION DATA CREATING DEVICE, VOICE SITUATION VISUALIZING DEVICE, VOICE SITUATION DATA EDITING DEVICE, VOICE DATA REPRODUCING DEVICE, AND VOICE COMMUNICATION SYSTEM - A voice situation data creating device for providing the user with data with a good convenience for the user when the user uses voice data collected from sound sources and recorded with time. A direction/talker identifying section ( | 08-06-2009 |
20090204400 | System and method for processing a spoken request from a user - A system and method are described for processing a spoken request from a user. In one embodiment, a method is disclosed for attempting to recognize a spoken request from a user with a speech recognition engine above a predetermined level of accuracy. If the spoken request is not recognized above the predetermined level of accuracy, the spoken request is provided to a level one agent. If the level one agent does not recognize the request, a voice connection is established between the user and a level two agent. In another embodiment, a method is disclosed for determining whether a silent response system recognizes a spoken request from a user above a predetermined level of accuracy. A response is provided to the user if the silent response system recognizes the spoken request. Otherwise, a voice connection is established between the user and a call center. | 08-13-2009 |
20090210227 | VOICE RECOGNITION APPARATUS AND METHOD FOR PERFORMING VOICE RECOGNITION - A voice recognition apparatus includes: a voice recognition module that performs a voice recognition for an audio signal during a voice period; a distance measurement module that measures a current distance between the user and an voice input module; a calculation module that calculates a recommended distance range, in which being estimated that an S/N ratio exceeds a first threshold, based on the voice characteristic; and a display module that displays the recommended distance range and the current distance. | 08-20-2009 |
20090222265 | Voice Recognition Apparatus - A voice recognition apparatus | 09-03-2009 |
20090240499 | LARGE VOCABULARY QUICK LEARNING SPEECH RECOGNITION SYSTEM - A speech recognition system comprising: an analog to digital converter, a time to frequency transformer, a noise filter; a context preprocessor, an acoustic word classifier, an initial acoustic model generator, a textual search module, and a trainer. The system recognizes speech initially prior to training, due to the context preprocessor classifying words of identical sound by the context of a leading and trailing neighboring group of words and by the acoustic model generator creating an initial acoustic model derived from an acoustic word statistical analysis ‘average’. Applications of the system include voice activated computer games, command and control systems and text dictation. | 09-24-2009 |
20090248412 | Association apparatus, association method, and recording medium - There is provided an association apparatus for associating a plurality of voice data converted from voices produced by speakers, comprising: a word/phrase similarity deriving section which derives an appearance ratio of a common word/phrase that is common among the voice data based on a result of speech recognition processing on the voice data, as a word/phrase similarity; a speaker similarity deriving section which derives a result of comparing characteristics of voices extracted from the voice data, as a speaker similarity; an association degree deriving section which derives a possibility of the plurality of the voice data, which are associated with one another, based on the derived word/phrase similarity and the speaker similarity, as an association degree; and an association section which associates the plurality of the voice data with one another, the derived association degree of which is equal to or more than a preset threshold. | 10-01-2009 |
20090248413 | DEVICES AND SYSTEMS FOR REMOTE CONTROL - Remote controllers and systems thereof are disclosed. The remote controller remotely operates a receiving host, in which the receiving host provides voice input and voice recognition functions. The remote controller comprises a first input unit and a second input unit for generating a voice input request and a voice recognition request. The generated voice input and voice recognition requests are then sent to the receiving host, thereby forcing the receiving host to perform the voice input and voice recognition functions. | 10-01-2009 |
20090248414 | Personal name assignment apparatus and method - An apparatus includes unit acquiring speaker information including a first duration of a speaker and a name specified by name specifying information used to indicate a name, and acquiring the first duration as a first period, unit acquiring a second period including an utterance, unit extracting, if the second period is included in the first period, a first amount that characterizes a speaker, and associating the first amount with a name corresponding to the first period, unit creating speaker models from amounts, unit acquiring, from the content information, a third duration as an duration to be recognized, unit extracting, if the second period is included in the third period, a second amount that characterizes a speaker, unit calculating degrees of similarity between amounts of speaker models and the second amount, and unit recognizing a name of a speaker model which satisfies a set condition of the degrees as a performer. | 10-01-2009 |
20090254343 | IDENTIFYING AUDIO CONTENT USING DISTORTED TARGET PATTERNS - Embodiments of a system for identifying audio content are described. During operation, the system receives a data stream from an electronic device via a communication network. Then, the system distorts a set of target patterns which are used to identify the audio content based on characteristics of the electronic device and/or the communication network. Next, the system identifies the audio content in the data stream based on the set of distorted target patterns. | 10-08-2009 |
20090259467 | Voice Recognition Apparatus - A voice recognition apparatus | 10-15-2009 |
20090259468 | SYSTEM AND METHOD FOR DETECTING SYNTHETIC SPEAKER VERIFICATION - Disclosed herein are systems, methods, and tangible computer readable-media for detecting synthetic speaker verification. The method comprises receiving a plurality of speech samples of the same word or phrase for verification, comparing each of the plurality of speech samples to each other, denying verification if the plurality of speech samples demonstrate little variance over time or are the same, and verifying the plurality of speech samples if the plurality of speech samples demonstrates sufficient variance over time. One embodiment further adds that each of the plurality of speech samples is collected at different times or in different contexts. In other embodiments, variance is based on a pre-determined threshold or the threshold for variance is adjusted based on a need for authentication certainty. In another embodiment, if the initial comparison is inconclusive, additional speech samples are received. | 10-15-2009 |
20090259469 | METHOD AND APPARATUS FOR SPEECH RECOGNITION - A method and apparatus for performing speech recognition receives an audio signal, generates a sequence of frames of the audio signal, transforms each frame of the audio signal into a set of narrow band feature vectors using a narrow passband, couples the narrow band feature vectors to a speech model, and determines whether the audio signal is a wide band signal. When the audio signal is determined to be a wide band signal, a pass band parameter of each of one or more passbands that are outside the narrow passband is generated for each frame and the one or more band energy parameters are coupled to the speech model. | 10-15-2009 |
20090259470 | Bio-Phonetic Multi-Phrase Speaker Identity Verification - Systems and methods for bio-phonetic multi-phrase speaker identity verification are disclosed. Generally, a speaker identity verification engine generates a dynamic phrase including at least one dynamically-generated word. The speaker identity verification engine prompts a user to speak the dynamic phrase and receives a dynamic phrase utterance. The speaker identity verification engine extracts at least one voice characteristic from the dynamic phrase utterance and compares the at least one voice characteristic with a voice profile the generate a score. The speaker identity verification engine then determines whether to accept a speaker identity claim based on the score. | 10-15-2009 |
20090271196 | CLASSIFYING PORTIONS OF A SIGNAL REPRESENTING SPEECH - Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a frame of the signal representing speech. The frame can be classified as unvoiced or voiced based on occurrence of one or more events within the frame. For example, the one or more events can comprise one or more glottal pulses. In response to classifying the frame as voiced, the frame can be processed. | 10-29-2009 |
20090276217 | VOIP CALLER AUTHENTICATION BY VOICE SIGNATURE CONTINUITY - There are provided methods and systems for authenticating a user. A method includes receiving a voice signature certificate corresponding to a setup portion of a Voice over Internet Protocol (VoIP) call. The VoIP call further has a voice conversation portion. The voice signature certificate includes a voice signature segment. The method further includes reproducing the voice signature segment to enable verification of voice continuity from the setup portion to the voice conversation portion. The verification is performing by comparing the voice signature segment to a user's voice during the voice conversation portion. | 11-05-2009 |
20090276218 | Robot and Server with Optimized Message Decoding - A method for optimizing message transmission and decoding comprises: reading data from a memory of an originating device, the data comprising information regarding the originating device; encoding the data by converting the data to a subset of words having a ranked recognition accuracy higher than the remainder of words; transmitting the encoded data from the originating device to a receiving system audibly as words via a telephone connection; utilizing a voice recognition software to recognize the words; decoding the words back to the data; and taking a predetermined action based on the data. | 11-05-2009 |
20090287489 | SPEECH PROCESSING FOR PLURALITY OF USERS - A mobile communication device configured to communicate over a wireless network has an audio processing circuit that is adaptable based on a pattern of the speaker's voice to provide improved audio quality and intelligibility. The audio processing circuit is configured to receive a voice signal from an individual speaker, to determine a pattern associated with the speaker's voice, and to adjust a filter based on the determined pattern. | 11-19-2009 |
20090319270 | CAPTCHA Using Challenges Optimized for Distinguishing Between Humans and Machines - An audible based electronic challenge system is used to control access to a computing resource by using a test to identify an origin of a voice. The test is based on analyzing a spoken utterance using optimized challenge items selected for their discrimination capability to determine if it was articulated by an unauthorized human or a text to speech (TTS) system. | 12-24-2009 |
20090319271 | System and Method for Generating Challenge Items for CAPTCHAs - Challenge items for an audible based electronic challenge system are generated using a variety of techniques to identify optimal candidates. The challenge items are intended for use in a computing system that discriminates between humans and text to speech (TTS) system. | 12-24-2009 |
20090326942 | METHODS OF IDENTIFICATION USING VOICE SOUND ANALYSIS - Methods of using individually distinctive patterns of voice characteristics to identify a speaker include computing the reassigned spectrogram of each of at least two voice samples, pruning each reassigned spectrogram to remove noise and other computational artifacts, and comparing (either visually or with the aid of a processor) the strongest points to determine whether the voice samples belong to the same speaker. | 12-31-2009 |
20090326943 | Guidance information display device, guidance information display method and recording medium - A guidance information display device includes: a voice input unit; a display unit for displaying guidance information; an operation unit for accepting an operation; and a processor capable of executing the following processes of: a voice recognition process operation of performing voice recognition based on inputted voice; a calculation operation of calculating an evaluation value for a recognition result of voice recognition by the voice recognition process operation; a display operation of reading out guidance information corresponding to the recognition result from a storage unit, which stores the guidance information, and displaying the guidance information at a display unit; and a decision operation of deciding a display mode of the guidance information at the display unit based on a variable value, which varies with an operation from the operation unit for the guidance information displayed by the display operation, and the evaluation value calculated by the calculation operation. | 12-31-2009 |
20090326944 | VOICE RECOGNITION APPARATUS AND METHOD - An input voice detect is detected after starting a voice input waiting state; the detected voice is recognized; an elapsed time from the start of the voice input waiting state is counted; an informative sound which urges a user to input the voice is outputted when the elapsed time reaches a preset output set time; and the output of the informative sound is stopped when the elapsed time at the time of inputting the voice is shorter than the output set timedetect. | 12-31-2009 |
20100017208 | INTEGRATED CIRCUIT FOR PROCESSING VOICE - An improved integrated circuit for processing voice (speech) is provided. This is a voice LSI. The voice LSI reduces a voice output level to 0V if a speech segment is silent. This voice LSI can reduce a white noise. | 01-21-2010 |
20100017209 | RANDOM VOICEPRINT CERTIFICATION SYSTEM, RANDOM VOICEPRINT CIPHER LOCK AND CREATING METHOD THEREFOR - The present invention provides a random voiceprint certification system comprises a training system, a random cipher generator, and a testing system, which is employed to process training or testing operation for the input raw voice data. In training voice, the training system obtains an appointment voiceprint feature model parameter groups from the input raw voice data. From the appointment voiceprint feature model parameter groups several voiceprint characteristic units are obtained and at least one reference voiceprint password, which is for the testing system to carry out the voice testing operation is built. In processing testing voice, the random cipher generator generates randomly at least one reference voiceprint password from the voiceprint characteristic units of the appointment voiceprint feature model parameter groups to build the random voiceprint cipher lock. The present invention generates randomly one or several reference voiceprint passwords. The random voiceprint certification system is built completely to form the random voiceprint cipher lock. Therefore, the effect of not easy for illegal invasion can be achieved. | 01-21-2010 |
20100036664 | SUBTITLE GENERATION AND RETRIEVAL COMBINING DOCUMENT PROCESSING WITH VOICE PROCESSING - An apparatus for retrieving a character string includes: storage for storing text data obtained by recognizing a voice in a presentation, second text data extracted from document data used in the presentation, and associated information of the first text data and the second text data. The apparatus also includes a retrieval unit for retrieving, by use of the associated information, the character string from text data composed from the first text data and the second text data. | 02-11-2010 |
20100049515 | VEHICLE-MOUNTED VOICE RECOGNITION APPARATUS - A vehicle-mounted voice recognition apparatus | 02-25-2010 |
20100070277 | VOICE RECOGNITION DEVICE, VOICE RECOGNITION METHOD, AND VOICE RECOGNITION PROGRAM - A voice recognition device that recognizes a voice of an input voice signal, comprises a voice model storage unit that stores in advance a predetermined voice model having a plurality of detail levels, the plurality of detail levels being information indicating a feature property of a voice for the voice model; a detail level selection unit that selects a detail level, closest to a feature property of an input voice signal, from the detail levels of the voice model stored in the voice model storage unit; and a parameter setting unit that sets parameters for recognizing the voice of an input voice according to the detail level selected by the detail level selection unit. | 03-18-2010 |
20100076763 | VOICE RECOGNITION SEARCH APPARATUS AND VOICE RECOGNITION SEARCH METHOD - A voice recognition search apparatus includes: a dictionary create unit creating a first voice recognition dictionary from a search subject data; a voice acquisition unit acquiring first and second voices; a voice recognition unit creating first and second text data by recognizing the first and second voices using the first and second voice recognition dictionaries; a first search unit searching the search subject data by the first text data; and a second search unit searching a search result of the first search unit by the second text data. | 03-25-2010 |
20100082342 | Method of Retaining a Media Stream without Its Private Audio Content - A method is disclosed that enables the handling of audio streams for segments in the audio that might contain private information, in a way that is more straightforward than in some techniques in the prior art. The data-processing system of the illustrative embodiment receives a media stream that comprises an audio stream, possibly in addition to other types of media such as video. The audio stream comprises audio content, some of which can be private in nature. Once it receives the data, the data-processing system then analyzes the audio stream for private audio content by using one or more techniques that involve looking for private information as well as non-private information. As a result of the analysis, the data-processing system omits the private audio content from the resulting stream that contains the processed audio. | 04-01-2010 |
20100106502 | SPEAKER VERIFICATION METHODS AND APPARATUS - In one aspect, a method for determining validity of an identity asserted by a speaker using a voice print associated with a user whose identity the speaker is asserting, the voice print obtained from characteristic features of at least one first voice signal obtained from the user uttering at least one enrollment utterance including at least one enrollment word is provided. The method comprises acts of obtaining a second voice signal of the speaker uttering at least one challenge utterance, wherein the at least one challenge utterance includes at least one word that was not in the at least one enrollment utterance, obtaining at least one characteristic feature from the second voice signal, comparing the at least one characteristic feature with at least a portion of the voice print to determine a similarity between the at least one characteristic feature and the at least a portion of the voice print, and determining whether the speaker is the user based, at least in part, on the similarity between the at least one characteristic feature and the at least a portion of the voice print. | 04-29-2010 |
20100106503 | SPEAKER VERIFICATION METHODS AND APPARATUS - In one aspect, a method for determining a validity of an identity asserted by a speaker using a voice print that models speech of a user whose identity the speaker is asserting is provided. The method comprises acts of performing a first verification stage comprising acts of obtaining a first voice signal from the speaker uttering at least one first challenge utterance; and comparing at least one characteristic feature of the first voice signal with at least a portion of the voice print to assess whether the at least one characteristic feature of the first voice signal is similar enough to the at least a portion of the voice print to conclude that the first voice signal was obtained from an utterance by the user. The method further comprises performing a second verification stage if it is concluded in the first verification stage that the first voice signal was obtained from an utterance by the user, the second verification stage comprising acts of adapting at least one parameter of the voice print based, at least in part, on the first voice signal to obtain an adapted voice print, obtaining a second voice signal from the speaker uttering at least one second challenge utterance, and comparing at least one characteristic feature of the second voice signal with at least a portion of the adapted voice print to assess whether the at least one characteristic feature of the second voice signal is similar enough to the at least a portion of the adapted voice print to conclude that the second voice signal was obtained from an utterance by the user. | 04-29-2010 |
20100106504 | INTELLIGENT MECHANISM TO AUTOMATICALLY DISCOVER AND NOTIFY A POTENTIAL PARTICIPANT OF A TELECONFERENCE - A computer-implemented method, computer program product, and data processing system for notifying an identified person of a teleconference. Data corresponding to an audio record of the teleconference is received. Pattern recognition is performed on the data. Responsive to recognizing in the data a pattern corresponding to an identification of the identified person, a device associated with the identified person is contacted. | 04-29-2010 |
20100121641 | EXTERNAL VOICE IDENTIFICATION SYSTEM AND IDENTIFICATION PROCESS THEREOF - An external voice identification system and an identification process thereof is disclosed. The external voice identification system of a multimedia electronic device is activated by identifying and analyzing inputting a voice message, and the multimedia electronic device can be an iPod player having a storage module stored with a plurality of voice files and has a transmission interface to electrically connect to a voice identification system. The voice identification system is electrically connected to the transmission interface and has a built-in identification module, and a identification unit can identify and analyze the voice signals. An adapting interface is connected to a voice input unit to receive the external voice signal, and thus identify and analyze the external voice signals by the identification module to further activate the multimedia electronic device for playing the voice signal (songs), and select, to adjust and switch the playing content by the inputted external voice signal. | 05-13-2010 |
20100145695 | APPARATUS FOR CONTEXT AWARENESS AND METHOD USING THE SAME - An apparatus for context awareness includes: a voice-based recognition unit that recognizes a user's emotional state on the basis of a voice signal; a motion-based recognition unit that recognizes the user's emotional state on the basis of a motion; a position recognition unit that recognizes a location where the user is positioned; and a mergence-recognition unit that recognizes a user's context by analyzing the recognition results of the voice-based recognition unit, the motion-based recognition unit, and the position recognition unit. Accordingly, it is possible to rapidly and accurately accidents or dangerous contexts caused to a user. | 06-10-2010 |
20100145696 | Method, system and apparatus for improved voice recognition - An improved voice recognition system in which a Voice Keyword Table is generated and downloaded from a set-up device to a voice recognition device. The VKT includes visual form data, spoken form data, phonetic format data, and an entry corresponding to a keyword, and TTS-generated voice prompts and voice models corresponding to the phonetic format data. A voice recognition system on the voice recognition device is updated by the set-up device. Furthermore, voice models in the voice recognition device are modified by the set-up device. | 06-10-2010 |
20100179813 | VOICE RECOGNITION SYSTEM AND METHODS - The present invention relates to a method of providing voice recognition. The method comprises the steps of receiving a packetised voice data of a person to be identified over a packet-switched network, comparing the voice data with a stored voice data of a user and, based on the comparison, providing an indication of the likelihood that the person to be identified is the user, wherein the step of receiving the voice data comprises waiting for sufficient voice data to be received. | 07-15-2010 |
20100185444 | METHOD, APPARATUS AND COMPUTER PROGRAM PRODUCT FOR PROVIDING COMPOUND MODELS FOR SPEECH RECOGNITION ADAPTATION - An apparatus for providing compound models for speech recognition adaptation includes a processor. The processor may be configured to receive a speech signal corresponding to a particular speaker, select a cluster model including both a speaker independent portion and a speaker dependent portion based at least in part on a characteristic of speech of the particular speaker, and process the speech signal using the selected cluster model. A corresponding method and computer program product are also provided. | 07-22-2010 |
20100204991 | Ultrasonic Doppler Sensor for Speaker Recognition - A method and system recognizes an unknown speaker by directing an ultrasonic signal at a face of the unknown speaker. A Doppler signal of the ultrasonic signal is acquired after reflection by the face, and Doppler features are extracted from the reflected Doppler signal. The Doppler features are classified using Doppler models storing the Doppler features and identities of known speakers to recognize and identify the unknown speaker. | 08-12-2010 |
20100204992 | METHOD FOR INDENTIFYING AN ACOUSIC EVENT IN AN AUDIO SIGNAL - An process for recognizing an acoustic event in an audio signal has two stages The first stage involves possible candidates being selected, and the second stage involves each of the possible candidates being allocated a confidence value. | 08-12-2010 |
20100217594 | PERSONAL AUTHENTICATION SYSTEM - In a system in which a business organization authenticates a user by speaker recognition, there is provided the system that obviates a necessity for the user to register a speaker model by uttering voice for each business partner. A user device | 08-26-2010 |
20100235169 | SPEECH DIFFERENTIATION - Method for differentiation between voices including 1) analyzing perceptually relevant signal properties of the voices, e.g. average pitch and pitch variance, 2) determining sets of parameters representing the signal properties of the voices, and finally 3) extracting voice modification parameters representing modified signal properties of at least some of the voices. Hereby it is possible to increase a mutual parameter distance between the voices, and thereby the perceptual difference between the voices, when the voices have been modified according to the voice modification parameters. Preferably most of or all voices are modified in order to limit the amount of modification of one parameter. Preferred signal property measures are: pitch, pitch variance over time, glottal pulse shape, formant frequencies, signal amplitude, energy differences between voiced and un-voiced speech segments, characteristics related to overall spectrum contour of speech, characteristics related to dynamic variation of one or more measures in long speech segment. The method allows an automatic voice differentiation with a natural sound since it is based on a modification of signal properties determined for each of the voices. | 09-16-2010 |
20100250252 | Conference support device, conference support method, and computer-readable medium storing conference support program - A conference support device includes an image receiving portion that receives captured images from conference terminals, a voice receiving portion that receives, from one of the conference terminals, a voice that is generated by a first participant, a first storage portion that stores the captured images and the voice, a voice recognition portion that recognizes the voice, a text data creation portion that creates text data that express the words that are included in the voice, an addressee specification portion that specifies a second participant, whom the voice is addressing, an image creation portion that creates a display image that is configured from the captured images and in which the text data are associated with the first participant and a specified image is associated with at least one of the first participant and the second participant, and a transmission portion that transmits the display image to the conference terminals. | 09-30-2010 |
20100268537 | Speaker verification system - A text-independent speaker verification system utilizes mel frequency cepstral coefficients analysis in the feature extraction blocks, template modeling with vector quantization in the pattern matching blocks, an adaptive threshold and an adaptive decision verdict and is implemented in a stand-alone device using less powerful microprocessors and smaller data storage devices than used by comparable systems of the prior art. | 10-21-2010 |
20100280828 | Communication Device Language Filter - Techniques are described that generally relate to systems, methods, and devices designed to selectively filter offensive communications in accordance with a user's intentions. Example methods may be designed to filter (such as by deleting, blocking, replacing, and/or modifying) various offensive words, phrases, and/or sounds that have been identified as having offensive meanings. | 11-04-2010 |
20100286983 | OPERATION CONTROL APPARATUS AND METHOD IN MULTI-VOICE RECOGNITION SYSTEM - An operation control apparatus and method of controlling a plurality of operationally connected voice recognition-enabled systems, each having reciprocal control operational states corresponding to an enabled/disabled state. | 11-11-2010 |
20100305946 | SPEAKER VERIFICATION-BASED FRAUD SYSTEM FOR COMBINED AUTOMATED RISK SCORE WITH AGENT REVIEW AND ASSOCIATED USER INTERFACE - Disclosed is method for screening an audio for fraud detection, the method comprising: providing a User Interface (UI) control capable of: a) receiving an audio; b) comparing the audio with a list of fraud audios; c) assigning a risk score to the audio based on the comparison with a potentially matching fraud audio of the list of fraud audios; and d) displaying an audio interface on a display screen, wherein the audio interface is capable of playing the audio along with the potentially matching fraud audio, and wherein the display screen further displays metadata for each of the audio and the potentially matching fraud audio thereon, wherein the metadata includes location and incident data of each of the audio and the potentially matching fraud audio. | 12-02-2010 |
20100324898 | VOICE RECOGNITION WITH DYNAMIC FILTER BANK ADJUSTMENT BASED ON SPEAKER CATEGORIZATION - Voice recognition methods and systems are disclosed. A voice signal is obtained for an utterance of a speaker. The speaker is categorized as a male, female, or child and the categorization is used as a basis for dynamically adjusting a maximum frequency f | 12-23-2010 |
20110004474 | Audience Measurement System Utilizing Voice Recognition Technology - A method, a system, and a computer program product for determining a total count of audience members within a sensory receiving environment during the presentation of a program. A voice recognition unit is enabled when a signal for a program/subject/event, such as a broadcast program, is received. The voice recognition unit receives one or more sounds in the sensory receiving environment and analyzes the characteristics of the sounds. When one or more unique human voices are identified during the program, a count of the number of unique human voices is determined. The count of unique human voices is transmitted to a server, whereby the count of unique human voices is equal to a count of audience members. The total count of audience members is calculated for all sensory receiving environment associated with the program. An audience analysis graphical user interface is generated to display the total count of audience members. | 01-06-2011 |
20110022388 | METHOD AND SYSTEM FOR SPEECH RECOGNITION USING SOCIAL NETWORKS - In an example embodiment, there is disclosed an apparatus comprising an audio interface configured to receive an audio signal, a data interface is configured to communicate with at least one social graph, and logic is coupled to the audio interface and the data interface. The logic is configured to identify a calling party. The logic is further configured to acquire data representative of a called party from the audio signal. The logic is configured to initiate a search of the at least one social graph for the data representative of the called party to identify the called party responsive to acquiring the data representative of the called party. | 01-27-2011 |
20110022389 | APPARATUS AND METHOD FOR IMPROVING PERFORMANCE OF VOICE RECOGNITION IN A PORTABLE TERMINAL - An apparatus and method for improving the performance of voice recognition in a portable terminal are provided. The apparatus includes a voice recognition management unit, and a controller. After recognizing a user's voice and extracting at least one voice parameter, the voice recognition management unit determines if the extracted at least one voice parameter meets a criterion for determining one of success and failure of voice recognition. The controller analyzes a result of the determination by the voice recognition management unit and outputs a result of the analysis. | 01-27-2011 |
20110022390 | SPEECH DEVICE, SPEECH CONTROL PROGRAM, AND SPEECH CONTROL METHOD - In order to speak numerals in a manner readily comprehensible to a user, a speech device includes a voice synthesis portion | 01-27-2011 |
20110035220 | AUTOMATED COMMUNICATION INTEGRATOR - An apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command. | 02-10-2011 |
20110035221 | Monitoring An Audience Participation Distribution - Apparatus for monitoring an audience participation distribution at an event comprising a speech activity module operable to generate speech data representing speech detected at the event, a speaker identification module operable to determine, using the speech data, a first speaker who has contributed to the detected speech, and a processing unit operable to generate speaker data representing a value for the time that the first speaker has contributed to the detected speech and to output distribution data based on the speaker data representing a measure of the participation for the first speaker at the event. | 02-10-2011 |
20110071830 | COMBINED LIP READING AND VOICE RECOGNITION MULTIMODAL INTERFACE SYSTEM - The present invention provides a combined lip reading and voice recognition multimodal interface system, which can issue a navigation operation instruction only by voice and lip movements, thus allowing a driver to look ahead during a navigation operation and reducing vehicle accidents related to navigation operations during driving. The combined lip reading and voice recognition multimodal interface system in accordance with the present invention includes: an audio voice input unit; a voice recognition unit; a voice recognition instruction and estimated probability output unit; a lip video image input unit; a lip reading unit; a lip reading recognition instruction output unit; and a voice recognition and lip reading recognition result combining unit that outputs the voice recognition instruction | 03-24-2011 |
20110071831 | Method and System for Localizing and Authenticating a Person - The present invention refers to a method for localizing a person comprising the steps carried out in a computing system ( | 03-24-2011 |
20110071832 | IMAGE DISPLAY DEVICE, METHOD, AND PROGRAM - It is an object of the present invention to make an act of viewing an image interactive and further enriched. A microphone | 03-24-2011 |
20110093266 | VOICE PATTERN TAGGED CONTACTS - A method and system for associating a voice pattern with a contact record and/or for identifying a speaker using a mobile device. A mobile device may include a voice identification application for extracting a voice pattern from audio data and associating the voice pattern with a contact record that includes identification information such as, for example, a name of a person. The device may also be used to identify a speaker. The device captures audio data of a speaker; the voice identification application extracts a voice pattern from the audio data and compares the voice pattern to voice patterns associated with contact records stored in a contact directory. The voice identification application identifies a contact record having a voice pattern matching the voice pattern from the audio data and drives the device to display identification information from the contact record having a matching voice pattern. | 04-21-2011 |
20110093267 | AGE DETERMINATION USING SPEECH - A device may be configured to provide a query to a user. Voice data may be received from the user responsive to the query. Voice recognition may be performed on the voice data to identify a query answer. A confidence score associated with the query answer may be calculated, wherein the confidence score represents the likelihood that the query answer has been accurately identified. A likely age range associated with the user may be determined based on the confidence score. The device to calculate the confidence score may be tuned to increase a likelihood of recognition of voice data for a particular age range of callers. | 04-21-2011 |
20110106536 | SYSTEMS AND METHODS FOR SIMULATING DIALOG BETWEEN A USER AND MEDIA EQUIPMENT DEVICE - Systems and methods for simulating dialog between a user and a media equipment device are provided. Videos of a user selected actor may be retrieved. An opener video of the selected actor may be displayed and based on a verbal response received from the user, a clip of a media asset associated with the selected actor may be retrieved. User reactions to the displayed clip may be monitored and subsequent videos of the actor and clips may be provided based on the user reactions. Clips of a media asset that matches preferences of the user may be retrieved. A clip associated with a mid level rank may be displayed. When the user reacts positively to the clip a clip associated with a low class level rank may be retrieved next otherwise a high class level rank clip may be retrieved next. | 05-05-2011 |
20110125498 | METHOD AND APPARATUS FOR HANDLING A TELEPHONE CALL - One embodiment of the invention provides a computer-implemented method of handling a telephone call. The method comprises monitoring a conversation between an agent and a customer on a telephone line as part of the telephone call to extract the audio signal therefrom. Real-time voice analytics are performed on the extracted audio signal while the telephone call is in progress. The results from the voice analytics are then passed to a computer-telephony integration system responsible for the call for use by the computer-telephony integration system for determining future handling of the call. | 05-26-2011 |
20110131043 | VOICE RECOGNITION SYSTEM, VOICE RECOGNITION METHOD, AND PROGRAM FOR VOICE RECOGNITION - The present invention enables the recognition process at high speed even when a lot of garbage is included in the grammar. The first voice recognition processing unit generates a recognition hypothesis graph which indicates a structure of hypothesis that is derived according to a first grammar together with a score associated with respective connections of a recognition unit by executing a voice recognition process based on the first grammar to a voice feature amount of input voice, and the second voice recognition processing unit outputs the recognition result from a total score of a hypothesis which is derived according to a second grammar after executing a voice recognition process according to the second grammar that is specified to accept a section other than keywords in input voice as the garbage section to a voice feature amount of input voice, and the second voice recognition processing unit acquires the structure and the score of the garbage section from the recognition hypothesis graph. | 06-02-2011 |
20110131044 | TARGET VOICE EXTRACTION METHOD, APPARATUS AND PROGRAM PRODUCT - An apparatus, program product and method is provided for separating a target voice from a plurality of other voices having different directions of arrival. The method comprises the steps of disposing a first and a second voice input device at a predetermined distance from one another and upon receipt of voice signals at said devices calculating discrete Fourier transforms for the signals and calculating a CSP (cross-power spectrum phase) coefficient by superpositioning multiple frequency-bin components based on correlation of the two spectra signals received and then calculating a weighted CSP coefficient from said two discrete Fourier-transformed speech signals. A target voice is separated when received by said devices from other voice signals in a spectrum by using the calculated weighted CSP coefficient. | 06-02-2011 |
20110166859 | VOICE RECOGNITION DEVICE - A voice recognition unit is constructed in such a way as to create a voice label string for an inputted voice uttered by a user inputted for each language on the basis of a feature vector time series of the inputted voice uttered by the user and data about a sound standard model, and register the voice label string into a voice label memory | 07-07-2011 |
20110173001 | SMS MESSAGING WITH VOICE SYNTHESIS AND RECOGNITION - When a subscriber's phone is sent a SMS message from any other Public Switch Telephone Network user, a voice call to the subscriber's phone is placed, and upon answering, the SMS message is translated into speech. A jargon translator is employed to convert SMS language into corresponding words. Once the message has been played, the subscriber receiving it may verbally request the opportunity to send a reply to the message by audibly speaking a response. The response is matched against an internal phrasebook to accurately transcribe the message. Transcription performance is improved by allowing each subscriber to provide a personal phrasebook which is combined with the internal one. However, if the spoken message is complex or not recognized, the message can be automatically relayed to a human agent for manual transcription. | 07-14-2011 |
20110173002 | IN-VEHICLE DEVICE AND METHOD FOR MODIFYING DISPLAY MODE OF ICON INDICATED ON THE SAME - A storage unit stores a correspondence between a voice command and a display mode modification operation. When a control unit determines that a vehicle is traveling according to a traveling state of the vehicle obtained by a traveling state acquisition unit, when a voice recognition unit recognizes a voice, which is uttered by a user and received by a voice input unit, and when the control unit determines that the recognized voice corresponds to a voice command stored in the storage unit, the control unit performs a display mode change operation corresponding to the voice command and modifies a display mode of an icon indicated on an indication screen of an indication unit. | 07-14-2011 |
20110178801 | SYSTEM AND METHOD FOR ACCESS TO MULTIMEDIA STRUCTURES - A system for access to multimedia structures has telephone sets capable of connecting to a telephone network, a storage device capable of storing a plurality of multimedia structures representing messages and/or data and/or commands, and a network access server that can be associated with the telephone sets and is capable of selectively instantiating the multimedia structures via an interconnection network. There is also a voice-recognition and speech-synthesis system that can be associated with the network access server and that comprises modules for reading files in XML format and for processing the files so as to obtain files in a format that can be synthesized by a speech synthesizer. | 07-21-2011 |
20110196677 | Analysis of the Temporal Evolution of Emotions in an Audio Interaction in a Service Delivery Environment - According to one illustrative embodiment, a method is provided for analyzing an audio interaction. At least one change in an emotion of a speaker in an audio interaction and at least one aspect of the audio interaction are identified. The at least one change in an emotion is analyzed in conjunction with the at least one aspect to determine a relationship between the at least one change in an emotion and the at least one aspect, and a result of the analysis is provided. | 08-11-2011 |
20110208524 | USER PROFILING FOR VOICE INPUT PROCESSING - This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device. | 08-25-2011 |
20110213615 | VOICE AUTHENTICATION SYSTEM AND METHODS - A method for configuring a voice authentication system comprises ascertaining a measure of confidence associated with a voice sample enrolled with the authentication system. The measure of confidence is derived through simulated impostor testing carried out on the enrolled sample. | 09-01-2011 |
20110224986 | VOICE AUTHENTICATION SYSTEMS AND METHODS - A method for configuring a voice authentication system employing at least one authentication engine comprises utilising the at least one authentication engine to systematically compare a plurality of impostor voice sample against a voice sample of a legitimate person to derive respective authentication scores. The resultant authentication scores are analysed to determine a measure of confidence for the voice authentication system. | 09-15-2011 |
20110257974 | GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location. | 10-20-2011 |
20110257975 | VOICE OVER IP BASED BIOMETRIC AUTHENTICATION - A receiver receives from a remote system a voice biometric sample from a party attempting to obtain a service from the apparatus using the remote system. A processor selectively determines when to request authentication of the party by a remote voice biometric system. A transmitter transmit a request to the party to provide the voice biometric sample responsive to the processor determining to request authentication of the party. The apparatus provides the service contingent upon authentication of the party by the remote voice biometric system. | 10-20-2011 |
20110270611 | OIL LEVEL INSPECTION SYSTEM FOR RAILROAD CAR TRUCK - A system for inspecting an oil level in each part of a railroad car truck includes: an imaging unit that obtains an image of an oil level gauge; an oil level inspection unit that inspects whether or not the oil level in the each part of the railroad car truck is within a predetermined range based on the image of the oil level gauge obtained by the imaging unit; a voice input unit adapted for an inspector to input, via voice, an inspection result; a voice processing unit that determines whether or not the inspection result inputted via the voice input unit is good based on the inputted inspection result, and converts a determination result into displayable data; a display unit that displays an oil level inspection result and the determination result; and a storage unit that stores, as data, the oil level inspection result and the determination result. | 11-03-2011 |
20110276330 | Methods and Devices for Appending an Address List and Determining a Communication Profile - Disclosed are methods and electronic communication devices, such as an in-car speaker device, that can receive via a downloading process, a communication address list from another device to the memory of the electronic communication device and can append a predetermined communication address to the communication address list. The predetermined communication address, which can be to an automated voice recognition based service, can be annunciated first. Also disclosed are methods and electronic communication devices for determining that a communication is with an automated voice recognition based service and then switching from a first call profile to a second call profile. Such a second profile can include different features such as a change of the frequency response of the audio signal of the electronic communication device, and/or reduction or elimination of the echo control, and/or a change in the noise control of the digital signal process. | 11-10-2011 |
20110276331 | VOICE RECOGNITION SYSTEM - A voice recognition system includes: a voice input unit | 11-10-2011 |
20110282665 | METHOD FOR MEASURING ENVIRONMENTAL PARAMETERS FOR MULTI-MODAL FUSION - Provided is a method for measuring environmental parameters for multi-modal fusion. The method for measuring environmental parameters for multi-modal fusion, includes: preparing at least one enrolled modality; receiving at least one input modality; calculating image related environmental parameters of input images in at least one input modality based on illumination of enrolled image in at least one enrolled modality; and comparing the image related environmental parameters with a predetermined reference value and discarding the input image or outputting it as a recognition data according to the comparison result. | 11-17-2011 |
20110282666 | Utterance state detection device and utterance state detection method - An utterance state detection device includes an user voice stream data input unit that gets user voice stream data of an user, a frequency element extraction unit that extracts high frequency elements by frequency-analyzing the user voice stream data, a fluctuation degree calculation unit that calculates a fluctuation degree of the high frequency elements thus extracted every unit time, a statistic calculation unit that calculates a statistic every certain interval based on a plurality of the fluctuation degrees in a certain period of time, and an utterance state detection unit that detects an utterance state of a specified user based on the statistic obtained from user voice stream data of the specified user. | 11-17-2011 |
20110288866 | VOICE PRINT IDENTIFICATION - Voice print identification may be provided. A plurality of speakers may be recorded and associated with identity indicators. Voice prints for each speaker may be created. If the voice print for at least one speaker corresponds to a known user according to the identity indicators, a database entry associating the user with the voice print may be created. Additional information associated with the user may also be displayed. | 11-24-2011 |
20110295603 | SPEECH RECOGNITION ACCURACY IMPROVEMENT THROUGH SPEAKER CATEGORIES - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition. In one aspect, a computer-based method includes receiving a speech corpus at a speech management server system that includes multiple speech recognition engines tuned to different speaker types; using the speech recognition engines to associate the received speech corpus with a selected one of multiple different speaker types; and sending a speaker category identification code that corresponds to the associated speaker type from the speech management server system over a network. The speaker category identification code can be used by any one of speech-interactive applications coupled to the network to select one of an appropriate one of multiple application-accessible speech recognition engines tuned to the different speaker types in response to an indication that a user accessing the application is associated with a particular one of the speaker category identification codes. | 12-01-2011 |
20110301954 | METHOD FOR ADJUSTING A VOICE RECOGNITION SYSTEM COMPRISING A SPEAKER AND A MICROPHONE, AND VOICE RECOGNITION SYSTEM - A method for adjusting a voice recognition system and a voice recognition system is disclosed, wherein the voice recognition system comprises a speaker and a microphone, and wherein the method comprises the steps of:
| 12-08-2011 |
20110307256 | SYSTEMS AND METHODS FOR PROVIDING NETWORK-BASED VOICE AUTHENTICATION - A system enables voice authentication via a network. The system may include an intelligent voice response engine operatively coupled to the network for receiving transaction or access requests from a plurality of telecommunications devices over the network. A speech recognition and verification services engine may be operatively coupled to the network and a database may be operatively coupled to the speech recognition and verification services engine for storing user voice print profiles. The speech recognition and verification services engine may receive a speaker verification call from the intelligent voice response engine and perform speaker verification on the received speaker verification call based on the stored user voice print profiles. The speech recognition and verification services engine may generate a verification score based upon results of the speaker verification. | 12-15-2011 |
20110313765 | Conversational Subjective Quality Test Tool - A method for assessing quality of conversational speech between nodes of a communication network ( | 12-22-2011 |
20110313766 | IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT - Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers. | 12-22-2011 |
20110320200 | SPEAKER RECOGNITION IN A MULTI-SPEAKER ENVIRONMENT AND COMPARISON OF SEVERAL VOICE PRINTS TO MANY - One-to-many comparisons of callers' voice prints with known voice prints to identify any matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract at least a portion of the customer's voice to create a customer voice print, and it formats the segmented voice print for network transmission to a server. The server compares the customer's voice print with multiple known voice prints to determine any matches, meaning that the customer's voice print and one of the known voice prints are likely from the same person. The identification of any matches can be used for a variety of purposes, such as determining whether to authorize a transaction requested by the customer. | 12-29-2011 |
20120004913 | METHOD AND APPARATUS FOR CONTROLLING OPERATION OF PORTABLE TERMINAL USING MICROPHONE - A method for controlling an operation of a portable terminal using a microphone includes detecting an operation mode of the portable terminal and driving an audio recognition mode according to the detected operation mode to activate the microphone, converting a signal, inputted through the microphone, into digital data and detecting audio characteristics from the digital data to extract audio analysis data for recognition of a type of the input signal, and determining whether there is UI setting information corresponding to the extracted audio analysis data type and performing a relevant function of the UI setting information. | 01-05-2012 |
20120004914 | AUDIO HUMAN VERIFICATION - A system generates an audio challenge that includes a first voice and one or more second voices, the first voice being audibly distinguishable, by a human, from the one or more second voices. The first voice conveys first information and the second voice conveys second information. The system provides the audio challenge to a user and verifies that the user is human based on whether the user can identify the first information in the audio challenge. | 01-05-2012 |
20120010886 | Language Identification - A language identification system suitable for use with voice data transmitted through either a telephonic or computer network systems is presented. Embodiments that automatically select the language to be used based upon the content of the audio data stream are presented. In one embodiment the content of the data stream is supplemented with the context of the audio stream. In another embodiment the language determination is supplemented with preferences set in the communication devices and in yet another embodiment, global position data for each user of the system is used to supplement the automated language determination. | 01-12-2012 |
20120016673 | SPEAKER RECOGNITION VIA VOICE SAMPLE BASED ON MULTIPLE NEAREST NEIGHBOR CLASSIFIERS - A speaker recognition system generates a codebook store with codebooks representing voice samples of speaker, referred to as trainers. The speaker recognition system may use multiple classifiers and generate a codebook store for each classifier. Each classifier uses a different set of features of a voice sample as its features. A classifier inputs a voice sample of an person and tries to authenticate or identify the person. A classifier generates a sequence of feature vectors for the input voice sample and then a code vector for that sequence. The classifier uses its codebook store to recognize the person. The speaker recognition system then combines the scores of the classifiers to generate an overall score. If the score satisfies a recognition criterion, then the speaker recognition system indicates that the voice sample is from that speaker. | 01-19-2012 |
20120022870 | GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location. | 01-26-2012 |
20120035929 | MESSAGING SYSTEM - The messaging system ( | 02-09-2012 |
20120065972 | WIRELESS VOICE RECOGNITION CONTROL SYSTEM FOR CONTROLLING A WELDER POWER SUPPLY BY VOICE COMMANDS - A wireless voice recognition control system for controlling the operation of an electric welder power supply by operator voice commands is disclosed. The system includes a remote module carried by the welder and a host module interfaced with the electric welder power supply. The remote module compares voice commands by the welder to preprogrammed voice command templates and operates to generate and broadcast a wireless signal when a spoken voice command matches a voice command template. The host module operates to receive the wireless signal and is configure to operate the electric welder power supply accordingly. In other embodiments, the host module and remote module operate to provide an audible feedback or acknowledgement to the welder. Other embodiments are also disclosed. | 03-15-2012 |
20120065973 | METHOD AND APPARATUS FOR PERFORMING MICROPHONE BEAMFORMING - A method and apparatus for performing microphone beamforming. The method includes recognizing a speech of a speaker, searching for a previously stored image associated with the speaker, searching for the speaker through a camera based on the image, recognizing a position of the speaker, and performing microphone beamforming according to the position of the speaker. | 03-15-2012 |
20120065974 | JOINT FACTOR ANALYSIS SCORING FOR SPEECH PROCESSING SYSTEMS - Method, system, and computer program product are provided for Joint Factor Analysis (JFA) scoring in speech processing systems. The method includes: carrying out an enrolment session offline to enrol a speaker model in a speech processing system using JFA, including: extracting speaker factors from the enrolment session; estimating first components of channel factors from the enrolment session. The method further includes: carrying out a test session including: calculating second components of channel factors strongly dependent on the test session; and generating a score based on speaker factors, channel factors, and test session Gaussian mixture model sufficient statistics to provide a log-likelihood ratio for a test session. | 03-15-2012 |
20120072218 | SYSTEM AND METHOD FOR TRACKING PERSONS OF INTEREST VIA VOICEPRINT - Disclosed are systems, methods, and computer readable media for tracking a person of interest. The method embodiment comprises identifying a person of interest, capturing a voiceprint of the person of interest, comparing a received voiceprint of a caller with the voiceprint of the person of interest, and tracking the caller if the voiceprint of the caller is a substantial match to the voiceprint of the person of interest. | 03-22-2012 |
20120084087 | METHOD, DEVICE, AND SYSTEM FOR SPEAKER RECOGNITION - A method, device, and system for speaker recognition are provided. The method includes: receiving a Speaker Verification instruction sent from a Media Gateway Controller (MGC) ( | 04-05-2012 |
20120095763 | DIGITAL METHOD AND ARRANGEMENT FOR AUTHENTICATING A PERSON - Digital method for authentication of a person by comparing a current voice profile with a previously stored initial voice profile, wherein to determine the relevant voice profile the person speaks at least one speech sample into the system, this speech sample is conveyed to a voice-profile calculation unit and thereby, on the basis of a prespecified voice-profile algorithm, the voice profile is calculated, such that the overall size of the speech sample and/or parameters of its evaluation to determine the relevant voice profile are established dynamically and automatically as the sample is spoken, in response to the result of an evaluation of a first partial speech sample. | 04-19-2012 |
20120095764 | METHODS FOR CREATING AND SEARCHING A DATABASE OF SPEAKERS - A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker. | 04-19-2012 |
20120101822 | BIOMETRIC SPEAKER IDENTIFICATION - A biometric speaker-identification apparatus is disclosed that generates ordered speaker-identity candidates for a probe based on prototypes. Probe match scores are clustered, and templates that correspond to clusters having top M probe match scores are compared with the prototypes to obtain template-prototype match scores. The probe is also compared with the prototypes, and those templates corresponding to template-prototype match scores that are nearest to probe-prototype match scores are selected as speaker-identity candidates. The speaker-identity candidates are ordered based on their similarity to the probe. | 04-26-2012 |
20120143608 | AUDIO SIGNAL SOURCE VERIFICATION SYSTEM - An audio signal source verification system is presented that, in certain embodiments, receives a first template for an audio signal and compares it to templates from different sound sources to determine a correlation between them. A question and response format may be used to eliminate false verifications and to increase the probability that an audio signal is from the purported source of the signal. Moreover mobile devices may be operated to provide audio signals generated by users of those phones and the audio signals and templates derived form those signals may be compared to known templates to determine a confidence level or other indication may be used to indicate the mobile device user is who they purport to be. Moreover comparisons can be made using templates of different richness to achieve confidence levels and confidence levels may be represented based on the results of the comparisons. | 06-07-2012 |
20120150540 | METHOD AND APPARATUS FOR DETECTING UNSOLICITED MULTIMEDIA COMMUNICATIONS - A service for searching for unsolicited communications is provided. For example, the service may inspect e-mail messages, instant messaging messages, facsimile transmissions, voice communications, and video telephony, and analyze these communications to determine whether an intended communication is unsolicited. In connection with voice and video telephony, a voice sample may be obtained from the caller and voice recognition may be performed on the sample to determine an identity of the person or the voice. The voice sample may also be used to determine the type of voice—i.e., if the voice is live, machine generated, or prerecorded. Where the call is a video telephony call, image recognition may be used to inspect an image of the person. The information obtained from voice recognition, voice type recognition, and image recognition may be used to detect whether the messages if from a known source of unsolicited communications. | 06-14-2012 |
20120173238 | Remote Control Audio Link - One embodiment may take the form of a voice control system. The system may include a first apparatus with a processing unit configured to execute a voice recognition module and one or more executable commands, and a receiver coupled to the processing unit and configured to receive a first audio file from a remote control device. The first audio file may include at least one voice command. The first apparatus may further include a communication component coupled to the processing unit and configured to receive programming content, and one or more storage media storing the voice recognition module. The voice recognition module may be configured to convert voice commands into text. | 07-05-2012 |
20120173239 | METHOD FOR VERIFYING THE IDENTITYOF A SPEAKER, SYSTEM THEREFORE AND COMPUTER READABLE MEDIUM - The invention refers to a method of verifying the identity of a speaker based on the speakers voice comprising the steps of: receiving ( | 07-05-2012 |
20120191454 | Method and Apparatus for Obtaining Statistical Data from a Conversation - A system is described to monitor various parameters of a conversation, for example distinguishing voices in a conversation and reporting who in the group is violating the proper etiquette rules of conversation. These results would indicate any disruptive individuals in a conversation. So they are identified, monitored, trained to prevent further disturbances, and their etiquette is improved to prevent further disturbances. Some of the functions the system can perform include: report the identity of the voices, report how long one has spoken, report how often one interrupts, report how often one raises their voice, count the occurrences of obscenities and determine length of silences. The system, in addition, can provide meaning of words, send email, identify fast talkers, train to reduce the volume of a voice, provide a period of time to a voice, beep after someone uses a profanity, request a voice to speak up, provide grammatical corrections, provide text copies of conversation, and eliminate background noises. | 07-26-2012 |
20120209608 | MOBILE COMMUNICATION TERMINAL APPARATUS AND METHOD FOR EXECUTING APPLICATION THROUGH VOICE RECOGNITION - A mobile communication terminal apparatus and method are capable of recognizing an input voice of a user and executing an application related to the recognized voice. The apparatus includes a voice input unit to receive a first input voice; a voice recognition unit to acquire first voice instruction information based on the first input voice; a voice control table acquiring unit to acquire a first voice control table comprising the first voice instruction information and first icon position information; and an application execution unit to execute a first application based on the first icon position information included in the first voice control table. The method for registering voice instruction information includes acquiring voice instruction information for a selected application; acquiring execution information of the selected application; generating a voice control table comprising the execution information, and the voice instruction information; and storing the voice control table. | 08-16-2012 |
20120215536 | Methods and Voice Activity Detectors for Speech Encoders - Voice activity detectors are related methods are provided. Methods include receiving a frame of the input signal; determining a first SNR of the received frame; comparing the determined first SNR with an adaptive threshold; and detecting whether the received frame comprises voice based on the comparison. The adaptive threshold is at least based on total noise energy of a noise level, an estimate of a second SNR and on energy variation between different frames. | 08-23-2012 |
20120221334 | SECURITY SYSTEM AND METHOD - A security system and method includes setting operation steps having a preset sequence, a trigger signal and a testing parameter for each of the operation steps, and a range of each of the testing parameters. The method further confirms a current operation step when a testing device starts a test process. If an output signal received from a sensing device is not identical to the trigger signal of the current operation step, a voice content file of the current operation step is sent to the voice output device. If a value of the output parameter read from the sensing device is not within the range of the testing parameter, a voice prompt file of the testing parameter is sent to the voice output device. After sending the voice content file or the voice prompt file, an abnormality processing command of the current operation step is sent to the testing device to stop the test process. | 08-30-2012 |
20120232903 | KITCHEN AND/OR DOMESTIC APPLIANCE - The invention relates to a kitchen and/or domestic appliance comprising input means, which are connected to a voice-recognition system, for acoustic operator commands. The invention is characterised in that means for executing command-dependent actions are provided and that the voice-recognition system is used to identify and check the authorisation of a user. | 09-13-2012 |
20120245941 | Device Access Using Voice Authentication - A device can be configured to receive speech input from a user. The speech input can include a command for accessing a restricted feature of the device. The speech input can be compared to a voiceprint (e.g., text-independent voiceprint) of the user's voice to authenticate the user to the device. Responsive to successful authentication of the user to the device, the user is allowed access to the restricted feature without the user having to perform additional authentication steps or speaking the command again. If the user is not successfully authenticated to the device, additional authentication steps can be request by the device (e.g., request a password). | 09-27-2012 |
20120253808 | Voice Recognition Device and Voice Recognition Method - According to an embodiment, a voice recognition device includes a voice inputting unit, a voice recognition processing unit, a vibration movement pattern model holding unit, and a vibration movement unit. The voice recognition processing unit performs voice recognition processing using a digital signal output from the voice inputting unit to output a voice recognition result and outputs voice reliability of the received voice signal. The vibration movement pattern model holding unit stores models prepared according to a number of patterns of the voice reliability output from the voice recognition processing unit and holds vibration movements corresponding to the models. The vibration movement unit detects whether or not the voice reliability output from the voice recognition processing unit matches any one of the models in the vibration movement pattern model holding unit and performs vibration movement predetermined for a matched model. | 10-04-2012 |
20120253809 | Voice Verification System - A voice verification module | 10-04-2012 |
20120253810 | COMPUTER PROGRAM, METHOD, AND SYSTEM FOR VOICE AUTHENTICATION OF A USER TO ACCESS A SECURE RESOURCE - Authenticating a purported user attempting to access a secure resource includes enrolling a user's voice sample by requiring the user to orally speak preselected enrollment utterances, generating prompts and respective predetermined correct responses where each question has only one correct response, presenting a prompt to the user in real time, and analyzing the user's real time live response to determine if the live response matches the predetermined correct response and if voice characteristics of the user's live voice sample match characteristics of the enrolled voice sample. | 10-04-2012 |
20120271632 | Speaker Identification - Speaker identification techniques are described. In one or more implementations, sample data is received at a computing device of one or more user utterances captured using a microphone. The sample data is processed by the computing device to identify a speaker of the one or more user utterances. The processing involving use of a feature set that includes features obtained using a filterbank having filters that space linearly at higher frequencies and logarithmically at lower frequencies, respectively, features that model the speaker's vocal tract transfer function, and features that indicate a vibration rate of vocal folds of the speaker of the sample data. | 10-25-2012 |
20120284026 | SPEAKER VERIFICATION SYSTEM - In an aspect, in general, a method for computer assisted speaker authentication in a voice communication session includes establishing a voice communication session between a first speaker and an agent, accepting a first voice signal from the first speaker, determining a voice characteristic measure of the first voice signal, including characterizing a similarity of the first voice signal to each of one or more stored characterizations of voice signals previously acquired from one or more known speakers, and providing an interface to the agent during the voice communication session between the agent and the first speaker, including presenting an indicator based on the determined voice characteristic measure to the agent. | 11-08-2012 |
20120284027 | METHOD AND SYSTEM FOR SHARING PORTABLE VOICE PROFILES - An embodiment of the present invention provides a speech recognition engine that utilizes portable voice profiles for converting recorded speech to text. Each portable voice profile includes speaker-dependent data, and is configured to be accessible to a plurality of speech recognition engines through a common interface. A voice profile manager receives the portable voice profiles from other users who have agreed to share their voice profiles. The speech recognition engine includes speaker identification logic to dynamically select a particular portable voice profile, in real-time, from a group of portable voice profiles. The speaker-dependent data included with the portable voice profile enhances the accuracy with which speech recognition engines recognize spoken words in recorded speech from a speaker associated with a portable voice profile. | 11-08-2012 |
20120296649 | Digital Signatures for Communications Using Text-Independent Speaker Verification - A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature. | 11-22-2012 |
20120296650 | SPEECH RECOGNITION SYSTEM FOR PROVIDING VOICE RECOGNITION SERVICES USING A CONVERSATIONAL LANGUAGE - Embodiments of the present invention provide a method, system and article of manufacture for adjusting a language model within a voice recognition system, based on text received from an external application. The external application may supply text representing the words of one participant to a text-based conversation. n such a case, changes may be made to a language model by analyzing the external text received from the external application. | 11-22-2012 |
20120303369 | Energy-Efficient Unobtrusive Identification of a Speaker - Functionality is described herein for recognizing speakers in an energy-efficient manner. The functionality employs a heterogeneous architecture that comprises at least a first processing unit and a second processing unit. The first processing unit handles a first set of audio processing tasks (associated with the detection of speech) while the second processing unit handles a second set of audio processing tasks (associated with the identification of speakers), where the first set of tasks consumes less power than the second set of tasks. The functionality also provides unobtrusive techniques for collecting audio segments for training purposes. The functionality also encompasses new applications which may be invoked in response to the recognition of speakers. | 11-29-2012 |
20120310647 | PATTERN PROCESSING SYSTEM SPECIFIC TO A USER GROUP - Methods and apparatus for identifying a user group in connection with user group-based speech recognition. An exemplary method comprises receiving, from a user, a user group identifier that identifies a user group to which the user was previously assigned based on training data. The user group comprises a plurality of individuals including the user. The method further comprises using the user group identifier, identifying a pattern processing data set corresponding to the user group, and receiving speech input from the user to be recognized using the pattern processing data set. | 12-06-2012 |
20120316876 | Display Device, Method for Thereof and Voice Recognition System - A display system, a display device, a control method for the display device, and a voice recognition system are disclosed. A display device according to one embodiment of the present invention can carry out voice recognition upon a voice received from at least one speaker through at least one voice input device; and display the voice recognition result on the display unit. Accordingly, effective voice recognition is made possible for TV environments where various constraints exist differently from mobile terminal environments. | 12-13-2012 |
20120323574 | SPEECH TO TEXT MEDICAL FORMS - Event audio data that is based on verbal utterances associated with a medical event associated with a patient is received. A list of a plurality of candidate text strings that match interpretations of the event audio data is obtained, based on information included in a medical speech repository, information included in a speech accent repository, and a matching function. A selection of at least one of the candidate text strings included in the list is obtained. A population of at least one field of an electronic medical form is initiated, based on the obtained selection. | 12-20-2012 |
20120323575 | SPEAKER ASSOCIATION WITH A VISUAL REPRESENTATION OF SPOKEN CONTENT - Speaker content generated in an audio conference is visually represented in accordance with a method. Speaker content from a plurality of audio conference participants is monitored using a computer with a tangible non-transitory processor and memory. The speaker content from each of the plurality of audio conference participants is monitored. A visual representation of speaker content for each of the plurality of audio conference participants is generated based on the analysis of the speaker content from each of the plurality of audio conference participant. The visual representation of speaker content is displayed. | 12-20-2012 |
20120330663 | IDENTITY AUTHENTICATION SYSTEM AND METHOD - An identity authentication method is applied on a system. The system is connected to an external storage device storing a first voice model. The system includes an information server and a terminal. The information server includes a database. The information server executes the following steps. First, receiving the first voice model transmitted by the terminal. Second, determining whether the first voice model matches one second voice model, and transmitting the verification result to the terminal. The terminal executes the following steps. First, generating a prompt to prompt the user to input voice signals. Second, receiving the input voice signals. Third, extracting voice features from the input voice signals. Fourth, determining whether the extracted voice features matches the first voice model. Fifth, determining the verification result is successful when matches, and determining the identity authentication is success only when two verification results are both successful. A related system is also provided. | 12-27-2012 |
20130013309 | System and Method for Low Overhead Voice Authentication - A system and method are provided to authenticate a voice in a frequency domain. A voice in the time domain is transformed to a signal in the frequency domain. The first harmonic is set to a predetermined frequency and the other harmonic components are equalized. Similarly, the amplitude of the first harmonic is set to a predetermined amplitude, and the harmonic components are also equalized. The voice signal is then filtered. The amplitudes of each of the harmonic components are then digitized into bits to form at least part of a voice ID. In another system and method, a voice is authenticated in a time domain. The initial rise time, initial fall time, second rise time, second fall time and final oscillation time are digitized into bits to form at least part of a voice ID. The voice IDs are used to authenticate a user's voice. | 01-10-2013 |
20130024196 | SYSTEMS AND METHODS FOR USING A MOBILE DEVICE TO DELIVER SPEECH WITH SPEAKER IDENTIFICATION - Systems, methods, and apparatus for using at least one mobile device to receive a representation of at least one audio signal. In some embodiments, the at least one audio signal comprises speech of at least one of a plurality of first participants of a meeting, the plurality of first participants participating in the meeting from a first location, and the at least one audio signal may be audibly rendered to at least one second participant of the meeting at a second location different from the first location. In some embodiments, the at least one mobile device may further receive an indication of an identity of a leading speaker of the speech in the at least one audio signal, the leading speaker being identified from among the plurality of first participants, and may render the identity of the leading speaker to the at least one second participant. | 01-24-2013 |
20130024197 | ELECTRONIC DEVICE AND METHOD FOR CONTROLLING THE SAME - An electronic device and a method for controlling an electronic device are disclosed. The electronic device includes: a display unit; a voice input unit; and a controller displaying a plurality of contents on the display unit, receiving a voice command for controlling any one of the plurality of contents through the voice input unit, and controlling content corresponding to the received voice command. Multitasking performed by the electronic device can be effectively controlled through a voice command. | 01-24-2013 |
20130041665 | Electronic Device and Method of Controlling the Same - There are disclosed an electronic device and a method of controlling the electronic device. The electronic device according to an aspect of the present invention includes a display unit, a voice input unit, and a control unit configured to output a plurality of contents through the electronic device, receive a voice command through the voice input unit for performing a command, determine which of the plurality of contents correspond to the received voice command, and perform the command on one or more of the plurality of contents that correspond to the received voice command. According to the present invention, multi-tasking performed in an electronic device can be efficiently controlled through a voice command. | 02-14-2013 |
20130041666 | VOICE RECOGNITION APPARATUS, VOICE RECOGNITION SERVER, VOICE RECOGNITION SYSTEM AND VOICE RECOGNITION METHOD - A voice recognition apparatus, a voice recognition server, a voice recognition system, and a voice recognition method, in which a general-purpose voice recognition engine may accurately recognize a limited number of words used in a specific area. | 02-14-2013 |
20130054243 | ELECTRONIC DEVICE AND CONTROL METHOD - Provided is an electronic device and control method, wherein a simple interface upon utilizing voice recognition can be attained. A cellular phone ( | 02-28-2013 |
20130060569 | VOICE AUTHENTICATION SYSTEM AND METHOD USING A REMOVABLE VOICE ID CARD - A voice authentication system using a removable voice ID card comprises: at server side, a voiceprint database for storing the voiceprints of all authorized users; a voiceprint updating means for updating the voiceprints in said voiceprint database; and a voiceprint digest generator for generating a voiceprint digest according to a request from a client; at client side, a voice ID card for storing the voiceprint of an authorized user; a validation means for validating the voiceprint in the voice ID card on the basis of the voiceprint digest from the server; an audio device for performing voice interaction with a user; and a voice authentication means for determining whether the voiceprint from said voice ID card is of the same speaker as the voice from said audio device. | 03-07-2013 |
20130080167 | Background Speech Recognition Assistant Using Speaker Verification - In one embodiment, a method includes receiving an acoustic input signal at a speech recognizer. A user is identified that is speaking based on the acoustic input signal. The method then determines speaker-specific information previously stored for the user and a set of responses based on the recognized acoustic input signal and the speaker-specific information for the user. It is determined if the response should be output and the response is outputted if it is determined the response should be output. | 03-28-2013 |
20130080168 | AUDIO ANALYSIS APPARATUS - An audio analysis apparatus includes the following components. A strap has an end portion connected to a main body and is used to hang the main body from a user's neck. A first audio acquisition device is at the end portion or in the main body. Second and third audio acquisition devices are at positions separate from the end portion by substantially the same predetermined distances, on the respective sides of the strap extending from the user's neck. An analysis unit discriminates whether an acquired sound is an uttered voice of the user or another person by comparing audio signals of acquired by the first and second or third audio acquisition devices and detects an orientation of the user's face by comparing the audio signals acquired by the second and third audio acquisition devices. A transmission unit transmits the analysis result to an external apparatus. | 03-28-2013 |
20130096917 | METHODS AND DEVICES FOR FACILITATING COMMUNICATIONS - Methods and electronic devices for facilitating communications are described. In one aspect, a method for facilitating communications is described. The method includes: monitoring audio based communications; performing an audio analysis on the monitored audio based communications to identify a contact associated with the monitored communications; and providing information associated with the identified contact on an electronic device. In another aspect, an electronic device is described. The electronic device includes a processor and a memory coupled to the processor. The memory stores processor readable instructions for causing the processor to: monitor audio based communications; perform an audio analysis on the monitored audio based communications to identify a contact associated with the monitored communications; and provide information associated with the identified contact on an electronic device. | 04-18-2013 |
20130144620 | METHOD, SYSTEM AND PROGRAM FOR VERIFYING THE AUTHENTICITY OF A WEBSITE USING A RELIABLE TELECOMMUNICATION CHANNEL AND PRE-LOGIN MESSAGE - Various embodiments of the present invention for validating the authenticity of a website are provided. An example of a method according to the present invention comprises providing a website having an artifact, receiving a communication from a user, at a service provider, for validating the website associated with a service provider, inquiring from the user a description of the artifact comparing the artifact on the website with the description of the artifact from the user and generating a indication to the user based upon the comparing. The communication is over a first communication channel and the website is accessed over a second communication channel. The first communication channel is different than the second. The artifact can be displayed after a user session is identified. | 06-06-2013 |
20130144621 | Systems and Methods for Assessment of Non-Native Spontaneous Speech - Computer-implemented systems and methods are provided for assessing non-native spontaneous speech pronunciation. Speech recognition on digitized speech is performed using a non-native acoustic model trained with non-native speech to generate word hypotheses for the digitized speech. Time alignment is performed between the digitized speech and the word hypotheses using a reference acoustic model trained with native-quality speech. Statistics are calculated regarding individual words and phonemes in the word hypotheses based on the alignment. A plurality of features for use in assessing pronunciation of the speech are calculated based on the statistics, an assessment score is calculated based on one or more of the calculated features, and the assessment score is stored in a computer-readable memory. | 06-06-2013 |
20130166298 | VOICE ANALYZER - A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body to make the apparatus body hung from a neck of a wearer, a first voice acquisition unit that acquires a voice of a speaker and is disposed in either a left or right strap when viewed from the wearer, a second voice acquisition unit that acquires the voice of the speaker and is disposed in the opposite strap in which the first voice acquisition unit is disposed, and an arrangement recognition unit that recognizes arrangements of the first and second voice acquisition units, when viewed from the wearer, by comparing a voice signal of the voice acquired by the first voice acquisition unit with sound pressure of a heart sound of the wearer acquired by the second voice acquisition unit. | 06-27-2013 |
20130166299 | VOICE ANALYZER - A voice analyzer includes an apparatus body, a strap that is connected to the apparatus body and is used to hang the apparatus body from a neck of a user, a first voice acquisition unit provided in the strap or the apparatus body, a second voice acquisition unit provided at a position where a distance of a sound wave propagation path from a mouth of the user is smaller than a distance of a sound wave propagation path from the mouth of the user to the first voice acquisition unit, and an identification unit that identifies a sound, in which first sound pressure acquired by the first voice acquisition unit is larger by a predetermined value or more than second sound pressure acquired by the second voice acquisition unit, on the basis of a result of comparison between the first sound pressure and the second sound pressure. | 06-27-2013 |
20130166300 | ELECTRONIC DEVICE, DISPLAYING METHOD, AND PROGRAM COMPUTER-READABLE STORAGE MEDIUM - An electronic device includes a voice recognition analyzing module, a manipulation identification module, and a manipulating module. The voice recognition analyzing module is configured to recognize and analyze a voice of a user. The manipulation identification module is configured to, using the analyzed voice, identify an object on a screen and identify a requested manipulation associated with the object. The manipulating module is configured to perform the requested manipulation. | 06-27-2013 |
20130179167 | METHODS AND APPARATUS FOR FORMANT-BASED VOICE SYNTHESIS - In one aspect, a method of processing a voice signal to extract information to facilitate training a speech synthesis model is provided. The method comprises acts of detecting a plurality of candidate features in the voice signal, performing at least one comparison between one or more combinations of the plurality of candidate features and the voice signal, and selecting a set of features from the plurality of candidate features based, at least in part, on the at least one comparison. In another aspect, the method is performed by executing a program encoded on a computer readable medium. In another aspect, a speech synthesis model is provided by, at least in part, performing the method. | 07-11-2013 |
20130185072 | Communication System and Method Between an On-Vehicle Voice Recognition System and an Off-Vehicle Voice Recognition System - A vehicle based system and method for receiving voice inputs and determining whether to perform a voice recognition analysis using in-vehicle resources or resources external to the vehicle. | 07-18-2013 |
20130191127 | VOICE ANALYZER, VOICE ANALYSIS SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING A PROGRAM - A voice analyzer includes a plate-shaped body, a plurality of first voice acquisition units that are placed on both surfaces of the plate-shaped body and that acquire a voice of a speaker, a sound pressure comparison unit that compares sound pressure of a voice acquired by the first voice acquisition unit placed on one surface of the plate-shaped body with sound pressure of a voice acquired by the first voice acquisition unit placed on the other surface and determines a larger sound pressure, and a voice signal selection unit that selects information regarding a voice signal which is associated with the larger sound pressure and is determined by the sound pressure comparison unit. | 07-25-2013 |
20130197912 | SPECIFIC CALL DETECTING DEVICE AND SPECIFIC CALL DETECTING METHOD - A specific call detecting device includes: an utterance period detecting unit which detects at least a first utterance period in which the first speaker speaks in a call between a first speaker and a second speaker; an utterance ratio calculating unit which calculates utterance ratio of the first speaker in the call; a voice recognition execution determining unit which determines whether at least one of the first voice of the first speaker and second voice of the second speaker becomes a target of voice recognition or not on the basis of the utterance ratio of the first speaker; a voice recognizing unit which detects a keyword related to a specific call from the voice determined as a target of voice recognition among the first and second voices; and a determining unit which determines whether the call is the specific call or not on the basis of the detected keyword. | 08-01-2013 |
20130204620 | ESTABLISHING A MULTIMODAL PERSONALITY FOR A MULTIMODAL APPLICATION IN DEPENDENCE UPON ATTRIBUTES OF USER INTERACTION - Establishing a multimodal personality for a multimodal application, including evaluating, by the multimodal application, attributes of a user's interaction with the multimodal application; selecting, by the multimodal application, a vocal demeanor in dependence upon the values of the attributes of the user's interaction with the multimodal application; and incorporating, by the multimodal application, the vocal demeanor into the multimodal application. | 08-08-2013 |
20130211836 | GLOBAL SPEECH USER INTERFACE - A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications. | 08-15-2013 |
20130218561 | System and Method for Enhancing Voice-Enabled Search Based on Automated Demographic Identification - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for approximating responses to a user speech query in voice-enabled search based on metadata that include demographic features of the speaker. A system practicing the method recognizes received speech from a speaker to generate recognized speech, identifies metadata about the speaker from the received speech, and feeds the recognized speech and the metadata to a question-answering engine. Identifying the metadata about the speaker is based on voice characteristics of the received speech. The demographic features can include age, gender, socio-economic group, nationality, and/or region. The metadata identified about the speaker from the received speech can be combined with or override self-reported speaker demographic information. | 08-22-2013 |
20130226581 | COMMUNICATION DEVICE AND METHOD - A communication method includes: capturing analog sound signals output by the audio output unit, and analyze the captured analog sound signals to obtain a corresponding digital audio information. Comparing the obtained digital audio information with a digital feature information stored in a storage unit to determine whether the obtained digital audio information includes the stored digital feature information. Playing a reply information stored in the storage unit if the obtained digital audio information includes the stored digital feature information. | 08-29-2013 |
20130231933 | Addressee Identification of Speech in Small Groups of Children and Adults - A method and system for assignee identification of speech includes defining several time intervals and utilizing one or more function evaluations to classify each of the several participants as addressing speech to an automated character or not addressing speech to the automated character during each of the several time intervals. A first function evaluation includes computing values for a predetermined set of features for each of the participants during a particular time interval and assigning a first addressing status to each of the several participants in the particular time interval, based on the values of each of the predetermined sets of features determined during the particular time interval. A second function evaluation may assign a second addressing status to each of the several participants in the particular time interval utilizing results of the first function evaluation for the particular time interval and for one or more additional contiguous time intervals. | 09-05-2013 |
20130246066 | METHOD AND APPARATUS FOR PROVIDING SERVICES USING VOICE RECOGNITION IN POS SYSTEM - A method of providing services using voice recognition in a POS system includes loading an execution command set for each service subject provided by the POS system for each group; registering item-based voice pattern information on the execution command for each group; detecting operation mode of the POS system and activating a microphone by driving voice recognition mode in response to the detected operation mode; converting the received signal of the activated microphone into digital data, detecting properties of a sound wave from the digital data, and extracting sound wave analysis data from the detecting properties; checking whether the sound wave analysis data has been registered and assigning a service use right to the received signal according to a result of the check; and performing voice recognition conversion on the received signal, searching for an execution command having a maximum likelihood for the resulting data, and performing services corresponding to the retrieved execution command. | 09-19-2013 |
20130253932 | CONVERSATION SUPPORTING DEVICE, CONVERSATION SUPPORTING METHOD AND CONVERSATION SUPPORTING PROGRAM - A conversation supporting device of an embodiment of the present disclosure has a information storage unit, a recognition resource constructing unit, and a voice recognition unit. Here, the information storage unit stores the information disclosed by a speaker. The recognition resource constructing unit uses the disclosed information to construct the recognition resource including a voice model and a language model for recognition of voice data. The voice recognition unit uses the recognition resource to recognize the voice data. | 09-26-2013 |
20130253933 | VOICE RECOGNITION DEVICE AND NAVIGATION DEVICE - A voice recognition device | 09-26-2013 |
20130262115 | ALERT MODE MANAGEMENT METHOD AND COMMUNICATION DEVICE HAVING ALERT MODE MANAGEMENT FUNCTION - A computerized alert mode management method of a communication device, the communication device includes a sound capture unit. Vocal sounds of the environment around the communication device are extracted at regular intervals using the sound capture unit. Voice characteristic information of the captured vocal sounds is extracted using a speech recognition method and/or a voice recognition method. The communication device is controlled to work at one of a plurality of predetermined alert modes according to the extracted voice characteristic information. | 10-03-2013 |
20130289991 | Application of Voice Tags in a Social Media Context - According to a present invention embodiment, a system utilizes a voice tag to automatically tag one or more entities within a social media environment, and comprises a computer system including at least one processor. The system analyzes the voice tag to identify one or more entities, where the voice tag includes voice signals providing information pertaining to one or more entities. One or more characteristics of each identified entity are determined based on the information within the voice tag. One or more entities appropriate for tagging within the social media environment are determined based on the characteristics and user settings within the social media environment of the identified entities, and automatically tagged. Embodiments of the present invention further include a method and computer program product for utilizing a voice tag to automatically tag one or more entities within a social media environment in substantially the same manner described above. | 10-31-2013 |
20130332165 | METHOD AND SYSTEMS HAVING IMPROVED SPEECH RECOGNITION - A method for improving speech recognition by a speech recognition system includes obtaining a voice sample from a speaker; storing the voice sample of the speaker as a voice model in a voice model database; identifying an area from which sound matching the voice model for the speaker is coming; providing one or more audio signals corresponding to sound received from the identified area to the speech recognition system for processing. | 12-12-2013 |
20130332166 | PROCESSING APPARATUS, PROCESSING SYSTEM, AND OUTPUT METHOD - A processing apparatus includes: a voice recognition unit that recognizes a voice of a user; a condition recognition unit that recognizes a current condition of a user; a search result acquisition unit that acquires a search result searched on the basis of the voice recognized by the voice recognition unit; an output manner determination unit that determines a manner of outputting the search result on the basis of the current condition recognized by the condition recognition unit; and an output control unit that causes an output unit to output the search result acquired by the search result acquisition unit in the manner determined by the output manner determination unit. | 12-12-2013 |
20140006025 | PROVIDING AUDIO-ACTIVATED RESOURCE ACCESS FOR USER DEVICES BASED ON SPEAKER VOICEPRINT | 01-02-2014 |
20140006026 | CONTEXTUAL AUDIO DUCKING WITH SITUATION AWARE DEVICES | 01-02-2014 |
20140006027 | MOBILE TERMINAL AND METHOD FOR RECOGNIZING VOICE THEREOF | 01-02-2014 |
20140019130 | GLOBAL SPEECH USER INTERFACE - A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications. | 01-16-2014 |
20140039891 | AUTOMATIC SEPARATION OF AUDIO DATA - Systems and methods for audio editing are provided. In one implementation, a computer-implemented method is provided. The method includes receiving digital audio data including a plurality of distinct vocal components. Each distinct vocal component is automatically identified using one or more attributes that uniquely identify each distinct vocal component. The audio data is separated into two or more individual tracks where each individual track comprises audio data corresponding to one distinct vocal component. The separated individual tracks are then made available for further processing. | 02-06-2014 |
20140039892 | USING THE ABILITY TO SPEAK AS A HUMAN INTERACTIVE PROOF - In one embodiment, a human interactive proof portal | 02-06-2014 |
20140046664 | Secure Device Pairing Using Voice Input - Methods and apparatuses for secure device pairing are disclosed. In one example, a user voice is received simultaneously at a first device and a second device to pair the devices. | 02-13-2014 |
20140074471 | SYSTEM AND METHOD FOR IMPROVING SPEAKER SEGMENTATION AND RECOGNITION ACCURACY IN A MEDIA PROCESSING ENVIRONMENT - A method is provided and includes estimating an approximate list of potential speakers in a file from one or more applications. The file (e.g., an audio file, video file, or any suitable combination thereof) includes a recording of a plurality of speakers. The method also includes segmenting the file according to the approximate list of potential speakers such that each segment corresponds to at least one speaker; and recognizing particular speakers in the file based on the approximate list of potential speakers. | 03-13-2014 |
20140074472 | VOICE CONTROL SYSTEM WITH PORTABLE VOICE CONTROL DEVICE - A voice control system is adapted for controlling an electrical appliance, and includes a host and a portable voice control device. The portable voice control device is capable of wireless communication with the host, and includes an audio pick-up unit for receiving a voice input. One of the host and the portable voice control device includes a voice recognition control module that is configured to recognize a control command from the voice input. The host controls operation of the electrical appliance according to the control command, and transmits an appliance status message to the portable voice control device. The portable voice control device further includes an output unit for outputting the appliance status message. | 03-13-2014 |
20140074473 | NAVIGATION APPARATUS - A navigation apparatus capable of providing a user not only with guidance, but also with all of the guidance, operational procedure, operation screen and recognition vocabulary, that is, with an operational transition that is defined by the guidance, operational procedure, operation screen and recognition vocabulary, while altering the operational transition in accordance with the recognition vocabulary comprehension level of the user. Thus, it can increase the possibility for a user with a low recognition vocabulary comprehension level to achieve a task, or for a user with a high recognition vocabulary comprehension level to improve the comfortableness of the operation, thereby being able to provide all the users with the optimum operational transition. | 03-13-2014 |
20140081637 | Turn-Taking Patterns for Conversation Identification - A method for identifying a conversation between a plurality of participants that includes monitoring voice streams in proximity to client devices and assigning a tag to identify the participants speaking in the voice streams in proximity to the client devices. The method also includes forming a fingerprint, based on the assigned tags, for the voice streams in proximity to the client devices. The method also includes identifying which participants are participating in a conversation based on the fingerprints for the voice streams and providing an interface to the client devices including graphical representations depicting the participants in the conversation. | 03-20-2014 |
20140088965 | ASSOCIATING AND LOCATING MOBILE STATIONS BASED ON SPEECH SIGNATURES - Methods and systems populate a speech signature database with unique speech signatures that are associated with one or more speaker identities and are further associated with one or more mobile stations and/or telephone numbers. Real-time voice signals are compared to the speech signatures in the speech signature database. When a match is found, the mobile station from which the voice signal originated is located in real-time. Further, the associations in the speech signature database are leveraged to find other relevant mobile stations or users and to generate additional associations and to also locate associated users and mobile stations. | 03-27-2014 |
20140088966 | VOICE ANALYZER, VOICE ANALYSIS SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM STORING PROGRAM - A voice analyzer includes a voice information acquiring unit that acquires information about voices acquired by a first voice acquiring unit which acquires the voice and is worn by a first wearer and a second voice acquiring unit which acquires the voice and is worn by a second wearer from each of the wearers, and a distance calculation unit that calculates a distance between the first wearer and the second wearer on the basis of (a) speaker identification information, which is information for determining whether the voice acquired by the first voice acquiring unit and the voice which is the same as that acquired by the first voice acquiring unit and is acquired by the second voice acquiring unit are spoken by the wearers or other persons, and (b) a phase difference between sound waves with plural frequencies included in the voices. | 03-27-2014 |
20140095161 | SYSTEM AND METHOD FOR CHANNEL EQUALIZATION USING CHARACTERISTICS OF AN UNKNOWN SIGNAL - Disclosed herein are systems and methods for identifying the source of a signal via channel equalization using characteristics of the signal. A system receives a signal, then measures a frequency response of the signal by performing a spectral analysis over the entire signal. The system computes the average amplitude over a subset of time samples from the spectral analysis for each represented frequency and compares the set of averaged amplitudes to a stored set of averaged amplitudes to produce equalization coefficients. Applying the equalization coefficients to the frequency response yields an equalized frequency response, which is compared to a stored frequency response using a classifier to determine a match. Alternately, the system applies the equalization coefficients to the stored frequency response yielding an equalized stored frequency response. The method can recognize speakers, vehicles, electromagnetic signals, sonar signals, optical signals, videos, etc. | 04-03-2014 |
20140100849 | VOICE PRINT IDENTIFICATION FOR IDENTIFYING SPEAKERS - Voice print identification for identifying speakers is provided. A plurality of speakers are recorded and associated with identity indicators. Voice prints for each speaker are associated with the plurality of recorded speakers. If the voice print for at least one speaker corresponds to a known user according to the identity indicators, a database entry associating the user with the voice print may be created. Additional information associated with the user may also be displayed. | 04-10-2014 |
20140108011 | SOUND ANALYSIS APPARATUS, SOUND ANALYSIS SYSTEM, AND NON-TRANSITORY COMPUTER READABLE MEDIUM - A sound analysis apparatus includes a sound information obtaining section chat obtains information relating to a sound acquired by a sound acquiring section that acquires the sound and distinguishes a spoken voice of a wearer from a spoken voice of another person, a phase difference deriving section that derives a relationship between a frequency and a phase difference with respect to the sound acquired by the plural sound acquiring sections, a dispersion deriving section that derives a dispersion that is the level of irregularity of the derived phase difference, and a distance deriving section that derives a distance between the wearer and the other person using a first dispersion derived in a case where the sound is distinguished as the spoken voice of the other person and a second dispersion derived in a case where the sound is distinguished as the spoken voice of the wearer. | 04-17-2014 |
20140108012 | USER SPEECH INTERFACES FOR INTERACTIVE MEDIA GUIDANCE APPLICATIONS - A user speech interface for interactive media guidance applications, such as television program guides, guides for audio services, guides for video-on-demand (VOD) services, guides for personal video recorders (PVRs), or other suitable guidance applications is provided. Voice commands may be received from a user and guidance activities may be performed in response to the voice commands. | 04-17-2014 |
20140122074 | METHOD AND SYSTEM OF USER-BASED JAMMING OF MEDIA CONTENT BY AGE CATEGORY - In one exemplary embodiment, a computer-implemented method includes the step of determining an age group of a first user. Media content available to the first user is identified. It is determined whether the user has permission to listen to the media content. The media content is jammed with a sound wave at a frequency that can be heard by the user when the user does not have permission to listen to the media content. Optionally, a voice age-recognition algorithm to determine the age group of the first user. An age-group of a second user can be determined. The first user and the second user may be proximate to a media player providing the ambient sound stream. | 05-01-2014 |
20140122075 | VOICE RECOGNITION APPARATUS AND VOICE RECOGNITION METHOD THEREOF - A voice recognition apparatus is provided. The voice recognition apparatus comprises: a voice receiver which receives a user's voice signal; a first voice recognition engine which receives the voice signal and performs a voice recognition process; a communication unit which receives the voice signal and transmits the voice signal to an external second voice recognition engine; and a controller which transmits the voice signal received through the voice receiver to at least one of the first voice recognition engine and the communication unit. | 05-01-2014 |
20140122076 | Voice Command System for Stitchers - A voice command system for a stitcher includes a tablet device in operative communication with the stitcher; the tablet device further comprising a display screen; a memory; a microprocessor; a communication module; and a microphone; and a speech recognition algorithm operatively communicating with said tablet device. An associated method includes the steps of digitizing a user's spoken command; transmitting the digitized spoken command to the speech recognition algorithm; producing a list of words possibly comprising the spoken command; parsing the list of possible words to identify the spoken command; and initiating execution of the spoken command. | 05-01-2014 |
20140129223 | METHOD AND APPARATUS FOR VOICE RECOGNITION - A method and apparatus for voice recognition are disclosed. The apparatus includes: a voice receiver which receives a user's voice signal; a first voice recognition engine which receives the voice signal and recognizes voice based on the voice signal; a communicator which receives and transmits the voice signal to an external second voice recognition engine; and a controller which transmits the voice signal from the voice receiver to the first voice recognition engine, and in response to the first voice recognition engine being capable of recognizing voice from the voice signal, the controller outputs the voice recognition results of the first voice recognition engine, and in response to the first voice recognition engine being incapable of recognizing voice from the voice signal, the controller controls transmission of the voice signal to the second voice recognition engine through the communicator. | 05-08-2014 |
20140136203 | DEVICE AND SYSTEM HAVING SMART DIRECTIONAL CONFERENCING - Some implementations provide a method for identifying a speaker. The method determines position and orientation of a second device based on data from a first device that is for capturing the position and orientation of the second device. The second device includes several microphones for capturing sound. The second device has movable position and movable orientation. The method assigns an object as a representation of a known user. The object has a moveable position. The method receives a position of the object. The position of the object corresponds to a position of the known user. The method processes the captured sound to identify a sound originating from the direction of the object. The direction of the object is relative to the position and the orientation of the second device. The method identifies the sound originating from the direction of the object as belonging to the known user. | 05-15-2014 |
20140156276 | CONVERSATION SYSTEM AND A METHOD FOR RECOGNIZING SPEECH - A dialogue system which correctly identifies an utterance directed to a dialogue system by using various pieces of information including information other than a voice recognition result without requiring a special signal is provided. | 06-05-2014 |
20140163984 | Method Of Voice Recognition And Electronic Apparatus - A method of voice recognition and an electronic apparatus are described with the method of voice recognition being applied in an electronic apparatus. The method includes taking i=1 and detecting corresponding i-th voice sub-information at a moment Ti when the electronic apparatus detects that a user starts to talk at a moment T0, wherein the i-th voice sub-information is corresponding voice information from the moment T0 to the moment Ti, the i-th voice sub-information is partial voice information of voice information with integral semantic corresponding to a moment Tj after the moment T0 to the moment Ti, and i is an integer greater than or equal to 1; and analyzing the i-th voice sub-information to obtain M results of analysis, M being an integer greater than or equal to 1. | 06-12-2014 |
20140163985 | Multi-Stage Speaker Adaptation - A first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors that correspond to a first unit of input speech. The first set of feature vectors may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors, which correspond to a second unit of input speech, may be modified based on the first gender-specific speaker adaptation technique. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors, which correspond to a third unit of speech, may be modified based on the first speaker-dependent speaker adaptation technique. | 06-12-2014 |
20140180689 | APPARATUS FOR SPEECH RECOGNITION USING MULTIPLE ACOUSTIC MODEL AND METHOD THEREOF - Disclosed are an apparatus for recognizing voice using multiple acoustic models according to the present invention and a method thereof. An apparatus for recognizing voice using multiple acoustic models includes a voice data database (DB) configured to store voice data collected in various noise environments; a model generating means configured to perform classification for each speaker and environment based on the collected voice data, and to generate an acoustic model of a binary tree structure as the classification result; and a voice recognizing means configured to extract feature data of voice data when the voice data is received from a user, to select multiple models from the generated acoustic model based on the extracted feature data, to parallel recognize the voice data based on the selected multiple models, and to output a word string corresponding to the voice data as the recognition result. | 06-26-2014 |
20140188471 | USER PROFILING FOR VOICE INPUT PROCESSING - This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device. | 07-03-2014 |
20140195233 | Distributed Speech Recognition System - Embodiments of the present invention include an apparatus, method, and system for speech recognition of a voice command. The method can include receiving data representing a voice command, generating a list of targets based on the state information of each target within the system, and selecting a target from the list of targets, based on the voice command. | 07-10-2014 |
20140195234 | Voice Recognition Grammar Selection Based on Content - The subject matter of this specification can be embodied in, among other things, a method that includes receiving geographical information derived from a non-verbal user action associated with a first computing device. The non-verbal user action implies an interest of a user in a geographic location. The method also includes identifying a grammar associated with the geographic location using the derived geographical information and outputting a grammar indicator for use in selecting the identified grammar for voice recognition processing of vocal input from the user. | 07-10-2014 |
20140195235 | REMOTE CONTROL APPARATUS AND METHOD FOR CONTROLLING POWER - A remote controller and a power control method are disclosed. The remote controller includes a voice recognizer a voice recognizer configured to recognize a voice utterance, a user interface configured to receive a user command; and a controller configured to, when a user command is input through the user interface to enter a voice recognition mode, convert a stand-by mode into an active mode, and convert the active mode into the stand-by mode depending on whether the utterance is recognized within a preset critical time. Accordingly, the remote controller, enabling an operation mode of a voice recognition module which recognizes a user voice utterance to be maintained as an active mode, can reduce power unnecessarily consumed. | 07-10-2014 |
20140207460 | VOICE IDENTIFICATION METHOD AND APPARATUS - Embodiments of the present invention provide a voice identification method, including: obtaining voice data; obtaining a first confidence value according to the voice data; obtaining a noise scenario according to the voice data; obtaining a second confidence value corresponding to the noise scenario according to the first confidence value; and if the second confidence value is greater than or equal to a pre-stored confidence threshold, processing the voice data. An apparatus is also provided. The method and apparatus that flexibly adjust the confidence value according to the noise scenario greatly improve a voice identification rate under a noise environment. | 07-24-2014 |
20140214423 | Technology For Combating Mobile Phone Criminal Activity - Technology for crime control includes receiving a voucher identifier for a mobile phone credit voucher purchased under duress by a victim and generating a request for a legal order directing a telecommunication service provider to obtain certain information about use of the voucher. Approval for the legal order is received and the legal order and the voucher identifier are transmitted by a law enforcement agency computer system via a network to a computer system of the telecommunication service provider. A phone number associated with a mobile phone to which a credit associated with the voucher identifier was applied and a recording of a telephone call to or from the phone number are received via the network from the telecommunication service provider computer system and the law enforcement agency computer system performs an automated analysis of the call by a voice recognition process. | 07-31-2014 |
20140214424 | VEHICLE BASED DETERMINATION OF OCCUPANT AUDIO AND VISUAL INPUT - Systems, apparatus, articles, and methods are described including operations to receive audio data and visual data from one or more occupants of a vehicle. A determination may be made regarding which of the one or more occupants of the vehicle to associate with the received audio data based at least in part on the received visual data. | 07-31-2014 |
20140222427 | Command Prefix For Voice Commands - Methods, systems, and products describe hands-free operation of a communications device. A user defines a command prefix that is recognized as preceding one of several voice commands. When the user speaks the command prefix, a processor identifies the spoken command prefix and treats a next spoken word as one of the voice commands. The voice command may then be executed for control. The command prefix thus helps avoid control confusion when the voice command is inadvertently spoken. | 08-07-2014 |
20140244257 | Method and Apparatus for Automated Speaker Parameters Adaptation in a Deployed Speaker Verification System - Typical speaker verification systems usually employ speakers' audio data collected during an enrollment phase when users enroll with the system and provide respective voice samples. Due to technical, business, or other constraints, the enrollment data may not be large enough or rich enough to encompass different inter-speaker and intra-speaker variations. According to at least one embodiment, a method and apparatus employing classifier adaptation based on field data in a deployed voice-based interactive system comprise: collecting representations of voice characteristics, in association with corresponding speakers, the representations being generated by the deployed voice-based interactive system; updating parameters of the classifier, used in speaker recognition, based on the representations collected; and employing the classifier, with the corresponding parameters updated, in performing speaker recognition. | 08-28-2014 |
20140249819 | VERIFYING A USER USING SPEAKER VERIFICATION AND A MULTIMODAL WEB-BASED INTERFACE - A method of verifying a user identity using a Web-based multimodal interface can include sending, to a remote computing device, a multimodal markup language document that, when rendered by the remote computing device, queries a user for a user identifier and causes audio of the user's voice to be sent to a multimodal, Web-based application. The user identifier and the audio can be received at about a same time from the client device. The audio can be compared with a voice print associated with the user identifier. The user at the remote computing device can be selectively granted access to the system according to a result obtained from the comparing step. | 09-04-2014 |
20140257812 | Background Speech Recognition Assistant Using Speaker Verification - In one embodiment, a method includes receiving an acoustic input signal at a speech recognizer. A user is identified that is speaking based on the acoustic input signal. The method then determines speaker-specific information previously stored for the user and a set of responses based on the recognized acoustic input signal and the speaker-specific information for the user. It is determined if the response should be output and the response is outputted if it is determined the response should be output. | 09-11-2014 |
20140278415 | Voice Recognition Configuration Selector and Method of Operation Therefor - A method includes obtaining a speech sample from a pre-processing front-end of a first device, identifying at least one condition, and selecting a voice recognition speech model from a database of speech models, the selected voice recognition speech model trained under the at least one condition. The method may include performing voice recognition on the speech sample using the selected speech model. A device includes a microphone signal pre-processing front end and operating-environment logic, operatively coupled to the pre-processing front end. The operating-environment logic is operative to identify at least one condition. A voice recognition configuration selector is operatively coupled to the operating-environment logic, and is operative to receive information related to the at least one condition from the operating-environment logic and to provide voice recognition logic with an identifier for a voice recognition speech model trained under the at least one condition. | 09-18-2014 |
20140278416 | Method and Apparatus Including Parallell Processes for Voice Recognition - A method and apparatus for voice recognition performed in a voice recognition block comprising a plurality of voice recognition stages. The method includes receiving a first plurality of voice inputs, corresponding to a first phrase, into a first voice recognition stage of the plurality of voice recognition stages, wherein multiple ones of the voice recognition stages includes a plurality of voice recognition modules and multiples ones of the voice recognition stages perform a different type of voice recognition processing, wherein the first voice recognition stage processes the first plurality of voice inputs to generate a first plurality of outputs for receipt by a subsequent voice recognition stage. The method further includes, receiving by each subsequent voice recognition stage a plurality of outputs from a preceding voice recognition stage, wherein a plurality of final outputs is generated by a final voice recognition stage from which to approximate the first phrase. | 09-18-2014 |
20140278417 | SPEAKER-IDENTIFICATION-ASSISTED SPEECH PROCESSING SYSTEMS AND METHODS - Methods, systems, and apparatuses are described for performing speaker-identification-assisted speech processing. In accordance with certain embodiments, a communication device includes speaker identification (SID) logic that is configured to identify a user of the communication device and/or the identity of a far-end speaker participating in a voice call with a user of the communication device. Knowledge of the identity of the user and/or far-end speaker is then used to improve the performance of one or more speech processing algorithms implemented on the communication device. | 09-18-2014 |
20140278418 | SPEAKER-IDENTIFICATION-ASSISTED DOWNLINK SPEECH PROCESSING SYSTEMS AND METHODS - Methods, systems, and apparatuses are described for performing speaker-identification-assisted speech processing in a downlink path of a communication device. In accordance with certain embodiments, a communication device includes speaker identification (SID) logic that is configured to identify the identity of a far-end speaker participating in a voice call with a user of the communication device. Knowledge of the identity of the far-end speaker is then used to improve the performance of one or more downlink speech processing algorithms implemented on the communication device. | 09-18-2014 |
20140288930 | VOICE RECOGNITION DEVICE AND VOICE RECOGNITION METHOD - The voice recognition device according to the present disclosure includes a communication interface that communicates with an external device, a first microphone that collects sound to produce audio data, and a controller that analyzes the audio data produced by the first microphone, determines contents of a designation corresponding to an analysis result, and then controls its own device based on a determination result, and yet controls its own device to urge a user to use the external device when the contents of the designation corresponding to the analysis result cannot be determined. | 09-25-2014 |
20140288931 | METHOD AND APPARATUS FOR SMART VOICE RECOGNITION - A display device with a voice recognition capability may be used to allow a user to speak voice commands for controlling certain features of the display device. As a means for increasing operational efficiency, the display device may utilize a plurality of voice recognition units where each voice recognition unit may be assigned a specific task. | 09-25-2014 |
20140297280 | SPEAKER IDENTIFICATION - In an aspect, in general, a system includes a first input for receiving a first data representing an interaction among a plurality of parties, the first data identifying a plurality of parts of the interaction and identifying a plurality of segments associated with each part of the plurality of parts, a second input for receiving a second data associating each of one or more labels with one or more corresponding query phrases, a searching module for searching the first data to identify putative instances of the query phrases, and a classifier for labeling the parts of the interaction associated with the identified putative instances of the query phrases with the labels corresponding to the identified query phrases. | 10-02-2014 |
20140309996 | VOICE CONTROL METHOD AND MOBILE TERMINAL APPARATUS - A voice control method and a mobile terminal apparatus are provided. The mobile terminal apparatus includes a voice receiving module, a voice outputting module, a voice wake-up module and a language recognition module. When the voice wake-up module determined that a first voice signal matches to identification information, the voice receiving module is turned on. When the voice receiving module receives a second voice signal after the first voice signal, the language recognition module parses the second voice signal and obtains a voice recognition result. When the voice recognition result includes an executing request, the language recognition module executes a responding operation, and the voice receiving module is turned off from receiving a third voice signal. When the voice recognition result does not include the executing request, the language recognition module executes a speech conversation mode. | 10-16-2014 |
20140324431 | System, Method, and Apparatus for Location-Based Context Driven Voice Recognition - Systems, methods, and devices for location-based context driven voice recognition are disclosed. A mobile or stationary computing device can include position locating functionality for determining the particular physical location of the computing device. Once the physical location of the computing device determined, a context related to that particular physical location. The context related to the particular physical location can include information regarding objects or experiences a user might encounter while in that particular physical location. The context can then be used to determine delimited or constrained voice recognition vocabulary subset based on the range of experiences a user might encounter within a particular context. The voice recognition vocabulary subset can then be referenced or used by a voice recognizer to increase the speed, accuracy, and effectiveness in receiving, recognizing, and acting in response to voice commands received by the user while in that particular physical location. | 10-30-2014 |
20140330566 | PROVIDING SOCIAL-GRAPH CONTENT BASED ON A VOICE PRINT - During a communication technique, an individual is identified based on a signal that includes vocal sounds of the individual and a voice print of the individual. For example, the voice print may include features characteristic of the individual's voice. Alternatively or additionally, the identification may be based on context information associated with a conversation that includes the individual and/or based on pronunciation of the individual's name. After the individual is identified, content in a social graph, which is associated with the individual, may be accessed and provided. This content may include business information, such as: contact information, education information, a job title, an organization associated with the individual, and/or connections of the individual to other individuals in the social graph. | 11-06-2014 |
20140343943 | Systems, Computer Medium and Computer-Implemented Methods for Authenticating Users Using Voice Streams - Provided are embodiments of systems, computer medium and computer-implemented methods for authenticating users using voice biometrics. Methods including receiving a request to access a resource via a user device, receiving a credentials set from a user (the credentials set including candidate credentials and candidate voice stream), determining whether the candidate credentials are valid based on a comparison of the candidate credentials to existing user credentials, in response to determining that the candidate credentials are valid, determining whether the candidate voice stream is valid based on a comparison of the candidate voice stream to a voice biometric associated with the candidate credentials and, in response to determining that the candidate voice stream is valid, generating an authentication signal configured to enable access to the resource via the user device. | 11-20-2014 |
20140350932 | VOICE PRINT IDENTIFICATION PORTAL - Systems and methods providing for secure voice print authentication over a network are disclosed herein. During an enrollment stage, a client's voice is recorded and characteristics of the recording are used to create and store a voice print. When an enrolled client seeks access to secure information over a network, a sample voice recording is created. The sample voice recording is compared to at least one voice print. If a match is found, the client is authenticated and granted access to secure information. | 11-27-2014 |
20140358542 | CANDIDATE SELECTION APPARATUS AND CANDIDATE SELECTION METHOD UTILIZING VOICE RECOGNITION - A candidate selection apparatus utilizing voice recognition includes an association unit that associates target candidates with candidate numbers so that numerals of the target candidates coincide with numerals of the candidate numbers when the target candidates to be displayed in list form are character strings representing the numerals of the candidate numbers, and a display control unit that displays the target candidates and the candidate numbers in list form in accordance with the associations made between the target candidates and the candidate numbers. | 12-04-2014 |
20140372119 | Compounded Text Segmentation - In general, the subject matter described in this specification can be embodied in methods, systems, and program products for performing compounded text segmentation. Compounded text that is extracted from one or more search queries submitted to a search engine is received. The compounded text includes a plurality of individual words that are joined together without intervening spaces. An electronic dictionary including words is accessed. A data structure representing possible segmentations of the compounded text is generated based on whether words in the possible segmentations occur in the electronic dictionary. A data store comprising data associated with a same field of usage as the compounded text is accessed to determine a frequency of occurrence for possible segmentations of the data structure. A segmentation of the compounded text that is most probable based on the data is determined. A language model is trained using the determined segmentation of the compounded text. | 12-18-2014 |
20140379338 | CONDITIONAL MULTIPASS AUTOMATIC SPEECH RECOGNITION - In a conditional multipass automatic speech recognition system, one or more intent templates may be received from an application. A spoken utterance is received and audio frames are generated from the utterance. The audio frames are compared to a first grammar. Recognized speech results are generated and unrecognized audio frames or low confidence frames are collected. One of one or more intent templates and one or more corresponding intent parameters may be determined based on the recognized speech results. The unrecognized audio frames may be conditionally compared to a second grammar in instances when additional information is requested, relative to the determined intent template or the corresponding intent parameters. | 12-25-2014 |
20140379339 | UTILIZING VOICE BIOMETRICS - Methods, systems, computer-readable media, and apparatuses for selecting authentication questions based on a voice biometric confidence score are presented. In some embodiments, a computing device may receive a voice sample. Subsequently, the computing device may determine a voice biometric confidence score based on the voice sample. The computing device then may select one or more authentication questions based on the voice biometric confidence score. | 12-25-2014 |
20140379340 | UTILIZING VOICE BIOMETRICS - Methods, systems, computer-readable media, and apparatuses for utilizing voice biometrics to prevent unauthorized access are presented. In some embodiments, a computing device may receive a voice sample. Subsequently, the computing device may determine a voice biometric confidence score based on the voice sample. The computing device then may evaluate the voice biometric confidence score in combination with one or more other factors to identify an attempt to access an account without authorization. | 12-25-2014 |
20140379341 | MOBILE TERMINAL AND METHOD FOR DETECTING A GESTURE TO CONTROL FUNCTIONS - The present disclosure relates to a portable terminal, and more particularly, to a portable terminal and a method of detecting a gesture and controlling a function. A method of controlling a function of a portable terminal includes: detecting a gesture; activating a voice recognition module in response to the detected gesture; and analyzing a voice input into the activated voice recognition module, and executing a function corresponding to the input voice. | 12-25-2014 |
20140379342 | VOICE FILTER SYSTEM - Embodiments of the invention are directed to systems and methods for voice filtering. In some embodiments, an original voice segment from a user may be received. The received original voice segment may be modified using a first predetermined algorithm. The modified voice segment may be sent to an authentication server. At the authentication server, the modified voice segment may be reconstructed into the original voice segment using a second predetermined algorithm. The user may be authenticated for a transaction based at least in part on the reconstructed original voice segment. | 12-25-2014 |
20140379343 | METHOD, DEVICE, AND SYSTEM FOR AUDIO DATA PROCESSING - A method and apparatus that filters audio data received from a speaking person that includes a specific filter for that speaker. The audio characteristics of the speaker's voice may be collected and the specific filter may be formed to reduce noise while also enhancing voice quality. For instance, if a speaker's voice does not contain specific frequencies, then a filter may cancel the noise at such frequencies to ease noise cancellation and reduce processing sound spectrum for cleaning that is not needed. Additionally, the strength frequencies of a speaker's voice may be identified from the collected audio characteristics and those spectrums can be filtered with finer granularity to provide a speaker specific filter that enhances the voice quality of the speaker's voice data that is transmitted or output by a communication device. The audio data may also be output based upon a user's predefined hearing spectrum. | 12-25-2014 |
20140379344 | VOICE RECOGNITION FOR PERFORMING AUTHENTICATION AND COMPLETING TRANSACTIONS IN A SYSTEMS INTERFACE TO LEGACY SYSTEMS - Method and apparatus for a user to access a systems interface to back-end legacy systems using voice inputs. Generally, a user such as a technician accesses a systems interface to legacy systems via a front-end voice server. The user dials-in to the voice server using a portable access device. Preferably, the portable access device is a cellular phone. Preferably, the voice recognition server performs voice authentication, speech recognition, and speech synthesis. The voice server authenticates the user based on a voice exemplar provided by the user. Using speech synthesis, the voice server provides a menu of operations from which the user can select. By speaking into the access device, the user selects an operation and provides any additional data needed for the operation. Using speech recognition, the voice server prepares a user request based on the spoken user input. The user request is forwarded to the systems interface to the legacy systems. Preferably, the systems interface includes a protocol server for providing a protocol interface and a transaction server for receiving user requests and generating legacy transactions based on the user requests. The systems interface retrieves information from the legacy systems based on the user request and forwards this information to the voice server. The voice server formats the information and outputs the information to the access device. Preferably, the outputted information may be synthesized speech and/or text presented on a display of the access device. | 12-25-2014 |
20150012276 | VOICE RECORDING FOR ASSOCIATION WITH A DOT PATTERN FOR RETRIEVAL AND PLAYBACK - A link table is generated, voice information is associated by dot patterns, and then, voice information associated with the dot pattern is reproduced from a speaker when the dot pattern is read by means of a scanner. In this manner, the dot pattern is printed on a surface of a material such as a picture book or a card, making it possible to play back voice information corresponding to a pattern or a story of a picture book and to play back voice information corresponding to a character described on the card. In addition, by means of a link table, new voice information can be associated with, dissociated from, or changed to, a new dot pattern. | 01-08-2015 |
20150019221 | SPEECH RECOGNITION SYSTEM AND METHOD - A speech recognition system includes a server, a data transmission interface and a speech recognition device. The speech recognition device builds a connection with the server through the data transmission interface. The speech recognition device includes a microphone, an output unit and a processing unit. The processing unit transmits received user information to the server through the data transmission interface to obtain a corresponding personal dictionary file. The personal dictionary file is generated according to history of speech recognition result and related data, which is used by others recently. The processing unit receives a voice signal to be recognized through the microphone and converts it into a digital characteristic file according to a voiceprint file of the user. The processing unit searches the personal dictionary file according to the digital characteristic file to obtain a speech recognition result for outputting through the output unit. | 01-15-2015 |
20150019222 | METHOD FOR USING VOICEPRINT IDENTIFICATION TO OPERATE VOICE RECOGNITION AND ELECTRONIC DEVICE THEREOF - A method for using voiceprint identification to operate voice recognition and electronic device thereof are provided. The method includes the following steps: receiving a specific voice fragment; cutting the received specific voice fragment into a plurality of specific sub-voice clips; performing a voiceprint identification flow to the specific sub-voice clips, respectively; determining whether each of the specific sub-voice clips is an appropriate sub-voice clip according to a result of the voiceprint identification flow; and capturing the appropriate sub-voice clips and operating a voice recognition thereto. | 01-15-2015 |
20150019223 | METHOD AND DEVICE FOR PRESENTING CONTENT - It is provided a method for triggering an action on a second device. It comprises the steps of obtaining audio of a multimedia content presented on a first device; comparing the obtained audio with reference audio data in a database; if finding the obtained audio exists in the database containing reference audio, determining an action corresponding to the matched reference audio; and triggering the action in the second device. | 01-15-2015 |
20150019224 | VOICE SYNTHESIS DEVICE - A voice synthesis device according to the present invention regularly recognizes the contents of an utterance made by a passenger or the like, and specifies a word before abbreviation corresponding to an abbreviation included in a facility name or the like which is included in the utterance contents by using the facility name or the like. Therefore, the voice synthesis device can read the abbreviation out loud while preventing the passenger from being forced to perform a burdensome operation of, for example, registering the word before abbreviation corresponding to the abbreviation and using a reading method familiar to and appropriate for the passenger. | 01-15-2015 |
20150025888 | SPEAKER RECOGNITION AND VOICE TAGGING FOR IMPROVED SERVICE - A method of enabling speaker identification, the method comprising receiving an identifier, the identifier having a limited number of potential speakers associated with it, processing speech data received from a speaker, and when the speaker is recognized, tagging a speaker and displaying a speaker identity. The method further comprises, when the speaker is not recognized, prompting an associate to identify the speaker | 01-22-2015 |
20150025889 | BIOMETRIC AUDIO SECURITY - A biometric audio security system comprises providing an input voice audio source. The input audio is enhanced in two or more harmonic and dynamic ranges by re-synthesizing the audio into a full range PCM wave. A hardware key with a set of audio frequency spikes (identifiers) with varying amplitude and frequency values is provided. The enhanced voice audio input and the key using additive resynthesis are summed. The voice and the spike set is compared against the users identification signature to verify user's identity. The set of audio spikes are user specific. The spikes are stored on the protected key device as a template, which would plug into the system. The template is determined by the owner/manufacturer of the system. The spikes are created and identified using the additive synthesis technique with a predetermined number of partials (harmonics). The identifiers include both positive and negative values. The amplitude and frequency values are spaced in very fine intervals. The enhancing of voice audio input includes the parallel processing the input audio as follows: A module that is a low pass filter with dynamic offset; | 01-22-2015 |
20150039312 | CONTROLLING SPEECH DIALOG USING AN ADDITIONAL SENSOR - Methods and systems are provided for managing speech dialog of a speech system. In one embodiment, a method includes: receiving information determined from a non-speech related sensor; using the information in a turn-taking function to confirm at least one of if and when a user is speaking; and generating a command to at least one of a speech recognition module and a speech generation module based on the confirmation. | 02-05-2015 |
20150039313 | Speech-Based Speaker Recognition Systems and Methods - The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. | 02-05-2015 |
20150046161 | DEVICE IMPLEMENTED LEARNING VALIDATION - An aspect provides a method, including: collecting, at one or more device sensors, one or more inputs selected from the group of inputs consisting of audio inputs from a learning environment and visual inputs from a learning environment; processing, using one or more processors, the one or more inputs to detect an unauthorized behavior pattern; mapping, using the one or more processors, the unauthorized behavior pattern to a predetermined action; and executing the predetermined action. Other aspects are described and claimed. | 02-12-2015 |
20150058016 | Methods and Systems for a Voice ID Verification Database and Service in Social Networking and Commercial Business Transactions - A method and system for voice identification and validation is provided. Other embodiments are disclosed. The system registers one or more users on a social media platform with login information during a social media session, acquires a voice sample at any time of the social media session or a continuation of the social media session, associates the login information and the voice sample in a profile for each of the one or more users, stores the profile as a voice print in a voice print identifier database, and identifies at least one talker from an interfacing of the social media platform with the voice print identifier database. Other embodiments are provided. | 02-26-2015 |
20150066508 | DATA PRE-PROCESSING AND PROCESSING FOR VOICE RECOGNITION - An apparatus, system, and computer readable media for data pre-processing and processing for voice recognition are described herein. The apparatus includes logic to pre-process multi-channel audio data and logic to resolve a source location. The apparatus also includes logic to perform wide range adaptive beam forming, and logic to perform full voice recognition. | 03-05-2015 |
20150066509 | ELECTRONIC DEVICE AND METHOD FOR ENCRYPTING AND DECRYPTING DOCUMENT BASED ON VOICEPRINT TECHOLOGY - In a method for encrypting and decrypting a document based on a voiceprint recognition technology on an electronic device, an encryption key is generated and stored in a storage device of the electronic device. And a voiceprint is verified to determined whether the voiceprint is identical to a predefined voiceprint. if the voiceprint is identical to a predefined voiceprint, the encryption key is obtained from the storage device to encrypt a document. When the encrypted document is decrypted, a decryption key is generated to decrypt the encrypted document. | 03-05-2015 |
20150073799 | VOICE VERIFYING SYSTEM AND VOICE VERIFYING METHOD - A voice verifying system, which comprises: a microphone, which is always turned on to output at least one voice signal; a speech determining device, for determining if the voice signal is valid or not according to a reference value, wherein the speech determining device passes the voice signal if the voice signal is valid; and a verifying module, for verifying a speech signal generated from the voice signal and for outputting a device activating signal to activate a target device if the speech signal matches a predetermined rule; and a reference value generating device, for generating the reference value according to speech signal information from the verifying module. | 03-12-2015 |
20150073800 | DIGITAL SIGNATURES FOR COMMUNICATIONS USING TEXT-INDEPENDENT SPEAKER VERIFICATION - A speaker-verification digital signature system is disclosed that provides greater confidence in communications having digital signatures because a signing party may be prompted to speak a text-phrase that may be different for each digital signature, thus making it difficult for anyone other than the legitimate signing party to provide a valid signature. | 03-12-2015 |
20150073801 | APPARATUS AND METHOD FOR SELECTING A CONTROL OBJECT BY VOICE RECOGNITION - There are provided an apparatus and a method for selecting a control object through voice recognition. The apparatus for selecting a control object through voice recognition according to the present invention includes one or more processing devices, in which the one or more processing devices are configured to obtain input information on the basis of a voice of a user, to match the input information to at least one first identification information obtained based on a control object and second identification information corresponding to the first identification information, to obtain matched identification information matched to the input information within the first identification information and the second identification information, and to select a control object corresponding to the matched identification information. | 03-12-2015 |
20150081299 | METHOD AND SYSTEM FOR ASSISTING PATIENTS - A system for use in assisting a user in a social interaction with another person is provided, the system being configured to determine whether the user recognizes the person and, if it is determined that the user does not recognize the person, to provide information to the user about the person. A corresponding method and computer program product for performing the method are also provided. | 03-19-2015 |
20150081300 | SPEECH RECOGNITION SYSTEM AND METHOD USING INCREMENTAL DEVICE-BASED ACOUSTIC MODEL ADAPTATION - An embodiment of the present invention relates to a speech recognition system and method using incremental device-based acoustic model adaptation. The speech recognition system comprises a model selection module selecting an acoustic model of multi-model tree by verifying and categorizing a device key transmitted from a user device; a model management module generating and incrementally adapting multi-model tree by categorizing voice data based on a user device; and a speech recognition module performing speech recognition by receiving the acoustic model selected from the model selection module and transmitting data of which reliability exceeds a predetermined threshold value to the model management module | 03-19-2015 |
20150088512 | CONTEXT-BASED AUDIO FILTER SELECTION - For context-based audio filter selection, a type module determines a recipient type for a recipient process of an audio signal. The recipient type includes a human destination recipient type and a speech recognition recipient type. A filter module selects an audio filter in response to the recipient type. | 03-26-2015 |
20150088513 | SOUND PROCESSING SYSTEM AND RELATED METHOD - A sound processing system is provided and is executed by a processor. The processor acquires a video/audio file from video/audio files. The processor controls a video/audio processing chip to build a voiceprint feature model of each section for use in speaker recognition, and to identify the speaker of each section based on comparison of the built voiceprint feature model of the acquired video/audio file and the voiceprint feature models of speakers stored in a storage unit. The processor generates a tag file recording relationships between the plurality of sections of the acquired video/audio file and the speakers according to the identification result. A sound processing method is also provided. | 03-26-2015 |
20150095028 | Customer Identification Through Voice Biometrics - Systems and methods for determining an identity of an individual are provided. Audio may be received that includes a key phrase spoken by the individual, and the key phrase may include an identifier spoken by the individual. A key phrase voice print and key phrase text corresponding to the audio may be obtained. The key phrase text may include text corresponding to the identifier spoken by the individual. Voice prints may be retrieved based on the text corresponding to the identifier, and the voice prints may be provided to a voice biometric engine for comparison to the key phrase voice print. The individual may be authenticated based on a comparison of the key phrase voice print to the voice prints. The identifier may include a first name and a last name of the individual. | 04-02-2015 |
20150106097 | METHOD AND DEVICE FOR PROVIDING DISTRIBUTED TELEPRESENCE SERVICE - There is provided a method of determining a main speaker that is performed by a first terminal participating in a distributed telepresence service. The method of determining a main speaker according to an embodiment of the invention includes obtaining first feature information for determining a main speaker from an audio input signal, obtaining second feature information for determining a main speaker of a second terminal from the second terminal participating in the distributed telepresence service, and determining a main speaker terminal for providing a video and an audio of a main speaker who is participating in a telepresence and is speaking based on the first feature information for determining a main speaker and the second feature information for determining a main speaker. | 04-16-2015 |
20150106098 | VOICE INPUT DEVICE, VOICE INPUT METHOD AND PROGRAM - A voice input device provided with an input section for inputting a voice of a user, a recognition section for recognizing the voice of the user inputted by the input section, a generation section for generating characters or a command based on a recognition result of the recognition section, a detection section for detecting a device's own posture, and an instruction section for instructing the generation section to generate the command when a detection result of the detection section represents a specific posture as compared to instructing the generation section to generate the characters when the detection result of the detection section represents a posture other than the specific posture. Accordingly, character input and command input during dictation is correctly distinguished, or more specifically unexpected character input during dictation is avoided. | 04-16-2015 |
20150106099 | IMAGE PROCESSING APPARATUS AND CONTROL METHOD THEREOF - An image processing apparatus includes: a voice input receiver configured to receive a voice input of user; a signal processor configured to recognize and process the received voice input received through the voice input receiver; a buffer configured to store the voice input; and a controller configured to determine whether a voice recognition function of the signal processor is activated and control the signal processor to recognize the voice input stored in the buffer in response to the voice recognition function being determined to be activated wherein the controller is further configured to store the received voice input in the buffer in response to the received voice input being input through the voice input receiver while the voice recognition function is not activated, so that the received voice input is recognized by the signal processor when the voice recognition function is activated. | 04-16-2015 |
20150120299 | VAD Detection Apparatus and Method of Operating the Same - At a processing device, a first signal from a first microphone and a second signal from a second microphone are received. The first signal indicates whether a voice signal has been determined at the first microphone, and the second signal indicates whether a voice signal has been determined at the second microphone. When the first signal indicates potential voice activity or the second signal indicates potential voice activity, the processing device is activated to receive data and the data is examined for a trigger word. When the trigger word is found, a signal is sent to an application processor to further process information from one or more of the first microphone and the second microphone. When no trigger word is found, the processing device is reset to deactivate data input and allowing the first microphone and the second microphone to enter or maintain an event detection mode of operation. | 04-30-2015 |
20150127344 | ELECTRONIC APPARATUS AND VOICE RECOGNITION METHOD FOR THE SAME - Disclosed are an electronic apparatus and a voice recognition method for the same. The voice recognition method for the electronic apparatus includes: receiving an input voice of a user; determining characteristics of the user; and recognizing the input voice based on the determined characteristics of the user. | 05-07-2015 |
20150134333 | VOICE RECOGNITION SYSTEM, VOICE RECOGNITION SERVER AND CONTROL METHOD OF DISPLAY APPARATUS - Apparatuses and methods related to a voice recognition system, a voice recognition server and a control method of a, display apparatus, are provided. More particularly, apparatuses and methods relate to a voice recognition system which performs a voice recognition function by using at least one of a current usage status with respect to the display apparatus and a function that is currently performed by the display apparatus. A voice recognition system includes: a voice receiver which receives a voice command and a controller which determines at least one from among a current usage status with respect to a display apparatus and a function currently performed by the display apparatus, determines an operation corresponding to the received voice command by using at least one from among the determined current usage status and the function currently performed by the display apparatus, and performs the determined operation. | 05-14-2015 |
20150142437 | INFORMATION PROCESSING SYSTEM, INFORMATION PROCESSING METHOD, COMMUNICATION TERMINAL, INFORMATION PROCESSING APPARATUS, AND CONTROL METHOD AND CONTROL PROGRAM THEREOF - An apparatus of this invention is directed to an information processing apparatus that determines a search range of one of the pieces of instruction information based on the other of a plurality of different pieces of instruction information, and effectively narrows down manipulation instruction candidates corresponding to a user even if the manipulation instruction candidates are extended. The information processing apparatus includes an instruction information receiver that receives instruction voice information indicating the instruction voice of the user acquired from the voice of the user, and instruction operation information indicating the instruction operation of the user acquired from the operation of the user, a search range determining unit that determines a search range for recognizing the instruction operation information according to the instruction voice information, or determines a search range for recognizing the instruction voice information according to the instruction operation information, and a user instruction recognizer that recognizes an instruction of the user based on a search result obtained by searching for one of the instruction voice information and the instruction operation information within the search range determined by the search range determining unit. | 05-21-2015 |
20150142438 | VOICE RECOGNITION METHOD, VOICE CONTROLLING METHOD, INFORMATION PROCESSING METHOD, AND ELECTRONIC APPARATUS - The present disclosure provides a voice recognition method for use in an electronic apparatus comprising a voice input module. The method comprises: receiving voice data by the voice input module; performing a first pattern voice recognition on the received voice data, including identifying whether the voice data comprises a first voice recognition information; performing a second pattern voice recognition on the voice data if the voice data comprises the first voice recognition information; and performing or refusing an operation corresponding to the first voice recognition information according to a result of the second pattern voice recognition. The present disclosure also provides a voice controlling method, an information processing method, and an electronic apparatus. | 05-21-2015 |
20150142439 | SYSTEM AND METHOD OF SPEAKER RECOGNITION - An authentication and authorization apparatus combines a unique identifier for a communications device with pre-stored voice recognition information. Incoming audio, associated with the unique identifier is processed to authenticate the speaker. In response to successful authentication, a requested function or action embedded in the audio can be recognized and, if authorized, implemented by a displaced system. | 05-21-2015 |
20150149173 | Controlling Voice Composition in a Conference - Various embodiments enable a system, such as an audio conferencing system, to remove voices from an audio conference in which the removed voices are not desired. In at least some embodiments, an audio signal associated with the audio conference is analyzed and components which represent the individual voices within the audio conference are identified. Once the audio signal is processed in this manner to identify the individual voice components, a control element can be applied to filter out one or more of the individual components that correspond to undesired voices. | 05-28-2015 |
20150149174 | DIFFERENTIAL ACOUSTIC MODEL REPRESENTATION AND LINEAR TRANSFORM-BASED ADAPTATION FOR EFFICIENT USER PROFILE UPDATE TECHNIQUES IN AUTOMATIC SPEECH RECOGNITION - A computer-implemented method is described for speaker adaptation in automatic speech recognition. Speech recognition data from a particular speaker is used for adaptation of an initial speech recognition acoustic model to produce a speaker adapted acoustic model. A speaker dependent differential acoustic model is determined that represents differences between the initial speech recognition acoustic model and the speaker adapted acoustic model. In addition, an approach is also disclosed to estimate speaker-specific feature or model transforms over multiple sessions. This is achieved by updating the previously estimated transform using only adaptation statistics of the current session. | 05-28-2015 |
20150149175 | VOICE RECOGNITION TERMINAL, SERVER, METHOD OF CONTROLLING SERVER, VOICE RECOGNITION SYSTEM, NON-TRANSITORY STORAGE MEDIUM STORING PROGRAM FOR CONTROLLING VOICE RECOGNITION TERMINAL, AND NON-TRANSITORY STORAGE MEDIUM STORING PROGRAM FOR CONTROLLING SERVER - A voice recognition terminal is provided to be able to communicate with a server capable of voice recognition for recognizing voice, and includes a voice input acceptance portion accepting voice input from a user, a voice recognition portion carrying out voice recognition of the voice input accepted, a response processing execution portion performing processing for responding to the user based on a result of voice recognition of the voice input accepted, and a communication portion transmitting the voice input accepted by the voice input acceptance portion to the server and receiving a result of voice recognition in the server. The response processing execution portion performs the processing for responding to the user based on the result of voice recognition determined as more suitable, of the result of voice recognition by the voice recognition portion and the result of voice recognition received from the server. | 05-28-2015 |
20150302852 | METHOD AND DEVICE FOR IMPLEMENTING VOICE INPUT - A network device for implementing voice input comprises an input-obtaining module for obtaining voice input information, a sequence-determining module for determining an input character sequence corresponding to the voice input information based on a voice recognition model, an accuracy-determining module for determining appearance-probability information corresponding to word segments in the input character sequence so as to obtain accuracy information of the word segments, and a transmitting module for transmitting, to a user device, the input character sequence and the accuracy information of the word segments corresponding to the voice input information. | 10-22-2015 |
20150302869 | CONVERSATION, PRESENCE AND CONTEXT DETECTION FOR HOLOGRAM SUPPRESSION - Various embodiments relating to detecting at least one of conversation, the presence and the identity of others during presentation of digital content on a computing device. When another person is detected, one or more actions may be taken with respect to the digital content. For example, the digital content may be minimized, moved, resized or otherwise modified. | 10-22-2015 |
20150302870 | Multisensory Speech Detection - A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters. | 10-22-2015 |
20150310851 | Method and Apparatus for Extra-Vehicular Voice Recognition Training Including Vehicular Updating - A system includes a processor configured to communicate with an application running on an extra-vehicular device. The processor is also configured to determine if new voice-recognition improvement data exists on the device. Further, the processor is configured to download any new improvement data determined to exist on the device. Additionally, the processor is configured to store the downloaded data in a voice-recognition profile associated with a driver and utilize the downloaded data in voice-recognition when the driver attempts voice input. | 10-29-2015 |
20150310877 | CONVERSATION ANALYSIS DEVICE AND CONVERSATION ANALYSIS METHOD - This conversation analysis device comprises: a change detection unit that detects, for each of a plurality of conversation participants, each of a plurality of prescribed change patterns for emotional states, on the basis of data corresponding to voices in a target conversation; an identification unit that identifies, from among the plurality of prescribed change patterns detected by the change detection unit, a beginning combination and an ending combination, which are prescribed combinations of the prescribed change patterns that satisfy prescribed position conditions between the plurality of conversation participants; and an interval determination unit that determines specific emotional intervals, which have a start time and an end time and represent specific emotions of the conversation participants of the target conversation, by determining a start time and an end time on the basis of each time position in the target conversation pertaining to the starting combination and ending combination identified by the identification unit. | 10-29-2015 |
20150310878 | METHOD AND APPARATUS FOR DETERMINING EMOTION INFORMATION FROM USER VOICE - A method of determining emotion information from a voice is provided. The method includes receiving a voice frame obtained by converting a sound generated by a user into an electrical signal, detecting phonation information and articulation information, the phonation information being related to phonation of the user and the articulation information being related to articulation of the user, from the voice frame, and determining user emotion information corresponding to the phonation information and the articulation information. | 10-29-2015 |
20150317641 | EVALUATION OF VOICE COMMUNICATIONS - One-to-many comparisons of callers' words and/or voice prints with known words and/or voice prints to identify any substantial matches between them. When a customer communicates with a particular entity, such as a customer service center, the system makes a recording of the real-time call including both the customer's and agent's voices. The system segments the recording to extract different words, such as words of anger. The system may also segment at least a portion of the customer's voice to create a tone profile, and it formats the segmented words and tone profiles for network transmission to a server. The server compares the customer's words and/or tone profiles with multiple known words and/or tone profiles stored on a database to determine any substantial matches. The identification of any matches may be used for a variety of purposes, such as providing representative feedback or customer follow-up. | 11-05-2015 |
20150325234 | Systems and Methods for Configuring Matching Rules Related to Voice Input Commands - Systems, devices and methods are provided for configuring matching rules related to voice input commands. For example, a first mapping relation between one or more first original terms in a preset term database and one or more first identification terms is established; the first mapping relation is stored in a first mapping relation table; one or more first voice input commands are configured for the first identification terms or one or more first statements including the first identification terms; and a second mapping relation between the first identification terms or the first statements and the first voice input commands is stored into a second mapping relation table. | 11-12-2015 |
20150325239 | DEVICES AND SYSTEMS FOR REMOTE CONTROL - Remote controllers and systems thereof are disclosed. The remote controller remotely operates a receiving host, in which the receiving host provides voice input and speech recognition functions. The remote controller comprises a first input unit and a second input unit for generating a voice input request and a speech recognition request. The generated voice input and speech recognition requests are then sent to the receiving host, thereby forcing the receiving host to perform the voice input and speech recognition functions. | 11-12-2015 |
20150325242 | SOUND TRANSMISSION-BASED VERIFICATION METHOD - A sound transmission-based verification method comprises: a client receiving a data packet set generated by a server according to request information, and converting the data packet set into audio data and play the audio data; a dynamic password apparatus collecting the audio data played by the client, decoding the audio data to obtain data information, and when the information is integral, generating and outputting display information; after the client receives a dynamic password, the client sending the dynamic password to the server; and the server generating, according to the request information, a verifying dynamic password to verify whether the dynamic password is valid, and if the dynamic password is valid, performing an operation according to the request information. | 11-12-2015 |
20150332034 | Spatial Audio Apparatus - An apparatus comprising: an input configured to receive at least one of: at least two audio signals from at least two microphones; and a network setup message; an analyser configured to authenticate at least one user from the input; a determiner configured to determine the position of the at least one user from the input; and an actuator configured to perform an action based on the authentication of the at least one user and/or the position of the at least one user. | 11-19-2015 |
20150332674 | VOICE ANALYZER AND VOICE ANALYSIS SYSTEM - A voice analyzer includes a first voice acquisition unit provided in a place where a distance of a sound wave propagation path from a mouth of a user is a first distance, plural second voice acquisition units provided in places where distances of sound wave propagation paths from the mouth of the user are smaller than the first distance, and an identification unit that identifies whether the voices acquired by the first and second voice acquisition units are voices of the user or voices of others excluding the user on the basis of a result of comparison between first sound pressure of a voice signal of the voice acquired by the first voice acquisition unit and second sound pressure calculated from sound pressure of a voice signal of the voice acquired by each of the plural second voice acquisition units. | 11-19-2015 |
20150340025 | TERMINAL, UNLOCKING METHOD, AND PROGRAM - A terminal comprises: a speech receiving unit that receives speech in a locked state; a voiceprint authentication unit that performs voiceprint authentication based on the speech received in the locked state and determining whether or not a user is legitimate; a speech recognition unit that performs speech recognition of the speech received in the locked state; and an execution unit that executes an application using a result of the speech recognition. | 11-26-2015 |
20150340028 | VOICE RECOGNITION SYSTEM FOR REPLACING SPECIFIC DOMAIN, MOBILE DEVICE AND METHOD THEREOF - The present invention relates to a voice recognition system for replacing a specific domain, a mobile device, and a method thereof, and more particularly, to a technology that divides a search space for voice recognition into a general domain search space and a specific domain search space. | 11-26-2015 |
20150340039 | SPEAKER RECOGNITION FROM TELEPHONE CALLS - A method for speaker recognition comprising: obtaining speaker information for a target speaker; obtaining speech samples from telephone calls from an unknown speaker; classifying the speech samples according the unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information of each of the speaker-dependent classes of speech samples; combining the extracted speaker information; comparing the combined extracted speaker information with the stored speaker information for the target speaker to obtain a comparison result; and determining whether the unknown speaker is identical with the target speaker based on the comparison result. | 11-26-2015 |
20150340040 | VOICE COMMAND RECOGNITION APPARATUS AND METHOD - A voice command recognition apparatus and method thereof are described. The voice command recognition apparatus includes audio sensors placed at different locations; a context determiner configured to determine user context based on a voice received at the audio sensors, wherein the context comprises a vocalization from a user. A command recognizer in the voice command recognition apparatus is configured to activate to recognize a voice command or remain inactive according to the recognized context. | 11-26-2015 |
20150340041 | MOBILE TERMINAL AND CONTROL METHOD THEREOF - A mobile terminal is provided. The mobile terminal includes a voice receiving module configured to receive the voice of a user through a first application and to generate first voice data for the voice received through the first application, a control module configured to transmit the first voice data and user information corresponding to the first voice data to a service server and to request the service server to register the first voice data and the user information, and a communication module configured to transmit, to the service server, a request for the user information corresponding to the voice of the user received through a second application when the voice of the user is received through the second application. | 11-26-2015 |
20150348544 | SYSTEM AND METHOD FOR PROCESSING MULTI-MODAL DEVICE INTERACTIONS IN A NATURAL LANGUAGE VOICE SERVICES ENVIRONMENT - A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic devices. The multi-modal device interactions may include a non-voice interaction with at least one of the electronic devices or an application associated therewith, and may further include a natural language utterance relating to the non-voice interaction. Context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multi-modal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent of the multi-modal device interaction. | 12-03-2015 |
20150356972 | VOICE RECOGNITION DEVICE AND VOICE RECOGNITION METHOD - The voice recognition device according to the present disclosure includes a communication interface that communicates with an external device, a first microphone that collects sound to produce audio data, and a controller that analyzes the audio data produced by the first microphone, determines contents of a designation corresponding to an analysis result, and then controls its own device based on a determination result, and yet controls its own device to urge a user to use the external device when the contents of the designation corresponding to the analysis result cannot be determined. | 12-10-2015 |
20150356973 | INVOKING ACTION RESPONSIVE TO CO-PRESENCE DETERMINATION - Methods, apparatus and computer-readable media (transitory and non-transitory) are disclosed for receiving audio information based on sensing of one or more audible sounds; identifying one or more voice profiles, wherein each of the voice profiles is associated with an individual and indicates one or more voice characteristics of the associated individual; determining at least a given voice profile of the one or more voice profiles matches the audio information; determining co-presence of the user with at least the individual associated with the given voice profile based on determining the given voice profile matches the audio information; identifying an action that includes a trigger based on co-presence of the user and the individual associated with the given voice profile; and invoking the action based on the determined co-presence of the user with at least the individual associated with the given voice profile. | 12-10-2015 |
20150364130 | CONVERSATION STRUCTURE ANALYSIS - Embodiments disclosed herein provide systems, methods, and computer readable media for analyzing a conversation between a plurality of participants. In a particular embodiment, a method provides determining a first speaker from the plurality of participants and determining a second speaker from the plurality of participants. The method further provides determining a first plurality of turns comprising portions of the conversation when the first speaker is speaking and determining a second plurality of turns comprising portions of the conversation when the second speaker is speaking The method also provides determining per-turn statistics for turns of the first and second pluralities of turns and identifying phases of the conversation based on the per-turn statistics. | 12-17-2015 |
20150379986 | VOICE RECOGNITION - A voice recognition method and system. A voice command can be received from a speech recognition unit associated with a device by a language switching module configured in association with the device. The voice command is recognized and processed into particular content to identify via a language database, a language associated with the device. The language can then be changed based on a detected language so that the language can be altered without reference to an instruction manual. In one scenario, a user walks toward the machine or device associated with the user device and speaks the desired/known language. The device “listens” to the voice detects the language and changes the user interface accordingly. | 12-31-2015 |
20150379987 | MULTI-PASS VEHICLE VOICE RECOGNITION SYSTEMS AND METHODS - A voice recognition system for a vehicle includes a micro-phone for receiving speech from a user. The system further includes a memory having a partial set of commands or names for voice recognition. The memory further includes a larger set of commands or names for voice recognition. The system further includes processing electronics in communication with the microphone and the memory. The processing electronics are configured to process the received speech to obtain speech data. The processing electronics are further configured to use the obtained speech data to conduct at least two voice recognition passes. In a first pass, the speech data is compared to the partial set. In a second pass, the speech data is compared to the larger set. | 12-31-2015 |
20150380012 | SPEECH REHABILITATION ASSISTANCE APPARATUS AND METHOD FOR CONTROLLING THE SAME - A speech rehabilitation assistance apparatus is disclosed, which can execute effective speech rehabilitation of, for example, a dysarthric speaker. The speech rehabilitation assistance apparatus can include a specification section specifying a target phoneme type and specifying at least one of a word head, a word middle, and a word end as a position of the specified phoneme type, a presentation section presenting a word selected from words having the specified phoneme type in the specified position, a voice recognition section recognizing a voice uttered when a trainee reads out the presented word, and a provision section providing an evaluation value concerning the voice uttered by the trainee based on history of a recognition result by the voice recognition section. | 12-31-2015 |
20160005403 | Methods and Systems for Voice Conversion - A device may receive data indicative of a plurality of speech sounds associated with first voice characteristics of a first voice. The device may receive an input indicative of speech associated with second voice characteristics of a second voice. The device may map at least one portion of the speech of the second voice to one or more speech sounds of the plurality of speech sounds of the first voice. The device may compare the first voice characteristics with the second voice characteristics based on the map. The comparison may include vocal tract characteristics, nasal cavity characteristics, and voicing characteristics. The device may determine a given representation configured to associate the first voice characteristics with the second voice characteristics. The device may provide an output indicative of pronunciations of the one or more speech sounds of the first voice according to the second voice characteristics based on the given representation. | 01-07-2016 |
20160012822 | Communication System and Method Between an On-Vehicle Voice Recognition System and an Off-Vehicle Voice Recognition System | 01-14-2016 |
20160012824 | SYSTEM AND METHOD FOR DETECTING SYNTHETIC SPEAKER VERIFICATION | 01-14-2016 |
20160019887 | METHOD AND DEVICE FOR CONTEXT-BASED VOICE RECOGNITION - A method and a device of voice recognition are provided. The method involves receiving a voice signal, identifying a first voice recognition model in which context information associated with a situation at reception of the voice signal is not reflected and a second voice recognition model in which the context information is reflected, determining a weighted value of the first voice recognition model and a weighted value of the second voice recognition model, and recognizing a word in the voice signal by applying the determined weighted values to the first voice recognition model and the second voice recognition model. | 01-21-2016 |
20160019897 | SPEAKER RECOGNITION FROM TELEPHONE CALLS - The present invention relates to a method for speaker recognition, comprising the steps of obtaining and storing speaker information for at least one target speaker; obtaining a plurality of speech samples from a plurality of telephone calls from at least one unknown speaker; classifying the speech samples according to at least one unknown speaker thereby providing speaker-dependent classes of speech samples; extracting speaker information for the speech samples of each of the speaker-dependent classes of speech samples; combining the extracted speaker information for each of the speaker-dependent classes of speech samples; comparing the combined extracted speaker information for each of the speaker-dependent classes of speech samples with the stored speaker information for at least one target speaker to obtain at least one comparison result; and determining whether at least one unknown speaker is identical with at least one target speaker based on at least one comparison result. | 01-21-2016 |
20160035346 | SYSTEM AND METHOD FOR PERSONALIZATION IN SPEECH RECOGNITON - Systems, methods, and computer-readable storage devices are for identifying a user profile for speech recognition. The user profile is selected from one of several user profiles which are all associated with a speaker, and can be selected based on the identity of the speaker, the location of the speaker, the device the speaker is using, or other relevant parameters. Such parameters can be hierarchical, having multiple layers, and can also be dependent or independent from one another. Using the parameters identified, the user profile is selected and used to recognize speech. | 02-04-2016 |
20160064001 | VAD Detection Apparatus and Method of Operation the Same - At a processing device, a first signal from a first microphone and a second signal from a second microphone are received. The first signal indicates whether a voice signal has been determined at the first microphone, and the second signal indicates whether a voice signal has been determined at the second microphone. When the first signal indicates potential voice activity or the second signal indicates potential voice activity, the processing device is activated to receive data and the data is examined for a trigger word. When the trigger word is found, a signal is sent to an application processor to further process information from one or more of the first microphone and the second microphone. When no trigger word is found, the processing device is reset to deactivate data input and allowing the first microphone and the second microphone to enter or maintain an event detection mode of operation. | 03-03-2016 |
20160064002 | METHOD AND APPARATUS FOR VOICE RECORDING AND PLAYBACK - Methods and apparatuses are provided for controlling an electronic device that includes a plurality of microphones configured to receive voice input, a storage unit configured to store a sound recording file, and a display unit configured to visually display speaker areas of individual speakers when recording a sound or playing a sound recording file. The electronic device also includes a control unit configured to provide a user interface relating a speaker direction to a speaker by identifying the speaker direction while recording the sound or performing playback of the sound recording file, and to update at least one of speaker information, direction information of a speaker, and distance information of the speaker through the user interface. | 03-03-2016 |
20160071521 | USER PROFILING FOR VOICE INPUT PROCESSING - This is directed to processing voice inputs received by an electronic device. In particular, this is directed to receiving a voice input and identifying the user providing the voice input. The voice input can be processed using a subset of words from a library used to identify the words or phrases of the voice input. The particular subset can be selected such that voice inputs provided by the user are more likely to include words from the subset. The subset of the library can be selected using any suitable approach, including for example based on the user's interests and words that relate to those interests. For example, the subset can include one or more words related to media items selected by the user for storage on the electronic device, names of the user's contacts, applications or processes used by the user, or any other words relating to the user's interactions with the device. | 03-10-2016 |
20160078865 | Information Processing Method And Electronic Device - An information processing method and an electronic device are provided. The method includes an electronic device obtaining an input information through a second collection manner when the electronic device is in a voice collection state for obtaining voice information through a first collection manner, and determining a logic boundary position in relation to a first voice information in accordance with the input information, the first voice information is obtained by the electronic device through the first collection manner which is different from the second collection manner. An electronic device corresponding thereto is also disclosed. | 03-17-2016 |
20160086607 | Method and Apparatus for Performing Speaker Recognition - Embodiments of the present invention perform speaker identification and verification by first prompting a user to speak a phrase that includes a common phrase component and a personal identifier. Then, the embodiments decompose the spoken phrase to locate the personal identifier. Finally, the embodiments identify and verify the user based on the results of the decomposing. | 03-24-2016 |
20160086608 | ELECTRONIC DEVICE, METHOD AND STORAGE MEDIUM - According to one embodiment, an electronic device includes a display controller and circuitry. The display controller displays a first object indicative of a first speaker, a first object indicative of a second speaker different from the first speaker, a second object indicative of a first speech period identified as a speech of the first speaker, and a second object indicative of a second speech period identified as a speech of the second speaker. The circuitry integrates the first speech period and the second speech period into a speech period of a same speaker when a first operation of associating the first object indicative of the first speaker with the first object indicative of the second speaker is operated. | 03-24-2016 |
20160093299 | FILE CLASSIFYING SYSTEM AND METHOD - A file classifying system and a file classifying method are disclosed herein, where the system includes a storing device storing at least one recognizing audio signal, a receiving device, and a processor. The receiving device receives an audio file or a video file. The processor compares a related audio signal and the at least one recognizing audio signal so as to generate a result of process, where the related audio signal is correlated to the audio file or the video file, and then automatically classifies the audio file or video file into a category. | 03-31-2016 |
20160098998 | VOICE SEARCHING METADATA THROUGH MEDIA CONTENT - Systems and methods for voice searching media content based on metadata or subtitles are provided. Metadata associated with media content can be pre-processed at a media server. Upon receiving a vocal command representative of a search for an aspect of the media content, the media server performs a search for one or more portions of the media content relevant to the aspect of the media content being searched for. The media performs the search by matching the aspect of the media content being searched for with the pre-processed metadata. | 04-07-2016 |
20160133274 | PREDICTIVE VIDEO ANALYTICS SYSTEM AND METHODS - The methods and systems described herein predict user behavior based on analysis of a user video communication. The methods include receiving a user video communication, extracting video facial analysis data from the video communication, extracting voice analysis data from the video communication, associating the video facial analysis data with the voice analysis data to determine an emotional state of a user, collecting biographical profile information specific to the user, applying a linguistic-based psychological behavioral model to the spoken words to determine personality type of the user, and inputting the collected biographical profile information, emotional state, and personality type into a predictive model to determine a likelihood of an outcome of the video communication. | 05-12-2016 |
20160180852 | SPEAKER IDENTIFICATION USING SPATIAL INFORMATION | 06-23-2016 |
20160189728 | Voice Signal Processing Method and Apparatus - A voice signal processing method and apparatus, which are used to process a voice signal collected by a microphone of a terminal in order to meet requirements of the terminal in different application modes for the voice signal generated after the processing. The method includes collecting at least two voice signals, determining a current application mode of a terminal, determining, according to the current application mode from the voice signals, voice signals corresponding to the current application mode, and performing, in a preset voice signal processing manner that matches the current application mode, beamforming processing on the corresponding voice signals. | 06-30-2016 |
20160203820 | VOICE MODE ASSET RETRIEVAL | 07-14-2016 |
20180025730 | CIRCUIT AND METHOD FOR SPEECH RECOGNITION | 01-25-2018 |
20190147886 | ENROLLMENT IN SPEAKER RECOGNITION SYSTEM | 05-16-2019 |
20190147887 | AUDIO PROCESSING | 05-16-2019 |
20190147889 | USER IDENTIFICATION METHOD AND APPARATUS BASED ON ACOUSTIC FEATURES | 05-16-2019 |
20190147890 | AUDIO PERIPHERAL DEVICE | 05-16-2019 |