Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Speech controlled system

Subclass of:

704 - Data processing: speech signal processing, linguistics, language translation, and audio compression/decompression

704200000 - SPEECH SIGNAL PROCESSING

704270000 - Application

Patent class list (only not empty are listed)

Deeper subclasses:

Entries
DocumentTitleDate
20130030814SYSTEMS AND METHODS FOR IMPROVING QUALITY OF USER GENERATED AUDIO CONTENT IN VOICE APPLICATIONS - Methods and arrangements for improving quality of content in voice applications. A specification is provided for acceptable content for a voice application, and user generated audio content for the voice application is inputted. At least one test is applied to the user generated audio content, and it is thereupon determined as to whether the user generated audio content meets the provided specification.01-31-2013
20110178804VOICE RECOGNITION DEVICE - A voice recognition device includes a voice input unit 07-21-2011
20090006099Depicting a speech user interface via graphical elements - Depiction of a speech user interface via graphical elements is provided. One or more bits of a graphical user interface bitmask are re-designated as speech bits. When a software application processes the re-designated speech bits, a window manager responsible for generating and rendering a graphical user interface for the application passes information to a secondary window manager responsible for generating and rendering a speech user interface. The secondary speech window manager may load a text-to-speech engine, a speech recognizer engine, a lexicon or library of recognizable words or phrases and a set of “grammars” (recognizable words and phrasing) for building a speech user interface that will receive, recognize and act on spoken input to the associated software application.01-01-2009
20130030816OFFLINE DELIVERY OF CONTENT AVAILABLE IN VOICE APPLICATIONS - Methods and arrangements for facilitating offline delivery of content available in voice applications. User access to a voice application is permitted, and the user is accorded a capability to select content in the voice application for offline delivery. The selected content is stored in a holding arrangement, and the selected content is availed for delivery to the user.01-31-2013
20130030815MULTIMODAL INTERFACE - Provided is a multimodal graphical user interface. The multimodal graphical user interface includes a menu with at least one menu item, wherein the at least one menu item is displayed as command name along with a unique hand shape, wherein the at least one menu item is configured to receive a combination of cursor and selection gesture input.01-31-2013
20130211843ENGAGEMENT-DEPENDENT GESTURE RECOGNITION - Methods, apparatuses, systems, and computer-readable media for performing engagement-dependent gesture recognition are presented. According to one or more aspects, a computing device may detect an engagement of a plurality of engagements, and each engagement of the plurality of engagements may define a gesture interpretation context of a plurality of gesture interpretation contexts. Subsequently, the computing device may detect a gesture. Then, the computing device may execute at least one command based on the detected gesture and the gesture interpretation context defined by the detected engagement. In some arrangements, the engagement may be an engagement pose, such as a hand pose, while in other arrangements, the detected engagement may be an audio engagement, such as a particular word or phrase spoken by a user.08-15-2013
20130211842Method For Quick Scroll Search Using Speech Recognition - A method for a computing device to search for data entails receiving first user input that initiates a quick scrolling action and activates a speech recognition subsystem, receiving second user input by recognizing voice input using the speech recognition subsystem to determine a search query, and searching for data that corresponds to the search query. The quick scrolling action and activation of the speech recognition subsystem may be triggered, for example, by a swiping gesture on an optical jog pad, on a touch screen, or on a touch-sensitive mouse, or by a contactless three-dimensional gesture.08-15-2013
20090192801SYSTEM AND METHOD FOR CONTROLLING AN ELECTRONIC DEVICE WITH VOICE COMMANDS USING A MOBILE PHONE - A method for controlling an electronic device with voice commands using a mobile phone (07-30-2009
20110196683System, Method And Computer Program Product For Adding Voice Activation And Voice Control To A Media Player - A media player system, method and computer program product are provided. In use, an utterance is received. A command for a media player is then generated based on the utterance. Such command is utilized for providing wireless control of the media player.08-11-2011
20130211844Solar Powered Portable Control Panel - A solar powered portable control panel is disclosed herein for wirelessly controlling one or more lights or other devices. An embodiment of the control panel includes a solar panel, a regulator connected to the solar panel, a power storage device connected to the regulator, a wireless transceiver, a controller connected to the power storage device, and a user interface connected to the controller. The user interface is adapted to accept control input and provide it to the controller. The controller is adapted to transmit commands on the wireless transceiver.08-15-2013
20120245946Reusable Mulitmodal Application - A method and system are disclosed herein for accepting multimodal inputs and deriving synchronized and processed information. A reusable multimodal application is provided on the mobile device. A user transmits a multimodal command to the multimodal platform via the mobile network. The one or more modes of communication that are inputted are transmitted to the multimodal platform(s) via the mobile network(s) and thereafter synchronized and processed at the multimodal platform. The synchronized and processed information is transmitted to the multimodal application. If required, the user verifies and appropriately modifies the synchronized and processed information. The verified and modified information are transferred from the multimodal application to the visual application. The final result(s) are derived by inputting the verified and modified results into the visual application.09-27-2012
20120245945IN-VEHICLE APPARATUS AND INFORMATION DISPLAY SYSTEM - An in-vehicle apparatus receives an image data representative of a screen image from a portable terminal with a touch panel. The apparatus extracts a text code data from the image data, and identifies a text-code display area in the screen image. The apparatus determines a command text based on a user-uttered voice command. The apparatus identifies a text-code display area as a subject operation area in the screen image of the portable terminal, based on the command text, the text code data extracted from image data, and information on the text-code display area corresponding to the text code data. An area of the screen image of the touch panel corresponding to the text-code display area is identified as the subject operation area, and a signal indicative of the subject operation area identified is transmitted to the portable terminal.09-27-2012
20130080177SPEECH RECOGNITION REPAIR USING CONTEXTUAL INFORMATION - A speech control system that can recognize a spoken command and associated words (such as “call mom at home”) and can cause a selected application (such as a telephone dialer) to execute the command to cause a data processing system, such as a smartphone, to perform an operation based on the command (such as look up mom's phone number at home and dial it to establish a telephone call). The speech control system can use a set of interpreters to repair recognized text from a speech recognition system, and results from the set can be merged into a final repaired transcription which is provided to the selected application.03-28-2013
20130080179USING A PHYSICAL PHENOMENON DETECTOR TO CONTROL OPERATION OF A SPEECH RECOGNITION ENGINE - A device may include a physical phenomenon detector. The physical phenomenon detector may detect a physical phenomenon related to the device. In response to detecting the physical phenomenon, the device may record audio data that includes speech. The speech may be transcribed with a speech recognition engine. The speech recognition engine may be included in the device, or may be included with a remote computing device with which the device may communicate.03-28-2013
20130080178USER INTERFACE METHOD AND DEVICE - A user interface method and corresponding device, where the user interface method includes waiting for detection of an event, which is a function of the user interface device, performing the event detection in the user interface device and notifying a user that the event has been detected, activating a voice input unit configured to allow the user to input his or her voice therethrough, receiving a voice command from the user with respect to the event through the voice input unit, and processing a function according to the received voice command from the user.03-28-2013
20130085761Voice Control For Asynchronous Notifications - A computing device may receive an incoming communication and, in response, generate a notification that indicates that the incoming communication can be accessed using a particular application on the communication device. The computing device may further provide an audio signal indicative of the notification and automatically activate a listening mode. The computing device may receive a voice input during the listening mode, and an input text may be obtained based on speech recognition performed upon the voice input. A command may be detected in the input text. In response to the command, the computing device may generate an output text that is based on at least the notification and provide a voice output that is generated from the output text via speech synthesis. The voice output identifies at least the particular application.04-04-2013
20090171667SYSTEMS AND METHODS FOR LANGUAGE ASSISTED PATIENT INTAKE - A method for assisting in the communication of a medical care provider and a patient is disclosed. The method may include displaying a first display section, the first display section including a plurality of anatomical features, each anatomical feature associated with an indicia indicating the location of the anatomical feature, the anatomical feature also associated with a first name provided in a first language and a second name provided in a second language name. The method may also include displaying a second display section, the second display section including a plurality of questions relating to patient intake, where each question provided in the first language and the second language.07-02-2009
20130035942ELECTRONIC APPARATUS AND METHOD FOR PROVIDING USER INTERFACE THEREOF - An electronic apparatus and a method for providing a user interface (UI) thereof are provided. Specifically, an electronic apparatus which displays an executable icon of an application which is controllable through voice recognition distinctively from an executable icon of an application which is uncontrollable through voice recognition in a voice task mode, and a method for providing UI thereof are provided. Some of the disclosed exemplary embodiments provide an electronic apparatus which is capable of recognizing a user voice command and a user motion gesture, and displays an executable icon of an application which is controllable through voice recognition and a name of the executable icon distinctively from an executable icon of an application which is uncontrollable through voice recognition and a name of the executable icon in a voice task mode, and a method for providing a UI thereof.02-07-2013
20130035941METHOD FOR CONTROLLING ELECTRONIC APPARATUS BASED ON VOICE RECOGNITION AND MOTION RECOGNITION, AND ELECTRONIC APPARATUS APPLYING THE SAME - An electronic apparatus and a method for controlling thereof are provided. The method recognizes one of among a user voice and a user motion through one of among a voice recognition module and a motion recognition module, and if a user voice is recognized through the voice recognition module, performs a voice task corresponding to the recognized user voice, and, if a user motion is recognized through the motion recognition module, performs a motion task corresponding to the recognized user motion.02-07-2013
20130041671Event Driven Motion Systems - A motion system for allowing a person to cause a desired motion operation to be performed, comprising a network, a motion machine, a speech to text converter, a message protocol generator, an instant message receiver, and a motion services system. The motion machine is capable of performing motion operations. The speech to text converter generates a digital representation of a spoken motion message spoken by the person. The message protocol generator generates a digital motion command based on the digital representation of the spoken motion message and causes the digital motion command to be transmitted over the network. The instant message receiver receives the digital motion command. The motion services system causes the motion machine to perform the desired motion operation based on the digital motion command received by the instant message receiver.02-14-2013
20130041670SPEECH COMMAND INPUT RECOGNITION SYSTEM FOR INTERACTIVE COMPUTER DISPLAY WITH INTERPRETATION OF ANCILLARY RELEVANT SPEECH QUERY TERMS INTO COMMANDS - In an interactive computer controlled display system with speech command input recognition and visual feedback including means for predetermining a plurality of speech commands for respectively initiating each of a corresponding plurality of system actions in combination with means for providing for each of the plurality of speech commands an associated set of speech terms, each term having relevance to its associated command Also included are means responsive to a detected speech term having relevance to one of the speech commands for displaying a relevant command. The system preferably may display basic speech commands simultaneously along with relevant commands. The means for providing the associated set of speech terms may comprise a stored relevance table of universal speech input commands and universal computer operation terms conventionally associated with system actions initiated by the input commands, and means for relating operation terms of the system with terms in the relevance table.02-14-2013
20090132256Command and control of devices and applications by voice using a communication base system - A first communication path for receiving a communication is established. The communication includes speech, which is processed. A speech pattern is identified as including a voice-command. A portion of the speech pattern is determined as including the voice-command. That portion of the speech pattern is separated from the speech pattern and compared with a second speech pattern. If the two speech patterns match or resemble each other, the portion of the speech pattern is accepted as the voice-command. An operation corresponding to the voice-command is determined and performed. The operation may perform an operation on a remote device, forward the voice-command to a remote device, or notify a user. The operation may create a second communication path that may allow a headset to join in a communication between another headset and a communication device, several headsets to communicate with each other, or a headset to communicate with several communication devices.05-21-2009
20130046544MULTIMODAL TEXT INPUT SYSTEM, SUCH AS FOR USE WITH TOUCH SCREENS ON MOBILE PHONES - A system and method for entering text from a user includes a programmed processor that receives inputs from the user and disambiguates the inputs to present word choices corresponding to the text. In one embodiment, inputs are received in two or more modalities and are analyzed to present the word choices. In another embodiment, a keyboard is divided into zones each of which represents two more input characters. A sequence of zones selected by the user is analyzed to present word choices corresponding to the zone selected.02-21-2013
20090043587SYSTEM AND METHOD FOR IMPROVING RECOGNITION ACCURACY IN SPEECH RECOGNITION APPLICATIONS - A speech recognition system and method are provided to correctly distinguish among multiple interpretations of an utterance. This system is particularly useful when the set of possible interpretations is large, changes dynamically, and/or contains items that are not phonetically distinctive. The speech recognition system extends the capabilities of mobile wireless communication devices that are voice operated after their initial activation.02-12-2009
20110004477Facility for Processing Verbal Feedback and Updating Digital Video Recorder(DVR) Recording Patterns - A method, a system and a computer program product for using speech/voice recognition technology to update digital video recorder (DVR) program recording patterns, based on program viewer/listener feedback. A speech controlled pattern modification (SCPM) utility utilizes a DVR recording sub-system integrated with speech processing functionality to compare control phrases with phrases uttered by a viewer. If a control phrase matches a phrase uttered by the viewer, the SCPM utility modifies the DVR recording patterns, according to a set of pre-programmed governing rules. For example, the SCPM utility may avoid modifying the recording patterns for programs within a list of “favorite” programs but may modify the recording patterns for programs excluded from the list. The SCPM utility determines priority of the uttered phrases by identifying users and retrieving a preset priority level of the identified users. The priority level is then used to control changes to the recording patterns.01-06-2011
20130090930Speech Recognition for Context Switching - Various embodiments provide techniques for implementing speech recognition for context switching In at least some embodiments, the techniques can enable a user to switch between different contexts and/or user interfaces of an application via speech commands. In at least some embodiments, a context menu is provided that lists available contexts for an application that may be navigated to via speech commands. In implementations, the contexts presented in the context menu include a subset of a larger set of contexts that are filtered based on a variety of context filtering criteria. A user can speak one of the contexts presented in the context menu to cause a navigation to a user interface associated with one of the contexts.04-11-2013
20090306991METHOD FOR SELECTING PROGRAM AND APPARATUS THEREOF - A program selection method and a display apparatus thereof are provided. The program selection method includes generating a program list including at least one program title, determining whether there is a voice input for a program selection; searching for a desired program title corresponding to the voice input for the program selection among the at least one program title in the program list, and selecting a program corresponding to the desired program title based on the searching for the desired program title.12-10-2009
20090306990VOICE ACTUATED AND OPERATOR VOICE PROMPTED COORDINATE MEASURING SYSTEM - A vehicle coordinate measuring system and method including a coordinate measuring device operably connected to a computer, and a voice input device for receiving prompts from the computer and enabling an operator to transmit responses to the prompts to the computer, with the computer adapted to translate the responses to digital information. The coordinate measuring device may record and transmit point location data to the computer, and the computer may correlate the point location data from the coordinate measuring device with the digital information from the responses.12-10-2009
20120191461Method and Apparatus for Voice Controlled Operation of a Media Player - A system and methods for voice controlled operation of a media player are provided. In one embodiment, a method includes detecting user positioning of a microphone power switch to an off position, detecting user positioning of the microphone power switch to an on position within a predetermined period of time and entering a voice recognition mode, by the media player, based on the user positioning of the microphone power switch to the on position within the predetermined period of time. The method may further include detecting one or more output signals of the microphone, detecting a voice command based on the one or more output signals of the microphone, and controlling operation of the media player based on the voice command, wherein the media player outputs a graphical display associated with the voice command.07-26-2012
20130073294Voice Controlled Wireless Communication Device System - A wireless communication device that accepts recorded audio data from an end-user. The audio data can be in the form of a command requesting user action. The audio data is reduced to a digital file in a format that is supported by the device hardware. The digital file is sent via wireless communication to at least one server computer for further processing. The command includes a unique device identifier that identifies the wireless communication device. The server computer determines required additional processing for the command based on the unique device identifier. The server computer constructs an application command based on the processed command, and transmits the application command to the wireless communication device. The application command includes at least one instruction that causes a corresponding application on the wireless communication device to execute the application command.03-21-2013
20130073293ELECTRONIC DEVICE AND METHOD FOR CONTROLLING THE SAME - An electronic device, a system including the same, and a method for controlling the same are provided. The electronic device may select a specific electronic device to perform a user's voice command in an environment including a plurality of electronic devices capable of voice recognition. The embodiments of the present disclosure allows for interaction between the user and the plurality of electronic devices so that the electronic devices can be efficiently controlled in the N screen environment.03-21-2013
20110015932 METHOD FOR SONG SEARCHING BY VOICE - The present invention relates to a method for song searching by voice, especially the method with which users can complete settings and then start searching, so that the users' voices of search conditions will be acquired to make voice recognition, and the recognition results will be compared with the instruction data and song attribute data in the voice recognition database to obtain comparison data. If the comparison data do not correspond with the preset conditions, the next search condition generated from the comparison data will be broadcast with voice, and the users are allowed to speak out the next search condition to make comparisons of search conditions in the next process. If the comparison data correspond with the preset conditions, one or more song files will be read according to the comparison data and will be given a preview. With this method in hand, the users will not touch buttons or knobs by mistake, do not need to spend time in searching for song files one by one, and do not need to free one or both of their hands to press the buttons or knobs, either. Besides, the users can decide on such matters as search conditions, initial position of previews, whether to play immediately after choices are made, preview period, sequential or shuffle play, etc, thus promoting convenience for users in searching for songs and meeting preferences and needs of different users.01-20-2011
20090271203Voice-activated remote control service - A method of remotely controlling operation of a controlled device involves receiving a telephone call from an owner via a telephone network; authenticating the telephone call to establish that the owner is authorized to control the controlled device; interpreting a voice command from the owner that issues instructions to the controlled device; identifying the controlled device based upon the authentication and identification by the owner of the controlled device; converting the voice command to one or more data packets capable of interpretation by the controlled device to execute the command; and delivering the one or more data packets to the controlled device via the Internet. This abstract is not to be considered limiting, since other embodiments may deviate from the features described in this abstract.10-29-2009
20120226503INFORMATION PROCESSING APPARATUS AND METHOD - An information processing apparatus comprising an information output unit configured to switch a plurality of languages at each given time interval while output a guidance information set by the plurality of languages, a response detection unit configured to detect a response to the guidance information when the guidance information is output while the languages are switched and a processing language determination unit configured to take the language which detect the response to the guidance information as a processing language.09-06-2012
20090030698USING SPEECH RECOGNITION RESULTS BASED ON AN UNSTRUCTURED LANGUAGE MODEL WITH A MUSIC SYSTEM - Speech recorded by an audio capture facility of a music facility is processed by a speech recognition facility to generate results that are provided to the music facility. When information related to a music application running on the music facility are provided to the speech recognition facility, the results generated are based at least in part on the application related information. The speech recognition facility uses an unstructured language model for generating results. The user of the music facility may optionally be allowed to edit the results being provided to the music facility. The speech recognition facility may also adapt speech recognition based on usage of the results.01-29-2009
20090030697USING CONTEXTUAL INFORMATION FOR DELIVERING RESULTS GENERATED FROM A SPEECH RECOGNITION FACILITY USING AN UNSTRUCTURED LANGUAGE MODEL - A user may control a mobile communication facility through recognized speech provided to the mobile communication facility. Speech that is recorded by a user using a mobile communication facility resident capture facility. A speech recognition facility generates results of the recorded speech using an unstructured language model based at least in part on information relating to the recording. Determining a context of the mobile communications facility at the time speech is recorded, and based on the context, delivering the generated results to a facility for performing an action on the mobile communication facility.01-29-2009
20090030696USING RESULTS OF UNSTRUCTURED LANGUAGE MODEL BASED SPEECH RECOGNITION TO CONTROL A SYSTEM-LEVEL FUNCTION OF A MOBILE COMMUNICATIONS FACILITY - A user may control a mobile communication facility through recognized speech provided to the mobile communication facility. Speech that is recorded by a user using a mobile communication facility resident capture facility. A speech recognition facility generates results of the recorded speech using an unstructured language model based at least in part on information relating to the recording. A function of the operating system of the mobile communication facility is controlled based on the results.01-29-2009
20090030695System And Method For Hazard Mitigation In Voice-Driven Control Applications - A speech recognition and control system including a sound card for receiving speech and converting the speech into digital data, the sound card removably connected to an input of a computer, recognizer software executing on the computer for interpreting at least a portion of the digital data, event detection software executing on the computer for detecting connectivity of the sound card, and command control software executing on the computer for generating a command based on at least one of the digital data and the connectivity of the sound card.01-29-2009
20130066636Apparatus and method for a wireless extension collar device for use with mobile and fixed end-user wireless devices - A wireless extension device to end-user wireless device has a collar that is worn around the neck. The collar has two end-members that are positioned on the two collar bone areas next to the neck. The end-members have positioned directional speakers therein that radiate sound in the direction of two ears of the human wearing the collar around the neck. The end-members have positioned microphones that pick up voice commands of a human wearing the collar around the neck. The wireless collar extension device is used for hands free communication with end-user wireless device, without having to plug a prior art BLUETOOTH earpiece into one of the ears.03-14-2013
20130066637INFORMATION PROCESSOR - Information processor 03-14-2013
20090012795METHOD AND SYSTEM FOR DYNAMIC CONDITIONAL INTERACTION IN A VOICEXML RUN-TIME SIMULATION ENVIRONMENT - A method and system for testing voice applications, such as VoiceXML applications, is provided. The system provides a run-time simulation environment for voice applications that simulates and automates user interaction. A user simulation script is provided in a customized mark-up language. The voice application is processed to derive a nominal output of the voice application. The user simulation script is processed to generate a simulated output for the voice application corresponding to the nominal output. Conditional logic may be applied to the nominal output to generate a simulated input in response thereto. The user simulation script is specified in a customized mark-up language having a set of one or more conditional tags and an internal variable for the nominal output of the voice application.01-08-2009
20130166305SPEECH RECOGNITION ADJUSTMENT BASED ON MANUAL INTERACTION - A method of operating a speech recognition system on a vehicle having a visual display and manually-operated input device that includes initiating a speech recognition system, controlling menu selections on a visual display using a manually-operated input device, receiving a notification from the manually-operated input device indicating that the user is manipulating the device in conjunction with the menu selections on the visual display, and adjusting operation of the speech recognition system based on input received by the manually-operated input device.06-27-2013
20080288260Input/Output Apparatus Based on Voice Recognition, and Method Thereof - Provided is an input/output apparatus based on voice recognition, and a method thereof. An object of the apparatus is to improve a user interface by making pointing input and command execution such as application program control possible according to a voice command of a user possible based on a voice recognition technology without individual pointing input device such as a mouse and a touch pad, and a method thereof. The apparatus includes: a voice recognizer for recognizing a voice command inputted from outside; a pointing controller for calculating a pointing location on a screen which corresponds to a voice recognition result transmitted from the voice recognizer; a displayer for displaying a screen; and a command controller for processing diverse commands related to a current pointing location.11-20-2008
20090299752 Recognition of Voice-Activated Commands - Systems and methods for voice activated commands in a digital home communication terminal are disclosed. One example method includes storing a program audio signal corresponding to a program tuned by the digital home communication terminal. The method also includes storing an incoming audio signal carrying speech and removing from the incoming audio signal a portion of the incoming audio signal that corresponds to the program audio signal, this producing an improved version of the incoming audio signal. The method also includes selecting one of a plurality of voice-activated commands that corresponds to the improved version of the incoming audio signal, and performing a function corresponding to the selected voice-activated command.12-03-2009
20110282673INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM - Provided is an information processing apparatus including: a voice analysis unit which performs an analysis process for a user speech; and a data processing unit which is input with analysis results of the voice analysis unit to determine a process which is to be performed by the information processing apparatus, wherein in the case where a factor of inhibiting process continuation occurs in a process based on the user speech, the data processing unit performs a process of generating and outputting feedback information corresponding to a process stage in which the factor of inhibiting occurs.11-17-2011
20090326957OPERATION METHOD OF INTERACTIVE REFRIGERATOR SYSTEM - An operation method of an interactive refrigerator system, includes displaying information about stored items corresponding to a speech input by a user, generating and outputting a response message for the information about the stored items, checking whether or not storage periods of the stored items are expired; and outputting expiration information about storage periods of the stored items or expected expiration information about storage periods of the stored items.12-31-2009
20090240502MULTIMEDIA CONTROLLER TABLET - A media controller tablet is disclosed comprising: an ergonomic housing including a pair of complimentary curvilinear surfaces tapering from an upper end to a lower end, and a gripping surface in contact with one of the complimentary curvilinear surfaces wherein the gripping surface and the complimentary curvilinear surfaces are adapted to engage a human hand with a palm of the human hand in contact with one of the pair of complimentary curvilinear surfaces and a plurality of fingers of the human hand wrapped around the gripping surface; a controller embedded in the ergonomic housing and adapted to receive an input and wirelessly control a media device in accordance with the input; and a display integrated with the ergonomic housing and connected with the controller for displaying information about media content available for viewing on the media device.09-24-2009
20080262848Applications Server and Method - A speech applications server is arranged to provide a user driven service in accordance with an application program in response to user commands for selecting service options. The user is prompted by audio prompts to issue the user commands. The application program comprises a state machine operable to determine a state of the application program from one of a predetermined set of states defining a logical procedure through the user selected service options, transitions between states being determined in accordance with logical conditions to be satisfied in order to change between one state of the set and another state of the set. The logical conditions include whether a user has provided one of a set of possible commands. A prompt selection engine is operable to generate the audio prompts for prompting the commands from the user in accordance with predetermined rules. The prompt selected by the prompt selection engine is determined at run-time. Since the state machine and the prompt selection engine are separate entities and the prompts to be selected are determined at run-time, it is possible to effect a change to the prompt selection engine without influencing the operation of the state machine, enabling different customisations to be provided for the same user driven services, in particular this allows multilingual support, with the possibility of providing rules to adapt the prompt structure allowing for grammatical differences between to languages to be taken into account thus providing higher quality multiple language support.10-23-2008
20110301958System-Initiated Speech Interaction - Whenever an event occurs on a computing system which will accept a response from a user of the system, the system automatically determines whether or not to enable speech interaction with the system for the event response. Whenever speech interaction is enabled with the system for the event response, the system provides a notification to the user which informs the user of the event and their options for responding thereto, where these options include responding verbally. Whenever the user responds within a prescribed period of time via a voice command (VC), the system attempts to recognize the VC. Whenever the VC is successfully recognized, the system responds appropriately to the VC.12-08-2011
20110301959VOICE ACQUISITION SYSTEM FOR A VEHICLE - A voice acquisition system for a vehicle includes an interior rearview mirror assembly attached at an inner portion of the windshield of a vehicle equipped with the interior rearview mirror assembly. The interior rearview mirror assembly includes at least two microphones for receiving audio signals within a cabin of the vehicle and generating an output indicative of the audio signals. A control is in the vehicle and is responsive to the output from the at least one microphone. The control at least partially distinguishes vocal signals from non-vocal signals present in the output. The at least two microphones provide sound capture for at least one of a hands free cell phone system, an audio recording system and a wireless communication system.12-08-2011
20090210233COGNITIVE OFFLOADING: INTERFACE FOR STORING AND COMPOSING SEARCHES ON AND NAVIGATING UNCONSTRAINED INPUT PATTERNS - One or more commands are configured to cause content to be stored for retrieval. The content to be stored includes one or more entries. The content may include event-triggered content stored for retrieval upon an occurrence of a specified event or other content. The content is retrieved in response to a retrieval command specifying a given pattern by comparing the given pattern with the stored content and, upon finding a match for the given pattern, wherein the match corresponds with the given pattern within a predetermined variance, retrieving additional content stored with the match for the given pattern. The content also may be retrieved by identifying the occurrence of the specified event and retrieving the event-triggered content upon the occurrence of the specified event.08-20-2009
20110288871INFORMATION PRESENTATION SYSTEM - In an in-vehicle navigation apparatus, an in-vehicle BT communications device receives a speech recognition result via a BT communications link from a cellular phone. Based on the received speech recognition result, an in-vehicle control circuit outputs a talk-back sound about the speech recognition result via an in-vehicle sound output device.11-24-2011
20100114580Responding to a Call to Action Contained in an Audio Signal - An audio signal is monitored to detect the presence of a call to action contained therein. Addressing information is automatically extracted from the call to action and stored on a storage medium. An electronic message responding to the call to action may be automatically prepared, or a contact field may be automatically populated for inclusion in a contact list. The audio signal may be digitized or obtained from a broadcast transmission, and the process may be performed by a mobile communication device, a central system, or a combination thereof.05-06-2010
20110125504MOBILE DEVICE AND METHOD AND COMPUTER-READABLE MEDIUM CONTROLLING SAME - A mobile device moves by calculating a distance between a sound source and the mobile device using a sound source direction estimation technique. The mobile device moves by a reference distance in a direction perpendicular to a direction in which the mobile device faces the sound source when call sound of the sound source is generated, outputs voice to instruct to the sound source to generate recall sound, checks a directional angle of the mobile device when recall sound is generated by the sound source, calculates the distance between the sound source and the mobile device according to the reference distance and the directional angle of the mobile device, and moves to the vicinity of the sound source.05-26-2011
20090132255Systems and Methods of Performing Speech Recognition with Barge-In for use in a Bluetooth System - Embodiments of the present invention improve methods of performing speech recognition with barge-in. In one embodiment, the present invention includes a speech recognition method comprising starting a synthesis of recorded speech, receiving a user speech input signal providing information regarding a user choice, detecting an initial portion of the user speech input signal, selectively altering the synthesis of recorded speech, and recognizing the user choice.05-21-2009
20090150160SYSTEMS AND METHODS OF PERFORMING SPEECH RECOGNITION USING GESTURES - Embodiments of the present invention improve methods of performing speech recognition using human gestures. In one embodiment, the present invention includes a speech recognition method comprising detecting a gesture, selecting a first recognition set based on the gesture, receiving a speech input signal, and recognizing the speech input signal in the context of the first recognition set.06-11-2009
20120035935APPARATUS AND METHOD FOR RECOGNIZING VOICE COMMAND - An apparatus and method for recognizing a voice command for use in an interactive voice user interface are provided. The apparatus includes a command intention belief generation unit that is configured to recognize a first voice command and that may generate one or more command intention beliefs for the first voice command. The apparatus also includes a command intention belief update unit that is configured to update each of the command intention beliefs based on a system response to the first voice command and a second voice commands. The apparatus also includes a command intention belief selection unit that is configured to select one of the updated command intention beliefs for the first voice command. The apparatus also includes an operation signal output unit that is configured to select a final command intention from the selected updated command intention belief and to output an operation signal based on the selected final command intention.02-09-2012
20100082351UNIVERSAL REMOTE CONTROLLER AND CONTROL CODE SETUP METHOD THEREOF - The present invention provides a universal remote controller that transmits and informs the learning of control codes, the setup of the control codes, and the setup of preference channels by voice commands, learns a voice command or key value input according to the voice command transmitted, and registers the leaned key value as the control code. A control codes setup method includes detecting input of a voice command or a specific key signal requesting a control codes setup in a standby mode, starting, when the voice command or specific key signal is detected, a control codes setup mode and transmitting a control codes setup method step by step by voice information, recognizing a voice command or key signal input by a user according to the voice information transmitted, and starting a standby mode after registering and storing the control codes matching with the recognized voice commands or key signals and transmitting voice information on the registering and storing of the control codes.04-01-2010
20090204410VOICE INTERFACE AND SEARCH FOR ELECTRONIC DEVICES INCLUDING BLUETOOTH HEADSETS AND REMOTE SYSTEMS - Systems and methods for improving the interaction between a user and a small electronic device such as a Bluetooth headset are described. The use of a voice user interface in electronic devices may be used. In one embodiment, recognition processing limitations of some devices are overcome by employing speech synthesizers and recognizers in series where one electronic device responds to simple audio commands and sends audio requests to a remote device with more significant recognition analysis capability. Embodiments of the present invention may include systems and methods for utilizing speech recognizers and synthesizers in series to provide simple, reliable, and hands-free interfaces with users.08-13-2009
20110202351Audio system and method for coordinating tasks - A system includes a hands free mobile communication device. Software stored on a machine readable storage device is executed to cause the hands free mobile communication device to communicate audibly with a field operator performing field operations. The operator receives instructions regarding operations to be performed. Oral communications are received from the operator and are processed automatically to provide further instructions in response to the received oral communications.08-18-2011
20090204411IMAGE PROCESSING APPARATUS, VOICE ASSISTANCE METHOD AND RECORDING MEDIUM - An image processing apparatus comprises: a voice input portion; a memory that stores in itself as voice data, voice of a plurality of users for voice assistance, which is inputted by the voice input portion; a selection portion that selects voice data applied for a login user among the voice data stored in the memory, if information should be given by voice; and a voice output portion that outputs voice corresponding to the selected voice data.08-13-2009
20090089064SYSTEM, METHOD AND ARCHITECTURE FOR CONTROL AND MULTI-MODAL SYNCHRONIZATION OF SPEECH BROWSERS - Clients connecting to a VoiceXML browser obtain a control channel. Using this channel, clients may initialize a new VoiceXML session or attach to an existing VoiceXML session. The client after obtaining a session may perform a range of actions including controlling and monitoring actions.04-02-2009
20090177477Voice-Controlled Clinical Information Dashboard - A method provides a display area of a computer system for displaying a set of data. The data includes clinical data for one or more medical patients. The method provides multiple controls for performing multiple functions. The method provides an audio interface for controlling at least one of the controls through audio commands.07-09-2009
20090276224SYSTEM AND METHOD FOR MONITORING DELIVERY OF MEDIA CONTENT BY A MEDIA COMMUNICATION SYSTEM - A system that incorporates teachings of the present disclosure may operate according to, for example, a method involving recording audio feedback from a plurality of subscribers commenting on media content supplied by a media communication system on at least one of a plurality of media channels, detecting one or more trigger words in the recorded audio feedback having an association with a disruption of one or more media services supplied by the media communication system, selecting one or more network elements of the media communication system in at least one transmission path that supplies media services to one or more of the plurality of subscribers that supplied audio feedback matching the one or more trigger words, and directing the selected one or more network elements to record media content on one or more media channels selected from the plurality of media channels. Other embodiments are disclosed.11-05-2009
20090089065ADJUSTING OR SETTING VEHICLE ELEMENTS THROUGH SPEECH CONTROL - A speech processing device includes an automotive device that filters data that is sent and received across an in-vehicle bus. The device selectively acquires vehicle data related to a user settings or adjustments of an in-vehicle system. An interface acquires the selected vehicle data from one or more in-vehicle sensors in response to a user's articulation of a first code phrase. A memory stores the selected vehicle data with unique identifying data associated with a user. The unique identifying data establishes a connection between the selected vehicle data and the user when a second code phrase is articulated by the user. A data interface provides access to the selected vehicle data and relationship data retained in the memory and enables the processing of the data to customize the in-vehicle system. The data interface is responsive to a user's articulation of a third code phrase to process the selected vehicle data that enables the setting or adjustment of the in-vehicle system.04-02-2009
20080208592Configuring A Speech Engine For A Multimodal Application Based On Location - Methods, apparatus, and products are disclosed for configuring a speech engine for a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application. The multimodal application is operatively coupled to a speech engine. Configuring a speech engine for a multimodal application based on location includes: receiving a location change notification in a location change monitor from a device location manager, the location change notification specifying a current location of the multimodal device; identifying, by the location change monitor, location-based configuration parameters for the speech engine in dependence upon the current location of the multimodal device, the location-based configuration parameters specifying a configuration for the speech engine at the current location; and updating, by the location change monitor, a current configuration for the speech engine according to the identified location-based configuration parameters.08-28-2008
20080208587Document Session Replay for Multimodal Applications - Methods, apparatus, and computer program products are described for document session replay for multimodal applications. including identifying, by a multimodal browser in dependence upon a log produced by a Form Interpretation Algorithm (‘FIA’) during a previous document session with a user, a speech prompt provided by a multimodal application in the previous document session; identifying, by a multimodal browser in replay mode in dependence upon the log, a response to the prompt provided by a user of the multimodal application in the previous document session; retrieving, by the multimodal browser in dependence upon the log, an X+V page of the multimodal application associated with the speech prompt and the response; rendering, by the multimodal browser, the visual elements of the retrieved X+V page; replaying, by the multimodal browser, the speech prompt; and replaying, by a multimodal browser, the response.08-28-2008
20080208595System and method for capturing steps of a procedure - The present invention relates to a system and method for capturing the steps of a procedure in a workplace or other environment to assist with operations, knowledge transfer or regulatory compliance and for other general purposes. The system and method enable a person to capture a procedure while actively carrying out the procedure in its associated environment.08-28-2008
20080208594Effecting Functions On A Multimodal Telephony Device - Methods, apparatus, and computer program products are described for effecting functions on a multimodal telephony device, implemented with the multimodal application operating on a multimodal telephony device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to an automated speech recognition engine. Embodiments include receiving the speech of a telephone call; identifying with the automated speech recognition engine action keywords in the speech of the telephone call; selecting a function of the multimodal telephony device in dependence upon the action keywords; identifying parameters for the function of the multimodal telephony device; and executing the function of the multimodal telephony device using the identified parameters.08-28-2008
20080208591Enabling Global Grammars For A Particular Multimodal Application - Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.08-28-2008
20080208589Presenting Supplemental Content For Digital Media Using A Multimodal Application - Presenting supplemental content for digital media using a multimodal application, implemented with a grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine, includes: rendering, by the multimodal application, a portion of the digital media; receiving, by the multimodal application, a voice utterance from a user; determining, by the multimodal application using the ASR engine, a recognition result in dependence upon the voice utterance and the grammar; identifying, by the multimodal application, supplemental content for the rendered portion of the digital media in dependence upon the recognition result; and rendering, by the multimodal application, the supplemental content.08-28-2008
20080208590Disambiguating A Speech Recognition Grammar In A Multimodal Application - Disambiguating a speech recognition grammar in a multimodal application, the multimodal application including voice activated hyperlinks, the voice activated hyperlinks voice enabled by a speech recognition grammar characterized by ambiguous terminal grammar elements, including maintaining by the multimodal browser a record of visibility of each voice activated hyperlink, the record of visibility including current visibility and past visibility on a display of the multimodal device of each voice activated hyperlink, the record of visibility further including an ordinal indication, for each voice activated hyperlink scrolled off display, of the sequence in which each such voice activated hyperlink was scrolled off display; recognizing by the multimodal browser speech from a user matching an ambiguous terminal element of the speech recognition grammar; selecting by the multimodal browser a voice activated hyperlink for activation, the selecting carried out in dependence upon the recognized speech and the record of visibility.08-28-2008
20090276225METHOD FOR AUTOMATED SENTENCE PLANNING IN A TASK CLASSIFICATION SYSTEM - The invention relates to a method for sentence planning (11-05-2009
20090254351MOBILE TERMINAL AND MENU CONTROL METHOD THEREOF - A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.10-08-2009
20120296655PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes obtaining sensor data from one or more sensors of a mobile device while the mobile device is operating in an inactive state, determining that a user of the mobile device is interacting with the mobile device based on the sensor data, invoking voice input functionality of the mobile device in response to determining that the user of the mobile device is interacting with the mobile device, detecting a voice input, and activating the mobile device in response to detecting the voice input.11-22-2012
20080228495Enabling Dynamic VoiceXML In An X+ V Page Of A Multimodal Application - Enabling dynamic VoiceXML in an X+V page of a multimodal application implemented with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a VoiceXML interpreter, including representing by the multimodal browser an XML element of a VoiceXML dialog of the X+V page as an ECMAScript object, the XML element comprising XML content; storing by the multimodal browser the XML content of the XML element in an attribute of the ECMAScript object; and accessing the XML content of the XML element in the attribute of the ECMAScript object from an ECMAScript script in the X+V page.09-18-2008
20080249782Web Service Support For A Multimodal Client Processing A Multimodal Application - Web service support for a multimodal client processing a multimodal application, the multimodal client providing an execution environment for the application and operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the application stored on an application server, includes: receiving, by the server, an application request from the client that specifies the application and device characteristics; determining, by a multimodal adapter of the server, modality requirements for the application; selecting, by the adapter, a modality web service in dependence upon the modality requirements and the characteristics for the device; determining, by the adapter, whether the device supports VoIP in dependence upon the characteristics; providing, by the server, the application to the client; and providing, by the adapter to the client in dependence upon whether the device supports VoIP, access to the modality web service for processing the application.10-09-2008
20080275707Voice Based Network Management Method and Agent - A method of providing voice based device management, comprising defining a set of one or more status queries for a device, defining for each of the status queries a respective set of status responses for the device corresponding to the instantaneous status of the device, mapping the status queries to corresponding voice format status queries, and mapping the status responses to corresponding voice format status responses.11-06-2008
20080275708NETWORK-BASED VOICE ACTIVATED AUTO-ATTENDANT SERVICE WITH B2B CONNECTORS - A network-based voice activated auto-attendant service is disclosed. In a particular embodiment, a data processor is provided that can construct an enterprise voice directory by executing instructions to encrypt eXtended Markup Language (XML)-based files using an encryption key issued by a voice activated auto-attendant service provider network to form encrypted XML-based files. The instructions are further to store the encrypted XML-based files in a manner that is accessible to the voice activated auto-attendant service provider network, and to create the enterprise voice directory based on the encrypted XML-based files. The enterprise voice directory is configured to provide run-time access to the voice activated auto-attendant service provider network.11-06-2008
20110145000Apparatus, System and Method for Voice Dialogue Activation and/or Conduct - An apparatus, a system and a method for voice dialogue activation and/or conduct. The apparatus for voice dialogue activation and/or conduct has a voice recognition unit, a speaker recognition unit and a decision-maker unit. The decision-maker unit is designed to activate a result action on the basis of results from the voice and speaker recognition units.06-16-2011
20080262849VOICE CONTROL SYSTEM - A voice control system allows a user to control a device through voice commands. The voice control system includes a speech recognition unit that receives a control signal from a mobile device and a speech signal from a user. The speech recognition unit configures speech recognition settings in response to the control signal to improve speech recognition.10-23-2008
20080262847USER POSITIONABLE AUDIO ANCHORS FOR DIRECTIONAL AUDIO PLAYBACK FROM VOICE-ENABLED INTERFACES - The present invention discloses a concept and a use of audio anchors within voice-enabled interfaces. Audio anchors can be user configurable points from which audio playback occurs. In the invention, a user can identify an interface position at which an audio anchor is to be established. The computing device can determine an anchor direction setting, with values that include forward playback and backward playback. Interface items can then be audibly enumerated from the audio anchor in a direction indicated by the anchor direction setting. For example, if a set of interface items are alphabetically ordered items and if an audio anchor is set at a first item beginning with a letter “G” and an anchor direction is set to indicate backward playback, then the interface items beginning with letters “A-F” can be audibly played in reverse alphabetical order. Additionally, a rate of audio playback can be user adjustable.10-23-2008
20080228494Speech-Enabled Web Content Searching Using A Multimodal Browser - Speech-enabled web content searching using a multimodal browser implemented with one or more grammars in an automatic speech recognition (‘ASR’) engine, with the multimodal browser operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal browser operatively coupled to the ASR engine, includes: rendering, by the multimodal browser, web content; searching, by the multimodal browser, the web content for a search phrase, including yielding a matched search result, the search phrase specified by a first voice utterance received from a user and a search grammar; and performing, by the multimodal browser, an action in dependence upon the matched search result, the action specified by a second voice utterance received from the user and an action grammar.09-18-2008
20080228493Determining voice commands with cooperative voice recognition - A method of recognizing voice commands cooperatively includes generating a voice command from a user specifying a target machine and a desired action to be performed by the target machine, and a plurality of machines receiving the voice command, the plurality of machines comprising the target machine and at least one member machine. The method also includes each of the plurality of machines performing a recognition process on the voice command to produce a corresponding recognition result, each member machine sending its corresponding recognition result to the target machine, and the target machine evaluating its own recognition result together with the recognition result from each member machine to determine a most likely final recognition result for the voice command.09-18-2008
20110010180Speech Enabled Media Sharing In A Multimodal Application - Speech enabled media sharing in a multimodal application including parsing, by a multimodal browser, one or more markup documents of a multimodal application; identifying, by the multimodal browser, in the one or more markup documents a web resource for display in the multimodal browser; loading, by the multimodal browser, a web resource sharing grammar that includes keywords for modes of resource sharing and keywords for targets for receipt of web resources; receiving, by the multimodal browser, an utterance matching a keyword for the web resource, a keyword for a mode of resource sharing and a keyword for a target for receipt of the web resource in the web resource sharing grammar thereby identifying the web resource, a mode of resource sharing, and a target for receipt of the web resource; and sending, by the multimodal browser, the web resource to the identified target for the web resource using the identified mode of resource sharing.01-13-2011
20080288259SPEECH RECOGNITION MACRO RUNTIME - The disclosed speech recognition system enables users to define personalized, context-aware voice commands without extensive software development. Command sets may be defined in a user-friendly language and stored in an eXtensible Markup Language (XML) file. Each command object within the command set may include one or more user configurable actions, one or more configurable rules, and one or more configurable conditions The command sets may be managed by a command set loader, that loads and processes each command set into computer executable code. The command set loader may enable and disable command sets. A macro processing component may provide a speech recognition grammar to an API of the speech recognition engine based on currently enabled commands. When the speech recognition engine recognizes user speech consistent with the grammar, the macro processing component may initiate the one or more computer executable actions.11-20-2008
20100138224NON-DISRUPTIVE SIDE CONVERSATION INFORMATION RETRIEVAL - Information is exchanged between a user of a communications device and an application during an ongoing conversation between the user using the communications device and a party, without disrupting the conversation. An application associated with the communications device is accessed via the communications device in response to a command and keyword spoken by the user during the communications session. Information is retrieved from the application according to the keyword spoken by the user. When the information is retrieved from the application, the user is prompted in a manner transparent to the party, after which a response is sent to the user.06-03-2010
20080235032Method and Apparatus for Data Capture Using a Voice Activated Workstation - A method and apparatus for capturing data in a workstation, wherein a large number of data associated with a sample which is viewed, by a user, through an optical device, such as a microscope, is to be entered in a computer related file. The optical device can be moved to a data-sampling position utilizing voice commands. A pointer can then be moved to an appropriate place in the file to receive the data relating to the data-sampling position. Data can be then entered in the appropriate position utilizing a voice command. The steps of moving the pointer and entering the data can then be repeated until all data is provided with respect to the data-sampling positions.09-25-2008
20080235031Interface apparatus, interface processing method, and interface processing program - An interface apparatus according to an embodiment of the invention includes: an operation detecting section configured to detect a device operation; a status detecting section configured to detect a status change or status continuance of a device or in the vicinity of the device; an operation history accumulating section configured to accumulate a operation detection result and a status detection result in association with each other; an operation history matching section configured to match a status detection result for a newly detected against accumulated status detection results, and select a device operation that corresponds to the status detection result for the newly detected; and an utterance section configured to utter as sound a word corresponding to the selected device operation.09-25-2008
20110270615GLOBAL SPEECH USER INTERFACE - A global speech user interface (GSUI) comprises an input system to receive a user's spoken command, a feedback system along with a set of feedback overlays to give the user information on the progress of his spoken requests, a set of visual cues on the television screen to help the user understand what he can say, a help system, and a model for navigation among applications. The interface is extensible to make it easy to add new applications.11-03-2011
20090182562DYNAMIC USER INTERFACE FOR AUTOMATED SPEECH RECOGNITION - Techniques are described for generating a dynamic user interface for a position-determining device that may account for a variety of input modes. In one example, a position-determining device is initiated in a first input mode (e.g., a touch screen mode) and a graphical user interface (GUI) of the device is configured to accept input via the first input mode. The position-determining device then receives an indication to switch to a second input mode (e.g., a speech input mode) and the GUI is configured to receive input via the second input mode. The position-determining device can dynamically transition between GUI configurations based on a plurality of input modes.07-16-2009
20090177476Method, system and mobile device for registering voice data with calendar events - A system, method and apparatus for registering voice data with a calendar event are provided. Voice data is recorded during the calendar event with a mobile device. The voice data is associated with the calendar event using the mobile device.07-09-2009
20090006100Identification and selection of a software application via speech - An audible indication of a user's position within a given speech grammar framework is provided for a speech-enabled software application, and recognition of speech grammars are limited to use only when a software application that has requested a given set of speech grammars is in focus by a user of an associated mobile computing device.01-01-2009
20130218573VOICE COMMAND RECOGNITION METHOD AND RELATED ELECTRONIC DEVICE AND COMPUTER-READABLE MEDIUM - An electronic device for browsing a document is disclosed. The document being browsed includes a plurality of command-associated text strings. First, a text string selector of the electronic device selects a plurality of candidate text strings from the command-associated text strings. Afterward, an acoustic string provider of the electronic device prepares a candidate acoustic string for each of the candidate text strings. Thereafter, a microphone of the electronic device receives a voice command. Next, a speech recognizer of the electronic device searches the candidate acoustic strings for a target acoustic string that matches the voice command, wherein the target acoustic string corresponds to a target text string of the candidate text strings. Finally, a document browser of the electronic device executes a command associated with the target text string.08-22-2013
20080306743SYSTEM AND METHOD OF USING MODULAR SPOKEN-DIALOG COMPONENTS - A system and method are disclosed for switching contexts within a spoken dialog between a user and a spoken dialog system. The spoken dialog system utilizes modular subdialogs that are invoked by at least one flow controller that is a finite state model and that associated with a dialog manager. The spoken dialog system includes a dialog manager with a flow controller and a reusable subdialog module. The method includes, while the spoken dialog is being controlled by the subdialog module that was invoked by the flow controller, receiving context-changing input associated with speech from a user that changes a dialog context and comparing the context-changing input to at least one context shift. And, if any of the context shifts are activated by the comparing step, then passing control of the spoken dialog to the flow controller with context shift message and destination state.12-11-2008
20080306741ROBOT AND METHOD FOR ESTABLISHING A RELATIONSHIP BETWEEN INPUT COMMANDS AND OUTPUT REACTIONS - The present invention relates to a robot and method for establishing a relationship between input commands and output reactions. When initiating an input configuration program, the robot fetches a predetermined motion output reaction and performs a corresponding motion. At this time, the robot receives a vocal input command from a user to obtain a vocal input profile, and establishes a relationship between the motion output reaction and the vocal input profile. When receiving the vocal input command again, the robot performs the corresponding motion according to the relationship. In addition, a sound assigned to the motion output reaction can be altered according to users' preferences. Accordingly, the motion output reaction may have different naming sound.12-11-2008
20080306740REMOTELY AND INTERACTIVELY CONTROLLING SEMI-AUTOMATIC DEVICES - An apparatus, system, method and computer program product are provided for enabling a user to remotely and interactively control, using voice commands, the processing tasks of multiple pieces of equipment, such as semi-automatic medication storing, dispensing and packaging devices. In particular, an apparatus may be configured to provide a user with a voice prompt associated with a dynamically prioritized task. In response, the apparatus may further be configured to receive, a voice command from the use and to transmit an instruction associated with the voice command to one of the multiple pieces of equipment for performance of the prioritized task.12-11-2008
20080306742APPARATUS, METHOD, AND PROGRAM FOR SUPPORTING SPEECH INTERFACE DESIGN - For design of a speech interface accepting speech control options, speech samples are stored on a computer-readable medium. A similarity calculating unit calculates a certain indication of similarity of first and second sets of ones of the speech samples, the first set of speech samples being associated with a first speech control option and the second set of speech samples being associated with a second speech control option. A display unit displays the similarity indication.12-11-2008
20130218572METHOD AND APPARATUS FOR SMART VOICE RECOGNITION - A display device with a voice recognition capability may be used to allow a user to speak voice commands for controlling certain features of the display device. As a means for increasing operational efficiency, the display device may utilize a plurality of voice recognition units where each voice recognition unit may be assigned a specific task.08-22-2013
20090326956VOICE CONTROL SYSTEM AND METHOD FOR OPERATING DIGITAL PHOTO FRAME - A voice control system includes an acoustic sensor, and a digital photo frame. The acoustic sensor is configured to receive a voice signal, and transform the voice signal to an electronic signal. The digital photo frame includes a transforming module, an instruction module, and a comparing module. The transforming module receives the electronic signal sent from the acoustic sensor and transforms the electronic signal to a transformed electronic code. The instruction module defines a plurality of predetermined electronic codes for performing predetermined functions of the digital photo frame. The comparing module compares the transformed electronic code with the predetermined electronic codes. If the transformed electronic code matches one of the predetermined electronic codes, the digital photo frame performs a function of the predetermined functions associated with the matched predetermined electronic code. A method for operating the digital photo frame is also provided.12-31-2009
20080319763SYSTEM AND DIALOG MANAGER DEVELOPED USING MODULAR SPOKEN-DIALOG COMPONENTS - A dialog manager and spoken dialog service having a dialog manager generated according to a method comprising selecting a top level flow controller based on application type, selecting available reusable subdialogs for each application part, developing a subdialog for each application part not having an available subdialog and testing and deploying the spoken dialog service using the selected top level flow controller, selected reusable subdialogs and developed subdialogs. The dialog manager capable of handling context shifts in a spoken dialog with a user. Application dependencies are established in the top level flow controller thus enabling the subdialogs to be reusable and to be capable of managing context shifts and mixed initiative dialogs.12-25-2008
20080319762USING A WIKI EDITOR TO CREATE SPEECH-ENABLED APPLICATIONS - The present invention discloses a system and a method for creating and editing speech-enabled WIKIs. A WIKI editor can be served to client-side Web browsers so that end-users can utilize WIKI editor functions, which include functions to create and edit speech-enabled WIKI applications. A WIKI server can serve speech-enabled WIKI applications created via the WIKI editor. Each of the speech-enabled WIKI applications can include a link to at least one speech processing engine located in a speech processing system remote from the WIKI server. The speech processing engine can provide a speech processing capability for the speech-enabled WIKI application when served by the WIKI server. In one embodiment, the speech-enabled applications can include an introspection document, an entry collection of documents, and a resource collection of documents in accordance with standards specified by an ATOM PUBLISHING PROTOCOL (APP).12-25-2008
20090024394AUDIO GUIDANCE SYSTEM - A CPU of a speech ECU acquires vehicle position information. If it is determined from the position information and map data stored in a memory that the vehicle has moved between areas where different languages are spoken as dialects or official languages, the CPU determines a language corresponding to the vehicle position information and transmits a request signal to a speech information center to transmit speech information in the language. By receiving the speech information from the speech information center, the CPU updates speech information pre-stored in the memory with the speech information transmitted from the speech information center.01-22-2009
20100292990AUDIO ENABLED CONTROL MENU SYSTEM, METHOD AND DEVICE - An audio enabled control menu system, method and device is provided. Embodiments of the present invention include an encoder including an input device actuation of the encoder by an operator of the control menu device; memory including a menu structure and a plurality of audio segments stored in the memory; and a microcontroller in operable communication with the encoder and the memory, the microcontroller further configured to receive menu navigation input from the encoder and output one of the plurality of audio segments in response to the menu navigation input, the microcontroller further configured to execute predetermined control actions in response to the menu navigation input. Embodiments of the invention transmit menu options to an operator in an audio format such that the operator can browse and select menu options with one hand and does not need to look at a visually displayed menu.11-18-2010
20090083039ROBOT APPARATUS WITH VOCAL INTERACTIVE FUNCTION AND METHOD THEREFOR - The present invention provides a robot apparatus with a vocal interactive function. The robot apparatus receives a vocal input, and recognizes the vocal input. The robot apparatus stores a plurality of output data, a last output time of each of the output data, and a weighted value of each of the output data. The robot apparatus outputs output data according to the weighted values of all the output data corresponding to the vocal input, and updates the last output time of the output data. The robot apparatus calculates the weighted values of all the output data corresponding to the vocal input according to the last output time. Consequently, the robot apparatus may output different and variable output data when receiving the same vocal input. The present invention also provides a vocal interactive method adapted for the robot apparatus.03-26-2009
20090222270VOICE COMMAND INTERFACE DEVICE - A device includes a speech input device. A speech recognition processor connected to the speech input device receives speech input. The device includes a computer readable medium coupled to the speech recognition processor. A command table stored on the computer readable medium includes commands corresponding to a control on a manual input interface on a digital music player. The digital music player is separate from the speech input device. The speech recognition processor compares the speech input to the commands in the command table and generates instructions if the speech input matches a command in the command table. A programmable controller is coupled to the speech recognition processor and is configured to receive instructions and to convert the instructions into control signals. The device includes a standard interface connector coupled to the programmable controller. The programmable controller sends the control signals through the standard interface connector. 09-03-2009
20090216540Open Architecture For A Voice User Interface - A system and method for processing voice requests from a user for accessing information on a computerized network and delivering information from a script server and an audio server in the network in audio format. A voice user interface subsystem includes: a dialog engine that is operable to interpret requests from users from the user input, communicate the requests to the script server and the audio server, and receive information from the script server and the audio server; a media telephony services (MTS) server, wherein the MTS server is operable to receive user input via a telephony system, and to transfer the user input to the dialog engine; and a broker coupled between the dialog engine and the MTS server. The broker establishes a session between the MTS server and the dialog engine and controls telephony functions with the telephony system.08-27-2009
20090222271Method For Operating A Navigation System - A method for operating a navigation system analyzes several address components to determine the most likely address desired by a user. The navigation device includes a receiving device on which an acoustic address input consisting of several input components can be registered. The input components of the address are analyzed with a speech recognition module, wherein at least one geographical location, which is defined by an address with several address components, is selected from a database for further processing depending on the result of the speech recognition analysis. The method includes analyzing several address component combinations to determine the most likely address inputted by the user.09-03-2009
20090248420MULTI-PARTICIPANT, MIXED-INITIATIVE VOICE INTERACTION SYSTEM - A voice interaction system includes one or more independent, concurrent state charts, which are used to model the behavior of each of a plurality of participants. The model simplifies the notation and provide a clear description of the interactions between multiple participants. These state charts capture the flow of voice prompts, the impact of externally initiated events and voice commands, and capture the progress of audio through each prompt. This system enables a method to prioritize conflicting and concurrent events leveraging historical patterns and the progress of in-progress prompts.10-01-2009
20090248418Speech Recognition and Statistics-Based Call Route Determination - A method of call route determination based upon a statistics-based business intelligence engine (BEI) queried by an IVR subsystem with caller parameters descriptive of the caller to determine a next best route for a received call, when the default or best route for the call exceeds a threshold time. A call is received at a contact center from a caller. Content and identity information of the caller is extracted from the received call. IVR determines a first estimated wait time associated with a default route of the received call. If the first estimated wait time is greater than a threshold time, and thus unacceptable, then the IVR queries a business intelligence engine (BIE) with caller parameters descriptive of the caller to determine a next best route of the received call, with the next best route having a second estimated wait time less than the first estimated wait time of the default route. The caller is then routed to the next best route.10-01-2009
20120197647COMPUTERIZED INFORMATION PRESENTATION APPARATUS - A computerized information system and computer readable apparatus. In one embodiment, the apparatus is configured for use in a transport apparatus and comprises a computer readable medium having at least one computer program disposed thereon, the at least one program being configured to provide the user with requested information (such as for example directions to a desired business or other entity). At least a portion of the information is obtained via a wireless link with a remote server.08-02-2012
20080312935MEDIA DEVICE WITH SPEECH RECOGNITION AND METHOD FOR USING SAME - A media player utilizing speech recognition software to perform functions of the media player or make file selections that may be played by the media player. The media player may include one or more microphones to receive a voice command from the user. The one or more microphones may be actuated into a state for receiving a voice command and providing the voice command to one or more microprocessors which perform a function based on the voice command.12-18-2008
20080312934USING RESULTS OF UNSTRUCTURED LANGUAGE MODEL BASED SPEECH RECOGNITION TO PERFORM AN ACTION ON A MOBILE COMMUNICATIONS FACILITY - A user may control a mobile communication facility through recognized speech provided to the mobile communication facility. Speech that is recorded by a user using a mobile communication facility resident capture facility is transmitted through a wireless communication facility to a speech recognition facility. The speech recognition facility generates results using an unstructured language model based at least in part on information relating to the recording. The results are transmitted to the mobile communications facility where an action is performed on the mobile communication facility based on the results.12-18-2008
20100185449METHOD AND SYSTEM FOR COMMUNICATING WITH AN INTERACTIVE VOICE RESPONSE (IVR) SYSTEM - Disclosed is a method and system for interacting with an IVR system. In one aspect, a computing device receives a user request to connect to an IVR system to perform an action. A request for information (e.g., a request to select from a plurality of menu options) is obtained from the IVR system. In response to the request, the computing device automatically supplies an answer to the request for information to the IVR system. In one embodiment, the answer is a dual-tone multi-frequency (DTMF) signal. The obtaining and supplying steps are repeated until the action has been performed.07-22-2010
20080281601USER SPEECH INTERFACES FOR INTERACTIVE MEDIA GUIDANCE APPLICATIONS - A user speech interface for interactive media guidance applications, such as television program guides, guides for audio services, guides for video-on-demand (VOD) services, guides for personal video recorders (PVRs), or other suitable guidance applications is provided. Voice commands may be received from a user and guidance activities may be performed in response to the voice commands.11-13-2008
20100191535SYSTEM AND METHOD FOR INTERRUPTING AN INSTRUCTIONAL PROMPT TO SIGNAL UPCOMING INPUT OVER A WIRELESS COMMUNICATION LINK - A voice interactive session includes detection of an input signaling an interrupt to the session. When the interrupt is detected, instructional and or informational output is interrupted and detection of voice input begins. The voice input is not detected until the output is interrupted. Upon detection of a voice input (or other sound-based input), a determination may be made if the input was valid. If the input was valid, the input is processed, otherwise, instructional and/or informational output may be relayed again and/or the voice input may be redetected.07-29-2010
20080215336METHOD AND SYSTEM FOR ENABLING A DEVICE FUNCTION OF A VEHICLE - The current invention provides a method and system for enabling a device function of a vehicle. A speech input stream is received at a telematics unit. A speech input context is determined for the received speech input stream. The received speech input stream is processed based on the determination and the device function of the vehicle is enabled responsive to the processed speech input stream. A vehicle device in control of the enabled device function of the vehicle is directed based on the processed speech input stream. A computer usable medium with suitable computer program code is employed for enabling a device function of a vehicle.09-04-2008
20100217604SYSTEM AND METHOD FOR PROCESSING MULTI-MODAL DEVICE INTERACTIONS IN A NATURAL LANGUAGE VOICE SERVICES ENVIRONMENT - A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic devices. The multi-modal device interactions may include a non-voice interaction with at least one of the electronic devices or an application associated therewith, and may further include a natural language utterance relating to the non-voice interaction. Context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multi-modal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent of the multi-modal device interaction.08-26-2010
20100145710Data-Driven Voice User Interface - A method for developing a voice user interface for a statistical semantic system is described. A set of semantic meanings is defined that reflect semantic classification of a user input dialog. Then, a set of speech dialog prompts is automatically developed from an annotated transcription corpus for directing user inputs to corresponding final semantic meanings. The statistical semantic system may be a call routing application where the semantic meanings are call routing destinations.06-10-2010
20120173244APPARATUS AND METHOD FOR VOICE COMMAND RECOGNITION BASED ON A COMBINATION OF DIALOG MODELS - Provided are a voice command recognition apparatus and method capable of figuring out the intention of a voice command input through a voice dialog interface, by combining a rule based dialog model and a statistical dialog model rule. The voice command recognition apparatus includes a command intention determining unit configured to correct an error in recognizing a voice command of a user, and an application processing unit configured to check whether the final command intention determined in the command intention determining unit comprises the input factors for execution of an application.07-05-2012
20130218574Management and Prioritization of Processing Multiple Requests - Systems and methods are described for systems that utilize an interaction manager to manage interactions—also known as requests or dialogues—from one or more applications. The interactions are managed properly even if multiple applications use different grammars. The interaction manager maintains a priority for each of the interactions, such as via an interaction list, where the priority of the interactions corresponds to an order in which the interactions are to be processed. Interactions are normally processed in the order in which they are received. However, the systems and method described herein may provide a grace period after processing a first interaction and before processing a second interaction. If a third interaction that is chained to the first interaction is received during this grace period, then the third interaction may be processed before the second interaction.08-22-2013
20130218575AUDIO INPUT APPARATUS, COMMUNICATION APPARATUS AND CONDITION NOTIFICATION METHOD - The used condition of a simplex communication apparatus is notified with a light-emitting device attached to the communication apparatus. It is determined whether a communication mode of the simplex communication apparatus is a transmission mode or a standby mode. A sound pick-up state of a sound carried by a speech signal to be transmitted is determined if the communication mode is the transmission mode. The light-emitting device is controlled so that it is turned off, turned on or repeatedly turned on and off based on determination results of the communication-mode determination and the sound pick-up state determination.08-22-2013
20090204409Voice Interface and Search for Electronic Devices including Bluetooth Headsets and Remote Systems - Systems and methods for improving the interaction between a user and a small electronic device such as a Bluetooth headset are described. The use of a voice user interface in electronic devices may be used. In one embodiment, recognition processing limitations of some devices are overcome by employing speech synthesizers and recognizers in series where one electronic device responds to simple audio commands and sends audio requests to a remote device with more significant recognition analysis capability. Embodiments of the present invention may include systems and methods for utilizing speech recognizers and synthesizers in series to provide simple, reliable, and hands-free interfaces with users.08-13-2009
20090319276Voice Enabled Remote Control for a Set-Top Box - A remote control device includes a digital audio storage device, a talk button, and an optical distance measurer. The digital audio storage device is configured to continually record an audio input for a specific amount of time. The talk button is coupled to the digital audio storage device and is configured to initiate a transmission of the audio input to a set-top box device. The optical distance measurer is coupled to the talk button and is configured to automatically measure a distance to a user in response to the talk button being pressed.12-24-2009
20090150159Voice Searching for Media Files - A consumer electronic device has a controller, a speech processing circuit, and a memory to store media files such as audio or video files. The device allows the user to use his or her voice to fast-forward or rewind through the media file to a desired position. Particularly, the device searches one or more selected media file for an audible sound such as a keyword or phrase uttered by the user. If the device locates the audible sound, the device renders the media file having the audible sound starting from that position.06-11-2009
20090112604Automatically Generating Interactive Learning Applications - Systems and methods are described for generating an interactive voice response (IVR) application from a state transition table and set of extensible markup language templates. Embodiments include representing an interactive student-computer dialog as a state transition table. The target interactive voice and video response (IVVR) markup language is encoded as a discrete set of extensible templates. The dialog states are mapped to IVVR markup language by selecting appropriate extensible templates and instantiating parameterized elements of each template with dialog state constituents. Embodiments organize the extended templates coherently and package the extended templates for deployment on an IVVR delivery platform.04-30-2009
20090112605FREE-SPEECH COMMAND CLASSIFICATION FOR CAR NAVIGATION SYSTEM - The present invention provides a system and method associating the freeform speech commands with one or more predefined commands from a set of predefined commands. The set of predefined commands are stored and alternate forms associated with each predefined command are retrieved from an external data source. The external data source receives the alternate forms associated with each predefined command from multiple sources so the alternate forms represent paraphrases of the predefined command. A representation including words from the predefined command and the alternate forms of the predefined command, such as a vector representation, is generated for each predefined command. A similarity value between received speech data and each representation of a predefined command is computed and the speech data is classified as the predefined command whose representation has the highest similarity value to the speech data.04-30-2009
20090112603CONTROL OF A NON-ACTIVE CHANNEL IN A MULTI-CHANNEL RECEIVER - In one embodiment, a satellite radio receiver is capable of simultaneously processing (i) a first radio channel that is playing on a first speaker and (ii) a second radio channel, different from the first radio channel, that is not playing on the first speaker. The second radio channel can simultaneously be playing on a second speaker, be recorded onto a non-volatile memory, and/or have its processing modified. A user can control the satellite radio receiver using vocal commands, while the first channel is playing on the first speaker. The radio receiver has a microphone connected to a voice-recognition command interpreter that includes an interfering-sound canceller, which reduces sounds interfering with the vocal commands, and a command-recognition module, which recognizes vocal commands and provides a control signal to a multi-channel control processor, which processes and controls the first and second radio channels, received from corresponding decoders connected to a satellite radio receiver antenna.04-30-2009
20100305951Methods And Systems For Resolving The Incompatibility Of Media Items Playable From A Vehicle - A system for monitoring hands-free accessibility of media items for play at a vehicle includes a vehicle entertainment computing system (VECS) configured to receive predetermined rules for voice-activated access of the media items. Violations of the rules are detected based on media item metadata. If a violation is detected, a prompt is outputted. Media items are retrieved and played based on voice-activated requests. One embodiment includes a method for monitoring hands-free accessibility of media items for play at a vehicle. A system for formatting media items for accessibility at a VECS includes a media item incompatibility resolution system (MIIRS) configured to resolve violations of the predetermined rules by receiving additional rules relating to formatting violating media items. The media items are searched and the violations addressed by reformatting the media items for voice-activated access. The media items are outputted to the MIIRS.12-02-2010
20090076827Control of plurality of target systems - A system for controlling or operating a plurality of target systems via spoken commands is provided. The system includes a first plurality of target systems, a second plurality of controllers for controlling or operating target systems via spoken commands, a speech recognition system that stores interface information that is specific to a target system or a group of target systems that are to be controlled or operated. A first controller in the second plurality of controllers includes a microphone for picking up audible signals in the vicinity of the first controller and a device for transmitting the audible signals to a speech recognition system. The speech recognition system is operable to analyze the interface information to recognize spoken commands issued for controlling or operating said target system.03-19-2009
20100332234Dynamically Extending The Speech Prompts Of A Multimodal Application - Dynamically extending the speech prompts of a multimodal application including receiving, by the prompt generation engine, a media file having a metadata container; retrieving, by the prompt generation engine from the metadata container, a speech prompt related to content stored in the media file for inclusion in the multimodal application; and modifying, by the prompt generation engine, the multimodal application to include the speech prompt.12-30-2010
20100332236VOICE-TRIGGERED OPERATION OF ELECTRONIC DEVICES - A system and method operating features of telecommunications, audio headsets, speakers, and other communications and electronic devices, such as mobile telephones, personal digital assistants and cameras, using voice-activated, voice-trigged or voice-enabled operation. In accordance with an embodiment, the electronic device is capable of operating in an idle mode, in which the device listens for verbal commands from a user. When the user speaks or otherwise issues a command, the device recognizes the command and responds accordingly, including, depending on the context in which the command is issued, following a series of prompts to guide the user through operating one or more features of the device, such as accessing menus or other features. In accordance with an embodiment, this allows the user to operate the device in a hands-free mode if desired.12-30-2010
20110029315VOICE DIRECTED SYSTEM AND METHOD FOR MESSAGING TO MULTIPLE RECIPIENTS - A method for sending messages in a voice-enabled system and a voice-enabled system to communicate a message are provided. The method comprises generating a message with a message generating device, analyzing the message to determine a voice-enabled device to send the message, and determining whether the voice-enabled device is available to receive the message. The method further comprises sending the message to the voice-enabled device in response to determining that the voice-enabled device is available to receive the message and, in response to determining that the voice-enabled device is not available, escalating the message based on an escalation protocol.02-03-2011
20090210232LAYERED PROMPTING: SELF-CALIBRATING INSTRUCTIONAL PROMPTING FOR VERBAL INTERFACES - A plurality of prompting layers configured to provide varying levels of detailed assistance in prompting a user are maintained. A prompt from a current prompting layer is presented to a user. Input is received from the user. A level of detail in prompting the user is adaptively changed based on user behavior. Upon the user making a hesitant verbal gesture that reaches a threshold duration, a transition is made from the current prompting layer to a more detailed prompting layer. Upon the user interrupting the prompt with a valid input, a transition is made from the current prompting layer to a less detailed prompting layer.08-20-2009
20110246204IMAGE DISPLAY DEVICE AND METHOD THEREOF - An image display device includes a display unit, a storage unit, a voice receiving unit and a processing unit. The storage unit stores a plurality of image data, a plurality of voice data and a plurality of image files, wherein each of the image data is corresponding to one of the voice data respectively. The voice receiving unit receives a current voice. The processing unit judges whether the current voice is similar to one of the voice data, so as to determine one image data corresponding to the current voice. When the current voice is similar to one of the voice data, the processing unit determines whether each of the image files contains the image data corresponding to the current voice and then displays the image file(s), which contain the image data corresponding to the current voice, on the display unit.10-06-2011
20090313026CONVERSATIONAL COMPUTING VIA CONVERSATIONAL VIRTUAL MACHINE - A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 12-17-2009
20100063823METHOD AND SYSTEM FOR GENERATING DIALOGUE MANAGERS WITH DIVERSIFIED DIALOGUE ACTS - A method to generate dialogue manager (DM) is provided, in which a plurality DMs with the same purpose but having different dialogue acts is automatically generated according to a DM designed by a designer. An automatic aiding tool facilitates the design of a dialogue flow and the adjustment of DM rules, and also helps a system designer to find out potential problems in the original DM. The method adopts the current DM combined with a user simulation technique and further employs a specially designed scoring function, so as to automatically generate a plurality of new DMs. The new DMs achieve the same dialogue purpose as the original DM, but differ from the original DM in system acts and responses during the dialogue process. The dialogue flow of the dialogue system is enhanced, and meanwhile, the design and improvement of the DM are also accelerated.03-11-2010
20100131280VOICE RECOGNITION SYSTEM FOR MEDICAL DEVICES - A system for transmitting voice commands to a medical device for carrying out those commands by the medical device. The system includes a remote control device that receives the voice commands from the caregiver and recognizes the caregiver as being authorized to give such commands. The recognized commands are then analyzed to determine the particular command, and the signals representing that command are transmitted in digital form by a wireless protocol, such as a ZigBee wireless protocol, to a receiving module incorporated into or in communication with the medical device. The receiving module decodes the wireless protocol, identifies the particular command, and interfaces that command to the patient device, whereby the command effects the operation of the patient device, such as by silencing an alarm on the medical device.05-27-2010
20100057470SYSTEM AND METHOD FOR VOICE-ENABLED MEDIA CONTENT SELECTION ON MOBILE DEVICES - A system for voice-enabled location and execution for playback of media content selections stored on a media content playback device has a voice input circuitry for inputting voice-based commands into the playback device; codec circuitry for converting voice input from analog content to digital content for speech recognition and for converting voice-located media content to analog content for playback; and a media content synchronization device for maintaining at least one grammar list of names representing media content selections in a current state according to what is currently stored and available for playback on the playback device.03-04-2010
20090216538Method for Interacting With Users of Speech Recognition Systems - A computer implemented method facilitates a user interaction via a speech-based user interface. The method acquires spoken input from a user in a form of a phrase of one or more words. It further determines, using a plurality of different domains) whether the phrase is a query or a command. If the phrase is the query the method retrieves and presents relevant items from a plurality of databases. If the phrase is a command, the method performs an operation.08-27-2009
20090216539IMAGE CAPTURING DEVICE - An image capturing device includes a digital signal processor for processing an image captured by an imaging sensor, a display unit for displaying the image, a storage unit for storing the image and preset voice samples, and a voice processing unit for picking up sound waves and converting the sound waves into text information. Each voice sample represents a category. In a first operation mode, the digital signal processor assigns the image to the category if the text information approximately matches one of the voice samples, or establishes a new category and assigns the images to the new category if the text information does not match any of the voice samples. In a second operation mode, the digital signal processor causes the image in the category corresponding to the text information to be displayed by the display unit in a slideshow fashion or a thumbnail fashion.08-27-2009
20110099017System and method for interactive communication with a media device user such as a television viewer - A personalized television or internet video viewing environment, where the user can respond to messages. Messages are received over the internet and overlaid onto the video program. A light and vibrator on the remote control alert the viewer to respond by speaking into a microphone in the remote control unit. Voice recognition techniques are used to interpret the user's response, and biometric voice analysis can be used to identify the user. Successive interactions can be related and tailored to the particular user.04-28-2011
20100057469METHOD AND SYSTEM FOR ORDERING CONTENT USING A VOICE MENU SYSTEM - A method and system for ordering content includes a voice menu system and a phone device communicating a phone signal to the voice menu system. The voice menu system determines the phone number associated with the phone device through the phone signal and generates a voice prompt for recording a content selection from the voice menu system. The phone device selects a recording content option. The voice menu system generates prompts for determining a content title. The phone device selects a content title by communicating a selection signal to the voice menu system. The voice menu system enables a content recording at a recording device in response to the selection signal.03-04-2010
20100057468BINARY-CACHING FOR XML DOCUMENTS WITH EMBEDDED EXECUTABLE CODE - A method, system and voice browser execute voice applications to perform a voice-based function. A document is retrieved and parsed to create a parse tree. Script code is created from the parse tree, thereby consuming part of the parse tree to create a reduced parse tree. The reduced parse tree is stored in a cache for subsequent execution to perform the voice-based function.03-04-2010
20120303374APPARATUS AND METHOD FOR TRANSMITTING VIDEO DATA FROM MOBILE COMMUNICATION TERMINAL - A mobile terminal includes an input unit receiving an input; a data storage unit storing data; a communication unit communicating signals; and a controller. The controller is configured to receive a selection input of a video data, the selection input being processed to select the video data among a plurality of video data stored in the data storage unit; temporarily store a selected portion of the video data for transmission based on a start position and a stop position specifying the selected portion in the video data; automatically attach the selected portion of the video data for transmission to a message without receiving any further user input when the selected portion of the video data is specified; transmit the message with the selected portion of the video data; and delete the selected portion of the video data from the data storage unit when the transmission of the message is completed.11-29-2012
20120303373ELECTRONIC APPARATUS AND METHOD FOR CONTROLLING THE ELECTRONIC APPARATUS USING VOICE - An electronic apparatus includes a microphone, a processor, a motherboard, and a voice recognition microchip. The voice recognition microchip compares a voice command with a pre-stored voice command. If the voice command is identical with the pre-stored voice command, the processor outputs a control signal to the motherboard. The motherboard controls the electronic apparatus to perform an action corresponding to the control signal.11-29-2012
20110060592IPTV SYSTEM AND SERVICE METHOD USING VOICE INTERFACE - Provided is an IPTV system using voice interface which includes a voice input device, a voice processing device, a query processing and content search device, and a content providing device. The voice processing device performs voice recognition to convert voice into a text. The voice processing device includes a voice preprocessing unit, a sound model database, a language model database, and a decoder. The voice preprocessing unit performs preprocessing which includes improving the quality of sound or removing noise for the received voice, and extracts a feature vector. The decoder converts the feature vector into a text by using a sound model and a language model. Moreover, the voice processing device stores the profile and preference of a user to provide personalized service. The result of voice recognition is updated in a sound model database and a user profile database each time service for a user is provided, the performance of voice recognition and the performance of personalized service can continuously be improved.03-10-2011
20110029316SPEECH RECOGNITION SYSTEM AND METHOD - According to the present invention, a method for integrating processes with a multi-faceted human centered interface is provided. The interface is facilitated to implement a hands free, voice driven environment to control processes and applications. A natural language model is used to parse voice initiated commands and data, and to route those voice initiated inputs to the required applications or processes. The use of an intelligent context based parser allows the system to intelligently determine what processes are required to complete a task which is initiated using natural language. A single window environment provides an interface which is comfortable to the user by preventing the occurrence of distracting windows from appearing. The single window has a plurality of facets which allow distinct viewing areas. Each facet has an independent process routing its outputs thereto. As other processes are activated, each facet can reshape itself to bring a new process into one of the viewing areas. All activated processes are executed simultaneously to provide true multitasking.02-03-2011
20100280829Photo Management Using Expression-Based Voice Commands - A system and method are provided for photo management using expression-based voice commands. The method interfaces a photo-image discovery device, having no dedicated display, to a display monitor. Expression-based user voice prompt are received and used to access a photo-image in storage at a storage site. The accessed photo-image is then presented on the display monitor. The photo-image in storage at the storage site can be accessed to perform an operation such as: selecting a storage site, selecting a photo-image, transforming a selected photo-image, converting a file format of a selected photo-image, and selecting a delivery option. In one aspect, a menu of photo-image user prompt options are presented on the display monitor, originating from the photo discovery device, and the expression-based user voice prompts are received in response to the presented menu.11-04-2010
20090006101Method to detect and assist user intentions with real time visual feedback based on interaction language constraints and pattern recognition of sensory features - A language model back-off system can be used with a user interface employing one or more language models to constrain navigation of selectable user interface input components. A user input interpretation module receives user input and interprets the user input to determine if a selection is made of one or more user interface input components. If a selection is not made, the user input interpretation module determines whether conditions are met for backing off one or more language models employed to constrain navigation of the user interface input components. If the conditions are met, a language model back-off module backs off the one or more language models.01-01-2009
20120203559ACTIVATING FUNCTIONS IN PROCESSING DEVICES USING START CODES EMBEDDED IN AUDIO - Apparatus, system and method for performing an action such as accessing supplementary data and/or executing software on a device capable of receiving multimedia are disclosed. After multimedia is received, a monitoring code is detected and a signature is extracted in response thereto from an audio portion of the multimedia. The ancillary code includes a plurality of code symbols arranged in a plurality of layers in a predetermined time period, and the signature is extracted from features of the audio of the multimedia. Supplementary data is accessed and/or software is executed using the detected code and/or signature.08-09-2012
20090248419SPEECH RECOGNITION ADJUSTMENT BASED ON MANUAL INTERACTION - A method of operating a speech recognition system on a vehicle having a visual display and manually-operated input device that includes initiating a speech recognition system, controlling menu selections on a visual display using a manually-operated input device, receiving a notification from the manually-operated input device indicating that the user is manipulating the device in conjunction with the menu selections on the visual display, and adjusting operation of the speech recognition system based on input received by the manually-operated input device.10-01-2009
20080243517SPEECH BOOKMARKS IN A VOICE USER INTERFACE USING A SPEECH RECOGNITION ENGINE AND ACOUSTICALLY GENERATED BASEFORMS - A system and method for navigating a dialog hierarchy from a voice user interface (VUI) using speech bookmarks. The method can detect a user spoken command for bookmarking a location within a dialog hierarchy of a voice response system. A user spoken bookmark can be received, which is added to a personalized bookmark grammar that is associated with a user who spoke the bookmark name. A database record can be used to associate the new bookmark name with a location within the dialog hierarchy. During a subsequent interaction between the user and the voice response system, the user can speak the bookmark name, which results in a match being detected between the spoken phrase and the personalized bookmark grammar. The voice response system can then navigate to the location within bookmark hierarchy that is associated with the speech bookmark.10-02-2008
20090106029VOICE ACQUISITION SYSTEM FOR A VEHICLE - A voice acquisition system for a vehicle includes an interior rearview mirror assembly. The mirror assembly may include a microphone for receiving audio signals within a cabin of the vehicle and generating an output indicative of these audio signals. The microphone may provide sound capture for a hands free cell phone system, an audio recording system and/or an emergency communication system. The system may include a control that is responsive to the output from the microphone and that distinguishes vocal signals from non-vocal signals present in the output. The microphone may provide sound capture for at least one accessory of the equipped vehicle, and the accessory may be responsive to a vocal signal captured by the microphone. The interior rearview mirror assembly may include at least one accessory, such as an antenna, a video device, a security system status indicator, a tire pressure indicator display and/or a loudspeaker.04-23-2009
20080215337SYSTEM, METHOD AND COMPUTER PROGRAM PRODUCT FOR ADDING VOICE ACTIVATION AND VOICE CONTROL TO A MEDIA PLAYER - A media player system, method and computer program product are provided. In use, an utterance is received. A command for a media player is then generated based on the utterance. Such command is utilized for providing wireless control of the media player.09-04-2008
20120150546APPLICATION STARTING SYSTEM AND METHOD - A computing device and method starts applications via voice commands. The computing device records a sound input by a microphone of the computing device and sends the recorded sound input to a sound sensor of the computing device. Furthermore, the computing device reads a voice command by an embedded controller of the computing device from the sound sensor, in response to a determination that the recorded sound input matches a predetermined verbal statement of the voice command. The computing device notifies an operating system of the computing device to start the application corresponding to the voice command.06-14-2012
20080255851Speech-Enabled Content Navigation And Control Of A Distributed Multimodal Browser - Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.10-16-2008
20080208588Invoking Tapered Prompts In A Multimodal Application - Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.08-28-2008
20110054909LOCALIZING THE POSITION OF A SOURCE OF A VOICE SIGNAL - The invention relates to localizing the position of a person speaking by using pictures of a pattern (03-03-2011
20110054908IMAGE PROCESSING SYSTEM, IMAGE PROCESSING APPARATUS AND INFORMATION PROCESSING APPARATUS - An image processing system includes an information processing apparatus and an image processing apparatus connected to each other via a network. The information processing apparatus has an application installed thereon to give a new function to the image processing apparatus. The image processing apparatus transmits to the information processing apparatus, voice data obtained by a microphone of the image processing apparatus and data set via an operation screen customized according to the application. The information processing apparatus determines answer information indicating an action to be taken by the image processing apparatus, based on the received voice data, a dictionary owned by the application and the data set via the operation screen, and then transmits the determined answer information to the image processing apparatus. The image processing apparatus takes an action according to the answer information received therefrom.03-03-2011
20110054907AUDIO INTERFACE UNIT FOR SUPPORTING NETWORK SERVICES - Techniques for providing network services at an audio interface unit include determining, based on spoken sounds of a user of an apparatus received at a microphone of the apparatus, whether to present audio data received from a different apparatus. If it is determined to present the received audio data, then presentation of the received audio data at a speaker of the apparatus is initiated. In some embodiments, an apparatus includes a data communications bus; and logic encoded in one or more tangible media configured to performs the above steps. In some embodiments, the apparatus does not include a visual display and does not include a keypad of multiple buttons.03-03-2011
20130197914VOICE ACTIVATED AUDIO CONTROL SYSTEM AND ASSOCIATED METHOD OF USE - A voice activated system for operating electronic devices in an environment includes a microphone for receiving a verbal command that requests the addition of a new voice command, a first processor, that is electrically connected to the microphone, for receiving a customized command input regarding a preexisting user for the voice activated system that should be associated with the new verbal command, input involving a new verbal command, and input involving a system command, where the first processor is then able to receive verbal input to recognize a user, a verbal command, and then determine an associated action, an appropriate command for that action and then generate an associated system command, and a second processor, in electronic communication with the first processor, and two or more electronic devices in an environment, where the second processor is capable of receiving the system command and operating the two or more devices.08-01-2013
20100292991METHOD FOR CONTROLLING GAME SYSTEM BY SPEECH AND GAME SYSTEM THEREOF - Embodiments of the present invention provide a method for controlling a game system by speech and a game system thereof. The method includes collecting a speech command, storing the speech command in association with a game command; receiving a speech command from a user during a game, searching for a game command associated with the speech command, and controlling a game system using the game command found. The game system includes a speech collecting module, an associated storage module, a speech command recognizing module and a game controlling module. The present invention can implement control of a game system using speech.11-18-2010
20110119063REMOTE NOTIFICATION SYSTEM AND METHOD AND INTELLIGENT AGENT THEREFOR - The invention relates to remote access systems and methods using automatic speech recognition to access a computer system. The invention also relates to an intelligent agent resident on the computer system for facilitating remote access to, and receipt of, information on the computer system through speech recognition or text-to-speech read-back. The remote access systems and methods can be used by a user of the computer system while traveling. The user can dial into a server system which is configured to interact with the user by automatic speech recognition and text-to-speech conversion. The server system establishes a connection to an intelligent agent running on the user's remotely located computer system by packet communication over a public network. The intelligent agent sources information on the user's computer system or a network accessible to the computer system, processes the information and transmits it to the server system over the public network. The server system converts the information into speech signals and transmits the speech signals to a telephone operated by the user.05-19-2011
20110119062Voice-recognition/voice-activated vehicle signal system - A control system is operable within a host vehicle to control the operation of signaling apparatus indicative of a driver intent to execute right, left or U-turn actions. The control system includes a voice recognition circuit for activating turn signal devices within the vehicle. In some embodiments, a wireless link facilitates aftermarket applications while in other embodiments original equipment manufacture is accommodated.05-19-2011
20110125503METHODS AND SYSTEMS FOR UTILIZING VOICE COMMANDS ONBOARD AN AIRCRAFT - Methods and systems are provided for utilizing audio commands onboard an aircraft. A method comprises identifying a flight phase for the aircraft, resulting in an identified flight phase, receiving an audio input, resulting in received audio input, filtering the received audio input in a manner that is influenced by the identified flight phase for the aircraft, resulting in filtered audio input, and validating the filtered audio input as a first voice command of a first plurality of possible voice commands.05-26-2011
20110137657KITCHEN AND/OR DOMESTIC APPLIANCE - The invention relates to a kitchen and/or domestic appliance comprising input means, which are connected to a voice-recognition system, for acoustic operator commands. The invention is characterised in that means for executing command-dependent actions are provided and that the voice-recognition system is used to identify and check the authorisation of a user.06-09-2011
20090171669Methods and Apparatus for Implementing Distributed Multi-Modal Applications - Embodiments of a system include a client device (07-02-2009
20090171668Recursive Adaptive Interaction Management System - A management system for guiding an agent in a media-specific dialogue has a conversion engine for instantiating ongoing dialogue as machine-readable text, if the dialogue is in voice media, a context analysis engine for determining facts from the text, a rules engine for asserting rules based on fact input, and a presentation engine for presenting information to the agent to guide the agent in the dialogue. The context analysis engine passes determined facts to the rules engine, which selects and asserts to the presentation engine rules based on the facts, and the presentation engine provides periodically updated guidance to the agent based on the rules asserted.07-02-2009
20100121645OPERATING DEVICE FOR A MOTOR VEHICLE - In a method for the operator control of a motor vehicle having a display for displaying variable information and having a microphone, the viewing direction of an operator of the motor vehicle is ascertained, it is checked whether the viewing direction of the operator is aimed toward the display, and information assigned to an acoustic command is shown on the display when a corresponding acoustic command is given while the viewing direction of the operator is aimed toward the display.05-13-2010
20090299751Robot apparatus and method for registering shortcut command thereof - A robot apparatus including an input unit to receive a voice command from a user, a determination unit to determine whether a voice command is repeated a predetermined number of times, and a control unit to register a shortcut command to shorten a voice command if it is determined a voice command is repeated a predetermined number of times. A shortcut command to shorten a voice command of a user is generated, and thus user convenience is enhanced.12-03-2009
20110191109METHOD OF CONTROLLING A SYSTEM AND SIGNAL PROCESSING SYSTEM - A method of controlling a system which includes the steps of obtaining at least one signal representative of information communicated by a user via an input device in an environment of the user, wherein a signal from a first source is available in a perceptible form in the environment; estimating at least a point in time when a transition between information flowing from the first source and information flowing from the user is expected to occur; and timing the performance of a function by the system in relation to the estimated time.08-04-2011
20100017212TURN-TAKING MODEL - A method is claimed for managing interactive dialog between a machine and a user. In one embodiment, an interaction between the machine and the user is managed in response to a timing position of possible speech onset from the user. In another embodiment, the interaction between the machine and the user is dependent upon the timing of a recognition result, which is relative to a cessation of a verbalization of a desired sequence from the machine. In another embodiment, the interaction between the machine and the user is dependent upon a recognition result and whether the desired sequence was ceased or not ceased.01-21-2010
20090099849Voice input system, interactive-type robot, voice input method, and voice input program - A first voice input system according to the present invention includes: a voice input unit 04-16-2009
20110307260MULTI-MODAL GENDER RECOGNITION - Gender recognition is performed using two or more modalities. For example, depth image data and one or more types of data other than depth image data is received. The data pertains to a person. The different types of data are fused together to automatically determine gender of the person. A computing system can subsequently interact with the person based on the determination of gender.12-15-2011
20120041766VOICE-CONTROLLED NAVIGATION DEVICE AND METHOD - In a voice-controlled navigation device and method, a voice command is received, and divided into voice segments Vi (i=1˜n) by comparing with one or more keywords. A voice segment Vi (i=1˜n) is obtained in sequence to be compared with tree nodes in a search tree of place names. A weight value of each tree node is computed according to a comparison, to select one or more tree nodes whose weight values are greater than a predetermined value. Routes formed by all the selected tree nodes are obtained to select a route whose total weight value is the greatest. A navigation to a destination is given by indicating the selected route on an electronic map according to place names represented by the tree nodes of the selected route.02-16-2012
20110022396METHOD, SYSTEM AND USER INTERFACE FOR AUTOMATICALLY CREATING AN ATMOSPHERE, PARTICULARLY A LIGHTING ATMOSPHERE, BASED ON A KEYWORD INPUT - The invention relates to the automatic creation of an atmosphere, particularly a lighting atmosphere, based on a keyword input such as a keyword typed or spoken by a user. A basic idea of the invention is to enable a user of an atmosphere creation system such as a lighting system to automatically create a specific atmosphere by simply using a keyword which is input to the system. The keyword, for example “eat”, “read”, “relax”, “sunny”, “cool”, “party”, “Christmas”, “beach”, may be spoken or typed by the user and may enable the user to find and explore numerous atmospheres in an interactive and playful way in embodiments of the invention. Finding atmosphere elements related to the keyword may be done in various ways according to embodiments of the invention. The invention allows also a non expert in designing or creating atmosphere scenes to control the creation of a desired atmosphere in an atmosphere creation system.01-27-2011
20120046952REMOTE CONTROL SYSTEM AND METHOD - A remote control system includes a receiving and recognition module, a converting module, and a control interface module. The receiving and recognition is used for receiving a signal from a user and recognizing the signal as a user command associated with an electronic device. The converting module is used for converting the user command into a control command identifiable by the electronic device. The control interface module is used for sending the control command to the electronic device to control the electronic device.02-23-2012
20100169097AUDIBLE LIST TRAVERSAL - Many embodiments may comprise logic such as hardware and/or code to implement user interface for traversal of long sorted lists, via audible mapping of the lists, using sensor based gesture recognition, audio and tactile feedback and button selection while on the go. In several embodiments, such user interface modalities are physically small in size, enabling a user to be truly mobile by reducing the cognitive load required to operate the device. For some embodiments, the user interface may be divided across multiple worn devices, such as a mobile device, watch, earpiece, and ring. Rotation of the watch may be translated into navigation instructions, allowing the user to traverse the list while the user receives audio feedback via the earpiece to describe items in the list as well as audio feedback regarding the navigation state. Many embodiments offer the user a simple user interface to traverse the list without visual feedback.07-01-2010
20120046953ESTABLISHING A MULTIMODAL PERSONALITY FOR A MULTIMODAL APPLICATION - Methods, apparatus, and computer program products are described for establishing a multimodal personality for a multimodal application that include selecting, by the multimodal application, matching vocal and visual demeanors and incorporating, by the multimodal application, the matching vocal and visual demeanors as a multimodal personality into the multimodal application.02-23-2012
20120010890POWER-OPTIMIZED WIRELESS COMMUNICATIONS DEVICE - The present invention is an Always On, Hands-free, Speech Activated, Power-optimized Wireless Communications Device with associated base. The unique value of the device is that a person can use the device at any time, 01-12-2012
20080319761SPEECH PROCESSING METHOD BASED UPON A REPRESENTATIONAL STATE TRANSFER (REST) ARCHITECTURE THAT USES WEB 2.0 CONCEPTS FOR SPEECH RESOURCE INTERFACES - The present invention discloses a method of performing speech processing operations based upon Web 2.0 type interfaces with speech engines. The method can include a step of interfacing with a Web 2.0 server from a standard browser. A speech-enabled application served by the Web 2.0 server can be accessed. The browser can render markup of the speech-enabled application. Speech input can be received from a user of the browser. A RESTful protocol, such as the ATOM Publishing Protocol (APP), can be utilized to access a remotely located speech engine. The speech engine can accept GET, PUT, POST, and DELETE commands. The speech processing engine can process the speech input and can provide results to the Web 2.0 server. The Web 2.0 server can perform a programmatic action based upon the provided results, which results in different content being presented in the browser.12-25-2008
20110166863RELEASE OF TRANSACTION DATA - For clearing transaction data selected for a processing, there is generated in a portable data carrier (07-07-2011
20120022876Voice Actions on Computing Devices - A computer-implemented method includes receiving spoken input at a computing device from a user of the computing device, the spoken input including a carrier phrase and a subject to which the carrier phrase is directed, providing at least a portion of the spoken input to a server system in audio form for speech-to-text conversion by the server system, the portion including the subject to which the carrier phrase is directed, receiving from the server system instructions for automatically performing an operation on the computing device, the operation including an action defined by the carrier phrase using parameters defined by the subject, and automatically performing the operation on the computing device.01-26-2012
20120022875SYNCHRONIZING VISUAL AND SPEECH EVENTS IN A MULTIMODAL APPLICATION - Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.01-26-2012
20120022874DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA - Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.01-26-2012
20120022873Speech Recognition Language Models - Methods, computer program products and systems are described for forming a speech recognition language model. Multiple query-website relationships are determined by identifying websites that are determined to be relevant to queries using one or more search engines. Clusters are identified in the query-website relationships by connecting common queries and connecting common websites. A speech recognition language model is created for a particular website based on at least one of analyzing at queries in a cluster that includes the website or analyzing webpage content of web pages in the cluster that includes the website.01-26-2012
20120158407VOICE CONTROL SYSTEM FOR AN IMPLANT - A system for the control of an implant (06-21-2012
20110077948METHOD AND SYSTEM FOR CONTAINMENT OF USAGE OF LANGUAGE INTERFACES - Client software is modified by a translator to use unique variant of linguistic interface of a service. An interceptor pre-processes subsequent client service requests from translated unique linguistic interface to standard linguistic interface implemented by service. Usage of linguistic interfaces of service is contained, rendering service incapable of executing arbitrary input, even if such input is crafted specifically for the service interface.03-31-2011
20110077947CONFERENCE BRIDGE SOFTWARE AGENTS - Systems and methods are provided to generate a software agent that is initiated to continue the business process flow during a conference. Upon initiating a teleconference in response to a selection associated with the business process or predefined rule associated with the business process that requires a conference, an instance of a software agent is instantiated and associated with the teleconference. The software agent may be a sub-process of the conference bridge that conducts the teleconference or a separate process that interacts with the conference bridge as another party to the teleconference. The software agent is initiated with information about the business process step that requires an action or a decision. During the teleconference, the software agent listens for a command from one of the parties and acts on any command given. The commands can send another event or action back to a business process application to continue or complete the business process. The event or action sent back may be based on the commands or a result of an action on the business process step that initiate the conference.03-31-2011
20120316884Wheelchair System Having Voice Activated Menu Navigation And Auditory Feedback - A personal mobility vehicle, such as a wheelchair system, includes an input audio transducer having an output coupled to a speech recognition system and an output audio transducer having an input coupled to a speech synthesis system. The wheelchair system further includes a control unit having a data processor and a memory. The data processor is coupled to the speech recognition system and to the speech synthesis system and is operable in response to a recognized utterance made by a user to present the user with a menu containing wheelchair system functions. The data processor is further configured in response to at least one further recognized utterance made by the user to select from the menu at least one wheelchair system function, to activate the selected function and to provide audible feedback to the user via the speech synthesis system.12-13-2012
20120166204Navigation System and Radio Receiving System - An object of the invention is also a navigation system having an input device for the input of an input scale value, having a display device for displaying road map information according to a selected display scale value and having a processor device, wherein the number of enterable input scale values is larger than the number of the selectable display scale values.06-28-2012
20120166203System and Method for Mobile Workflow Processing - A system and method of wirelessly serving a work flow protocol to agents for use with respect to subjects. The agents wear headsets, each with a display and a microphone coupled to a portable controller. The work flow protocol causing presentation of queries through the headsets based on a logical tree structure. Data generated by speech of the agents is received and stored.06-28-2012
20120221341MOTOR-VEHICLE VOICE-CONTROL SYSTEM AND MICROPHONE-SELECTING METHOD THEREFOR - A voice-control system for motor vehicles has a plurality of spaced microphones emitting respective microphone signals, and an evaluation unit connected to the microphones. This unit serves for assembling correlation pairs from the signals of two of the microphones, calculating a correlation coefficient for each correlation pair, detecting an energy value for each microphone, detecting a respective delay time of a voice signal between a voice signal source and the each of the microphones, and selecting in dependence on current correlation coefficients of the correlation pairs, on the current energy values of the microphones, and on the current delay time of the voice signal to the microphones, that microphone whose signal is optimal as a basis for the operation of the voice-control system.08-30-2012
20120130719REMOTE CONTROL SIGNALING USING AUDIO WATERMARKS - A system for using a watermark embedded in an audio signal to remotely control a device. Various devices such as toys, computers, and appliances, equipped with an appropriate detector, detect the hidden signals, which can trigger an action, or change a state of the device. The watermarks can be used with a “time gate” device, where detection of the watermark opens a time interval within which a user is allowed to perform an action, such as pressing a button, typing in an answer, turning a key in a lock, etc.05-24-2012
20100332235INTELLIGENT HOME AUTOMATION - An intelligent home automation system answers questions of a user speaking “natural language” located in a home. The system is connected to, and may carry out the user's commands to control, any circuit, object, or system in the home. The system can answer questions by accessing the Internet. Using a transducer that “hears” human pulses, the system may be able to identify, announce and keep track of anyone entering or staying in the home or participating in a conversation, including announcing their identity in advance. The system may interrupt a conversation to implement specific commands and resume the conversation after implementation. The system may have extensible memory structures for term, phrase, relation and knowledge, question answering routines and a parser analyzer that uses transformational grammar and a modified three hypothesis analysis. The parser analyzer can be dormant unless spoken to. The system has emergency modes for prioritization of commands.12-30-2010
20120215543Adding Speech Capabilities to Existing Computer Applications with Complex Graphical User Interfaces - At design time of a graphical user interface (GUI), a software component (VUIcontroller) is added to the GUI. At run time of the GUI, the VUIcontroller analyzes the GUI from within a process that executes the GUI. From this analysis, the VUIcontroller automatically generates a voice command set, such as a speech-recognition grammar, that corresponds to controls of the GUI. The generated voice command set is made available to a speech recognition engine, thereby speech-enabling the GUI. Optionally, a GUI designer may add properties to ones of the GUI controls at GUI design time, without necessarily writing a voice command set. These properties, if specified, are then used at GUI run time to control or influence the analysis of the GUI and the automatic generation of the voice command set.08-23-2012
20120136668ELEVATOR CONTROL DEVICE - An elevator control device makes call registration by voice recognition by using a microphone outputting a user's voice as a sound signal and includes: an indicator controller that causes an image which specifies one of objects selectable as a call registration object to be displayed; a storage which stores, in advance, the sound signal of a predetermined voice, which is used for designating the call registration object, as a registered sound signal; and a voice recognizing mechanism that compares the sound signal delivered from the microphone with the registered sound signal, and delivers a control signal if these sound signals coincide with each other. The indicator controller outputs registration information, in which the object specified by the image displayed on the indicator is the call registration object, when receiving the control signal sent from the voice recognizing mechanism.05-31-2012
20120136666AUTOMATED PERSONAL ASSISTANCE SYSTEM - An automated personal assistance system employing artificial intelligence technology that includes speech recognition and synthesis, situational awareness, pattern and behavioral recognition, and the ability to learn from the environment. Embodiments of the system include environmental and occupant sensors and environmental actuators interfaced to an assistance controller having the artificial intelligence technology incorporated therein to control the environment of the system. An embodiment of the invention is implemented as a vehicle which reacts to voice command for movement and operation of the vehicle and detects objects, obstructions, and distances. This invention provides the ability to monitor for the safety of operation and modify dangerous maneuvers as well as to learn locations in the environment and to automatically find its way to them. The system may also incorporate communication capability to convey patterns of environmental and occupant parameters and to a monitoring center.05-31-2012
20120136667VOICE ASSISTANT SYSTEM - Methods and apparatuses to assist a user in the performance of a plurality of tasks are provided. The invention includes storing at least one care plan for a resident, the care plan defining a plurality of tasks to be performed for providing care to the resident. Capturing speech inputs from the user, and providing speech outputs to the user to provide a speech dialog with the user reflective of the care plan. Information is captured with a contactless communication interface and is used for engaging the care plan.05-31-2012
20110184740Integration of Embedded and Network Speech Recognizers - A method, computer program product, and system are provided for performing a voice command on a client device. The method can include translating, using a first speech recognizer located on the client device, an audio stream of a voice command to a first machine-readable voice command and generating a first query result using the first machine-readable voice command to query a client database. In addition, the audio stream can be transmitted to a remote server device that translates the audio stream to a second machine-readable voice command using a second speech recognizer. Further, the method can include receiving a second query result from the remote server device, where the second query result is generated by the remote server device using the second machine-readable voice command and displaying the first query result and the second query result on the client device.07-28-2011
20100174546Sound recognition apparatus of robot and method for controlling the same - Disclosed is a sound recognition apparatus of a robot and a method for controlling the same. The sound recognition apparatus senses sound and determines if the sound is for communication by comparing the sensed sound with a preset reference condition. If the sound is for conversation, the movement of the robot is controlled. The method includes comparing the sound sensed by the robot with a preset reference condition, thereby determining if the sound is for communication with a user. When a conversation is intended, recognition rate is increased, and the robot is moved according to the intention of communication.07-08-2010
20120173245NAVIGATION SYSTEM - A navigation system is provided which facilitates discrimination between an icon of a facility associated with a route, along which the user is expected to move from now on, and an ordinary icon. To achieve this, it includes a destination estimating unit for acquiring information about a driving history and for estimating a destination from the information about the driving history acquired; a drawing decision changing unit for drawing a destination candidate estimated by the destination estimating unit in a form different from an icon of a non-destination candidate; and an information display unit for causing the icon drawn by the drawing decision changing unit to be displayed.07-05-2012
20120215545ROBUST VOICE BROWSER SYSTEM AND VOICE ACTIVATED DEVICE CONTROLLER - The present invention relates to a system for acquiring information from sources on a network, such as the Internet. A voice browsing system maintains a database containing a list of information sources, such as web sites, connected to a network. Each of the information sources is assigned a rank number which is listed in the database along with the record for the information source. In response to a speech command received from a user, a network interface system accesses the information source with the highest rank number in order to retrieve information requested by the user.08-23-2012
20120215544COMPUTERIZED INFORMATION PRESENTATION APPARATUS - A computerized information apparatus useful for providing directions and other information to a user. In one embodiment, the apparatus comprises a processor and network interface and computer readable medium having at least one computer program disposed thereon, the at least one program being configured to receive a speech input from the user regarding an organization or entities, and provide a graphic or visual representation of the organization or entity to aid them in finding the organization or entity. At least a portion of the information is obtained via the network interface from a remote server.08-23-2012
20100049527Method and Device for Voice Control of a Device or of a System in a Motor Vehicle - A method for voice controlling of a device or of a system in a motor vehicle, the device or the system being capable of being operated both by voice inputs and also by non-voice inputs, in particular through the actuation of switches and/or buttons and/or a touch screen, and in which the user of the device or system, in particular the driver of the motor vehicle, is alerted optically and/or acoustically and/or haptically that voice operation of the device or system is possible, dependent on the presence or absence of particular predefined conditions. In addition, the present invention relates to a device or system for supporting the voice controlling of a device or system in a motor vehicle, with which this method is able to be executed.02-25-2010
20100049528SYSTEM AND METHOD FOR CUSTOMIZED PROMPTING - A method for providing an audible prompt to a user within a vehicle. The method includes retrieving one or more data files from a memory device. The data files define certain characteristics of an audio prompt. The method also includes creating the audio prompt from the data files and outputting the audio prompt as an audio signal.02-25-2010
20100049529INTEGRATED SYSTEM AND METHOD FOR MOBILE AUDIO PLAYBACK AND DICTATION - A method and system provides for a single-pass review and feedback of a document. During audio playback of the document to be reviewed, voice-activated recording of feedback and submission of feedback relative to the location in the original document are accomplished. This provides for a fully integrated, single pass review and feedback of documentation to occur.02-25-2010
20080215335COMPUTER, DISPLAY CONTROL DEVICE, POINTER POSITION CONTROL METHOD, AND PROGRAM - To provide a pointer position control method and the like for manipulating a pointer more easily. The user moves the pointer P two-dimensionally and perform click and other operations by using only “voice”—by varying the volume and pitch of produced voice without uttering any specific command. The user moves the pointer P by varying the volume and switches the travel direction of the pointer P by changing the pitch. Also, by stopping to vary the volume, the user can automatically enter a fine adjustment mode in which the user can make fine adjustments. Furthermore, the user can perform a click by stopping to produce voice suddenly and return to normal speech recognition mode by keeping silent.09-04-2008
20120179473SPEECH INTERACTIVE APPARATUS AND COMPUTER PROGRAM PRODUCT - According to an embodiment, a speech interactive apparatus includes an output unit to output a first response; a receiving unit to receive a start instruction of a speech input as a reply to the first response; a response control unit to stop the output of the first response when the start instruction is received while the first response is being output; and a deciding unit to decide on a first determination period, which is used in determining whether a silent state has occurred, based on whether the start instruction is received while the first response is being output or based on the timing of receiving the start instruction. When the input speech is not input during a period starting from the reception of the start instruction till an elapse of the first determination period, the response control unit instructs the output unit to output the first response again.07-12-2012
20120179472ELECTRONIC DEVICE CONTROLLED BY A MOTION AND CONTROLLING METHOD THEREOF - An electronic device is provided. The electronic device includes a motion recognition unit which recognizes motion of an object and a control unit which, if a push motion in which the object located in front of the electronic device is moved in a direction of the electronic device is sensed by the motion recognition unit, activates a motion recognition mode, tracks the motion of the object, and performs a control operation of the electronic device corresponding to a subsequent motion of the object. The control unit may inactivate the motion recognition mode if an end motion in which the motion of the object is in a direction to contact a body part of a user or an additional object is recognized by the motion recognition unit while the motion recognition mode is activated.07-12-2012
20120253825RELEVANCY RECOGNITION FOR CONTEXTUAL QUESTION ANSWERING - Disclosed are systems, methods and computer-readable media for controlling a computing device to provide contextual responses to user inputs. The method comprises receiving a user input, generating a set of features characterizing an association between the user input and a conversation context based on at least a semantic and syntactic analysis of user inputs and system responses, determining with a data-driven machine learning approach whether the user input begins a new topic or is associated with a previous conversation context and if the received question is associated with the existing topic, then generating a response to the user input using information associated with the user input and any previous user input associated with the existing topic.10-04-2012
20120253824METHODS AND SYSTEM OF VOICE CONTROL - This invention relates to a system with different modes of operation or performance that integrates all the key components for the control of most domestic services, such as telephone, lighting and audio/video system, through audio inputs such as words or phrases by a user.10-04-2012
20120259641METHODS AND APPARATUS FOR INITIATING ACTIONS USING A VOICE-CONTROLLED INTERFACE - Methods and apparatus for initiating an action using a voice-controlled human interface. The interface provides a hands free, voice driven environment to control processes and applications. According to one embodiment, a method comprises electronically receiving first user input, parsing the first user input to determine whether the first user input contains a command activation statement that cues a voice-controlled human interface to enter a command mode in which a second user input comprising a voice signal is processed to identify at least one executable command and, in response to determining that the first user input comprises the command activation statement, identifying the at least one executable command in the second user input.10-11-2012
20120226502TELEVISION APPARATUS AND A REMOTE OPERATION APPARATUS - According to one embodiment, a television apparatus includes a speech input unit, an indication input unit, a speech recognition unit, and a control unit. The speech input unit is configured to input a speech. The indication input unit is configured to input an indication to start speech recognition from a user. The speech recognition unit is configured to recognize the user's speech inputted after the indication is inputted. The control unit is configured to execute an operation command corresponding to a recognition result of the user's speech. The control unit, if a volume of the television apparatus at a timing when the indication is inputted is larger than or equal to a threshold, temporarily sets the volume to a value smaller than the threshold while the speech recognition unit is recognizing.09-06-2012
20080300886SYSTEMS AND METHODS OF A STRUCTURED GRAMMAR FOR A SPEECH RECOGNITION COMMAND SYSTEM - In embodiments of the present invention, a system and method for enabling a user to interact with a computer platform using a voice command may comprise the steps of defining a structured grammar for handling a global voice command, defining a global voice command of the structured grammar wherein the global voice command enables access to an object of the computer platform using a single command, and mapping at least one function of the object to the global voice command, wherein upon receiving voice input from the user of the computer platform the object recognizes the global voice command and controls the function.12-04-2008
20120265538VOICE REMOTE CONTROL - A device may include a display and logic. The logic may be configured to receive, from a user, a selection f a first control action associated with an application stored in the device, provide, via the display, a number of choices associated with the first control action, and receive, from the user, a word or a phrase to use as a voice command corresponding to the first control action, wherein the word or phrase is selected from the choices. The logic may also associate the word or phrase with the first control action, receive voice input from the user, identify the voice input as corresponding to the word or phrase, and perform the first control action based on the identified voice input10-18-2012
20110046962VOICE TRIGGERING CONTROL DEVICE AND METHOD THEREOF - A voice triggering control device for enabling a data collection host which assembled on it comprises a processing unit, a speaker, a control module, a power supply module and a housing containing the elements disclosed above. The control device controls the processing unit to output a high-frequency audio signal which is corresponded to an act command Then, broadcasting a high-frequency audio through the speaker, wherein the high-frequency audio is generated by the high-frequency audio signal, and the data collection host is enabled to perform the act command while receiving and decoding the high-frequency audio. Thereby, making the triggering control device enabling the data collection host proceed a functional action by the high-frequency audio can solve the contact fault problem in the prior art.02-24-2011
20120323580EDITING TELECOM WEB APPLICATIONS THROUGH A VOICE INTERFACE - Systems and associated methods for editing telecom web applications through a voice interface are described. Systems and methods provide for editing telecom web applications over a connection, as for example accessed via a standard phone, using speech and/or DTMF inputs. The voice based editing includes exposing an editing interface to a user for a telecom web application that is editable, dynamically generating a voice-based interface for a given user for accomplishing editing tasks, and modifying the telecom web application to reflect the editing commands entered by the user.12-20-2012
20120271641METHOD AND APPARATUS FOR EDUTAINMENT SYSTEM CAPABLE FOR INTERACTION BY INTERLOCKING OTHER DEVICES - An apparatus and method provide interactive edutainment through connection of a smart TV and other devices (e.g., a tablet PC, a smart phone, and a projector). The method includes connecting with a control device, and when at least one main story for interactivity is stored, receiving from a user a selection of the main story to be executed through the control device. The method also includes executing the selected main story, and when a control command is received from the control device, processing the control command.10-25-2012
20120271640Implicit Association and Polymorphism Driven Human Machine Interaction - A voice based user-system interaction may take advantage of implicit association and/or polymorphism to achieve smooth and effective discoursing between the user and the voice enabled system. This user-system interaction may occur at a local control unit, at a remote server, or both.10-25-2012
20120271639PERMITTING AUTOMATED SPEECH COMMAND DISCOVERY VIA MANUAL EVENT TO COMMAND MAPPING - An input from a manually initiated action within a computing system can be received. The system can be associated with a speech component. The input can be associated with a system function. The function can be an operation within the computing system and can be linked to a function identifier. The identifier can be translated to a command data. The command data can be associated with a command identifier, a command, and an alternative command. The command data can be a speech command registered within the speech component. The command data can be presented within a speech interface responsive to the translating. The speech interface can be associated with the speech component.10-25-2012
20120271643INFERRING SWITCHING CONDITIONS FOR SWITCHING BETWEEN MODALITIES IN A SPEECH APPLICATION ENVIRONMENT EXTENDED FOR INTERACTIVE TEXT EXCHANGES - The disclosed solution includes a method for dynamically switching modalities based upon inferred conditions in a dialogue session involving a speech application. The method establishes a dialogue session between a user and the speech application. During the dialogue session, the user interacts using an original modality and a second modality. The speech application interacts using a speech modality only. A set of conditions indicative of interaction problems using the original modality can be inferred. Responsive to the inferring step, the original modality can be changed to the second modality. A modality transition to the second modality can be transparent the speech application and can occur without interrupting the dialogue session. The original modality and the second modality can be different modalities; one including a text exchange modality and another including a speech modality.10-25-2012
20120271642ESTABLISHING A MULTIMODAL ADVERTISING PERSONALITY FOR A SPONSOR OF A MULTIMODAL APPLICATION - Establishing a multimodal advertising personality for a sponsor of a multimodal application, including associating one or more vocal demeanors with a sponsor of a multimodal application and presenting a speech portion of the multimodal application for the sponsor using at least one of the vocal demeanors associated with the sponsor.10-25-2012
20120278084METHOD FOR SELECTING ELEMENTS IN TEXTUAL ELECTRONIC LISTS AND FOR OPERATING COMPUTER-IMPLEMENTED PROGRAMS USING NATURAL LANGUAGE COMMANDS - A method for controlling a program by natural language allows a user to efficiently operate a computer-implemented target program through intuitive natural language commands. A list of natural language commands related to the target program is compiled. Each natural language command is stored as an element in an electronic list. Natural language commands generally consist of short sentences comprising at least a predicate (a verb) and an object (a noun). A user can filter the list of natural language commands by entering the initials of a natural language command. The user enters the first character of the first word to be filtered, followed by the first character of the second word to be filtered, and so forth. Filtering by initials very rapidly reduces the number of choices presented to a user and minimizes the number of keystrokes required to select a particular list element.11-01-2012
20110276335METHODS FOR SYNCHRONOUS AND ASYNCHRONOUS VOICE-ENABLED CONTENT SELECTION AND CONTENT SYNCHRONIZATION FOR A MOBILE OR FIXED MULTIMEDIA STATION - A system is provided for enabling voice-enabled selection and execution for playback of media files stored on a media content playback device. The system includes a voice input circuitry and speech recognition module for enabling voice input recognizable on the device as one or more voice commands for task performance; a push-to-talk interface for activating the voice input circuitry and speech recognition module; and a media content synchronization device for maintaining synchronization between stored media content selections and at least one list of grammar sets used for speech recognition by the speech recognition module, the names identifying one or more media content selections currently stored and available for playback on the media content playback device.11-10-2011
20100169098SYSTEM AND METHOD OF A LIST COMMANDS UTILITY FOR A SPEECH RECOGNITION COMMAND SYSTEM - In embodiments of the present invention, a system and computer-implemented method for enabling a user to interact with a mobile device using a voice command may include the steps of defining a structured grammar for generating a global voice command, defining a global voice command of the structured grammar, wherein the global voice command enables access to an object of the mobile device using a single command, and mapping at least one function of the object to the global voice command, wherein upon receiving voice input from the user of the mobile device, the object recognizes the global voice command and controls the function.07-01-2010
20130013320MULTIMODAL AGGREGATING UNIT - In a voice processing system, a multimodal request is received from a plurality of modality input devices, and the requested application is run to provide a user with the feedback of the multimodal request. In the voice processing system, a multimodal aggregating unit is provided which receives a multimodal input from a plurality of modality input devices, and provides an aggregated result to an application control based on the interpretation of the interaction ergonomics of the multimodal input within the temporal constraints of the multimodal input. Thus, the multimodal input from the user is recognized within a temporal window. Interpretation of the interaction ergonomics of the multimodal input include interpretation of interaction biometrics and interaction mechani-metrics, wherein the interaction input of at least one modality may be used to bring meaning to at least one other input of another modality.01-10-2013
20130013318USER INPUT BACK CHANNEL FOR WIRELESS DISPLAYS - As part of a communication session, a wireless source device can transmit audio and video data to a wireless sink device, and the wireless sink device can transmit user input data received at the wireless sink device back to the wireless source device. In this manner, a user of the wireless sink device can control the wireless source device and control the content that is being transmitted from the wireless source device to the wireless sink device. The input data received at the wireless sink device can be a voice command.01-10-2013
20130013319METHODS AND APPARATUS FOR INITIATING ACTIONS USING A VOICE-CONTROLLED INTERFACE - Methods and apparatus for initiating an action using a voice-controlled human interface. The interface provides a hands free, voice driven environment to control processes and applications. According to one embodiment, a method comprises electronically receiving first user input, parsing the first user input to determine whether the first user input contains a command activation statement that cues a voice-controlled human interface to enter a command mode in which a second user input comprising a voice signal is processed to identify at least one executable command and, in response to to determining that the first user input comprises the command activation statement, identifying the at least one executable command in the second user input.01-10-2013
20100161339METHOD AND SYSTEM FOR OPERATING A VEHICULAR ELECTRONIC SYSTEM WITH VOICE COMMAND CAPABILITY - Methods and systems for operating an avionics system with voice command capability are provided. A first voice command is received. A first type of avionics system function is performed in response to the receiving of the first voice command. A second voice command is received. A second type of avionics system function that has a hazard level higher than that of the first type of avionics system function is performed in response to the receiving of the second voice command only after a condition is detected that is indicative of a confirmation of the request to perform the second type of avionics function. The avionics system may also have the capability to test whether or not the voice command feature is functioning properly.06-24-2010
20130018659Systems and Methods for Speech Command Processing - Methods and apparatus related to processing speech input at a wearable computing device are disclosed. Speech input can be received at the wearable computing device. Speech-related text corresponding to the speech input can be generated. A context can be determined based on database(s) and/or a history of accessed documents. An action can be determined based on an evaluation of at least a portion of the speech-related text and the context. The action can be a command or a search request. If the action is a command, then the wearable computing device can generate output for the command. If the action is a search request, then the wearable computing device can: communicate the search request to a search engine, receive search results from the search engine, and generate output based on the search results. The output can be provided using output component(s) of the wearable computing device.01-17-2013
20110153332Device and Method for Booting Handheld Apparatus by Voice Control - A device for booting a handheld apparatus by voice control includes a base, a power-on device, a trigger switch, and an acoustic sensor. Upon the handheld apparatus being placed at the base to trigger the trigger switch, the trigger switch controls the power-on device to power on the handheld apparatus. After the handheld apparatus is powered on, the acoustic sensor detects a sound of the handheld apparatus and then controls a pressure head of the power-on device to move away. The device and its method for booting a handheld apparatus by voice control come with the advantages of a simple and easy operation and a high efficiency.06-23-2011
20130024200METHOD FOR SELECTING PROGRAM AND APPARATUS THEREOF - A program selection method and a display apparatus thereof are provided. The program selection method includes generating a program list including at least one program title, determining whether there is a voice input for a program selection; searching for a desired program title corresponding to the voice input for the program selection among the at least one program title in the program list, and selecting a program corresponding to the desired program title based on the searching for the desired program title.01-24-2013
20080255852APPARATUSES AND METHODS FOR VOICE COMMAND PROCESSING - An apparatus for voice command processing comprising a mobile agent execution platform is provided. The mobile agent execution platform comprises a native platform, at least one agent, a mobile agent execution context, and a mobile agent management unit. The mobile agent execution context provides an application interface, enabling the agent to access resources of the native platform via the application interface. The mobile agent management unit performs initiation, running, suspension, resumption and dispatch of the agent. The agent performs functions regarding voice command processing.10-16-2008
20080255850Providing Expressive User Interaction With A Multimodal Application - Methods, apparatus, and products are disclosed for providing expressive user interaction with a multimodal application, the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a speech engine through a VoiceXML interpreter, including: receiving, by the multimodal browser, user input from a user through a particular mode of user interaction; determining, by the multimodal browser, user output for the user in dependence upon the user input; determining, by the multimodal browser, a style for the user output in dependence upon the user input, the style specifying expressive output characteristics for at least one other mode of user interaction; and rendering, by the multimodal browser, the user output in dependence upon the style.10-16-2008
20080235030Automatic Method For Measuring a Baby's, Particularly a Newborn's, Cry, and Related Apparatus - The present invention concerns an automatic method for measuring a baby's cry, comprising the following step: A. having N samples ρ(i), for i=O, 1, . . . , (N−1), of an acoustic signal p(t) representing the cry, sampled at a sampling frequencŷ for a period of duration P; the method being characterised in that it assigns a score PainScore to the acoustic signal p(t) by means of a function AF of one or more acoustic parameters selected from the group comprising: —a root-mean-square or rms value prms of the acoustic signal p(t) in the period P; —a fundamental or pitch frequency F09-25-2008
20080235029Speech-Enabled Predictive Text Selection For A Multimodal Application - Methods, apparatus, and products are disclosed for speech-enabled predictive text selection for a multimodal application, the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to an automatic speech recognition (‘ASR’) engine through a VoiceXML interpreter, including: identifying, by the VoiceXML interpreter, a text prediction event, the text prediction event characterized by one or more predictive texts for a text input field of the multimodal application; creating, by the VoiceXML interpreter, a grammar in dependence upon the predictive texts; receiving, by the VoiceXML interpreter, a voice utterance from a user; and determining, by the VoiceXML interpreter using the ASR engine, recognition results in dependence upon the voice utterance and the grammar, the recognition results representing a user selection of a particular predictive text.09-25-2008
20080228496SPEECH-CENTRIC MULTIMODAL USER INTERFACE DESIGN IN MOBILE TECHNOLOGY - A multi-modal human computer interface (HCI) receives a plurality of available information inputs concurrently, or serially, and employs a subset of the inputs to determine or infer user intent with respect to a communication or information goal. Received inputs are respectively parsed, and the parsed inputs are analyzed and optionally synthesized with respect to one or more of each other. In the event sufficient information is not available to determine user intent or goal, feedback can be provided to the user in order to facilitate clarifying, confirming, or augmenting the information inputs.09-18-2008
20080228492Device Control Device, Speech Recognition Device, Agent Device, Data Structure, and Device Control - A language analyzer performs speech recognition on a speech input by a speech input unit, specifies a possible word which is represented by the speech, and the score thereof, and supplies word data representing them to an agent processing unit. The agent processing unit stores process item data which defines a data acquisition process to acquire word data or the like, a discrimination process, and an input/output process, and wires or data defining transition from one process to another and giving a transition constant to the transition, and executes a flow represented generally by the process item data and the wires to thereby control devices belonging to an input/output target device group. To which process in the flow the transition takes place is determined by the weighting factor of each wire, which is determined by the connection relationship between a point where the process has proceeded and the wire, and the score of word data.09-18-2008
20130179172IMAGE REPRODUCING DEVICE, IMAGE REPRODUCING METHOD - An image reproducing device connected to a reproducing unit that reproduces image data includes an extraction unit configured to extract first-condition-satisfying-image data that satisfies a first extraction condition from image data stored in a storage unit; a voice keyword extraction unit configured to extract a keyword that matches a voice input to a voice input unit; and a presentation unit configured to determine, while the first-condition-satisfying-image data is being reproduced by the reproducing unit, a second extraction condition based on a relationship between the first extraction condition applied when extracting the first-condition-satisfying-image data being reproduced and the keyword that has been extracted, and present information pertinent to second-condition-satisfying-image data that satisfies the second extraction condition among the image data stored in the storage unit.07-11-2013
20080221903Hierarchical Methods and Apparatus for Extracting User Intent from Spoken Utterances - Improved techniques are disclosed for permitting a user to employ more human-based grammar (i.e., free form or conversational input) while addressing a target system via a voice system. For example, a technique for determining intent associated with a spoken utterance of a user comprises the following steps/operations. Decoded speech uttered by the user is obtained. An intent is then extracted from the decoded speech uttered by the user. The intent is extracted in an iterative manner such that a first class is determined after a first iteration and a sub-class of the first class is determined after a second iteration. The first class and the sub-class of the first class are hierarchically indicative of the intent of the user, e.g., a target and data that may be associated with the target. The multi-stage intent extraction approach may have more than two iterations. By way of example only, the user intent extracting step may further determine a sub-class of the sub-class of the first class after a third iteration, such that the first class, the sub-class of the first class, and the sub-class of the sub-class of the first class are hierarchically indicative of the intent of the user.09-11-2008
20130179173METHOD AND APPARATUS FOR EXECUTING A USER FUNCTION USING VOICE RECOGNITION - A method and an apparatus for executing a user function using voice recognition. The method includes displaying a user function execution screen; confirming a function to be executed according to voice input; displaying a voice command corresponding to the confirmed function on the user function execution screen; recognizing a voice input by a user, while a voice recognition execution request is continuously received; and executing the function associated with the input voice command, when the recognized voice input is at least one of the displayed voice command.07-11-2013
20130179174MACHINE, SYSTEM AND METHOD FOR USER-GUIDED TEACHING AND MODIFYING OF VOICE COMMANDS AND ACTIONS EXECUTED BY A CONVERSATIONAL LEARNING SYSTEM - A machine, system and method for user-guided teaching and modifications of voice commands and actions to be executed by a conversational learning system. The machine includes a system bus for communicating data and control signals received from the conversational learning system to a computer system, a vehicle data and control bus for connecting devices and sensors in the machine, a bridge module for connecting the vehicle data and control bus to the system bus, machine subsystems coupled to the vehicle data and control bus having a respective user interface for receiving a voice command or input signal from a user, a memory coupled to the system bus for storing action command sequences learned for a new voice command and a processing unit coupled to the system bus for automatically executing the action command sequences learned when the new voice command is spoken.07-11-2013
20130138444MODIFICATION OF OPERATIONAL DATA OF AN INTERACTION AND/OR INSTRUCTION DETERMINATION PROCESS - It is inter alia disclosed to perform at least one of operating an interaction process with a user of the medical apparatus and determining, based on a representation of at least one instruction given by the user, at least one instruction operable by the medical apparatus. Therein, the at least one of the operating and the determining at least partially depends on operational data. It is further disclosed to receive modification information for modifying at least a part of the operational data, wherein the modification information is at least partially determined based on an analysis of a representation of at least one instruction given by the user.05-30-2013
20130173270ELECTRONIC APPARATUS AND METHOD OF CONTROLLING ELECTRONIC APPARATUS - An electronic apparatus and a method of controlling the electronic apparatus are provided. The method includes: receiving a voice command; and if the voice command is a first voice start command, changing a mode of the electronic apparatus to a first voice task mode in which the electronic apparatus is controlled according to further voice input, and if the voice command is a second voice start command, changing the mode of the electronic apparatus to a second voice task mode in which the electronic apparatus is controlled according to the further voice input received via an external apparatus which operates with the electronic apparatus. Therefore, providing efficiency and flexibility in controlling the electronic apparatus by using a microphone of the electronic apparatus or a microphone of the external apparatus.07-04-2013
20130090931MULTIMODAL COMMUNICATION SYSTEM - The present invention, in various embodiments, comprises systems and methods for providing a communication system. In one embodiment, the system is an assistive technology (AT) in a single, highly integrated, multimodal, multifunctional, multipurpose, minimally invasive, unobtrusive, wireless, wearable, easy to use, low cost, and reliable AT that can potentially provide people with severe disabilities with flexible and effective computer access and environmental control in various conditions. In one embodiment, a multimodal Tongue Drive System (mTDS) is disclosed that uses tongue motion as its primary input modality. Secondary input modalities including speech, head motion, and diaphragm control are added to the tongue motion as additional input channels to enhance the system speed, accuracy, robustness, and flexibility, which are expected to address many of the aforementioned issues with traditional ATs that have limited number of input channels/modalities and can only be used in certain conditions by a certain group of users.04-11-2013
20130090932VEHICULAR APPARATUS - A hands-free conversation vehicular apparatus coupling with a communication terminal includes: a communication device; a sound output device; a sound input device inputting a transmission speech sound; a vehicle information acquisition device; a first sound extraction device setting a first direction for a directionality of the sound input device, and extracting a first sound along the first direction; a sound recognition device; a second sound extraction device specifying a second direction for the transmission speech sound recognized by the sound recognition device, and extracting a second sound along the second direction; a sound quality comparison unit comparing a sound quality of the first and second sounds; a changeover device for selecting one of the first and second sounds as the transmission speed sound; and a control device for allowing the changeover device to perform a changeover when a determination condition is fulfilled.04-11-2013
20130096925SYSTEM FOR PROVIDING A SOUND SOURCE INFORMATION MANAGEMENT SERVICE - Disclosed is a system for providing a sound source information management service. The system for providing a sound source information management service manages sound source information transmitted from a driver terminal and extracts the sound source information corresponding to voice input data via voice recognition according to the voice input data transmitted from the driver terminal and provides the extracted sound source information to the driver terminal.04-18-2013
20130103404MOBILE VOICE PLATFORM ARCHITECTURE - A mobile voice platform providing a user speech interface to computer-based services uses a device having a processor, communication circuitry, an operating system, and applications that are run using the operating system and that utilize the computer-based services via the communication circuitry. The mobile voice platform includes a non-transient digital storage medium storing first and second program modules. Upon execution by the processor the first program module receives speech recognition results, determines a desired service based on the speech recognition results, and provides at least some of the speech recognition results to the second program module. The second program module, when executed, generates a service request based on the speech recognition results provided from the first program module, provides the service request to one or more of the computer-based services, obtains a service result from the computer-based service(s), and supplies the first program module with a response.04-25-2013
20130103405OPERATING SYSTEM AND METHOD OF OPERATING - An operation determination processing section of a center extracts words included in the utterance of a driver and an operator, reads an attribute associated with each word from a synonym and related word in which an attribute is stored so as to be associated with each word, reads a domain of a candidate or the like for the task associated with the attribute from the synonym and related word in which domains of a candidate for a task associated with the read attribute or domains of a task to be actually performed are stored, totals the domains read for each word for words included in the utterance of the driver or the like, and estimates those related to a domain with a highest total score as the candidate for the task and the task to be actually performed. In this manner, it is possible to estimate the task with high accuracy.04-25-2013
20110313776System and Method for Controlling Devices that are Connected to a Network - A system, method and computer-readable medium for controlling devices connected to a network. The method includes receiving an utterance from a user for remotely controlling a device in a network; converting the received utterance to text using an automatic speech recognition module; accessing a user profile in the network that governs access to a plurality of devices on the network and identifiers which control a conversion of the text to a device specific control language; identifying based on the text a device to be controlled; converting at least a portion of the text to the device control language; and transmitting the device control language to the identified device, wherein the identified device implements a function based on the transmitted device control language.12-22-2011
20110313775Television Remote Control Data Transfer - A computer-implemented method for information sharing between a portable computing device and a television system includes receiving a spoken input from a user of the portable computing device, by the portable computing device, submitting a digital recording of the spoken query from the portable computing device to a remote server system, receiving from the remote server system a textual representation of the spoken query, and automatically transmitting the textual representation from the portable computing device to the television system. The television system is programmed to submit the textual representation as a search query and to present to the user media-related results that are determined to be responsive to the spoken query.12-22-2011
20110313774Methods, Systems, and Products for Measuring Health - Methods, systems, and products measure health data related to a user. A spoken phrase is received and time-stamped. The user is identified from the spoken phrase. A window of time is determined from a semantic content of the spoken phrase. A sensor measurement is received and time-stamped. A difference in time between the time-stamped spoken phrase and the time-stamped sensor measurement is determined and compared to the window of time. When the difference in time is within the window of time, then the sensor measurement is associated with the user.12-22-2011
20130124207VOICE-CONTROLLED CAMERA OPERATIONS - A computing device (e.g., a smart phone, a tablet computer, digital camera, or other device with image capture functionality) causes an image capture device to capture one or more digital images based on audio input (e.g., a voice command) received by the computing device. For example, a user's voice (e.g., a word or phrase) is converted to audio input data by the computing device, which then compares (e.g., using an audio matching algorithm) the audio input data to an expected voice command associated with an image capture application. In another aspect, a computing device activates an image capture application and captures one or more digital images based on a received voice command. In another aspect, a computing device transitions from a low-power state to an active state, activates an image capture application, and causes a camera device to capture digital images based on a received voice command.05-16-2013
20130124210INFORMATION TERMINAL, CONSUMER ELECTRONICS APPARATUS, INFORMATION PROCESSING METHOD AND INFORMATION PROCESSING PROGRAM - According to an information terminal connectable to a target apparatus, including a determining unit and a control unit wherein the determining unit determines whether the information terminal is held by a user or not, the control unit perform to output, when changed from a status of being held to a status of not being held, a control signal to instruct accepting an operation given from the user to the target apparatus, and the control unit performs, when changed from the status of not being held to the status of being held, at least either one of displaying a remote controller to operate the target device on a display screen of the information terminal or acquiring information on a status of the target apparatus from the target apparatus to display the information on the display screen.05-16-2013
20130124211SYSTEM AND METHOD FOR ENHANCED COMMUNICATIONS VIA SMALL DATA RATE COMMUNICATION SYSTEMS - A system and method for interacting with an interactive communication system include processing a profile associated with an interactive communication system; generating a user interface based on the processing of the profile to solicit a user response correlating to a response required by the interactive communication system; receiving the user response via the user interface; updating the user interface using the profile based on the user response; and sending a signal to the interactive communication system based on one or more user responses.05-16-2013
20130124208REAL-TIME DISPLAY OF SYSTEM INSTRUCTIONS - A system and method for reviewing inputted voice instructions in a vehicle-based telematics control unit. The system includes a microphone, a speech recognition processor, and an output device. The microphone receives voice instructions from a user. Coupled to the microphone is the speech recognition processor that generates a voice signal by performing speech recognition processing of the received voice instructions. The output device outputs the generated voice signal to the user, The system also includes a user interface for allowing the user to approve the outputted voice signal, and a communication component for wirelessly sending the generated voice signal to a server over a wireless network upon approval by the user.05-16-2013
20130124209INFORMATION PROCESSING APPARATUS, INFORMATION PROCESSING METHOD, AND PROGRAM - An information processing apparatus includes: a plurality of information input units; an event detection unit that generates event information including estimated position information and estimated identification information of users present in the real space based on analysis of the information from the information input unit; and an information integration processing unit that inputs the event information, and generates target information including a position of each user and user identification information based on the input event information, and signal information representing a probability value of the event generation source, wherein the information integration processing unit includes an utterance source probability calculation unit, and wherein the utterance source probability calculation unit performs a process of calculating an utterance source score as an index value representing an utterance source probability of each target by multiplying weights based on utterance situations by a plurality of different information items from the event detection unit.05-16-2013
20130132094SYSTEM AND METHOD FOR VOICE ACTUATED CONFIGURATION OF A CONTROLLING DEVICE - A speech recognition engine is provided voice data indicative of at least a brand of a target appliance. The speech recognition engine uses the voice data indicative of at least a brand of the target appliance to identify within a library of codesets at least one codeset that is cross-referenced to the brand of the target appliance. The at least one codeset so identified is then caused to be provisioned to the controlling device for use in commanding functional operations of the target appliance.05-23-2013
20130132096Systems and Techniques for Producing Spoken Voice Prompts - Methods and systems are described in which spoken voice prompts can be produced in a manner such that they will most likely have the desired effect, for example to indicate empathy, or produce a desired follow-up action from a call recipient. The prompts can be produced with specific optimized speech parameters, including duration, gender of speaker, and pitch, so as to encourage participation and promote comprehension among a wide range of patients or listeners. Upon hearing such voice prompts, patients/listeners can know immediately when they are being asked questions that they are expected to answer, and when they are being given information, as well as the information that considered sensitive.05-23-2013
20130132095AUDIO PATTERN MATCHING FOR DEVICE ACTIVATION - A system and method are disclosed for activating an electric device from a standby power mode to a full power mode. The system may include one or more microphones for monitoring audio signals in the vicinity of the electric device, and a standby power activation unit including a low-power microprocessor and a non-volatile memory. Audio captured by the one or more microphones is digitized and compared by the microprocessor against predefined activation pattern(s) stored in the non-volatile memory. If a pattern match is detected between the digital audio pattern and a predefined activation pattern, the electric device is activated.05-23-2013
20130144629SYSTEM AND METHOD FOR CONTINUOUS MULTIMODAL SPEECH AND GESTURE INTERACTION - Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing multimodal input. A system configured to practice the method continuously monitors an audio stream associated with a gesture input stream, and detects a speech event in the audio stream. Then the system identifies a temporal window associated with a time of the speech event, and analyzes data from the gesture input stream within the temporal window to identify a gesture event. The system processes the speech event and the gesture event to produce a multimodal command. The gesture in the gesture input stream can be directed to a display, but is remote from the display. The system can analyze the data from the gesture input stream by calculating an average of gesture coordinates within the temporal window.06-06-2013
20080208593Altering Behavior Of A Multimodal Application Based On Location - Methods, apparatus, and products are disclosed for altering behavior of a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application, including a voice mode and one or more non-voice modes. The voice mode of user interaction with the multimodal application is supported by a voice interpreter. Altering behavior of a multimodal application based on location includes: receiving a location change notification in the voice interpreter from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device; updating, by the voice interpreter, location-based environment parameters for the voice interpreter in dependence upon the current location of the multimodal device; and interpreting, by the voice interpreter, the multimodal application in dependence upon the location-based environment parameters.08-28-2008
20090125311VEHICULAR VOICE CONTROL SYSTEM - A vehicular voice control system includes a first and a second microphone located on a vehicle external to a vehicle cabin. The microphones receive audio signals from an audio source external to the vehicle and generate microphone output signals. A signal processor processes the microphone output signals, generates a processed signal, and determines a location of the audio source. A speech recognition system receives the processed signal and obtains a recognition result. A controller controls one or more vehicular elements based on the recognition result and the determined location of the audio source.05-14-2009
20120278083VOICE CONTROLLED DEVICE AND METHOD - A voice control device includes a storage module, a voice recording module, and a processing module. The storage module stores a number of computerized voice commands. The voice recording module records audio signals of a user. The processing module processes the recorded voice signals to a machine readable command, determines whether the determined machine readable command matches one stored computerized voice command, and controls the device to execute a function according to the machine readable command if the determined machine readable command matches one stored computerized voice command. The processing module stores the determined machine readable command as a history command. The processing module further obtains all the history commands and determines which function the voice controlled device is to do according to the history commands if the determined machine readable command is partially the same as at least two of the stored computerized voice commands.11-01-2012
20100318366Touch Anywhere to Speak - The present invention provides a user interface for providing press-to-talk-interaction via utilization of a touch-anywhere-to-speak module on a mobile computing device. Upon receiving an indication of a touch anywhere on the screen of a touch screen interface, the touch-anywhere-to-speak module activates the listening mechanism of a speech recognition module to accept audible user input and displays dynamic visual feedback of a measured sound level of the received audible input. The touch-anywhere-to-speak module may also provide a user a convenient and more accurate speech recognition experience by utilizing and applying the data relative to a context of the touch (e.g., relative location on the visual interface) in correlation with the spoken audible input.12-16-2010
20120284031METHOD AND DEVICE FOR OPERATING TECHNICAL EQUIPMENT, IN PARTICULAR A MOTOR VEHICLE - A method and device for operating technical equipment, in particular in a motor vehicle. Speech inputs are fed by a speech input unit and manual inputs are fed by means of a manual input unit as operating instructions to a controller by which a command corresponding to the operating instruction is generated and fed to the corresponding technical equipment, which then executes the operating procedure associated with the operating instruction. A basic structure of the command is established by the speech input unit or the manual input unit, and then the basic structure of the command is supplemented by the manual input unit or the speech input unit.11-08-2012
20130159002VOICE APPLICATION ACCESS - A system may include a mobile computing device configured to receive voice input; identify, in the voice input, a navigate command including a sequence indication; determine, based on a sequence control map, a control of a user interface corresponding to the sequence indication; and activate the control of the user interface corresponding to the sequence indication.06-20-2013
20130159003METHOD AND APPARATUS FOR PROVIDING CONTENTS ABOUT CONVERSATION - Disclosed are a method and an apparatus for providing contents about conversation, which collect voice information from conversation between a user and another person, search contents on the basis of the collected voice information, and provide contents about the conversation between the user and the person. The method of providing contents about conversation includes: a voice information collecting step of collecting voice information from conversation between a user and another person; a keyword creating control step of creating search keywords by using the collected voice information; and a contents providing control step of searching contents by using the created search keywords, and providing the searched contents.06-20-2013
20130185078METHOD AND SYSTEM FOR USING SOUND RELATED VEHICLE INFORMATION TO ENHANCE SPOKEN DIALOGUE - Sound related vehicle information representing one or more sounds may be received in the processor. The sound related vehicle information may or may not include an audio signal. Spoken dialogue of a spoken dialogue system associated with the vehicle based on the sound related vehicle information may be modified.07-18-2013
20130185079HOME APPLIANCE, HOME APPLIANCE SYSTEM, AND METHOD FOR OPERATING SAME - The present invention relates to a home appliance, to a home appliance system, and to a method for operating same, wherein the home appliance and a mobile terminal are connected to one another to add or update data in the home appliance through the mobile terminal connected thereto, diagnose the state of the home appliance by means of the mobile terminal, and supplement the function of the home appliance by means of the mobile terminal, thus expanding the functions of the home appliance to enable the easy control of the home appliance, and more conveniently controlling the home appliance.07-18-2013
20130185080USER SPEECH INTERFACES FOR INTERACTIVE MEDIA GUIDANCE APPLICATIONS - A user speech interface for interactive media guidance applications, such as television program guides, guides for audio services, guides for video-on-demand (VOD) services, guides for personal video recorders (PVRs), or other suitable guidance applications is provided. Voice commands may be received from a user and guidance activities may be performed in response to the voice commands.07-18-2013
20130185081Maintaining Context Information Between User Interactions with a Voice Assistant - Methods, systems, and computer readable storage medium related to operating an intelligent digital assistant are disclosed. A first task is performed using a first parameter. A text string is obtained from a speech input received from a user. Based at least partially on the text string, a second task different from the first task or a second parameter different from the first parameter is identified. The first task is performed using the second parameter or the second task is performed using the first parameter.07-18-2013
20130191132VEHICLE-TO-VEHICLE COMMUNICATION DEVICE - A vehicle-to-vehicle communication device generates voice information that includes a voice message and added information regarding an output of the voice message. The voice information is transmitted in one direction of a subject vehicle via a transmission unit, and voice information from another vehicle is received via a reception unit. The vehicle-to-vehicle communication device plays the voice message of the voice information received by the reception unit based on the added information of the voice information. In such manner, information regarding a travel situation is appropriately transmitted by the vehicle-to-vehicle communication device.07-25-2013
20120016678Intelligent Automated Assistant - An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.01-19-2012
20120029922METHOD OF ACCESSING A DIAL-UP SERVICE - A method of accessing a dial-up service is disclosed. An example method of providing access to a service includes receiving a first speech signal from a user to form a first utterance; recognizing the first utterance using speaker independent speaker recognition; requesting the user to enter a personal identification number; and when the personal identification number is valid, receiving a second speech signal to form a second utterance and providing access to the service.02-02-2012
20120029921SPEECH RECOGNITION SYSTEM AND METHOD - According to the present invention, a method for integrating processes with a multi-faceted human centered interface is provided. The interface is facilitated to implement a hands free, voice driven environment to control processes and applications. A natural language model is used to parse voice initiated commands and data, and to route those voice initiated inputs to the required applications or processes. The use of an intelligent context based parser allows the system to intelligently determine what processes are required to complete a task which is initiated using natural language. A single window environment provides an interface which is comfortable to the user by preventing the occurrence of distracting windows from appearing. The single window has a plurality of facets which allow distinct viewing areas. Each facet has an independent process routing its outputs thereto. As other processes are activated, each facet can reshape itself to bring a new process into one of the viewing areas. All activated processes are executed simultaneously to provide true multitasking.02-02-2012
20130197915SPEECH-BASED USER INTERFACE FOR A MOBILE DEVICE - A method of providing hands-free services using a mobile device having wireless access to computer-based services includes carrying out a completed speech session via a mobile device without any physical interaction with the mobile device, wherein the speech session includes receiving a speech input from a user, and obtaining from a cloud service a service result responsive to the speech input, and providing the service result as a speech response presented to the user.08-01-2013
20130197916TERMINAL DEVICE, SPEECH RECOGNITION PROCESSING METHOD OF TERMINAL DEVICE, AND RELATED PROGRAM - According to one embodiment, a terminal device including a main body, includes: a sound input module configured to receive a voice, convert the voice into a digital signal, and output the digital signal; a state detecting module having an acceleration sensor, configured to detect one or both of a movement and a state of the main body and output a detection result; an executing module, which is capable to execute plural speech recognition response processes, configured to execute one of the speech recognition response processes to the digital signal according to the detection result detected by the state detecting module.08-01-2013
20130197917METHODS AND SYSTEMS FOR UTILIZING VOICE COMMANDS ONBOARD AN AIRCRAFT - Methods and systems are provided for utilizing audio commands onboard an aircraft. A method comprises identifying a flight phase for the aircraft, resulting in an identified flight phase, receiving an audio input, resulting in received audio input, filtering the received audio input in a manner that is influenced by the identified flight phase for the aircraft, resulting in filtered audio input, and validating the filtered audio input as a first voice command of a first plurality of possible voice commands.08-01-2013
20130204629VOICE INPUT DEVICE AND DISPLAY DEVICE - An voice input device includes a wave guide unit for guiding an incident sound wave, a microphone unit for converting a sound wave guided through the wave guide unit to an electrical sound signal, and a signal processing unit for processing the sound signal obtained by the microphone unit, using an acoustic characteristic given by the wave guide unit to the sound wave, in which, the wave guide unit has a structure which gives the acoustic characteristic that is different between direct sound, which is sound that reaches the microphone unit without reflecting off an internal surface of the wave guide unit, and indirect sound, which is sound that is reflected off the internal surface before reaching the microphone unit, and the signal processing unit determines whether or not the direct sound is input based on a difference in the acoustic characteristic between the direct sound and the indirect sound.08-08-2013

Patent applications in class Speech controlled system