Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Speech assisted network

Subclass of:

704 - Data processing: speech signal processing, linguistics, language translation, and audio compression/decompression

704200000 - SPEECH SIGNAL PROCESSING

704270000 - Application

Patent class list (only not empty are listed)

Deeper subclasses:

Entries
DocumentTitleDate
20130211841Multi-Dimensional Interactions and Recall - Methods for initiating actions based on analysis of multi-dimensional interactions are presented. Electronic devices can acquire sensor data representing interactions among multiple entities. Analysis engines can use the interaction data to create or otherwise manage interaction guide queues based on conceptual threads associated with the interactions. Interaction guides within the queue comprise instructions, possibly domain-specific instructions, for devices to participate in the interactions. Contemplated engines manage the queues as a function of attributes, for example priority, derived from the interactions.08-15-2013
20120245944Intelligent Automated Assistant - The intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.09-27-2012
20120245943TRANSFORMING A NATURAL LANGUAGE REQUEST FOR MODIFYING A SET OF SUBSCRIPTIONS FOR A PUBLISH/SUBSCRIBE TOPIC STRING - A natural language request for modifying a set of subscriptions for one or more topics in a publish/subscribe topic hierarchy is received at a processing device. The natural language request includes a predetermined natural language element. The natural language request is transformed into a publish/subscribe topic string and the predetermined natural language element is transformed into a publish/subscribe symbol. The symbol represents one or more topics in the topic hierarchy. One or more subscriptions to one or more topics is modified based on the transformed topic string.09-27-2012
20100042414SYSTEM AND METHOD FOR IMPROVING NAME DIALER PERFORMANCE - Disclosed herein are systems, methods, and computer readable-media for improving name dialer performance. The method includes receiving a speech query for a name in a directory of names, retrieving matches to the query, if the matches are uniquely spelled homophones or near-homophones, identifying information that is unique to all retrieved matches, and presenting a spoken disambiguation statement to a user that incorporates the identified unique information. Identifying information can include multiple pieces of unique information if necessary to completely disambiguate the matches. A hierarchy can establish priority of multiple pieces of unique information for use in the spoken disambiguation statement.02-18-2010
20100106507Ratio of Speech to Non-Speech Audio such as for Elderly or Hearing-Impaired Listeners - The invention relates to audio signal processing and speech enhancement. In accordance with one aspect, the invention combines a high-quality audio program that is a mix of speech and non-speech audio with a lower-quality copy of the speech components contained in the audio program for the purpose of generating a high-quality audio program with an increased ratio of speech to non-speech audio such as may benefit the elderly, hearing impaired or other listeners. Aspects of the invention are particularly useful for television and home theater sound, although they may be applicable to other audio and sound applications. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.04-29-2010
20100094635System for Voice-Based Interaction on Web Pages - SYSTEM FOR VOICE-BASE INTERACTION ON WEB PAGES, of type that permits the incorporation of voice-handling functions on a Web page, in which from a Terminal (04-15-2010
20120166202SYSTEM AND METHOD FOR FUNNELING USER RESPONSES IN AN INTERNET VOICE PORTAL SYSTEM TO DETERMINE A DESIRED ITEM OR SERVICEBACKGROUND OF THE INVENTION - A method of funneling user responses in a voice portal system to determine a desired item or service includes (a) querying a user for an attribute value associated with a first particular attribute of the desired item or service; and (b) determining if the attribute value given by the user satisfies an end state. If the end state is not satisfied, steps (a) and (b) are performed with a new particular attribute.06-28-2012
20130046543INTERACTIVE VOICE RESPONSE (IVR) SYSTEM FOR ERROR REDUCTION AND DOCUMENTATION OF MEDICAL PROCEDURES - Interactive voice response (IVR) systems and methods for delivery of healthcare services (e.g., by one or more medical professionals, such as, for example, in a hospital or clinic). In some embodiments, the present systems can be configured to: prompt one or more users for a plurality of voice inputs with information associated with at least one of a patient and a user; and determine whether each of the plurality of voice inputs is consistent with records related to the patient or the one or more users. In some embodiments, the present systems can be configured to: during performance of a procedure on a patient, prompt one or more users to provide a plurality of voice inputs with information related to progress of the procedure or characteristics of the patient; and/or prompt the user to perform each of a plurality of steps of the procedure.02-21-2013
20090094036SYSTEM AND METHOD OF HANDLING PROBLEMATIC INPUT DURING CONTEXT-SENSITIVE HELP FOR MULTI-MODAL DIALOG SYSTEMS - A method of presenting a multi-modal help dialog move to a user in a multi-modal dialog system is disclosed. The method comprises presenting an audio portion of the multi-modal help dialog move that explains available ways of user inquiry and presenting a corresponding graphical action performed on a user interface associated with the audio portion. The multi-modal help dialog move is context-sensitive and uses current display information and dialog contextual information to present a multi-modal help move that is currently related to the user. A user request or a problematic dialog detection module may trigger the multi-modal help move.04-09-2009
20090076824REMOTE CONTROL SERVER PROTOCOL SYSTEM - A remote control server protocol system transports data to a client system. The client system communicates with the server application using a platform-independent communications protocol. The client system sends commands and audio data to the server application. The server application may respond by transmitting audio and other messages to the client system. The messages may be transmitted over a single communications channel.03-19-2009
20130066634Automated Conversation Assistance - Methods, apparatuses, systems, and computer-readable media for providing automated conversation assistance are presented. According to one or more aspects, a computing device may obtain user profile information associated with a user of the computing device, the user profile information including a list of one or more words that have previously been detected in one or more previously captured speeches associated with the user. Subsequently, the computing device may select, based on the user profile information, one or more words from a captured speech for inclusion in a search query. Then, the computing device may generate the search query based on the selected one or more words.03-14-2013
20130066633Providing Audio-Activated Resource Access for User Devices - Methods and computer systems for providing audio-activated resource access for user devices are provided. In at least one embodiment, a computer system may comprise a processor and a memory coupled to the processor. The memory may store instructions to cause the processor to perform operations comprising capturing audio at a user device. The operations may also comprise using a speech-to-text converter to convert speech transmitted over the audio into text and transmitting the text to a server system to determine a corresponding keyword or phrase. The operations may also comprise receiving a resource corresponding to the keyword or phrase.03-14-2013
20130066635APPARATUS AND METHOD FOR CONTROLLING HOME NETWORK SERVICE IN PORTABLE TERMINAL - An apparatus and a method, which set a remote control command for controlling a home network service in a portable terminal are provided. The apparatus includes a memory for storing configuration types of a remote control command in a set order in a home network service; and a controller for setting the remote control command including the input configuration types of the remote control command and transmitting the remote control command, when the configuration types of the remote control command are input in the set order in the home network service.03-14-2013
20080319757SPEECH PROCESSING SYSTEM BASED UPON A REPRESENTATIONAL STATE TRANSFER (REST) ARCHITECTURE THAT USES WEB 2.0 CONCEPTS FOR SPEECH RESOURCE INTERFACES - A speech processing system can include a client, a speech for Web 2.0 system, and a speech processing system. The client can access a speech-enabled application using at least one Web 2.0 communication protocol. For example, a standard browser of the client can use a standard protocol to communicate with the speech-enabled application executing on the speech for Web 2.0 system. The speech for Web 2.0 system can access a data store within which user specific speech parameters are included, wherein a user of the client is able to configure the specific speech parameters of the data store. Suitable ones of these speech parameters are utilized whenever the user interacts with the Web 2.0 system. The speech processing system can include one or more speech processing engines. The speech processing system can interact with the speech for Web 2.0 system to handle speech processing tasks associated with the speech-enabled application.12-25-2008
20110066439DIMENSION MEASUREMENT SYSTEM - A dimension measurement system is provided. The dimension measurement system includes a speech I/O device fit in an ear canal of a worker, generating a voice signal from vibration in the air emitted from an eardrum of the worker and propagated inside the ear canal, and outputting the voice signal and an information processing device realizing a speech recognition function recognizing a measurement value of a dimension of an object from the voice signal that the speech I/O device output and a judgment function judging if the measurement value satisfies a reference value of the object.03-17-2011
20110282672DISTRIBUTED VOICE BROWSER - The present invention can include a method of call processing using a distributed voice browser including allocating a plurality of service processors configured to interpret parsed voice markup language data and allocating a plurality of voice markup language parsers configured to retrieve and parse voice markup language data representing a telephony service. The plurality of service processors and the plurality of markup language parsers can be registered with one or more session managers. Accordingly, components of received telephony service requests can be distributed to the voice markup language parsers and the parsed voice markup language data can be distributed to the service processors.11-17-2011
20110282671METHODS FOR PERSONAL EMERGENCY INTERVENTION - A method according to an aspect of the present invention includes receiving a communication from a patient through an interactive voice response (IVR) system; providing a guided voice prompt from the interactive voice response system to the patient; receiving a response of the patient to the guided voice prompt through the interactive voice response system; analyzing the response of the patient to the guided voice prompt; determining, based on the response of the patient, whether a command should be transmitted; and transmitting a command to a device controlled by the patient after a determination that the command should be transmitted. This method can be practiced automatically to allow a medical device for a patient or other subject to be monitored without requiring the patient to manually enter information.11-17-2011
20090138269SYSTEM AND METHOD FOR ENABLING VOICE DRIVEN INTERACTIONS AMONG MULTIPLE IVR'S, CONSTITUTING A VOICE WORKFLOW - A method for enabling voice driven interactions among multiple interactive voice response (IVR) systems begins by receiving a telephone call from a user of a first IVR system to begin a transaction; and, automatically contacting, by the first IVR system, at least one additional IVR system. Specifically, the contacting of the additional IVR system includes assigning tasks to the additional IVR system. The tasks require input from the user and the additional IVR system is secure and separate from the first IVR system. Moreover, the tasks can include a transfer of currency and a transfer of local information.05-28-2009
20100125460TRAINING/COACHING SYSTEM FOR A VOICE-ENABLED WORK ENVIRONMENT - A voice assistant system is disclosed which directs the voice Prompts delivered to a first user of a voice assistant to also be communicated wirelessly to the voice assistant of a second user so that the second user can hear the voice Prompts as delivered to the first user.05-20-2010
20100088100ELECTRONIC DEVICES WITH VOICE COMMAND AND CONTEXTUAL DATA PROCESSING CAPABILITIES - An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.04-08-2010
20100088101SYSTEM AND METHOD FOR FACILITATING CALL ROUTING USING SPEECH RECOGNITION - A computer-implemented method is described for optimizing prompts for a speech-enabled application. The speech-enabled application is operable to receive communications from a number of users and communicate one or more prompts to each user to illicit a response from the user that indicates the purpose of the user's communication. The method includes determining a number of prompt alternatives (each including one or more prompts) to evaluate and determining an evaluation period for each prompt alternative. The method also includes automatically presenting each prompt alternative to users during the associated evaluation period and automatically recording the results of user responses to each prompt alternative. Furthermore, the method includes automatically analyzing the recorded results for each prompt alternative based on one or more performance criteria and automatically implementing one of the prompt alternatives based on the analysis of the recorded results.04-08-2010
20110202350REMOTE CONTROL OF A WEB BROWSER - A system for remotely and interactively controlling visual and multimedia content displayed on and rendered by a web browser using a telephony device. In particular, the system relates to receiving a voice input (e.g., dual tone multi-frequency DTMF input, spoken input, etc.) from a telephony device (e.g., a landline, a cellular telephone, or other system with telephone functionality, etc.) via a wide-area network to an intermediary computer that is configured to control the rendering of one or more web pages (or other web data) by a standard web browser.08-18-2011
20080208586Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application - Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in dependence upon the action identifier.08-28-2008
20090276223REMOTE ADMINISTRATION METHOD AND SYSTEM - An administration method and system. The method includes receiving by a computing system, a telephone call from an administrator. The computing system presents an audible menu associated with a plurality of computers to the administrator. The computing system receives from the administrator, an audible selection for a computer from the audible menu. The computing system receives from the administrator, an audible verbal command for performing a maintenance operation on the computer. The computing system executes the maintenance operation on the computer. The computing system receives from the computer, confirmation data indicating that the maintenance operation has been completed. The computing system converts the confirmation data into an audible verbal message. The computing system transmits the second audible verbal message to the administrator.11-05-2009
20110144999DIALOGUE SYSTEM AND DIALOGUE METHOD THEREOF - A dialogue system and a method for the same are disclosed. The dialogue system includes a multimodal input unit receiving speech and non-speech information of a user, a domain reasoner, which stores a plurality of pre-stored situations, each of which is formed by a combination one or more speech and non-speech information, calculating each adaptability of the pre-stored situations on the basis of a generated situation based on the speech and the non-speech information received from the multimodal input unit, and determining a current domain according to the calculated adaptability, a dialogue manager to select a response corresponding to the current domain, and a multimodal output unit to output the response. The dialogue system performs domain reasoning using a situation including information combinations reflected in the domain reasoning process, current information, and a speech recognition result, and reduces the size of a dialogue search space while increasing domain reasoning accuracy.06-16-2011
20080228490METHOD AND APPARATUS FOR LINKING REPRESENTATION AND REALIZATION DATA - A method and apparatus for creating links between a representation, (e.g. text data,) and a realization, (e.g. corresponding audio data,) is provided. According to the invention the realization is structured by combining a time-stamped version of the representation generated from the realization with structural information from the representation. Thereby so called hyper links between representation and realization are created. These hyper links are used for performing search operations in realization data equivalent to those which are possible in representation data, enabling an improved access to the realization (e.g. via audio databases).09-18-2008
20080228489SYSTEM AND METHOD FOR ANALYZING AUTOMATIC SPEECH RECOGNITION PERFORMANCE DATA - In a disclosed method for interpreting automatic speech recognition (ASR) performance data, a data processing system may receive user input that selects a log file to be processed. The log file may contain log records produced by an ASR system as a result of verbal interaction between an individual and the ASR system. In response to receiving the user input, the data processing system may automatically interpret data in the log records and generate interpretation results. The interpretation results may include a duration for a system prompt communicated to the individual by the ASR system, a user response to the system prompt, and a duration for the user response. The user response may include a textual representation of a verbal response from the individual, obtained through ASR. The interpretation results may also include an overall duration for the telephone call.09-18-2008
20080312933INTERFACING AN APPLICATION SERVER TO REMOTE RESOURCES USING ENTERPRISE JAVA BEANS AS INTERFACE COMPONENTS - A method for interfacing an application server with a resource can include the step of associating a plurality of Enterprise Java Beans (EJBs) to a plurality of resources, where a one-to-one correspondence exists between EJB and resource. An application server can receive an application request and can determine a resource for handling the request. An EJB associated with the determined resource can interface the application server to the determined resource. The request can be handled with the determined resource.12-18-2008
20080249781Voice business client - The subject mater herein relates to computer software and client-server based applications and, more particularly, to a voice business client. Some embodiments include one or more device-agnostic application interaction models and one or more device specific transformation services. Some such embodiments provide one or more of systems, methods, and software embodied at least in part in a device specific transformation service to transform channel agnostic application interaction models to and from device or device surrogate specific formats.10-09-2008
20080270142Remote Interactive Information Delivery System - Disclosed herein is a method and system for providing a response to a user's request for information. The user calls into an intelligent information delivery system requests for the information. The information request is recorded as an audio file at the intelligent information delivery system. A structured text form of the audio file is refined into an optimized search query. The optimized search query is input to retrieve search results comprising information of interest from a data server. The search results are processed into an agent readability enhanced and context specific output and displayed to the agent. The agent selects context specific results from the displayed output. The selected context specific results are formatted to an optimized speech deliverable text form. Content of the optimized speech deliverable text form is converted into a voice stream. The voice stream is then communicated to the user.10-30-2008
20080235028Creating A Voice Response Grammar From A Presentation Grammer - Methods, systems, and products are disclosed for creating a voice response grammar in a voice response server including identifying presentation documents for a presentation, each presentation document having a presentation grammar. Typical embodiments include storing each presentation grammar in a voice response grammar on a voice response server. In typical embodiments, identifying presentation documents for a presentation includes creating a data structure representing a presentation and listing at least one presentation document in the data structure representing a presentation. In typical embodiments listing the at least one presentation document includes storing a location of the presentation document in the data structure representing a presentation and storing each presentation grammar includes retrieving a presentation grammar of the presentation document in dependence upon the location of the presentation document.09-25-2008
20100023332SPEECH RECOGNITION INTERFACE FOR VOICE ACTUATION OF LEGACY SYSTEMS - Methods and apparatus are disclosed for a technician to access a systems interface to back-end legacy systems by voice input commands to a speech recognition module. Generally, a user logs a computer into a systems interface which permits access to back-end legacy systems. Preferably, the systems interface includes a first server with middleware for managing the protocol interface. Preferably, the systems interface includes a second server for receiving requests and generating legacy transactions. After the computer is logged-on, a request for voice input is made. A speech recognition module is launched or otherwise activated. The user inputs voice commands that are processed to convert them to commands and text that can be recognized by the client software. The client software formats the requests and forwards them to the systems interface in order to retrieve the requested information.01-28-2010
20090187410SYSTEM AND METHOD OF PROVIDING SPEECH PROCESSING IN USER INTERFACE - Disclosed are systems, methods and computer-readable media for enabling speech processing in a user interface of a device. The method includes receiving an indication of a field and a user interface of a device, the indication also signaling that speech will follow, receiving the speech from the user at the device, the speech being associated with the field, transmitting the speech as a request to public, common network node that receives and processes speech, processing the transmitted speech and returning text associated with the speech to the device and inserting the text into the field. Upon a second indication from the user, the system processes the text in the field as programmed by the user interface. The present disclosure provides a speech mash up application for a user interface of a mobile or desktop device that does not require expensive speech processing technologies.07-23-2009
20090326954IMAGING APPARATUS, METHOD OF CONTROLLING SAME AND COMPUTER PROGRAM THEREFOR - An imaging apparatus is provided. The apparatus includes a sound collecting unit configured to collect speech in a monitored environment, a shooting unit configured to shoot video in the monitored environment, a detection unit configured to detect a change in a state of the monitored environment based upon a change in data acquired by the sound collecting unit, the shooting unit and a sensor for measuring the state of the monitored environment, a recognition unit configured to recognize the change in state with regard to speech data acquired by the sound collecting unit and video data acquired by the shooting unit, and a control unit configured to start up the recognition unit and select a recognition database, which is used by the recognition unit, based upon result of detection by the detection unit.12-31-2009
20090326953Method of accessing cultural resources or digital contents, such as text, video, audio and web pages by voice recognition with any type of programmable device without the use of the hands or any physical apparatus. - The use of voice as a means of communication with a computer or programmable device (12-31-2009
20080319759INTEGRATING A VOICE BROWSER INTO A WEB 2.0 ENVIRONMENT - The present invention discloses a system and method for integrating a voice browser into a Web 2.0 environment. For example, a system is disclosed which includes at least a Web 2.0 server, a voice browser, and a server-side speech processing system. The Web 2.0 server can serve Web 2.0 content comprising at least one speech-enabled application. The served Web 2.0 content can include voice markup. The voice browser can render the Web 2.0 content received from the Web 2.0 server which includes rendering the voice markup. The server-side speech processing system can handle speech processing operations for the speech-enabled application. Communications with the server-side speech processing system occur via a set of RESTful commands, such as an HTTP GET command, an HTTP POST command, an HTTP PUT command, and an HTTP DELETE command.12-25-2008
20080319760CREATING AND EDITING WEB 2.0 ENTRIES INCLUDING VOICE ENABLED ONES USING A VOICE ONLY INTERFACE - The present invention discloses a method for creating Web 2.0 entries, such as WIKI entries. In the method, a voice communication channel can be established between a user and an automated response system. User speech input can be received over the voice communication channel. A Web 2.0 entry can be created based upon the speech input. The Web 2.0 entry can be saved in a data store accessible by a Web 2.0 server. The Web 2.0 server can serve the saved Web 2.0 entry to Web 2.0 clients. The Web 2.0 clients can include a graphical and/or a voice interface through which the Web 2.0 entry can be presented to users of the clients. The created Web 2.0 entries (e.g. Web 2.0 application) can be formatted in an ATOM PUBLISHING PROTOCOL compliant manner.12-25-2008
20080319758SPEECH-ENABLED APPLICATION THAT USES WEB 2.0 CONCEPTS TO INTERFACE WITH SPEECH ENGINES - The present invention discloses a speech-enabled application that includes two or more linked markup documents that together form a speech-enabled application served by a Web 2.0 server. The linked markup documents can conform to an ATOM PUBLISHING PROTOCOL (APP) based protocol. Additionally, the linked markup documents can include an entry collection of documents and a resource collection of documents. The resource collection can include at least one speech resource associated with a speech engine disposed in a speech processing system remotely located from the Web 2.0 server. The speech resource can add a speech processing capability to the speech-enabled application. In one embodiment, end-users of the speech-enabled application can be permitted to introspect, customize, replace, add, re-order, and remove at least a portion of the linked markup documents.12-25-2008
20090055191ESTABLISHING CALL-BASED AUDIO SOCKETS WITHIN A COMPONENTIZED VOICE SERVER - A method of interfacing a telephone application server and a speech engine can include the step of establishing one or more audio sockets in a media converting component of the telephone application server. The audio socket can remain available for approximately a duration of a call. A work unit that requires processing by a speech engine can be detected for the call. An identifier for the audio socket and a data for the work unit can be conveyed to a selected speech engine. Work unit results from the selected speech engine can be received by the media converting component via the previously established audio socket.02-26-2009
20120078637METHOD AND APPARATUS FOR PERFORMING AND CONTROLLING SPEECH RECOGNITION AND ENROLLMENT - A method and an apparatus for performing and controlling speech recognition and enrolment are provided. The method for performing speech recognition and enrolment includes: receiving a Speech Enrolment Start Request and a Speech Recognition Request sent from a media gateway controller (MGC); performing speech recognition and enrolment according to the Speech Enrolment Start Request and the Speech Recognition Request, and obtaining a recognition and enrolment result; and feeding back the recognition and enrolment result to the MGC.03-29-2012
20120078636EVIDENCE DIFFUSION AMONG CANDIDATE ANSWERS DURING QUESTION ANSWERING - Diffusing evidence among candidate answers during question answering may identify a relationship between a first candidate answer and a second candidate answer, wherein the candidate answers are generated by a question-answering computer process, the candidate answers have associated supporting evidence, and the candidate answers have associated confidence scores. All or some of the evidence may be transferred from the first candidate answer to the second candidate answer based on the identified relationship. A new confidence score may be computed for the second candidate answer based on the transferred evidence.03-29-2012
20090204407System and method for processing a spoken request from a user - A system and method are described for processing a spoken request from a user. In one embodiment, a method is disclosed for attempting to recognize a spoken request from a user with a speech recognition engine above a predetermined level of accuracy. If the spoken request is not recognized above the predetermined level of accuracy, the spoken request is provided to a level one agent. If the level one agent does not recognize the request, a voice connection is established between the user and a level two agent. In another embodiment, a method is disclosed for determining whether a silent response system recognizes a spoken request from a user above a predetermined level of accuracy. A response is provided to the user if the silent response system recognizes the spoken request. Otherwise, a voice connection is established between the user and a call center.08-13-2009
20100274563METHOD AND MOBILE COMMUNICATION DEVICE FOR GENERATING DUAL-TONE MULTI-FREQUENCY (DTMF) COMMANDS ON A MOBILE COMMUNICATION DEVICE HAVING A TOUCHSCREEN - A method and mobile communication device for generating dual-tone multi-frequency (DTMF) commands on a mobile communication device having a touchscreen are provided. In accordance with one embodiment, there is provided a method for generating dual-tone multi-frequency (DTMF) commands on a mobile communication device having a touchscreen, comprising: detecting an automated attendant during a telephone call; activating speech recognition in respect of incoming voice data during the telephone call in response to detecting an automated attendant; translating spoken prompts in the incoming voice data into respective DTMF commands; displaying a menu having selectable menu options corresponding to the DTMF commands In a graphical user interface on the touchscreen; receiving input selecting one of the menu options; receiving input via the touchscreen activating a selected one of the menu options; and generating a DTMF command in accordance with the activated menu option.10-28-2010
20100211396System and Method for Speech Recognition System - A digital speech enabled middleware module is disclosed that facilitates interaction between a large number of client devices and network-based automatic speech recognition (ASR) resources. The module buffers feature vectors associated with speech received from the client devices when the number of client devices is greater than the available ASR resources. When an ASR decoder becomes available, the module transmits the feature vectors to the ASR decoder and a recognition result is returned.08-19-2010
20100217603Method, System, and Apparatus for Enabling Adaptive Natural Language Processing - An adaptive processing system includes one or more adaptive processing engines that are adapted to receive the one or more requests. The one or more adaptive processing engines adapted to parse the one or more requests and to communicate one or more queries to the one or more communication devices based at least in part on the information of the request. In one embodiment, the parsing of the one or more queries includes analyzing the information from the user, determining one or more next steps to perform based at least in part on the information from the user, and generating one or more queries for additional information. The system also includes an application server adapted to communicate with the one or more adaptive processing engines in response to the one or more requests based at least in part on the information of the one or more requests.08-26-2010
20090076823INTERACTIVE VOICE RESPONSE INTERFACE, SYSTEM, METHODS AND PROGRAM FOR CORRECTIONAL FACILITY COMMISSARY - An interactive voice response interface for a correctional facility commissary that detects violations of facility restrictions to orders for commissary goods at more than one point in time, and allows comprehensive review and editing of pending orders for commissary items.03-19-2009
20090144061Systems and methods for generating verbal feedback messages in head-worn electronic devices - Systems and methods for generating and providing verbal feedback messages to wearers of man-machine interface (MMI)-enabled head-worn electronic devices. An exemplary head-worn electronic device includes an MMI and an acoustic signal generator configured to provide verbal acoustic messages to a wearer of the head-worn electronic device in response to the wearer's interaction with the MMI. The head-worn electronic device may be further configured to monitor device states and generate and provide verbal acoustic messages indicative of changes to the device states to the wearer. The verbal messages are digitally stored and accessed by a microprocessor configured to execute a verbal feedback generation program. Further, the verbal messages may be stored according to multiple different natural languages, thereby allowing a user to select a preferred natural language by which the verbal acoustic messages are fed back to the user.06-04-2009
20090112600SYSTEM AND METHOD FOR INCREASING ACCURACY OF SEARCHES BASED ON COMMUNITIES OF INTEREST - Disclosed are systems, methods and computer-readable media for using a local communication network to generate a speech model. The method includes retrieving for an individual a list of numbers in a calling history, identifying a local neighborhood associated with each number in the calling history, truncating the local neighborhood associated with each number based on the at least one parameter, retrieving a local communication network associated with each number in the calling history and each phone number in the local neighborhood, and creating a language model for the individual based on the retrieved local communication network. The generated language model may be used for improved automatic speech recognition for audible searches as well as other modules in a spoken dialog system.04-30-2009
20110040564VOICE ASSISTANT SYSTEM FOR DETERMINING ACTIVITY INFORMATION - A system and method of assisting a care provider in the documentation of self-performance and support information for a resident or person includes a speech dialog with a care provider that uses the generation of speech to play to the care provider and the capture of speech spoken by a care provider. The speech dialog provides assistance to the care provider in providing care for a person according to a care plan for the person. The care plan includes one or more activities requiring a level of performance by the person. For the activity, speech inquiries are provided to the care provider, through the speech dialog, regarding performance of the activity by the person and regarding care provider assistance in the performance of the activity by the person. Speech input is captured from the care provider that is responsive to the speech inquiries. A code is then determined from the speech input and the code indicates the self-performance of the person and support information for a care provider for the activity.02-17-2011
20110082698Devices, Systems and Methods for Improving and Adjusting Communication - Devices, methods and systems for improving and adjusting voice volume and body movements during a performance are disclosed. Device embodiments may be configured with a processor, microphone, one or more movement sensors and at least a display or a speaker. The processor may include instructions configured to receive at least one of sound input from the microphone and movement data from the one or more accelerometers, generate one or more input levels corresponding to at least one of the sound input and movement data, compare the one or more generated input levels to one or more predefined input levels, associate the one or more predefined input levels with at least one of a color, text, graphic or audio file and present at least one of the color, text, graphic or audio file to a user of the device.04-07-2011
20090106028AUTOMATED TUNING OF SPEECH RECOGNITION PARAMETERS - A method for execution on a server for serving presence information, the method for providing dynamically loaded speech recognition parameters to a speech recognition engine, can be provided. The method can include storing at least one rule for selecting speech recognition parameters, wherein a rule comprises an if-portion including criteria and a then-portion specifying speech recognition parameters that must be used when the criteria is met. The method can further include receiving notice that a speech recognition session has been initiated between a user and the speech recognition engine. The method can further include selecting a first set of speech recognition parameters responsive to executing the at least one rule and providing to the speech recognition engine the first set of speech recognition parameters for performing speech recognition of the user.04-23-2009
20120303372ENABLING SECURE TRANSACTIONS BETWEEN SPOKEN WEB SITES - Techniques for enabling a secure transaction with a remote site that uses voice interaction are provided. The techniques include authenticating a remote site to enable a secure transaction, wherein authenticating the remote site comprises using a dynamically generated audio signal.11-29-2012
20120203557COMPREHENSIVE MULTIPLE FEATURE TELEMATICS SYSTEM - A comprehensive system and method for telematics including the following features individually or in sub-combinations: vehicle user interfaces, telecommunications, speech recognition, digital commerce and vehicle parking, digital signal processing, wireless transmission of digitized voice input, navigational assistance for motorists, data communication to vehicles, mobile client-server communication, extending coverage and bandwidth of wireless communication services, and noise reduction.08-09-2012
20080243515SYSTEM AND METHOD FOR PROVIDING AN AUTOMATED CALL CENTER INLINE ARCHITECTURE - A system and method for providing an automated call center inline architecture is provided. A plurality of grammar references and prompts are maintained on a script engine. A call is received through a telephony interface. Audio data is collected using the prompts from the script engine, which are transmitted to the telephony interface via a message server. Distributed speech recognition is performed on a speech server. The grammar references are received from the script engine via the message server. Speech results are determined by applying the grammar references to the audio data. A new grammar is formed from the speech results. Speech recognition results are identified by applying the new grammar to the audio data. The speech recognition results are received as a display on an agent console.10-02-2008
20080255847Meeting visualization system - Voice of plural participants during a meeting is obtained and dialogue situations of the participants that change every second are displayed in real time, so that it is possible to provide a meeting visualization system for triggering more positive discussions. Voice data collected from plural voice collecting units associated with plural participants is processed by a voice processing server to extract speech information. The speech information is sequentially input to an aggregation server. A query process is performed for the speech information by a stream data processing unit of the aggregation server, so that activity data such as the accumulation value of speeches of the participants in the meeting is generated. A display processing unit visualizes and displays dialogue situations of the participants by using the sizes of circles and the thicknesses of lines on the basis of the activity data.10-16-2008
20080255848Speech Recognition Method and System and Speech Recognition Server - A speech recognition method, system and server includes receiving speech information from at least one User Equipment; analyzing and recognizing the speech information and searching for a speech feature matching the speech information; and obtaining an instruction in accordance with the speech feature and executing the instruction. With the various embodiments of the disclosure, cost of the User Equipment may be reduced and accuracy of speech recognition may be improved.10-16-2008
20100324910TECHNIQUES TO PROVIDE A STANDARD INTERFACE TO A SPEECH RECOGNITION PLATFORM - Techniques and systems to provide speech recognition services over a network using a standard interface are described. In an embodiment, a technique includes accepting a speech recognition request that includes at least audio input, via an application program interface (API). The speech recognition request may also include additional parameters. The technique further includes performing speech recognition on the audio according to the request and any specified parameters; and returning a speech recognition result as a hypertext protocol (HTTP) response. Other embodiments are described and claimed.12-23-2010
20120310652Adaptive Human Computer Interface (AAHCI) - An Adaptive Human-Computer Interface (AAHCI) allows an electronic system to automatically monitor and learn from normal in-use behavior exhibited by a human user via responses generated by the supported input devices and to adjust output to the supported output devices accordingly. This Auto-Learning process is different than computer-directed training sessions and takes place as the user begins to use the device for the first time and with repeated use over time. The purpose of AHCI is to provide a user experience that is tailored to the skills, preferences, deficiencies and other personal attributes of the user automatically via machine-learned processes. This in turn provides an improved user experience that is more productive and cost efficient and that can automatically optimize itself over time with repeated use.12-06-2012
20100223059METHOD AND APPARATUS FOR PLAYING DYNAMIC AUDIO AND VIDEO MENUS - A method and an apparatus for playing dynamic audio and video menus are provided herein to play two or more audio and video menu items dynamically. Specifically, the audio and video data in at least two obtained audio and video menu items are split into audio data and video data, respectively. After the splitting, the obtained video data is integrated into one video stream data and the audio data and the integrated video stream data are played. In this way, the video data of each menu item in the dynamic audio and video menus are played smoothly and the voice prompts can be spliced seamlessly. As such, the effect of the audio dynamic menus is the same as the effect of playing a single audio file, and the user can hear the voice menus smoothly.09-02-2010
20110191108Remote controller with position actuatated voice transmission - A method of operation of a remote controller consistent with certain implementations involves determining a spatial orientation of the remote controller based upon an output signal from a position detector; and setting a voice mode of operation of the remote controller as active or inactive based upon the spatial orientation of the remote controller as determined by the position detector, where the voice mode determines whether or not the remote controller will accept and process voice information from a microphone. This abstract is not to be considered limiting, since other embodiments may deviate from the features described in this abstract.08-04-2011
20090192800Medical Ontology Based Data & Voice Command Processing System - A computerized integrated order entry and clinical documentation and voice recognition system enables voice responsive user entry of orders. The system includes a voice recognition unit for detecting spoken words and converting detected spoken words to data representing commands. A data processor, coupled to the voice recognition unit, processes the data representing commands provided by the voice recognition unit, to provide order and documentation related data and menu options for use by a user, by interpreting the data representing commands using an ordering and documentation application specific ontology and excluding use of other non-ordering or non-documentation application specific ontologies. The ordering application enables initiating an order for medication to be administered to a particular patient, or additional ordered services to be performed. A user interface processor, coupled to the data processor, provides data representing a display image. The display image, includes the order related data and menu options provided by the data processor and supports a user in selecting an order for medication to be administered to a particular patient07-30-2009
20110307259SYSTEM AND METHOD FOR AUDIO CONTENT NAVIGATION - A system and method for communicating one or more audio files through a network. One or more original files of an original web site are converted into one or more audio files. An indication is provided to a user that the one or more original files are available as the one or more audio files in response to the user navigating the one or more original files. The one or more audio files are delivered to a computing device of the user through the network in response to a request to access the one or more audio files.12-15-2011
20120065982DYNAMICALLY GENERATING A VOCAL HELP PROMPT IN A MULTIMODAL APPLICATION - Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.03-15-2012
20120046950RETRIEVAL AND PRESENTATION OF NETWORK SERVICE RESULTS FOR MOBILE DEVICE USING A MULTIMODAL BROWSER - A method of obtaining information using a mobile device can include receiving a request including speech data from the mobile device, and querying a network service using query information extracted from the speech data, whereby search results are received from the network service. The search results can be formatted for presentation on a display of the mobile device. The search results further can be sent, along with a voice grammar generated from the search results, to the mobile device. The mobile device then can render the search results.02-23-2012
20120209613METHOD AND ARRANGEMENT FOR MANAGING GRAMMAR OPTIONS IN A GRAPHICAL CALLFLOW BUILDER08-16-2012
20120046951NUMERIC WEIGHTING OF ERROR RECOVERY PROMPTS FOR TRANSFER TO A HUMAN AGENT FROM AN AUTOMATED SPEECH RESPONSE SYSTEM - A method for a speech response system to automatically transfer users to human agents. The method can establish an interactive dialog session between a user and an automated speech response system. An error score can be established when the interactive dialog session is initiated. During the interactive dialog session, responses to dialog prompts can be received. Error weights can be assigned to receive responses determined to be non-valid responses. Different non-valid responses can be assigned different error weights. For each non-valid response, the assigned error weight can be added to the error score. When a value of the error score exceeds a previously established error threshold, a user can be automatically transferred from the automated speech response system to a human agent.02-23-2012
20130013317METHOD AND APPARATUS FOR NAVIGATION OF A DIALOGUE SYSTEM - In one embodiment, the present disclosure is a method and apparatus for navigation of a dialogue system. In one embodiment, a method for facilitating navigation of a menu of a dialogue system includes encoding data including information for navigating the menu in a machine-readable data structure and outputting the machine-readable data structure.01-10-2013
20120116777Stateful, Double-Buffered Dynamic Navigation Voice Prompting - A navigation system written in J2ME MIDP for a client device includes a plurality of media players each respectively comprising a buffer. A navigation program manages the state of the plurality of media players. The plurality of media players are in either one of an acquiring resources state, and a playing and de-allocating state. The use of a plurality of media players each respectively comprising a buffer overcomes the prior art in which navigation system can cut off a voice prompt because of the time-consuming tasks associated with playing a voice prompt.05-10-2012
20120116776System and method for client voice building - Provided is a system and method for building and managing a customized voice of an end-user, comprising the steps of designing a set of prompts for collection from the user, wherein the prompts are selected from both an analysis tool and by the user's own choosing to capture voice characteristics unique to the user. The prompts are delivered to the user over a network to allow the user to save a user recording on a server of a service provider. This recording is then retrieved and stored on the server and then set up on the server to build a voice database using text-to-speech synthesis tools. A graphical interface allows the user to continuously refine the data file to improve the voice and customize parameter and configuration settings, thereby forming a customized voice database which can be deployed or accessed.05-10-2012
20120116775BIOCHEMICAL ANALYZER HAVING MICROPROCESSING APPARATUS WITH EXPANDABLE VOICE CAPACITY - A biochemical analyzer having a microprocessing apparatus with expandable voice capacity is characterized in that a driving module is installed in a data processor and a voice carrier is replaceable. Thereby, increase or decrease of voice files can be easily done by replacing the current voice carrier with an alternative voice carrier storing desired voice files, without the need of replacing the driving module together with the voice carrier, thereby saving costs and reducing processing procedures.05-10-2012
20120022872Automatically Adapting User Interfaces For Hands-Free Interaction - A user interface for a system such as a virtual assistant is automatically adapted for hands-free use. A hands-free context is detected via automatic or manual means, and the system adapts various stages of a complex interactive system to modify the user experience to reflect the particular limitations of such a context. The system of the present invention thus allows for a single implementation of a complex system such as a virtual assistant to dynamically offer user interface elements and alter user interface behavior to allow hands-free use without compromising the user experience of the same system for hands-on use.01-26-2012
20080208585Ordering Recognition Results Produced By An Automatic Speech Recognition Engine For A Multimodal Application - Ordering recognition results produced by an automatic speech recognition (‘ASR’) engine for a multimodal application implemented with a grammar of the multimodal application in the ASR engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a weight for each recognition result; and sorting, by the VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.08-28-2008
20120166201INVOKING TAPERED PROMPTS IN A MULTIMODAL APPLICATION - Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.06-28-2012
20120215542METHOD OF PROVIDING DYNAMIC SPEECH PROCESSING SERVICES DURING VARIABLE NETWORK CONNECTIVITY - A client device for providing dynamic speech processing services during variable network connectivity with a network server includes a connection monitor that monitors network connectivity between the client device and the network server. The device further includes a simplified speech processor that processes speech data and is initiated based on an assessment from the connection monitor that the network connectivity is impaired. The device further includes a speech data storage that stores processed speech data from the simplified speech processor and a transmitter that is configured to transmit the stored speech data to the network08-23-2012
20120253821VEHICULAR DEVICE AND METHOD FOR COMMUNICATING THE SAME WITH INFORMATION CENTER - A communication unit is connected with an information center, which is receivable analog data, through an external network. A transmission unit transmits predetermined vehicle information in a form of a voice prompt of analog data when the communication unit is in connection with the information center.10-04-2012
20120253820Mixed-mode interaction - A user of a wireless device, such as a mobile phone, can make purchases or obtain information via a network, such as the Internet, using both voice and non-verbal methods. Users can submit voice queries and receive non-verbal replies, submit non-verbal queries and receive voice replies, or perform similar operations that many the voice and data capabilities of modern mobile communication devices. The user may provide notification criteria indicating under what conditions a notification should be sent to the user's wireless device. When purchasing opportunities matching the selected notification criteria become available, the user is notified. The user can respond to the notification, and immediately take advantage of the purchasing opportunity if he so desires. Mixed-mode interactions can also be used by sellers to more advantageously control the marketing of distressed, time sensitive, or other merchandise/services.10-04-2012
20120173243Expert Conversation Builder - An expert conversation builder contains a knowledge database that includes a plurality of dialogues having nodes and edges arranged as directed acyclic graphs. Users and authors of the system interface with the knowledge database through a graphical interface to author dialogues and to create expert conversations as threads traversing the node in the dialogues.07-05-2012
20100049525METHODS, APPARATUSES, AND SYSTEMS FOR PROVIDING TIMELY USER CUES PERTAINING TO SPEECH RECOGNITION - A method is provided of providing cues from am electronic communication device to a user while capturing an utterance. A plurality of cues associated with the user utterance are provided by the device to the user in at least near real-time. For each of a plurality of portions of the utterance, data representative of the respective portion of the user utterance is communicated from the electronic communication device to a remote electronic device. In response to this communication, data, representative of at least one parameter associated with the respective portion of the user utterance, is received at the electronic communication device. The electronic communication device provides one or more cues to the user based on the at least parameter. At least one of the cues is provided by the electronic communication device to the user prior to completion of the step of capturing the user utterance.02-25-2010
20080215331ENABLING SPEECH WITHIN A MULTIMODAL PROGRAM USING MARKUP - A method for speech enabling an application can include the step of specifying a speech input within a speech-enabled markup. The speech-enabled markup can also specify an application operation that is to be executed responsive to the detection of the speech input. After the speech input has been defined within the speech-enabled markup, the application can be instantiated. The specified speech input can then be detected and the application operation can be responsively executed in accordance with the specified speech-enabled markup.09-04-2008
20120179471CONFIGURABLE SPEECH RECOGNITION SYSTEM USING MULTIPLE RECOGNIZERS - Techniques for combining the results of multiple recognizers in a distributed speech recognition architecture. Speech data input to a client device is encoded and processed both locally and remotely by different recognizers configured to be proficient at different speech recognition tasks. The client/server architecture is configurable to enable network providers to specify a policy directed to a trade-off between reducing recognition latency perceived by a user and usage of network resources. The results of the local and remote speech recognition engines are combined based, at least in part, on logic stored by one or more components of the client/server architecture.07-12-2012
20100010817System and Method for Improving the Performance of Speech Analytics and Word-Spotting Systems - A System and Method for Improving the Performance of Speech Analytics and Word-Spotting Systems is provided wherein a digitized signal originates from a input client device belonging to a customer, the signal being then passed to a network which passes the signal to both of an output client device belonging to a customer service rep and a call recorder. The call recorder compresses the signal using CELP-based technology such as MASC® technology and then sends the compressed signal to a speech analytics engine before being processed with or without a signal processing filter. The speech analytics engine receives the signal and upon also receiving a query, the speech analytics engine operates on the signal in response to the query, thereby outputting one or more desired voice outputs to an application to include a query application.01-14-2010
20120232906Electronic Devices with Voice Command and Contextual Data Processing Capabilities - An electronic device may capture a voice command from a user. The electronic device may store contextual information about the state of the electronic device when the voice command is received. The electronic device may transmit the voice command and the contextual information to computing equipment such as a desktop computer or a remote server. The computing equipment may perform a speech recognition operation on the voice command and may process the contextual information. The computing equipment may respond to the voice command. The computing equipment may also transmit information to the electronic device that allows the electronic device to respond to the voice command.09-13-2012
20120253822Systems and Methods for Managing Prompts for a Connected Vehicle - A method for providing audio prompts via a service-providing remote center includes receiving a list of requested data from an on-board navigation system of a vehicle, and, for each item in the list of requested data, determining whether an audio prompt is available and delivering an associated audio prompt from the service-providing remote center over a data channel. Also provided is a method for obtaining audio prompts using a minimal amount of text-to-speech ports including determining a plurality of known data items, generating audio prompts for the plurality of known data items with a single text-to-speech engine using batch mode processing, obtaining an associated audio prompt for each of the known data items, and storing each associated audio prompt in a recording database.10-04-2012
20120253823Hybrid Dialog Speech Recognition for In-Vehicle Automated Interaction and In-Vehicle Interfaces Requiring Minimal Driver Processing - A system and method for implementing a server-based speech recognition system for multi-modal automated interaction in a vehicle includes receiving, by a vehicle driver, audio prompts by an on-board human-to-machine interface and a response with speech to complete tasks such as creating and sending text messages, web browsing, navigation, etc. This service-oriented architecture is utilized to call upon specialized speech recognizers in an adaptive fashion. The human-to-machine interface enables completion of a text input task while driving a vehicle in a way that minimizes the frequency of the driver's visual and mechanical interactions with the interface, thereby eliminating unsafe distractions during driving conditions. After the initial prompting, the typing task is followed by a computerized verbalization of the text. Subsequent interface steps can be visual in nature, or involve only sound.10-04-2012
20080300884Using voice commands from a mobile device to remotely access and control a computer - A method of using voice commands from a mobile device to remotely access and control a computer. The method includes receiving audio data from the mobile device at the computer. The audio data is decoded into a command. A software program that the command was provided for is determined. At least one process is executed at the computer in response to the command. Output data is generated at the computer in response to executing at least one process at the computer. The output data is transmitted to the mobile device.12-04-2008
20080235027Supporting Multi-Lingual User Interaction With A Multimodal Application - Methods, apparatus, and products are disclosed for supporting multi-lingual user interaction with a multimodal application, the application including a plurality of VoiceXML dialogs, each dialog characterized by a particular language, supporting multi-lingual user interaction implemented with a plurality of speech engines, each speech engine having a grammar and characterized by a language corresponding to one of the dialogs, with the application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the application operatively coupled to the speech engines through a VoiceXML interpreter, the VoiceXML interpreter: receiving a voice utterance from a user; determining in parallel, using the speech engines, recognition results for each dialog in dependence upon the voice utterance and the grammar for each speech engine; administering the recognition results for the dialogs; and selecting a language for user interaction in dependence upon the administered recognition results.09-25-2008
20080221902MOBILE BROWSER ENVIRONMENT SPEECH PROCESSING FACILITY - In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a browser software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the browser software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.09-11-2008
20080221901MOBILE GENERAL SEARCH ENVIRONMENT SPEECH PROCESSING FACILITY - In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a search software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the search software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.09-11-2008
20080221900MOBILE LOCAL SEARCH ENVIRONMENT SPEECH PROCESSING FACILITY - In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a local search software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the local search software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.09-11-2008
20080221899MOBILE MESSAGING ENVIRONMENT SPEECH PROCESSING FACILITY - In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a messaging software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the messaging software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.09-11-2008
20080221898Mobile navigation environment speech processing facility - In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a navigation software application resident on a mobile communication facility, where speech may be recorded using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the navigation software application. In embodiments, the user may be allowed to alter the results that are received from the speech recognition facility. In addition, the speech recognition facility may be adapted based on usage.09-11-2008
20080221897MOBILE ENVIRONMENT SPEECH PROCESSING FACILITY - In embodiments of the present invention improved capabilities are described for a mobile environment speech processing facility. The present invention may provide for the entering of text into a software application resident on a mobile communication facility, where recorded speech may be presented by the user using the mobile communications facility's resident capture facility. Transmission of the recording may be provided through a wireless communication facility to a speech recognition facility, and may be accompanied by information related to the software application. Results may be generated utilizing the speech recognition facility that may be independent of structured grammar, and may be based at least in part on the information relating to the software application and the recording. The results may then be transmitted to the mobile communications facility, where they may be loaded into the software application.09-11-2008
20080221896Grammar confusability metric for speech recognition - Architecture for testing an application grammar for the presence of confusable terms. A grammar confusability metric (GCM) is generated for describing a likelihood that a reference term will be confused by the speech recognizer with another term phrase currently allowed by active grammar rules. The GCM is used to flag processing of two phrases in the grammar that have different semantic meaning, but that the speech recognizer could have difficulty distinguishing reliably. A built-in acoustic model is analyzed and feature vectors generated that are close to the acoustic properties of the input term. The feature vectors are then sent for recognition. A statistically random sampling method is applied to explore the acoustic properties of feature vectors of the input term phrase spatially and temporally. The feature vectors are perturbed in the neighborhood of the time domain and the Gaussian mixture model to which the feature vectors belong.09-11-2008
20130138443VOICE-SCREEN ARS SERVICE SYSTEM, METHOD FOR PROVIDING SAME, AND COMPUTER-READABLE RECORDING MEDIUM - A method for providing a voice-screen ARS service on a terminal, according to an embodiment of the present invention, uses an application installed on the terminal to connect to an IVR system of a client company via a voice call and connects a data call to a VARS service server. Menu information including a plurality of menu items related to a client is received through a data call and displayed on a screen and voice information related to the menu is received through a voice call and output in audio. Accordingly, when a user uses the ARS, both services of voice and onscreen information are simultaneously provided and thereby decreases the limitations and inaccuracies of provided voice information increases user convenience.05-30-2013
20130096924Apparatus and Method for Processing Service Interactions - An interactive voice and data response system that directs input to a voice, text, and web-capable software-based router, which is able to intelligently respond to the input by drawing on a combination of human agents, advanced speech recognition and expert systems, connected to the router vis a TCP/IP network. The digitized input is broken down into components so that the customer interaction is managed as a series of small tasks performed by a pool of human agents, rather than one ongoing conversation between the customer and a single agent. The router manages the interactions and keeps pace with a real-time conversation. The system utilizes both speech recognition and human intelligence for purposes of interpreting customer utterances or customer text, where the role of the human agent(s) is to input the intent of caller utterances, and where the computer system—not the human agent—determines which response to provide given the customer's stated intent (as interpreted/captured by the human agents). The system may use more than one human agent, or both human agents and speech recognition software, to interpret simultaneously the same component for error-checking and interpretation accuracy.04-18-2013
20130096923System and Method of Dynamically Modifying a Spoken Dialog System to Reduce Hardware Requirements - A system and method for providing a scalable spoken dialog system are disclosed. The method comprises receiving information which may be internal to the system or external to the system and dynamically modifying at least one module within a spoken dialog system according to the received information. The modules may be one or more of an automatic speech recognition, natural language understanding, dialog management and text-to-speech module or engine. Dynamically modifying the module may improve hardware performance or improve a specific caller's speech processing accuracy, for example. The modification of the modules or hardware may also be based on an application or a task, or based on a current portion of a dialog.04-18-2013
20130103403RESPONDING TO A CALL TO ACTION CONTAINED IN AN AUDIO SIGNAL - An audio signal is monitored to detect the presence of a call to action contained therein. Addressing information is automatically extracted from the call to action and stored on a storage medium. An electronic message responding to the call to action may be automatically prepared, or a contact field may be automatically populated for inclusion in a contact list. The audio signal may be digitized or obtained from a broadcast transmission, and the process may be performed by a mobile communication device, a central system, or a combination thereof.04-25-2013
20130132090Voice Data Retrieval System and Program Product Therefor - A voice data retrieval system including an inputting device of inputting a keyword, a phoneme converting unit of converting the inputted keyword in a phoneme expression, a voice data retrieving unit of retrieving a portion of a voice data at which the keyword is spoken based on the keyword in the phoneme expression, a comparison keyword creating unit of creating a set of comparison keywords having a possibility of a confusion of a user in listening to the keyword based on a phoneme confusion matrix for each user, and a retrieval result presenting unit of presenting a retrieval result from the voice data retrieving unit and the comparison keyword from the comparison keyword creating unit to a user.05-23-2013
20130144628VOICE INTERFACE TO NFC APPLICATIONS - Technologies for transferring Near Field Communications information on a computing device include storing information corresponding to services in a database on the computing device, receiving a voice input corresponding to a name of a requested service, and retrieving the information corresponding to the requested service from the database. Such technologies may also include loading the retrieved information corresponding to the requested service into a Near Field Communications tag emulated by the computing device and transferring the retrieved information to a portable computing device in response to the Near Field Communications tag being touched by a Near Field Communications reader of the portable computing device. The information corresponding to the requested service stored in the database, retrieved from the database, loaded into the Near Field Communications tag, and/or transferred to the portable computing device may include a Universal Resource Identifier and content-specific keywords corresponding to the requested service.06-06-2013
20130204626Method and Apparatus for Setting Selected Recognition Parameters to Minimize an Application Cost Function - Methods and systems for setting selected automatic speech recognition parameters are described. A data set associated with operation of a speech recognition application is defined and includes: i. recognition states characterizing the semantic progression of a user interaction with the speech recognition application, and ii. recognition outcomes associated with each recognition state. For a selected user interaction with the speech recognition application, an application cost function is defined that characterizes an estimated cost of the user interaction for each recognition outcome. For one or more system performance parameters indirectly related to the user interaction, the parameters are set to values which optimize the cost of the user interaction over the recognition states.08-08-2013
20130204627Systems and Methods for Off-Board Voice-Automated Vehicle Navigation - A system for selecting music includes a mobile system for processing and transmitting through a wireless link a continuous voice stream spoken by a user of the mobile system, the continuous voice stream including a music request, and a data center for processing the continuous voice stream received through the wireless link into voice music information. The data center can perform automated voice recognition processing on the voice music information to recognize music components of the music request, confirm the recognized music components through interactive speech exchanges with the mobile system user through the wireless link and the mobile system, selectively allow human data center operator intervention to assist in identifying the selected recognized music components having a recognition confidence below a selected threshold value, and download music information pertaining to the music request for transmission to the mobile system derived from the confirmed recognized music components.08-08-2013
20120284030Stateful, Double-Buffered Dynamic Navigation Voice Prompting - A navigation system written in J11-08-2012
20120078635VOICE CONTROL SYSTEM - One embodiment of a voice control system includes a first electronic device communicatively coupled to a server and configured to receive a speech recognition file from the server. The speech recognition file may include a speech recognition algorithm for converting one or more voice commands into text and a database including one or more entries comprising one or more voice commands and one or more executable commands associated with the one or more voice commands.03-29-2012

Patent applications in class Speech assisted network