Inventors list

Assignees list

Classification tree browser

Top 100 Inventors

Top 100 Assignees


Jason Flaks, Redmond US

Jason Flaks, Redmond, WA US

Patent application numberDescriptionPublished
20110178798ADAPTIVE AMBIENT SOUND SUPPRESSION AND SPEECH TRACKING - A device for suppressing ambient sounds from speech received by a microphone array is provided. One embodiment of the device comprises a microphone array, a processor, an analog-to-digital converter, and memory comprising instructions stored therein that are executable by the processor. The instructions stored in the memory are configured to receive a plurality of digital sound signals, each digital sound signal based on an analog sound signal originating at the microphone array, receive a multi-channel speaker signal, generate a monophonic approximation signal of the multi-channel speaker signal, apply a linear acoustic echo canceller to suppress a first ambient sound portion of each digital sound signal, generate a combined directionally-adaptive sound signal from a combination of each digital sound signal by a combination of time-invariant and adaptive beamforming techniques, and apply one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal.07-21-2011
20110184735SPEECH RECOGNITION ANALYSIS VIA IDENTIFICATION INFORMATION - Embodiments are disclosed that relate to the use of identity information to help avoid the occurrence of false positive speech recognition events in a speech recognition system. One embodiment provides a method comprising receiving speech recognition data comprising a recognized speech segment, acoustic locational data related to a location of origin of the recognized speech segment as determined via signals from the microphone array, and confidence data comprising a recognition confidence value, and also receiving image data comprising visual locational information related to a location of each person in an image. The acoustic locational data is compared to the visual locational data to determine whether the recognized speech segment originated from a person in the field of view of the image sensor, and the confidence data is adjusted depending on this determination.07-28-2011
20120092328FUSING VIRTUAL CONTENT INTO REAL CONTENT - A system that includes a head mounted display device and a processing unit connected to the head mounted display device is used to fuse virtual content into real content. In one embodiment, the processing unit is in communication with a hub computing device. The system creates a volumetric model of a space, segments the model into objects, identifies one or more of the objects including a first object, and displays a virtual image over the first object on a display (of the head mounted display) that allows actual direct viewing of at least a portion of the space through the display.04-19-2012
20120093320SYSTEM AND METHOD FOR HIGH-PRECISION 3-DIMENSIONAL AUDIO FOR AUGMENTED REALITY - Techniques are provided for providing 3D audio, which may be used in augmented reality. A 3D audio signal may be generated based on sensor data collected from the actual room in which the listener is located and the actual position of the listener in the room. The 3D audio signal may include a number of components that are determined based on the collected sensor data and the listener's location. For example, a number of (virtual) sound paths between a virtual sound source and the listener may be determined The sensor data may be used to estimate materials in the room, such that the affect that those materials would have on sound as it travels along the paths can be determined In some embodiments, sensor data may be used to collect physical characteristics of the listener such that a suitable HRTF may be determined from a library of HRTFs.04-19-2012
20120163520SYNCHRONIZING SENSOR DATA ACROSS DEVICES - Techniques are provided for synchronization of sensor signals between devices. One or more of the devices may collect sensor data. The device may create a sensor signal from the sensor data, which it may make available to other devices upon a publisher/subscriber model. The other devices may subscribe to sensor signals they choose. A device could be a provider or a consumer of the sensor signals. A device may have a layer of code between an operating system and software applications that processes the data for the applications. The processing may include such actions as synchronizing the data in a sensor signal to a local time clock, predicting future values for data in a sensor signal, and providing data samples for a sensor signal at a frequency that an application requests, among other actions.06-28-2012
20120165964INTERACTIVE CONTENT CREATION - An audio/visual system (e.g., such as an entertainment console or other computing device) plays a base audio track, such as a portion of a pre-recorded song or notes from one or more instruments. Using a depth camera or other sensor, the system automatically detects that a user (or a portion of the user) enters a first collision volume of a plurality of collision volumes. Each collision volume of the plurality of collision volumes is associated with a different audio stem. In one example, an audio stem is a sound from a subset of instruments playing a song, a portion of a vocal track for a song, or notes from one or more instruments. In response to automatically detecting that the user (or a portion of the user) entered the first collision volume, the appropriate audio stem associated with the first collision volume is added to the base audio track or removed from the base audio track.06-28-2012
20120245933ADAPTIVE AMBIENT SOUND SUPPRESSION AND SPEECH TRACKING - A device for suppressing ambient sounds from speech received by a microphone array is provided. One embodiment of the device comprises a microphone array, a processor, an analog-to-digital converter, and memory comprising instructions stored therein that are executable by the processor. The instructions stored in the memory are configured to receive a plurality of digital sound signals, each digital sound signal based on an analog sound signal originating at the microphone array, receive a multi-channel speaker signal, generate a monophonic approximation signal of the multi-channel speaker signal, apply a linear acoustic echo canceller to suppress a first ambient sound portion of each digital sound signal, generate a combined directionally-adaptive sound signal from a combination of each digital sound signal by a combination of time-invariant and adaptive beamforming techniques, and apply one or more nonlinear noise suppression techniques to suppress a second ambient sound portion of the combined directionally-adaptive sound signal.09-27-2012
20120306850DISTRIBUTED ASYNCHRONOUS LOCALIZATION AND MAPPING FOR AUGMENTED REALITY - A system and method for providing an augmented reality environment in which the environmental mapping process is decoupled from the localization processes performed by one or more mobile devices is described. In some embodiments, an augmented reality system includes a mapping system with independent sensing devices for mapping a particular real-world environment and one or more mobile devices. Each of the one or more mobile devices utilizes a separate asynchronous computing pipeline for localizing the mobile device and rendering virtual objects from a point of view of the mobile device. This distributed approach provides an efficient way for supporting mapping and localization processes for a large number of mobile devices, which are typically constrained by form factor and battery life limitations.12-06-2012