Patent application number | Description | Published |
20080255840 | Video Nametags - Video nametags allow automatic identification of people speaking in a video. A video nametag is associated with a person who is participating in a video, such as a video conference scenario or recorded meeting. The video nametag includes one or more sensors that detect when the person is speaking. The video nametag transmits information to a video conferencing system that provides an indicator on a display of the video that identifies the speaker. The system may also automatically format the display of the video to concentrate on the person when the person is speaking. The video nametag can also capture the wearer's audio and transmit it wirelessly to be used for the conference audio send signal. | 10-16-2008 |
20080313713 | AUDIO START SERVICE FOR AD-HOC MEETINGS - An audio start service method for enabling and scheduling ad hoc distributed meetings. Only a short (in some embodiments less than or equal to about 32 bits) unique device identification is needed to enable distributed meeting devices participating in the meeting to rendezvous at a common rendezvous network address. Once the participants know the unique meeting network address they can take part in the meeting, while others can join or leave the meeting. The data string is each device's unique identification that is encoded into an inaudible watermark and continuously exchanged between devices over the telephone network. A first distributed meeting device requests a network address from a distributed meeting server. This unique meeting network address then is sent to an audio start service that identifies “buddies” of the first device and sends out meeting invitations and the network address to other devices so they can join the meeting. | 12-18-2008 |
20090002476 | MICROPHONE ARRAY FOR A CAMERA SPEAKERPHONE - A camera speakerphone having a microphone array may be used for videoconferencing. Example microphone array designs described herein may be used to perform Sound Source Localization (SSL) and improve audio quality of captured audio. In one example, an omni-directional camera speakerphone includes a base having a speaker and at least one microphone. A neck is coupled to the base which is coupled to a head. The head includes an omni-directional camera and at least one microphone. | 01-01-2009 |
20090002477 | CAPTURE DEVICE MOVEMENT COMPENSATION FOR SPEAKER INDEXING - Embodiments of the invention compensate for the movement of a meeting capture device during a live meeting when performing speaker indexing of a recorded meeting. In one example, a first position of a capture device is determined. A second position of the capture device is determined after the capture device has been moved from the first position to the second position. The movement data associated with movement of the capture device from the first position to the second position is determined. The movement data is outputted and used in speaker indexing of the recorded meeting. | 01-01-2009 |
20090002480 | Techniques for detecting a display device - Techniques to detect a display device are described. An apparatus may include a video camera operative to receive video information for an image, and a microphone operative to receive audio information for an image. The apparatus may further include a monitor detection module communicatively coupled to the video camera and the microphone, where the monitor detection module is operative to detect a temporal watermark signal displayed by the monitor within the image, and determine a location for the monitor within the image based on the detection. The apparatus may also include an active speaker detector module communicatively coupled to the monitor detection module, where the active speaker detector module is operative to exclude false positives caused by the monitor. Other embodiments are described and claimed. | 01-01-2009 |
20090003678 | AUTOMATIC GAIN AND EXPOSURE CONTROL USING REGION OF INTEREST DETECTION - A region of interest may be determined using any or all of sound source location, multi-person detection, and active speaker detection. An weighted mean may be determined using the region of interest and a set of backlight weight regions, or, only the set of backlight weight regions if a region of interest could not be found. The image mean is compared to a target value to determine if the image mean is greater than or less than the target value within a predetermined threshold. If the image mean is greater than the predetermined target value and predetermined threshold value, the gain and exposure are decreased. If the image mean is lesser than the predetermined target value minus the predetermined threshold value, the gain and exposure are decreased. | 01-01-2009 |
20090210491 | TECHNIQUES TO AUTOMATICALLY IDENTIFY PARTICIPANTS FOR A MULTIMEDIA CONFERENCE EVENT - Techniques to automatically identify participants for a multimedia conference event are described. An apparatus may comprise a content-based annotation component operative to receive a meeting invitee list for a multimedia conference event. The content-based annotation component may receive multiple input media streams from multiple meeting consoles. The content-based annotation component may annotate media frames of each input media stream with identifying information for each participant within each input media stream to form a corresponding annotated media stream. Other embodiments are described and claimed. | 08-20-2009 |
20100195812 | AUDIO TRANSFORMS IN CONNECTION WITH MULTIPARTY COMMUNICATION - The claimed subject matter relates to an architecture that can preprocess audio portions of communications in order to enrich multiparty communication sessions or environments. In particular, the architecture can provide both a public channel for public communications that are received by substantially all connected parties and can further provide a private channel for private communications that are received by a selected subset of all connected parties. Most particularly, the architecture can apply an audio transform to communications that occur during the multiparty communication session based upon a target audience of the communication. By way of illustration, the architecture can apply a whisper transform to private communications, an emotion transform based upon relationships, an ambience or spatial transform based upon physical locations, or a pace transform based upon lack of presence. | 08-05-2010 |
20100315482 | Interest Determination For Auditory Enhancement - Gaze tracking or other interest indications are used during a video conference to determine one or more audio sources that are of interest to one or more participants to the video conference, such as by determining a conversation from among multiple conversations that a subset of participants are participating in or listening to, for enhancing the audio experience of one or more of the participants. | 12-16-2010 |
20110313766 | IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT - Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers. | 12-22-2011 |
20120278077 | IDENTIFICATION OF PEOPLE USING MULTIPLE TYPES OF INPUT - Systems and methods for detecting people or speakers in an automated fashion are disclosed. A pool of features including more than one type of input (like audio input and video input) may be identified and used with a learning algorithm to generate a classifier that identifies people or speakers. The resulting classifier may be evaluated to detect people or speakers. | 11-01-2012 |
20120327179 | AUTOMATIC VIDEO FRAMING - A dynamically adjustable framed view of occupants in a room is captured through an automatic framing system. The system employs a camera system, including a pan/tilt/zoom (PTZ) camera and one or more depth cameras, to automatically locate occupants in a room and adjust the PTZ camera's pan, tilt, and zoom settings to focus in on the occupants and center them in the main video frame. The depth cameras may distinguish between occupants and inanimate objects and adaptively determine the location of the occupants in the room. The PTZ camera may be calibrated with the depth cameras in order to use the location information determined by the depth cameras to automatically center the occupants in the main video frame for a framed view. Additionally, the system may track position changes in the room and may dynamically adjust and update the framed view when changes occur. | 12-27-2012 |
20130027506 | TECHNIQUES FOR DETECTING A DISPLAY DEVICE - Techniques to detect a display device are described. An apparatus may include a video camera operative to receive video information for an image, and a microphone operative to receive audio information for an image. The apparatus may further include a monitor detection module communicatively coupled to the video camera and the microphone, where the monitor detection module is operative to detect a temporal watermark signal displayed by the monitor within the image, and determine a location for the monitor within the image based on the detection. The apparatus may also include an active speaker detector module communicatively coupled to the monitor detection module, where the active speaker detector module is operative to exclude false positives caused by the monitor. Other embodiments are described and claimed. | 01-31-2013 |
20130147975 | CAPTURE DEVICE MOVEMENT COMPENSATION FOR SPEAKER INDEXING - Embodiments of the invention compensate for the movement of a meeting capture device during a live meeting when performing speaker indexing of a recorded meeting. In one example, a first position of a capture device is determined. A second position of the capture device is determined after the capture device has been moved from the first position to the second position. The movement data associated with movement of the capture device from the first position to the second position is determined. The movement data is outputted and used in speaker indexing of the recorded meeting. | 06-13-2013 |
20140185814 | BOUNDARY BINAURAL MICROPHONE ARRAY - A boundary binaural microphone array includes a pair of microphones spaced from one another by a distance between approximately 5 cm and 30 cm. The boundary binaural microphone array has a structural support that locates the microphones no more than approximately 4 cm off of a surface upon which the array is placed. The microphones are separated by a sound barrier that provides an interaural level difference in the amplitudes of the sound signals sensed by the two microphones. | 07-03-2014 |