DOLBY INTERNATIONAL AB Patent applications |
Patent application number | Title | Published |
20160125888 | CODING OF AUDIO SCENES - Exemplary embodiments provide encoding and decoding methods, and associated encoders and decoders, for encoding and decoding of an audio scene which at least comprises one or more audio objects ( | 05-05-2016 |
20160125887 | EFFICIENT CODING OF AUDIO SCENES COMPRISING AUDIO OBJECTS - There is provided encoding and decoding methods for encoding and decoding of object based audio. An exemplary encoding method includes inter alia calculating M downmix signals by forming combinations of N audio objects, wherein M≦N, and calculating parameters which allow reconstruction of a set of audio objects formed on basis of the N audio objects from the M downmix signals. The calculation of the M downmix signals is made according to a criterion which is independent of any loudspeaker configuration. | 05-05-2016 |
20160118057 | SELECTIVE BASS POST FILTER - In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information. | 04-28-2016 |
20160111099 | Reconstruction of Audio Scenes from a Downmix - Audio objects are associated with positional metadata. A received downmix signal comprises downmix channels that are linear combinations of one or more audio objects and are associated with respective positional locators. In a first aspect, the downmix signal, the positional metadata and frequency-dependent object gains are received. An audio object is reconstructed by applying the object gain to an upmix of the downmix signal in accordance with coefficients based on the positional metadata and the positional locators. In a second aspect, audio objects have been encoded together with at least one bed channel positioned at a positional locator of a corresponding downmix channel. The decoding system receives the downmix signal and the positional metadata of the audio objects. A bed channel is reconstructed by suppressing the content representing audio objects from the corresponding downmix channel on the basis of the positional locator of the corresponding downmix channel. | 04-21-2016 |
20160111098 | Audio Encoder and Decoder - The present disclosure provides methods, devices and computer program products for encoding and decoding of a vector of parameters in an audio coding system. The disclosure further relates to a method and apparatus for reconstructing an audio object in an audio decoding system. According to the disclosure, a modulo differential approach for coding and encoding a vector of a non-periodic quantity may improve the coding efficiency and provide encoders and decoders with less memory requirements. Moreover, an efficient method for encoding and decoding a sparse matrix is provided. | 04-21-2016 |
20160111097 | METHODS FOR AUDIO ENCODING AND DECODING, CORRESPONDING COMPUTER-READABLE MEDIA AND CORRESPONDING AUDIO ENCODER AND DECODER - The present disclosure provides methods, devices and computer program products which provide less complex and more flexible control of the introduced decorrelation in an audio coding system. According to the disclosure, this is achieved by calculating and using two weighting factors, one for an approximated audio object and one for a decorrelated audio object, for introduction of decorrelation of audio objects in the audio coding system. | 04-21-2016 |
20160104496 | EFFICIENT CODING OF AUDIO SCENES COMPRISING AUDIO OBJECTS - There is provided encoding and decoding methods for encoding and decoding of object based audio. An exemplary encoding method includes inter alia calculating M downmix signals by forming combinations of N audio objects, wherein M≦N, and calculating parameters which allow reconstruction of a set of audio objects formed on basis of the N audio objects from the M downmix signals. The calculation of the M downmix signals is made according to a criterion which is independent of any loudspeaker configuration. | 04-14-2016 |
20160099005 | Enhancing Performance of Spectral Band Replication and Related High Frequency Reconstruction Coding - The present proposes new methods and an apparatus for enhancement of source coding systems utilising high frequency reconstruction (HFR). It addresses the problem of insufficient noise contents in a reconstructed highband, by Adaptive Noise-floor Addition. It also introduces new methods for enhanced performance by means of limiting unwanted noise, interpolation and smoothing of envelope adjustment amplification factors. The present invention is applicable to both speech coding and natural audio coding systems. | 04-07-2016 |
20160093312 | AUDIO ENCODER AND DECODER WITH MULTIPLE CODING MODES - In one embodiment, an audio decoder for decoding an audio bitstream is disclosed. The decoder includes a first decoding module adapted to operate in a first coding mode and a second decoding module adapted to operate in a second coding mode, the second coding mode being different from the first coding mode. The decoder further includes a pitch filter in either the first coding mode or the second coding mode, the pitch filter adapted to filter a preliminary audio signal generated by the first decoding module or the second decoding module to obtain a filtered signal. The pitch filter is selectively enabled or disabled based on a value of a first parameter encoded in the audio bitstream, the first parameter being distinct from a second parameter encoded in the audio bitstream, the second parameter specifying a current coding mode of the audio decoder. | 03-31-2016 |
20160093310 | Spectral Translation/Folding in the Subband Domain - The present invention relates to a new method and apparatus for improvement of High Frequency Reconstruction (HFR) techniques using frequency translation or folding or a combination thereof. The proposed invention is applicable to audio source coding systems, and offers significantly reduced computational complexity. This is accomplished by means of frequency translation or folding in the subband domain, preferably integrated with spectral envelope adjustment in the same domain. The concept of dissonance guard-band filtering is further presented. The proposed invention offers a low-complexity, intermediate quality HFR method useful in speech and natural audio coding applications. | 03-31-2016 |
20160086616 | PITCH FILTER FOR AUDIO SIGNALS - In some embodiments, a pitch filter for filtering a preliminary audio signal generated from an audio bitstream is disclosed. The pitch filter has an operating mode selected from one of either: (i) an active mode where the preliminary audio signal is filtered using filtering information to obtain a filtered audio signal, and (ii) an inactive mode where the pitch filter is disabled. The preliminary audio signal is generated in an audio encoder or audio decoder having a coding mode selected from at least two distinct coding modes, and the pitch filter is capable of being selectively operated in either the active mode or the inactive mode while operating in the coding mode based on control information. | 03-24-2016 |
20160042744 | ADVANCED QUANTIZER - The present document relates an audio encoding and decoding system (referred to as an audio codec system). In particular, the present document relates to a transform-based audio codec system which is particularly well suited for voice encoding/decoding. A quantization unit ( | 02-11-2016 |
20160042742 | Audio Encoder and Decoder for Interleaved Waveform Coding - There is provided methods and apparatuses for decoding and encoding of audio signals. In particular, a method for decoding includes receiving a waveform-coded signal having a spectral content corresponding to a subset of the frequency range above a cross-over frequency. The waveform-coded signal is interleaved with a parametric high frequency reconstruction of the audio signal above the cross-over frequency. In this way an improved reconstruction of the high frequency bands of the audio signal is achieved. | 02-11-2016 |
20160035361 | Harmonic Transposition in an Audio Coding Method and System - The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length L | 02-04-2016 |
20160035329 | Efficient Combined Harmonic Transposition - The present document relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), and to digital effect processors, e.g. so-called exciters, where generation of harmonic distortion adds brightness to the processed signal. In particular, a system configured to generate a high frequency component of a signal from a low frequency component of the signal is described. The system may comprise an analysis filter bank ( | 02-04-2016 |
20160029140 | METHODS AND SYSTEMS FOR GENERATING AND INTERACTIVELY RENDERING OBJECT BASED AUDIO - Methods for generating an object based audio program, renderable in a personalizable manner, and including a bed of speaker channels renderable in the absence of selection of other program content (e.g., to provide a default full range audio experience). Other embodiments include steps of delivering, decoding, and/or rendering such a program. Rendering of content of the bed, or of a selected mix of other content of the program, may provide an immersive experience. The program may include multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects), the bed of speaker channels, and other speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method. | 01-28-2016 |
20160029138 | Methods and Systems for Interactive Rendering of Object Based Audio - Methods for generating an object based audio program which is renderable in a personalizable manner, e.g., to provide an immersive, perception of audio content of the program. Other embodiments include steps of delivering (e.g., broadcasting), decoding, and/or rendering such a program. Rendering of audio objects indicated by the program may provide an immersive experience. The audio content of the program may be indicative of multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects, and typically also a default set of objects which will be rendered in the absence of a selection by a user) and a bed of speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method. | 01-28-2016 |
20160027447 | SPATIAL COMFORT NOISE - A method, an apparatus, logic (e.g., executable instructions encoded in a non-transitory computer-readable medium to carry out a method), and a non-transitory computer-readable medium configured with such instructions. The method is to generate and spatially render spatial comfort noise at a receiving endpoint of a conference system, such that the comfort noise has target spectral characteristics typical of comfort noise, and at least one spatial property that at least substantially matches at least one target spatial property. On version includes receiving one or more or more audio signals from other endpoints, combining the received audio signals with the spatial comfort noise signals, and rendering the combination of the received audio signals and the spatial comfort noise signals to a set of output signals for loudspeakers, such that the spatial comfort noise signals are continually in the output signal sin addition to output from the received audio signals. | 01-28-2016 |
20160027446 | Stereo Audio Encoder and Decoder - The present disclosure provides methods, devices and computer program products for encoding and decoding a stereo audio signal based on an input signal. According to the disclosure, a hybrid approach of using both parametric stereo coding and a discrete representation of the stereo audio signal is used which may improve the quality of the encoded and decoded audio for certain bitrates. | 01-28-2016 |
20160019899 | Audio Processing - An audio processing system ( | 01-21-2016 |
20160007133 | RENDERING OF AUDIO OBJECTS WITH APPARENT SIZE TO ARBITRARY LOUDSPEAKER LAYOUTS - Multiple virtual source locations may be defined for a volume within which audio objects can move. A set-up process for rendering audio data may involve receiving reproduction speaker location data and pre-computing gain values for each of the virtual sources according to the reproduction speaker location data and each virtual source location. The gain values may be stored and used during “run time,” during which audio reproduction data are rendered for the speakers of the reproduction environment. During run time, for each audio object, contributions from virtual source locations within an area or volume defined by the audio object position data and the audio object size data may be computed. A set of gain values for each output channel of the reproduction environment may be computed based, at least in part, on the computed contributions. Each output channel may correspond to at least one reproduction speaker of the reproduction environment. | 01-07-2016 |
20160006406 | CROSS PRODUCT ENHANCED SUBBAND BLOCK BASED HARMONIC TRANSPOSITION - The invention provides an efficient implementation of cross-product enhanced high-frequency reconstruction (HFR), wherein a new component at frequency QΩ+rΩ | 01-07-2016 |
20160005407 | Methods for Parametric Multi-Channel Encoding - The present document relates to audio coding systems. In particular, the present document relates to efficient methods and systems for parametric multi-channel audio coding. An audio encoding system ( | 01-07-2016 |
20150380001 | MDCT-BASED COMPLEX PREDICTION STEREO CODING - The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of:
| 12-31-2015 |
20150372820 | METADATA TRANSCODING - The present document relates to transcoding of metadata, and in particular to a method and system for transcoding metadata with reduced computational complexity. A transcoder configured to transcode an inbound bitstream comprising an inbound content frame and an associated inbound metadata frame into an outbound bitstream comprising an outbound content frame and an associated outbound metadata frame is described. The inbound content frame is indicative of a signal encoded according to a first codec system and the outbound content frame is indicative of the signal encoded according to a second codec system. The transcoder is configured to identify an inbound block of metadata from the inbound metadata frame, the inbound block of metadata associated with an inbound descriptor indicative of one or more properties of metadata comprised within the inbound block of metadata, and to generate the outbound metadata frame from the inbound metadata frame based on the inbound descriptor. | 12-24-2015 |
20150365775 | AUTOMATIC LOUDSPEAKER POLARITY DETECTION - In some embodiments, a method for automatic detection of polarity of speakers, e.g., speakers installed in cinema environments. In some embodiments, the method determines relative polarities of a set of speakers (e.g., loudspeakers and/or drivers of a multi-driver loudspeaker) using a set of microphones, including by measuring impulse responses, including an impulse response for each speaker-microphone pair; clustering the speakers into a set of groups, each group including at least two of the speakers which are similar to each other in at least one respect; and for each group, determining and analyzing cross-correlations of pairs of impulse responses (e.g., pairs of processed versions of impulse responses) of speakers in the group to determine relative polarities of the speakers. Other aspects include systems configured (e.g., programmed) to perform any embodiment of the inventive method, and computer readable media (e.g., discs) which store code for implementing any embodiment of the inventive method. | 12-17-2015 |
20150363160 | System and Method for Optimizing Loudness and Dynamic Range Across Different Playback Devices - Embodiments are directed to a method and system for receiving, in a bitstream, metadata associated with the audio data, and analyzing the metadata to determine whether a loudness parameter for a first group of audio playback devices are available in the bitstream. Responsive to determining that the parameters are present for the first group, the system uses the parameters and audio data to render audio. Responsive to determining that the loudness parameters are not present for the first group, the system analyzes one or more characteristics of the first group, and determines the parameter based on the one or more characteristics. | 12-17-2015 |
20150356978 | AUDIO CODING WITH GAIN PROFILE EXTRACTION AND TRANSMISSION FOR SPEECH ENHANCEMENT AT THE DECODER - The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application. | 12-10-2015 |
20150333734 | Low Delay Modulated Filter Bank - The document relates to modulated sub-sampled digital filter banks, as well as to methods and systems for the design of such filter banks. In particular, the present document proposes a method and apparatus for the improvement of low delay modulated digital filter banks. The method employs modulation of an asymmetric low-pass prototype filter and a new method for optimizing the coefficients of this filter. Further, a specific design for a 64 channel filter bank using a prototype filter length of 640 coefficients and a system delay of 319 samples is given. The method substantially reduces artifacts due to aliasing emerging from independent modifications of subband signals, for example when using a filter bank as a spectral equalizer. The method is preferably implemented in software, running on a standard PC or a digital signal processor (DSP), but can also be hardcoded on a custom chip. The method offers improvements for various types of digital equalizers, adaptive filters, multiband companders and spectral envelope adjusting filter banks used in high frequency reconstruction (HFR) or parametric stereo systems. | 11-19-2015 |
20150333733 | Low Delay Modulated Filter Bank - The document relates to modulated sub-sampled digital filter banks, as well as to methods and systems for the design of such filter banks. In particular, the present document proposes a method and apparatus for the improvement of low delay modulated digital filter banks. The method employs modulation of an asymmetric low-pass prototype filter and a new method for optimizing the coefficients of this filter. Further, a specific design for a 64 channel filter bank using a prototype filter length of 640 coefficients and a system delay of 319 samples is given. The method substantially reduces artifacts due to aliasing emerging from independent modifications of subband signals, for example when using a filter bank as a spectral equalizer. The method is preferably implemented in software, running on a standard PC or a digital signal processor (DSP), but can also be hardcoded on a custom chip. The method offers improvements for various types of digital equalizers, adaptive filters, multiband companders and spectral envelope adjusting filter banks used in high frequency reconstruction (HFR) or parametric stereo systems. | 11-19-2015 |
20150332703 | Low Delay Modulated Filter Bank - The document relates to modulated sub-sampled digital filter banks, as well as to methods and systems for the design of such filter banks. In particular, the present document proposes a method and apparatus for the improvement of low delay modulated digital filter banks. The method employs modulation of an asymmetric low-pass prototype filter and a new method for optimizing the coefficients of this filter. Further, a specific design for a 64 channel filter bank using a prototype filter length of 640 coefficients and a system delay of 319 samples is given. The method substantially reduces artifacts due to aliasing emerging from independent modifications of subband signals, for example when using a filter bank as a spectral equalizer. The method is preferably implemented in software, running on a standard PC or a digital signal processor (DSP), but can also be hardcoded on a custom chip. The method offers improvements for various types of digital equalizers, adaptive filters, multiband companders and spectral envelope adjusting filter banks used in high frequency reconstruction (HFR) or parametric stereo systems. | 11-19-2015 |
20150317986 | Processing of Audio Signals During High Frequency Reconstruction - The application relates to HFR (High Frequency Reconstruction/Regeneration) of audio signals. In particular, the application relates to a method and system for performing HFR of audio signals having large variations in energy level across the low frequency range which is used to reconstruct the high frequencies of the audio signal. A system configured to generate a plurality of high frequency subband signals covering a high frequency interval from a plurality of low frequency subband signals is described. The system comprises means for receiving the plurality of low frequency subband signals; means for receiving a set of target energies, each target energy covering a different target interval within the high frequency interval and being indicative of the desired energy of one or more high frequency subband signals lying within the target interval; means for generating the plurality of high frequency subband signals from the plurality of low frequency subband signals and from a plurality of spectral gain coefficients associated with the plurality of low frequency subband signals, respectively; and means for adjusting the energy of the plurality of high frequency subband signals using the set of target energies. | 11-05-2015 |
20150317985 | Signal Adaptive FIR/IIR Predictors for Minimizing Entropy - The present document relates to coding. In particular, the present document relates to coding using linear prediction in combination with entropy encoding. A method ( | 11-05-2015 |
20150312676 | SYSTEM AND METHOD FOR REDUCING LATENCY IN TRANSPOSER-BASED VIRTUAL BASS SYSTEMS - A latency reduction system in a virtual bass processing system performs harmonic transposition on low frequency components of an audio signal to generate transposed data indicative of harmonics of the audio signal. The system uses a base transposition factor greater than two, and generates the harmonics in response to frequency-domain values determined by forward and inverse transform stages that use asymmetric analysis and synthesis windows. The system combines a virtual bass signal with the delayed wide band audio signal through analysis filter banks having filter coefficient truncated Nyquist filters. The virtual bass signal may lag the delayed wide band audio signal when combining with the audio signal to further reduce the latency caused by the harmonic transposition. The virtual bass input signal may be directly routed from a CQMF analysis filter bank of a preceding Hybrid filter bank stage, in order to avoid the delay associated with a Nyquist filter bank. | 10-29-2015 |
20150269948 | Advanced Stereo Coding Based on a Combination of Adaptively Selectable Left/Right or Mid/Side Stereo Coding and of Parametric Stereo Coding - The application relates to audio encoder and decoder systems. An embodiment of the encoder system comprises a downmix stage for generating a downmix signal and a residual signal based on a stereo signal. In addition, the encoder system comprises a parameter determining stage for determining parametric stereo parameters such as an inter-channel intensity difference and an inter-channel cross-correlation. Preferably, the parametric stereo parameters are time- and frequency-variant. Moreover, the encoder system comprises a transform stage. The transform to stage generates a pseudo left/right stereo signal by performing a transform based on the downmix signal and the residual signal. The pseudo stereo signal is processed by a perceptual stereo encoder. For stereo encoding, left/right encoding or mid/side encoding is selectable. Preferably, the selection between left/right stereo encoding and mid/side stereo encoding is time- and frequency-variant. | 09-24-2015 |
20150248889 | LAYERED APPROACH TO SPATIAL AUDIO CODING - The invention provides a layered audio coding format with a monophonic layer and at least one sound field layer. A plurality of audio signals is decomposed, in accordance with decomposition parameters controlling the quantitative properties of an orthogonal energy-compacting transform, into rotated audio signals. Further, a time-variable gain profile specifying constructively how the rotated audio signals may be processed to attenuate undesired audio content is derived. The monophonic layer may comprise one of the rotated signals and the gain profile. The sound field layer may comprise the rotated signals and the decomposition parameters. In one embodiment, the gain profile comprises a cleaning gain profile with the main purpose of eliminating non-speech components and/or noise. The gain profile may also comprise mutually independent broadband gains. Because signals in the audio coding format can be mixed with a limited computational effort, the invention may advantageously be applied in a tele-conferencing application. | 09-03-2015 |
20150237447 | Wireless Audio Distribution System - A system for wirelessly distributing audio, the system including a controller for receiving audio content from a source external to the controller, one or more speakers wirelessly coupled to the controller, a wireless communications interface coupled to or integrated within the controller for wirelessly distributing the audio content to the one or more speakers, a memory for storing a sonic characteristic of at least one of the one or more speakers, and a processor for associating at least some elements of the audio content with the at least one of the one or more speakers based at least in part on the sonic characteristic. | 08-20-2015 |
20150237301 | NEAR-END INDICATION THAT THE END OF SPEECH IS RECEIVED BY THE FAR END IN AN AUDIO OR VIDEO CONFERENCE - Embodiments of client device and method for audio or video conferencing are described. An embodiment includes an offset detecting unit, a configuring unit, an estimator and an output unit. The offset detecting unit detects an offset of speech input to the client device. The configuring unit determines a voice latency from the client device to every far end. The estimator estimates a time when a user at the far end perceives the offset based on the voice latency. The output unit outputs a perceivable signal indicating that a user at the far end perceives the offset based on the time estimated for the far end. The perceivable signal is helpful to avoid collision between parties. | 08-20-2015 |
20150221319 | METHODS AND SYSTEMS FOR SELECTING LAYERS OF ENCODED AUDIO SIGNALS FOR TELECONFERENCING - In some embodiments, a method for selecting at least one layer of a spatially layered, encoded audio signal. Typical embodiments are teleconferencing methods in which at least one of a set of nodes (endpoints, each of which is a telephone system, and optionally also a server) is configured to perform audio coding in response to soundfield audio data to generate spatially layered encoded audio including any of a number of different subsets of a set of layers, the set of layers including at least one monophonic layer, at least one soundfield layer, and optionally also at least one metadata layer comprising metadata indicative of at least one processing operation to be performed on the encoded audio. Other aspects are systems configured (e.g., programmed) to perform any embodiment of the method, and computer readable media which store code for implementing any embodiment of the method or steps thereof. | 08-06-2015 |
20150221313 | CODING OF A SOUND FIELD SIGNAL - A method for encoding sound field signals includes allocating coding rate by application of a uniform criterion to all subbands of all signals in a joint process. An allocation criterion may be based on a comparison, in a given subband, between a spectral envelope of the signals to be encoded and a coding noise profile, wherein the noise profile may be a sum of a noise shape and a noise offset, which noise offset is computed on the basis of the coding bit budget. The rate allocation process may be combined with an energy-compacting orthogonal transform, for which there is proposed a parameterization susceptible of efficient coding and having adjustable directivity. In a further aspect, the invention provides a corresponding decoding method. | 08-06-2015 |
20150215410 | GEO-Referencing Media Content - Geo origination data is generated for a geo-tagged media device of a user from measurements performed by sensors. The geo origination data is sent to a server system. At the server system, geo-tagged media content elements are selected based on the geo origination data. Further, based on the selected geo-tagged media content elements and the geo origination data, geo-referenced rendering data to be used for rendering media content from the selected geo-tagged media content elements perceivable to the user of the geo-tagged media device is generated. The geo-referenced rendering data can be streamed to the geo-tagged media device along with media content derived from the geo-tagged media content elements for rendering at the geo-tagged media device. | 07-30-2015 |
20150201178 | Frame Compatible Depth Map Delivery Formats for Stereoscopic and Auto-Stereoscopic Displays - Stereoscopic video data and corresponding depth map data for stereoscopic and auto-stereoscopic displays are coded using a coded base layer and one or more coded enhancement layers. Given a 3D input picture and corresponding input depth map data, a side-by-side and a top-and-bottom picture are generated based on the input picture. Using an encoder, the side-by-side picture is coded to generate a coded base layer Using the encoder and a texture reference processing unit (RPU), the top-and-bottom picture is encoded to generate a first enhancement layer, wherein the first enhancement layer is coded based on the base layer stream, and using the encoder and a depth-map RPU, depth data for the side-by-side picture are encoded to generate a second enhancement layer, wherein the second enhancement layer is coded based on to the base layer. Alternative single, dual, and multi-layer depth map delivery systems are also presented. | 07-16-2015 |
20150154970 | SMOOTH CONFIGURATION SWITCHING FOR MULTICHANNEL AUDIO RENDERING BASED ON A VARIABLE NUMBER OF RECEIVED CHANNELS - A decoding system reconstructs an n-channel audio signal on the basis of an input signal representing the audio signal, in different time frames, either by parametric coding or as n discretely coded channels. Parametric decoding uses a core signal and mixing parameters controlling a spatial synthesis stage, to which a downmix signal is supplied from a downmix stage. The downmix stage realizes a projection on the downmix signal based on an n-channel input signal, either a discretely coded signal or a core signal padded with neutral-valued channels. The padding may take place either on the decoding side (reduced parametric coding) or the encoding side. In an embodiment, an audio decoder ( | 06-04-2015 |
20150149158 | Efficient Combined Harmonic Transposition - The present document relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), and to digital effect processors, e.g. so-called exciters, where generation of harmonic distortion adds brightness to the processed signal. In particular, a system configured to generate a high frequency component of a signal from a low frequency component of the signal is described. The system may comprise an analysis filter bank ( | 05-28-2015 |
20150131800 | Efficient Encoding and Decoding of Multi-Channel Audio Signal with Multiple Substreams - The present document relates to audio encoding/decoding. In particular, the present document relates to a method and system for improving the quality of encoded multi-channel audio signals. An audio encoder configured to encode a multi-channel audio signal according to a total available data-rate is described. The multi-channel audio signal is representable as a basic group ( | 05-14-2015 |
20150124973 | METHOD AND APPARATUS FOR LAYOUT AND FORMAT INDEPENDENT 3D AUDIO REPRODUCTION - A method for encoding audio signals, for later reproduction in arbitrary three-dimensional loudspeaker layouts, based on the generation of an intermediate channel-independent representation, which enables the creation, manipulation and reproduction of sounds with complex apparent size and shape, including multiple disconnected shapes. | 05-07-2015 |
20150104021 | SYSTEM FOR MAINTAINING REVERSIBLE DYNAMIC RANGE CONTROL INFORMATION ASSOCIATED WITH PARAMETRIC AUDIO CODERS - On the basis of a bitstream (P), an n-channel audio signal (X) is reconstructed by deriving an m-channel core signal (Y) and multichannel coding parameters (α) from the bitstream, where 1≦m04-16-2015 | |
20150095039 | Enhancing Performance of Spectral Band Replication and Related High Frequency Reconstruction Coding - The present proposes new methods and an apparatus for enhancement of source coding systems utilising high frequency reconstruction (HFR). It addresses the problem of insufficient noise contents in a reconstructed highband, by Adaptive Noise-floor Addition. It also introduces new methods for enhanced performance by means of limiting unwanted noise, interpolation and smoothing of envelope adjustment amplification factors. The present invention is applicable to both speech coding and natural audio coding systems. | 04-02-2015 |
20150058025 | Oversampling in a Combined Transposer Filterbank - The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank ( | 02-26-2015 |
20150049880 | Low Delay Real-to-Complex Conversion in Overlapping Filter Banks for Partially Complex Processing - An arrangement of overlapping filter banks comprises a synthesis stage and an analysis stage. The synthesis stage receives a first signal segmented into time blocks and outputs, based thereon, an intermediate signal to be received by the analysis stage forming the basis for the computation of a second signal segmented into time frames. In an embodiment, the synthesis stage is operable to release an approximate value of the intermediate signal in a time block located L−1 time blocks ahead of its output block, which approximate value is computed on the basis of any available time blocks of the first signal, so that the approximate value contributes, in the analysis stage, to the second signal. The delay is typically reduced by L−1 blocks. Applications include audio signal processing in general and real-to-complex conversion in particular. | 02-19-2015 |
20150043754 | System and Method for Non-Destructively Normalizing Loudness of Audio Signals within Portable Devices - Many portable playback devices cannot decode and playback encoded audio content having wide bandwidth and wide dynamic range with consistent loudness and intelligibility unless the encoded audio content has been prepared specially for these devices. This problem can be overcome by including with the encoded content some metadata that specifies a suitable dynamic range compression profile by either absolute values or differential values relative to another known compression profile. A playback device may also adaptively apply gain and limiting to the playback audio Implementations in encoders, in transcoders and in decoders are disclosed. | 02-12-2015 |
20150032461 | Subband Block Based Harmonic Transposition - The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank ( | 01-29-2015 |
20150025895 | Audio Encoder with Parallel Architecture - The present document relates to methods and systems for audio encoding. In particular, the present document relates to methods and systems for fast audio encoding using a parallel system architecture. A frame-based audio encoder ( | 01-22-2015 |
20140365231 | UPSAMPLING USING OVERSAMPLED SBR - An encoder (250) comprises a core encoder (252) for encoding a low frequency component of the audio signal at the signal sampling rate (fs_in) and a spectral band replication-referred to as SBR-encoding unit (153, 254) for determining a plurality of SBR parameters. A plurality of the SBR parameters is determined such that a high frequency component of the audio signal can be approximated based on the low frequency component of the audio signal and the plurality of SBR parameters. A multiplexer (155) is adapted to generate an overall bitstream comprising the core encoded bitstream, the plurality of SBR parameters and an indication of one or more SBR encoder settings applied by the SBR encoder (153, 254); wherein the generated overall bitstream does not indicate that the core encoded bitstream has been determined by encoding the low frequency component at the signal sampling rate (fs_in). | 12-11-2014 |
20140358554 | AUDIO ENCODING METHOD AND SYSTEM FOR GENERATING A UNIFIED BITSTREAM DECODABLE BY DECODERS IMPLEMENTING DIFFERENT DECODING PROTOCOLS - In a class of embodiments, an audio encoding system (typically, a perceptual encoding system that is configured to generate a single (“unified”) bitstream that is compatible with (i.e., decodable by) a first decoder configured to decode audio data encoded in accordance with a first encoding protocol (e.g., the multichannel Dolby Digital Plus, or DD+, protocol) and a second decoder configured to decode audio data encoded in accordance with a second encoding protocol (e.g., the stereo AAC, HE AAC v1, or HE AAC v2 protocol). The unified bitstream can include both encoded data (e.g., bursts of data) decodable by the first decoder (and ignored by the second decoder) and encoded data (e.g., other bursts of data) decodable by the second decoder (and ignored by the first decoder). In effect, the second encoding format is hidden within the unified bitstream when the bitstream is decoded by the first decoder, and the first encoding format is hidden within the unified bitstream when the bitstream is decoded by the second decoder. The format of the unified bitstream generated in accordance with the invention may eliminate the need for transcoding elements throughout an entire media chain and/or ecosystem. Other aspects of the invention are an encoding method performed by any embodiment of the inventive encoder, a decoding method performed by any embodiment of the inventive decoder, and a computer readable medium (e.g., disc) which stores code for implementing any embodiment of the inventive method. | 12-04-2014 |
20140330556 | LOW COMPLEXITY REPETITION DETECTION IN MEDIA DATA - Low complexity detection of a time-wise position of a representative segment in media data is described. A subset of offset values is located in a set of offset values in media data using a first type of one or more types of features, which are extractable from (e.g., derivable from components of) the media data. The subset of offset values comprise values that are selected from the set of offset values based on one or more selection criteria. A set of candidate seed time points is identified based on the subset of offset values using a second type of the one or more types of features. | 11-06-2014 |
20140324441 | METHOD AND SYSTEM FOR ENCODING AUDIO DATA WITH ADAPTIVE LOW FREQUENCY COMPENSATION - A method for determining mantissa bit allocation of audio data values of frequency domain audio data to be encoded. The allocation method includes a step of determining masking values for the audio data values, including by performing adaptive low frequency compensation on the audio data of each frequency band of a set of low frequency bands of the audio data. The adaptive low frequency compensation includes steps of: performing tonality detection on the audio data to generate compensation control data indicative of whether each frequency band in the set of low frequency bands has prominent tonal content; and performing low frequency compensation on the audio data in each frequency band in the set of low frequency bands having prominent tonal content as indicated by the compensation control data, but not performing low frequency compensation on the audio data in any other frequency band in the set of low frequency bands. | 10-30-2014 |
20140304315 | Low Delay Modulated Filter Bank - The document relates to modulated sub-sampled digital filter banks, as well as to methods and systems for the design of such filter banks. In particular, the present document proposes a method and apparatus for the improvement of low delay modulated digital filter banks. The method employs modulation of an asymmetric low-pass prototype filter and a new method for optimizing the coefficients of this filter. Further, a specific design for a 64 channel filter bank using a prototype filter length of 640 coefficients and a system delay of 319 samples is given. The method substantially reduces artifacts due to aliasing emerging from independent modifications of subband signals, for example when using a filter bank as a spectral equalizer. The method is preferably implemented in software, running on a standard PC or a digital signal processor (DSP), but can also be hardcoded on a custom chip. The method offers improvements for various types of digital equalizers, adaptive filters, multiband companders and spectral envelope adjusting filter banks used in high frequency reconstruction (HFR) or parametric stereo systems. | 10-09-2014 |
20140297295 | Cross Product Enhanced Harmonic Transposition - The present invention relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR). A system and a method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank providing a plurality of analysis subband signals of the low frequency component of the signal. It also comprises a non-linear processing unit to generate a synthesis subband signal with a synthesis frequency by modifying the phase of a first and a second of the plurality of analysis subband signals and by combining the phase-modified analysis subband signals. Finally, it comprises a synthesis filter bank for generating the high frequency component of the signal from the synthesis subband signal. | 10-02-2014 |
20140294185 | Adaptive High Fidelity Reproduction System for Object-Based Audio - Object-based audio is adaptively associated with speakers, depending on the speaker configuration that is present. Each speaker it receives an audio assignment based on its individual spectral characteristics. As more speakers are added, content is adaptively associated with that you speaker, and taken away from the previous. | 10-02-2014 |
20140236604 | APPARATUS AND METHOD FOR GENERATING A LEVEL PARAMETER AND APPARATUS AND METHOD FOR GENERATING A MULTI-CHANNEL REPRESENTATION - A parameter representation of a multi-channel signal having several original channels includes a parameter set, which, when used together with at least one down-mix channel allows a multi-channel reconstruction. An additional level parameter is calculated such that an energy of the at least one downmix channel weighted by the level parameter is equal to a sum of energies of the original channels. The additional level parameter is transmitted to a multi-channel reconstructor together with the parameter set or together with a down-mix channel. An apparatus for generating a multi-channel representation uses the level parameter to correct the energy of the at least one transmitted down-mix channel before entering the down-mix signal into an upmixer or within the up-mixing process. | 08-21-2014 |
20140235192 | PREDICTION-BASED FM STEREO RADIO NOISE REDUCTION - The present document relates to audio signal processing, in particular to an apparatus and a corresponding method for improving the audio signal of an FM stereo radio receiver. In particular, the present document relates to a method and system for reducing the noise of a received FM stereo radio signal. An apparatus ( | 08-21-2014 |
20140188488 | Reduced Complexity Converter SNR Calculation - An audio encoder configured to encode an audio signal to generate a bitstream having E-AC-3 format, including by determining a first control parameter indicative of an allocation of available mantissa bits for quantized audio content of the signal. The encoder is configured to perform transcoding simulation to determine a second control parameter in a manner based at least in part on statistical analysis of results of E-AC-3 bit allocation processing of audio data assuming a first target data rate, and of AC-3 bit allocation processing of the data assuming a second target data rate, and to include the second control parameter in the bitstream for use by a converter to convert the bitstream into a second to bitstream having AC-3 format at the second target data rate. Other aspects are converters configured to perform transcoding on a bitstream using such a second control parameter, and methods performed by any embodiment of the inventive encoder or converter. | 07-03-2014 |
20140161262 | FM STEREO RADIO RECEIVER BY USING PARAMETRIC STEREO - The invention relates to an apparatus ( | 06-12-2014 |
20140135963 | System and Method for High Dynamic Range Audio Distribution - A transcoding tool for transcoding an audio stream to be played at a playback device is provided. The transcoding tool comprises a receiving section adapted to receive at least one bit stream comprising an audio stream and metadata associated with the audio stream. The transcoding tool further comprises a processing section connected to the receiving section and adapted to create a processed audio stream based on the audio stream and metadata, and a transmitting section connected to the processing section and adapted to transmit the created processed audio stream to the playback device. | 05-15-2014 |
20140074462 | METHOD FOR REDUCTION OF ALIASING INTRODUCED BY SPECTRAL ENVELOPE ADJUSTMENT IN REAL-VALUED FILTERBANKS - The present invention proposes a new method for improving the performance of a real-valued filterbank based spectral envelope adjuster. By adaptively locking the gain values for adjacent channels dependent on the sign of the channels, as defined in the application, reduced aliasing is achieved. Furthermore, the grouping of the channels during gain-calculation, gives an improved energy estimate of the real valued subband signals in the filterbank. | 03-13-2014 |
20140072120 | METHOD AND ENCODER FOR PROCESSING A DIGITAL STEREO AUDIO SIGNAL - The invention discloses a method and an encoder for processing a digital audio stereo signal. A digital audio encoder for coding such audio signal comprises a predictive Temporal Noise Shaping (TNS) filter, a Mid-/Side (M/S) coding unit, a control unit for determining a first prediction gain related to the unmodified L/R signal processed by the TNS filter and for determining a second prediction gain related to the M/S-coded L/R signal processed by the TNS filter, wherein the control unit is adapted to disable TNS-filtering—i.e. to bypass the TNS filter—for a current signal frame, if the first and second prediction gains differ by more than a pre-determined mismatch range. Preferably, the first and second prediction gains are determined from signal energy ratios calculated for each channel of the stereo signal including the signal energies of both the TNS-processed (unmodified) L- respectively (unmodified) R-signal and the TNS-processed M/S coded L- respectively M/S coded R-signal divided by the respective signal energies before TNS processing. Furthermore, the control unit is preferably adapted to overrule the disabling of the TNS filter, if the input signal is a near-mono audio signal exhibiting only low energy either in its M- or S-band. In that case, operation of the TNS filter on the stereo audio signal is maintained. | 03-13-2014 |
20140039890 | EFFICIENT CONTENT CLASSIFICATION AND LOUDNESS ESTIMATION - The present document relates to methods and systems for encoding an audio signal. The method comprises determining a spectral representation of the audio signal. The determining a spectral representation step may comprise determining modified discrete cosine transform, MDCT, coefficients, or a Quadrature Mirror Filter, QMF, filter bank representation of the audio signal. The method further comprises encoding the audio signal using the determined spectral representation; and classifying parts of the audio signal to be speech or non-speech based on the determined spectral representation. Finally, a loudness measure for the audio signal based on the speech parts is determined. | 02-06-2014 |
20140037117 | METHOD AND SYSTEM FOR UPMIXING AUDIO TO GENERATE 3D AUDIO - In some embodiments, a method for upmixing input audio comprising N full range channels to generate | 02-06-2014 |
20140019146 | FRAME ELEMENT POSITIONING IN FRAMES OF A BITSTREAM REPRESENTING AUDIO CONTENT - A better compromise between a too high bitstream and decoding overhead on the one hand and flexibility of frame element positioning on the other hand is achieved by arranging that each of the sequence of frames of the bitstream has a sequence of N frame elements and, on the other hand, the bitstream has a configuration block having a field indicating the number of elements N and a type indication syntax portion indicating, for each element position of the sequence of N element positions, an element type out of a plurality of element types with, in the sequences of N frame elements of the frames, each frame element being of the element type indicated, by the type indication portion, for the respective element position at which the respective frame element is positioned within the sequence of N frame elements of the respective frame in the bitstream. | 01-16-2014 |
20140016787 | FRAME ELEMENT LENGTH TRANSMISSION IN AUDIO CODING - Frame elements which shall be made available for skipping may are transmitted more efficiently by arranging that a default payload length information is transmitted separately within a configuration block, with the length information within the frame elements, in turn, being subdivided into a default payload length flag followed, if the default payload length flag is not set, by a payload length value explicitly coding the payload length of the respective frame element. However, if the default payload length flag is set, an explicit transmission of the payload length may be avoided. Rather, any frame element, the default extension payload length flag of which is set, has the default payload length and any frame element, the default extension payload length flag of which is not set, has a payload length corresponding to the payload length value. By this measure, transmission effectiveness is increased. | 01-16-2014 |
20140016785 | AUDIO ENCODER AND DECODER HAVING A FLEXIBLE CONFIGURATION FUNCTIONALITY - An audio decoder for decoding an encoded audio signal, the encoded audio signal including a first channel element and a second channel element in a payload section of a data stream and first decoder configuration data for the first channel element and second decoder configuration data for the second channel element in a configuration section of the data stream, includes: a data stream reader for reading the configuration data for each channel element in the configuration section and for reading the payload data for each channel element in the payload section; a configurable decoder for decoding the plurality of channel elements; and a configuration controller for configuring the configurable decoder so that the configurable decoder is configured in accordance with the first decoder configuration data when decoding the first channel element and in accordance with the second decoder configuration data when decoding the second channel element. | 01-16-2014 |
20130339037 | Spectral Translation/Folding in the Subband Domain - The present invention relates to a new method and apparatus for improvement of High Frequency Reconstruction (HFR) techniques using frequency translation or folding or a combination thereof. The proposed invention is applicable to audio source coding systems, and offers significantly reduced computational complexity. This is accomplished by means of frequency translation or folding in the subband domain, preferably integrated with spectral envelope adjustment in the same domain. The concept of dissonance guard-band filtering is further presented. The proposed invention offers a low-complexity, intermediate quality HFR method useful in speech and natural audio coding applications. | 12-19-2013 |
20130315403 | SPATIAL ADAPTATION IN MULTI-MICROPHONE SOUND CAPTURE - A spatial adaptation system for multiple-microphone sound capture systems and methods thereof are described. A spatial adaptation system includes an inference and weight module configured to receive a inputs. The inputs based on two or more input signals captured by at least two microphones. The inference and weight module to determine one or more weight values base on at least one of the inputs. The spatial adaptation system also including a noise magnitude ratio update module coupled with the inference and weight module. The noise magnitude ratio update module to determine an updated noise target based on the one or more weight values from the inference and weight module. | 11-28-2013 |
20130287214 | Scene Change Detection Around a Set of Seed Points in Media Data - Techniques for scene change detection around seed points in media data are provided. Media features of many different types may be extracted from the media data. One or more statistical patterns of media features in a plurality of time-wise intervals around a plurality of seed time points of the media data may be determined using one or more types of features extractable from the media data. At least one of the one or more types of features comprises a type of features that captures structural properties, tonality including harmony and melody, timbre, rhythm, loudness, stereo mix, or a quantity of sound sources as related to the media data. A plurality of beginning scene change points and a plurality of ending scene change points in the media data may be detected, based on the one or more statistical patterns, for the plurality of seed time points in the media data. | 10-31-2013 |
20130282388 | SONG TRANSITION EFFECTS FOR BROWSING - In one aspect, a method of providing directive transitions between audio signals comprises associating a first/second browsing direction (A | 10-24-2013 |
20130282383 | Audio Encoder and Decoder - The present invention teaches a new audio coding system that can code both general audio and speech signals well at low bit rates. A proposed audio coding system comprises linear prediction unit for filtering an input signal based on an adaptive filter; a transformation unit for transforming a frame of the filtered input signal into a transform domain; and a quantization unit for quantizing the transform domain signal. The quantization unit decides, based on input signal characteristics, to encode the transform domain signal with a model-based quantizer or a non-model-based quantizer. Preferably, the decision is based on the frame size applied by the transformation unit. | 10-24-2013 |
20130282382 | Audio Encoder and Decoder - The present invention teaches a new audio coding system that can code both general audio and speech signals well at low bit rates. A proposed audio coding system comprises linear prediction unit for filtering an input signal based on an adaptive filter; a transformation unit for transforming a frame of the filtered input signal into a transform domain; and a quantization unit for quantizing the transform domain signal. The quantization unit decides, based on input signal characteristics, to encode the transform domain signal with a model-based quantizer or a non-model-based quantizer. Preferably, the decision is based on the frame size applied by the transformation unit. | 10-24-2013 |
20130253917 | PSYCHOACOUSTIC FILTER DESIGN FOR RATIONAL RESAMPLERS - The present document relates to the design of anti-aliasing and/or anti-imaging filters for resamplers using rational resampling factors. In particular, the present document relates to a method for designing such filters having a reduced number of filter coefficients or an increased perceptual performance, as well as to the filters designed using such method. A method for designing a filter ( | 09-26-2013 |
20130238345 | PARTIALLY COMPLEX MODULATED FILTER BANK - An apparatus for processing a plurality of real-valued subband signals using a first real-valued subband signal and a second real-valued subband signal to provide at least a complex-valued subband signal comprises a multiband filter for providing an intermediate real-valued subband signal and a calculator for providing the complex-valued subband signal by combining a real-valued subband signal from the plurality of real-valued subband signals and the intermediate subband signal. | 09-12-2013 |
20130236021 | METHOD FOR REPRESENTING MULTI-CHANNEL AUDIO SIGNALS - A multi-channel input signal having at least three original channels is represented by a parameter representation of the multi-channel signal. A first balance parameter, a first coherence parameter, or a first inter-channel time difference between a first channel pair and a second balance parameter, or a second coherence parameter, or a second inter-channel time difference parameter between a second channel pair are calculated. This set of parameters is the parameter representation of the original signals. The first channel pair has two channels, which are different from two channels of a second channel pair. Furthermore, each channel of the two channel pairs is one of the original channels, or a weighted combination of the original channels, and the first channel pair and the second channel pair include information on the three original channels. For multi-channel reconstruction purposes, the parameters are used in addition to down-mixing information to generate a selectable number of output channels in a scalable fashion. | 09-12-2013 |
20130226597 | Methods for Improving High Frequency Reconstruction - The present invention proposes a new method and a new apparatus for enhancement of audio source coding systems utilising high frequency reconstruction (HFR). It utilises a detection mechanism on the encoder side to assess what parts of the spectrum will not be correctly reproduced by the HFR method in the decoder. Information on this is efficiently coded and sent to the decoder, where it is combined with the output of the HFR unit. | 08-29-2013 |
20130218579 | Time Warped Modified Transform Coding of Audio Signals - A representation of an audio signal having a first, a second and a third frame is derived by estimating first warp information for the first and second frames and second warp information for the second and third frames, the warp information describing pitch information of the audio signal. First or second spectral coefficients for first and second frames or second and third frames are derived using first or second warp information and a first or second weighted representation of the first and second frames or second and third frames, the first or second weighted representation derived by applying a first or second window function to the first and second frames or second and third frames, wherein the first or second window function depends on the first or second warp information. The representation of the audio signal is generated including the first and the second spectral coefficients. | 08-22-2013 |
20130182870 | CROSS PRODUCT ENHANCED SUBBAND BLOCK BASED HARMONIC TRANSPOSITION - The invention provides an efficient implementation of cross-product enhanced high-frequency reconstruction (HFR), wherein a new component at frequency QΩ+Ω | 07-18-2013 |
20130170672 | AUDIO STREAM MIXING WITH DIALOG LEVEL NORMALIZATION - A method for mixing of audio signals that allows maintaining of a consistent perceived sound level for the mixed signal by holding the sound level of the dominant signal in the mix constant by adjusting the sound level of the non-dominant signal(s) in relation to the dominant signal. It further includes receiving of a mixing balance input, which denotes the adjustable balance between the main and associated signals. It further includes identification of the dominant signal from the mixing balance input and mixing metadata, from which an appropriate scale factor for the non-dominant signal may also be determined directly from the scaling information, without the need for any analysis or measurement of the audio signals to be mixed. It further includes scaling the non-dominant signal in relation to the dominant signal and combining the scaled non-dominant signal with the dominant signal into a mixed signal. | 07-04-2013 |
20130159004 | Seamless Playback of Successive Multimedia Files - The present document relates to methods and systems for encoding and decoding multimedia files. In particular, the present document relates to methods and systems for encoding and decoding a plurality of audio tracks for seamless playback of the plurality of audio tracks. A method for encoding an audio signal comprising a first and a directly following second audio track for seamless and individual playback of the first and second audio tracks is described. The first and second audio tracks comprise a first and second plurality of audio frames, respectively. The method comprises jointly encoding the audio signal using a frame based audio encoder, thereby yielding a continuous sequence of encoded frames; extracting a first plurality of encoded frames from the continuous sequence of encoded frames; extracting a second plurality of encoded frames from the continuous sequence of encoded frames; appending one or more rear extension frames to an end of the first plurality of encoded frames; and appending one or more front extension frames to the beginning of the second plurality of encoded frames. | 06-20-2013 |
20130142340 | CONCEALMENT OF INTERMITTENT MONO RECEPTION OF FM STEREO RADIO RECEIVERS - The present invention relates to audio signal processing. In particular, it relates to a method and system for reliably concealing intermittent mono reception of FM stereo radio receivers. The system comprises a parametric stereo parameter estimation stage configured to determine a first parametric stereo parameter based on a first frame of the received two-channel audio signal. The system further comprises a concealment detection stage configured to determine an energy of a side signal within the first signal frame; determine a number of following successive signal frames during which the energy of the side signal drops from a value above the high threshold to a value below a low threshold; determine that the two-channel audio signal following the first signal frame is a forced mono signal if the number of successive signal frames is below a frame threshold; and determine the parametric stereo parameter based on the first parametric stereo parameter. | 06-06-2013 |
20130142339 | REDUCTION OF SPURIOUS UNCORRELATION IN FM RADIO NOISE - The document relates to audio signal processing, in particular to a system and a corresponding method for improving an audio signal of an FM stereo radio receiver, in this context, one aspect relates to the estimation of noise in a received side signal and the compensation of such noise in parametric stereo parameters. A system for generating a parametric stereo parameter from a two-channel audio signal is described. The two-channel audio signal is presentable as a mid signal and side signal representative of a corresponding left and right audio signal. The system comprises a noise estimation stage configured to determine an impact factor characteristic for the noise of the side signal; and a parametric stereo parameter estimation stage configured to determine the parametric stereo parameter; wherein the determining is based on the two-channel audio signal and the impact factor. | 06-06-2013 |
20130096912 | SELECTIVE BASS POST FILTER - In one aspect, the invention provides an audio encoding method characterized by a decision being made as to whether the device which will decode the resulting bit stream Bitstream should apply post filtering including attenuation of interharmonic noise. Hence, the decision whether to use the post filter, which is encoded in the bit stream, is taken separately from the decision as to the most suitable coding mode. In another aspect, there is provided an audio decoding method with a decoding step followed by a post-filtering step, including interharmonic noise attenuation, and being characterized in a step of disabling the post filter in accordance with post filtering information encoded in the bit stream signal. Such a method is well suited for mixed-origin audio signals by virtue of its capability to deactivate the post filter in dependence of the post filtering information only, hence independently of factors such as the current coding mode. | 04-18-2013 |
20130044896 | Virtual Bass Synthesis Using Harmonic Transposition - In some embodiments, a virtual bass generation method including steps of: performing harmonic transposition on low frequency components of an input audio signal (typically, bass frequency components expected to be inaudible during playback of the input audio signal using an expected speaker or speaker set) to generate transposed data indicative of harmonics (which are expected to be audible during playback, using the expected speaker(s), of an enhanced version of the input audio which includes the harmonics); generating an enhancement signal in response to the transposed data; and generating an enhanced audio signal by combining (e.g., mixing) the enhancement signal with the input audio signal. Other aspects are systems (e.g., programmed processors) and devices (e.g., devices having physically-limited bass reproduction capabilities, such as, for example, a notebook, tablet, mobile phone, or other device with small speakers) configured to perform any embodiment of the method. | 02-21-2013 |
20130030819 | AUDIO ENCODER, AUDIO DECODER AND RELATED METHODS FOR PROCESSING MULTI-CHANNEL AUDIO SIGNALS USING COMPLEX PREDICTION - An encoder, based on a combination of two audio channels, obtains a first combination signal as a mid-signal and a residual signal derivable using a predicted side signal derived from the mid signal. The first combination signal and the prediction residual signal are encoded and written into a data stream together with the prediction information. A decoder generates decoded first and second channel signals using the prediction residual signal, the first combination signal and the prediction information. A real-to-imaginary transform may be applied for estimating the imaginary part of the spectrum of the first combination signal. For calculating the prediction signal used in the derivation of the prediction residual signal, the real-valued first combination signal is multiplied by a real portion of the complex prediction information and the estimated imaginary part of the first combination signal is multiplied by an imaginary portion of the complex prediction information. | 01-31-2013 |
20130022214 | Method and System for Touch Gesture Detection in Response to Microphone Output - In some embodiments, a method for processing output of at least one microphone of a device (e.g., a headset) to identify at least one touch gesture exerted by a user on the device, including by distinguishing the gesture from input to the microphone other than a touch gesture intended by the user, and by distinguishing between a tap exerted by the user on the device and at least one dynamic gesture exerted by the user on the device, where the output of the at least one microphone is also indicative of ambient sound (e.g., voice utterences). Other embodiments are systems for detecting ambient sound (e.g., voice utterences) and touch gestures, each including a device including at least one microphone and a processor coupled and configured to process output of each microphone to identify at least one touch gesture exerted by a user on the device. | 01-24-2013 |
20120328124 | Processing of Audio Signals During High Frequency Reconstruction - The application relates to HFR (High Frequency Reconstruction/Regeneration) of audio signals. In particular, the application relates to a method and system for performing HFR of audio signals having large variations in energy level across the low frequency range which is used to reconstruct the high frequencies of the audio signal. A system configured to generate a plurality of high frequency subband signals covering a high frequency interval from a plurality of low frequency subband signals is described. The system comprises means for receiving the plurality of low frequency subband signals; means for receiving a set of target energies, each target energy covering a different target interval within the high frequency interval and being indicative of the desired energy of one or more high frequency subband signals lying within the target interval; means for generating the plurality of high frequency subband signals from the plurality of low frequency subband signals and from a plurality of spectral gain coefficients associated with the plurality of low frequency subband signals, respectively; and means for adjusting the energy of the plurality of high frequency subband signals using the set of target energies. | 12-27-2012 |
20120328115 | SYSTEM FOR COMBINING LOUDNESS MEASUREMENTS IN A SINGLE PLAYBACK MODE - The present document relates to processing of multimedia data, notably the encoding, the transmission, the decoding and the rendering of multimedia data, e.g. audio files or bitstreams. In particular, the present document relates to the implementation of loudness control in multimedia players. A method for providing loudness related data to a media player is described. The method comprises the steps of providing a first loudness related value associated with an audio signal; wherein the first loudness related value has been determined according to a first procedure; of converting the first loudness related value into a second loudness related value using a reversible relation; wherein the second loudness related value is associated with a second procedure for determining loudness related values; of storing the second loudness related value in metadata associated with the audio signal; and of providing the metadata to the media player. | 12-27-2012 |
20120278088 | Subband Block Based Harmonic Transposition - The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank ( | 11-01-2012 |
20120275607 | SBR BITSTREAM PARAMETER DOWNMIX - The present document relates to audio decoding and/or audio transcoding. In particular, the present document relates to a scheme for efficiently decoding a number M of audio channels from a bitstream comprising a higher number N of audio channels. In this context a method and system for merging a first and a second source set of spectral band replication (SBR) parameters to a target set of SBR parameters is described. The first and second source set comprise a first and second frequency band partitioning, respectively, which are different from one another. The first source set comprises a first set of energy related values associated with frequency bands of the first frequency band partitioning. The second source set comprises a second set of energy related values associated with frequency bands of the second frequency band partitioning. The target set comprises a target energy related value associated with an elementary frequency band. The method comprises the steps of breaking up the first and the second frequency band partitioning into a joint grid comprising the elementary frequency band; assigning a first value of the first set of energy related values to the elementary frequency band; assigning a second value of the second set of energy related values to the elementary frequency band; and combining the first and second value to yield the target energy related value for the elementary frequency band. | 11-01-2012 |
20120259643 | APPARATUS FOR PROVIDING AN UPMIX SIGNAL REPRESENTATION ON THE BASIS OF THE DOWNMIX SIGNAL REPRESENTATION, APPARATUS FOR PROVIDING A BITSTREAM REPRESENTING A MULTI-CHANNEL AUDIO SIGNAL, METHODS, COMPUTER PROGRAMS AND BITSTREAM REPRESENTING A MULTI-CHANNEL AUDIO SIGNAL USING A LINEAR COMBINATION PARAMETER - An apparatus for providing an upmix signal representation on the basis of a downmix signal representation and an object-related parametric information, which are included in a bitstream representation of an audio content, in independence on a user-specified rendering matrix, the apparatus has a distortion limiter configured to obtain a modified rendering matrix using a linear combination of a user-specified rendering matrix in a target rendering matrix in dependence on a linear combination parameter. The apparatus also has a signal processor configured to obtain the upmix signal representation on the basis of the downmix signal representation and the object-related parametric information using the modified rendering matrix. The apparatus is also configured to evaluate a bitstream element representing the linear combination parameter in order to obtain the linear combination parameter. | 10-11-2012 |
20120250899 | System and Method of Adjusting the Sound of Multiple Audio Objects Directed Toward an Audio Output Device - Embodiments of the present invention include methods and apparatuses for adjusting audio content when more multiple audio objects are directed toward a single audio output device. The amplitude, white noise content, and frequencies can be adjusted to enhance overall sound quality or make content of certain audio objects more intelligible. Audio objects are classified by a class category, by which they are can be assigned class specific processing. Audio objects classes can also have a rank. The rank of an audio objects class is used to give priority to or apply specific processing to audio objects sin the presence of other audio objects of different classes. | 10-04-2012 |
20120243690 | APPARATUS FOR PROVIDING AN UPMIX SIGNAL REPRESENTATION ON THE BASIS OF A DOWNMIX SIGNAL REPRESENTATION, APPARATUS FOR PROVIDING A BITSTREAM REPRESENTING A MULTI-CHANNEL AUDIO SIGNAL, METHODS, COMPUTER PROGRAM AND BITSTREAM USING A DISTORTION CONTROL SIGNALING - An apparatus for providing an upmix signal representation on the basis of a downmix signal representation and an object-related parametric information, which are included in a bitstream representation of an audio content, and in dependence on a rendering information, has a distortion limiter configured to adjust upmix parameters using a distortion control scheme to avoid or limit audible distortions which are caused by an inappropriate choice of rendering parameters. The distortion limiter is configured to obtain a distortion limitation control parameter, which is included in the bitstream representation of the audio content, and to adjust a distortion control scheme in dependence on the distortion limitation control parameter. | 09-27-2012 |
20120215546 | Complexity Scalable Perceptual Tempo Estimation - The present document relates to methods and systems for estimating the tempo of a media signal, such as audio or combined video/audio signal. In particular, the document relates to the estimation of tempo perceived by human listeners, as well as to methods and systems for tempo estimation at scalable computational complexity. A method and system for extracting tempo information of an audio signal from an encoded bit-stream of the audio signal comprising spectral band replication data is described. The method comprises the steps of determining a payload quantity associated with the amount of spectral band replication data comprised in the encoded bit-stream for a time interval of the audio signal; repeating the determining step for successive time intervals of the encoded bit-stream of the audio signal, thereby determining a sequence of payload quantities; identifying a periodicity in the sequence of payload quantities; and extracting tempo information of the audio signal from the identified periodicity. | 08-23-2012 |
20120213385 | Enhancing Perceptual Performance of SBR and Related HFR Coding Methods by Adaptive Noise-Floor Addition and Noise Substitution Limiting - The present proposes new methods and an apparatus for enhancement of source coding systems utilising high frequency reconstruction (HFR). It addresses the problem of insufficient noise contents in a reconstructed highband, by Adaptive Noise-floor Addition. It also introduces new methods for enhanced performance by means of limiting unwanted noise, interpolation and smoothing of envelope adjustment amplification factors. The present invention is applicable to both speech coding and natural audio coding systems. | 08-23-2012 |
20120213378 | Spectral Translation/Folding in the Subband Domain - The present invention relates to a new method and apparatus for improvement of High Frequency Reconstruction (HFR) techniques using frequency translation or folding or a combination thereof. The proposed invention is applicable to audio source coding systems. This is accomplished by means of frequency translation in the frequency domain with spectral envelope adjustment in the same domain. The proposed invention offers a low-complexity HFR method useful in speech and natural audio coding applications. | 08-23-2012 |
20120209615 | Efficient Multichannel Signal Processing by Selective Channel Decoding - An input signal conveying encoded information representing one or more audio channels is decoded by determining the configuration of channels represented by the encoded information, obtaining from the channel configuration a channel selection mask that specifies which of the one or more audio channels are to be decoded, extracting encoded information from the input signal, and decoding the extracted encoded information for those audio channels specified in the channel selection mask. | 08-16-2012 |
20120197650 | METADATA TIME MARKING INFORMATION FOR INDICATING A SECTION OF AN AUDIO OBJECT - The application relates to a method for encoding time marking information within audio data. According to the method, time marking information is encoded as audio metadata within the audio data. The time marking information indicates at least one section of an audio object encoded in the audio data. E.g. the time marking information may specify a start position and an end position of the section or only a start position. The at least one section may be a characteristic part of the audio object, which allows instant recognition by listening. The time marking information encoded in the audio data enables instantaneous browsing to a certain section of the audio object. The application further relates to a method for decoding the time marking information encoded in the audio data. | 08-02-2012 |
20120195442 | OVERSAMPLING IN A COMBINED TRANSPOSER FILTER BANK - The present invention relates to coding of audio signals, and in particular to high frequency reconstruction methods including a frequency domain harmonic transposer. A system and method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank ( | 08-02-2012 |
20120128151 | Authentication of Data Streams - The present invention relates to techniques for authentication of data streams. Specifically, the invention relates to the insertion of identifiers into a data stream, such as a Dolby Pulse, AAC or HE AAC bitstream, and the authentication and verification of the data stream based on such identifiers. A method and system for encoding a data stream comprising a plurality of data frames is described. The method comprises the step of generating a cryptographic value of a number N of successive data frames and configuration information, wherein the configuration information comprises information for rendering the data stream. The method then inserts the cryptographic value into the data stream subsequent to the N successive data frames. | 05-24-2012 |
20120065983 | Efficient Combined Harmonic Transposition - The present document relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), and to digital effect processors, e.g. so-called exciters, where generation of harmonic distortion adds brightness to the processed signal. In particular; a system configured to generate a high frequency component of a signal from a low frequency component of the signal is described, The system may comprise an analysis filter bank ( | 03-15-2012 |
20120002818 | Advanced Stereo Coding Based on a Combination of Adaptively Selectable Left/Right or Mid/Side Stereo Coding and of Parametric Stereo Coding - The application relates to audio encoder and decoder systems. An embodiment of the encoder system comprises a downmix stage for generating a downmix signal and a residual signal based on a stereo signal. In addition, the encoder system comprises a parameter determining stage for determining parametric stereo parameters such as an inter-channel intensity difference and an inter-channel cross-correlation. Preferably, the parametric stereo parameters are time- and frequency-variant. Moreover, the encoder system comprises a transform stage. The transform stage generates a pseudo left/right stereo signal by performing a transform based on the downmix signal and the residual signal. The pseudo stereo signal is processed by a perceptual stereo encoder. For stereo encoding, left/right encoding or mid/side encoding is selectable. Preferably, the selection between left/right stereo encoding and mid/side stereo encoding is time- and frequency-variant. | 01-05-2012 |
20110305352 | Cross Product Enhanced Harmonic Transposition - The present invention relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR). A system and a method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank providing a plurality of analysis subband signals of the low frequency component of the signal. It also comprises a non-linear processing unit to generate a synthesis subband signal with a synthesis frequency by modifying the phase of a first and a second of the plurality of analysis subband signals and by combining the phase-modified analysis subband signals. Finally, it comprises a synthesis filter bank for generating the high frequency component of the signal from the synthesis subband signal. | 12-15-2011 |
20110302230 | LOW DELAY MODULATED FILTER BANK - The document relates to modulated sub-sampled digital filter banks, as well as to methods and systems for the design of such filter banks. In particular, the present document proposes a method and apparatus for the improvement of low delay modulated digital filter banks. The method employs modulation of an asymmetric low-pass prototype filter and a new method for optimizing the coefficients of this filter. Further, a specific design for a (64) channel filter bank using a prototype filter length of (640) coefficients and a system delay of (319) samples is given. The method substantially reduces artifacts due to aliasing emerging from independent modifications of subband signals, for example when using a filter bank as a spectral equalizer. The method is preferably implemented in software, running on a standard PC or a digital signal processor (DSP), but can also be hardcoded on a custom chip. The method offers improvements for various types of digital equalizers, adaptive filters, multiband companders and spectral envelope adjusting filterbanks used in high frequency reconstruction (HFR) or parametric stereo systems. | 12-08-2011 |
20110274281 | Method for Determining Inverse Filter from Critically Banded Impulse Response Data - A method for determining an inverse filter for altering the frequency response of a loudspeaker so that with the inverse filter applied in the loudspeaker's signal path the inverse-filtered loudspeaker output has a target frequency response, and optionally also applying the inverse filter in the signal path, and a system configured (e.g., a general or special purpose processor programmed and configured) to determine an inverse filter. In some embodiments, the inverse filter corrects the magnitude of the loudspeaker's output. In other embodiments, the inverse filter corrects both the magnitude and phase of the loudspeaker's output. In some embodiments, the inverse filter is determined in the frequency domain by applying eigenfilter theory or minimizing a mean square error expression by solving a linear equation system. | 11-10-2011 |
20110261966 | Method and Apparatus for Applying Reverb to a Multi-Channel Audio Signal Using Spatial Cue Parameters - A method and system for applying reverb to an M-channel down-mixed audio input signal indicative of X individual audio channels, where X is greater than M. Typically, the method includes steps of: in response to spatial cue parameters indicative of spatial image of the downmixed input signal, generating Y discrete reverb channel signals, where each of the reverb channel signals at a time, t, is a linear combination of at least a subset of values of the individual audio channels at the time, t, and individually applying reverb to each of at least two of the reverb channel signals, thereby generating Y reverbed channel signals. Preferably, the reverb applied to at least one of the channel signals has a different reverb impulse response than does the reverb applied to at least one other one of the channel signals, t, is a linear combination of at least a sub-set of values of the individual audio channels at the time, t, and individually applying reverb to each of at least two of the reverb channel signals, thereby generating Y reverbed channel signals. Preferably, the reverb applied to at least one of the channel signals has a different reverb impulse response than does the reverb applied to at least one other one of the channel signals. | 10-27-2011 |
20110208528 | SIGNAL CLIPPING PROTECTION USING PRE-EXISTING AUDIO GAIN METADATA - The application describes a method and an apparatus to prevent clipping of an audio signal when protection against signal clipping by received audio metadata is not guaranteed. The method may be used to prevent clipping for the case of downmixing a multichannel signal to a stereo audio signal. According to the method, it is determined whether first gain values ( | 08-25-2011 |
20110178810 | AUDIO SIGNAL ENCODING OR DECODING - Encoding an audio signal is provided wherein the audio signal includes a first audio channel and a second audio channel, the encoding comprising subband filtering each of the first audio channel and the second audio channel in a complex modulated filterbank to provide a first plurality of subband signals for the first audio channel and a second plurality of subband signals for the second audio channel, downsampling each of the subband signals to provide a first plurality of downsampled subband signals and a second plurality of downsampled subband signals, further subband filtering at least one of the downsampled subband signals in a further filterbank in order to provide a plurality of sub-subband signals, deriving spatial parameters from the sub-subband signals and from those downsampled subband signals that are not further subband filtered, and deriving a single channel audio signal comprising derived subband signals derived from the first plurality of downsampled subband signals and the second plurality of downsampled subband signals. Further, decoding is provided wherein an encoded audio signal comprising an encoded single channel audio signal and a set of spatial parameters is decoded by decoding the encoded single channel audio channel to obtain a plurality of downsampled subband signals, further subband filtering at least one of the downsampled subband signals in a further filterbank in order to provide a plurality of sub-subband signals, and deriving two audio channels from the spatial parameters, the sub-subband signals and those downsampled subband signals that are not further subband filtered. | 07-21-2011 |
20110004479 | HARMONIC TRANSPOSITION - The present invention relates to transposing signals in time and/or frequency and in particular to coding of audio signals. More particular, the present invention relates to high frequency reconstruction (HFR) methods including a frequency domain harmonic transposer. A method and system for generating a transposed output signal from an input signal using a transposition factor T is described. The system comprises an analysis window of length L | 01-06-2011 |
20100286991 | AUDIO ENCODER AND DECODER - The present invention teaches a new audio coding system that can code both general audio and speech signals well at low bit rates. A proposed audio coding system comprises linear prediction unit for filtering an input signal based on an adaptive filter; a transformation unit for transforming a frame of the filtered input signal into a transform domain; and a quantization unit for quantizing the transform domain signal. The quantization unit decides, based on input signal characteristics, to encode the transform domain signal with a model-based quantizer or a non-model-based quantizer. Preferably, the decision is based on the frame size applied by the transformation unit. | 11-11-2010 |
20100286990 | AUDIO ENCODER AND DECODER - The present invention teaches a new audio coding system that can code both general audio and speech signals well at low bit rates. A proposed audio coding system comprises a linear prediction unit for filtering an input signal based on an adaptive filter; a transformation unit for transforming a frame of the filtered input signal into a transform domain; a quantization unit for quantizing a transform domain signal; a long term prediction unit for determining an estimation of the frame of the filtered input signal based on a reconstruction of a previous segment of the filtered input signal; and a transform domain signal combination unit for combining, in the transform domain, the long term prediction estimation and the transformed input signal to generate the transform domain signal. | 11-11-2010 |
20100246832 | METHOD AND APPARATUS FOR GENERATING A BINAURAL AUDIO SIGNAL - An apparatus for generating a binaural audio signal includes a de-multiplexer and decoder which receives audio data comprising an audio M-channel audio signal which is a downmix of an N-channel audio signal and spatial parameter data for upmixing the M-channel audio signal to the N-channel audio signal. A conversion processor converts spatial parameters of the spatial parameter data into first binaural parameters in response to at least one binaural perceptual transfer function. A matrix processor converts the M-channel audio signal into a first stereo signal in response to the first binaural parameters. A stereo filter generates the binaural audio signal by filtering the first stereo signal. The filter coefficients for the stereo filter are determined in response to the at least one binaural perceptual transfer function by a coefficient processor. The combination of parameter conversion/processing and filtering allows a high quality binaural signal to be generated with low complexity. | 09-30-2010 |