D. Rocchesso: Sound Processing
upward spread of masking
outer hair cells
temporal masking
forward masking
backward masking
Degree of consonance
Frequency separation in critical bandwidth
Degree of consonance
Figure 8: Degree of consonance between two sine tones as a function of their
frequency distance, measured as a fraction of critical bandwidth (Measurement
by Plomp and Levelt (1965) reported also in [105]).
thresholding level in order to become audible in presence of the masker. This
phenomenon is called masking and it is cartoonified in figure 9. Indeed, masking
is ill-defined in the immediate proximity of the masker, because there the pres-
ence of beats may let the interference between masker and masked tone become
Two features of masking can be noticed in figure 9. First, masking is much
more effective towards high frequencies (note also the log scale in frequency).
Second, high-intensity maskers spread their effects even more towards high fre-
quencies. The latter phenomenon is called upward spread of masking, and it is
due to the nonlinear behavior of the outer hair cells of the cochlea, whose stiff-
ness depends on the excitation they receive [4]. A high-frequency cell, excited
by a lower-frequency tone, increases its stiffness and becomes less sensitive to
components at its characteristic frequency.
In complex tones, the partials affect each other as far as masking is con-
cerned, so that it may well happen that in a tone with a few dozens of partials,
only five or six emerge from a collective masking threshold. In a sound coding
task, it is obvious that we should use all our resources (i.e., the bits) to encode
those partials, thus neglecting the components that are masked. This idea is the
basis for perceptual audio coding, as it is found in the MPEG-1 standard [69].
For coding purposes, it is also useful to look at temporal masking. Namely,
the effects of masking extend in the future for up to 40ms (forward masking),
and in the past for up to 10ms (backward masking). These temporal effects
may occur because the brain integrates sound information over time, and there
are inherent delays in this operation. Therefore, a soft tone preceding a louder
tone by a couple of milliseconds is likely to be just canceled from our perceptual
Next Page >>
<< Previous Page
Back to the Table of Contents