64
D. Rocchesso: Sound Processing
spatial processing
IID
ITD
Exercise
The reader is invited to write a chorus/flanger based on comb or allpass comb
filters using a language for sound processing (e.g., CSound). As an input signal,
try a sine wave and a noisy signal. Then, implement a phaser by cascading
several first-order allpass filters having coefficients between 0 and 1.
3.6
Spatial sound processing
The spatial processing of sound is a wide topic that would require at least a thick
book chapter on its own [82]. Here we only describe very briefly a few techniques
for sound spatialization and reverberation. In particular, techniques for sound
spatialization are different if the target display is by means of headphones or
loudspeakers.
3.6.1
Spatialization
Spatialization with headphones
Humans can localize sound sources in a 3D space with good accuracy using
several cues. If we can rely on the assumption that the listener receives the
sound material via a stereo headphone we can reproduce most of the cues that
are due to the filtering effect of the pinnaheadtorso system, and inject the
signal artificially affected by this filtering process directly to the ears.
Sound spatialization for headphones can be based on interaural intensity
and time differences (see the appendix C). It is possible to use only one of the
two cues, but using both cues will provide a stronger spatial impression. Of
course, interaural time and intensity differences are just capable of moving the
apparent azimuth of a sound source, without any sense of elevation. Moreover,
the apparent source position is likely to be located inside the head of the listener,
without any sense of externalization. Special measures have to be taken in order
to push the virtual sources out of the head.
A finer localization can be achieved by introducing frequency-dependent in-
teraural differences. In fact, due to diffraction the low frequency components are
barely affected by IID, and the ITD is larger in the low frequency range. Cal-
culations done with a spherical head model and a binaural model [49, 73] allow
to draw approximated frequency-dependent ITD curves, one being displayed in
fig. 9.a for 30
o
of azimuth. The curve can be further approximated by constant
segments, one corresponding to a delay of about 0.38ms in low frequency, and
the other corresponding to a delay of about 0.26ms in high frequency. The low-
frequency limit can in general be obtained for a general incident angle by the
formula
ITD =
1.5
c
sin ,
(27)
where is the inter-ear distance in meters and c is the speed of sound. The
crossover point between high and low frequency is located around 1kHz. Similarly,
the IID should be made frequency dependent. Namely, the difference is larger for
high-frequency components, so that we have IID curves such as that reported
in fig. 9.b for 30
o
of azimuth. The IID and ITD are shown to change when
the source is very close to the head [32]. In particular, sources closer than five