formant filter

formants

inverse formant filter

whitening filter

vocal-fold excitation

prediction coefficients

reflection coefficients

proper order P is chosen, its magnitude frequency response follows the envelope

of the signal spectrum, with its broad resonances called formants. The filter

A(z) is called the inverse formant filter because it extracts from the voice signal

a residual resembling the vocal tract excitation. A(z) is also called a whitening

filter because it produces a residual having a flat spectrum. However, we dis-

tinguish between two kinds of residuals, both having a flat spectrum: the pulse

train and the white noise, the first being the idealized vocal-fold excitation for

voiced speech, the second being the idealized excitation for unvoiced speech. In

reality, the residual is neither one of the two idealized excitations. At the resyn-

thesis stage the choice is either to use an encoded residual, possibly choosing

from a code book of templates, or to choose one of the two idealized excitations

according to a voiced/unvoiced decision made by the analysis stage.

replicas of a basic pulse, with the correct inter-pulse period. Several techniques

are available for pitch detection, either using the residual or the target signal [53].

Although not particularly efficient, one possibility is to do a Fourier analysis

of the residual and estimate the fundamental frequency by the techniques of

section 4.1.5.

in section 5.1.3.

the filter 1/A(z). As we mentioned in section 2.2.4, the reflection coefficients

are related to a piecewise cylindrical modelization of the vocal tract. The LPC

analysis proceeds by frames lasting a few milliseconds. In each frame the signal