Linear predictive coding

D. Rocchesso: Sound Processing

phase unwrapping
linear predictive coding
LPC
vocoder
source signal
target signal
white noise
pulse train
voiced
unvoiced
residual
prediction error
allpole filter

where A(m)e

j(m)

contain the amplitude and instantaneous phase of the sinu-

soid that falls within the k-th bin, and W (

(m)) is the window transform.

If we have access to the instantaneous phase, we can deduce the instantaneous
frequency by back difference between two adjacent frames. This can be done as
long as we deal with the problem of phase unwrapping, due to the fact that the
phase is known modulo 2.

It can be shown [52, pag. 287288] that phase unwrapping can be unambigu-

ous under

Assumption 2 Said H the hop size and

the separation between adjacent

bins, let

H < .

(26)

The assumption 2 holds for rectangular windows and imposes H <

. For

Hann or Hamming windows the hop size must be such that H <

(75%

overlap). Therefore the frame rate to be used for accurate partial estimation is
higher than the minimal frame rate needed for perfect reconstruction.

4.2

Linear predictive coding

(with Federico Fontana)

The analysis/synthesis method known as linear predictive coding (LPC) was
introduced in the sixties as an efficient and effective mean to achieve synthetic
speech and speech signal communication [92]. The efficiency of the method is
due to the speed of the analysis algorithm and to the low bandwidth required
for the encoded signals. The effectiveness is related to the intelligibility of the
decoded vocal signal.

The LPC implements a type of vocoder [10], which is an analysis/synthesis

scheme where the spectrum of a source signal is weighted by the spectral compo-
nents of the target signal that is being analyzed. The phase vocoder of figures 2
and 5 is a special kind of vocoder where amplitude and phase information of the
analysis channels is retained and can be used as weights for complex sinusoids
in the synthesis stage.

In the standard formulation of LPC, the source signals are either a white

noise or a pulse train, thus resembling voiced or unvoiced excitations of the
vocal tract, respectively.

The basic assumption behind LPC is the correlation between the n-th sample

and the P previous samples of the target signal. Namely, the n-th signal sample
is represented as a linear combination of the previous P samples, plus a residual
representing the prediction error:

x(n) = -a

x(n - 1) - a

x(n - 2) - . . . - a

x(n - P ) + e(n) .

(27)

Equation (27) is an autoregressive formulation of the target signal, and the

analysis problem is equivalent to the identification of the coefficients a

, . . . a

of an allpole filter. If we try to minimize the error in a mean square sense, the
problem translates into a set of P equations

k=1

x(n - k)x(n - i) = -

x(n)x(n - i) ,

(28)

Next Page >>

<< Previous Page

Back to the Table of Contents