bin

Sound Analysis

bin
resynthesis

where the third member of the equality is obtained by defining r = m-n, and m
is a variable accounting for the temporal dislocation of the window. Therefore,
the STFT turns out to be a function of two variables, one can be thought of as
frequency, the other is essentially a time shift.

The DTFT is a periodic function of a continuous variable, and it can be

inverted by means of an integral computed over a period

w(m - n)y(n) =

()e

d .

(3)

By a proper alignment of the window (m = n) we can compute, if w(0) = 0

y(n) =

2w(0)

()e

d .

(4)

The STFT in its formulation (2) can be seen as convolution

() = (w y

)(m) ,

(5)

where y

(n) = y(n)e

-jn

is the demodulated signal. If w is set to the impulse

response of the ideal lowpass filter, and if we set =

, we get a channel of the

filterbank of fig. 2. In general, w(·) will be the impulse response of a non-ideal
lowpass filter, but the filterbank view will keep its validity.

In practice, we need to compute the STFT on a finite set of N points. In

what follows we assume that the window is R N samples long, so that we
can use the DFT on N points, thus obtaining a sampling of the frequency axis
between 0 and 2 in multiples of 2/N .

The k-th point in the transform domain (said the k-th bin of the DFT) is

given by

(k) =

N -1

n=0

w(m - n)y(n)e

-j

2kn

(6)

and, by means of an inverse DFT

w(m - n)y(n) =

N -1

k=0

(k)e

2kn

(7)

By a proper alignment of the window (m = n), and assuming that w(0) = 0

we get

y(n) =

N w(0)

N -1

k=0

(k)e

j2kn

(8)

More generally, we can reconstruct (resynthesis) the time-domain signal by

means of

y(n) =

N w(m - n)

N -1

k=0

(k)e

j2kn

(9)

where w(m - n) = 0, which is true, given an integer n

, for a non-trivial window

defined for

m + n

n m + n

+ R - 1 .

(10)

Next Page >>

<< Previous Page

Back to the Table of Contents