Systems, Sampling and Quantization
17
limit cycles
overflow oscillations
lossy quantization
overflow-protected
operations
numbers represented with b bits requires 2b - 1 bits to represent the result
without any precision loss. If successive operations use operands represented
with b bits it is clear that the least-significant bits must be eliminated, thus
introducing a quantization. The effects of these quantizations can be studied
resorting to the additive white noise model, where the points of injection of
noises are the points where the quantization actually occurs.
The fixed-point implementations of linear systems are subject to disappoint-
ing phenomena related to quantization: limit cycles and overflow oscillations.
Both phenomena can be expressed as nonzero signals that are maintained even
when the system has stopped to produce usuful signals. The limit cycles are
usually small oscillations due to the fact that, because of rounding, the sources
of quantization noise determine a local amplification or attenuation of the sig-
nal (see fig. 4). If the signals within the system have a physical meaning (e.g.,
they are propagating waves), the limit cycles can be avoided by forcing a lossy
quantization, which truncates the numbers always toward zero. This operation
corresponds to introducing a small numerical dissipation. The overflow oscilla-
tions are more serious because they produce signals as large as the maximum
amplitude that can be represented. They can be produced by operations whose
results exceed the largest representable number, so that the result is slapped
back into the legal range of two's complement numbers. Such a distructive os-
cillation can be avoided by using overflow-protected operations, which are op-
erations that saturate the result to the largest representable number (or to the
most negative representable number).
The quantizations introduce nonlinear elements within otherwise linear struc-
tures. Indeed, limit cycles and overflow oscillations can persist only because there
are nonlinearities, since any linear and stable system can not give a persistent
nonzero output with a zero input.
Quantization in floating point implementations is usually less of a concern for
the designer. In this case, quantization occurs only in the mantissa. Therefore,
the relative error
r
(n) =
y
q
(n) - y(n)
y(n)
,
(46)
is more meaningful for the analysis. We refer to [65] for a discussion on the
effects of quantization with floating point implementations.
Some digital audio formats, such as the µ-law and A-law encodings, use
a fixed-point representation where the quantization levels are distributed non
linearly in the amplitude range. The idea, resemblant of the quasi logarithmic
sensitivity of the ear, is to have many more levels where signals are small and
a coarser quantization for large amplitudes. This is justified if the signals being
quantized do not have a statistical uniform distribution but tend to assume small
amplitudes more often than large amplitudes. Usually the distribution of levels
is exponential, in such a way that the intervals between points increase exponen-
tially with magnitude. This kind of quantization is called logarithmic because,
in practical realizations, a logarithmic compressor precedes a linear quantization
stage [69]. Floating-point quantization can be considered as a piecewise-linear
logarithmic quantization, where each linear piece corresponds to a value of the
exponent.
Next Page >>
<< Previous Page