Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z - Internet FAQ Archives

comp.dsp FAQ [2 of 4]

( Part1 - Part2 - Part3 - Part4 )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Schools ]
Archive-name: dsp-faq/part2
Last-modified: Wed Apr 11 2007

See reader questions & answers on this topic! - Help others by sharing your knowledge
                     Previous section (1) Next section (3)

                          Q2: Algorithms and standards

Q2.1: Where can I get public domain algorithms for general-purpose DSP?

   Updated 12/31/96

           The following archives contain things such as matrix operations,
           FFT's and generally useful things like that, as opposed to
           complete applications.


           Netlib serves some of this software via email. Try mail to
           netlib@ORNL.GOV with "send help" in the subject field.

   To Obtain:
           For Europe:

           X.400: s=netlib; o=nac; c=no;
           EUNET/uucp: nac!netlib

   For more information:
           See Jack J. Dongarra and Eric Grosse, "Distribution of
           Mathematical Software Via Electronic Mail," Comm. ACM (1987)

           A similar collection of statistical software is available from

           The symbolic algebra system REDUCE is supported by

    NSWC Library

           The Naval Surface Warfare Center has a library of mathematical
           Fortran subroutines that may be of use. The NSWC library is a
           library of general-purpose Fortran subroutines that provide a
           basic computational capability in a variety of mathematical
           activities. Emphasis has been placed on the transportability of
           the codes. Subroutines are available in the following areas:
           Elementary Operations, Geometry, Special Functions, Polynomials,
           Vectors, Matrices, Large Dense Systems of Linear Equations, Banded
           Matrices, Sparse Matrices, Eigenvalues and Eigenvectors, l1
           Solution of Linear Equations, Least-Squares Solution of Linear
           Equations, Optimization, Transforms, Approximation of Functions,
           Curve Fitting, Surface Fitting, Manifold Fitting, Numerical
           Integration, Integral Equations, Ordinary Differential Equations,
           Partial Differential Equations

   For more information:
           NSWC Library of Mathematical Subroutines
           Report No.: NSWC TR 90-21, January 1990
           by Alfred H. Morris, Jr.

           Naval Surface Warfare Center (E43)
           Dahlgren, VA 22448-5000

           [Witold Waldman]

    IEEE Press book "Programs For Digital Signal Processing"

           You can get the Fortran source code from the IEEE Press book
           "Programs For Digital Signal Processing." See question 1.3.6.


Q2.2: What are CELP and LPC? Where can I get the source for CELP and LPC?

   Updated 09/10/01

           CELP stands for "code excited linear prediction". LPC stands for
           "linear predictive coding". They are compression algorithms used
           for low bit rate (2400 and 4800 bps) speech coding.

           The U.S. DoD's Federal-Standard-1016 based 4800 bps code excited
           linear prediction voice coder version 3.2 (CELP 3.2) Fortran and C
           simulation source codes are available for worldwide distribution
           (on DOS diskettes, but configured to compile on Sun SPARC
           stations) from NTIS and DTIC. Example input and processed speech
           files are included. A Technical Information Bulletin (TIB),
           "Details to Assist in Implementation of Federal Standard 1016
           CELP," and the official standard, "Federal Standard 1016,
           Telecommunications: Analog to Digital Conversion of Radio Voice by
           4,800 bit/second Code Excited Linear Prediction (CELP)," are also

   To obtain CELP:
           Available through the National Technical Information Service:

              U.S. Department of Commerce
              5285 Port Royal Road
              Springfield, VA 22161
              (800) 553-6847

           FS-1016 CELP 3.2 may also be obtained from

           LPC-10 (2.4 Kbps) is available from

           LPC (4.8 Kbps) can be downloaded in SpeakFreely
 , or in HawkVoice
  HawkVoice includes versions of
           OpenLPC, LPC-10, LPC, GSM, and Intel/DVI ADPCM. These versions
           have been rewritten to support multiple encoding and decoding
           streams, and the interfaces have been standardized. [Phil Frisbie,

           OpenLPC (1.4 and 1.8 Kbps) can be downloaded from

           MATLAB software for LPC-10 is available from
  Also, postscript
           copies of tutorials of speech coding can be found at
  [Andreas Spanias,

   For more information:

     * The following articles describe the Federal-Standard-1016 4.8-kbps
       CELP coder (it's unnecessary to read more than one):

         Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The
         Federal Standard 1016 4800 bps CELP Voice Coder, Digital Signal
         Processing, Academic Press, 1991, Vol. 1, No. 3, p. 145-155.

         Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The
         DoD 4.8 kbps Standard (Proposed Federal Standard 1016), in Advances
         in Speech Coding, ed. Atal, Cuperman and Gersho, Kluwer Academic
         Publishers, 1991, Chapter 12, p. 121-133.

         Campbell, Joseph P. Jr., Thomas E. Tremain and Vanoy C. Welch, The
         Proposed Federal Standard 1016 4800 bps Voice Coder: CELP, Speech
         Technology Magazine, April/May 1990, p. 58-64.

       Additional information on CELP can also be found in the comp.speech

     * The voicing classifier used in the enhanced LPC-10 (LPC-10e) is
       described in: Campbell, Joseph P., Jr. and T. E. Tremain,
       Voiced/Unvoiced Classification of Speech with Applications to the U.S.
       Government LPC-10E Algorithm, Proceedings of the IEEE International
       Conference on Acoustics, Speech, and Signal Processing, 1986, p.

       The U. S. Federal Standard 1015 (NATO STANAG 4198) is described in:
       Thomas E. Tremain, The Government Standard Linear Predictive Coding
       Algorithm: LPC-10, Speech Technology Magazine, April 1982, pp. 40-49.

   [Most of the above from Joe Campbell,, with
   additions from Dan Frankowski,, and Ed Hall,]


Q2.3: What is ADPCM? Where can I get source for it?

   Updated: 04/03/01

   ADPCM stands for Adaptive Differential Pulse Code Modulation. It is a
   family of speech compression and decompression algorithms. A common
   implementation takes 16-bit linear PCM samples and converts them to 4-bit
   samples, yielding a compression rate of 4:1.

   To obtain:
           There is public domain C code available via anonymous ftp at
  written by Jack Jansen
           (email It is very programmer-friendly. The
           ADPCM code used is the Intel/DVI ADPCM code which is being
           recommended by the IMA Digital Audio Technical Working Group. It
           allows the following calls:

 adpcm_coder(short inbuf[], char outbuf[], int nsample,
         struct adpcm_state *state);
 adpcm_decoder(char inbuf[], short outbuf[], int nsample,
         struct adpcm_state *state);

           Note that this is NOT a G.722 coder. The ADPCM standard is much
           more complicated, probably resulting in better quality sound but
           also in much more computational overhead.

           The routines have been tested on numerous platforms, and will
           easily compress and decompress millions of samples per second on
           current hardware.

   For more information:
           The G.721/722/723 packages are available from ITU at

           [From Dan Frankowski,; Jack Jansen,


Q2.4: What is GSM? Where can I get source for it?

   Updated 4/27/00

           GSM (Global System for Mobile Communication) is a standard for
           digital cellular telephony used in Europe. GSM also refers to the
           speech coder used in GSM telephones, which is what this section of
           the FAQ is concerned with.

           The Communications and Operating Systems Research Group (KBS) at
           the Technische Universitaet Berlin is currently working on a set
           of UNIX-based tools for computer-mediated telecooperation that
           will be made freely available.

           As part of this effort they are publishing an implementation of
           the European GSM 06.10 provisional standard for full-rate speech
           transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse
           excitation/long term prediction) coding at 13 kbit/s.

           GSM 06.10 compresses frames of 160 13-bit samples (8 kHz sampling
           rate, i.e. a frame rate of 50 Hz) into 260 bits; for compatibility
           with typical UNIX applications, our implementation turns frames of
           160 16-bit linear samples into 33-byte frames (1650 Bytes/s). The
           quality of the algorithm is good enough for reliable speaker
           recognition; even music often survives transcoding in recognizable
           form (given the bandwidth limitations of 8 kHz sampling rate).

           The interfaces offered are a front end modeled after compress(1),
           and a library API. Compression and decompression run faster than
           real time on most SPARCstations. The implementation has been
           verified against the ETSI standard test patterns.

           Jutta Degener, Carsten Bormann

           Communications and Operating Systems Research Group, TU Berlin
           Fax: +49.30.31425156, Phone: +49.30.31424315

   To obtain:
           An alternative site is
  Try also:

   [From Dan Frankowski,; Jutta Degener,]


Q2.5: How does pitch perception work, and how do I implement it on my DSP chip?

   Updated 04/02/01

           Pitch is officially defined as "That attribute of auditory
           sensation in terms of which sounds may be ordered on a musical
           scale." Several good examples illustrating the subtleties of pitch
           perception are included in the "Auditory Demonstrations CD" which
           is available from the Acoustical Society of America, Woodbury, NY
           10797 for $20.

           A good general reference about the psychology of pitch perception
           is the book:

             B.C.J. Moore, An Introduction to the Psychology of Hearing,
             Academic Press, London, 1997.

           This book is available in paperback and makes a good desk

           An algorithm implementation that matches a large body of
           psycho-acoustical work, but which is computationally very
           intensive, is presented in the paper:

             Malcolm Slaney and Richard Lyon, "A Perceptual Pitch Detector,"
             Proceedings of the International Conference of Acoustics,
             Speech, and Signal Processing, 1990, Albuquerque, New Mexico.
             Available for ftp at

           The definitive papers describing the use of such a perceptual
           pitch detector as applied to the classical pitch literature is in:

             Ray Meddis and M. J. Hewitt. "Virtual pitch and phase
             sensitivity of a computer model of the auditory periphery. "
             Journal of the Acoustical Society of America 89 (6 1991):
             2866-2682. and 2883-2894.

           The current work that argues for a pure spectral method starts
           with the work of Goldstein:

             J. Goldstein, "An optimum processor theory for the central
             formation of the pitch of complex tones," Journal of the
             Acoustical Society of America 54, 1496-1516, 1973.

           Two approaches are worth considering if something approximating
           pitch is appropriate. The people at IRCAM have proposed a harmonic
           analysis approach that can be implemented on a DSP:

             Boris Doval and Xavier Rodet, "Estimation of Fundamental
             Frequency of Musical Sound Signals," Proceedings of the 1991
             International Conference on Acoustics, Speech, and Signal
             Processing, Toronto, Volume 5, pp. 3657-3660.

           The classic paper for time domain (peak picking) pitch algorithms

             B. Gold and L. Rabiner, "Parallel processing techniques for
             estimating pitch periods of speech in the time domain," Journal
             of the Acoustical Society of America, 46, pp 441-448, 1969.

   Finally, a word of caution:
           Pitch is not single-valued. We can hear a sound and match it to
           several different pitches. Imagine the number of instruments in an
           orchestra, each with its own pitch. Even a single sound can have
           more than one pitch. See for example Demonstration 27 from the ASA
           Auditory Demonstrations CD.

           [The above from Malcolm Slaney, Interval Research, and John
           Lazzaro, U.C. Berkeley.]

           Information about independently changing the pitch and speed of a
           digital recording can be found at
  [Stephan M.


Q2.6: What standards exist for digital audio? What is AES/EBU? What is S/PDIF?

   Updates 1/8/97

  Q2.6.1: Where can I get copies of ITU (formerly CCITT) standards?

           Try the ITU (International Telecommunication Union) homepage at


  Q2.6.2: What standards are there for digital audio?


           The "AES/EBU" (Audio Engineering Society / European Broadcast
           Union) digital audio standard is probably the most popular digital
           audio standard today. Most consumer and professional digital audio
           devices (CD players, DAT decks, etc.) that feature digital audio
           I/O support AES/EBU.

           AES/EBU is a bit-serial communications protocol for transmitting
           digital audio data through a single transmission line. It provides
           two channels of audio data (up to 24 bits per sample), a method
           for communication control and status information ("channel status
           bits"), and some error detection capabilities. Clocking
           information (i.e., sample rate) is derived from the AES/EBU bit
           stream, and is thus controlled by the transmitter. The standard
           mandates use of 32 kHz, 44.1 kHz, or 48 kHz sample rates, but some
           interfaces can be made to work at other sample rates.

           AES/EBU provides both "professional" and "consumer" modes. The big
           difference is in the format of the channel status bits mentioned
           above. The professional mode bits include alphanumeric channel
           origin and destination data, time of day codes, sample number
           codes, word length, and other goodies. The consumer mode bits have
           much less information, but do include information on copy
           protection (naturally). Additionally, the standard provides for
           "user data", which is a bit stream containing user-defined (i.e.,
           manufacturer-defined) data. According to Tim Channon, "CD user
           data is almost raw CD subcode; DAT is StartID and SkipID. In
           professional mode, there is an SDLC protocol or, if DAT, it may be
           the same as consumer mode."

           The physical connection media are commonly used with AES/EBU:
           balanced (differential), using two wires and shield in three-wire
           microphone cable with XLR connectors; unbalanced (single-ended),
           using audio coax cable with RCA jacks; and optical (via fiber


           "S/P-DIF" (Sony/Philips Digital Interface Format) typically refers
           to AES/EBU operated in consumer mode over unbalanced RCA cable.
           Note that S/P-DIF and AES/EBU mean different things depending on
           how much of a purist you are in the digital audio world; see the
           Finger article below.

           Finger, Robert, AES3-199X: The Revised Two Channel Digital Audio
           Interface (DRAFT), presented at the 91st Convention of the Audio
           Engineering Society, October 4-8, 1991. Reprints: AES, 60 East
           42nd St., New York, NY, 10165.

           [The above from Phil Lapsley and Tim Channon,

           Painter, E. M., and Spanias, A. S. (1997 and revised 1999). A
           Review of Algorithms for Perceptual Coding of Digital Audio
           Signals. (PostScript, 3MB)

           [Andreas Spanias,]


Q2.7: What is mu-law encoding? Where can I get source for it?

   Updated 9/13/99

           Mu-law (also "u-law") encoding is a form of logarithmic
           quantization or companding. It's based on the observation that
           many signals are statistically more likely to be near a low signal
           level than a high signal level. Therefore, it makes more sense to
           have more quantization points near a low level than a high level.
           In a typical mu-law system, linear samples of 14 to 16 bits are
           companded to 8 bits. Most telephone quality codecs (including the
           Sparcstation's audio codec) use mu-law encoded samples.

           Desktop Sparc machines come with routines to convert between
           linear and mu-law samples. On a desktop Sparc, see the man page
           for audio_ulaw2linear in /usr/demo/SOUND/man.

   To obtain:
           Craig Reese posted the source of similar routines to comp.dsp in
           August '92. These are archived on

           ITU-T (formerly CCITT) Recommendation G.711 (very difficult to

           Michael Villeret, et. al, A New Digital Technique for
           Implementation of Any Continuous PCM Companding Law, IEEE Int.
           Conf. on Communications, 1973, vol. 1, pp. 11.12-11.17.

           MIL-STD-188-113, Interoperability and Performance Standards for
           Analog-to-Digital Conversion Techniques, 17 February 1987.

           TI Digital Signal Processing Applications with the TMS320 Family
           (TI literature number SPRA012A), pp. 169-198.

   [From Joe Campbell; Craig Reese,; Sepehr Mehrabanzad,; Keith Kendall,]


Q2.8: How can I do CD <=> DAT sample rate conversion?

   Updated 9/13/99

           CD players use a 44.1 kHz sample rate, whereas DAT uses a 48 kHz
           sample rate. This means that you must do sample rate conversion
           before you can get data from a CD player directly into a DAT deck.

           [From Ed Hall,]

           For a start, look at Multirate Digital Signal Processing by
           Crochiere and Rabiner (see FAQ section 1.1).

           Almost any technique for producing good digital low-pass filters
           will be adaptable to sample-rate conversion. 44.1:48 and
           vice-versa is pretty hairy, though, because the lowest
           whole-number ratio is 147:160. To do all that in one go would
           require a FIR with thousands of coefficients, of which only
           1/147th or 1/160th are used for each sample--the real problem is
           memory, not CPU for most DSP chips. You could chain several
           interpolators and decimators, as suggested by factoring the ratio
           into 3*7*7:2*2*2*2*2*5. This adds complexity, but reduces the
           number of coefficients required by a considerable amount.

           [From Lou Scheffer:]

           Theory of operation: 44.1 and 48 are in the ratio 147/160. To
           convert from 44.1 to 48, for example, we (conceptually):

             1. interpolate 159 zeros between every input sample. This raises
                that data rate to 7.056 MHz. Since it is equivalent to
                reconstructing with delta functions, it also creates images
                of frequency f at 44.1-f, 44.1+f, 88.2-f, 88.2+f, ...
             2. We remove these with an FIR digital filter, leaving a signal
                containing only 0-20 KHz information, but still sampled at a
                rate of 7.056 MHz.
             3. We discard 146 of every 147 output samples. It does not hurt
                to do so since we have no content above 24 KHz. In practice,
                of course, we never compute the values of the samples we will
                throw out.

           So we need to design an FIR filter that is flat to 20 KHz, and
           down at least X db at 24 KHz. How big does X need to be? You might
           think about 100 db, since the max signal size is roughly +-32767,
           and the input quantization +- 1/2, so we know the input had a
           signal to broadband noise ratio of 98 db at most. However, the
           noise in the stopband (20KHz-3.5MHz) is all folded into the
           passband by the decimation in step 3, so we need another 22 db
           (that's 160 in db) to account for the noise folding. Thus 120 db
           rejection yields a broadband noise equal to the original
           quantizing noise. If you are a fanatic, you can shoot for 130 db
           to make the original quantizing errors dominate, and a 22.05 KHz
           cutoff to eliminate even ultrasonic aliasing. You will pay for
           your fanaticism with a penance of more taps, however.

   To obtain:
           There's a free implementation of Julius O. Smith III and someone
           else's "bandwidth-limited interpolation" rate conversion

           A paper available as
           explains the algorithm. Free source code, as well as an HTML
           discussion of the algorithm, is available at
  It all works quite

           [From Kevin Bradley,]

           There is an implementation of polyphase resampling for various
           rates as a part of the Sox audio toolkit at
  See file polyphas.c
           for details.

           Sox also contains an implementation of bandlimited interpolation
           and linear interpolation, and serves as a ready vehicle for module

           [From Fritz M. Rothacher,]

           You can add my Ph.D. thesis on sample-rate conversion to the FAQ:

           Fritz M. Rothacher, Sample-Rate Conversion: Algorithms and VLSI
           Implementation, Ph.D. thesis, Integrated Systems Lab, Swiss
           Federal Institute of Technology, ETH Zuerich, 1995, ISBN

           It can also be downloaded from my homepage at


Q2.9: Wavelets

   Updated 6/3/98

  Q2.9.1 What are wavelets? Where can I get more information?

           In short, wavelets are a way to analyze a signal using base
           functions which are localized both in time (as diracs, but unlike
           sine waves), and in frequency (as sine waves, but unlike diracs).
           They can be used for efficient numerical algorithms and many DSP
           or compression applications.

           Sources of information on wavelets include:

              * a newsletter, "Wavelet Digest". Subscriptions for Wavelet
                Digest: E-mail to with "subscribe"
                as subject. The Wavelet Digest can also be found at


  Q2.9.2 What are some good books and papers on wavelets

           The best introduction to wavelet transforms is in:

             Wavelets and Signal Processing- Oliver Rioul and Martin
             Vetterli, IEEE Signal Processing magazine, Oct. 91, pp 14-38

           A good introductory book on wavelets:

             Randy K. Young, Wavelet Theory and Its Applications, Kluwer
             Academic Publishers, ISBN 0-7923-9271-X, 1993.

           A more thorough book:

             Ali N. Akansu and Richard A. Haddad, Multiresolution Signal
             Decomposition Transforms, Subbands, Wavelets Academic Press,
             Inc., ISBN 0-12-047140-X

           A couple more interesting papers:

             Wavelets and Filter banks: Theory and Design, IEEE Transactions
             on Signal Processing, Vol. 40, No.9, Sept. 1992, pp 2207-2232

             Mac Cody's articles in Dr. Dobb's Journal, April 1992 and April

             Paper by Ingrid Daubechies in IEEE Trans. on Info. theory , vol
             36. No.5 , Sept 1990 and a book titled " Ten lectures on
             Wavelets" deal with the mathematical aspects of the WT.


  Q2.9.3: Where can I get some software for wavelets?

           Binaries are available for the following platforms: Sun
           Sparcstations running SunOS 4.1 or Solaris 2.3, NeXT machines
           running NeXTstep 3.0 or higher, with an X server, Silicon Graphics
           machines (IRIS), DEC Alpha AXP running OSF/1 1.2 or higher,
           i386/i486 PC compatible with Linux 0.99.

           There is also a sample data directory containing interesting

   More information:

           [From Fazal Majid]:

    Rice Wavelet Tools

           The Rice Wavelet Toolbox (RWT) is a collection of Matlab M-files
           and C MEX-files for 1D and 2D wavelet and filter bank design,
           analysis, and processing. The toolbox provides tools for denoising
           and interfaces directly with our Matlab code for wavelet domain
           hidden Markov models and wavelet regularized deconvolution. Also
           included is a simple converter to the data format used by the
           official Matlab wavelet toolbox.

           The current distribution, Version 2.3 (Dec 1, 2000), has been
           streamlined and packaged for different systems, including Solaris,
           Linux, and Microsoft Windows. Functions omitted in Version 2.3 can
           be found in the Version 2.01 distribution.

   To obtain:

           Send mail to (or


Q2.10: How do I calculate the coefficients for a Hilbert transformer?

   Updated 6/3/98

           For all the gory details, I suggest the paper: Andrew Reilly and
           Gordon Frazer and Boualem Boashash: Analytic signal
           generation---tips and traps, IEEE Transactions on Signal
           Processing, no. 11, vol. 42, Nov. 1994, pp. 3241-3245.

           For comp.dsp, the gist is:

             1. Design a half-bandwidth real low-pass FIR filter using
                whatever optimal method you choose, with the principle design
                criterion being minimization of the maximum attenuation in
                the band f_s/4 to f_s/2.

             2. Modulate this by exp(2 pi f_s/4 t), so that now your
                stop-band is the negative frequencies, the pass-band is the
                positive frequencies, and the roll-off at each end does not
                extend into the negative frequency band.

             3. either use it as a complex FIR filter, or a pair of I/Q real
                filters in whatever FIR implementation you have available.

           If your original filter design produced an impulse response with
           an even number of taps, then the filtering in 3 will introduce a
           spurious half-sample delay (resampling the real signal component),
           but that does not matter for many applications, and such filters
           have other features to recommend them.

           Andrew Reilly []


Q2.11: Algorithm implementation: floating-point versus fixed-point

           According to the WWWebster Dictionary, an algorithm is "a
           procedure for solving a mathematical problem (as of finding the
           greatest common divisor) in a finite number of steps that
           frequently involves repetition of an operation; broadly: a
           step-by-step procedure for solving a problem or accomplishing some
           end especially by a computer."

           Typical (although by no means the only) operations are those of
           addition and multiplication. When expressing the algorithm with
           pencil and paper, these operations are commonly taken to be within
           an algebraically complete number system such as the integers or
           the reals. However, when the time comes to implement the algorithm
           on a computer, these "ideal" number systems must be exchanged for
           something realizable. The number systems available today on common
           processors and digital hardware are broadly categorized as
           floating-point and fixed-point.

           In a floating-point representation, the total number of bits
           available are partitioned into an exponent and mantissa. Generally
           speaking, the mantissa stores the "significant digits" of the
           value while the exponent scales the significant digits to the
           desired magnitude. The action of the exponent is to move, or
           "float," the decimal point depending on the magnitude being
           represented; thus the term "floating-point."

           Because floating-point representations are typically at least 32
           bits long (IEEE-754 is a popular standard for 32-bit and 64-bit
           floating-point numbers), there exists simultaneously high
           precision and high dynamic range. These traits of floating-point
           numbers allow most algorithms to be ported directly to
           floating-point implementations with little or no change, and this
           is the key reason floating-point representations are highly
           desirable. The disadvantage of floating-point implementations is
           that they require a significant amount of extra hardware over
           fixed-point implementations, which translates to higher parts
           costs, higher power consumption, slower execution, larger chip
           area, or a combination of these.

           As the term "fixed-point" implies, fixed-point representations
           have the binary point at a fixed location. There are two subsets
           of fixed-point implementations: fractional and integer. In a
           fractional fixed-point implementation, such as that provided on
           the Motorola 56K series of DSPs, the binary point is always
           assumed to be to the left of the most-significant digit. In an
           integer fixed-point implementation, such as that provided by the
           Texas Instruments TMS320C54xx series of DSPs, the binary point is
           to the right of the least-significant digit. In either case, the
           arithmetic operations implemented in the hardware are essentially
           integer, which results in a much simpler arithmetic logic unit in
           hardware that allows lower cost, lower power consumption, faster
           execution, smaller chip area, or a combination of these, over that
           of floating-point implementations.

    Fixed-Point Arithmetic: The Basics

           In essence, a fixed-point representation is a simple integer
           scaled (divided) by a power of two. If we denote an unscaled
           integer variable by upper case "X" and the scaled, fixed-point
           variable by lower case "x," then x = X/2^b, where b is the number
           of digits the binary point is shifted left. For example, if X is a
           16-bit, two's complement integer, and b=4, then "X" has values
           ranging from -2^(15) to +2^(15)-1 and with minimum step size of 1,
           while the scaled value "x" ranges from -2^(11) to +2^(11) -
           1/(2^4) with a minimum step size of 1/(2^4).

           Note that the value of "b" is not part of the representation. You
           won't see it in a register or as part of the data anywhere; it is
           a parameter that the algorithm implementer must determine and

           Fixed-point representations place some very different rules on
           operations than their floating-point counterparts. For example,
           two variables must be scaled the same in order to be added (or
           subtracted). Thus it may be necessary to shift one or the other
           operand prior to adding. Another example is that when multiplying
           two N-bit values with scale factors b0 and b1, the result is
           scaled (b0+b1) and requires 2*N bits in general in order to avoid
           overflow and maintain precision.

           There are several other rules and considerations for fixed-point
           arithmetic that are commonly encountered when implementing
           algorithms. For more information, see

           Randy Yates []

                     Previous section (1) Next section (3)
                     Previous section (2) Next section (4)

                 Q3: Programmable DSP chips and their software

Q3.1: What are the available DSP chips and chip architectures?

   Updated 05/07/02

           The "big four" programmable DSP chip manufacturers are Texas
           Instruments, with the TMS320C2000, TMS320C5000, and TMS320C6000
           series of chips; Freescale, with the DSP56300, DSP56800, and
           MSC8100 (StarCore) series; Agere Systems (formerly Lucent
           Technologies), with the DSP16000 series; and Analog Devices, with
           the ADSP-2100 and ADSP-21000 ("SHARC") series. A good overview of
           programmable DSP chips is published periodically in EDN and
           Computer Design magazines.

           You may also want to check out Berkeley Design Technology's home
           page, which has a number of articles on choosing DSP processors,
           as well as a "Pocket Guide to Processors for DSP" in HTML format.
           Brief overviews of various DSP processors, cores, and
           general-purpose processors can be found at

           Here's a less ambitious chip breakdown by manufacturer:

  Agere Systems (formerly Lucent Technologies):

           100 to 170 MHz 16-bit fixed-point DSP. The DSP16000 core features
           two multipliers with SIMD-like capabilities, a 20-bit address bus,
           a 32-bit address bus, and eight 40-bit accumulators. The chips
           feature two serial ports and two timers.

           The first-generation processor, the DSP16210, contains a single
           DSP16000 core and 120 KB of internal RAM. The second-generation
           DSP16410 incorporates two DSP16000 cores and 386 KB of internal

  Analog Devices:

           10 to 80 MHz 16-bit fixed point DSPs; 40-bit accumulator; 24-bit
           instructions. Large number of family members with different
           configurations of on-chip memory and serial ports, timers, and
           host ports. ADSP-21mspxx members include an on-chip codec.

           160 MHz 16-bit fixed point DSPs; 40-bit accumulator; 24-bit
           instructions. Based on the ADSP-21xx family, and is is mostly, but
           not completely, assembly source-code upward compatible with the
           ADSP-21xx Adds new addressing modes and an instruction cache,
           expands address space, and lengthens pipeline (six stages compared
           to three on the ADSP21xx). Family includes members containing
           multiple ADSP-219x cores.

   ADSP-21xxx ("SHARC"):
           33 to 100 MHz floating-point DSP; Supports 32-bit fixed-point,
           IEEE format 32-bit floating-point, and 40-bit floating-point;
           40-bit registers plus an 80-bit accumulator that can be divided
           into two 32-bit registers and a 16-bit register.

           The first-generation SHARC, the ADSP-2106x, features a single data
           path, a 32-bit address bus, and 40-bit data bus. Versions are
           available with up to 512 KB of on-chip memory, up to six
           communication ports, and up to 10 DMA channels.

           The second-generation ADSP-2116x has two parallel data paths, a
           32-bit address bus, and a 64-bit data bus. Versions are available
           with up to 512 KB of on-chip memory; up to six communication
           ports, and up to 14 DMA channels.

           Analog Devices also sells the AD14000 series, which contain four
           ADSP-2106x SHARC processors in a single-chip package.

           200 to 300 MHz 16-bit fixed point DSPs that can execute two MAC
           instructions per cycle; based on the ADI/Intel MSA core. Uses a
           mix of 16-, 32-, and 64-bit instructions. Features include ability
           to operate over a wide range of frequencies and voltages.


           66 to 160 MHz 24-bit fixed-point DSP; most family members have
           24-bit address and data busses. The DSP563xx also features 56-bit
           accumulators (2), timers, serial interface, host interface port.
           The DSP56307 and DSP56311 contain a filter co-processor. Up to 1
           MB of internal RAM.

           40 MHz 16-bit fixed point DSP; 36-bit accumulators (2), three
           internal address buses (two 16-bit, one 19-bit) and one 16-bit
           external address bus; three 16-bit internal data buses, one 16-bit
           external data bus; serial ports, timers. 4-12 KB of internal RAM.
           Most family members include an on-chip A/D.

           160 MHz 16-bit fixed point DSP based on the DSP568xx. Adds an
           exponent detector and two accumulators, extends shifter and the
           logic unit to 32 bits, and widens internal address and data buses.
           The DSP5685x uses a 1X master clock rate rather than the 2X master
           clock rate used by the DSP568xx.

           The 300 MHz MSC8101 is the first processor based on the StarCore
           SC140 core. It contains four parallel ALU units that can execute
           up to four MAC operations in a single clock cycle. The MSC8101
           uses variable-length instructions. Features include: 512 KB
           on-chip RAM; 16 DMA channels; an on-chip filter co-processor; and
           interfaces for ATM, Ethernet, E1/T1 and E3/T3, and the PowerPC

  Texas Instruments:

           20-40 MHz 16-bit fixed-point DSPs oriented toward low-cost control
           applications; 16 bit data, 32 bit registers. The family members
           have a variety of peripherals, such as A/D converters, 41 I/O
           pins, and 16 PWM outputs. A variety of RAM and ROM configurations
           are available

           TI also sells the TMS320C2x family, an older version of the chip
           with fewer features.

           33-75 MHz floating point DSPs; 32-bit floating-point, 24-bit
           fixed-point data, 40-bit registers; DMA controller; serial ports;
           some support for multi-processor arrays. Various ROM and RAM

           40 to 160 MHz 16-bit fixed-point DSPs with a large number of
           specialized instructions. Many family members; the processors
           differ in configuration of on-chip ROM/RAM, serial ports,
           autobuffered serial ports, host ports, and time-division
           multiplexed ports. On-chip RAM ranges from 10 KB to over 1 MB.

           144 to 200 MHz dual-ALU variant of the TMS320C54xx that can
           execute two MAC instructions per cycle. Variable instruction word
           width. Features include up to 320 KB internal RAM; 6 DMA channels;
           2 serial ports; and 2 timers.

           150-300 MHz 16-bit fixed-point DSP with VLIW (very large
           instruction word), load/store architecture; 32 32-bit registers;
           very deep pipeline; two multipliers, ALUs, and shifters; cache.

           400-600 MHz 16-bit fixed-point DSP based on the TMS320C62xx. Adds
           SIMD support to most execution units, including extensive 8-bit
           SIMD support. Also doubles data bandwidth and increases size of
           on-chip memory.

           100-167 MHz 32-bit and 64-bit IEEE-754 floating-point DSP with
           VLIW (very large instruction word), load/store architecture; 32
           32-bit registers; very deep pipeline; two multipliers, ALUs, and
           shifters; cache.


Q3.2: What is the difference between a DSP and a microprocessor?

   Updated 04/02/01

           The essential difference between a DSP and a microprocessor is
           that a DSP processor has features designed to support
           high-performance, repetitive, numerically intensive tasks. In
           contrast, general-purpose processors or microcontrollers
           (GPPs/MCUs for short) are either not specialized for a specific
           kind of applications (in the case of general-purpose processors),
           or they are designed for control-oriented applications (in the
           case of microcontrollers). Features that accelerate performance in
           DSP applications include:

              * Single-cycle multiply-accumulate capability; high-performance
                DSPs often have two multipliers that enable two
                multiply-accumulate operations per instruction cycle; some
                DSP have four or more multipliers
              * Specialized addressing modes, for example, pre- and
                post-modification of address pointers, circular addressing,
                and bit-reversed addressing
              * Most DSPs provide various configurations of on-chip memory
                and peripherals tailored for DSP applications. DSPs generally
                feature multiple-access memory architectures that enable DSPs
                to complete several accesses to memory in a single
                instruction cycle
              * Specialized execution control. Usually, DSP processors
                provide a loop instruction that allows tight loops to be
                repeated without spending any instruction cycles for updating
                and testing the loop counter or for jumping back to the top
                of the loop
              * DSP processors are known for their irregular instruction
                sets, which generally allow several operations to be encoded
                in a single instruction. For example, a processor that uses
                32-bit instructions may encode two additions, two
                multiplications, and four 16-bit data moves into a single
                instruction. In general, DSP processor instruction sets allow
                a data move to be performed in parallel with an arithmetic
                operation. GPPs/MCUs, in contrast, usually specify a single
                operation per instruction

           While the above differences traditionally distinguish DSPs from
           GPPs/MCUs, in practice it is not important what kind of processor
           you choose. What is really important is to choose the processor
           that is best suited for your application; if a GPP/MCU is better
           suited for your DSP application than a DSP processor, the
           processor of choice is the GPP/MCU. It is also worth noting that
           the difference between DSPs and GPPs/MCUs is fading: many
           GPPs/MCUs now include DSP features, and DSPs are increasingly
           adding microcontroller features.


Q3.3: Software for Analog Devices DSPs

   Updated 12/01/2006

  Q3.3.1: Where can I get a C compiler for the ADSP-21xx and ADSP-21xxx?

           The G21 package collects the free source code for the Analog
           Devices GCC-based C compilers for their 21xxx (SHARC) and 21xx
           series DSPs. These compilers are all based on GCC version 2.3.3.
           Full source code for the compiler, assembler, linker, etc. is
           available at

           The C compilers are available for the 210x series as well as for
           the SHARC. The assemblers and linkers are only available for the
           SHARC. The source code is based on what is released under GPL by
           ADI, but is adapted for use with Linux and other Unix variants.

           [Egil Kvaleberg,]


  Q3.3.2: Where can I get tools for the ADSP-21xxx?

           SHARC development tools are avaiable for Acorn/BSD, Linux, and
           other platforms. The tools include frontend/preprocessor ,
           assembler, linker, archiver, a utility to generate ROM images for
           eprom burners, and other utilities The supplied assembler is not
           part of the gnu archive, but is based on a assembler originaly
           written by P. Lantto. Source code and binaries are available at:


  Q3.3.3: Where can I get algorithms or libraries for Analog Devices DSPs?

           The number for the Analog Devices DSP BBS is (617) 461-4258 (300,
           1200, 2400, 9600, 14400 bps), 8N1.

           You can also find files on Analog Devices' web site at
 , or at their FTP site

           [Analog Devices DSP Applications,]


Q3.4: Software for Agere Systems (Formerly Lucent Technologies) DSPs

           Agere Systems provides application libraries for their DSPs at


Q3.5: Software for Freescale DSPs

   Updated 12/01/2006

           Freescale provides free software development tools that may be
           downloaded from the Freescale Web site at

  Q3.5.1: Where can I get a free assembler for the Freescale DSP56000?

           A free assembler for the Freescale DSP56000 exists, thanks to
           Quinn Jensen, The current version is 1.2. It
           is also available at


  Q3.5.2: Where can I get a free C compiler for the Freescale DSP56000?

           There are two separate compiler sources for the Freescale
           DSP56000. One is the port of gcc 1.40 done by Andrew Sterian
           ( and the other is a port of gcc 1.37.1 done by
           Freescale and returned to the FSF. Andrew's port has bowed to
           Freescale's version. Both may be portable to gcc2.x.x with some
           effort required. Neither of these comes with an assembler, but you
           can get a free DSP56000 assembler elsewhere (see question 3.5.1,
           above). The Freescale gcc source is available for FTP from:

           From Andrew Sterian, "My DSP56K compiler,
           while not supported nor as well tested as Freescale's, implements
           fixed-point arithmetic rather than floating-point arithmetic. This
           may be suitable for some applications. The 5615 compiler also
           implements fixed-point arithmetic. To the best of my knowledge,
           Freescale does not have a C compiler for the 5615 family, although
           alternatives may exist. As of this writing (January 1997) I have
           not worked with Freescale DSPs or compiler software for nearly 5
           years so questions regarding my compilers may well be met with
           "Ummm... I have no idea."

           Both compilers were posted to alt.sources so any Usenet site that
           archives this newsgroup will have a copy. I have also found the
           5616 compiler at
           ( IsoPod(TM) - based on the DSP56F805.
           The assembler generates output suitable for Freescale's free JTAG
           flash loader.

           Pete Gray has announced the availability of a Small C
           cross-compiler (with source) and assembler for the Freescale
           DSP56800, available for download from
  . Targetting a simple DOS-box host,
           developed and tested using djgpp (
           and Metrowerks CodeWarrior, in conjunction with NMI's
           ( IsoPod(TM) - based on the DSP56F805.
           The assembler generates output suitable for Freescale's free JTAG
           flash loader.

           Small C language reference available online at


  Q3.5.3 Where can I get a disassembler for the Freescale DSP56000?

           Miloslaw Smyk has released an open source (BSD style) 5600x
           disassembly library. It is available for download at
  [Miloslaw Smyk,


  Q3.5.4: Where can I get algorithms and libraries for Freescale DSPs?

           Freescale provides a software archive that is available via
           World-Wide Web from the software page at
  The archive includes macros
           for filters (FIR, IIR, adaptive) and floating-point functions.
           [Tim Baggett]


  Q3.5.5: Where can I get NeXT-compatible Freescale DSP56001 code?

           Try FTP at The /pub/ directory contains
           free code for the Freescale DSP56001 and the NeXT platform.


  Q3.5.6: Where can I get emulators for the 68HC11 (6811) processor?

           While the 68HC11 is not a DSP processor, emulators are available
           for those who might be interested in doing DSP on these

              * New Mexico State University (NMSU) simulator engine,
       (Unix). Simulator
                engine with a command-line interface.
              * Sim6811,
                (Mac). Screen-oriented user interface based on the NMSU
                simulator engine (plus bug fixes).
              * THRSim11, allows
                you to edit, assemble, simulate and debug programs for the
                68HC11 on Windows 95/98. THRSim11 simulates the CPU, ROM,
                RAM, all memory mapped I/O ports, and the on board


Q3.6: Software for Texas Instruments DSPs

   Updated 12/01/2006

  Q3.6.1: Where can I get free algorithms or libraries for TI DSPs?

   has some old, apparently public
           domain, assembler and related tools from TI for the TMS320

           TI has a number of free algorithms available on their website at

           TI's world-wide web site is The TI DSP bulletin
           board is mirrored on The TI site is the official one,
           but has no user contributed software. [Brad Hards,

           { If anyone knows of any other sources for TI DSP software, please
           let us know at Thanks! }


  Q3.6.2: Where can I get free development tools for TI DSPs?

           TI development tools are available for free 30 day evaluation on
           the TI website.
           Go to 


  Q3.6.3: Where can I get a free C compiler for the TI TMS320C3x/4x?

           The GNU binutils 2.11 and later have been ported to the TI
           C54xx/IBM C54DSP. Most of the binutils tools are supported,
           including the assembler, linker and objdump. The assembler is
           source-compatible with the TI assembler. The GNU binutils are
           available from GDB ports for
           c25/c5x/c54x are also available.

           [Timothy Wall]

           Dr. Michael P. Hayes has written a GNU C-based compiler for the
           TMS320C30 and TMS320C40 families, available at
  The current version patches
           against gcc-2.8.1; support is moving to egcs-1.2. The compiler is
           freely redistributable under the terms of the GNU Public License.
           Front-ends are also available for C++, Java, Fortran 77, Pascal,
           Ada 95, among others.

           [Dr. Michael P. Hayes,]


  Q3.6.4: Where can I get a free assembler for the TI TMS320C3x/4x?

           Ted Rossin has written an assembler and linker for the TMS320C30.
           In his words, "It is somewhat limited by the fact that it can't
           handle expressions but it has worked fine for me over the past few
           years. There is no manual because it is a clone of the TI
           assembler and linker. However the linker command files use a
           different (easier to use) syntax. It runs on HP-UX workstations,
           Macs, IBM clones and believe it or not the Atari-ST (because I
           developed the code on it)."

           [Ted Rossin,]

           Dr. Michael P. Hayes has written a GNU-based assembler for the
           TMS320C30 and TMS320C40 families, available at
  The current version patches
           against binutils-2.7. According to Michael Hayes, the assembler
           syntax is compatible with the Texas Instruments TMS320C30
           assembler, although not all the Texas Instruments directives are
           supported. The binutils include a linker (ld), archiver (ar),
           disassembler (objdump), and other miscellaneous utilities. The
           object format of the assembler is compatible with the COFF format
           used by the Texas Instruments assembler. The assembler and other
           binary utilities are freely redistributable under the terms of the
           GNU Public License.

           [Dr. Michael P. Hayes,]


  Q3.6.5: Where can I get a free simulator for the TI TMS320C3x/4x?

           A freely distributable instruction set architecture simulator is
           available for the TMS320C30 DSP as part of the Web-Enabled
           Simulation framework from UT Austin at

           We have released all of the source code, as well as prebuilt C30
           simulators for Windows '95/NT and Solaris 2.5 architectures.

           The C30 simulator is bit-, cycle-, and instruction-accurate. The
           behavior of the C30 simulator has been validated against a C30 DSK
           board. The C30 simulator correctly reports interlocking and
           pipeline flushes, so it provides a convenient way to check C30
           programs for these hidden delays. The C30 simulator is based on
           the C30 DSK tools by Keith Larson at Texas Instruments.

           [Brian Evans,]

           Herman Ten Brugge ( has also written a
           GNU debugger (GDB) based simulator for the TMS320C30 and
           TMS320C40, available via anonymous FTP at
  This is freely
           redistributable under the terms of the GNU Public License.

           This simulator allows you to debug your programs without having to
           a connect to a real C[34]x target system. It will also profile
           your code showing you where the pipeline conflicts are occurring.
           You can connect I/O ports to files (or TCP/IP sockets), trigger
           interrupts, examine the cache etc. It will detect different
           threads of control running and generate a profile summary for each
           thread, annotating both the C code and assembler code with the
           number of executed cycles.

           [Dr. Michael P. Hayes,]


  Q3.6.6: What is Tick? Where can I get it?

           Tick is a TMS320C40 parallel network detection and loader utility.

           It is available from:

           Supports: Transtech, Hunt, and Traquair boards hosted by DOS,
           SunOS, Linux

                     Previous section (2) Next section (4)

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

Part1 - Part2 - Part3 - Part4

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer: (Seth Benton)

Last Update March 27 2014 @ 02:11 PM