Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

comp.compression Frequently Asked Questions (part 1/3)
Section - [26] Are there algorithms and standards for audio compression?

( Part1 - Part2 - Part3 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Forum ]


Top Document: comp.compression Frequently Asked Questions (part 1/3)
Previous Document: [25] Fast DCT (Discrete Cosine Transform) algorithms
Next Document: [30] My archive is corrupted!
See reader questions & answers on this topic! - Help others by sharing your knowledge

Yes. See the introduction to MPEG given in part 2 of this FAQ.

A lossless compressor for 8bit and 16bit audio data (.au) is available
in ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/coding/shorten.tar.gz
or http://www.softsound.com/ShortenDownload.html
Shorten works by using Huffman coding of prediction residuals.
Compression is generally better than that obtained by applying general
purpose compression utilities to audio files. Also supports lossy
compression.  Contact: Tony Robinson <ajr@eng.cam.ac.uk>.

Benchmarks of shorten and other lossless audio compression programs
are in http://www.firstpr.com.au/audiocomp/lossless/

Audio software is available in subdirectories of
ftp://sunsite.unc.edu/pub/electronic-publications/IUMA/audio_utils/ :
- An MPEG audio player is in mpeg_players/Workstations/maplay1_2.tar.Z.
- The sources of the XING MPEG audio player for Windows is in
  mpeg_players/Windows/mpgaudio.zip.
- An encoder/decoder is in converters/source/mpegaudio.tar.Z.

MSDOS audio software is available in
ftp://ftp.simtel.net/pub/simtelnet/msdos/sound/
In particular, MPEG-2 audio software is in ampegsrc.zip and ampeg43.zip.

MPEG audio files are available in ftp://ftp.iuma.com and http://www.iuma.com/

The site http://www.mp3tech.com is dedicated to the MP3 audio compression
standard. It has information about the MP3 standard, audio compression
techniques, tests, sources, etc...

Copied from the comp.dsp FAQ posted by guido@cwi.nl (Guido van Rossum):

  Strange though it seems, audio data is remarkably hard to compress
  effectively.  For 8-bit data, a Huffman encoding of the deltas between
  successive samples is relatively successful.  For 16-bit data,
  companies like Sony and Philips have spent millions to develop
  proprietary schemes.

  Public standards for voice compression are slowly gaining popularity,
  e.g. CCITT G.721 and G.723 (ADPCM at 32 and 24 kbits/sec).  (ADPCM ==
  Adaptive Delta Pulse Code Modulation.)  Free source code for a *fast*
  32 kbits/sec ADPCM (lossy) algorithm is available by ftp from ftp.cwi.nl
  as /pub/audio/adpcm.shar.  (** NOTE: if you are using v1.0, you should get
  v1.1, released 17-Dec-1992, which fixes a serious bug -- the quality
  of v1.1 is claimed to be better than uLAW **)

  (Note that U-LAW and silence detection can also be considered
  compression schemes.)

Information and source code for adpcm are available in
http://www.rss.rockwell.com/techinfo/pc/adpcm/adpcm.html

Source for Sun's free implementation of CCITT compression types G.711,
G.721 and G.723 is in ftp://ftp.cwi.nl/pub/audio/ccitt-adpcm.tar.gz

You can get a G.721/722/723 package by email to teledoc@itu.arcom.ch, with
GET ITU-3022
as the *only* line in the body of the message.


A note on u-law from Markus Kuhn <mskuhn@immd4.informatik.uni-erlangen.de>:

  u-law (more precisely (greek mu)-law or 5-law if you have an 8-bit
  ISO terminal) is more an encoding then a compression method,
  although a 12 to 8 bit reduction is normally part of the encoding.
  The official definition is CCITT recommendation G.711. If you want
  to know how to get CCITT documents, check the Standards FAQ
  posted to news.answers or get the file standards-faq by ftp in
  directory ftp://rtfm.mit.edu/pub/usenet/news.answers/


See also the comp.dsp FAQ for more information on:

- The U.S. DoD's Federal-Standard-1016 based 4800 bps code excited linear
  prediction voice coder version 3.2a (CELP 3.2a)
- The U.S. DoD's Federal-Standard-1015/NATO-STANAG-4198 based 2400 bps
  linear prediction coder version 53 (LPC-10e v53)
- Realtime DSP code and hardware for FS-1015 and FS-1016

The comp.dsp FAQ is in comp.dsp with subject "FAQ: Audio File Formats" and in
ftp://rtfm.mit.edu/pub/usenet/news.answers/audio-fmts/


CELP C code for Sun SPARCs is in ftp://ftp.super.org/pub/speech/
An LPC10 speech coder is in ftp://ftp.super.org/pub/speech/ ;
a derived version is available from http://www.arl.wustl.edu/~jaf/lpc/

Source code for ITU-T (CCITT) G.728 Low Delay CELP speech compression
is in ftp://svr-ftp.eng.cam.ac.uk/pub/comp.speech/sources/


Recommended reading:
  Digital Coding of Waveforms: Principles and Applications to Speech and
  Video.  N. S. Jayant and Peter Noll.  Prentice-Hall, 1984, ISBN
  0-13-211913-7.

Information on GSM sound compression is available at
http://ccnga.uwaterloo.ca/~jscouria/gsm.html


from Markus Kuhn <mskuhn@immd4.informatik.uni-erlangen.de>:

  One highest quality sound compression format is called ASPEC and has
  been developed by a team at the Frauenhofer Institut in Erlangen (Germany)
  and others.

  ASPEC produces CD like quality and offers several bitrates, one is
  128 kbit/s. It is a lossy algorithm that throws away frequencies that
  aren't registered in the human cochlea in addition to sophisticated
  entropy coding. The 64 kbit/s ASPEC variant might soon bring hifi
  quality ISDN phone connections. It has been implemented on standard DSPs.

  The Layer 3 MPEG audio compression standard now contains what is officially
  called the best parts of the ASPEC and MUSICAM algorithms. A reference is:

    K.Brandenburg, G.Stoll, Y.F.Dehery, J.D.Johnston, L.v.d.Kerkhof,
    E.F.Schroeder: "The ISO/MPEG-Audio Codec: A Generic Standard for Coding
    of High Quality Digital Audio",
    92nd. AES-convention, Vienna 1992, preprint 3336


from Jutta Degener <jutta@cs.tu-berlin.de> and Carsten Bormann
<cabo@cs.tu-berlin.de>:

  GSM 06.10 13 kbit/s RPE/LTP speech compression available
  --------------------------------------------------------

  The Communications and Operating Systems Research Group (KBS) at the
  Technische Universitaet Berlin is currently working on a set of
  UNIX-based tools for computer-mediated telecooperation that will be
  made freely available.

  As part of this effort we are publishing an implementation of the
  European GSM 06.10 provisional standard for full-rate speech
  transcoding, prI-ETS 300 036, which uses RPE/LTP (residual pulse
  excitation/long term prediction) coding at 13 kbit/s.

  GSM 06.10 compresses frames of 160 13-bit samples (8 kHz sampling
  rate, i.e. a frame rate of 50 Hz) into 260 bits; for compatibility
  with typical UNIX applications, our implementation turns frames of 160
  16-bit linear samples into 33-byte frames (1650 Bytes/s).
  The quality of the algorithm is good enough for reliable speaker
  recognition; even music often survives transcoding in recognizable 
  form (given the bandwidth limitations of 8 kHz sampling rate).

  Version 1.0 of the implementation is available per anonymous ftp from
  ftp.cs.tu-berlin.de in the directory /pub/local/kbs/tubmik/gsm/ ;
  more information about the library can be found on the World-Wide Web
  at http://www.cs.tu-berlin.de/~jutta/toast.html .
  Questions and bug reports should be directed to jutta@cs.tu-berlin.de
  and cabo@informatik.uni-bremen.de .


from Nicola Ferioli <ser1509@cdc835.cdc.polimi.it>:

  ftp://ftp.simtel.net/pub/simtelnet/msdos/sound/vocpak20.zip
    Lossless 8-bit sound file compressor

  VOCPACK is a compressor/decompressor for 8-bit digital sound using a
  lossless algorithm; it is useful to save disk space without degrading
  sound quality.  It can compress signed and unsigned data, sampled at any
  rate, mono or stereo.  Since the method used is not lossy, it isn't
  necessary to strip file headers before compressing.

  VOCPACK was developed for use with .VOC (SoundBlaster) and .WAV (Windows)
  files, but any 8-bit sound can be compressed since the program takes no
  assumptions about the file structure.

  The typical compression ratio obtained goes from 0,8 for files sampled at
  11 KHz to 0,4 for 44 Khz files.  The best results are obtained with 44 KHz
  sounds (mono or stereo): general-purpose archivers create files that can be
  twice longer than the output of VOCPACK.  You can obtain smaller values
  using lossy compressors but if your goal is to keep the sound quality
  unaltered you should use a lossless program like VOCPACK.

from Harald Popp <popp@iis.fhg.de>:

  new version 1.0 of ISO/MPEG1 Audio Layer 3 Shareware available

  major improvements of the new version:
       - encoder works twice as fast
       - improved file handling for encoder including .WAV files

  You may download the shareware from fhginfo.fhg.de (153.96.1.4)
  from the directory /pub/layer3

  The source code for the MPEG1 audio decoder layer 1, 2 and 3 is
  now available on fhginfo.fhg.de (153.96.1.4) in /pub/layer3/public_c.

  There are two files:
     mpeg1_iis.tar.Z     (Unix: lines seperated by line feed only)
     mpeg1iis.zip        (PC: lines seperated by carriage return and line feed)

For more information about this product and MPEG Audio Layer 3, see
the document "Informations about MPEG Audio Layer-3" maintained by
Juergen Zeller <zeller@iis.fhg.de>, available in
ftp://fhginfo.fhg.de/pub/layer3/

from Monty <xiphmont@athena.mit.edu>:

  OggSquish is a compression package designed to reduce the file size of
  digitized 8 and 16 bit audio samples (or samples of any periodic
  data).  OggSquish will operate on files sampled at any speed, but it is
  designed to work with very high quality samples, for example, CD
  quality samples.

  [OggSquish is now at http://www.xiph.com/OggSquish/index.html or
   http://world.std.com/~xiph/OggSquish/ or
   http://web.mit.edu/afs/sipb/user/xiphmont/OggSquish/html-pages/ ]

from Dmitrij V. Schmunk <shmunk@csd.inp.nsk.su>:

  Take a look at http://www.inp.nsk.su/~shmunk/
  This compressor gives you about 2-3 times better compression
  for 44.1kHz stereo sound then MPEG layer-3.

from Dennis Lee <denlee@ecf.utoronto.ca>

  WA incorporates lossless audio codecs (similar to SHORTEN and OggSquish)
  into an easy to use archiver program.  WA only supports compression of the
  popular ".WAV" audio format.  To the author's knowledge, WA can compress
  waveform data better than any existing software.  With default settings, WA
  also compresses faster than PKZIP, so it is convenient to use.  This
  software can be found at http://www.ecf.utoronto.ca/~denlee/software.html

from Jack Berlin <jberlin@jpg.com>:

  Pegasus has a new lossless sound compressor based on our patent pending
  arithmetic coder. Currently beats Shorten in all cases. Trial app for
  Windows: ftp://207.69.208.43/jpg.com/SPSEXE.ZIP  ($39 to register)
  http://www.pegasusimaging.com/sound.html

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

CAPTCHA




Top Document: comp.compression Frequently Asked Questions (part 1/3)
Previous Document: [25] Fast DCT (Discrete Cosine Transform) algorithms
Next Document: [30] My archive is corrupted!

Part1 - Part2 - Part3 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
jloup@gzip.OmitThis.org





Last Update March 27 2014 @ 02:11 PM