Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

comp.compression Frequently Asked Questions (part 2/3)
Section - [73] What is the theoretical compression limit?

( Part1 - Part2 - Part3 - Single Page )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Cities ]


Top Document: comp.compression Frequently Asked Questions (part 2/3)
Previous Document: [72] What is wavelet theory?
Next Document: [74] Introduction to JBIG
See reader questions & answers on this topic! - Help others by sharing your knowledge

This question can be understood in two different ways:

(a) For a given compressor/decompressor, what is the best possible lossless
    compression for an arbitrary string (byte sequence) given as input?

(b) For a given string, what is the best possible lossless
    compressor/decompressor?

For case (a), the question is generally meaningless, because a specific
compressor may compress one very large input file down to a single bit, and
enlarge all other files by only one bit.  There is no lossless compressor that
is guaranteed to compress all possible input files. If it compresses some
files, then it must enlarge some others.  This can be proven by a simple
counting argument (see item 9).  In case (a), the size of the decompressor is
not taken into account for the determination of the compression ratio since the
decompressor is fixed and it may decompress an arbitrary number of files of
arbitrary length.

For case (b), it is of course necessary to take into account the size of the
decompressor. The problem may be restated as "What is the shortest program P
which, when executed, produces the string S?".  The size of this program
is known as the Kolmogorov complexity of the string S.  Some (actually most)
strings are not compressible at all, by any program: the smallest
representation of the string is the string itself.  On the other hand, the
output of a pseudo-random number generator can be extremely compressible, since
it is sufficient to know the parameters and seed of the generator to reproduce
an arbitrary long sequence.

References: "An Introduction to Kolmogorov Complexity and its Applications",
   Ming Li and Paul Vitanyi, 2nd edition, Springer-Verlag, ISBN 0-387-94868-6
   http://www.cwi.nl/~paulv/kolmogorov.html

If you don't want to read a whole book, I recommend the excellent lecture
"Randomness & Complexity in Pure Mathematics" by G. J. Chaitin:
  http://www.cs.auckland.ac.nz/CDMTCS/chaitin/ijbc.html
The decimal and binary expansions of Chaitin's number Omega are examples of
uncompressible strings. There are more papers on
http://www.cs.auckland.ac.nz/CDMTCS/chaitin/

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

CAPTCHA




Top Document: comp.compression Frequently Asked Questions (part 2/3)
Previous Document: [72] What is wavelet theory?
Next Document: [74] Introduction to JBIG

Part1 - Part2 - Part3 - Single Page

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
jloup@gzip.OmitThis.org





Last Update March 27 2014 @ 02:11 PM