# Codes and Ciphers

█ LARRY GILMAN

Codes and ciphers are forms of cryptography, a term from the Greek
*
kryptos
*
, hidden, and
*
graphia
*
, writing. Both transform legible messages into series of symbols that are
intelligible only to specific recipients. Codes do so by substituting
arbitrary symbols for meanings listed in a codebook; ciphers do so by
performing rule-directed operations directly on original message text.
Because codes can only communicate concepts that are listed in their
codebooks, they have limited flexibility. Rather, modern cryptography
relies almost entirely on ciphers implemented by digital computers, and is
widely employed in industry, diplomacy, espionage, warfare, and personal
communications.

**
Codes.
**
A
*
code
*
is a set of symbolic strings ("code groups") that are
listed, along with their assigned meanings, in a code book.

Codes encrypt messages by substitution, that is, they substitute code groups for components of the original message. "Kill the king at midnight" could thus be encoded, for example, as "OAKEN 7890 SPINDRIFT." Without the code book, it would be difficult for a reader of the encoded message to form an idea of its meaning.

Either a word or a number can be used as a code group. Code groups that are words are termed code words and those that are numbers are termed code numbers. Note that a single code group can encode a single word ("king") or an entire phrase ("deliver the films to agent number 3"). A coded message may, therefore, be shorter than the original message. It can also be made as long as or longer than the original message, if the codebook

provides lengthy code phrases for single concepts or nonsense code groups for padding purposes. Such techniques can be used to make encoded messages harder for opponents to read.

**
Ciphers.
**
A cipher uses a system of fixed rules (an "algorithm") to
transform a legible message ("plaintext") into an apparently
random string of characters ("ciphertext"). For example, a
cipher might be defined by the following rule: "For every letter of
plaintext, substitute a two-digit number specifying the plaintext
letter's position in the alphabet plus a constant between 1 and 73
that shall be agreed upon in advance." If 46 is the agreed-upon
constant, then the plaintext word ZAP enciphers to 724762 as follows:

- Plaintext letter Z = ciphertext 72 (alphabet position 26 + 46).
- Plaintext letter A = ciphertext 47 (alphabet position 1 + 46).
- Plaintext letter P = ciphertext 62 (alphabet position 16 + 46).

Incorporation of a variable term into a fixed algorithm, as in this example, is typical of real-world ciphers. The variable component is termed a key. A real key would be longer and would have a more complex relationship to the cipher algorithm than the key in this example, but its basic role would be the same: a key fits into an algorithm so as to enable enciphering and deciphering, just as a physical key fits into a lock to enable locking and unlocking. Without a key, a cipher algorithm is missing an essential part. In fact, so important is the concept of the key that in real-world ciphering it is not algorithms that are kept secret, but keys. Cipher designers assume that their algorithms will always become known to their opponents, but design the relationship between key and algorithm so that even knowing the algorithm it is almost impossible to decipher a ciphertext without knowing the appropriate key. Before a cipher can work, therefore, a key or set of keys must be in the possession of both the sender and the receiver.

If the key were always the same, it would simply constitute a permanent part of the algorithm, and keying would have no special advantage over trying to keep one's algorithm secret to begin with. Keys must, therefore, be changed occasionally. A new key may be employed every day, for every message, or on some other schedule.

**
Comparison of codes and ciphers.
**
Codes have the advantage of simplicity. No calculations are required to
encode or decode messages, only lookups in a codebook. Further, because a
code uses no fixed system for associating code groups with their meanings
(even the amount of meaning assigned to a code word can vary, as seen
above), a code may fail gracefully—that is, an enemy may discern
the meaning of a few code groups but still be unable to interpret others.
In contrast, a cipher produces ciphertext from plaintext (and vice versa)
according to a fixed algorithm. Thus, if an enemy determines the algorithm
and steals or guesses a key, they can at once interpret all messages sent
using that key. Changing the key may restore cipher security, unless the
enemy has developed a system for guessing keys. One such system, always
possible in theory, is to try all possible keys until one is found that
works.

Codes, however, have two great disadvantages. Users can only send messages that can be expressed using the terms defined in the codebook, whereas ciphers can transmit all possible messages. Additionally, all codes are vulnerable to codebook capture. If a codebook is captured, there is no recourse but to distribute new codebooks to all users. In contrast, the key–algorithm concept makes cipher secrecy dependent on small units of information (keys) that can be easily altered.

Secure ciphers, however, entail complex calculations. This made the use of complex ciphers impractical before

the invention of ciphering machines in the early twentieth century; codes and simple ciphers were the only feasible methods of ciphering. Yet, a cipher that is simple to implement is proportionately simple to crack, and a cracked cipher can be disastrous. It is better to have to communicate "in the clear"—to send messages that can be easily read by the enemy—than to suppose that one's communications are secret when they are not. Mary, Queen of Scots (1542–1567) was executed for treason on the basis of deciphered letters that frankly discussed plans for murdering Queen Elizabeth of England; likewise, simple ciphers used by the Confederacy during the U.S. Civil War were easily cracked by Union cryptographers. What is more, even more sophisticated ciphers, such as the Enigma cipher used by Nazi Germany during World War II or implemented today on digital computers, are subject to attack. As soon as any new cipher is invented, someone, somewhere starts attacking it. The result is that ciphers, like some antibiotics, have limited lifespans, and must be regularly replaced.

**
Historical perspective.
**
Throughout much of the ancient world, writing was either completely
unknown or was an arcane art accessible only to priests. There was little
motive, therefore, to develop coding or ciphering. Eventually, however,
writing came to serve military, personal, and commercial as well as sacred
purposes, creating a need for secure communications. To meet this need,
ciphers based on scrambling the order of plaintext characters or on
substituting other characters for them were developed. The first recorded
use of ciphering was by the Greek general Lysander in the fifth century
B.C.
The
*
Kamasutra
*
, a Hindu text compiled in the
A.D.
fourth century from manuscripts dating back as far as the fourth century
B.C.
, recommends monoalphabetic substitution ciphering—the replacement
of each letter of a plaintext message with a different letter of the
alphabet—as one of the 64 arts to be mastered by an
ideally-educated woman. By the first century
B.C.
, codes had also been developed.

Cryptography fell out of use during the early Middle Ages, but Arab
scholars during the heyday of medieval Muslim civilization, the Abbasid
caliphate (
A.D.
750–1258), revived it. Muslim writers not only ciphered, but
invented
*
cryptanalysis
*
, the systematic breaking of ciphers. Ninth-century Arab philosopher Abu
Yusuf al-Kindi wrote the earliest known description of the cryptanalytic
technique known as frequency analysis, which breaks substitution
ciphers by matching ciphertext letters with plaintext letters according
to their frequency of use in the language. In English, for example, the
most frequently used letter is E; in an English-language ciphertext
produced using a monoalphabetic substitution cipher, therefore, the most
frequently used character probably stands for E.

During the late Middle Ages and the Renaissance, a literate ruling class arose throughout Europe, and ciphering regained importance in that part of the world for purposes of intrigue, espionage, and war. English monk and scientist Roger Bacon (1220–1292) wrote a book describing several cryptographic methods; Italian artist Leon Battista Alberti (1404–1472) wrote the first European text on cryptanalysis in 1466. Under pressure from cryptanalysis, codes and cipher systems gradually became more complex.

Beginning in the mid-nineteenth century, the importance of coding and ciphering was rapidly amplified by the invention of electronic information technologies: the telegraph (1837), the telephone (1876), radio (1895), and electronic computers (1940s). Non-secret commercial codes were developed in conjunction with telegraphy to make messages more compact (therefore cheaper); ciphers were widely used (and cracked) during the U.S. Civil War and the first and second world wars. The cracking of German and Japanese ciphers by Allied cryptographers during World War II was of particular importance, enabling the British and Americans to avoid submarines, intercept ships and aircraft, and otherwise frustrate enemy plans. Ciphering has since become basic to military and government communications. Since the 1960s, commercial and personal communications have become increasingly dependent on digital computers, making sophisticated ciphering a practical option for those sectors as well. In the late 1970s, the U.S. government defined a cipher algorithm for standard use by all government departments, available also to the public; this now-elderly algorithm, the Digital Encryption Standard, is today in the process of being replaced by a new algorithm, the Advanced Encryption Standard.

**
Types of codes.
**
Codes can be generally divided into
*
one-part
*
and
*
two-part
*
codes. In a one-part code, the same codebook is used for encipherment and
decipherment. The problem with this system is that some systematic
ordering of the code groups and their assigned meanings must be made, or
it will be difficult to locate code groups when enciphering or their
meanings when deciphering. (A randomly ordered list of words or numbers
thousands of terms long is difficult to search except by computer.) Thus,
code groups tend to be arranged in alphabetic or numerical order in a
one-part code, an undesirable property, since an opponent seeking to crack
the code can exploit the fact that code groups that are numerically or
alphabetically close probably encode words or phrases that are
alphabetically close. To avoid this weakness, a two-part code employs one
codebook for encipherment and another for decipherment. In the
encipherment codebook, alphabetically ordered meanings (e.g., A, ABDICATE,
ABLE) are assigned randomly ordered code groups (e.g., 6897, 1304, 0045).
In the decipherment codebook, the code groups are arranged in order (e.g.,
0045, 1304, 6897), for easy location.

Code security can be improved by combining ciphering with coding. In this technique, messages are first encoded and then enciphered; at the receiving end, they are first deciphered and then decoded. A standard method for combining coding and ciphering is the "code plus additive" technique, which employs numbers as code groups and adds a pseudorandom number to each code group to produce a disguised code group. The pseudorandom numbers used for this purpose are generated by modulo-arithmetic techniques closely related to those used in stream ciphering.

**
Block ciphers.
**
Ciphers that encrypt whole blocks of characters at once—such as 10
letters at a time, or 128 bits—are termed block ciphers. Block
ciphers have the advantage that each character in each ciphertext block
can be made to depend complexly on all characters of the corresponding
message block, thus scrambling or smearing out the message content over
many characters of ciphertext. The widely used Digital Encryption Standard
(DES) is a block cipher that employs a 56-bit key to encrypt 56-bit
blocks. In DES, the key and each message block are used as inputs to a
complex algorithm that produces a 56-bit block of ciphertext. The same key
is used to decode the block of ciphertext at the receiving end.

**
Stream ciphers.
**
Stream ciphers operate upon series of binary digits
("bits," usually symbolized as 1s and 0s), enciphering them
one by one rather than in blocks of fixed length. In stream encipherment,
a series of bits termed the key-stream is made available by some means to
both the sender and receiver. This stream is as long as the message to be
sent. At the sending end, the key-stream is combined with the
message-stream in a bit-by-bit fashion using the exclusive or operation of
Boolean algebra, producing the ciphertext. At the receiving end, the same
key-stream is combined again with the ciphertext to recover the message
stream. This system of ciphering is unbreakable in both theory and
practice if the key-stream remains secret. Ongoing breakthroughs in
quantum cryptography may soon make perfectly secret key-streams available
by exploiting certain properties of photons. If these techniques can be
made technologically practical, truly unbreakable cipher systems will have
become available for the first time in history.

**
Public-key ciphers.
**
All ciphers require the use of a secret key. Public-key ciphers, first
developed in the late 1970s, are no exception. However, public-key ciphers
have the
important advantage that the secret key possessed by the sender need not
be the same secret key possessed by the receiver; thus, no secure transfer
of keys between the sender and receiver is ever necessary.

Public-key ciphers exploit the computational difficulty of discovering the
prime factors of large numbers. (The prime factors of a number are the
primes that, when multiplied together, produce the number: e.g., the prime
factors of 15 are 5 and 3.) To create a public key, two large (50-digit or
longer) primes are chosen and their product calculated. This number (
*
r
*
) is made public. Further mathematical operations by the user produce two
numbers based on
*
r
*
; one of these is the user's public key
*
k
*
_{
p
}
, and the other is retained as the user's private key
*
k
*
_{
s
}
. Anyone that knows
*
r
*
and a given user's public key
*
k
*
_{
p
}
can send encrypted messages to that particular user; the recipient
decrypts the message using their private key
*
k
*
_{
s
}
.

Public-key cryptography has seen wide use since the 1970s. Its security is
limited by the ability of opponents to determine the prime factors of
*
r
*
, and the difficulty of this task is a function both of the size of
*
r
*
and of the speed of available digital computers. (Large
*
r
*
also makes encryption and decryption more computation-intensive, so it is
not practical to defeat opponents by simply making
*
r
*
extremely large.)

Software for a powerful public-key cipher algorithm known as Pretty Good Privacy (PGP) is downloadable for free from many sites on the Internet.

**
Attacking codes and ciphers.
**
Codes and ciphers can be attacked by two basic means. The first is theft
of codebooks or keys—espionage. The second is cryptanalysis, which
is any attempt to crack a code or cipher without direct access to keys or
codebooks. Cryptanalysis may proceed either by trial and error or by
systematic analysis of plaintext and ciphertext. The analytic approach may
involve both looking for patterns in ciphertext and solving mathematical
equations representing the encryption algorithm.

Cryptanalysis by trial and error usually means guessing cipher keys. A
cipher key can be guessed by trying all possible keys using a computer.
However, designers of encryption systems are aware of this threat, and are
constantly employing larger and larger keys to keep ahead of growing
computer speed. Systematic cryptanalysis may seek patterns in ciphertext,
either by itself or in conjunction with a known plaintext (the so-called
"known-plaintext attack"). Mathematical modeling of cipher
algorithms may assist trial-and-error methods by reducing the number of
guesses required to within (or near) practical limits. For example, in
2002, cryptographers announced that the recently-standardized Advanced
Encryption Standard of the U.S. government might be vulnerable to a
mathematical attack that would reduce the number of computations needed
for a successful trial-and-error attack from order 2
^{
256
}
to order 2
^{
100
}
. The latter number is still not computationally practical, but may be
soon.

Quantum cryptography holds out the promise of truly attack-proof ciphering. In a quantum-cryptographic system, not only would messages be undecipherable if intercepted, but also the act of interception would always be detectable by the intended receiver. Such systems may become available to military and government users around 2010.

## █ FURTHER READING:

### BOOKS:

Charthouse, Robert.
*
Codes and Ciphers.
*
Cambridge, UK: Cambridge University Press, 2002.

Meyer, Carl H., and Stephen M. Matyas.
*
Cryptography: A New Dimension in Computer Data Security.
*
New York: John Wiley & Sons, 1982.

Mollin, Richard A.
*
An Introduction to Cryptography.
*
New York: Chapman & Hall, 2001.

Singh, Simon.
*
The Code Book.
*
New York: Doubleday, 1999.

Stinson, Douglas R.
*
Cryptography: Theory and Practice.
*
New York: Chapman & Hall, 2002.

### PERIODICALS:

Seife, Charles. "Crucial Cipher Flawed, Cryptographers
Claim."
*
Science
*
no. 5590 (2002): 2193.

## SEE ALSO

*
ADFGX Cipher
Cipher Disk
Cipher Key
Cipher Machines
Code Name
ENIGMA
FISH (German Geheimschreiber Cipher Machine)
French Underground During World War II, Communication and Codes
Playfair Cipher
World War I: Loss of the German Codebook
World War II, United States Breaking of Japanese Naval Codes
*

## Comment about this article, ask questions, or add new information about this topic: