An Introduction to the Use of Encryption
by Peter Meyer
Originally written January 1994
Last modified: 1997-07-14
The purpose of this article is to provide information in the area
of practical cryptography of interest to anyone wishing to use
cryptographic software. I have mostly avoided discussion of
technical matters in favor of a more general explanation of what I
regard as the main things to be understood by someone beginning to
use encryption. Those wishing to get more deeply into the
theoretical aspects should consult Bruce Schneier's book (see
bibliography at end).
Cryptography is the art or science of secret writing, or more
exactly, of storing information (for a shorter or longer period of
time) in a form which allows it to be revealed to those you wish
to see it yet hides it from all others. A cryptosystem is a method
to accomplish this. Cryptanalysis is the practice of defeating
such attempts to hide information. Cryptology includes both
cryptography and cryptanalysis.
The original information to be hidden is called "plaintext". The
hidden information is called "ciphertext". Encryption is any
procedure to convert plaintext into ciphertext. Decryption is any
procedure to convert ciphertext into plaintext.
A cryptosystem is designed so that decryption can be accomplished
only under certain conditions, which generally means only by
persons in possession of both a decryption engine (these days,
generally a computer program) and a particular piece of
information, called the decryption key, which is supplied to the
decryption engine in the process of decryption.
Plaintext is converted into ciphertext by means of an encryption
engine (again, generally a computer program) whose operation is
fixed and determinate (the encryption method) but which functions
in practice in a way dependent on a piece of information (the
encryption key) which has a major effect on the output of the
encryption process.
A cryptosystem could be designed which made use of several
different methods of encryption, the particular method chosen for
a particular encryption process being key-dependent. The
combination of encryption methods results again in an encryption
method, which is just as deterministic as a simpler cryptosystem,
although probably harder for a cryptanalyst to crack. A good
cryptosystem should in fact vary the details of its encryption
method in a key-dependent way, though high security does not
require the combination of distinct encryption algorithms.
The result of using the decryption method and the decryption key
to decrypt ciphertext produced by using the encryption method and
the encryption key should always be the same as the original
plaintext (except perhaps for some insignificant differences).
In this process the encryption key and the decryption key may or
may not be the same. When they are the cryptosystem is called a
"symmetric key" system; when they are not it is called an
"asymmetric key" system. The most widely-known instance of a
symmetric cryptosystem is DES (the so-called "Data Encryption
Standard"). The most widely-known instance of an asymmetric key
cryptosystem is PGP ("Pretty Good Privacy").
An encryption algorithm (a precise specification of the steps to
be taken when encrypting plaintext and when decrypting the
resulting ciphertext) is known as an "asymmetric algorithm" if the
encryption and decryption keys that it uses are different;
otherwise it is a "symmetric algorithm".
There are many reasons for using encryption (examples are given
below), and the cryptosystem that one should use is the one best
suited for one's particular purpose and which satisfies the
requirements of security, reliability and ease-of-use.
* Ease-of-use is easy to understand.
* Reliability means that the cryptosystem, when used as its
designer intended it to be used, will always reveal exactly
the information hidden when it is needed (in other words,
that the ciphertext will always be recoverable and the
recovered data will be the same as to the original
plaintext).
* Security means that the cryptosystem will in fact keep the
information hidden from all but those persons intended to see
it despite the attempts of others to crack the system.
Ease-of-use is the quality easiest to ascertain. If the encryption
key is a sequence of 64 hexadecimal digits (a 256-bit key), such
as:
B923A24C98D98F83E24234CF8492C384E9AD19A128B3910F3904C324E920DA31
then you may have a problem not only in remembering it but also in
using it (try typing the sequence above a few times). With such a
key it is necessary to write it down or store it in a disk file,
in which case there is the danger that it may be discovered by
someone else. Thus such a key is not only inconvenient to use but
also is a security risk.
A cryptosystem which is easy to use should allow keyboard entry of
a string of from 10 to 60 characters, and thus a phrase which is
easy to remember, e.g. "Lay on MacDuff!" Spaces should not be
significant, and upper and lower case should be equivalent, so you
don't have to be concerned about variants such as "Lay on Mac
Duff!" or "Lay on Macduff!"
Reliability is the quality next easiest to test for. If it is not
possible to provide a formal proof that the decryption of the
encryption of the plaintext is always identical to the plaintext
it is at least possible to write software to perform multiple
encryptions and decryptions with many different keys to test for
reliability (though this testing cannot be exhaustive). Such test
software rarely accompanies commercial encryption software.
Finally there is the question of security. The security of a
cryptosystem is always relative to the task it is intended to
accomplish and the conditions under which it will be used. A
theoretically secure system becomes insecure if used by people who
write their encryption keys on pieces of paper which they stick to
their computer terminals.
In general a cryptosystem can never be shown to be completely
secure in practice, in the sense that without knowledge of the
decryption key it is impossible to recover the plaintext with
real-world computing power in less than, say, a thousand years.
There is always something that could go wrong, and future advances
in computing power (sufficient to render a cryptanalyst's task
easy) cannot be known in advance.
In theory there is only one cryptosystem, a symmetric key system
known as the "one-time pad", which is completely secure, but in
practice it is cumbersome and the key can be used only once
without compromising the security of the system. Because of its
security, this is one of the methods used by governments to
protect secrets, since they can afford the expense involved in
dealing with the concomitant inconvenience.
In some cases it is possible to show that cracking a cryptosystem
is equivalent to solving some particular mathematical problem,
e.g. the problem of factoring large numbers ("large" here means
numbers with several hundred decimal digits). If many
mathematicians working for many years have been unable to solve a
problem then this is a reason to regard a cryptosystem based on it
as secure. However, there is no guarantee that a solution to the
mathematical problem may not be found tomorrow, in which case the
security of the cryptosystem would disappear overnight (or at
least, as soon as word got around).
In the case of PGP and other encryption software such as RIPEM
which rely on an asymmetric encryption algorithm known as the RSA
Algorithm, it is widely believed that these are secure if and only
if the problem of factoring large numbers is insoluble (that is,
computationally infeasible in real time). In 1994 a claim was made
by William H. Payne (but apparently not confirmed) that a method
of cryptanalysis of the RSA Algorithm had been found which did not
depend on a general solution to the problem of factoring large
numbers. A poster to the Usenet newsgroup sci.crypt (Francis
Barrett) remarked:
Although factoring is believed to be hard, and factoring
breaks RSA, breaking RSA does not simplify factoring.
Trivial non-factoring methods of breaking RSA could
therefore exist. Whether this paper [by Payne] is
legitimate remains to be seen, but it is certainly not
beyond the realm of possiblity.
Some have claimed that PGP is the most secure encryption program
available for PCs, a claim that does not withstand critical
examination. Given two encryption programs, each of which
generates random-looking ciphertext, how does one decide that one
of them is "more secure" than the other -- even if full details of
the encryption algorithms are known? Short of breaking one of the
systems there is no clear answer. If one cannot provide criteria
for determining when one program is more secure than another then
it does not make sense to assert that one is the most secure.
A "brute force attack" upon a cryptosystem is one which involves
trying every possible key to decrypt some ciphertext until finding
one that works. Millions of keys are used in successive (or
simultaneous) attempts to decrypt the ciphertext -- assuming the
decryption method is known -- and the result in each case is
tested to ascertain whether it is something intelligible (it is
easy to distinguish text from random bytes).
Brute force attacks against specific cryptosystems can be compared
since the average time required by a brute force attack is half
the number of possible keys multiplied by the time required to
test each key. It is true that if the size of the key space
associated with a cryptosystem is small (e.g. 2^16 = 65,536) then
the cryptosystem is vulnerable to a brute force attack. But if a
cryptosystem has a very large key space (e.g. about 10^100 if a
60-character key is permitted) then a brute force attack is not
feasible and so any weakness in the system, if it exists, must be
sought elsewhere.
In practice, the security of a cryptosystem can only be measured
by its resistance to actual attempts to break it. Those that have
been broken are obviously insecure. (There are several
commercially available PC encryption packages that have been
broken; see for example the articles by Kochanski in the
bibliography at the end of this article.) Those that have resisted
the attentions of many cryptanalysts for many years may be deemed
secure, at least until better methods of cryptanalysis are
invented. All assertions that a new cryptosystem is weak are so
much hot air unless a demonstration can be given that the system
can be broken or unless specific theoretical weaknesses can be
pointed out.
In the case of DES there has long been widespread suspicion that
the National Security Agency (NSA) influenced its designers at IBM
so that it was strong enough to withstand most attacks but not
strong enough to withstand the NSA computers.
The original design submitted by IBM permitted all 16 x
48 = 768 bits of key used in the 16 rounds to be
selected independently. A U.S. Senate Select Committee
ascertained in 1977 that the U.S. National Security
Agency (NSA) was instrumental in reducing the DES secret
key to 56 bits that are each used many times, although
this had previously been denied by IBM ... --ÊMassey,
p.541.
But the best attempts by cryptanalysts over the years have
produced only meager results (in particular, the demonstration of
Adi Shamir that cryptanalysis of DES ciphertext, in the simplest
DES mode (electronic code book), can be done with somewhat less
effort than that required for a brute force attack). But recently
a new method of DES cryptanalysis has been proposed which involves
the use of parallel processing (using many computers
simultaneously), and it now seems clear that for a few million
dollars a computer can be built which can crack DES ciphertext in
a few hours. Since NSA has practically unlimited funding and has
the largest concentration of computing power and mathematical
talent in the world, it is likely that NSA possesses the ability
to decrypt DES ciphertext fairly easily.
In fact a brute force attack (using many computers working in
parallel) has recently been used to break DES (though only in its
simplest form -- so-called ECB mode).
[Image]
NSA has, of course, never affirmed or denied their ability to
crack DES. ("NSA" also is an acronym for "Never Say Anything" and
-- in earlier years -- "No Such Agency".) However, the absence of
publication of a demonstration that a particular cryptosystem has
been cracked is no proof that it hasn't. Anyone who discovered a
way to crack DES, RSA, etc., could make a lot more money by
quietly providing a decryption service than by telling the world
about his discovery. In fact if he did announce it people would
quickly stop using that cryptosystem and he would have few
clients.
When selecting a cryptosystem, or cryptographic software, you
should first consider what you want it to accomplish. There are
numerous (entirely law-abiding) reasons why you might wish to
conceal information, for example:
1. Companies often possess data files on employees which are
confidential, such as medical records, salary records, etc.
Employees will feel safer knowing that these files are
encrypted and are not accessible to casual inspection by data
entry clerks (who may be bribed to obtain information on
someone).
2. Individuals may share working space with others, of whose
honor they are not entirely sure, and may wish to make
certain that in their absence no-one will find anything by
snooping about in their hard disk.
3. A company may wish to transfer sensitive business information
between sites such as branch offices. Or it may wish to send
confidential information (for example, a negotiating
position, operating procedures or proprietary data) to an
agent in the field (perhaps abroad). If the information is
encrypted before transmission then one does not have to worry
about it being intercepted since if this happens the
encrypted data is incomprehensible (without the encryption
key).
4. A company may have information that a competitor would like
to see, such as information concerning legal or financial
problems, results of research, who the customers are and what
they are buying, information revealing violations of
government regulations, secret formulas or details of
manufacturing processes, plans for future expansion or for
thedevelopment of new products.
5. A person or company may wish to transport to a distant
location a computer which contains sensitive information
without being concerned that if the computer is examined en
route (e.g. by foreign customs agents) then the information
will be revealed.
6. Two individuals may wish to correspond by email on matters
that they wish to keep private and be sure that no-one else
is reading their mail.
From the above examples it can be seen that there are two general
cases when encryption is needed:
(a) When information, once encrypted, is simply to be stored
on-site (and invulnerable to unauthorized access) until there is a
need to access that information.
(b) When information is to be transmitted somewhere and it is
encrypted so that if it is intercepted before reaching its
intended destination the interceptor will not find anything they
can make sense of.
In case (b) there arises the problem of secure key exchange. This
problem exists because the person who will decrypt the information
is usually not the same as the person who encrypted the
information. Assuming that the decryptor is in posssession of the
decryption engine (normally a piece of software) how does the
decryptor know which decryption key to use? This information must
be communicated to the decryptor in some way. If, during the
course of this communication, the key is intercepted by a third
party then that third party can intercept and decrypt the
ciphertext subsequently sent by the encryptor to the decryptor.
This is a problem which all users of symmetric key systems, such
as DES, must face when transmitting encrypted data, because in
such systems the decryption key is the same as the encryption key.
The encryptor can choose any encryption key they wish, but how are
they to communicate that key to the decryptor in a secure way?
Governments typically solve this problem by putting the key in a
locked briefcase, handcuffing it to the wrist of a trusted minion,
and despatching him with several armed guards to deliver the
briefcase in person (typically at an embassy in a foreign
country). This solution is generally too expensive for ordinary
citizens.
If you know that your mail is not being opened then you can send
the key that way, but who can be sure of this? Even registered
mail may be opened. The best way to pass the key to whoever you
will be sending encrypted material to is by personal contact
someplace where there is no chance of being overheard. If this is
not possible then various less secure means are available. For
example, if you used to live in the same city as the person for
some years then you might call them and say, "Remember that
restaurant in San Diego where we used to have breakfast? Remember
the name of that cute waitress?" Then you have a key that only you
two know, unless someone has extensive information on your
breakfast habits in San Diego several years ago and the names of
the waitresses you might have come in contact with.
There is a class of cryptosystems knowns as "public key" systems
which were first developed in the 1970s to solve this problem of
secure key exchange. These are the systems referred to above as
"asymmetric key" systems, in which the decryption key is not the
same as the encryption key. Such public key systems can, if used
properly, go a long way toward solving the problem of secure key
exchange because the encryption key can be given out to the world
without compromising the security of communication, provided that
the decryption key is kept secret.
Let's say you wish to receive encrypted email from your girlfriend
Alice. You send her your public key - the one used to perform
encryption. Alice writes a passionate love letter, encrypts it
with your public key and sends it to you. You decrypt it with your
private key. If your other girlfriend Cheryl intercepts this then
there is no way she can decrypt it because the public key (assumed
to be known to everyone and thus to her) is no good for
decryption. Decryption can only be performed with the private key,
which only you know (unless Cheryl finds it on your PC when
browsing through your files while you're taking a shower).
A public key cryptosystem relies on some mathematical procedure to
generate the public and private keys. The mathematical nature of
these systems usually allows the security of the system to be
measured by the difficulty of solving some mathematical problem.
There are several public key cryptosystems, the most well known
being the one based on the RSA Algorithm (which is named after,
and patented by, its inventors, Rivest, Shamir and Adelman),
which, as noted above, relies for its security on the difficulty
of factoring large numbers. There are two other public key systems
available for licensing for commercial use, such as the LUC public
key system (from LUC Encryption Technology, Sierra Madre, CA), and
one developed by the computer manufacturer Next, Inc.
One should know that RSA is very vulnerable to chosen
plaintext attacks. There is also a new timing attack
that can be used to break many implementations of RSA.
The RSA algorithm is believed to be safe when used
properly, but one must be very careful when using it to
avoid these attacks. -- Public Key Algorithms
Public key cryptography has applications beyond the classical one
of hiding information. As a consequence of the encryption key and
the decryption key being different, public key cryptography makes
possible digital signatures (for authentification of documents)
and digital forms of such activities as simultaneous contract
signing. Digital cash is also an idea which builds on the use of
an asymmetric cryptosystem.
Although public key cryptography in theory solves the problem of
secure key exchange, it does in general have a couple of
disadvantages compared to asymmetric (or secret) key systems. The
first is speed. Generally public key systems, such as PGP, are
much slower than secret key systems, and so may be suitable for
encrypting small amounts of data, such as messages sent by email,
but are not suitable for bulk encryption, where it may be required
to encrypt megabytes of data. Secret key systems can be very fast
(especially if implemented by instructions hard-coded into chips).
The more complex such a system is the slower it tends to be, but
even complex systems are generally of acceptable speed, especially
given the developments in computer technology in recent years. A
program that might have encrypted data at 4 KB/sec. in 1992 will
soon (if not already) be able to encrypt at 100 Kb/sec.
The second disadvantage of public key systems is that there is a
problem of key validation. If you wish to send encrypted data to a
person, Fred, say, and you have obtained what is claimed to be
Fred's public key, how do you know it really is Fred's public key?
What if a third party, Louis, were to publish a public key in
Fred's name? If Louis works for a government that likes to
maintain a close watch over its citizens then perhaps he can
monitor communications channels used by Fred, intercept encrypted
data sent to him, and can then decrypt it (since he has the
corresponding private key). If Louis were really cunning, and knew
Fred's real public key, he could re-encrypt your message to Fred
using the real public key (perhaps after altering your message in
ways you might not approve of) and deliver it to Fred as if it had
come directly from you. Fred would then decrypt it with his
private key and read a message which he assumes is from you, but
which may in fact be quite different from what you sent. In theory
Louis could sit in the middle of an assumed two-way email
correspondence between you and Fred, read everything each of you
send to the other, and pass to each of you faked messages saying
anything he wanted you to believe was from the other.
In 1993 a contributor to sci.crypt (Terry Ritter, 11/29/93) wrote:
When we have a secret-key cipher, we have the serious
problem of transporting a key in absolute secrecy.
However, after we do this, we can depend on the cipher
providing its level of technical secrecy as long as the
key is not exposed.
When we have a public-key cipher, we apparently have
solved the problem of transporting a key. In fact,
however, we have only done so if we ignore the security
requirement to validate that key. Now, clearly,
validation must be easier than secure transport, so it
can be a big advantage. But validation is not trivial,
and many people do not understand that it is necessary.
When we have a public-key cipher and use an unvalidated
key, our messages could be exposed to a spoofer who has
not had to "break" the cipher. The spoofer has not had
to break RSA. The spoofer has not had to break IDEA.
Thus, discussion of the technical strength of RSA and
IDEA are insufficient to characterize the overall
strength of such a cipher. In contrast, discussion of
the technical strength of a secret-key cipher *IS*
sufficient to characterize the strength of that cipher.
Discussion of the strength of public-key cipher
mechanisms is irrelevant without a discussion of the
strength of the public-key validation protocol.
Private-key ciphers need no such protocol, nor any such
discussion. And a public-key cipher which includes the
required key-validation protocol can be almost as much
trouble as a secret-key cipher which needs none.
When encryption is used in case (a), to be stored on-site (and
invulnerable to unauthorized access) until there is a need to
access that information, a secret key cryptosystem is clearly
preferable, since such a system has the virtue of speed, and there
is no problem of key validation and no problem of key exchange
(since there is no need to transmit the encryption key to anyone
other than by face-to-face communication).
However, many people are still using secret key cryptosystems that
are relatively easy to break since those people don't know any
better. For example, the WordPerfect word processing program
allows you to lock the information in a file by means of a
password. In a bad marriage one spouse might think that by locking
their WordPerfect files they can write what they like and not
worry that the other spouse might later use this against them.
What the first spouse doesn't know is that there are programs
around that can automatically (and in a few seconds) find the
password used to lock a WordPerfect file.
In fact the WordPerfect encryption method (at least for Versions
5.1 and earlier) has been shown to be very easy to break. Full
descriptions are given in the articles by Bennett, for Version
4.2, and by Bergen and Caelli, for Version 5.0 (see the
bibliography below).
Another case is the encryption scheme used by Microsoft's word
processing program Word. A method to crack encrypted Word files
was published on Usenet late in 1993, so this method of protecting
information is now obsolete. There is even a company, Access Data
Recovery (in Orem, Utah) that sells software that automatically
recovers the passwords used to encrypt data in a number of
commercial software applications, including Lotus 123.
For a cryptosystem to be considered strong it should possess the
following properties :
1. The security of a strong system resides with the secrecy of
the key rather than with the supposed secrecy of the
algorithm. In other words, even if an attacker knows the full
details of the method used to encrypt and to decrypt, this
should not allow him to decrypt the ciphertext if he does not
know the key which was used to encrypt it (although obviously
his task is even more difficult if he does not know the
method -- and one might note that the Pentagon does not
reveal details of its encryption techniques).
2. A strong cryptosystem has a large keyspace, that is, there
are very many possible encryption keys. DES is considered by
many to be flawed in this respect, because there are only
2^56 (about 10^17) possible keys. The size of the keyspace
should be over 10^100, which is possible in systems which
allow keys up to 60 characters in length (provided that no
two keys are equivalent in their effect on the encryption
process).
3. A strong cryptosystem will produce ciphertext which appears
random to all standard statistical tests. A full discussion
of these tests is beyond the scope of an introductory article
such as this on the use of encryption software, but we may
consider one interesting test, the so-called kappa test,
otherwise known as the index of coincidence.
The idea behind this is as follows: Suppose that the elements of
the cipher text are any of the 256 possible bytes (0 through FF).
Consider the ciphertext to be a sequence of bytes (laid out in a
row). Now duplicate this sequence and place it beneath the first
(with the first byte of the second sequence below the first byte
of the first sequence). We then have a sequence of pairs of
identical bytes. Slide the lower sequence to the right a certain
distance, say, 8 places. Then count how many pairs there are in
which the bytes are identical. If the sequence of bytes were truly
random then we would expect about 1/256 of the pairs to consist of
identical bytes, i.e. about 0.39% of them. It is not difficult to
write a program which analyzes a file of data, calculating the
indices of coincidence (also known as the kappa value) for
multiple displacement values.
When we run such a program on ordinary English text we obtain
values such as the following ("IC" means "index of coincidence"):
Offset IC coincidences
1 5.85% 2397 in 40968
2 6.23% 2551 in 40967
3 9.23% 3780 in 40966
4 8.31% 3406 in 40965
5 7.91% 3240 in 40964
6 7.88% 3227 in 40963
7 7.78% 3187 in 40962
8 7.92% 3244 in 40961
9 8.24% 3377 in 40960
10 7.98% 3268 in 40959
11 8.16% 3341 in 40958
12 8.09% 3315 in 40957
13 8.15% 3337 in 40956
14 7.97% 3264 in 40955
15 7.97% 3265 in 40954
16 8.07% 3306 in 40953
17 8.04% 3293 in 40952
18 7.85% 3214 in 40951
Typically only 80 or so different byte values occur in a file of
English text. If these byte values occurred randomly then we would
expect an index of coincidence for each displacement of about
1/80, i.e. about 1.25%. However, the distribution of characters in
English text is not random ("e", "t" and the space character occur
most frequently), which is why we obtain the larger IC values
shown above.
The kappa test can be used to break a weak cryptosystem, or at
least, to provide a clue toward breaking it. The index of
coincidence for the displacement equal to the length of the
encryption key will often be significantly higher than the other
indices, in which case one can infer the length of the key.
For example, here are the indices of coincidence for a file of
ciphertext (2048 bytes in size) produced by encrypting a text file
using a weak cryptosystem (one which was discussed on sci.crypt in
December 1993):
Offset IC coincidences
1 0.15% 3 in 2047
2 0.34% 7 in 2046
3 0.34% 7 in 2045
4 0.54% 11 in 2044
5 0.44% 9 in 2043
6 0.39% 8 in 2042
7 0.24% 5 in 2041
8 0.49% 10 in 2040
9 0.49% 10 in 2039
10 0.29% 6 in 2038
11 0.15% 3 in 2037
12 0.10% 2 in 2036
13 0.64% 13 in 2035
14 0.74% 15 in 2034
15 0.39% 8 in 2033
16 0.20% 4 in 2032
17 0.30% 6 in 2031
18 0.34% 7 in 2030
256 different byte values occur in the ciphertext, so if it were
to appear as random then the kappa value should be about 0.39% for
each displacement. But the kappa values for displacements 13 and
14 are significantly higher than the others, suggesting that the
length of the key used in the encryption was either 13 or 14. This
clue led to the decryption of the ciphertext and it turned out
that the key length was in fact 13.
As an example of how non-random some ciphertext produced by
commercial cryptosystems may be it is instructive to consider the
proprietary encryption algorithm used by the Norton Diskreet
program. The file named NORTON.INI, which comes with the Diskreet
program, contains 530 bytes and 41 different byte values,
including 403 instances of the byte value 0. The non-zero byte
values are dispersed among the zero values. If we encrypt this
file using Diskreet's proprietary encryption method and the key
"ABCDEFGHIJ" we obtain a file, NORTON.SEC, which contains 2048
bytes, including 1015 0-bytes. When we examine this file with a
hex editor we find that it consists of the letters "PNCICRYPT",
seven 0-bytes or 1-bytes, 1024 bytes of apparent gibberish (the
ciphertext) and finally 1008 0-bytes. Suppose we extract the 1024
bytes of ciphertext. There are 229 different byte values in this
ciphertext, so if it really appeared random we would expect the
kappa values to be about 1/229, i.e. about 0.44%. What we find is
the following:
Offset IC coincidences
1 0.29% 3 in 1023
2 21.72% 222 in 1022
3 0.69% 7 in 1021
4 1.08% 11 in 1020
5 0.49% 5 in 1019
6 0.20% 2 in 1018
7 0.39% 4 in 1017
8 0.00% 0 in 1016
9 0.79% 8 in 1015
10 0.39% 4 in 1014
11 0.69% 7 in 1013
12 0.69% 7 in 1012
13 0.30% 3 in 1011
14 0.99% 10 in 1010
15 0.20% 2 in 1009
16 0.30% 3 in 1008
17 0.40% 4 in 1007
18 0.20% 2 in 1006
The figure of 21.72% for offset 2 is quite astounding. When we
look at the ciphertext with a hex editor we see that there are
many lines which have a byte pattern:
xx yy aa bb aa bb cc dd cc dd ee ff ee ff gg hh
gg hh ...
that is, in which pairs of bytes tend to be repeated, for example:
4B 25 4B 25 8D 28 8D 28 2D F8 2D F8 21 AC
21 AC E8 9E E8 9E F2 FC F2 FC C6 C5 C6 C5 7E 4F
7E 4F B2 8B B2 8B 32 EE 32 EE 25 2C 25 2C A5 32
A5 32 8D 61 8D 61 E5 C1 E5 C1 D4 F7 D4 F7
This explains why sliding the ciphertext against itself two places
to the right produces such a large number of coincidences.
Clearly this ciphertext shows obvious regularities, and appears to
be very far from random. Such regularities are what a cryptanalyst
looks for, as a clue to the encryption method and to the key, and
which a good cryptosystem denies him.
In contrast to Diskreet, a good cryptosystem would encrypt the
same file, NORTON.INI, using the same key, to a file of a few
hundred bytes with kappa values such as the following:
Offset IC coincidences
1 0.45% 2 in 449
2 0.45% 2 in 448
3 0.00% 0 in 447
4 0.45% 2 in 446
5 0.00% 0 in 445
6 0.23% 1 in 444
7 0.45% 2 in 443
8 0.23% 1 in 442
9 0.23% 1 in 441
10 0.23% 1 in 440
11 0.46% 2 in 439
12 0.23% 1 in 438
13 0.23% 1 in 437
14 0.46% 2 in 436
15 0.23% 1 in 435
16 0.69% 3 in 434
17 0.00% 0 in 433
18 0.46% 2 in 432
The essentially discrete distribution of these indices of
coincidence (0.00, 0.23, 0.46, 0.69) are due to the small size of
the ciphertext (400 - 500 bytes). When the same test is done with
about 60,000 bytes of ciphertext produced by a good cryptosystem
(in which all 256 possible byte values should be present, implying
a desired kappa value of 0.39%) we obtain a result such as:
Offset IC coincidences
1 0.41% 248 in 60200
2 0.43% 258 in 60199
3 0.44% 263 in 60198
4 0.43% 258 in 60197
5 0.43% 257 in 60196
6 0.34% 205 in 60195
7 0.40% 239 in 60194
8 0.42% 252 in 60193
9 0.40% 241 in 60192
10 0.40% 242 in 60191
11 0.41% 247 in 60190
12 0.36% 216 in 60189
13 0.41% 245 in 60188
14 0.37% 223 in 60187
15 0.36% 219 in 60186
16 0.41% 247 in 60185
17 0.40% 238 in 60184
18 0.37% 222 in 60183
A good cryptosystem should produce ciphertext which passes the
kappa test and other statistical tests and reveals no regularities
or pattern of any kind.
------------------------------------------------------------------
Selected Bibliography
Cryptology is an academic discipline which has implications for
the security of life and property, and thus there is a vast
literature on the subject, often highly technical in nature. Much
of the research is secret and unpublished. The following are just
a few of the many books and journal articles available. The
history of codes and code-breaking is especially interesting. The
best book on this subject is David Kahn's The Codebreakers (the
bound edition is recommended). Among the following works those
marked with an asterisk are more historical than technical and
tend to be somewhat easier reading. Those marked "#" contain
commentary on some contemporary political aspects of the civilian
use of cryptography.
Andreassen, K.: Computer Cryptology, Prentice-Hall.
Angluin, D. and Lichtenstein, D.: Provable Security in
Cryptosystems, Yale University, 1983.
#Bamford, J.: The Puzzle Palace, Penguin Books.
#Barlow, J. P.: "Decrypting the Puzzle Palace", Communications of
the ACM, July 1992, pp. 25-31.
*Barker, W. G.: History of Codes and Ciphers in the U.S., several
volumes, Aegean Park Press, P. O. Box 2837, Laguna Hills, CA
92654.
Beker, H. and Piper, F.: Cipher Systems, Wiley, 1982.
Bennett, J.: "Analysis of the Encryption Algorithm Used in the
WordPerfect Word Processing Program", Cryptologia 11(4), pp.
206-210, 1987.
Bergen, H. A. and Caelli, W. J.: "File Security in WordPerfect
5.0", Cryptologia 15(1), pp. 57-66, January 1991.
Biham, E. and Shamir, A.: "Differential cryptanalysis of DES-like
cryptosystems", Journal of Cryptology, vol. 4, #1, pp. 3-72, 1991.
*Boyd, C.: "Anguish under Siege: High-Grade Japanese Signal
Intelligence and the Fall of Berlin", Cryptologia 8(3), July 1989,
pp. 193-209.
Brassard, G.: Modern Cryptology, Springer-Verlag, 1988.
Deavours, C. A. and Kruh, L.: Machine Cryptography and Modern
Cryptanalysis, Artech House, 610 Washington St., Dedham, MA 02026,
1985.
DeLaurentis, J. M.: "A Further Weakness in the Common Modulus
Protocol in the RSA Cryptoalgorithm", Cryptologia, 8(3), July
1984, pp. 253-259.
Denning, D.: Cryptography and Data Security, Addison-Wesley, 1982.
*Diffie, W.: "The first ten years of public key cryptography",
IEEE proceedings, 76(5), 560--577, 1988.
---- and Hellman, M.: "Privacy and authentication: an introduction
to cryptography", IEEE proceedings, 67(3), 397-427, 1979.
Feistel, H.: "Cryptography and Computer Privacy", Scientific
American, 228(5), pp. 15-23, 1973.
*Flicke, W. F.: War Secrets in the Ether, Volumes 1 & 2, Aegean
Park Press.
*Friedman, W. F.: Solving German Codes in World War I, Aegean Park
Press.
*---- and Mendelsohn, C. J.: The Zimmermann Telegram of 1917 and
its Cryptographic Backround, Aegean Park Press.
Gaines, H. F.: Cryptanalysis, Dover, 1956.
Garon, G. and Outerbridge, R.: "DES watch: an examination of the
sufficiency of the Data Encryption Standard for financial
institutions in the 1990's", Cryptologia 15(3), 1991, pp. 177-193.
*Hinsley, F. H. et al.: British Intelligence in the Second World
War, Cambridge U. P., volumes 1 - 4.
*---- and Stripp, A. (eds.): Codebreakers: The Inside Story of
Bletchley Park, Oxford U.P., 1993.
Held, G.: Top Secret Data Encryption Techniques, Sams Publishing,
1993.
Hellman, M.: "The mathematics of public key cryptography",
Scientific American, pp. 130-139, 1979.
*Kahn, D.: The Codebreakers, Macmillan, 1967.
*----: Seizing the Enigma, Houghton Mifflin, 1991.
Kochanski, M.: "A Survey of Data Insecurity Packages", Cryptologia
11(1), pp. 1-15, 1987.
----: "Another Data Insecurity Package", Cryptologia 12(3),
pp.165-177, July 1988.
Konheim, A. G.: Cryptography: A Primer, John Wiley, 1981.
#Kruh, L.: "The Control of Public Cryptography and Freedom of
Speech - A Review", Cryptologia 10(1), January 1986, pp. 2-9.
Lysing, H.: Secret Writing, Dover, 1974.
Marotta, M.: The Code Book, Loompanics, 1987.
Massey, J.: "An Introduction to Contemporary Cryptology", IEEE
Proceedings, 76(5), pp. 533-549, May 1988.
Meyer, C. H., and Matyas, S. M.: Cryptography, John Wiley, 1982.
#Pierce, K. J.: "Public Cryptography, Arms Export Controls, and
the First Amendment: A Need for Legislation", Cornell
International Law Journal, Vol. 17, No. 3 (Winter 1984), pp.
197-236.
Rivest, R. L., Shamir, A. and Adelman, L.: "A Method for Obtaining
Digital Signatures and Public-key Cryptosystems," Communications
of the ACM, February 1979.
Salomaa, A.: Public Key Cryptography, Springer-Verlag, 1990.
Schneier, B.: "Untangling Public Key Cryptography", Dr Dobb's
Journal, May 1992, pp. 16-28.
----: "The IDEA Encryption Algorithm", Dr Dobb's Journal, December
1993, pp. 50-56.
----: Practical Cryptography, John Wiley & Sons, 1994.
Simmons, G. (ed.): Contemporary Cryptology: the Science of
Information Integrity, IEEE Press, 1991.
Smith, L. D.: Cryptography, Dover, 1955.
*Weber, R. E.: United States Diplomatic Codes and Ciphers
1775-1938, Precedent, 1979.
Welsh, D.: Codes and Cryptography, Claredon Press, 1988.
*Yardley, H. O.: The American Black Chamber, Ballantine 1981.
Encryption Software
[Image]
Email
Index Home Page