17crypto_L03
Traditional Encryption
Algorithms
Crpto & SecDev 2017 © Ron Poet: Lecture 3 1
Algorithms
Traditional Encryption
� Covers the time period up to the invention of the electronic
computer.
� All single key systems, two main variations.
� Transposition ciphers, where the characters are
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
2
� Transposition ciphers, where the characters are
rearranged.
� Substitution ciphers, where the characters are substituted.
� Transposition and substitution can be combined.
Types Of Attack: Information Available
� Attempts to break the code depend on the information
available.
� Ciphertext only
� Known plaintext
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
3
� Known plaintext
�Know part of the message and use this to find the key.
� Chosen plaintext
�Plant some plain text and examine the cipher text to
find the key.
Types of Attack: Methods
� Brute force (exhaustive search)
�Try all possible keys.
� Letter frequency.
�Use the letter frequencies in the ciphertext.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
4
�Use the letter frequencies in the ciphertext.
� Digram and trigram frequencies
�Frequencies of pairs and triples of characters in the
ciphertext.
� The following slide shows the letter frequencies: totals and
occurrences per 1000 letters from the file dracula.txt.
Letters in dracula.txt
a — 52337 82
b — 8987 14
c — 13516 21
d — 28539 45
e — 79302 124
n — 43597 68
o — 50331 79
p — 9158 14
q — 625 1
r — 34951 55e — 79302 124
f — 13991 22
g — 12670 20
h — 43201 68
i — 42602 67
j — 813 1
k — 6201 10
l — 26115 41
m — 17758 28
r — 34951 55
s — 39484 62
t — 58123 91
u — 17923 28
v — 5871 9
w — 18057 28
x — 781 1
y — 12671 20
z — 351 1
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
5
Confusion and Diffusion
� A good encryption system will try to make sure that there
is little relationship between plain text letters, the key and
cipher text letters.
� In particular, small changes to the plain text or the key
should produce large changes to the cipher text.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
6
should produce large changes to the cipher text.
� An encryption system has good confusion if changing one
bit in the key changes roughly half the bits of the cipher
text.
� An encryption system has good diffusion if changing one
bit in the plain text changes roughly half the bits of the
cipher text.
Transposition Ciphers
� Split characters into blocks of fixed length d.
� Rearrange the characters inside a block according to a key dependent
permutation.
� Decryption uses the inverse permutation to recover the original text.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
7
� We can define the permutation by a table e.g.
0→1, 1→3, 2→0, 3→2.
� It is shorter to list the destination positions in order.
(1, 3, 0, 2) or 1302
� The inverse permutation is
(2, 0, 3, 1) or 2031
Example
� Starting with the text
securityandcryptography
� Arrange it in blocks of 4 characters.
secu rity andc rypt ogra phy
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
8
secu rity andc rypt ogra phy
� Encrypting the text with the above permutation yields
csue tryi dacn prty roag yph.
Unicity Distance
� The first step in calculating the unicity distance is to
examine the key space in more detail.
� Assume that all permutations are equally likely
� There are d! possible permutations for an encryption with
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
9
� There are d! possible permutations for an encryption with
block length d.
� d! is very large for moderate values of d.
Unicity Distance (2)
� The entropy of the key space is log2(d!).
�Assume all permutations are equally likely.
� We can estimate this using Stirling’ approximation for the
factorial function:
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
10
factorial function:
�loge(n!) ≈ (n + 1/2) loge(n) – n
�so log2(x) = log2(e) × loge(x) = 1.4427 loge(x)
� If d = 25 then H(K) = 83.6767
�There are a lot of possible keys.
� Hence Nu = 83.6767 / 3.2 = 26
Confusion and Diffusion
� This algorithm can have good confusion.
�It all depends on the way the notation for the key is
related to the permutation.
� Diffusion is poor because the plain text bits are shuffled
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
11
� Diffusion is poor because the plain text bits are shuffled
round in character sized blocks.
�Replacing the first s in the plain text by a t does not
change the cipher text very much.
Breaking this Code
� Cipher text only attack.
�The only information we have is the cipher text.
� The letter frequencies in the cipher text remain the same as
those for the plain text.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
12
those for the plain text.
� This is a good indication that a transposition cipher is
being used.
Breaking this Code (2)
� The order of the letters has been changed, and the original
order can be recovered by the use of anagramming
techniques.
� This is greatly helped by the use of digram and trigram
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
13
� This is greatly helped by the use of digram and trigram
frequency tables, showing which letter pairs are common,
and also which pairs never occur.
Breaking this Code (3)
� Start by guessing that the permutation length is 4, then
look for clues in the blocks.
csue tryi dacn prty roag yph
� Notice that the third block contains the letters for a-n-d, a
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
14
� Notice that the third block contains the letters for a-n-d, a
very common trigram.
� If this block contained the word “and”, there would be
only two possible permutations.
�c the first in the block “cand” (2, 1, 3, 0)
�c the last in the block “andc” (1, 3, 0, 2).
Breaking this Code (4)
� Making c the first letter in the block leads to
usec yrit cand tryp aogr phy
� It is quite hard to automatically detect that this is wrong.
� Making c the last letter of the block leads to the correct
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
15
� Making c the last letter of the block leads to the correct
pain text.
Substitution Ciphers
� Substitution ciphers provide a matching between letters in
the alphabet to other letters in the same or a different
alphabet.
� In the simple case, each English letter is replaced by
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
16
� In the simple case, each English letter is replaced by
another English letter.
� In more complex cases, English letters can be replaced by
letters in another alphabet.
Shift Substitution
� A shift substitution shifts each letter of the alphabet along
by a fixed amount, with a wrap around at the end. The
formula is
c = (p + k) % n
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
17
� where c is the cipher letter, p the plain text, n is the size of
the alphabet and k, the key, is the number of characters to
shift.
� Decryption uses the inverse transformation
p = (c – k) % n
Caesar Cipher
� An early example of a shift substitution is the Caesar
cipher with k = 3
� This confused the Gauls during Caesar’s wars.
�As reported by Caesar!
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
18
�As reported by Caesar!
� The Caesar cipher relied on a secret algorithm, which is
usually a bad idea.
�But it worked for Caesar.
� Diffusion is again poor. Changing one letter in the plain
text only changes one letter in the cipher text.
Caesar Cipher Example
� securityandcryptography becomes
� vhfxulwbdogfubswrjudskb.
� The letter frequencies have been shifted.
� We can use the most frequent letters in the ciphertext to
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
19
� We can use the most frequent letters in the ciphertext to
guess the shift.
� If the most common letter in the ciphertexst is h, we guess
that it corresponds to a common letter such as e, with a
shift of 3.
�Works better for longer messages.
Multiplicative Substitution
� A multiplicative substitution is slightly more sophisticated,
and uses the formula
c = (p × k) % n
� Decryption uses k’, the inverse of k
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
20
� Decryption uses k’, the inverse of k
p = (c × k ‘) % n, where kk’ = 1 (mod n).
� A more complex version would combine shift and
multiplication.
� c = (p × k1 + k2) % n
Breaking These Codes
� The unicity distance is quite small, since there are not
many possible keys.
�There are 25 keys, assuming shifting with 0 or
multiplying with 1 are not used.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
21
H(K) = log2(25) and so Nu = log2(25) / 3.2 = 1.5
� Single letter frequencies can be used to break these codes
quite easily.
� An exhaustive search is also possible.
� Shift and multiply has 252 = 625 possible keys.
General English to English Substitutions
� The key is a substitution between letters of the alphabet.
�There are a large number of different possible
substitutions, and so keys.
� The unicity distance calculation is similar to a
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
22
� The unicity distance calculation is similar to a
transposition cipher, and Nu = 27.
� A cipher text only attack will then use single letter
frequencies to break the code.
Homophonic Ciphers
� A homophonic cipher matches each letter in the plain text
alphabet to possibly more than one letter in another
alphabet used for the cipher text.
� For example, English letters in the plain text to numbers
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
23
� For example, English letters in the plain text to numbers
between 0 and 127 in the cipher text.
� Each English letter in the plaintext can corresponds to
several numbers in the cipher text.
�The key is this match between letters and numbers.
�It can be quite large.
Homophonic Ciphers (2)
� More frequently used letters have more matching
characters in the other alphabet.
� This severely weakens attacks based on single letter
frequencies
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
24
frequencies
� It can still be attacked by digram frequencies.
� The following slide shows how many numbers between 0
and 127 should be allocated to each of the letters in the
alphabet, based on the earlier frequencies.
� It also shows the relative frequency of plain text letters to
ciphertext codes. 1.00 means no statistical information.
Codes Per Letter
a — 10 1.05
b — 2 0.90
c — 3 0.90
d — 6 0.95
e — 16 0.99
n — 8 1.09
o — 10 1.01
p — 2 0.92
q — 1 0.13
r — 7 1.00e — 16 0.99
f — 3 0.94
g — 2 1.27
h — 8 1.08
i — 8 1.07
j — 1 0.16
k — 1 1.24
l — 5 1.05
m — 3 1.19
r — 7 1.00
s — 8 0.99
t — 11 1.06
u — 4 0.90
v — 1 1.18
w — 4 0.91
x — 1 0.16
y — 2 1.27
z — 1 0.07
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
25
Homophonic Ciphers (3)
� The larger the ciphertext alphabet, the more secure the
code.
� In the limit where every plain text letter is encrypted to a
different cipher text letter, the code cannot be broken
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
26
different cipher text letter, the code cannot be broken
�But the key size is at least as large as the length of the
plain text.
� The ciphertext alphabet requires more bits to encode each
letter.
�26 letters require 5 bits per letter
�Numbers between 0 and 127 require 7 bits per number.
Beale Ciphers
� These form a set of 3 encrypted documents locating hidden
treasure
�Widely believed to be a hoax.
� The key to the second document is the American
declaration of independence.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
27
declaration of independence.
�Every word in the declaration is numbered
consecutively, starting with 1
�Each letter in the plain text can be coded with the
sequence number of any word in the American
declaration of independence that starts with that letter.
� This is a homophonic cipher.
Second Order Homophonics
� Encrypt two different plain text messages of the same
length with two different keys to produce a composite
cipher text.
� Each key will decrypt the cipher text to a different
message.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
28
message.
� The first key decrypts to the real message:
� The other message can be innocuous and the second key is
a distress key, to be revealed under duress.
Second Order Homophonics (2)
� The plaintext alphabet forms the rows and columns of an
n x n matrix.
� The ciphertext alphabet consists of integers between 0 and
n2-1 inclusive
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
29
n2-1 inclusive
� The key is the order in which they appear in the matrix.
� Encryption uses letters from the first message to locate the
row and the second message the column.
� Each cipher text character is the number appearing in the
appropriate element in the matrix.
Second Order Homophonics (3)
� Decrypting to the first message decodes all the numbers in
a given row to the same letter
� Decrypting to the second message uses the numbers in a
given column to produce the same letter.
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
30
given column to produce the same letter.
Example: Alphabet of 5 letters EILMS
� Encryption Matrix
E I L M S
E 10 22 18 02 11
I 12 01 00 05 20
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
31
I 12 01 00 05 20
L 19 06 23 13 07
M 03 16 08 24 15
S 17 09 21 14 04
� If the real message is SMILE and the decoy message
LIMES, then the cipher text will be 21 16 05 19 11
Example (2)
� The real decryption key will be
E(10,22,18,02,11)
I(12,01,00,05,20)
L(19,06,23,13,07)
M(03,16,08,24,15)
Crpto & SecDev 2017 © Ron Poet:
Lecture 3
32
M(03,16,08,24,15)
S(17,09,21,14,04)
� while the decoy key will be
E(10,12,19,03,17)
I(22,01,06,16,09)
L(18,00,23,08,21)
M(02,05,13,24,14)
S(11,20,07,15,04)
� Note that letter frequencies are not destroyed.