PowerPoint Presentation
Lec – 3b
Substitution and Transposition, Steganography
1
Overview
CIA-AA & Cryptography relationship
Categories/Dimensions of cryptographic systems
Substitution techniques:
Single-letter
Caesar
Rot13
Multi-letters substitution
Playfair
Hill Cipher
Vigenere
Vernam
One Time Pad
Transposition techniques: rail fence and rectangular (route)
Cryptanalysis: frequency analysis
Steganography
2
Confidentiality & Cryptography
Username Password
Imran 123456
Username Password
Imran -xcvzuy
Encryption
Symmetric
DES/3DES
Blowfish
AES
—
Asymmetric
RSA
ElGamal
ECC
—
System Developer
Hacker/Bad person
Unencrypted asset/data
Integrity and Hash
SHA- family
SHA1-SHA2, SHA3
RC family
RC4, RC5..
MD family
MD5
Authentication & Cryptography
Asymmetric cryptography is used to confirm authenticity of the sender to the receiver.
This will be discussed when digital certificate and public key cryptography is covered.
Overview of lab work for next two-three weeks
1) OpenSSL – A general purpose cryptographic library [Week 3]
2) Simple web application
– You can use any programming language
– Java
– Django
Please note that I don’t know much about those but you are free to use if you are confident
– I will be using Apache/PHP/MySQL, Lab sheet will cover it, so don’t worry
– XAMP local Apache/PHP environment will be set up [Week 3]
– PHP to create User registration and login [Week 3]
– MySQL to create DB [Week 3]
– Will use Ciphers to encrypt password [Week 4]
– Will use verities of ciphers using PHP syntax [Week 4]
3) Module is not about web programming
– But I will provide some notes about PHP which will be sufficient for this module
4) This will application that you will developed from next will also be used in
exploiting web application vulnerabilities especially injection attacks.
Cryptographic systems
Cryptographic systems are generically classified along three independent dimensions:
1. The type of operations used for transforming plaintext to ciphertext. All
encryption algorithms are based on two general principles: substitution,
in which each element in the plaintext (bit, letter, group of bits or letters)
is mapped into another element, and transposition, in which elements
in the plaintext are rearranged. The fundamental requirement is that no
information be lost (i.e., that all operations be reversible). Most systems,
referred to as product systems, involve multiple stages of substitutions and
transpositions.
2. The number of keys used. If both sender and receiver use the same key, the
system is referred to as symmetric, single-key, secret-key, or conventional
encryption. If the sender and receiver each use a different key, the system is
referred to as asymmetric, two-key, or public-key encryption.
3. The way in which the plaintext is processed. A block cipher processes the input
one block of elements at a time, producing an output block for each input
block. A stream cipher processes the input elements continuously, producing
output one element at a time, as it goes along.
7
Classified along three independent dimensions:
The type of operations used for transforming plaintext to ciphertext
Substitution – each element in the plaintext is mapped into another element
Transposition – elements in plaintext are rearranged
The number of keys used
Sender and receiver use same key – symmetric
Sender and receiver each use a different key – asymmetric
The way in which the plaintext is processed
Block cipher – processes input one block of elements at a time
Stream cipher – processes the input elements continuously
Cryptographic system dimensions: substitution & transposition
Recall: secret key ciphers use a secret key for encryption
Almost all secret key ciphers (no matter how complicated) are essentially a combination of two simple techniques:
Transposition: rearranging order of the plaintext characters
Substitution: it is a method of encrypting by which units of plaintext are replaced with ciphertext, according to a fixed system; the “units” may be single letters, pairs of letters, triplets of letters, mixtures of the above, and so forth
Substitution
Substitution
Monoalphabetic (simple substitution)
– Simple substitution – each character is replaced
– e.g. caesar cipher
Polygraphic (larger groups of letters)
Uniform substitution is performed on blocks of letters
Plaintext: Welcome to Sussex and Welcome to Introduction to Computer Security
for example
if pair of letters moved e.g. we in welcome will be bigraphic
if three letters moved e.g. wel in welcome will be trigraphic
Polyalphabetic (a number of substitutions at different positions in the message)
– For example Playfair cipher, Vigenere
– In our plaintext listed above, suppose, when wel is
substituted by 3 position, come by 4 and so on.
10
Monoalphabetic: Caesar cipher
Julius Caesar a roman military general, 2000 years ago
Earliest known and the simplest substitution cipher
Involves replacing each letter of the alphabet with the letter standing three places further down the alphabet
a b c d e f g h i j k l m n o p q r s t u v w x y z
Example
Plain: meet me after the toga party
Cipher: PHHW PH DIWHU WKH WRJD SDUWB
What is the key used in this example?
Task your turn – Plaintext is: Hello Ciphertext : ?
Formula can be expressed as…
C = E(3, p) = (p + 3) mod 26 where p is plaintext
What is mod?
11
For letter b value is computed as following,
its index is 1
C= (1+3) mod 26 = 4 = e
Mod value can be checked on
https://www.miniwebtool.com/modulo-calculator/
General Caesar algorithm takes on a value in the range 1 to 25.
C = E(k, p) = (p + k) mod 26
The decryption algorithm is simply p = D (k, C) = (C – k) mod 26
For example for encrypted e p = (C – k) mod 26 = 4-3 mod 26 = 1 mod 26 = 1 = b
C = E(k, p) = (p + k) mod 26 Where p is index of plaintext character
P = D (k, c) = (c – k) mod 26 Where c is index of ciphertext character
Bruteforce against Caesar cipher
Three important characteristics enabled us to use a brute force:
The encryption and decryption algorithms are known.
There are only 25 keys to try.
3. The language of the plaintext is known and easily recognizable.
12
ciphertext: PHHW PH DIWHU WKH WRJD SDUWB
plaintext: meet me after the toga party
Large number of keys makes brute-force cryptanalysis impractical largely
The triple DES algorithm makes use of a 168-bit key, giving a key space of or greater than 3.7 * 1050 possible keys.
13
Problems with monoalphabetic
Key-space
Why a need for multi-letter substitution?
Total characters in English language are 26, key-space is 26
To increase key-space, use permutation
Monoalphabetic easy to break because they reflect the frequency data of the original alphabet.
A countermeasure is to provide multiple substitutes known as homophones. For example, the letter e could be assigned a number of different cipher symbols such as 3, 4 and 16.
Each homophones assigns these symbols in rotation or randomly
With homophones, each element of plaintext affects only one element of ciphertext, and multiple letter patterns will still survive in the ciphertext two solutions for this:
1)Encrypt multiple letters of plaintext
2)Use multiple cipher alphabets
Monoalphabetic ciphers
Permutation: A permutation of a finite set of elements is an ordered sequence of all the elements of S, with each element appearing exactly once.
For example, if S = {a,b,c} there are six permutations of S:
abc, acb, bac, bca, cab, cba
Permutations calculator
Instead of simple cipher (26 keys) there are 26!
What this symbol ! means?
Provides bigger key space than DES.
15
Monoalphabetic ciphers
But the problem is:
If language is known, it is still possible to find the plaintext
How? Frequency analysis
16
Polyalphabetic: playfair cipher
Best known multiple-letter cipher
Use a number of substitutions over the entire plaintext
Treats digrams in the plaintext as single units and translates these units into ciphertext digrams
Use of 5 x 5 matrix
I/J takes on space
What is a matrix? – If you don’t know then research on it yourself.
Netflix prize – 1 million dollar
Playfair – Example
key: MONARCHY plaintext: FRIEND
1st step: write the key in the matrix
2nd step: along with the key,
write remaining letters
3rd step: make pairs,
not possible then write filler
Fill in the letters minus duplicates from left to right and from top to bottom
Fill in the remainder of the matrix with the remaining letters in alphabetic order
Letters I/J count as one letter
Plaintext is encrypted two letters at a time according to the following rules:
Playfair rules
Repeating plaintext letters that are in the same pair are separated with a filler letter, such as x, so that balloon would be treated as ba lx lo on.
Two plaintext letters that fall in the same row of the matrix are each replaced by the letter to the right, with the first element of the row circularly following the last. For example, ar is encrypted as RM.
Two plaintext letters that fall in the same column are each replaced by the letter beneath, with the top element of the column circularly following the last. For example, mu is encrypted as CM.
Otherwise, each plaintext letter in a pair is replaced by the letter that lies in its own row and the column occupied by the other plaintext letter. Thus, hs
becomes BP and ea becomes IM (or JM, as the encipherer wishes).
Playfair
Identification of digrams is more difficult
Relative frequencies of individual letters exhibit a much greater range than that of digrams
Used extensively in World war 1 and 2 especially first world war
There are ways to break this cipher; one method is to find relative frequency.
Substitution multi letter: Hill Cipher
Research yourself about this cipher
Look at how encryption and decryption process works.
Polyalphabetic: vigenere cipher
A set of monoalphabetic ciphers is used
Vigenère cipher: generalize the rotate cipher by shifting different plaintext letters by different amounts
Shift amount determined by the letter in the key (a: shift 0, b: shift 1, …, z: shift 25)
Repeat the key if necessary
Example: [p:We are discovered save yourself ;k:deceptive]
Polyalphabetic: Vigenere cipher
Observation: Repeated plaintext patterns separated by integer multiples of key length results in repeated ciphertext patterns
1) Attacker looks for repeated ciphertext patterns and guess key length
Same example: can guess
key length = 3 or 9
If (say) key length = 9,
the 1st, 10th, 19th, …
letter is encrypted
using same key letter
3) The problem reduces
to solving a number of
monoalphabetic ciphers.
Break each such
monoalphabetic cipher
as before.
Combine results
Vigenere cipher
Idea: append plaintext as key
Example:
Problem with auto-key is that key and plaintext share same frequency
Vernam
Aim is to choose a keyword that is as long as the plaintext and has not statistical relationship to the plaintext
The system can be expressed
Ci = Pi XOR Ki
Pi = ith binary digit of plaintext
Ki = ith binary digit of key
Ci = ith binary digit of ciphertext
XOR = Exclusive OR operation
Focus in this technique is on the construction of the key
Long key is a benefit, makes cryptanalysis difficult
Can be broken with availability of sufficient cipher texts
For further details and example, look at https://isaaccomputerscience.org/concepts/data_encrypt_vernam
One Time Pad
Use a truly random key that is as long as the message
An unconditionally secure type of cipher
A brute force attack will fail: you will decrypt into many possible plaintexts
There is no way to distinguish which is the correct plaintext
One Time Pad
Why unconditionally secure?
There are no statistical relationship between plaintext and ciphertext
The ciphertext contains no information about the plaintext
In fact, there is always a key to decrypt into whatever plaintext you want
However, usually not feasible in practice
Need huge amount of random numbers
How to securely distribute the secret key, which is as long as the message? (If there is such a way, why not use it for the message directly)
Transposition techniques
Transposition techniques
Permutation is a mathematics concept
In permutation order does matter
P(n,r)=n!/(n−r)!
e.g. 4 digits code and total digits are 0-9
P(n,r)=10!/(10−4)!=10⋅9⋅8⋅7⋅6⋅5⋅4⋅3⋅2⋅1/6⋅5⋅4⋅3⋅2⋅1=5040
Transposition ciphers mean that some sort of permutation has been performed on plain text
Transposition: Rail fence technique
Slight variations of it exists
Two steps
Plaintext is written down as a sequence of diagonals
Read off as a sequence of rows
Example
Plaintext = meet me after the toga party; rail fence of depth two, depth is the key
m e m a t r h t g p r y
e t e f e t e o a a t
The encrypted message is : MEMATRHTGPRYETEFETEOAAT
Key/Depth is 3 (Note variation)
mmtetefeear
mtaeemfreet
m e m a t r h t g p r y
e t e f e t e o a a t
m m t
e t e f e
e a r
m t a e
e m f r
e e t
Route/Rectangular cipher
Research on this yourself. Find how encryption and decryption works in this particular cipher
Cryptanalysis using frequency analysis
ciphertext:UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZVUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSXEPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
1st step: Frequency of letters
2nd step: Compare it with
Frequency of letters in English
text
P and Z in ciphers are equal to e and t, not sure which is which; high frequency
S,U,O, M and H probably correspond to a,h,i,n,o,r,s ; relatively high frequency
A,B,G,Y,I,J likely included in b,j,k,q,v,x,z
; lower frequency
3rd step: Find frequency of two letter combinations and compare with English language two letter frequency. For details on English language, frequency look at this link. Two letters combinations is called digrams or bigrams.
1) th in English language
2) ZW in cipher text above
3) Can be concluded that Z correspond with t and
W with h
4) This helps to conclude that P correspond with e
5) Thus, ZWP appears to be the, most frequent
trigram in English, indicate we are on the right
track
4th step: continued analysis of frequencies plus trial and error should easily yield a solution from this point.
It was disclosed yesterday that several informal but direct contacts have been made with political representatives of the viet cong in Moscow
Steganography
Conceal the existence of the message
Various techniques:
Character marking
Pin punctures
Typewriter correction ribbon
Through social media chat
Behind images
Does not use a lot as it is not very secure