Cryptography – CMSC 414
Cryptography
CMSC 414
Goals of Cryptography
There are two main goals of cryptography:
I Keep secrets secret (Confidentiality, Privacy, Anonymity)
I Ensure that data is correct (Integrity, Authenticity)
Cryptosystem Principles
Kerckhoffs’s Principle: “A cryptosystem should be secure even if
everything about the system, except the key, is public knowledge.”
Schneier’s Law: “Any person can invent a security system so
clever that he or she can’t imagine a way of breaking it.”
⇒ There is no security through obscurity!
Terminology
Cryptography and related topics have terminology all their own:
I Techniques
I Technologists
I Systems and operations
I Participants
Keeping Secrets
We want to replace some or all of our message with something
else, and being able to recover the original
Encoding refers to replacing a semantic unit (generally, a word or
phrase) with something else
Enciphering refers to replacing individual letters or bits with
something else
Plaintext or Cleartext refers to a message as written and
intended to be read
Ciphertext refers to a transformation of a plaintext so that it
cannot be read, other than by the intended recipient
Code Examples
Codewords can replace single or multiple words
Telegraph operators have standard replacements using 4-letter
non-words for common longer phrases
I CIMA: Reserved seats fully booked
I EZUK: I am arranging supply of trucks
I JOPO: No damages noted on discharge
A criminal organization replaced “kilo of cocaine” with “shirt”
⇒ Got awkward when the police intercepted a call promising
delivery of 312 shirts
Cipher Types
Classified based on unit of operation
Substitution and Stream Ciphers transform one symbol at a time
Block Ciphers transform multiple symbols as a group
Substitution Ciphers
Vigenère Ciphers (one type of substitution)
Plaintext Ciphertext
⇐ Key
A
B
C
D
E
K E Y W
L
M
N
O
P
F
G
H
I
J
Z
A
B
C
D
X
Y
Z
A
B
C = P + K mod 26
P = C − K mod 26
BEADED ⇒ MJZAPI
Monoalphabetic or Polyalphabetic Cipher
⇒ Vulnerable to Frequency Analysis
http://thephoenixsociety.org/puzzles/puzzles.htm
http://thephoenixsociety.org/puzzles/puzzles.htm
One-Time Pads
Message M: “Attack at dawn”
⇒ 0x41747461636b206174206461776e
Random key stream K :
⇒ 0x964a5c33708882d481643e50f301e79c…
Ciphertext C is M ⊕ K :
M: 0x41747461636b206174206461776e
K: 0x964a5c33708882d481643e50f301e79c…
C: 0xd73e285213e3a2b5f5445a31846f
Random key stream is a One-Time Pad
⇒ Never repeats!
Information-theoretic security — bit Ci has equal probability of
being a 0 or 1
Requires a reliable stream of random numbers, shared by sender
and receiver, with no reuse of the stream (this is what Quantum
Key Distribution does)
Block Ciphers
I Playfair replaces pairs of letters based on rectangles within a
grid
I DES and AES replace fixed-size blocks of bits according to a
key and a complex algorithm of shifts and mathematical
operations
I RSA also operates on fixed-size blocks, but has two keys,
which are related by a mathematical property
Technologists
Cryptographers create ciphers
Cryptanalysts break ciphers
Cryptologists study ciphers, both how they’re created and how
they’re broken
I generally not as thoroughly as cryptographers and cryptanalyts
I often focus on how to use cryptography
Systems and Operations
Encryption is the process of transforming a plaintext into the
corresponding ciphertext
Decryption is the process of transforming a ciphertext into the
corresponding plaintext
A Cryptosystem is the combination of an encryption and
decryption scheme, which may involve multiple algorithms to
prepare and modify the inputs and outputs, as well as protocols to
use those schemes
Participants
We generally use the following names:
Alice, Bob, and Carol are the typical participants in a
cryptosystem. They may be cooperative or adversarial.
Eve is an eavesdropper, and may have wiretapping abilities:
Passive Wiretapping Eve can see all messages exchanged
Active Wiretapping Eve can also add, remove, or modify messages
between when they’re sent and received
Trudy is an intruder
Some Notation
Cryptography involves math, so let’s review
Specifying a function’s domain and range:
f : D 7→R
H : {0, 1}∗ 7→{0, 1}n
E : {0, 1}k 7→{0, 1}k
Specifying a function’s transformation on an input:
f : x →f (x)
E : m→me mod n
D : c →cd mod n
Categorizing Algorithms
A Cryptographic Hash function is a one-way function of the form
H : {0, 1}∗ 7→ {0, 1}n
such that c = H(m) is easy to compute, but m = H−1(c) (the
preimage) is infeasible
A Symmetric-Key Cryptosystem has a single key, shared by the
sender and receiver; c = E (k,m) and m = D(k, c) are easy to
compute with the key k, but infeasible to compute otherwise
An Asymmetric-Key Cryptosystem has a public key that can be
shared with any potential senders and a private key known only to
the key owner; c = E (Kpub,m) can be computed by anyone, but
m = D(Kpriv, c) can only be computed efficiently by the key owner
Data Correctness
Cryptography can provide more than just secrecy
Cryptographic hashes can provide an Integrity check on data
Need some way to check the integrity of the hash
⇒ Digital Signatures
Digital Signatures
Asymmetric cryptosystems provide the following functions:
E : {0, 1}b 7→ {0, 1}b; D : {0, 1}b 7→ {0, 1}b
Since the domain and range of E and D are identical, we can
switch them:
s = D(Kpriv,m); m = E (Kpub, s)
s is a digital signature of m using the private key Kpriv, which
can be verified using the public key Kpub
Typically, m = H(M) for some cryptographic hash function H
Evaluating Cryptosystems
The Random Oracle model is the standard model of
cryptographic analysis
The oracle produces random output for any input, but the same
input will always produce the same output
A cryptographic primitive is cryptographically secure if it’s
indistinguishable from a random oracle
Birthday Paradox
Counter-intuitive statistical result
Given m samples from a set of N options, the probability of a
collision within the samples becomes appreciable when m ≈
√
N
H : {0, 1}∗ 7→ {0, 1}b likely to have collisions if 2b/2 inputs are
tried
This has implications for
I choice of block size based on computing capabilities
I number of times a block cipher can safely be used
Symmetric Key Cryptosystems
Also called Shared Key cryptosystems
Sender and receiver have the same key
Key has to be transmitted by some out-of-band mechanism or
generated by a key-agreement protocol
You can also use this to encrypt your own files, in which case you
are in both roles, and there is no need to transmit the key
XOR Block Cipher
K is a block of length b
M is a message of length b
C = M ⊕ K
M = C ⊕ K
Similar to One-Time Pad, but…
Breaking a Short XOR
AA D5 BF C0 A8 CA AA D5 AF C0 BC CF
We (somehow) already know b = 16 (2-byte XOR)
C: AA D5 BF C0 A8 CA AA D5 AF C0 BC CF
K: K1 K2 K1 K2 K1 K2 K1 K2 K1 K2 K1 K2
Break into 2 1-byte groups:
1: AA BF A8 AA AF BC
2: D5 C0 CA D5 C0 CF
Note repetitions — these are the same character in the plaintext!
AA1,D52,C02
Breaking a Short XOR
Recall:
1: AA BF A8 AA AF BC
2: D5 C0 CA D5 C0 CF
Let’s try some common
letters!
AA1
?= E (45)⇒ K1 = AA⊕ 45 = EF
E ? P ? G ? E ? @ ? S ?
AA1
?= T (54)⇒ K1 = AA⊕ 54 = FE
T ? A ? V ? T ? Q ? B ?
AA1
?= A(41)⇒ K1 = AA⊕ 41 = EB
A ? T ? C ? A ? D ? W ?
Looks promising…
D52
?= T (54)⇒ K2 = D5⊕ 54 = 81
A T T A C K A T D A W N
K = EB81
Longer blocks require more ciphertext, but technique is the same
This can be automated, based on derived letter frequencies
Better Symmetric-Key Ciphers
A good block cipher will employ:
Confusion Each bit in C depends on many bits in K (ideally
non-linearly)
Diffusion Flip a bit in M ⇒ flip about half the bits in C
Flip a bit in C ⇒ flip about half the bits in M
Our simple XOR has neither
SP-Networks
Substitution and Permutation
S-boxes mix input bits into output bits (confusion)
I each input bit mixes into every output bit
I changing an input bit changes roughly half of the output bits
I invertible
I usually block size is several S-boxes wide
P-boxes permute outputs of S-boxes (diffusion)
I 1-1 mapping
I full-width of the block, to mix output from separate S-boxes
This defines one round, of which there are typically several
SP-Networks
One Block
S:
P:
O
ne
R
ound
SP-Networks
Let’s look at an analogy:
Say you have a very large deck of cards to shuffle
⇒ Too much to fit in your hands all at once!
Break the deck into several smaller “sub-decks”, and shuffle these
⇒ This is what S-boxes do!
After shuffling each individually, cut them back together to form
one deck, but reordering parts of sub-decks
⇒ This is (roughly) what P-boxes do!
Obviously, one round isn’t enough to randomize the cards, so
repeat several times
Feistel Ciphers
In general, Feistel Cipher is ψ(f1, f2, …, f2k)
ψ−1(f1, f2, …, f2k) = ψ(f2k , …, f2, f1)
Often, fi built from a function F and keys Ki
Block size b bits
F : {0, 1}b/2 7→ {0, 1}b/2
K → {Ki}, i ∈ [0, n] (Key Schedule)
P → (L0,R0) — L and R both b/2 bits
Li+1 = Ri Ri+1 = Li ⊕ F (Ri ,Ki )
C = (Rn+1, Ln+1)
Ri = Li+1 Li = Ri+1 ⊕ F (Li+1,Ki )
P = (L0,R0)
F does not need to be invertible
Feistel Ciphers
That’s a lot of math, even for us
Let’s visualize this:
R1
L1
F1 ⊕
K1
R2
L2
F2 ⊕
K2
R3
L3
C 2-fn encrypt
R1
L1 F1⊕
K1
R2
L2
1-round decrypt
DES
Data Encryption Standard
64-bit blocks, 56-bit keys (plus 8 parity bits)
16-round Feistel cipher, operating on 32 bits of data in each Feistel
function
F is 8 different S-boxes and a P-box
S-boxes have 6-bit input and 4-bit output
DES is not secure ⇒ 3DES is three applications of DES, with 3
(sometimes 2) different DES keys
C = EK3(DK2(EK1(P))),P = DK1(EK2(DK3(C)))
AES
3DES was good enough for a while; eventually replaced with AES
(Advanced Encryption Standard)
Not a Fesitel cipher
Supports multiple key/block sizes: 128 (10 rounds), 192 (12
rounds), and 256 (14 rounds)
No practical attacks known (yet) against algorithm
Side-channel attacks exploit implementation details (an issue for
all algorithms)
Modes of Operation
Rare to use a block cipher on only one block
Need a way to extend this to multiple blocks
I Electronic Code Book (ECB)
I Cipher Block Chaining (CBC)
I Output Feedback (OFB)
I Counter Encryption (CTR)
I Message Authentication Code (MAC)
Electronic Code Book
Ci = EK (Pi)
P0 P1 P2 P3 P4
EK EK EK EK EK
C0 C1 C2 C3 C4
I Easy to parallelize
I Repeated plaintext blocks ⇒
repeated ciphertext
I Vulnerable to splicing attacks
I OK for challenge-response
systems
If you see a “secure” connection with “ECB” in the ciphersuite, be
wary!
Cipher Block Chaining
Ci = EK (Pi ⊕ Ci−1)
P0 P1 P2 P3 P4
EK EK EK EK EK
C0 C1 C2 C3 C4
IV
I Obscures repeated patterns
I Resistant to splicing
I Random Initialization
Vector (IV) used for first
block
Can’t parallelize block encryptions
Output Feedback
Ki = EK (Ki−1)
K−1 = IV
Ci = Pi ⊕ Ki
EK EK EK EK EK
K0 K1 K2 K3 K4
IV
P0 P1 P2 P3 P4⊕ ⊕ ⊕ ⊕ ⊕
C0 C1 C2 C3 C4
I Creates a key stream
I Additive stream cipher
I Random Initialization
Vector (IV) used for first
key
Given Pi, can recover Ki & replace with any block you like!
⇒ no message integrity
Parallelize as many blocks as willing/able to store keystream
Counter Encryption
Similar to OFB, but
Ki = EK (IV + i)
Can compute Ki as needed to parallelize!
Still vulnerable to the attack described for OFB
CTR has a longer cycle length (BP collisions), though (2n instead
of 2n/2 for CBC and OFB)
Message Authentication Code
Not an encryption mode, but an integrity mode
CBC-MAC builds off of CBC mode, keeping only the last block
Effective for fixed-length messages
CMAC mixes K into the next-to-last block encryption to prevent
an extension attack, where a variable-length message is
concatenated with a malicious extension
We can also use a cryptographic hash function with a key to create
a MAC
Cryptographic Hash Functions
H : {0, 1}∗ 7→ {0, 1}b
Small changes in the input should result in large changes in the
output
Given H(x), it should be very hard to find x (there will be many)
⇒ 2b/2 guesses, on average
Algorithm b Maximum message size
SHA-1 160 264 − 1
SHA-256 256 264 − 1
SHA-512 512 2128 − 1
We can also build cryptographic hash functions from block ciphers
⇒ must consider Birthday Paradox!
Using Hash Functions
Given a good hash function H and a key K , we can create a good
MAC
MAC = H(K |M)
I M is the message being authenticated
I a|b denotes concatenation of a and b
We can also create a Feistel cipher:
ψ(f1, f2, f3, f4) fi (x) = H(Ki |x)
The Luby-Rackoff Result proves this is a good block cipher
Encryption and Integrity
Encrypt-then-MAC
I Encrypt plaintext
I Keyed hash on ciphertext
I Used in IPsec
Encrypt-and-MAC
I Encrypt plaintext
I Keyed hash on plaintext
I Used in SSH
MAC-then-Encrypt
I Keyed hash on plaintext
I Plaintext and hash encrypted as a single message
I CCM: CBC-MAC and CTR
I Used in SSL/TLS
Limitations of Secret Key Cryptosystems
Secret keys are pairwise ⇒ for n principals, O(n2) keys/exchanges
If someone distributes an encrypted file and then disappears, no
way to verify file correctness or obtain a new key ⇒ requires
principals to be online
Key exchanges are often vulnerable to man-in-the-middle attacks,
so we often lack authentication
How can we overcome these limitations?
Trusted Third Parties
One way is to use a Trusted Third Party (TTP)
If Trent is a trusted third party, then Alice and Bob can use Trent
as an intermediary
Each principal only needs to exchange keys with Trent ⇒ O(n),
good
Possible to strongly authenticate Trent, since it only needs to be
done once ⇒ good
Trent becomes a bottleneck and central point of failure ⇒ bad
Trust Models
Just like a Security Model or Threat Model, we also need a Trust
Model
I Who are we trusting?
I Who is doing the trusting?
I What are we trusting them with?
I What are we trusting them to do or not do?
Trusting Trent
What is the trust model for our TTP?
I Who are we trusting?
Trent
I Who is doing the trusting?
Alice, Bob, and other communicants
I What are we trusting them with?
all of the communications, including timely delivery
I What are we trusting them to do or not do?
deliver messages with strong Confidentiality and Integrity, and
with constant Availability ; do not disclose messages, drop or
delay them, nor modify them
Kerberos
Kerberos uses symmetric key cryptography and a TTP
Delegates service support and decentralizes some of the TTP
functionality
Authentication Server (AS) grants Alice a Ticket Granting
Ticket (TGT) for a Service Server (SS), encrypted with that
service’s symmetric key
⇒ Alice cannot decrypt this ticket, but the Service Server can in
order to verify her credentials
The AS is required for users to log in, but SS can interpret the
TGT whether the AS is available or not
Further tickets can be granted hierarchically
Asymmetric Key Cryptosystems
Also called Public Key cryptosystems
Sender and receiver have different keys
Public key is used to encrypt, and can be given to anyone
Private key is used to decrypt, and must be kept secret by the
owner
This is incredibly useful for communications, like email and
messaging
Can also be used to generate digital signatures
Hard Problems
Form the core of any public key cryptosystem
One-way Trapdoor Functions
With some piece of information (the private key), a function
inverse is easy to compute ⇒ this is the trapdoor part
Without it, the inverse is infeasible to compute ⇒ this is the
one-way part
I Factoring products of large prime numbers
I Discrete logarithms
Modular Arithmetic
Most public key cryptography uses modular arithmetic
a ≡ b (mod n) ⇐⇒ a mod n = b mod n
Some things are easy
I a + b mod n
I a − b mod n
I a ∗ b mod n
I ab mod n
Some things are hard
I find b such that c ≡ ab (mod n)
Diffie-Hellman
Hard problem: given
I prime p
I α that generates Z∗p (integers modulo p, excluding 0)
I αx mod p
I αy mod p
find αxy mod p
Alice Bob
x ←
R
[1, p − 2]
αx mod p
y ←
R
[1, p − 2]
K ← (αx )y mod p
α
y mod p
K ← (αy )x mod p
These shared keys are discarded after a session completes
RSA
Ron Rivest, Adi Shamir, Leonard Adleman — 1978
Most widely-used public key
algorithm
First developed by GCHQ in 1973, but Official Secrets Act meant
they couldn’t take credit until 1997
RSA
Hard problem: given
I n = p · q for p, q prime (but known only to one party)
I e ∈ Z∗n (integers modulo n, excluding 0, p · a, q · a (∀a))
I me mod n
find m
One party knows d s.t. e · d ≡ 1 (mod (p − 1)(q − 1))
With d , finding m is easy: m ≡ (me)d mod n
Computing d is hard without factoring n into p and q, even given e
Kpub = (n, e), Kpriv = (p, q, d) e is usually 3 or 65537
c = E (Kpub,m) = me mod n, m = D(Kpriv, c) = cd mod n
Weaknesses of RSA
E (Kpub,m) always the same
⇒ Chosen-plaintext attack using potential messages
E ((n, e),m) ,E ((n′, e),m)
⇒ Chinese Remainder Theorem can be used to decrypt
Random Padding of plaintexts can alleviate these problems
Practical Considerations of RSA
No two principals can have the same modulus n, or they will be
able to decrypt each other’s messages
A message m must be less than n
Due to the need for padding, m must be even smaller (fewer bits
of padding means greater vulnerability to chosen-plaintext attacks)
Encryption is fairly fast, but decryption is rather slow
Long messages have to be split into many blocks
We will re-visit this soon
Other Public Key Cryptosystems
ElGamal generates a pair of elements as a ciphertext, and employs
a random exponent which is incorporated into both elements
I It relies on the difficulty of solving the Discrete Logarithm
problem
Elliptic-Curve Cryptograpy is based on a variant of the Discrete
Logarithm problem as it relates to calculations involving elliptic
curves
I There are multiple cryptosystems that employ elliptic curves
I Elliptic-curve cryptography typically requires much shorter
keys than RSA for the same level of security
Digital Signatures
As previously discussed, a public key encryption scheme can be
used to create a digital signature scheme
We will limit our discussion to RSA signatures, but there are other
digital signature schemes to be aware of:
I The Digital Signature Algorithm is based on the ElGamal
cryptosystem
I The Schnorr signature algorithm is similar to DSA, but is
based on more general group theoretical principles
RSA Signatures
Define
S(Kpriv,M) = D(Kpriv,H(M))
V (Kpub, s,M) = H(M)
?
≡ E (Kpub, s) (mod n)
H is a cryptographic hash function, such as SHA-1 or SHA-256
If s = S(Kpriv,M) = H(M)d mod n, then
E (Kpub, s) = (H(M)d mod n)e mod n = H(M)
A signature scheme thus requires both an encryption scheme and a
cryptographic hash function
E and D are called encrypt and decrypt operations; S and V are
called sign and verify operations