Cryptography basics – Integrity: Hashes and MACs
ECEN 4133 Jan 19, 2021
Alice and Bob
Alice wants to send message m to Bob
◦ Can’t fully trust the messenger or network carrying the message ◦ Want to be sure what Bob receives is actually what Alice sent
✉✉
Alice Mallory Bob
Threat model:
◦ Mallory can see, modify, forge messages
◦ Mallory wants to trick Bob into accepting a message Alice didn’t send
Solution:
Message Authentication Code (MAC)
One approach:
◦ Alice computes v := f(m)
◦ Bob verifies that v′ = f(m′)
Alice
Function f ?
Easily computable by Alice and Bob;
not computable by Mallory
(Idea: Secret only Alice & Bob know)
We’re sunk if Mallory can learn f(x) for any x ≠ m!
m, v
m’, v’
Mallory
Bob
Candidate f: Random Function
Input: Any size
Output: Fixed size (e.g. 256 bits)
Defined by a giant lookup table that’s filled in by flipping coins
Completely impractical [why?]
in
0 → 1 → 2 →
out
0011111001010001… 1110011010010100… 0101010001010000…
Provably secure [why?]
(Mallory can’t do better than randomly guessing)
…
…
Hash Functions
Random Functions are impractical
Hash functions approximate a random function:
• Any size input
• Fixed size output (e.g. 256 bits)
• Hard (but not impossible!) to invert (given output, find input)
Input
N bits
Hash
256 bits
Properties of a secure cryptographic hash:
◦ First pre-image resistant – Given H(m), hard to find m
◦ Second pre-image resistant – Given m1, hard to find m2 s.t H(m1)==H(m2) ◦ Collision resistant – Hard to find m1 != m2 s.t H(m1)==H(m2)
Output
Example Hash Function: SHA256
What is SHA256?
“Cryptographic hash function”
Input: arbitrary length data (No key) Output: 256 bits
Built with “compression function” h
(256 bits, 512 bits) in → 256 bits out Designed to be really hairy (64 rounds of this:)
Compression functions
Compression function h take (two) fixed-length inputs, produce fixed-length output
?
256 bits
How do we build a hash function from h that takes an arbitrary length input?
256 bitsH(M)
512 bits
?
M N bits
Hash
h
256 bits
Solution: Merkle–Damgård Construction
Entire algorithm:
1. Pad input M to a multiple of 512 bits 2. Break into 512-bit blocks b0, b1, … bn-1
3. y0 = const (IV), y1 = h(y0,b0),
…,
yi = h(yi-1,bi-1)
4. Return yn
IV
b0
Fixed-length inputs/output
s
h h
M
Arbitrary length input
pad
b1 ……
bn-1 h
Fixed length output
H(M)
Merkle–Damgård Problem:
Length Extension Attacks
Given H(m), attacker can compute H(m || x)
IV
b0 h
for arbitrary x, without knowing m!
[How?]
Fixed-length inputs/output
s
b1 ……
bn-1 h
M
pad
h
Fixed length output
H(M)
Length Extension Attack
Given H(m), attacker can compute H(m || x) for arbitrary x, without knowing m!
IV
h
IV
b0 h b0
H(m) is essentially the internal state
of the hash function after hashing m. b
Can just feed H(m) as IV to compute b1 h 1 hash of H(m || x)
H(b0 || b1)
H(b0 || b1)
h xhxh
H(b0 || b1 || x)
H(b0 || b1 || x)
Other hash functions
MD5
Once ubiquitous, broken in 2004 Turns out to be easy to find collisions
(pairs of messages with same MD5 hash) You’ll investigate this in Project 1
SHA1
Deprecated in 2011, but still widely used
Collisions found in 2017:
Took 9,223,372,036,854,775,808 SHA1 computations to find (6,500+ CPU-years)
Don’t use!
SHA3
Different “sponge” construction Not susceptible to length-extension
Try hash functions yourself!
$ echo -n “Hello, World” | sha256sum 03675ac53ff9cd1535ccc7dfcdfa2c458c5218371f418dc136f2d19ac1fbe8a5 –
$ echo -n “Hello, World” | openssl dgst -sha3-256
(stdin)= 844af7bf6a5d414359dcd8845cb52d515397410e1668e00c8469ea8728c4ffe8 $ echo -n “Hello, Worle” | openssl dgst -sha3-256
(stdin)= 7cccfd7c35d3b321c85bce74564b8da936bcd9eed4877ad775f262f106b71f3d
Hash functions -> Integrity?
Can we use hash functions to provide integrity?
m, H(m)
Alice
Mallory Bob
Hash functions -> Integrity?
Can we use hash functions to provide integrity?
m, H(m)
Alice
Not directly: Mallory could still change m to m’, and compute H(m’) [Alternative?]
m’, H(m’)
Mallory
Bob
Keyed hash function:
Message Authentication Code (MAC)
Assume Alice and Bob have a
shared secret k Alice computes MAC
over the message m with her key k:
k
m, v
m’, v’?
k
v = MACk(m)
Mallory doesn’t know k, so cannot produce v’ = MACk(m’)
Mallory
Alice
Bob
Building a MAC from a hash function: HMAC
HMAC-SHA256(k, m) =
SHA256kc SHA256kc m
12
Not vulnerable to length extension!
XOR 0x3636
… Concatenation …
SHA256 function
takes arbitrary length input,
returns 256-bit output
0x5c5c
Using HMAC
$ echo “Hello, World” | openssl dgst -sha256 -hmac “NotVerySecret” (stdin)= a502137ae2ad88313fcb267747a0474f0286ae671a1e639d5448a82bc5efb44a
Tricky question: are hashes secure?
Tricky question: are hashes secure?
Answer: we don’t know!
Hashes have been broken in the past:
◦ MD5 introduced in 1992, first collision in 2004
◦ SHA1 introduced in 1995, first collision in 2017
◦ SHA2 introduced in 2001, no known collision …yet! ◦ SHA3 introduced in 2015, no known collision …yet!
We know collisions exist, but hope they are difficult to find [Why?]
MAC crypto game
Game against Mallory
1. Give Mallory MAC(K, mi) ∀ 𝑚𝑖 ∈ 𝑀
and 𝑀 (but not K!)
2. Mallory tries to discover
MAC(K, m′) for a new m′ ∉ 𝑀
3. If Mallory succeeds, MAC is insecure
Other uses for hashes/HMACs?