程序代写代做代考 algorithm database chain C graph game CS306: Introduction to IT Security Fall 2020

CS306: Introduction to IT Security Fall 2020
Labs 5 & 6: More on Hashing
Instructor: Nikos Triandopoulos October 8 & 15, 2020

2
6.5 Hash functions

Cryptographic hash functions
Basic cryptographic primitive
u maps “objects” to a fixed-length binary strings
u core security property: mapping avoids collisions
input
arbitrarily
output
long string
u collision: distinct objects (x ≠ y) are mapped to the same hash value (H(x) = H(y))
short digest, fingerprint, “secure”
H
description
u although collisions necessarily exist, they are infeasible to find
Important role in modern cryptography
u lie between symmetric- and asymmetric-key cryptography
u capture different security properties of “idealized random functions” u qualitative stronger assumption than PRF
77

Hash & compression functions
Map messages to short digests
u a general hash function H() maps
u a message of an arbitrary length to a l(n)-bit string
input
arbitrarily long string
H
output
l(n)-bit string
u a compression (hash) function h() maps
u a long binary string to a shorter binary string
u an l’(n)-bit string to a l(n)-bit string, with l’(n) > l(n)
input
l’(n)-bit string
h
output
l(n)-bit string
4

Collision resistance (CR)
Attacker wins the game if x ≠ x’ & H(x) = H(x’)
description of H
x, x’
H is collision-resistant if any PPT A wins the game only negligibly often. 5
A
Hash function H
T

Weaker security notions
Given a hash function H: X ® Y, then we say that H is
u preimage resistant (or one-way)
u if given y Î Y, finding a value x Î X s.t. H(x) = y happens negligibly often
u 2-nd preimage resistant (or weak collision resistant)
u ifgivenauniformxÎX,findingavaluex’ÎX,s.t.x’1xandH(x’)=H(x)
happens negligibly often
u cf. collision resistant (or strong collision resistant)
u if finding two distinct values x’, x Î X, s.t. H(x’) = H(x) happens negligibly often 6

7
6.6 Design framework

Domain extension via the Merkle-Damgård transform
General design pattern for cryptographic hash functions
u reduces CR of general hash functions to CR of compression functions
input
H
u thus, in practice, it suffices to realize a collision-resistant compression function h
input
output
arbitrarily long string
output
≤ l’(n)-bit string
l(n)-bit string
l(n)-bit string
h
u compressing by 1 single bit is a least as hard as compressing by any number of bits!
8

Merkle-Damgård transform: Design
Suppose that h: {0,1}2n→ {0,1}n is a collision-resistant compression function Consider the general hash function H: M = {x : |x|<2n} → {0,1}n, defined as Merkle-Damgård design u H(x) is computed by applying h() in a “chained” manner over n-bit message blocks u pad x to define a number, say B, message blocks x1, ..., xB, with |xi| = n u set extra, final, message block xB+1 as an n-bit encoding L of |x| u starting by initial digest z0 = IV = 0n, output H(x) = zB+1, where zi = hs(zi-1||xi) 9 Merkle-Damgård transform: Security If the compression function h is CR, then the derived hash function H is also CR! 10 Compression function design: The Davies-Meyer scheme Employs PRF w/ key length m & block length n u define h: {0,1}n+m → {0,1}n as h(x||k) = Fk(x) XOR x Security u hisCR,ifFisanidealcipher 11 Well known hash functions u MD5 (designed in 1991) u output 128 bits, collision resistance completely broken by researchers in 2004 u today (controlled) collisions can be found in less than a minute on a desktop PC u SHA1 – the Secure Hash Algorithm (series of algorithms standardized by NIST) u output 160 bits, considered insecure for collision resistance u broken in 2017 by researchers at CWI u SHA2 (SHA-224, SHA-256, SHA-384, SHA-512) u outputs 224, 256, 384, and 512 bits, respectively, no real security concerns yet u based on Merkle-Damgård + Davies-Meyer generic transforms u SHA3(Kessac) u completely new philosophy (sponge construction + unkeyed permutations) 12 SHA-2-512 overview 13 Current hash standards 14 15 6.7 Generic attacks Generic attacks against cryptographic hashing Assume a CR compression function h : {0,1}l’(n) → {0,1}l(n) u brute-force attack u for each string x in the domain u compute and record hash value h(x) u if h(x) equals a previously recorded hash h(y) (i.e., x ≠ y but h(x)=h(y)), halt and output collision on x ≠ y u birthday attack u surprisingly, a more efficient generic attack exists! 16 Birthday paradox “In any group of 23 people (or more), it is more likely (than not) that at least two individuals have their birthday on the same day” u based on probabilistic analysis of a random “balls-into-bins” experiment: “k balls are each, independently and randomly, thrown into one out of m bins” u captures likelihood that event E = “two balls land into the same bin” occurs u analysis shows: Pr[E] » 1 - e-k(k-1)/2m (1) u ifPr[E]=1/2,Eq.(1)givesk»1.17m1⁄2 u thus,form=365,kisaround23(!) u assuming a uniform birth distribution 17 Pr[E] m=365 k Birthday attack Applies “birthday paradox” against cryptographic hashing u exploits the likelihood of finding collisions for hash function h using a randomized search, rather than an exhausting search uanalogy u k balls: distinct messages chosen to hash u m bins: number of possible hash values u independent & random throwing u how is this achieved? u message selection, hash mapping Mk ...M4 M 3 hash function h M2 M1 bin1 bin2 ... 18 binm Probabilistic analysis Experiment u k balls are each, independently and randomly, thrown into one out of m bins Analysis u the probability that the i-th ball lands in an empty bin is: 1 - (i - 1)/m u the probability Fk that after k throws, no balls land in the same bin is: Fk =(1-1/m)(1-2/m)(1-3/m)...(1-(k-1)/m) u by the standard approximation 1 - x » e-x: Fk » e-(1/m + 2/m + 3/m + ... + (k-1)/m) = e-k(k-1)/2m u thus, two balls land in same bin with probability Pr[E] = 1 - Fk = 1 - e-k(k-1)/2m u lower bound – Pr[E] increases if the bin-selection distribution is not uniform 19 What birthday attacks mean in practice... u # hash evaluations for finding collisions on n-bit digests with probability p nm u for large m = 2n, average # hash evaluations before finding the first collision is 1.25(m)1/2 20 Overall Assume a CR function h producing hash values of size n u brute-force attack u evaluate h on 2n + 1 distinct inputs u by the “pigeon hole” principle, at least 1 collision will be found u birthday attack u evaluate h on (much) fewer distinct inputs that hash to random values u by “balls-into-bins” probabilistic analysis, at least 1 collision will likely be found u when hashing only half distinct inputs, it’s more likely to find a collision! u thus, in order to get n-bit security, we (at least) need hash values of length 2n 21 22 6.8 Applications of hashing to cryptography Hash functions enable efficient MAC design! Back to problem of designing secure MAC for messages of arbitrary lengths u so far, we have seen two solutions u block-based “tagging” u based on PRFs u inefficient u CBC-MAC u also based on PRFs u more efficient r||1||δ||m1 r||2||δ||m1 r||d||δ||m1 Fk Fk ... Fk t=r,t1, t2, td 23 [1] Hash-and-MAC: Design Generic method for designing secure MAC for messages of arbitrary lengths u based on CR hashing and any fix-length secure MAC m m H H(m) Mac m t u new MAC (Gen’, Mac’, Vrf’) as the name suggests u Gen’: instantiate H and Mack with key k u Mac’: hash message m into h = H(m), output Mack-tag t on h u Vrf’: canonical verification Mac’ hash h = H(m) Mac 24 t [1] Hash-and-MAC: Security The Hash-and-MAC construction is a secure as long as u Hiscollisionresistant;and u theunderlyingMACissecure Intuition u since H is CR: authenticating digest H(m) is a good as authenticating m itself! m Mac’ hash h = H(m) Mac 25 t [2] Hash-based MAC u so far, MACs are based on block ciphers u can we construct a MAC based on CR hashing? 26 [2] A naïve, insecure, approach Set tag t as: Mack(m) = H(k||m) u intuition: given H(k||m) it should be infeasible to compute H(k||m’), m’ ≠ m Insecure construction u practical CR hash functions employ the Merkle-Damgård design u length-extension attack u knowledge of H(m1) makes it feasible to compute H(m1||m2) u by knowing the length of m1, one can learn internal state zB even without knowing m1! 27 [2] HMAC: Secure design Set tag t as: HMACk[m] = H[ (kÅopad) || H[ (kÅipad) || m ] ] u intuition: instantiation of hash & sign paradigm u two layers of hashing H u upper layer u y = H( (k Å ipad) || m ) u y = H’(m), i.e., “hash” u lower layer ut=H((kÅopad) ||y’) u t = Mac’(kout, y’), i.e., “sign” 28 [2] HMAC: Security If used with a secure hash function and according to specs, HMAC is secure u no practical attacks are known against HMAC 29 30 6.9 Applications to security Recall: Weaker security notions Given a hash function H: X ® Y, then we say that H is u preimage resistant (or one-way) u if given y Î Y, finding a value x Î X s.t. H(x) = y happens negligibly often u 2-nd preimage resistant (or weak collision resistant) u ifgivenauniformxÎX,findingavaluex’ÎX,s.t.x’1xandH(x’)=H(x) happens negligibly often u cf. collision resistant (or strong collision resistant) u if finding two distinct values x’, x Î X, s.t. H(x’) = H(x) happens negligibly often 31 Generally: Message digests Short secure description of data primarily used to detect changes M Hash function Encrypted for authe o n e way “ n ti c ity compression” Message digest 32 Application 1: Secure cloud storage u Bob has files f1, f2,...,fn u Bob sends to a cloud storage provider u the hashes h(f1||r), h(f2||r),..., h(fn||r) u files f1, f2,...,fn u Bob stores locally randomness r and keeps it secret u Every time Bob reads a file fi, he also reads h(fi||r) and verifies the integrity of fi u Any problems with writes? 33 Application 2: Fairness (I) Suppose Alice, Bob, Charlie are bidders in an online auction u Alice plans to bid A, Bob B and Charlie C u they do not trust that bids will be secret u nobody is willing to submit their bid u solution u Alice, Bob, Charlie submit hashes h(A), h(B), h(C) of their bids u all received hashes are posted online u then parties’ bids A, B and C revealed u analysis u “hiding:” hashes do not reveal bids u “binding:” cannot change bid after hash sent 34 (which property?) (which property?) Application 2: Fairness (II) A general issue with concealing private data via hashing u due to the small search space, this protocol is not secure! u a forward search attack is possible u e.g., Bob computes h(A) for the most likely bids A u how to prevent this? u increase search space u e.g., Alice computes h(A||R), where R is randomly chosen u at the end, Alice must reveal A and R u but before he chooses B, Bob cannot try all A and R combination 35 Application 2: Digital envelops Commitment schemes u two operations u commit(x, r) = C u i.e., put message x into an envelop (using randomness r) u e.g., commit(x, r) = h(x || r) u hiding property: you cannot see through an (opaque) envelop u open(C, m, r) = ACCEPT or REJECT u i.e., open envelop (using r) to check that it has not been tampered with u e.g., open(C, m, r): check if h(x || r) =? C u binding property: you cannot change the contents of a sealed envelop 36 Application 2: Security properties Hiding: perfect opaqueness u similar to indistinguishability; commitment reveals nothing about message u adversary selects two messages x1, x2 which he gives to challenger u challenger randomly selects bit b, computes (randomness and) commitment Ci of xi u challenger gives Cb to adversary, who wins if he can find bit b (better than guessing) Binding: perfect sealing u similar to unforgeability; cannot find a commitment “collision” u adversary selects two distinct messages x1, x2 and two corresponding values r1, r2 u adversary wins if commit(x1, r1) = commit(x2, r2) 37 Example 2: Fair decision via coin flipping Alice is to “call” the coin flip and Bob is to flip the coin u to decide who will do the dishes... u problem: Alice may change her mind, Bob may skew the result u protocol u Alice "calls" the coin flip but only tells Bob a commitment to her call u Bob flips the coin and reports the result u Alice reveals what she committed to u Bob verifies that Alice's call matches her commitment u If Alice's revelation matches the coin result Bob reported, Alice wins u hiding: Bob does not get any advantage by seeing Alice commitment u binding: Alice cannot change her mind after the coin is flipped 38 Application 3: Forward-secure key rotation Alice and Bob secretly communicate using symmetric encryption u Eve intercepts their messages and later breaks into Bob’s machine to steal the shared key Alice key k s1 = k Bob key k h s2 h s3 h ...sk ✗ h sk+1 leakage 39 key Application 4: Hash values as file identifiers Consider a cryptographic hash function H applied on a file F u the hash (or digest) H(M) of F serves as a unique identifier for F u “uniqueness” u if another file F’ has the same identifier, this contradicts the security of H u thus u the hash H(F) of F is like a fingerprint u one can check whether two files are equal by comparing their digests Many real-life applications employ this simple idea! 40 Examples 4.1 Virus fingerprinting u When you perform a virus scan over your computer, the virus scanner application tries to identify and block or quarantine programs or files that contain viruses u This search is primarily based on comparing the digest of your files against a database of the digests of already known viruses u The same technique is used for confirming that is safe to download an application or open an email attachment 4.2 Peer-to-peer file sharing u In distributed file-sharing applications (e.g., systems allowing users to contribute contents that are shared amongst each other), both shared files and participating peer nodes (e.g., their IP addresses) are uniquely mapped into identifiers in a hash range u When a given file is added in the system it is consistently stored at peer nodes that are responsible to store files those digests fall in a certain sub-range u When a user looks up a file, routing tables (storing values in the hash range) are used to eventually locate one of the machines storing the searched file 41 Example 4.3: Data deduplication Goal: Elimination of duplicate data u Consider a cloud provider, e.g., Gmail or Dropbox, storing data from numerous users. u A vast majority of stored data are duplicates; e.g., think of how many users store the same email attachments, or a popular video... u Huge cost savings result from deduplication: u a provider stores identical contents possessed by different users once! u this is completely transparent to end users! Idea: Check redundancy via hashing u Files can be reliably checked whether they are duplicates by comparing their digests. u When a user is ready to upload a new file to the cloud, the file’s digest is first uploaded. u The provider checks to find a possible duplicate, in which case a pointer to this file is added. u Otherwise, the file is being uploaded literally u This approach saves both storage and bandwidth! 42 Example 4.4: Password hashing Goal: User authentication Problem: How to protect password files u Today, passwords are the dominant means for u user authentication, i.e., the process of verifying the identity of a user (requesting access to some computing resource). If password are stored at the server in the clear, an attacker can steal the password file after breaking into the authentication server – this type of attack happens routinely nowadays... Password hashing involved having the server storing the hashes of the users passwords. Thus, even if a password file leaks to an attacker, the onewayness of the used hash function can guarantee some protections against user- impersonation simply by providing the stolen password for a victim user. u This is a “something you know” type of user authentication, assuming that only the legitimate user knows the correct password. u When you provide your password to a computer system (e.g., to a server through a web interface), the system checks if your submitted password matches the password that was initially stored in the system at setup. u u 43 Example 4.4: Password storage Plaintext Concealed via hashing 44 Application 5: Hash-and-digitally-sign (looking ahead) Very often digital signatures are used with hash functions u the hash of a message is signed, instead of the message itself Signing message M u let h be a cryptographic hash function, assume RSA setting (n, d, e) u compute signature σ = h(M)d mod n u send σ, M Verifying signature σ u use public key (e,n) u computeΗ=σe modn u if H = h(M) output ACCEPT, else output REJECΤ 45 46 6.10 Database-as-a- service authentication model Securing your cloud storage optimally – setup server user files 2. upload files 3. send h & hashing scheme d files 1. pre-process files using CR hash function h files = F = (F1, F2, ..., F7, F8) digest d is computed over all files |d| << |F| 47 Securing your cloud storage optimally – data authentication 1. give me file F1 2. here it is, F1’ goals security: Step 4 is reliable test efficiency: u disassmallaspossible u proofisassmallaspossible u verificationisasfastaspossible server files + 3. “proof” d 4. verification F = (F1, F2, ..., F7, F8)(or helper information) user has u authenticdigestd(locallystored) u fileF1’(tobechecked) u proof(tohelpcheckingintegrity) “is F1’ intact?” verification involves user 48 u combinefileF1’withtheproofto re-compute candidate digest d’ u checkifd’=d u ifyes,thenF1isintact;otherwise tampering is detected! Hashing the files: Individually or as a whole? server user files F = (F1, F2, ..., Fn) 1. give me file F1 2. here it is, F1’ + 3. “proof” (or helper information) d 4. verification goals security: Step 4 is reliable test efficiency: u disassmallaspossible u proof is as small as possible “isF1’intact?”u verificationisasfastaspossible lethi =h(Fi),1≤i≤n d = (h1, h2,..., hn) Vs. d = h(h1||h2||...||hn) Vs. d = h(F1||F2||...||Fn) 49 Hashing the files: In a chain or in a partition? 1. give me file F1 2. here it is, F1’ goals security: Step 4 is reliable test efficiency: u disassmallaspossible u proof is as small as possible server user files + 3. “proof” (or helper information) d 4. verification “isF1’intact?”u verificationisasfastaspossible lethi =h(Fi),1≤i≤n F = (F1, F2, ..., Fn) d=hn,hi =h(Fi||hi−1),12:
u hashing k files in a hash chain is preferable to hashing files as a whole
u hash chains over k objects (hash values/files) result in unbalanced verification costs
u e.g., proof size/verification time are sensitive to a file’s position in the hash chain u hashing file subsets individually across a balanced partition of the files:
u offers trade-offs between space and verification costs u allows for hierarchical hashing schemes
51

Hashing via the Merkle tree
server
user
files
2. upload files
3. send h & hashing scheme
d
files = F = (F1, F2, …, F7, F8)
digest d is computed over all files |d| << |F| h: collision resistant iterative hash function (e.g., SHA-2) 1. pre-process files using CR hash function h F1 F2 dd ... F7 F8 F1 F2 ... F7 F8 52 The Merkle tree files 1. give me file F1 2. here it is, F1’ server user + 3. “proof” (or helper information) d 4. verification F = (F1, F2, ..., F8) h(h12 ||h34 )= h14 h h58 =h(h56 ||h78 ) h(h ||h )= h h56 h 1 2 12 34 78 d=h(h14 ||h48 ) h1 h2 ... h7 h8 lethi =h(Fi),1≤i≤n 53 Generalization: DB-as-a-service authentication model source server query answer user “is answer correct?” DB DB cloud-based DB portals, hubs 54 Generalization: DB-as-a-service authentication model source server query answer user “is answer correct?” DB DB Internet protocols (DNS, OCSP) 55 Generalization: DB-as-a-service authentication model source digest server user query answer DB file hosting 56 DB-as-a-service authentication model source server DB DB query answer user “is answer correct?” Integrity checks offer authenticated queries guarantees correct answer, as if coming directly from the source! 57 DB-as-a-service authentication model malicious source DB DB query answer + proof user “is answer correct?” verification server Integrity checks offer authenticated queries u crypto-based: harden data/computations to provide verifiable answers u reliable: allow no false positives or negatives crypto does its work! 58 DB-as-a-service authentication model malicious source DB DB query answer + proof user “is answer correct?” verification server Integrity checks offer authenticated queries u crypto-based: harden data/computations to provide verifiable answers u reliable: allow no false positives or negatives u utility-preserving: practical & easy to adopt minimize all overheads u fast times for query, verification u near total storage, low bandwidth usage for proof 59 DB-as-a-service authentication model source DB update malicious server DB query user “is answer correct?” answer + proof verification Integrity checks offer authenticated queries u crypto-based: harden data/computations to provide verifiable answers u reliable: allow no false positives or negatives u utility-preserving: practical & easy to adopt u fast times for query, verification , update u near total storage, low bandwidth usage for proof , update 60 Intro to authenticated querying (AQ101) δ server DB query answer user source DB compute compact & secure digest δ of DB 61 Intro to authenticated querying (AQ101) δ query user answer + proof source DB compute compact & secure digest δ of DB DB compute proof that links answer to digest server 62 Intro to authenticated querying (AQ101) Assumption: user possesses authentic δ verification use proof to securely link answer to authentic digest δ query answer DB + compute proof that proof links answer to digest source server user DB compute compact & secure digest δ of DB 63 Intro to authenticated querying (AQ101) Assumption: user possesses authentic δ easily removed using PKI, digital signatures verification use proof to securely link answer to authentic digest δ query answer DB + compute proof that proof links answer to digest source server user DB compute compact & secure digest δ of DB 64 Intro to authenticated querying (AQ101) δ query answer DB + compute proof that proof links answer to digest source server DB compute compact & secure digest δ of DB (given δ) verification use proof to securely link answer to authentic digest Using digital signatures to provide authentic digest δ 65 user Example: Set membership – digest computation δ server is 3 in DB? yes source { 2, 3, 6, 7} = DB DB δ=h(h12 ||h34 ) h34=h(6||7) compute compact & secure digest δ of DB Merkletreehash h(2||3)= h12 h: collision resistant terative hash function (e.g., SHA-2) user 2367 66 Example: Set membership – proof computation δ is 3 in DB? yes + proof source server user {2,3,6,7}= DB compute compact & secure digest δ of DB Merkle tree hash h12 DB δ h34 compute proof that links answer to digest 2367 67 Example: Set membership – proof verification δ is 3 in DB? yes DB + compute proof that links answer to digest h34 verification (given δ) use proof to securely link answer to authentic digest i.e.: recompute h12 recompute δ compare against given δ source server user {2,3,6,7}= DB compute compact & secure digest δ of DB Merkle tree hash h12 proof δ 2367 68