程序代写代做代考 information theory database algorithm 17crypto_L02

17crypto_L02

Information Theory

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

1

Entropy

� The entropy of a set of messages is the number of bits
needed to encode all possible messages with an optimal
encoding.

� Let X = {Xi} be the set of all possible messages, with
message X occurring with probability p(X ).

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

2

i
message Xi occurring with probability p(Xi).

� The entropy H is defined as
�H(X) = Σ p(Xi) log2(1/p(Xi))

� Σ means sum.

A Note On Logs

� The log function is the inverse of the power function.

� If a = 2b then b = log2(a).

� Special values
�If b = 1 then a = 2, so log (2) = 1.

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

3

�If b = 1 then a = 2, so log2(2) = 1.

�If b = 0 then a = 1, so log2(1) = 0.

� Note: abac = ab+c

� Note: (ab)c = abc

� Note: 1/ab = a-b

� Logs base 2 are convenient for computing.

Entropy Example

� Consider the Gender field in a database.
�The two possible values, MALE and FEMALE have

equal probability.
� p1 = p2 = 1/2

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

4

� p1 = p2 = 1/2

� H = 1/2 log2(2) + 1/2 log2(2)

� = 1/2 + 1/2

� = 1

� Therefore 1 bit is needed to encode this information.

Non Equal Probabilities

� Now change the probabilities
�p1 = 7/8

�p2 = 1/8

� H = 0.5410

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

5

� H = 0.5410

� Fewer bits are needed when one value is more likely than

the other.

� We could group the values in 10’s and use 6 bits per
group.

Null Values Allowed

� Now assume that the database allows NULL values for the
Gender field.
�The gender information is not provided in half of the

entries.

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

6

�p1 = p2 = 1/4

�p3 = 1/2
� H = 1/4 log2(4) + 1/4 log2(4) + 1/2 log2(2)
� = 1/2 + 1/2 + 1/2

� = 1.5

Another Example

� There are n messages, all equally likely.

�pn = 1/n

� H = n (1/n log2(n)) = log2(n)

� log (n) is the number of bits in the binary representation

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

7

� log2(n) is the number of bits in the binary representation

of n.

The Redundancy of a Language

� All ways of encoding data have some structure which
makes them different from purely random collections.

� English is highly structured, with a lot of redundancy.
� Compressing a file of English text reduces the structure,

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

8

� Compressing a file of English text reduces the structure,
but there is still some structure in the compression
encoding table.

� Information theory provides quantitative estimates of the
amount of redundancy.

� Code breaking attacks that exploit this redundancy are
called statistical attacks.

The Actual Rate of a Language

� The actual rate of a language (r) is defined as the average
number of bits of information per character in the
language.
�It is essentially the entropy per character.

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

9

� We can calculate it by defining the rate per character for all
N character strings rN and making N large.

� We can define it in terms of entropy as:
� rN = H(X) / N

�where X is the set of all messages of length N using
characters in the language.

The Actual Rate of a Language (2)

� As N increased, the rate decreases because there are fewer
choices and certain choices are more likely.

�This decrease tapers off quickly to a constant value.
� In English, r is between 1.0 and 1.5 bits per character

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

10

� In English, rN is between 1.0 and 1.5 bits per character

for large N.

� r is the value of rN when N is large.

The Absolute Rate of a Language

� The absolute rate (R) is defined as the maximum number
of bits of information that could be encoded by the
language, assuming all sequences of characters are equally
likely.

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

11

likely.
� Let L be the number of characters in the alphabet.

� R = log2(L) from earlier.

�Note, this means L = 2R.

� In English, R = log2(26) = 4.7

The Redundancy of a Language

� The redundancy (D) of a language is defined as
�D = R – r

� In English, D = 3.7 .. 3.2. We will use 3.2.

� This corresponds to 68% redundancy.

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

12

� This corresponds to 68% redundancy.

� The redundancy comes from
�Uneven single letter distribution;

�Uneven digram (2 letter) frequency;

�Uneven trigram (3 letter) frequency.

� By deleting vowels and double letters, mst ids cn b xprsd n
fwr ltrs, bt th xprnc s mst nplsnt.

Unicity Distance

� We are now in a position to ask an interesting question.

� If we decode some cipher text to produce plain text, how
can we tell whether we have succeeded!

� If the text reads as English rather than gibberish then we

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

13

� If the text reads as English rather than gibberish then we
are part the way there.

� On the other hand, could we have produced a seemingly

meaningful message by accident?

Unicity Distance (2)

� Assume we have an alphabet of L symbols.

� The number of possible messages of length N is
�LN = (2R)N since L = 2R

�= 2RN

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

14

�= 2RN

� The number of meaningful messages can be expressed in
terms of the rate for the language.

� 2rN

Unicity Distance (3)

� If we assume that all messages are equally likely, then the
probability of getting a meaningful message by chance is

number of apparently meaningful messages

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

15

number of apparently meaningful messages

number of possible messages

= 2rN / 2RN

= 2(r-R)N

= 2-DN

The Keyspace

� Try all possible keys and keep those that look right.
�An exhaustive search or brute force attack.

� Let K be the set of all possible keys.
� Now the number of keys is 2H(K)

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

16

� Now the number of keys is 2H(K)

�We must use the entropy of the keyspace, since the
chosen key values may exhibit some redundancy.

� So the expected number of wrong keys is
�= 2H(K) – 1, approximately 2H(K)

� Where we assume there are a lot of possible keys, all but
one of which is wrong.

The Keyspace (2)

� The expected number of false solution that look like they
might be right is

�number of wrong keys × probability of chance
meaningful message

Crpto & SecDev 2017 © Ron Poet:
Lecture 2

17

meaningful message
� = 2H(K) * 2-DN

� = 2H(K) – DN

� This is a rapidly decreasing function of N
� We can define the value of N at the crossover point when

the exponent = 0 as the unicity distance Nu.
� H(K) – D Nu = 0, so Nu = H(K) / D

Unicity Distance (4)

� If N > Nu then the chance of getting a false positive is
negligible since the number of correct looking false
messages is much less than 1.

� On the other hand, if N < Nu then many of the correct Crpto & SecDev 2017 © Ron Poet: Lecture 2 18 � On the other hand, if N < Nu then many of the correct looking messages will be false. � Nu is roughly the number of characters needed to unambiguously break the code and determine the key. DES Example � The DES algorithm has a key length of 56 bits. If the keys are chosen at random then the entropy H(K) is 56. � Assume an English language document is encrypted and that one byte is used for each letter. Crpto & SecDev 2017 © Ron Poet: Lecture 2 19 � Thus R = 8 and hence D = R – r = 6.5. � Hence H(K) / D = 56 / 6.5 = 8.6 � So we need to decode about 9 letters to make sure that a message that looks like English is the real message. Removing Redundancy � Reducing the value of D increased the unicity distance, and so makes it harder to break the code. �More characters are needed before an unambiguous solution is found. Crpto & SecDev 2017 © Ron Poet: Lecture 2 20 solution is found. � If the plain text were nearly random, and so D is very small, then the real message will be hard to tell it apart from actual random text. The Random Assumption � The preceding model assumes a random encryption system. � In particular, there is no relationship between: �similar keys and the cipher text Crpto & SecDev 2017 © Ron Poet: Lecture 2 21 �similar keys and the cipher text �similar plain text and the cipher text. � The unicity distance is less if the encryption is non random. Confusion and Diffusion � This random concept is usually expressed in the two terms: confusion and diffusion. � An encryption system has good confusion if changing one bit in the key changes roughly half the bits of the cipher text. Crpto & SecDev 2017 © Ron Poet: Lecture 2 22 text. � An encryption system has good diffusion if changing one bit in the plain text changes roughly half the bits of the cipher text. Usefulness of Unicity � The unicity distance does not provide a way of breaking the code. � It just indicates how much information we need to gather before we are able to attempt to break it. Crpto & SecDev 2017 © Ron Poet: Lecture 2 23 before we are able to attempt to break it. � It based on the rate for a language, a statistical property, which is less valid for very short character strings.