程序代写代做代考 scheme information theory COMP2610 / 6261 — Information Theory – Lecture 13: Symbol Codes for Lossless Compression

COMP2610 / 6261 — Information Theory – Lecture 13: Symbol Codes for Lossless Compression

COMP2610 / 6261 — Information Theory
Lecture 13: Symbol Codes for Lossless Compression

Robert C. Williamson

Research School of Computer Science
The Australian National University

1 L O G O U S E G U I D E L I N E S T H E A U S T R A L I A N N A T I O N A L U N I V E R S I T Y

ANU Logo Use Guidelines

Deep Gold
C30 M50 Y70 K40

PMS Metallic 8620

PMS 463

Black
C0 M0 Y0 K100

PMS Process Black

Preferred logo Black version

Reverse version
Any application of the ANU logo on a coloured
background is subject to approval by the Marketing
Office, contact

brand@anu.edu.au

The ANU logo is a contemporary
reflection of our heritage.
It clearly presents our name,
our shield and our motto:

First to learn the nature of things.
To preserve the authenticity of our brand identity, there are
rules that govern how our logo is used.

Preferred logo – horizontal logo
The preferred logo should be used on a white background.
This version includes black text with the crest in Deep Gold in
either PMS or CMYK.

Black
Where colour printing is not available, the black logo can
be used on a white background.

Reverse
The logo can be used white reversed out of a black
background, or occasionally a neutral dark background.

17 September, 2018

1 / 25

Last time

Proof of the source coding theorem

Foundational theorem, but impractical
Requires potentially very large block sizes

The theorem also only considers uniform coding schemes

Could variable length coding help?
Does entropy turn up for such codes as well?

2 / 25

This time

Variable-length codes

Prefix codes

Kraft’s inequality

3 / 25

1 Variable-Length Codes
Unique Decodeability
Prefix Codes

2 The Kraft Inequality

3 Summary

4 / 25

1 Variable-Length Codes
Unique Decodeability
Prefix Codes

2 The Kraft Inequality

3 Summary

5 / 25

Codes: A Review

Notation:

If A is a finite set then AN is the set of all strings of length N.
A+ =


N A

N is the set of all finite strings

Examples:

{0, 1}3 = {000, 001, 010, 011, 100, 101, 110, 111}
{0, 1}+ = {0, 1, 00, 01, 10, 11, 000, 001, 010, . . .}

Binary Symbol Code
Let X be an ensemble with AX = {a1, . . . , aI}.
A function c : AX → {0, 1}+ is a code for X .

The binary string c(x) is the codeword for x ∈ AX
The length of the codeword for for x is denoted `(x).
Shorthand: `i = `(ai) for i = 1 . . . , I.
The extension of c assigns codewords to any sequence x1x2 . . . xN
from A+ by c(x1 . . . xN) = c(x1) . . . c(xN)

6 / 25

Codes: A Review

Notation:

If A is a finite set then AN is the set of all strings of length N.
A+ =


N A

N is the set of all finite strings

Examples:

{0, 1}3 = {000, 001, 010, 011, 100, 101, 110, 111}
{0, 1}+ = {0, 1, 00, 01, 10, 11, 000, 001, 010, . . .}

Binary Symbol Code
Let X be an ensemble with AX = {a1, . . . , aI}.
A function c : AX → {0, 1}+ is a code for X .

The binary string c(x) is the codeword for x ∈ AX

The length of the codeword for for x is denoted `(x).
Shorthand: `i = `(ai) for i = 1 . . . , I.
The extension of c assigns codewords to any sequence x1x2 . . . xN
from A+ by c(x1 . . . xN) = c(x1) . . . c(xN)

6 / 25

Codes: A Review

Notation:

If A is a finite set then AN is the set of all strings of length N.
A+ =


N A

N is the set of all finite strings

Examples:

{0, 1}3 = {000, 001, 010, 011, 100, 101, 110, 111}
{0, 1}+ = {0, 1, 00, 01, 10, 11, 000, 001, 010, . . .}

Binary Symbol Code
Let X be an ensemble with AX = {a1, . . . , aI}.
A function c : AX → {0, 1}+ is a code for X .

The binary string c(x) is the codeword for x ∈ AX
The length of the codeword for for x is denoted `(x).
Shorthand: `i = `(ai) for i = 1 . . . , I.

The extension of c assigns codewords to any sequence x1x2 . . . xN
from A+ by c(x1 . . . xN) = c(x1) . . . c(xN)

6 / 25

Codes: A Review

Notation:

If A is a finite set then AN is the set of all strings of length N.
A+ =


N A

N is the set of all finite strings

Examples:

{0, 1}3 = {000, 001, 010, 011, 100, 101, 110, 111}
{0, 1}+ = {0, 1, 00, 01, 10, 11, 000, 001, 010, . . .}

Binary Symbol Code
Let X be an ensemble with AX = {a1, . . . , aI}.
A function c : AX → {0, 1}+ is a code for X .

The binary string c(x) is the codeword for x ∈ AX
The length of the codeword for for x is denoted `(x).
Shorthand: `i = `(ai) for i = 1 . . . , I.
The extension of c assigns codewords to any sequence x1x2 . . . xN
from A+ by c(x1 . . . xN) = c(x1) . . . c(xN)

6 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000

Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}

All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4

The extension of c maps aba ∈ A3X ⊂ A
+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111

Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}

In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3

The extension of c maps aba ∈ A3X ⊂ A
+
X to 0100

7 / 25

Codes: A Review
Examples

X is an ensemble with AX = {a, b, c, d}

Example 1 (Uniform Code):

Let c(a) = 0001, c(b) = 0010, c(c) = 0100, c(d) = 1000
Shorthand: C1 = {0001, 0010, 0100, 1000}
All codewords have length 4. That is, `1 = `2 = `3 = `4 = 4
The extension of c maps aba ∈ A3X ⊂ A

+
X to 000100100001

Example 2 (Variable-Length Code):

Let c(a) = 0, c(b) = 10, c(c) = 110, c(d) = 111
Shorthand: C2 = {0, 10, 110, 111}
In this case `1 = 1, `2 = 2, `3 = `4 = 3
The extension of c maps aba ∈ A3X ⊂ A

+
X to 0100

7 / 25

Unique Decodeability

Recall that a code is lossless if for all x , y ∈ AX

x 6= y =⇒ c(x) 6= c(y)

This ensures that if we work with a single outcome, we can uniquely
decode the outcome

When working with variable-length codes, it will be convenient to also
require the following:

Uniquely Decodable

A code c for X is uniquely decodable if no two strings from A+X have the
same codeword. That is, for all x, y ∈ A+X

x 6= y =⇒ c(x) 6= c(y)

This ensures that if we work with a sequence of outcomes, we can still
uniquely decode the individual elements

8 / 25

Unique Decodeability

Recall that a code is lossless if for all x , y ∈ AX

x 6= y =⇒ c(x) 6= c(y)

This ensures that if we work with a single outcome, we can uniquely
decode the outcome

When working with variable-length codes, it will be convenient to also
require the following:

Uniquely Decodable

A code c for X is uniquely decodable if no two strings from A+X have the
same codeword. That is, for all x, y ∈ A+X

x 6= y =⇒ c(x) 6= c(y)

This ensures that if we work with a sequence of outcomes, we can still
uniquely decode the individual elements

8 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable
C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless
I Lossless ; Uniquely decodable

C3 = {0, 10, 110, 111} is uniquely decodable

I We can easily segment a given code string scanning left to right
I e.g. 0110010→ 0, 110, 0, 10

9 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable

C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless
I Lossless ; Uniquely decodable

C3 = {0, 10, 110, 111} is uniquely decodable

I We can easily segment a given code string scanning left to right
I e.g. 0110010→ 0, 110, 0, 10

9 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable
C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless
I Lossless ; Uniquely decodable

C3 = {0, 10, 110, 111} is uniquely decodable

I We can easily segment a given code string scanning left to right
I e.g. 0110010→ 0, 110, 0, 10

9 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable
C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless

I Lossless ; Uniquely decodable
C3 = {0, 10, 110, 111} is uniquely decodable

I We can easily segment a given code string scanning left to right
I e.g. 0110010→ 0, 110, 0, 10

9 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable
C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless
I Lossless ; Uniquely decodable

C3 = {0, 10, 110, 111} is uniquely decodable

I We can easily segment a given code string scanning left to right
I e.g. 0110010→ 0, 110, 0, 10

9 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable
C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless
I Lossless ; Uniquely decodable

C3 = {0, 10, 110, 111} is uniquely decodable

I We can easily segment a given code string scanning left to right
I e.g. 0110010→ 0, 110, 0, 10

9 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable
C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless
I Lossless ; Uniquely decodable

C3 = {0, 10, 110, 111} is uniquely decodable
I We can easily segment a given code string scanning left to right

I e.g. 0110010→ 0, 110, 0, 10

9 / 25

Examples of uniquely decodable codes

Examples:
C1 = {0001, 0010, 0100, 1000} is uniquely decodable

I Uniform + Lossless⇒ Uniquely decodable
C2 = {1, 10, 110, 111} is not uniquely decodable because

c(aaa) = c(d) = 111 and c(ab) = c(c) = 110

I The code is of course lossless
I Lossless ; Uniquely decodable

C3 = {0, 10, 110, 111} is uniquely decodable
I We can easily segment a given code string scanning left to right
I e.g. 0110010→ 0, 110, 0, 10

9 / 25

“Self-punctuating” property

The code C3 = {0, 10, 110, 111} has a “self-punctuating” property

Trivial to segment a given code string into individual codewords

Keep scanning until we match a codeword
Once matched, add new segment boundary, and proceed to rest of
string

Once our current segment matches a codeword, no ambiguity to resolve

Why? No codeword is a prefix of any other

Not true for every uniquely decodable code, e.g. C4 = {0, 01, 011}
First bit 0→ no certainty what the symbol is

10 / 25

“Self-punctuating” property

The code C3 = {0, 10, 110, 111} has a “self-punctuating” property

Trivial to segment a given code string into individual codewords

Keep scanning until we match a codeword
Once matched, add new segment boundary, and proceed to rest of
string

Once our current segment matches a codeword, no ambiguity to resolve

Why? No codeword is a prefix of any other

Not true for every uniquely decodable code, e.g. C4 = {0, 01, 011}
First bit 0→ no certainty what the symbol is

10 / 25

“Self-punctuating” property

The code C3 = {0, 10, 110, 111} has a “self-punctuating” property

Trivial to segment a given code string into individual codewords

Keep scanning until we match a codeword
Once matched, add new segment boundary, and proceed to rest of
string

Once our current segment matches a codeword, no ambiguity to resolve

Why? No codeword is a prefix of any other

Not true for every uniquely decodable code, e.g. C4 = {0, 01, 011}
First bit 0→ no certainty what the symbol is

10 / 25

Prefix Codes
a.k.a prefix-free or instantaneous codes

A simple property of codes guarantees unique decodeability

Prefix property

A codeword c ∈ {0, 1}+ is said to be a prefix of another codeword
c′ ∈ {0, 1}+ if there exists a string t ∈ {0, 1}+ such that c′ = ct.

Can you create c′ by gluing something to the end of c?

Example: 01101 has prefixes 0, 01, 011, 0110.

Prefix Codes
A code C = {c1, . . . , cI} is a prefix code if for every codeword ci ∈ C
there is no prefix of ci in C.

In a stream, no confusing one codeword with another

11 / 25

Prefix Codes
a.k.a prefix-free or instantaneous codes

A simple property of codes guarantees unique decodeability

Prefix property

A codeword c ∈ {0, 1}+ is said to be a prefix of another codeword
c′ ∈ {0, 1}+ if there exists a string t ∈ {0, 1}+ such that c′ = ct.

Can you create c′ by gluing something to the end of c?

Example: 01101 has prefixes 0, 01, 011, 0110.

Prefix Codes
A code C = {c1, . . . , cI} is a prefix code if for every codeword ci ∈ C
there is no prefix of ci in C.

In a stream, no confusing one codeword with another

11 / 25

Prefix Codes
a.k.a prefix-free or instantaneous codes

A simple property of codes guarantees unique decodeability

Prefix property

A codeword c ∈ {0, 1}+ is said to be a prefix of another codeword
c′ ∈ {0, 1}+ if there exists a string t ∈ {0, 1}+ such that c′ = ct.

Can you create c′ by gluing something to the end of c?

Example: 01101 has prefixes 0, 01, 011, 0110.

Prefix Codes
A code C = {c1, . . . , cI} is a prefix code if for every codeword ci ∈ C
there is no prefix of ci in C.

In a stream, no confusing one codeword with another

11 / 25

Prefix Codes: Examples

Examples:

C1 = {0001, 0010, 0100, 1000} is prefix-free

C2 = {0, 10, 110, 111} is prefix-free

C′2 = {1, 10, 110, 111} is not prefix free since c3 = 110 = c1c2

C′′2 = {1, 01, 110, 111} is not prefix free since c3 = 110 = c110

12 / 25

Prefix Codes as Trees

C1 = {0001, 0010, 0100, 1000}

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et

13 / 25

Prefix Codes as Trees

C2 = {0, 10, 110, 111}

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et

13 / 25

Prefix Codes as Trees

C′2 = {1, 10, 110, 111}

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et

Each codeword choice eliminates its descendants 13 / 25

Prefix Codes are Uniquely Decodeable

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et If `
∗ = max{`1, . . . , `I} then symbol is

decodeable after seeing at most `∗

bits
Consider C2 = {0, 10, 110, 111}

I If c(x) = 0 . . . then x1 = a
I If c(x) = 1 . . . then x1 ∈ {b, c, d}
I If c(x) = 10 . . . then x1 = b
I If c(x) = 11 . . . then x1 ∈ {c, d}

14 / 25

Uniquely Decodeable Codes are Not Always Prefix
Codes

A uniquely decodeable code is not necessarily a prefix code

Example: C1 = {0, 01, 011}
00 . . .→ first codeword
010 . . .→ second codeword
011 . . .→ third codeword

Example: C2 = {0, 01, 011, 111}
This is the reverse of the prefix code C′2 = {0, 10, 110, 111}

15 / 25

Relating various types of codes

All codes

Lossless

Uniquely
decodable

Prefix

Note that e.g.

Prefix =⇒ Uniquely Decodable

but

Not Prefix 6=⇒ Not Uniquely Decodable

16 / 25

Why prefix codes?

While prefix codes do not represent all uniquely decodable codes, they
are convenient to work with

It will be easy to generate prefix codes (Huffman coding, next lecture)

Further, we can quickly establish if a given code is not prefix

Testing for unique decodability is non-trivial in general

17 / 25

1 Variable-Length Codes
Unique Decodeability
Prefix Codes

2 The Kraft Inequality

3 Summary

18 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}

— C1 = {0001, 0010, 0100, 1000}

L2 = {1, 2, 3, 3}

— C2 = {0, 10, 110, 111}

L3 = {2, 2, 3, 4, 4}

— C3 = {

00

,

01

,

100

,

1010

,

1011

}

L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}

— C2 = {0, 10, 110, 111}

L3 = {2, 2, 3, 4, 4}

— C3 = {

00

,

01

,

100

,

1010

,

1011

}

L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}— C2 = {0, 10, 110, 111}
L3 = {2, 2, 3, 4, 4}

— C3 = {

00

,

01

,

100

,

1010

,

1011

}

L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}— C2 = {0, 10, 110, 111}
L3 = {2, 2, 3, 4, 4}— C3 = {00,

01

,

100

,

1010

,

1011

}
L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}— C2 = {0, 10, 110, 111}
L3 = {2, 2, 3, 4, 4}— C3 = {00, 01,

100

,

1010

,

1011

}
L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}— C2 = {0, 10, 110, 111}
L3 = {2, 2, 3, 4, 4}— C3 = {00, 01, 100,

1010

,

1011

}
L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}— C2 = {0, 10, 110, 111}
L3 = {2, 2, 3, 4, 4}— C3 = {00, 01, 100, 1010,

1011

}
L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}— C2 = {0, 10, 110, 111}
L3 = {2, 2, 3, 4, 4}— C3 = {00, 01, 100, 1010, 1011}
L4 = {1, 3, 3, 3, 3, 4}

— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

Lengths and Trees

Suppose someone said “I want prefix codes with codewords lengths”:
L1 = {4, 4, 4, 4}— C1 = {0001, 0010, 0100, 1000}
L2 = {1, 2, 3, 3}— C2 = {0, 10, 110, 111}
L3 = {2, 2, 3, 4, 4}— C3 = {00, 01, 100, 1010, 1011}
L4 = {1, 3, 3, 3, 3, 4}— Impossible!

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et
19 / 25

The Kraft Inequality
a.k.a. The Kraft-McMillan Inequality

Kraft Inequality
For any prefix (binary) code C, its codeword lengths {`1, . . . , `I} satisfy

I∑
i=1

2−`i ≤ 1 (1)

Conversely, if the set {`1, . . . , `I} satisfy (1) then there exists a prefix code
C with those codeword lengths.

Examples:

1 C1 = {0001, 0010, 0100, 1000} is prefix and
∑4

i=1 2
−4 = 14 ≤ 1

2 C2 = {0, 10, 110, 111} is prefix and
∑4

i=1 2
−`i = 12 +

1
4 +

2
8 = 1

3 Lengths {1, 2, 2, 3} give
∑4

i=1 2
−`i = 12 +

2
4 +

1
8 > 1 so no prefix code

20 / 25

The Kraft Inequality
a.k.a. The Kraft-McMillan Inequality

Kraft Inequality
For any prefix (binary) code C, its codeword lengths {`1, . . . , `I} satisfy

I∑
i=1

2−`i ≤ 1 (1)

Conversely, if the set {`1, . . . , `I} satisfy (1) then there exists a prefix code
C with those codeword lengths.

Examples:
1 C1 = {0001, 0010, 0100, 1000} is prefix and

∑4
i=1 2

−4 = 14 ≤ 1

2 C2 = {0, 10, 110, 111} is prefix and
∑4

i=1 2
−`i = 12 +

1
4 +

2
8 = 1

3 Lengths {1, 2, 2, 3} give
∑4

i=1 2
−`i = 12 +

2
4 +

1
8 > 1 so no prefix code

20 / 25

The Kraft Inequality
a.k.a. The Kraft-McMillan Inequality

Kraft Inequality
For any prefix (binary) code C, its codeword lengths {`1, . . . , `I} satisfy

I∑
i=1

2−`i ≤ 1 (1)

Conversely, if the set {`1, . . . , `I} satisfy (1) then there exists a prefix code
C with those codeword lengths.

Examples:
1 C1 = {0001, 0010, 0100, 1000} is prefix and

∑4
i=1 2

−4 = 14 ≤ 1
2 C2 = {0, 10, 110, 111} is prefix and

∑4
i=1 2

−`i = 12 +
1
4 +

2
8 = 1

3 Lengths {1, 2, 2, 3} give
∑4

i=1 2
−`i = 12 +

2
4 +

1
8 > 1 so no prefix code

20 / 25

The Kraft Inequality
a.k.a. The Kraft-McMillan Inequality

Kraft Inequality
For any prefix (binary) code C, its codeword lengths {`1, . . . , `I} satisfy

I∑
i=1

2−`i ≤ 1 (1)

Conversely, if the set {`1, . . . , `I} satisfy (1) then there exists a prefix code
C with those codeword lengths.

Examples:
1 C1 = {0001, 0010, 0100, 1000} is prefix and

∑4
i=1 2

−4 = 14 ≤ 1
2 C2 = {0, 10, 110, 111} is prefix and

∑4
i=1 2

−`i = 12 +
1
4 +

2
8 = 1

3 Lengths {1, 2, 2, 3} give
∑4

i=1 2
−`i = 12 +

2
4 +

1
8 > 1 so no prefix code

20 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}

4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i

≤ 2`

excluded `∗-bit codewords.

But there are only 2`

possible `∗-bit codewords

21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}

8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i

≤ 2`

excluded `∗-bit codewords.

But there are only 2`

possible `∗-bit codewords

21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}

In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i

≤ 2`

excluded `∗-bit codewords.

But there are only 2`

possible `∗-bit codewords

21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i

≤ 2`

excluded `∗-bit codewords.

But there are only 2`

possible `∗-bit codewords

21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i

≤ 2`

excluded `∗-bit codewords.

But there are only 2`

possible `∗-bit codewords

21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i

≤ 2`

excluded `∗-bit codewords.

But there are only 2`

possible `∗-bit codewords

21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i ≤ 2`

excluded `∗-bit codewords. But there are only 2`

possible `∗-bit codewords
21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

1
2`∗

I∑
i=1

2`
∗−`i ≤ 1

excluded `∗-bit codewords. But there are only 2`

possible `∗-bit codewords
21 / 25

Prefixes Exclude Codes

We are constrained when constructing prefix codes, as selecting a codeword
eliminates a whole subtree
Choosing a prefix codeword of length 1 — e.g., c(a) = 0 — excludes:

1111

1110
1101

1100
1011
1010

1001
1000
0111

0110
0101
0100

0011

0010
0001
0000

111

110

101

100

011

010

001

000

11

10

01

00

0

1

T
he

to
ta

l s
ym

bo
l c

od
e

bu
dg

et 2 x 2-bit codewords: {00, 01}
4 x 3-bit codewords: {000, 001, 010, 011}
8 x 4-bit codewords: {0000, 0001, . . . , 0111}
In general, an `-bit codeword excludes

2k−` x k -bit codewords

For lengths L = {`1, . . . , `I} and `∗ = max{`1, . . . , `I}, there will be

I∑
i=1

2−`i ≤ 1

excluded `∗-bit codewords. But there are only 2`

possible `∗-bit codewords
21 / 25

Kraft inequality: other direction

Suppose we are given lengths satisfying

I∑
i=1

2−`i ≤ 1

Then, we can construct a code by:

Picking the first (remaining) node at depth `1, and using it as the first
codeword

Removing all descendants of the node (to ensure the prefix condition)

Picking the next (remaining) node at depth `2, and using it as the
second codeword

Removing all descendants of the node (to ensure the prefix condition)

22 / 25

Kraft inequality: comments

Kraft’s inequality actually holds more generally for uniquely decodable
codes

Harder to prove

Note that if a given code has lengths that satisfy

I∑
i=1

2−`i ≤ 1

it does not mean the given code necessarily is prefix

Just that we can construct a prefix code with these lengths

23 / 25

Summary

Key ideas from this lecture:

Prefix and Uniquely Decodeable variable-length codes

Prefix codes are tree-like

Every Prefix code is Uniquely Decodeable but not vice versa

The Kraft Inequality:
I Code lengths satisfying


i 2
−`i ≤ 1 implies Prefix/U.D. code exists

I Prefix/U.D. code implies

i 2
−`i ≤ 1

Relevant Reading Material:

MacKay: §5.1 and §5.2

Cover & Thomas: §5.1, §5.2, and §5.5

24 / 25

Next time

Bound on expected length for a prefix code

Shannon codes

Huffman coding

25 / 25

Variable-Length Codes
Unique Decodeability
Prefix Codes

The Kraft Inequality
Summary