COMP2610/COMP6261 – Information Theory
Tutorial 6: Source Coding
Young Lee and Bob Williamson
Tutors: Debashish Chakraborty and Zakaria Mhammedi
Week 8 (25th – 29th Sep), Semester 2, 2017
1. Probabilistic inequalities
Suppose a coin is tossed n times. The coin is known to land “heads” with probability p. The number of observed
“heads” is recorded as a random variable X .
(a) What is the exact probability of X being n− 1 or more?
(b) Using Markov’s inequality, compute a bound on the same probability as the previous part.
(c) Suppose n = 2. For what values of p will the bound from Markov’s inequality be within 1% of the exact
probability?
2. AEP and source coding (cf. Cover & Thomas, Problem 3.7)
A sequence of bits its generated by i.i.d. draws from an ensemble with probabilities p0 = 0.995 and p1 = 0.005.
Sequences are coded in 100-bit blocks. Every 100-bit block with at most three 1s is assigned a codeword. Those
blocks with more than three 1s are not assigned codewords.
(a) What is the minimum required length of the assigned codewords if they are all to be of the same length?
(b) Calculate the probability of observing a 100-bit block that has no associated codeword.
(c) (Harder) Use Chebyshev’s inequality to bound the probability of observing a 100-bit block for which no
codeword has been assigned. Compare the bound to the probability just calculated.
3. Typical Sets and Smallest δ-Sufficient Subsets (cf. Cover & Thomas, Problem 3.13)
Let XN be an extended ensemble for X with AX = {0, 1} and PX = {0.4, 0.6}.
(a) Calculate the entropy H(X).
(b) Let N = 25 and β = 0.1.
i. Which sequences inXN fall in the typical set TNβ? (You may find it helpful to refer to Table 1 below.)
ii. Compute P (x ∈ TNβ), the probability of a sequence from XN falling in the typical set.
iii. With reference to Table 1 below, how many elements are there in TNβ?
iv. How many elements are in the smallest δ-sufficient subset Sδ for δ = 0.9?
v. What is the essential bit content Hδ(XN ) for δ = 0.9?
1
4. Source Coding Theorem
Recall that the source coding theorem (for uniform codes) says that for any ensemble X:
(∀� > 0) (∀δ ∈ (0, 1)) (∃N0) (∀N > N0)
∣∣∣∣ 1NHδ(XN )−H(X)
∣∣∣∣ ≤ �.
(a) Near, an enthusiastic software developer, has just learned about the source coding theorem. He exclaims:
“The theorem allows us to pick any � > 0. So, if I pick � = H(X)− �′, I get that for sufficiently large N ,
1
N
Hδ(X
N ) ≥ �′.
This means that by making �′ tiny, I can get away with using virtually zero bits per outcome. Great!”.
Is Near’s reasoning correct? Explain why or why not.
(b) Mello, a skeptical econometrician, has also just learned about the source coding theorem. He complains:
“The thereom is not really relevant to me. I am interested in coding blocks of outcomes where each outcome
is dependent on the previous outcome, rather than them all being independent of each other. The source
coding theorem is not useful in this case.”
Is Mello’s reasoning correct? Explain why or why not.
5. Prefix Codes
Consider the codes C1 = {0, 01, 1101, 10101}, C2 = {00, 01, 100, 101}, and C3 = {0, 1, 00, 11}
(a) Are C1, C2, and C3 prefix codes? Are they uniquely decodable?
(b) Construct new prefix codes C ′1, C
′
2, and C
′
3 that have the same lengths as C1, C2, and C3, respectively. If
this is not possible, explain why.
(c) Is it possible to construct a prefix code that has codeword lengths {1, 2, 3, 4, 4, 4}? If so, construct one.
(d) Is it possible to construct a uniquely decodable code with lengths {1, 3, 3, 4, 4, 4}? If so, construct one.
2
k
(
N
k
) (
N
k
)
pk1p
N−k
0 −
1
N
log2 p(x)
0 1 0.000000 1.321928
1 25 0.000000 1.298530
2 300 0.000000 1.275131
3 2300 0.000001 1.251733
4 12650 0.000007 1.228334
5 53130 0.000045 1.204936
6 177100 0.000227 1.181537
7 480700 0.000925 1.158139
8 1081575 0.003121 1.134740
9 2042975 0.008843 1.111342
10 3268760 0.021222 1.087943
11 4457400 0.043410 1.064545
12 5200300 0.075967 1.041146
13 5200300 0.113950 1.017748
14 4457400 0.146507 0.994349
15 3268760 0.161158 0.970951
16 2042975 0.151086 0.947552
17 1081575 0.119980 0.924154
18 480700 0.079986 0.900755
19 177100 0.044203 0.877357
20 53130 0.019891 0.853958
21 12650 0.007104 0.830560
22 2300 0.001937 0.807161
23 300 0.000379 0.783763
24 25 0.000047 0.760364
25 1 0.000003 0.736966
Table 1: Table for Question 3. Column 1 shows k, the number of 1s in a block of length N = 25. Column 2 shows
the number of such blocks. Column 3 shows the probability p(x) =
(
N
k
)
pk1p
N−k
0 of drawing such a block x. Column
4 shows the Shannon information per symbol in x.
3