COMP2610/COMP6261
Tutorial 6 Sample Solutions
Tutorial 6: Source Coding
Young Lee and Bob Williamson
Tutors: Debashish Chakraborty and Zakaria Mhammedi
Week 8 (25th – 29th Sep), Semester 2, 2017
1. (a) We know that X is a binomial with parameters n, p. Thus,
p(X ≥ n− 1) = p(X = n− 1) + p(X = n)
= n · pn−1 · (1− p) + pn.
(b) By Markov’s inequality,
p(X ≥ n− 1) ≤
E[X]
n− 1
=
n
n− 1
· p.
(c) When n = 2,
p(X ≥ 1) = 2 · p · (1− p) + p2 = p · (2− p).
The bound from Markov’s inequality is
p(X ≥ 1) ≤ 2 · p.
The difference between the two is
2 · p− (2 · p− p2) = p2.
Thus, for the Markov bound to be within 0.01 of the exact probability, we will need p ≤ 0.1.
2. (a) We need to code all sequences of length 100 with three or less 1’s. There is 1 sequence with zero 1’s. There
are 100 sequences with one 1 (one sequence for each possible position in which the 1 appears.) Similarly,
there are
(
100
2
)
and
(
100
3
)
sequences with two and three 1’s respectively. In total we have
1 + 100 +
(
100
2
)
+
(
100
3
)
= 166, 751.
If we want a uniform code over this number of elements, we need dlog2(166, 751)e = 18
(b) We need to find the probability of observing more than 3 1’s. LetK be the number of 1’s observed. Then
P (K > 3) = 1− P (K ≤ 3) = 1− P (K = 0)− P (K = 1)− P (K = 2)− P (K = 3)
= 1− 0.995100 + 100× 0.99599 × 0.005
+
(
100
2
)
× 0.99598 × 0.0052 +
(
100
3
)
× 0.99597 × 0.0053
= 0.00167
1
(c) The number of 1’s, K, is a binomial random variable with probability of success 0.995. It has expectation
E[K] = 100 × 0.005 = 0.5 and variance V[K] = 100 × 0.995 × 0.005 = 0.4975. By Chebyshev’s
inequality we have
P (K ≥ 4) ≤ P (|K − 0.5| ≥ 3.5) = P (|K − E[K]| ≥ 3.5)
≤
V[K]
3.52
=
0.4975
3.52
= 0.0406
This is larger than the probability we found in part (b), it is off by a factor of around 20.
3. (a) Using the definition of entropy
H(X) = 0.4× log
1
0.4
+ 0.6× log
1
0.6
= 0.971
(b) i. Recall that
TNβ := {x : | −
1
N
log2 P (x)−H(X)| < β}
so we are looking for k (the number of ones) such that
0.871 = H(X)− 0.1 < −
1
N
log2 P (k) < H(X) + 0.1 = 1.071.
Referring to the table we see this is the case for values of k ranging from 11 to 19. That is, the typical
set consists of sequences with between 11 and 19 1s.
ii. Summing the corresponding entries in the table (rows 11-19 in the second column) of the table we see
P (x ∈ TNβ) = 0.043410 + 0.075967 + . . .+ 0.044203 = 0.936247
iii. The number of elements in the typical set is
4457400 + 5200300 + . . .+ 177100 = 26, 366, 510.
iv. By definition, Sδ is the smallest set of sequences such that P (Sδ) ≥ 1 − δ. In particular, you
always want to throw in the highest probability elements in order to get the smallest set. The highest
probability sequences are the ones with the most 1s. If we start adding up probabilities from the third
column of the table starting from the bottom (more 1s), we see that P (K ≥ 19) = 0.073564 while
P (K ≥ 18) = 0.15355 so we need to add
⌈
(0.1−0.073564)
0.079986
× 480, 700
⌉
= 158, 876 elements with 18
1s to the set of sequences with more than 18 1s. This gives us a total of
1 + 25 + 300 + 2, 300 + 12, 650 + 53, 130 + 177, 100 + 158, 876 = 404, 382
elements in Sδ .
v. The essential bit content is simply log2 |Sδ| = 18.625.
2
4. (a) This is incorrect. Certainly it is true that 1
N
Hδ(X
N ) ≥ 0, as the essential bit content is nonnegative.
But this does not mean we can compress down to zero bits per outcome. If we were able to show that
1
N
Hδ(X
N ) ≤ �′, for arbitrary �′, on the other hand, we would be able to make such a statement. Of course,
we cannot show such a thing, because we know that the asymptotic fraction converges to the entropyH(X).
(b) This is correct. The source coding theorem assumes that we are coding blocks created from extended
ensembles. By definition this involves performing independent trials. When there is dependence amongst
outcomes in a block, the theorem no longer holds.
5. (a) The code C1 is not prefix (0 is a prefix to 01) nor uniquely decodable (0110101 could decompose into 0
1101 01 or 01 10101). The code C2 is prefix free and thus uniquely decodable. The code C3 is not uniquely
decodable (00 can be 00 or 0 0) and clearly not prefix free.
(b) C ′1 = {0, 10, 1110, 11110}, C ′2 = C2. For C ′3 we need a code a with lengths 1,1,2, and 2. However, this
cannot be done as the only codewords of length one are 0 and 1 and one of these will be a prefix to any other
codeword.
(c) No. The Kraft inequality tells us that for any prefix code∑
i
2−li ≤ 1
but for the given code lengths
∑
i
2−li =
1
2
+
1
4
+
1
8
+ 3×
1
16
=
17
16
> 1.
(d) Yes, these lengths satisfy the Kraft inequality. For instance C = {0, 100, 101, 1100, 1101, 1110}.
3