2020/8/14 COMP9444 Exercise 3 Solutions
COMP9444 Neural Networks and Deep Learning Term 2, 2020
Solutions to Exercises 3: Probability This page was last updated: 06/15/2020 11:40:18
1. Bayes’ Rule
One bag contains 2 red balls and 3 white balls. Another bag contains 3 red balls and 2 green balls. One of these bags is chosen at random, and two balls are drawn randomly from that bag, without replacement. Both of the balls turn out to be red. What is the probability that the first bag is the one that was chosen?
Let B = first bag is chosen, R = both balls are red. Then P ( R | B ) = (2/5)*(1/4) = 1/10
P ( R | ¬B ) = (3/5)*(2/4) = 3/10
P ( R ) = (1/2)*(1/10) + (1/2)*(3/10) = 1/5
P ( B | R ) = P(R|B)*P(B) / P(R) = (1/10)*(1/2)/(1/5) = 1/4 2. Entropy and Kullback-Leibler Divergence
Consider these two probability distributions on the same space Ω = {A, B, C, D} = ⟨ 1⁄2, 1⁄4, 1⁄8, 1⁄8 ⟩
= ⟨ 1⁄4, 1⁄8, 1⁄8, 1⁄2 ⟩
a. Construct a Huffmann tree for each distribution and
b. Compute the entropy H( )
H( ) = H( ) = 1⁄2(-log 1⁄2) + 1⁄4(-log 1⁄4) + 1⁄8(-log 1⁄8) + 1⁄8(-log 1⁄8) = 1⁄2(1) + 1⁄4(2) + 1⁄8(3) + 1⁄8(3) = 1.75
c. Compute the KL-Divergence in each direction DKL( || ) and DKL( || ) DKL( || ) = 1⁄2(2-1) + 1⁄4(3-2) + 1⁄8(3-3) + 1⁄8(1-3) = 0.5
https://www.cse.unsw.edu.au/~cs9444/20T2/tut/sol/Ex3_Probability_sol.html 1/2
pq qp
qp
p
qp
qp
q p
2020/8/14 COMP9444 Exercise 3 Solutions
DKL( || ) = 1⁄4(1-2) + 1⁄8(2-3) + 1⁄8(3-3) + 1⁄2(3-1) = 0.625
Which one is larger? Why?
DKL( || ) is larger, mainly because the frequency of D has increased from 1⁄8 to
1⁄2, so it incurs a cost of 3-1=2 additional bits every time it occurs (which is often).
https://www.cse.unsw.edu.au/~cs9444/20T2/tut/sol/Ex3_Probability_sol.html 2/2
pq
pq