程序代做 IV. Problems - PowCoder代写

IV. Problems
Probabilistic Graphical Models (PGMs) (40 points)
1. (3 points) Consider the following PGM of five random variables A, B, C, D, and E:
The joint distribution 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) according to a PGM can be decomposed as follows: ∏ 𝑝(𝑋|𝑝𝑎𝑟𝑒𝑛𝑡𝑠(𝑋))

𝑋∈{𝐴,𝐵,𝐶,𝐷,𝐸}
Decompose 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) for the PGM below in the above-mentioned form. Your answer is an expression that solely comprises of probabilities over one or more of the random variables A, B ,C, D, E. For example, 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) =
𝑝(𝐶)𝑝(𝐴|𝐶)𝑝(𝐵|𝐴, 𝐶)𝑝(𝐷|𝐴, 𝐵, 𝐶)𝑝(𝐸|𝐴).
2. (3 points) Draw the PGM of five random variables A, B, C, D, E from the joint distribution 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) = 𝑝(𝐴)𝑝(𝐵)𝑝(𝐶)𝑝(𝐷|𝐴, 𝐵)𝑝(𝐸|𝐷, 𝐶).
3. (24 points) Consider the following PGM of three binary random variables (A, B, and C) and corresponding probability terms. Remember, each binary random variable has two outcomes.
For parts a – f (3 points each), compute the probabilities. Your answer must be a floating point decimal. Show your work to receive full credit by writing your derivation / calculation.
a) 𝑝(𝐵 = 𝑏)
b) 𝑝(𝐶 = ¬𝑐)

K-means (15 points)
5. Consider the following one-dimensional dataset of 8 data instances:
Perform 10 iterations of the K-means algorithm given 𝐾 = 2 and the initializations below. 𝑐1(𝑡) and 𝑐2(𝑡) represent the 𝐾 = 2 centers at the end of the t-th iteration. Use squared Euclidean
distance (as we do in the slides).
Note, if no points are assigned to a center (e.g., if no points are assigned to 𝑐1(𝑡)), then 𝑐1(𝑡+1) =
𝑐 1( 𝑡 ) .
Show your work to receive full credit.
Let 𝑐1(𝑡) = 25 and 𝑐2(𝑡) = 50 at 𝑡 = 0 (i.e., initialization). Compute 𝑐1(10) and 𝑐2(10). Gaussian mixture models (15 points)
6. Consider the following one-dimensional dataset of 6 data instances:
You will follow steps to construct a Gaussian mixture model with two components (i.e., 1 & 2) to model the distribution of these 6 data points.
Note that to construct a Gaussian mixture model, you are to estimate:
􏰀 𝜇 ,𝜎 , such that 𝑝(𝑋 = 𝑥|𝑌 = 1) = 11
1 𝑒𝑥𝑝 − (𝑥−𝜇1)2
𝜎1√2𝜋 2𝜎12
1 𝑒𝑥𝑝 − (𝑥−𝜇2)2
𝜎2√2𝜋 2𝜎22
, such that 𝑝(𝑋 = 𝑥|𝑌 = 2) = 􏰀 𝜆, such that 𝑝(𝑌 = 1) = 𝜆
so that together you have the GMM distribution
𝑝(𝑋 = 𝑥) = 𝑝(𝑌 = 1)𝑝(𝑋 = 𝑥|𝑌 = 1) + 𝑝(𝑌 = 2)𝑝(𝑋 = 𝑥|𝑌 = 2) Let us denote by 𝜃 all the parameters {𝜇1, 𝜎1, 𝜇2, 𝜎2, 𝜆}.

Like K-means, GMM also needs initialization. Let
􏰀 𝜇1(0)=4,𝜎1(0)=4 􏰀 𝜇2(0)=10,𝜎2(0)=4
􏰀 𝜆(0)=0.5
be the initialization and let 𝜃(0) denote {𝜇1(0), 𝜎1(0), 𝜇2(0), 𝜎2(0), 𝜆(0)}. Here, ” (0) ” means the
parameters at the 0-th iteration.
Show your work to receive full credit.
a) E-steps: Compute 𝑝(𝑌 = 1| 𝑥 ; 𝜃(0)) for each data point in {𝑥 }6
(a numerical value).
𝑝(𝑌=1|𝑥𝑖;𝜃(0))= 𝑝(𝑌=1,𝑥𝑖;𝜃(0)) 𝑝(𝑥𝑖;𝜃(0))
= 𝑝(𝑌 = 1; 𝜃(0))𝑝(𝑥𝑖| 𝑌 = 1; 𝜃(0))
𝑝(𝑌 = 1; 𝜃(0))𝑝(𝑥𝑖| 𝑌 = 1; 𝜃(0)) + 𝑝(𝑌 = 2; 𝜃(0))𝑝(𝑥𝑖| 𝑌 = 2; 𝜃(0))
b) M-steps: Based on the values 𝑝(𝑌 = 1| 𝑥𝑖; 𝜃(0)) for 𝑖 ∈ {1, … ,6} that you computed earlier, compute the updated parameters 𝜃(1) = {𝜇1(1), 𝜎1(1), 𝜇2(1), 𝜎2(1), 𝜆(1)}.
𝜇𝑘(1) =∑𝑖𝑝(𝑌=𝑘|𝑥𝑖;𝜃(0)) × 𝑥𝑖 ∑𝑖 𝑝(𝑌 = 𝑘| 𝑥𝑖; 𝜃(0))
𝜎𝑘(1) =√∑𝑖𝑝(𝑌=𝑘|𝑥𝑖;𝜃(0))×(𝑥𝑖 −𝜇𝑘(1))2 ∑𝑖 𝑝(𝑌 = 𝑘| 𝑥𝑖; 𝜃(0))
𝜆(1) =∑𝑖𝑝(𝑌=1|𝑥𝑖;𝜃(0)) 6
where 𝑘 ∈ {1, 2} is the index of each component.

Decision Trees (10 points)
7. The following is a training set with three binary attributes A, B, and C:
Example # A B C 1 True False False 2 True False True 3 False True False 4 True True True 5 True True False
False False False True True
Determine the decision tree using the majority rule. Show each step of the algorithm and the final tree for full credit.
K-nearest neighbor (10 points)
8. Use k-nearest neighbor classification to classify on the following training set with two classes {+, -}:
Class x1 x2 +12 +21
-22 +23 -31 +32
a) (2 points) Plot these six training points on an x1-x2 axis.
b) (4 points) Separate the classes {+, -} with a decision boundary on your plot in part a for k =
1. Fill in the region(s) that will classify the “–“ classes.
c) (4 points) Separate the classes {+, -} with a decision boundary on your plot in part a for k =
3. Fill in the region(s) that will classify the “–“ classes.

Support Vector Machine (10 Points)
Use a support vector machine to classify on the following training set:
Class x1 x2 +11 +22 +20
-00 -10 -01
(2 points) Plot these six training points on an x1-x2 axis.
(8 points) Separate the classes {+, -} with a maximum margin separator defined as the set of points {x : w ∙ x + b = 0}. Give your choice for vector w and intercept b as your answer.
Carmen submission
Submit this portion of Homework #4 (Problem Set) into the drop box for the problem set portion. You will submit the programming portion into a different drop box.

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts