程序代做 IV. Problems

IV. Problems
Probabilistic Graphical Models (PGMs) (40 points)
1. (3 points) Consider the following PGM of five random variables A, B, C, D, and E:
The joint distribution 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) according to a PGM can be decomposed as follows: ∏ 𝑝(𝑋|𝑝𝑎𝑟𝑒𝑛𝑡𝑠(𝑋))

Copyright By PowCoder代写 加微信 powcoder

𝑋∈{𝐴,𝐵,𝐶,𝐷,𝐸}
Decompose 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) for the PGM below in the above-mentioned form. Your answer is an expression that solely comprises of probabilities over one or more of the random variables A, B ,C, D, E. For example, 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) =
𝑝(𝐶)𝑝(𝐴|𝐶)𝑝(𝐵|𝐴, 𝐶)𝑝(𝐷|𝐴, 𝐵, 𝐶)𝑝(𝐸|𝐴).
2. (3 points) Draw the PGM of five random variables A, B, C, D, E from the joint distribution 𝑝(𝐴, 𝐵, 𝐶, 𝐷, 𝐸) = 𝑝(𝐴)𝑝(𝐵)𝑝(𝐶)𝑝(𝐷|𝐴, 𝐵)𝑝(𝐸|𝐷, 𝐶).
3. (24 points) Consider the following PGM of three binary random variables (A, B, and C) and corresponding probability terms. Remember, each binary random variable has two outcomes.
For parts a – f (3 points each), compute the probabilities. Your answer must be a floating point decimal. Show your work to receive full credit by writing your derivation / calculation.
a) 𝑝(𝐵 = 𝑏)
b) 𝑝(𝐶 = ¬𝑐)

c) 𝑝(𝐴=𝑎,𝐶=¬𝑐)
d) 𝑝(𝐴=𝑎|𝐵=𝑏)
e) 𝑝(𝐴=𝑎,𝐶=¬𝑐|𝐵=𝑏)
f) 𝑝(𝐴=𝑎|𝐶=¬𝑐)
For parts g and h (3 points each), answer “Yes” or “No” and state your reason for full credit.
Is 𝑝(𝐴 = 𝑎, 𝐶 = ¬𝑐) equal to 𝑝(𝐴 = 𝑎)𝑝(𝐶 = ¬𝑐)?
Is 𝑝(𝐴 = 𝑎, 𝐶 = ¬𝑐|𝐵 = 𝑏) equal to 𝑝(𝐴 = 𝑎|𝐵 = 𝑏) 𝑝(𝐶 = ¬𝑐|𝐵 = 𝑏)?
𝑝(𝐶=¬𝑐)=𝑝(𝐶=¬𝑐,𝐵=𝑏)+ 𝑝(𝐶=¬𝑐,𝐵=¬𝑏)= 𝑝(𝐶=¬𝑐|𝐵=𝑏)𝑝(𝐵= 𝑏)+ 𝑝(𝐶=¬𝑐|𝐵=¬𝑏)𝑝(𝐵=¬𝑏)
𝑝(𝐴=𝑎,𝐶=¬𝑐)= 𝑝(𝐴=𝑎,𝐶=¬𝑐,𝐵=𝑏)+ 𝑝(𝐴=𝑎,𝐶=¬𝑐,𝐵=¬𝑏)
4. (10 points) The following is a PGM of five random variables A, B, C, D, and E.
For each part below (2 points each), answer “Yes” or “No” and state your reason for full credit.
a) Is 𝐴 ⊥ 𝐸 (i.e., 𝑝(𝐴, 𝐸) = 𝑝(𝐴)𝑝(𝐸))?
b) Is𝐴⊥𝐷|𝐵(i.e.,𝑝(𝐷|𝐴,𝐵)=𝑝(𝐷|𝐵))?
c) Is𝐴⊥𝐸|𝐵(i.e.,𝑝(𝐴,𝐸|𝐵)=𝑝(𝐴|𝐵)𝑝(𝐸|𝐵))?
d) Is𝐴⊥𝐸|𝐵,𝐶(i.e.,𝑝(𝐴,𝐸|𝐵,𝐶)=𝑝(𝐴|𝐵,𝐶)𝑝(𝐸|𝐵,𝐶))?
e) Is𝐴⊥𝐶|𝐷(i.e.,𝑝(𝐴,𝐶|𝐷)=𝑝(𝐴|𝐷)𝑝(𝐶|𝐷))?

K-means (15 points)
5. Consider the following one-dimensional dataset of 8 data instances:
Perform 10 iterations of the K-means algorithm given 𝐾 = 2 and the initializations below. 𝑐1(𝑡) and 𝑐2(𝑡) represent the 𝐾 = 2 centers at the end of the t-th iteration. Use squared Euclidean
distance (as we do in the slides).
Note, if no points are assigned to a center (e.g., if no points are assigned to 𝑐1(𝑡)), then 𝑐1(𝑡+1) =
𝑐 1( 𝑡 ) .
Show your work to receive full credit.
Let 𝑐1(𝑡) = 25 and 𝑐2(𝑡) = 50 at 𝑡 = 0 (i.e., initialization). Compute 𝑐1(10) and 𝑐2(10). Gaussian mixture models (15 points)
6. Consider the following one-dimensional dataset of 6 data instances:
You will follow steps to construct a Gaussian mixture model with two components (i.e., 1 & 2) to model the distribution of these 6 data points.
Note that to construct a Gaussian mixture model, you are to estimate:
􏰀 𝜇 ,𝜎 , such that 𝑝(𝑋 = 𝑥|𝑌 = 1) = 11
1 𝑒𝑥𝑝 − (𝑥−𝜇1)2
𝜎1√2𝜋 2𝜎12
1 𝑒𝑥𝑝 − (𝑥−𝜇2)2
𝜎2√2𝜋 2𝜎22
, such that 𝑝(𝑋 = 𝑥|𝑌 = 2) = 􏰀 𝜆, such that 𝑝(𝑌 = 1) = 𝜆
so that together you have the GMM distribution
𝑝(𝑋 = 𝑥) = 𝑝(𝑌 = 1)𝑝(𝑋 = 𝑥|𝑌 = 1) + 𝑝(𝑌 = 2)𝑝(𝑋 = 𝑥|𝑌 = 2) Let us denote by 𝜃 all the parameters {𝜇1, 𝜎1, 𝜇2, 𝜎2, 𝜆}.

Like K-means, GMM also needs initialization. Let
􏰀 𝜇1(0)=4,𝜎1(0)=4 􏰀 𝜇2(0)=10,𝜎2(0)=4
􏰀 𝜆(0)=0.5
be the initialization and let 𝜃(0) denote {𝜇1(0), 𝜎1(0), 𝜇2(0), 𝜎2(0), 𝜆(0)}. Here, ” (0) ” means the
parameters at the 0-th iteration.
Show your work to receive full credit.
a) E-steps: Compute 𝑝(𝑌 = 1| 𝑥 ; 𝜃(0)) for each data point in {𝑥 }6
(a numerical value).
𝑝(𝑌=1|𝑥𝑖;𝜃(0))= 𝑝(𝑌=1,𝑥𝑖;𝜃(0)) 𝑝(𝑥𝑖;𝜃(0))
= 𝑝(𝑌 = 1; 𝜃(0))𝑝(𝑥𝑖| 𝑌 = 1; 𝜃(0))
𝑝(𝑌 = 1; 𝜃(0))𝑝(𝑥𝑖| 𝑌 = 1; 𝜃(0)) + 𝑝(𝑌 = 2; 𝜃(0))𝑝(𝑥𝑖| 𝑌 = 2; 𝜃(0))
b) M-steps: Based on the values 𝑝(𝑌 = 1| 𝑥𝑖; 𝜃(0)) for 𝑖 ∈ {1, … ,6} that you computed earlier, compute the updated parameters 𝜃(1) = {𝜇1(1), 𝜎1(1), 𝜇2(1), 𝜎2(1), 𝜆(1)}.
𝜇𝑘(1) =∑𝑖𝑝(𝑌=𝑘|𝑥𝑖;𝜃(0)) × 𝑥𝑖 ∑𝑖 𝑝(𝑌 = 𝑘| 𝑥𝑖; 𝜃(0))
𝜎𝑘(1) =√∑𝑖𝑝(𝑌=𝑘|𝑥𝑖;𝜃(0))×(𝑥𝑖 −𝜇𝑘(1))2 ∑𝑖 𝑝(𝑌 = 𝑘| 𝑥𝑖; 𝜃(0))
𝜆(1) =∑𝑖𝑝(𝑌=1|𝑥𝑖;𝜃(0)) 6
where 𝑘 ∈ {1, 2} is the index of each component.

Decision Trees (10 points)
7. The following is a training set with three binary attributes A, B, and C:
Example # A B C 1 True False False 2 True False True 3 False True False 4 True True True 5 True True False
False False False True True
Determine the decision tree using the majority rule. Show each step of the algorithm and the final tree for full credit.
K-nearest neighbor (10 points)
8. Use k-nearest neighbor classification to classify on the following training set with two classes {+, -}:
Class x1 x2 +12 +21
-22 +23 -31 +32
a) (2 points) Plot these six training points on an x1-x2 axis.
b) (4 points) Separate the classes {+, -} with a decision boundary on your plot in part a for k =
1. Fill in the region(s) that will classify the “–“ classes.
c) (4 points) Separate the classes {+, -} with a decision boundary on your plot in part a for k =
3. Fill in the region(s) that will classify the “–“ classes.

Support Vector Machine (10 Points)
Use a support vector machine to classify on the following training set:
Class x1 x2 +11 +22 +20
-00 -10 -01
(2 points) Plot these six training points on an x1-x2 axis.
(8 points) Separate the classes {+, -} with a maximum margin separator defined as the set of points {x : w ∙ x + b = 0}. Give your choice for vector w and intercept b as your answer.
Carmen submission
Submit this portion of Homework #4 (Problem Set) into the drop box for the problem set portion. You will submit the programming portion into a different drop box.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com