程序代写代做代考 python algorithm matlab EECS 391 Introduction to Artificial Intelligence

EECS 391 Introduction to Artificial Intelligence
Fall 2018, Written Assignment 5 (“W5”)

Due: Tue Nov 27. Submit a single pdf document along with your code for the whole assignment to Canvas
before class. You may scan a handwritten page for the written portions, but make sure you submit a single pdf
file.

Total Points: 100 + 20 extra credit

Remember: name and case ID, stapled, answers concise, neat, and legible. Submit in class on the due date.

Note: Some of the questions below ask you to make plots and/or write simple programs. This might be more
convenient to do in a language with a good plotting library such as Matlab, Mathematica, or python using
matplotlib. Submit your code via Canvas, but turn in the homework writeup in class.

Q1. Bernoulli trials and bias beliefs

Recall the binomial distribution describing the likelihood of getting y heads for n flips

p(y|θ,n)=
(
n
y

)
θ

y(1−θ)n−y

where θ is the probability of heads.

a) Using the fact ∫ 1
0

p(y|θ,n)dθ = 1
1+n

derive the posterior distribution for θ assuming a uniform prior. (5 P.)

b) Plot the likelihood for n = 4 and θ = 3/4. Make sure your plot includes y= 0. (5 P.)
c) Plot the posterior distribution of θ after each of the following coin flips: head, head, tail, head. You

should have four plots total. (10 P.)

Q2. After R&N 20.1 Bags O’ Surprise
The data used for Figure 20.1 on page 804 can be viewed as being generated by h5.

a) For each of the other four hypotheses, write code to generate a data set of length 100 and plot the
corresponding graphs for P(hi|d1, . . . ,dN ) and P(DN+1 = lime|d1, . . . ,dN ). The plots should follow the
format of Figure 20.1. Comment on your results. (15 P.)

b) What is the mathematical expression for how many candies you need to unwrap before you are more
90% sure which type of bag you have? (5 P.)

c) Make a plot that illustrates the reduction in variabilty of the curves for the posterior probability for each
type of bag by averaging each curve obtained from multiple datasets. (15 P.)

Q3. Classification with Gaussian Mixture Models

Suppose you have a random variable x which is drawn from one of two classes C1 and C2. Each class follows
a Gaussian distribution with means µ1 and µ2 (assume µ1 < µ2) and variances σ1 and σ2. Assume that the prior probability of C1 is twice that of C2. a) What is the expression for the probability of x, i.e. p(x), when the class is unknown? (5 P.) b) What is the expression for the probability of total error in this model assuming that the decision bound- ary is at x = θ? (10 P.) c) Derive an expression for the value of the decision boundary θ that minimizes the probability of misclas- sification. (10 P.) 1 Q4. k-means Clustering In k-means clustering, µk is the vector mean of the kth cluster. Assume the data vectors have I dimensions, so µk is a (column) vector [µ1, . . . ,µI ]Tk , where the symbol T indicates vector transpose. a) Derive update rule for µk using the objective function D = N∑ n=1 K∑ k=1 rn,k‖xn −µk‖2 where xn is the nth data vector, rn,k is 1 if xn is in the kth class and 0 otherwise, and ‖x‖2 = xT x =∑ i xixi = ∑ i x 2 i . The update rule is derived by computing the gradient for each element of the k th mean and solving for the value where the gradient is zero. Express your answer first in scalar form for µk,i and in vector form for µk. (20 P.) b) Extra credit. Write a program that implements the k-means clustering algorithm on the iris data set. Plot the results of the learning process by showing the initial, intermediate, and converged cluster cen- ters for k = 2 and k = 3. (+20 P. bonus) 2