机器学习代写 EECE5644 Homework 4

EECE5644 Fall 2018 – Homework 4

This assignment is due on Blackboard by 10:00am ET on Tuesday, November 13, 2018. Please submit your solutions on Blackboard in a single PDF file that includes all math, visual and quantitative results (plots, tables, etc), as well as your code (appended after your answers/solutions for each question). In this PDF file, start your answer for each question on a new page and clearly indicate on this new page which question is being answered. Clearly label axes of plots, table rows/columns, and use descriptive captions for results displayed in this fashion. It is recommended that you use LATEXto prepare this PDF document containing your answers. If you have not used LATEXbefore, you can start quickly by opening an account for the online latex editor service by Overleaf, for instance. If using Matlab for your code, please use built-in functions whenever possible. Many useful functions were explained in the ML online tutorial.

1 Order Selection for Gaussian Mixture Model (GMM) Fitting (30 points)
Let g(x; μ, ⌃) indicate the probability density function (pdf) of a Gaussian random vector with mean μ and covariance matrix S. Write

code that implements the following:

• GenerateNsamplesfromp(x;✓)=b1g(x;0,10)+b2g(x;3,10)+b3g(x;0,10). 001 001 201

  • Fit a mixture of M Gaussian pdfs to the data generated in the previous step using the EM algorithm to obtain ✓ˆM. Initialize the EM algorithm as you see appropriate but do not use the true model parameters.
  • Evaluate the Bayesian Information Criterion (BIC) for the GMM with ✓ˆM. Note that there are fewer than 7M independent parameters for order-M 2-dimensional GMM. Explain how you decide on the number of parameters that you use in the model order penalty term of BIC.

Repeat the following for R = 1000 Monte Carlo runs:
• Generate N = 100 samples from p(x; ✓) for each of the following mixture coefficients: = 0.25 , = 0.15 , = 0.05 .

• Forallthreedatasets(obtainedusingdifferentvalues),andforallM2{1,…,5},determine✓ˆM andcomputeBIC(✓ˆM).

Prepare three separate plots, one for each value of , demonstrating the values of median BIC values for different M, as well as 5% and 95% levels of the BIC values obtained in R Monte Carlo runs for each M, in order to indicate 90% confidence intervals around these median values. Also, prepare three additional separate bar plots, one for each value of , in which you show the histogram of M values at which BIC is minimized (i.e. the number of Gaussians that would have been selected for the GMM by BIC). Indicate how you initialized each algorithm. Briefly discuss results.

2 Clustering Using GMMs and K-Means Algorithm (30 points)
Let g(x; μ, ⌃) indicate the probability density function (pdf) of a Gaussian random vector with mean μ and covariance matrix S. Write

code that implements the following:
• GenerateNsamplesfromp(x;✓)=b1g(x;0,10)+b2g(x;3,1 0 )+b3g(x;0,0.50).

0 01 0 00.5 2 01
• Fit a mixture of M = 3 Gaussian pdfs to the data generated in the previous step using the EM algorithm to obtain ✓ˆM . Initialize

the EM algorithm as you see appropriate but do not use the true model parameters.

  • Using the components of the estimated GMM, assign cluster labels to samples, such that the index of the Gaussian component with the largest posterior is assigned to each sample (in the textbook, see Chapter 10, equation 27, on page 526 for the calculation of the posterior probabilities).
  • Using the K-Means clustering algorithm with K = 3, and employing Euclidean distance between centroids and samples to make label assignment updates during iterations, assign cluster labels to each sample. Initialize the centroids as you see appropriate but do not use the true model parameters.

0.4
Run the code you developed to generate N = 100 samples from a GMM with = 0.3 and cluster these samples using both GMM

0.3
and K-Means procedures. Present your results in the form of three separate plots that show the generated data set as a 2-dimensional

scatter plot (1) with all samples having the same marker shape/color; (2) with the marker shape/color for each sample indicating the assigned cluster label for the GMM procedure; (3) with the marker shape/color for each sample indicating the assigned cluster label for the K-Means algorithm. Indicate how you initialized each algorithm. Briefly discuss results.

1

0.75 0.80 0.90 0.05 0.05 0.05

3 Training an SVM Classifier (40 points)
Let g(x; μ, ⌃) indicate the probability density function (pdf) of a Gaussian random vector with mean μ and covariance matrix S. Write

code that implements the following:

0 1 0 3 1 0 0 1 0 0.5
• GenerateNtrain =100samplesfrom p(x;✓)=b1g(x; 0 , 0 1 )+b2g(x; 0 , 0 1 )+b3g(x; 2 , 0 1 )with= 0.25 . Label

0.25
the samples that originate from Component 1 (with mean at the origin) as coming from class +1, and the remaining samples as

coming from class 1.

  • ForasupportvectormachineclassifierthatusessphericallysymmetricGaussiankernelswithwidthparameters,useK-foldcross validation to determine the best kernel parameter s and the overlap penalty term weight parameter (referred to as C in class), in order to achieve minimum empirical probability of error estimate as the cross-validation objective.
  • Using the best hyper parameters s and C, train a final SVM on the entire data set.
  • Generate Ntest = 10000 samples from the same GMM that generated the training set, and assign labels in the same way. This large test data set is independent from and is identically distributed to our training set that was used to train the SVM classifier. We can use it to assess test (generalization) performance.

    Apply the trained SVM to classify the test samples and report the following results: (1) empirical probability of error estimate on the test set;
    (2) a single plot that shows the following:

    (a) all test data, each sample colored green if correctly classified xor red if incorrectly classified by the SVM; (b) the decision boundary of the SVM overlaid on the data.

2