CS计算机代考程序代写 Hive Machine Learning 1 TU Berlin, WiSe 2020/21

Machine Learning 1 TU Berlin, WiSe 2020/21
Gaussian Processes
In this exercise, you will implement Gaussian process regression and apply it to a toy and a real dataset. We use the notation used in the paper ”Rasmussen (2005). Gaussian Processes in Machine Learning” linked on ISIS.
Let us first draw a training set X = (x1,…,xn) and a test set X⋆ = (x⋆1,…,x⋆m) from a d-dimensional input distribution. The Gaussian Process is a model under which the real-valued outputs f = (f1,…,fn) and f⋆ = (f1⋆, . . . , fm⋆ ) associated to X and X⋆ follow the Gaussian distribution:
where
[f] ([0][ΣΣ])
f⋆
∼N,⋆ 0 Σ⊤⋆ Σ⋆⋆
Σ = k(X, X) + σ2I Σ⋆ =k(X,X⋆)
Σ⋆⋆ =k(X⋆,X⋆)+σ2I
and where k(·, ·) is the Gaussian kernel function. (The kernel function is implemented in utils.py.) Predicting the output for new data points X⋆ is achieved by conditioning the joint probability distribution on the training set. Such conditional distribution called posterior distribution can be written as:
f⋆|f∼N(Σ⊤⋆ Σ−1f , Σ⋆⋆ −Σ⊤⋆ Σ−1Σ⋆) (1) 􏰎􏰍􏰌􏰏􏰎 􏰍􏰌 􏰏
μ⋆ C⋆
Having inferred the posterior distribution, the log-likelihood of observing for the inputs X⋆ the outputs y⋆ is
given by evaluating the distribution f⋆|f at y⋆:
logp(y |f)=−1(y −μ )⊤C−1(y −μ )−1log|C |−mlog2π (2)
⋆ 2⋆⋆⋆⋆⋆2⋆2
where | · | is the determinant. Note that the likelihood of the data given this posterior distribution can be
measured both for the training data and the test data.
Part 1: Implementing a Gaussian Process (30 P) Tasks:
• Create a class GP_Regressor that implements a Gaussian process regressor and has the following three methods:
• def __init__(self,Xtrain,Ytrain,width,noise): Initialize a Gaussian process with noise parameter σ and width parameter w. The variable Xtrain is a two-dimensional array where each row is one data point from the training set. The Variable Ytrain is a vector containing the associated targets. The function must also precompute the matrix Σ−1 for subsequent use by the method predict() and loglikelihood().
• def predict(self,Xtest): For the test set X⋆ of m points received as parameter, return the mean vector of size m and covariance matrix of size m × m of the corresponding output, that is, return the parameters (μ⋆,C⋆) of the Gaussian distribution f⋆|f.
• def loglikelihood(self,Xtest,Ytest): For a data set X⋆ of m test points received as first parameter, return the loglikelihood of observing the outputs y⋆ received as second parameter.
1

In [1]: # ————————– # TODO: Replace by your code # ————————–
import solutions
class GP_Regressor(solutions.GP_Regressor):
pass
# ————————–
• Test your implementation by running the code below (it visualizes the mean and variance of the prediction at every location of the input space) and compares the behavior of the Gaussian process for various noise parameters σ and width parameters w.
In [2]: import utils,datasets,numpy import matplotlib.pyplot as plt
%matplotlib inline # Open the toy data
Xtrain,Ytrain,Xtest,Ytest = utils.split(*datasets.toy()) # Create an analysis distribution
Xrange = numpy.arange(-3.5,3.51,0.025)[:,numpy.newaxis]
f = plt.figure(figsize=(18,15))
# Loop over several parameters:
for i,noise in enumerate([2.5,0.5,0.1]): for j,width in enumerate([0.1,0.5,2.5]):
# Create Gaussian process regressor object
gp = GP_Regressor(Xtrain,Ytrain,width,noise)
# Compute the predicted mean and variance for test data
mean,cov = gp.predict(Xrange)
var = cov.diagonal()
# Compute the log-likelihood of training and test data
lltrain = gp.loglikelihood(Xtrain,Ytrain)
lltest = gp.loglikelihood(Xtest ,Ytest )
# Plot the data
p = f.add_subplot(3,3,3*i+j+1)
p.set_title(‘noise=%.1f width=%.1f lltrain=%.1f, lltest=%.1f’%(noise,width,lltrain,lltes p.set_xlabel(‘x’)
p.set_ylabel(‘y’)
p.scatter(Xtrain,Ytrain,color=’green’,marker=’x’) # training data
p.scatter(Xtest,Ytest,color=’green’,marker=’o’)
p.plot(Xrange,mean,color=’blue’)
p.plot(Xrange,mean+var**.5,color=’red’)
p.plot(Xrange,mean-var**.5,color=’red’)
p.set_xlim(-3.5,3.5)
p.set_ylim(-4,4)
# test data
# GP mean
# GP mean + std
# GP mean – std
2

Part 2: Application to the Yacht Hydrodynamics Data Set (20 P)
In the second part, we would like to apply the Gaussian process regressor that you have implemented to a real dataset: the Yacht Hydrodynamics Data Set available on the UCI repository at the webpage http://archive.ics.uci.edu/ml/datasets/Yacht+Hydrodynamics. As stated on the web page, the input variables for this regression problem are:
1. Longitudinal position of the center of buoyancy 2. Prismatic coefficient
3. Length-displacement ratio
4. Beam-draught ratio
5. Length-beam ratio 6. Froude number
and we would like to predict from these variables the residuary resistance per unit weight of displacement (last column in the file yacht_hydrodynamics.data).
Tasks:
• Load the data using datasets.yacht() and partition the data between training and test set using the function utils.split(). Standardize the data (center and rescale) so that each dimension of the training data and the labels have mean 0 and standard deviation 1 over the training set.
3

• Train several Gaussian processes on the regression task using various combinations of width and noise parameters.
• Draw two contour plots where the training and test log-likelihood are plotted as a function of the noise and width parameters. Choose suitable ranges of parameters so that the best parameter combination for the test set is in the plot. Use the same ranges and contour levels for the training and test plots. The contour levels can be chosen linearly spaced between e.g. 50 and the maximum log-likelihood value
In [3]: # ————————– # TODO: Replace by your code # ————————–
import solutions %matplotlib inline solutions.yacht()
# ————————–
Noise params: 0.005 0.007 0.008 0.010 0.011 0.013 0.014 0.016 0.017 0.019 0.020 0.022 0.023 0.025 0.026
Width params: 0.050 0.135 0.220 0.304 0.389 0.474 0.559 0.643 0.728 0.813 0.898 0.983 1.067 1.152 1.237
4