CS计算机代考程序代写 Tutorial03-Tasks (1)

Tutorial03-Tasks (1)

QBUS2820 – Predictive Analytics
¶Tutorial 3 Tasks¶

In [1]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_context(‘notebook’)
%matplotlib inline

Split the data into train and test sets

In [2]:

data=pd.read_csv(‘credit.csv’, index_col=’Obs’)
train = data.sample(frac=0.7, random_state=1)
test = data[data.index.isin(train.index)==False]

In [3]:

values = np.arange(1, 101)
print(values)

[ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72
73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100]

Complete code in the companion notebook to generate a plot of the test performance for the model with one as we change the number of neighbours. Interpret the results and relate them to our discussion in the first module.

In [ ]:

from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error

losses = []
for k in values:
# 1. Specify and fit the model (there is no need to store it)
# 2. Compute predictions for the test data
# 3. Compute the root mean squared error and assign to a variable called loss
losses.append(loss)

fig, ax= plt.subplots()
# add the command to plot the required rigure
ax.set_xlabel(‘Number of neighbours’)
ax.set_ylabel(‘Test error’)
plt.show()

This will give you the value of $k$ with lowest error.

In [ ]:

1 + np.argmin(losses)