HW3
CSE475 HW3, Monday, 03/15/2021, Due: Thursday, 03/25/2021¶
Linear Regression and Gradient Descent¶
Instruction¶
Please submit your Jupyter Notebook file (the. ipynb file) containing your code and the outputs produced by your code (note that .ipynb file can contain both the code and the outputs) to Canvas. Please name your file CSE475-HW3-LastName-FirstName.ipynb.
If you have any questions on the homework problems, you should post your question on the Canvas discussion board (under HW3 Q&A), instead of sending emails to the instructor or TA. We will answer your questions there. In this way, we can avoid repeated questions, and help the entire class stay on the same page whenever any clarification/correction is made.
Introduction¶
This homework uses the fish length dataset for linear regression which aims to predict length of fish. The fish length dataset has 44 rows of fish data with index, age of the fish, temperature of the water and the fish length as columns.
In [1]:
from __future__ import print_function
import os
Import the data using Pandas and examine the shape. There are 3 feature columns plus the predictor, the fish length (Length of fish).
In [2]:
import pandas as pd
import numpy as np
# Import the data using the file path
filepath = ‘Fish_Length.csv’
data = pd.read_csv(filepath, sep=’,’)
print(data.shape)
(44, 4)
Drop the Index as it it is not required for predictions.
In [3]:
# Select the object (string) columns
data.drop(‘Index’, axis=1, inplace=True)
data.shape
Out[3]:
(44, 3)
Create train and test splits of dataset.
Scale the parameters using MinMaxScaler
In [4]:
from sklearn.model_selection import train_test_split
y_col = ‘Length of fish’
# Split the data that is not one-hot encoded
feature_cols = [x for x in data.columns if x != y_col]
X_data = data[feature_cols]
y_data = data[y_col]
X_train, X_test, y_train, y_test = train_test_split(X_data, y_data, test_size=0.2, random_state=0)
# Converting Pandas to Numpy
X_train, X_test, y_train, y_test = X_train.values, X_test.values, y_train.values, y_test.values
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)
(35, 2) (9, 2) (35,) (9,)
In [5]:
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
Question 1 ( 5pts)¶
The linear regression model considered in this assignment is y = WTx + b. y is the outcome, and x is a data point (a 2 by 1 column vector containing two features) . W is a 2 by 1 column vector, b is scalar which is called intercept. We will learn intercept in the linear regression model, so please set fit_intercept=True. Please do the following tasks:
Fit a basic linear regression model on the training data (X_train, y_train) using Scikit-learn and print W and b.
Calculate and print the root mean squared error (not mean squared error) on both the train and test sets.
Hint: the necessary packages have been imported for your convenience. You do not need to import additional packages. You can use np.sqrt(mean_squared_error(…)) (or call mean_squared_error with squared=False) to compute the root mean squared error.
In [6]:
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
Question 2 (10pts)¶
We can follow the following steps to compute closed-form solutions for W and b.
1) Append a new column to X_train and all elements of this new column is 1. Name the obtained extended trainning data X. The new column is the righmost column of X.
2) Calculate the coefficients W_ using the equation below:
W_ = (XTX)-1XTY, where Y is Y_train.
Then W is the first two elements of W and b is the last element of W.
Print W and b. The obtained W and b should match the results obtained from scikit-learn model.
In [ ]:
Question 3 (10pts)¶
Initialize W and b as random variables following normal distribution with mean 0 and variance 1. Use gradient descent for n epochs [10