Microsoft Word – HW-1
CPE 695 Applied Machine Learning
HW – 1 Linear Regression
Name
I pledge on my honor that I have not given or received any unauthorized
assistance on this assignment/examination. I further pledge that I have
not copied any material from a book, article, the Internet or any other
source except where I have expressly cited the source.
Signature _________________________ Date: _____________
We learned in class that there are multiple ways to find the
coefficients for a linear regression model. In the homework, use any
language you prefer to implement
1. Normal equation and
2. Gradient descent (batch or stochastic mode)
a. Print the cost function over iterations
b. Describe how you choose the right alpha (learning rate)
3. Summarize the result: for example, report confidence interval
of the estimated coefficients and so on
A simulated dataset will be provided, you job is to find the
coefficients that can accurately estimate the true coefficients.
Your solution should be 3 for intercept and 1, 4 for two coefficients.
[Bonus 20 points] given the advertising.csv dataset that we used in
class, use the same code you wrote on the simulated data to
estimate the coefficients for the advertising dataset. Explain why it
does not work directly on the raw data and what is the solution to fix
the problem.
Sample R code:
1. Simulated data
# artifical dataset
set.seed(123)
x1 <- runif(1000, -5, 5)
x2 <- runif(1000, -5, 5)
y <- x1 + 4*x2 + rnorm(1000) + 3
2. Advertising dataset
ads <- read.csv(file='data/Advertising.csv')
X <- ads[, c('TV', 'Radio', 'Newspaper')]
Y <- ads$Sales