-4exELEC 400M / EECE 571M Homework 2: The Linear Model -8ex Homework is not handed in. Please solve for your own benefit.
ELEC 400M / EECE 571M
Homework 2: The Linear Model
Homework is not handed in. Please solve for your own benefit.
• Problem 3.1, solution attached
• Problem 3.2, solution attached
• Problem 3.3, solution attached
• Problems 3.4, solution attached
• Problem 3.16, solution attached
• Extra: Problem 3.15
1
Problems3.1-3.2
January 26, 2021
1 Problems 3.1-3.3 from Learning from Data
[12]: %reset
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
Once deleted, variables cannot be recovered. Proceed (y/[n])? y
[13]: def ErrorRate (w, x, y):
### Your error rate here
[14]: def PLA(w, x, y, maxIter):
### Your PLA here
[42]: def pocket(x, y, T):
### Your Pocket algorithm here
[30]: def LinReg(X, y):
return(np.dot(np.transpose(y),np.matmul(X,np.linalg.inv(np.matmul(np.
↪→transpose(X),X)))))
1.1 Problem 3.1
[31]: def gen_semi_circle_data (nbr_x, rad, thk, sep):
x_l = thk+rad
x_r = 2*rad+1.5*thk
x_u = thk+rad
x_d = thk+rad+sep
nbr_x_gen = 0
x = []
y = []
while (nbr_x_gen= 0) and (rad ** 2 <= x1 ** 2 + x2 ** 2) and (x1 ** 2 + x2 **2␣
↪→<= (rad+thk) **2)):
x.append((x1,x2))
y.append(-1)
nbr_x_gen += 1
elif ((x2 < -sep) and (rad ** 2 <= (x1-rad-thk/2) ** 2 + (x2+sep) ** 2)␣
↪→and ((x1-rad-thk/2) ** 2 + (x2+sep) ** 2 <= (rad+thk) ** 2)):
x.append((x1,x2))
y.append(1)
nbr_x_gen += 1
return np.squeeze(np.array(x)), np.array(y)
[32]: # generate data
rad = 10
thk = 5
sep = 5
x, y = gen_semi_circle_data (2000, rad, thk, sep)
x = np.hstack((np.ones((len(x),1)),x))
# run PLA
w_init = np.zeros(len(x[0]))
wPLA, iters = PLA(w_init, x, y, 100) # run Linear Regression
wLinReg = LinReg(x, y)
# plot
plt.figure(figsize = (12,12))
plt.plot(x[y == 1][:,1], x[y == 1][:,2], 'bo')
plt.plot(x[y == -1][:,1], x[y == -1][:,2], 'ro')
xlin = np.linspace(-thk-rad, 2*rad+1.5*thk, 100)
plt.plot(xlin,-1*wPLA[1]/wPLA[2]*xlin-1*wPLA[0]/wPLA[2],label="PLA");
plt.plot(xlin,-1*wLinReg[1]/wLinReg[2]*xlin-1*wLinReg[0]/
↪→wLinReg[2],label="Linear Regression"); plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
plt.legend()
plt.show()
2
1.2 Problem 3.2
[27]: rad = 10
thk = 5
iters = []
for sep in np.arange(0.2,5.1,0.2):
x, y = gen_semi_circle_data (2000, rad, thk, sep)
x = np.hstack((np.ones((len(x),1)),x))
w_init = np.zeros(len(x[0]))
w, iter = PLA(w_init, x, y, 1e5);
iters.append(iter)
# plot
3
plt.figure(figsize = (12,12))
plt.plot(np.arange(0.2,5.1,0.2),iters)
plt.xlabel('Separation of semicircles')
plt.ylabel('numer of iterations of PLA');
1.3 Problem 3.3
[46]: def nonLinearTransQ3 (x1,x2): return(np.hstack((np.
↪→ones((len(x1),1)),x1,x2,x1**2,x1*x2,x2**2,x1**3,x1**2*x2,x1*x2**2,x2**3)))
rad = 10
thk = 5
sep = -5
4
# generate data
x, y = gen_semi_circle_data (2000, rad, thk, sep)
x = np.hstack((np.ones((len(x),1)),x))
# run pocket
T = 10000
wPocket, Ein = pocket(x, y, T)
# run line linear regression
wLinReg = LinReg(x, y)
Ein_LinReg = ErrorRate(wLinReg, x, y)
# nonlinear transform
x1=x[:,1:2]
x2=x[:,2:3]
z=nonLinearTransQ3(x1,x2)
# run pocket
wPocket_NL, Ein_NL = pocket(z, y, T)
# run line linear regression
wLinReg_NL = LinReg(z, y)
Ein_LinReg_NL = ErrorRate(wLinReg_NL, z, y)
# plot error rate
plt.figure(figsize = (12,12))
plt.plot(np.arange(0,T+1),Ein,label="Pocket")
plt.plot(np.arange(0,T+1),np.ones(T+1)*Ein_LinReg,label="Linear regression")
plt.plot(np.arange(0,T+1),Ein_NL,label="Pocket, nonlinear transform")
plt.plot(np.arange(0,T+1),np.ones(T+1)*Ein_LinReg_NL,label="Linear regression,␣
↪→nonlinear transform")
plt.legend()
plt.xlabel('Number of iterations')
plt.ylabel('Classification error $E_in$');
5
[54]: # plot decision regions
plt.figure(figsize = (12,12))
plt.plot(x[y == 1][:,1], x[y == 1][:,2], 'bo')
plt.plot(x[y == -1][:,1], x[y == -1][:,2], 'ro')
xlin = np.linspace(-thk-rad, 2*rad+1.5*thk, 100)
# Pocket
plt.plot(xlin,-1*wPocket[1]/wPocket[2]*xlin-1*wPocket[0]/
↪→wPocket[2],label="Pocket", linewidth=3);
# Linear Regression
plt.plot(xlin,-1*wLinReg[1]/wLinReg[2]*xlin-1*wLinReg[0]/
↪→wLinReg[2],label="Linear regression", linewidth=3); # for NL
x_l = thk+rad
x_r = 2*rad+1.5*thk
x_u = thk+rad
6
x_d = thk+rad+sep
h = (x_r-x_l)/100
xx1, xx2 = np.meshgrid(np.arange(-x_l, x_r, h), np.arange(-x_d, x_u, h))
x1 = xx1.flatten().reshape((len(xx1.flatten()),1))
x2 = xx2.ravel().reshape((len(xx2.flatten()),1))
z = nonLinearTransQ3(x1,x2)
# Pocket
yhat = np.sign(np.dot(z,wPocket_NL))
yhat = yhat.reshape(xx1.shape)
#plt.contourf(xx1, xx2, yhat, cmap=plt.cm.jet, alpha=.8)
cp=plt.contour(xx1, xx2, yhat, colors='k', linewidths=2)
cp.collections[0].set_label("Pocket, nonlinear transform")
# Linear Regression
yhat = np.sign(np.dot(z,wLinReg_NL))
yhat = yhat.reshape(xx1.shape)
cp=plt.contour(xx1, xx2, yhat, colors='g', linewidths=2)
cp.collections[0].set_label("Linear regression, nonlinear transform")
plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
plt.legend()
plt.show()
7
8
• Problem 3.4
(a) The only critical point is when we transition between the two parts of the max-function,
i.e., when 1 − ynwTxn = 0. However, we have that the error function converges to 0 from
both sides when approaching this point:
lim
w:ynwTxn→1±
en(w) = 0
and hence it is continuous everywhere.
The gradient is obtained as
∇en(w) =
{
0, if ynw
Txn > 1
−2(1− ynwTxn)ynxn, if ynwTxn < 1
and
lim
w:ynwTxn→1±
∇en(w) = 0 .
Hence, the function is differentiable.
(b) We need to consider two cases.
1. 1{sign(wTxn 6= yn} = 1: Then, ynwTxn < 0 and thus en(w) = (1 − ynwTxn)2 > 1 ≥
1{sign(wTxn 6= yn}.
2. 1{sign(wTxn 6= yn} = 0: Since en(w) ≥ 0 for all w, we have en(w) ≥ 1{sign(wTxn 6=
yn}.
(c) We perform stochastic gradient descent on 1
N
N∑
n=1
en(w). Hence, introducing s(t) =
wT (t)x the update equation is
w(t+ 1) = w(t) + η∇en(w)
= w(t) +
0, if yns(t) > 1
2η︸︷︷︸
η′
(yn − s(t))xn, if yns(t) ≤ 1
This is the Adaline algorithm from Problem 1.5.
• Problem 3.16
(a) If we decide to accept the person, then the expected cost is
E[cost] = cost(accept) = 0 · Pr[y = +1|x] + ca · Pr[y = −1|x] .
Since our estimate for Pr[y = +1|x] is g(x), we have
cost(accept) = ca(1− g(x)) .
Similarly,
cost(reject) = crg(x) .
2
(b) We accept a person if cost(reject) ≥ cost(accept).
crg(x) ≥ ca(1− g(x))
g(x) ≥
ca
ca + cr
and hence
κ =
ca
ca + cr
.
(c) Supermarket: ca = 1 and cr = 10, and hence κ = 1/11. You rather not reject a patron
who is in the bonus program, than accidentally let somebody in who is not. So false rejects
are minimized.
CIA: ca = 1000 and cr = 1, and hence κ = 1000/1001. In this example, a false accept would
be a disaster and so we only accept persons of whose authorization we are quite confident.
3
2021-Problems3.1-3.3.pdf
Problems 3.1-3.3 from Learning from Data
Problem 3.1
Problem 3.2
Problem 3.3