CS计算机代考程序代写 python algorithm deep learning 2-MLP-checkpoint

2-MLP-checkpoint

COMP5329 – Deep Learning¶
Tutorial 2 – Multilayer Neural Network¶

Semester 1, 2021

Objectives:

To understand the multi-layer perceptron.
To become familiar with backpropagation.

Instructions:

Go to File->Open. Drag and drop “lab2MLP_student.ipynb” file to the home interface and click upload.
Read the code and complete the exercises.
To run the cell you can press Ctrl-Enter or hit the Play button at the top.

Loading the packages¶

In [ ]:

import numpy as np
import matplotlib.pyplot as pl
from ipywidgets import interact, widgets
from matplotlib import animation

The Dataset¶
The following script allows you to create a 2D dataset by using the mouse. The left click adds points belonging to class A (blue), and the right click adds points belonging to class B (red). You can create as many points as you desire. The final dataset will contain hence three values per point: x coordinate [-1,1], y coordinate [-1,1] and the class {1,-1}.

In [ ]:

%matplotlib notebook

# create the figure
fig = pl.figure(figsize=(6,6))
pl.title(“Input Dataset”)
pl.xlim((-2,2))
pl.ylim((-2,2))

dataset = []

# setting the click event
def onclick(event):
global dataset
cx = event.xdata
cy = event.ydata
co = event.button
dataset.append((cx, cy, co-2))

pl.scatter(cx, cy, c=([‘b’, ‘r’])[co > 2], s=100, lw=0)
pl.grid(True)

cid = fig.canvas.mpl_connect(‘button_press_event’, onclick)

In [ ]:

%matplotlib inline

Show the dataset¶

In [ ]:

# # Un-comment this code block if you are using Google Colab
class_1 = np.hstack([np.random.normal( 1, 1, size=(25, 2)), np.ones(shape=(25, 1))])
class_2 = np.hstack([np.random.normal(-1, 1, size=(25, 2)), -np.ones(shape=(25, 1))])
dataset = np.vstack([class_1, class_2])
dataset

Out[ ]:

array([[ 2.08053124, 0.7997481 , 1. ],
[ 1.01020104, -0.67326865, 1. ],
[ 1.61462389, 0.59748769, 1. ],
[ 1.05209536, 1.00923937, 1. ],
[ 1.32721156, 0.51245666, 1. ],
[ 1.30524576, 2.25972211, 1. ],
[ 0.88241679, 1.43789203, 1. ],
[ 1.00659085, 0.50823154, 1. ],
[-0.07855533, 0.79471839, 1. ],
[ 0.69424139, 1.90253347, 1. ],
[-0.65202019, 0.58262893, 1. ],
[ 1.51663836, 2.01803715, 1. ],
[ 1.44157754, 0.81980129, 1. ],
[ 0.19362019, -0.45953656, 1. ],
[ 0.27078822, 1.64881697, 1. ],
[ 0.21677635, 0.93895035, 1. ],
[ 0.36153144, 0.20814551, 1. ],
[ 0.8564538 , 0.99844893, 1. ],
[ 1.51092535, -0.20891597, 1. ],
[-0.45555107, 1.50849382, 1. ],
[ 1.33746865, -0.02471705, 1. ],
[-0.14216097, 0.82048818, 1. ],
[-0.11370976, -0.92828622, 1. ],
[ 0.18958969, 1.18666251, 1. ],
[-0.3829876 , -0.04470097, 1. ],
[-0.83731959, -1.18313309, -1. ],
[-1.51843651, 0.75433994, -1. ],
[-1.09662947, -1.94547105, -1. ],
[-0.26206777, 0.03161099, -1. ],
[-0.77807205, -1.07878211, -1. ],
[ 1.10024545, -1.00090903, -1. ],
[-0.71288403, -1.11606537, -1. ],
[-0.74053771, 1.71102965, -1. ],
[-1.11566994, -2.00410355, -1. ],
[-2.07805826, -1.31067104, -1. ],
[-1.09452175, -1.62146432, -1. ],
[-1.65006709, -1.35826716, -1. ],
[-0.60935775, -0.53560331, -1. ],
[-0.41974783, 0.28725026, -1. ],
[-0.86820094, -2.16221091, -1. ],
[-1.20596335, 0.15974258, -1. ],
[-2.25399166, -1.31475396, -1. ],
[-1.61958938, -2.50899222, -1. ],
[-1.55776355, -0.82686885, -1. ],
[-0.60181084, -2.36415966, -1. ],
[-1.46061271, -1.1791191 , -1. ],
[-1.86549811, -2.2066424 , -1. ],
[ 0.86628608, -2.03850892, -1. ],
[ 0.81031179, -0.595103 , -1. ],
[ 0.57334688, -0.55846204, -1. ]])

In [ ]:

pl.figure(figsize=(6, 6))
pl.scatter(class_1[:,0], class_1[:,1], label=’+1′)
pl.scatter(class_2[:,0], class_2[:,1], label=’-1′)
pl.grid()
pl.legend()
pl.show()

Definition of some activation functions¶

Linear
$$output = x$$

Tanh

$$output = tanh(x)$$

Sigmoid
$$output = \frac {1}{1 + e^{-x}}$$

In [ ]:

# create a activation class
# for each time, we can initiale a activation function object with one specific function
# for example: f = Activation(“tanh”) means we create a tanh activation function.
# you can define more activation functions by yourself, such as relu!

class Activation(object):
def __tanh(self, x):
return np.tanh(x)

def __tanh_deriv(self, a):
# a = np.tanh(x)
return 1.0 – a**2
def __logistic(self, x):
return 1.0 / (1.0 + np.exp(-x))

def __logistic_deriv(self, a):
# a = logistic(x)
return a * (1 – a )

def __init__(self,activation=’tanh’):
if activation == ‘logistic’:
self.f = self.__logistic
self.f_deriv = self.__logistic_deriv
elif activation == ‘tanh’:
self.f = self.__tanh
self.f_deriv = self.__tanh_deriv

Define HiddenLayer¶

$$output = f\_act(\sum_{i=0}^{1}{(I_{i} * W_{i})} + b)$$

In [ ]:

# now we define the hidden layer for the mlp
# for example, h1 = HiddenLayer(10, 5, activation=”tanh”) means we create a layer with 10 dimension input and 5 dimension output, and using tanh activation function.
# notes: make sure the input size of hiddle layer should be matched with the output size of the previous layer!

class HiddenLayer(object):
def __init__(self,n_in, n_out,
activation_last_layer=’tanh’,activation=’tanh’, W=None, b=None):
“””
Typical hidden layer of a MLP: units are fully-connected and have
sigmoidal activation function. Weight matrix W is of shape (n_in,n_out)
and the bias vector b is of shape (n_out,).

NOTE : The nonlinearity used here is tanh

Hidden unit activation is given by: tanh(dot(input,W) + b)

:type n_in: int
:param n_in: dimensionality of input

:type n_out: int
:param n_out: number of hidden units

:type activation: string
:param activation: Non linearity to be applied in the hidden
layer
“””
self.input=None
self.activation=Activation(activation).f

# activation deriv of last layer
self.activation_deriv=None
if activation_last_layer:
self.activation_deriv=Activation(activation_last_layer).f_deriv

# we randomly assign small values for the weights as the initiallization
self.W = np.random.uniform(
low=-np.sqrt(6. / (n_in + n_out)),
high=np.sqrt(6. / (n_in + n_out)),
size=(n_in, n_out)
)
if activation == ‘logistic’:
self.W *= 4

# we set the size of bias as the size of output dimension
self.b = np.zeros(n_out,)

# we set he size of weight gradation as the size of weight
self.grad_W = np.zeros(self.W.shape)
self.grad_b = np.zeros(self.b.shape)

# the forward and backward progress for each training epoch
# please learn the week2 lec contents carefully to understand these codes.
def forward(self, input):
”’
:type input: numpy.array
:param input: a symbolic tensor of shape (n_in,)
”’
lin_output = np.dot(input, self.W) + self.b
self.output = (
lin_output if self.activation is None
else self.activation(lin_output)
)
self.input=input
return self.output

def backward(self, delta, output_layer=False):
self.grad_W = np.atleast_2d(self.input).T.dot(np.atleast_2d(delta))
self.grad_b = delta
if self.activation_deriv:
delta = delta.dot(self.W.T) * self.activation_deriv(self.input)
return delta

The MLP¶
The class implements a MLP with a fully configurable number of layers and neurons. It adapts its weights using the backpropagation algorithm in an online manner.

In [ ]:

class MLP:
“””
“””

# for initiallization, the code will create all layers automatically based on the provided parameters.
def __init__(self, layers, activation=[None,’tanh’,’tanh’]):
“””
:param layers: A list containing the number of units in each layer.
Should be at least two values
:param activation: The activation function to be used. Can be
“logistic” or “tanh”
“””
### initialize layers
self.layers=[]
self.params=[]

self.activation=activation
for i in range(len(layers)-1):
self.layers.append(HiddenLayer(layers[i],layers[i+1],activation[i],activation[i+1]))

# forward progress: pass the information through the layers and out the results of final output layer
def forward(self,input):
for layer in self.layers:
output=layer.forward(input)
input=output
return output

# define the objection/loss function, we use mean sqaure error (MSE) as the loss
# you can try other loss, such as cross entropy.
def criterion_MSE(self,y,y_hat):
activation_deriv=Activation(self.activation[-1]).f_deriv
# MSE
error = y-y_hat
loss=error**2
# calculate the delta of the output layer
delta=-error*activation_deriv(y_hat)
# return loss and delta
return loss,delta

# backward progress
def backward(self,delta):
delta=self.layers[-1].backward(delta,output_layer=True)
for layer in reversed(self.layers[:-1]):
delta=layer.backward(delta)

# update the network weights after backward.
# make sure you run the backward function before the update function!
def update(self,lr):
for layer in self.layers:
layer.W -= lr * layer.grad_W
layer.b -= lr * layer.grad_b

# define the training function
# it will return all losses within the whole training process.
def fit(self,X,y,learning_rate=0.1, epochs=100):
“””
Online learning.
:param X: Input data or features
:param y: Input targets
:param learning_rate: parameters defining the speed of learning
:param epochs: number of times the dataset is presented to the network for learning
“””
X=np.array(X)
y=np.array(y)
to_return = np.zeros(epochs)

for k in range(epochs):
loss=np.zeros(X.shape[0])
for it in range(X.shape[0]):
i=np.random.randint(X.shape[0])

# forward pass
y_hat = self.forward(X[i])

# backward pass
loss[it],delta=self.criterion_MSE(y[i],y_hat)
self.backward(delta)
y
# update
self.update(learning_rate)
to_return[k] = np.mean(loss)
return to_return

# define the prediction function
# we can use predict function to predict the results of new data, by using the well-trained network.
def predict(self, x):
x = np.array(x)
output = np.zeros(x.shape[0])
for i in np.arange(x.shape[0]):
output[i] = self.forward(x[i,:])
return output

Learning¶

In [ ]:

### Try different MLP models
nn = MLP([2,3,1], [None,’logistic’,’tanh’])
input_data = dataset[:,0:2]
output_data = dataset[:,2]

In [ ]:

### Try different learning rate and epochs
MSE = nn.fit(input_data, output_data, learning_rate=0.001, epochs=500)
print(‘loss:%f’%MSE[-1])

loss:0.343023

Plot loss in epochs¶
We can visualize the loss change during the training process, to under how we can the network. As we can see, the loss staies at the large level at the beginning, but drop quickly within the training. A small loss value indicate a well-trained network.

In [ ]:

pl.figure(figsize=(15,4))
pl.plot(MSE)
pl.grid()

In [ ]:

### Try different MLP models
# we can compare the loss change graph to under how the network parameters (such as number of layers and activation functions),
# could affect the performance of network.
nn = MLP([2,3,1], [None,’logistic’,’tanh’])
input_data = dataset[:,0:2]
output_data = dataset[:,2]
MSE = nn.fit(input_data, output_data, learning_rate=0.0001, epochs=500)
print(‘loss:%f’%MSE[-1])
pl.figure(figsize=(15,4))
pl.plot(MSE)
pl.grid()

loss:1.085496

In [ ]:

### Try different MLP models
nn = MLP([2,3,1], [None,’logistic’,’tanh’])
input_data = dataset[:,0:2]
output_data = dataset[:,2]
MSE = nn.fit(input_data, output_data, learning_rate=0.1, epochs=500)
print(‘loss:%f’%MSE[-1])
pl.figure(figsize=(15,4))
pl.plot(MSE)
pl.grid()

loss:0.334366

Testing¶

In [ ]:

output = nn.predict(input_data)

In [ ]:

# visualizing the predict results
# notes: since we use tanh function for the final layer, that means the output will be in range of [0,1]
pl.figure(figsize=(8,6))
pl.scatter(output_data, output, s=100)
pl.xlabel(‘Targets’)
pl.ylabel(‘MLP output’)
pl.grid()

In [ ]:

# create a mesh to plot in
xx, yy = np.meshgrid(np.arange(-2, 2, .02),np.arange(-2, 2, .02))

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max].
Z = nn.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
Z = Z.reshape(xx.shape)

pl.figure(figsize=(15,7))
pl.subplot(1,2,1)
pl.pcolormesh(xx, yy, Z>0, cmap=’cool’)
pl.scatter(input_data[:,0], input_data[:,1], c=[([‘b’, ‘r’])[d>0] for d in output_data], s=100)
pl.xlim(-2, 2)
pl.ylim(-2, 2)
pl.grid()
pl.title(‘Targets’)
pl.subplot(1,2,2)
pl.pcolormesh(xx, yy, Z>0, cmap=’cool’)
pl.scatter(input_data[:,0], input_data[:,1], c=[([‘b’, ‘r’])[d>0] for d in output], s=100)
pl.xlim(-2, 2)
pl.ylim(-2, 2)
pl.grid()
pl.title(‘MLP output’)

/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:14: DeprecationWarning: In future, it will be an error for ‘np.bool_’ scalars to be interpreted as an index

/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:21: DeprecationWarning: In future, it will be an error for ‘np.bool_’ scalars to be interpreted as an index

Out[ ]:

Text(0.5, 1.0, ‘MLP output’)

the figure on the left shows the ground true label of each data¶
the figure on the eright shows the predict label of each data with MLP model.¶
Based on the visualization result, we can find that network learned a boundary between positive and negative data!¶