Slide 1
Lecture 5
Artificial Neural Networks
*
Examine the basic principles of artificial neural networks.
Discuss the operation of the Multi Layer Perceptron through the use of suitable examples.
Discuss the derivation of the weight update formula through the use of backpropagation.
Biologically inspired family of algorithms that is inspired by the human brain
Neural Networks are used for classification, clustering and numeric prediction tasks.
Most popular types are
Multi Layer Perceptron (MLP) used for classification
Radial Basis Function (RBF) used for classification and numeric prediction
Self Organizing Map (SOM) used for clustering
Convolutional Neural Network (CNN)used for image/text classification
Long Short Term Memory (LSTM) used for modelling time series
Brain performs classifications, predictions and associations
Huge connectivity: each neuron sends and receives 104 of synapses (contacts)
Huge complexity
1011 of neurons in the brain
*
*
Output Y is 1 if at least two of the three inputs are equal to 1.
Model is an assembly of inter-connected nodes and weighted links
Output node sums up each of its input value according to the weights of its links
Compare output node against some threshold t
Perceptron Model
or
Simple perceptrons can be used to classify problems which are linearly separable
For such problems a single line can be drawn which separates the two classes with zero (or near zero) error
*
Training ANN means learning the weights of the neurons
*
In this and the next 4 slides the functions f and the sigmoid are one and
the same
*
X1 X2 NOT X1 NOT X2 y desired
0 0 1 1 ? 1
0 1 1 0 ? 0
1 0 0 1 ? 0
1 1 0 0 ? 0
X1 X2 NOT X1 NOT X2 y desired
0 0 1 1 ? 1
0 1 1 0 ? 0
1 0 0 1 ? 0
1 1 0 0 ? 0
a1 and a2 can be computed in parallel and so 2 neurons can be assigned to do the computation in the hidden (intermediate layer).
1
1
a1
a2
y
-30
-30
20
20
20
20
20
20
-10
Classification problems involving more than two classes are solved through the Softmax function which is implemented as an additional layer
apple: yes/no?
bear: yes/no?
candy: yes/no?
dog: yes/no?
egg: yes/no?
Softmax
hidden
hidden
logits
Initialize the weights (w0, w1, …, wk)
Compute the error at each output node (k), and the hidden node (j) connected to it.
Now adjust the weights wjk such that
wjk(new) = wjk(current)+Δwjk
where Δwjk=rError(k)Oj
r = learning rate parameter (0
–
+
+
=
otherwise
0
true
is
if
1
)
(
where
)
0
4
.
0
3
.
0
3
.
0
3
.
0
(
3
2
1
z
z
I
X
X
X
I
Y
X
1
X
2
X
3
Y
Black box
w
1
t
Output
node
Input
nodes
w
2
w
3
)
(
t
X
w
I
Y
i
i
i
–
=
å
)
(
t
X
w
sign
Y
i
i
i
–
=
å
+ + + + + + + – – – – – –
+ + + + + + – – – – –
+ + + + + – – – –
+ + + + – –
+ + + –
+ +
Decision
Boundary
+ –
– +
Activation
function
g(S
i
)
S
i
O
i
I
1
I
2
I
3
w
i1
w
i2
w
i3
O
i
Neuron iInputOutput
threshold, t
Input
Layer
Hidden
Layer
Output
Layer
x
1
x
2
x
3
x
4
x
5
y