Foundations of Machine Learning Neural Networks
Kate Farrahi
ECS Southampton
November 19, 2020
1/13
References
Pattern Recognition and Machine Learning by Christopher Bishop
2/13
References
Pattern Recognition and Machine Learning by Christopher Bishop
Michael Nielson’s online book http://neuralnetworksanddeeplearning.com
2/13
References
Pattern Recognition and Machine Learning by Christopher Bishop
Michael Nielson’s online book http://neuralnetworksanddeeplearning.com
Deep Learning by Ian Goodfellow, Y. Bengio, and A. Courville http://www.deeplearningbook.org
2/13
References
Pattern Recognition and Machine Learning by Christopher Bishop
Michael Nielson’s online book http://neuralnetworksanddeeplearning.com
Deep Learning by Ian Goodfellow, Y. Bengio, and A. Courville http://www.deeplearningbook.org
Step by Step Example of Backpropagation: http://mattmazur.com/2015/03/17/ a-step-by-step-backpropagation-example
2/13
References
Pattern Recognition and Machine Learning by Christopher Bishop
Michael Nielson’s online book http://neuralnetworksanddeeplearning.com
Deep Learning by Ian Goodfellow, Y. Bengio, and A. Courville http://www.deeplearningbook.org
Step by Step Example of Backpropagation: http://mattmazur.com/2015/03/17/ a-step-by-step-backpropagation-example
2/13
The Neuron
3/13
The Human Brain
Highly complex, non-linear, and parallel ”computer”
Structural constituents: neurons
The structure of the brain is extremely complex and not fully understood
Billions of nerve cells (neurons) and trillions of interconnections in the human brain
Scientists tried to mimic the brain’s behaviour in proposing the artificial neural network (ANN)
The human brain is the inspiration for ANNs though we cannot say ANNs actually replicate the brain’s behaviour very well, they are extremely simplified
Great video about the brain https://www.youtube.com/watch?v=nvXuq9jRWKE
4/13
The Neuron
biological neuron
artificial neuron
5/13
The Perceptron
6/13
History of Neural Networks: McCulloch-Pitts Model
1943 McCulloch and Pitts introduced the first model of an extremely simple artificial neuron.
The inputs and outputs could be either a zero or a one.
They introduced the idea of an excitatory and inhibitory
potential using weights (+/-).
Each input is weighed and the summed activation is either transmitted (output of 1) or not (output of 0).
The McCulloch-Pitts model lacked a mechanism for learning, which was crucial for it to be usable for AI.
Link to the Original Paper https: //link.springer.com/article/10.1007%2FBF02478259
7/13
History of the Perceptron
1957 Rosenblatt introduced the perceptron which was an electronic device constructed using biological principles and showed the ability to learn.
1962 Rosenblatt wrote a book about the Perceptron and received international recognition.
1969 Marvin Minsky and Seymour Papert published the book ”Perceptrons” which proved some limitations of the perceptron (that linear functions cannot model non-linears ones) having a big effect on the community.
8/13
History of the Perceptron
Initially the perceptron seemed promising, but it was quickly shown that perceptrons could not be used to classify many classes of patterns.
This caused the field of neural networks to stagnate for many years before it was recognised that a feedforward neural network with two or more layers (multilayer perceptrons) had far greater power.
The popularity of neural networks resurged in the 1980s.
Today deep learning is state of the art for many applications in machine learning.
9/13
The Perceptron
Source: http://www.andreykurenkov.com/writing/ai/a-brief-history-of-neural-nets-and-deep-learning/
10/13
The Perceptron Algorithm
Begin Initialize
Set all of the weights wi to small random numbers
Training
For T iterations (or until the convergence criteria is met): For each input vector xj :
Compute the activation of each neuron i:
yj =f(
wixij)=
m i=0
m
i=0 (1) 0 otherwise
1 if wixij>0
Update each weight as follows:
wi ←wi −η(yj −tj)·xij (2)
11/13
Example
Solve the logical AND function using the perceptron algorithm.
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
After how many iterations did the perceptron converge?
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
After how many iterations did the perceptron converge?
Do you need a bias term?
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
After how many iterations did the perceptron converge?
Do you need a bias term?
What are some suitable convergence criteria?
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
After how many iterations did the perceptron converge?
Do you need a bias term?
What are some suitable convergence criteria? Is there more than one possible solution?
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
After how many iterations did the perceptron converge?
Do you need a bias term?
What are some suitable convergence criteria?
Is there more than one possible solution?
What happens if you set η to a very large number?
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
After how many iterations did the perceptron converge?
Do you need a bias term?
What are some suitable convergence criteria?
Is there more than one possible solution?
What happens if you set η to a very large number? What happens if you set η to a very small number?
12/13
Example
Solve the logical AND function using the perceptron algorithm. Givenb=1,w1 =0,w2 =0,η=0.1,findasolution
After how many iterations did the perceptron converge?
Do you need a bias term?
What are some suitable convergence criteria?
Is there more than one possible solution?
What happens if you set η to a very large number? What happens if you set η to a very small number?
12/13
Example
Solve the logical XOR function using the perceptron algorithm.
13/13