代写代考 COMP3308/3608, Lecture 8a

COMP3308/3608, Lecture 8a
ARTIFICIAL INTELLIGENCE
Introduction to Neural Networks. Perceptrons.
Reference: Russell and Norvig, pp.727-731

Copyright By PowCoder代写 加微信 powcoder

, COMP3308/3608 AI, week 8a, 2022 1

• Introduction to Neural Networks
• Human nervous system
• Structure and operation of biological neural systems
• What is an artificial neural network?
• Taxonomy of neural networks
• Perceptron
• Neuron model
• Investigation of decision boundary
• Learning rule
• Capabilities and limitations
, COMP3308/3608 AI, week 8a, 2022 2

Artificial Neural Networks
• Field of AI that studies networks of artificial neurons
• It was inspired by the desire to:
• produce artificial systems capable of “intelligent” computations, similar to what the human brain does
• enhance our understanding of the human brain
• Artificial neurons are very simple abstractions of biological neurons
• They are connected to form networks of neurons
• These networks:
• are implemented as a computer program or specialized hardware
• do not have a fraction of the power of the human brain but can be trained to perform useful functions, e.g. classification or prediction
, COMP3308/3608 AI, week 8a, 2022 3

For interested students, not examinable
Efficiency of Biological Neural Systems
The brain performs tasks such as pattern recognition, perception, motor control many times faster than the fastest computers
Efficiency of the human visual system
• Humans – can do perceptual recognition in 100-200 ms (e.g. recognition of a familiar face in an unfamiliar scene)
• Computers – still not able to do this well enough Efficiency of the sonar system of a bat
• Sonar = SOund NAvigation and Ranging (emitting sounds and listening for echoes; used underwater to navigate and detect other vessels)
• A bat sonar provides information about the distance from a target, its relative velocity, size, azimuth, elevation; the size of various features of the target
• This complex computation (extracting information from the echo) occurs in a brain with the size of a plum!
• The precision of the target location achieved by the bat is still impossible to match by the current radars
How does the human brain or the brain of a bat do this?
, COMP3308/3608 AI, week 8a, 2022 4

Human Brain
• We still don’t fully understand how it operates
• What we know is that we have a huge number of neurons (brain cells) and connections between them
100 billion (i.e. 1010 ) neurons in the brain
• a full Olympic size swimming pool contains 1010 raindrops; the number of stars in the Milky Way is of the same magnitude
104 connections per neuron
=> total number of connections = 1010 . 104 =100 000 000 000 000
• Biological neurons are much slower than computers
• Neurons operate in mili (10-3) seconds, computers in nano (10-9) seconds
, COMP3308/3608 AI, week 8a, 2022 5

• Structure
Image ref: R. Beale and T. Jackson, Neural Computing – An Introduction, 1990, CRC Press
• body – contains the chromosomes • dendrites – inputs
• axon – output
• synapse – a narrow gap (not a link)
• between the axon of one neuron and a dendrite of another neuron
• can get activated chemically allowing information from one neuron to pass to another
Biological Neuron
• Purpose: transmit information in the form of electrical signals
• A neuron accepts many input signals via its dendrites and adds them
• If the input signal is strong enough, the neuron gets activated and an
electrical signal is generated at its output
• This electrical signal activates the synapse (chemically) and the signal is
passed to the input of the connected neuron
, COMP3308/3608 AI, week 8a, 2022 6

For interested students, not examinable
More on the operation of biological neurons
• Signals are transmitted between neurons by electrical pulses (action potentials, AP) traveling along the axon
• When the potential at the synapse is raised sufficiently by the AP, it releases chemicals called neurotransmitters
• it may take the arrival of more than one AP before the synapse is triggered
• The neurotransmitters diffuse across the gap and chemically activate the gates on the dendrites, that allows charged ions to flow
• The flow of ions alters the potential of the dendrite and provides a voltage pulse on the dendrite (post-synaptic-potential, PSP)
• some synapses excite the dendrite they affect, while others inhibit it
• the synapses also determine the strength of the new input signal
• each PSP travels along its dendrite and spreads over the body
• the body sums the effects of thousands PSPs; if the resulting PSP exceeds a threshold, the neuron fires and generates an AP
, COMP3308/3608 AI, week 8a, 2022

Biological Neurons – Examples
, COMP3308/3608 AI, week 8a, 2022 8
Image ref: J. Anderson, Introduction to Neural Networks, MIT press, 1995

Learning in Humans
• We were born with some of our neural structures (e.g. neurons); others (e.g. synapses) are formed and modified via learning and experience
• Learning is achieved by:
• creation of new synaptic connections between neurons
• changing the strength of existing synaptic connections
• The synapses are thought to be mainly responsible for learning
• Hebb proposed his famous learning rule in 1949:
The strength of a synapse between 2 neurons is increased by the repeated activation of one neuron by the other across the synapse
, COMP3308/3608 AI, week 8a, 2022 9

What is an Artificial Neural Network (NN)?
A network of many simple neurons (units, nodes)
• Neurons are linked by connections
• Each connection has a numeric weight
• Neurons:
• receive numeric inputs (from the environ- ment or other neurons) via the connections
• produce numeric output using their weights and the inputs
• are organised into layers: input, hidden and output neurons
• A NN can be represented as a directed graph NNs learn from examples
• They can be used for supervised or unsupervised learning
• The knowledge from the training examples is stored in the weights
• Learning rule – a procedure for modifying the weights in order to perform a certain task
, COMP3308/3608 AI, week 8a, 2022
input neurons
output neurons
hidden neurons

input vector p
weight vector w
Summation of products (wp+b)
transfer function output a
w & b are the parameters of the neuron and they are learned
p comes from the data
, COMP3308/3608 AI, week 8a, 2022 11
using a given learning rule for the specific type of NN

A Neuron – More Details
• A connection from neuron i to j has a numeric weigh wij which determines its strength
• Given an input vector p, the neuron first computes the weighed sum wp, and then applies a transfer function f to determine the output a
• The transfer function f has different forms depending on the type of NN
• A neuron typically has a special weight called bias weight b . It is connected
to a fixed input of 1.
• A NN represents a function of the inputs p and weights w and b. By adjusting the weights, we change this function. This is done by using a learning rule.
input vector p
neuron i output
ai=f(wp+bi)
If there are 2 inputs p1=2 and p2=3, and if w11= 3, w12=1, b = -1.5, then a = f(2*3+3*1 -1.5) = f(7.5)
p2 2i Sf …wni b
COMP3308/3608 AI, week 8a, 2022 12

Correspondence Between Artificial and Biological Neurons
• How this artificial neuron relates to the biological one?
• input vector p – input signals at the dendrites
• weight w (or weight vector w) – strength of the synapse (or synapses)
• summer & transfer function – body
• output a – signal at the axon
input vector p
…wni b 1
ai=f(wp+bi)
COMP3308/3608 AI, week 8a, 2022 13

Taxonomy of NNs
• Feedforward (acyclic) and recurrent (cyclic, feedback)
• Supervised and unsupervised
• Feedforward supervised networks
• typically used for classification and regression
• perceptrons, ADALINEs, multilayer backpropagation networks, Radial- Basis Function (RBF) networks, Learning Vector Quantization (LVQ) networks, deep neural networks
• Feedforward unsupervised networks
• Hebbian networks used for associative learning
• Competitive networks performing clustering and visualization, e.g. Self-Organizing Feature Maps (SOM)
• Recurrent networks – temporal data processing
• recurrent backpropagation, associative memories, adaptive resonance
networks, LSTM deep neural networks
, COMP3308/3608 AI, week 8a, 2022 14

Image ref: www-cse.ucsd.edu/~dasgupta/250B/lec3.ppt , COMP3308/3608 AI, week 8a, 2022 15

• A supervised NN
• Uses a step transfer function
=> its output is binary: either 0 or 1 (or either -1 or 1)
• Its output is a weighed sum of its inputs, subject to a
step transfer function
• Forms a linear decision boundary
, COMP3308/3608 AI, week 8a, 2022 16

Single-Neuron Perceptron – Example
Step transfer function:
 a=f(n)=0 if n0
• 2 inputs p1 and p2 ; 1 output a
• Each input is weighted (weights w1 and w2)
• An additional weight b (bias weight) associated with the neuron
• The sum of the weighted inputs is sent to a step transfer function (denoted as step; other names: threshold or hard limit)
, COMP3308/3608 AI, week 8a, 2022 17

Single-Neuron Perceptron – Investigation of the Decision Boundaries
We will use both analytical and graphical representation
Images in next slides from: Hagan, Demuth, Beale, Neural Network Design, 1996, PWS, Thomson
, COMP3308/3608 AI, week 8a, 2022 18

• 4. The decision boundary is:
• It is a line in the input space
• It separates the input space into 2
subspaces: output =1 and 0
• 5. Draw the decision boundary:
Decision Boundary (1)
n=wp+b=w1 p1 +w2 p2 +b= = p1 + p2 – 1 = 0
a=f(wp+b) • 1. Output:
a = step(wp+b) =
= step(w1 p1 + w2 p2 +b)
p1 = 0 => p2 = 1; (p2 intersect) p2 = 0 => p1 = 1; (p1 intersect)
• 2. Decision boundary:
n = wp+b = w1 p1 + w1 p2 +b = 0
• 3. Let’s assign values to the parameters of the perceptron (w and b):
w1 = 1; w2 = 1; b = -1;
COMP3308/3608 AI, week 8a, 2022 19

Decision Boundary (2)
• 6. Find the side corresponding to an output of 1:
• Properties of the decision boundary – it is orthogonal to w
• =>The decision boundary is defined by the weight vector (if we know the weight vector, we know the decision boundary)
• Most important conclusion: the decision boundary of a perceptron is a line
a = step(wp + b) =
= step([11]2 −1) =1 0
p = 2 0
, COMP3308/3608 AI, week 8a, 2022 20

Perceptron Learning Rule – Derivation
p = 1t =1 11
• Supervised learning:
• Given: a set of 3 training examples from 2 classes: 1 and 0
• Goal: learn to classify these examples correctly with a perceptron (i.e.
learn to associate: input pi with output ti)
• Idea: Start with a random set of weights (i.e. random initial decision boundary); feed each example, iteratively adjust the weights (i.e. adjust the decision boundary) until the examples are correctly classified, i.e. class 0 and 1 are separated
p = –1t =0 22
• Is it possible to solve this problem? We know that a perceptron forms a linear decision boundary; can we separate the examples with a line?
• How many lines can we draw to separate the examples?
• How many inputs and outputs for our perceptron? , COMP3308/3608 AI, week 8a, 2022 21
p = 0 t =0 33

Example – Staring Point and First Input Vector
1) Initialization of the weights
• 1 neuron with 2 inputs; suppose that there is
no bias weight
• Let our random initial weight vector be:
w=1 −0.8
• It defines our initial classification boundary
2) Start training: feed the input examples to the perceptron iteratively (1-by-1), calculate the output and adjust w until all 3 examples are correctly classified
• First input example:
p = 1t =1 11
a = step(wp ) = step([1 − 0.8]1) = step(−0.6) = 0 1 2
Incorrect classification!
Output should be 1 but it is 0!
COMP3308/3608 AI, week 8a, 2022 22

Tentative Learning Rule
• We need to alter the weight vector so that it points more toward p1, so that in the future it has a better chance of classifying it correctly
• => Let’s add p1 to w – repeated presentations of p1 would cause the direction of w to approach the direction of p1
• => Tentative learning rule (rule 1):
If t = 1 and a = 0, then wnew = wold +pT
• Applying the rule:
after ex.1 wnew
wnew=wold +pT=1 −0.8+1 2=2 1.2 1
COMP3308/3608 AI, week 8a, 2022 23

If t = 0 and a = 1, then wnew = wold -pT Applying the rule:
p = –1t =0 22
Incorrect classification! Output should be 0 but it is 1!
after ex.2 wold
Second Input Vector
after ex.1 wnew
We can move the weight vector w away from the input (i.e. subtract it) =>
a = step(wp ) = step([2 1.2]−1) = step(0.4) = 1 2  2 
wnew =wold −pT =2 1.2−−1 2=3 −0.8 2
COMP3308/3608 AI, week 8a, 2022 24

Incorrect classification! Output should be 0 but it is 1!
• We already have a rule for this case – apply rule 2:
If t = 0 and a = 1, then wnew = wold -pT
wnew =wold −pT =3 −0.8−0 −1=3 0.2
• All examples have been fed once – we say that 1 epoch has been completed
• Check how each example is classified by the current classifier
• all are correctly classified => stop
wnew after ex.3
Third Input Vector
p = 0 t =0 33
a = step(wp ) = step([3 − 0.8] 0 ) = step(0.8) = 1 3
after ex.2 wold
• otherwise => repeat
, COMP3308/3608 AI, week 8a, 2022 25

Unified Learning Rule
• Covers all combinations of output and target values (0 and 1):
If t = 1 and a = 0, then wnew = wold + pT If t = 0 and a = 1, then wnew = wold – pT Ift=a,thenwnew =wold
Rule 1 Rule 2 Rule 3
Ife=1,thenwnew =wold +pT Ife=-1,thenwnew =wold -pT Ife=0,thenwnew =wold
• unified rule:
wnew = wold + epT bnew = bold +e
COMP3308/3608 AI, week 8a, 2022 26

Perceptron Learning Law – Summary
1. Initialize weights (including biases) to small random values, set epoch=1. 2. Choose an example (input-target output pair {p,t} ) from the training set
3. Calculate the actual output of the network for this example a (also called network activation)
4. Compute the output error e=t-a 5. Update the weights:
6. Repeat steps 2-5 (by choosing another example from the training data)
7. At the end of each epoch check if the stopping criterion is satisfied: all examples are correctly classified or a maximum number of epochs is reached; if yes – stop, otherwise epoch++ and repeat from 2.
wnew = wold + epT bnew = bold +e
, COMP3308/3608 AI, week 8a, 2022 27

Perceptron – Stopping Criterion
• Stopping criterion is checked at the end of each epoch:
• Epoch (= training epoch) – one pass through the whole training set (i.e. each training example is passed, the perceptron output is computed and the weights are changed, then the next example is passed etc.)
• The epoch numbering starts from 1: epoch 1, epoch 2, etc.
• To check if all examples are correctly classified at the end of the
• All training examples are passed again, the actual output is calculated and compared with the target output. There is no weight change.
• Note: this does not count for another epoch as there is no weight change.
, COMP3308/3608 AI, week 8a, 2022 28

Capability and Limitations
• The output values of a perceptron can take only 2 values: • 0 and 1 (or –1 and 1)
• Theorem: If the training examples are linearly separable, the perceptron learning rule is guaranteed to converge to a solution (i.e. a set of weights that correctly classify the training examples) in a finite number of steps
• When the examples are linearly separable, the perceptron will find a line (hyperplane) that separates the two classes
• It doesn’t try to find an “optimal” line (e.g. a line that is in the middle of the positive and negative examples), it will simply stop when a solution (a separating line) is found
Linearly inseparable:
Linearly separable:
, COMP3308/3608 AI, week 8a, 2022 29

Perceptron Learning Rule – Example
• Given is the following training data: Ex# input output
• Train a perceptron without a bias on this data. Stopping criterion: all examples are correctly classified or a maximum number of 3 epochs is reached. Assume that all initial weights are 0.
, COMP3308/3608 AI, week 8a, 2022 30

Perceptron learning rule:
How many inputs? How many outputs?
wnew = wold + epT e =t -a
Ex# input output 1 1 0 1
Starting point: w=[0 0]
Apply ex.1 [1 0], t=1
a=step([0 0][1 0])=step(0)=1, correct, no weight change
Apply ex.2 [0 1], t=0
a=step([0 0][0 1])=step(0)=1, incorrect p1 w1
w_new=[0 0]+(0-1)[0 1]=[0 0]-[0 1]=[0 -1] p
Apply ex.3 [1 1], t=1
a=step([0 -1][1 1])=step(-1)=0, incorrect w_new=[0 -1]+(1-0)[1 1]=[0 -1]+[1 1]=[1 0]
End of epoch 1
, COMP3308/3608 AI, week 8a, 2022 31

Solution – cont.
Check if the stopping criterion is satisfied
1) All training examples are correctly classified? w=[1 0] //current weight vector
Apply ex.1 [1 0], t=1
a=step([1 0][1 0])=step(1)=1, correct
Apply ex.2 [0 1], t=0
a=step([1 0][0 1])=step(0)=1, incorrect => condition not satisfied
Ex# input output 1 10 1
2) Epoch=3 reached? No
 Stoppingcriterionisnotsatisfied=>keeptraining
Start epoch 2: training: …
End epoch 2: check stopping criterion ….
, COMP3308/3608 AI, week 8a, 2022 32

Read at home
Another Example – Apple/Banana Sorter
• A farmer has a warehouse that stores a variety of fruits. He wants to sort automatically the different types of fruit.
• There is a conveyer belt on which the fruit is loaded. It is then passed through a set of sensors, which measure 3 properties of the fruit: shape, texture and weight.
• shape sensor: -1 if the fruit is round, 1 – if it is more elliptical
• texture sensor: -1 if the surface is smooth, 1 – if it is rough
• weight sensor: -1 if the fruit is > 500g , 1 – if < 500g • The sensor outputs are inputs to a NN • NN’s purpose: Recognise the fruit type, so that fruits are directed to the correct bin • For simplicity, only 2 types of fruits: apples and bananas COMP3308/3608 AI, week 8a, 2022 33 Image ref: Hagan, Demuth, Beale, Neural Network Design, 1996, PWS, Thomson • How many inputs? • How many outputs? 程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com