代写代考 COMP3308/COMP3608

1 Notation
COMP3308/COMP3608
Back Propagation of a Multi-layer Neural Network
This is the notation that will be used for this example:

Copyright By PowCoder代写 加微信 powcoder

1 … i neti
Output Neurons
Hidden Neurons
Input Neurons
Given by the example:
– input neuron outputs (ok)
– all initial edge weights (wkj, wji)
– all initial biases (bi, bj)
– output neuron desired outputs (di)
Left for us to calculate:
– combined input into hidden and
output neurons (netj, neti)
– hidden neuron and output neuron
outputs (oj, oj)
Josh Stretton May 3, 2017
The example is of a machine that categorises fruit into either banana or orange. Presumably we have a magical fruit tree in the future that grows both bananas and oranges, and we need to use a machine to sort the bananas from the oranges.
Each fruit is scanned, and we pick up two attributes about each fruit: how yellow the fruit is (from 0 to 1) and how round the fruit is (0 to 1).
Two example fruit are picked, and they have the following attributes: Fruit Yellowess Roundness Class
0.1 Banana 0.3 Orange

3 Neural Network Architecture
To train the machine how to distinguish between these fruits, we use a neural network. The first thing to do is to determine the architecture of the network. We will have:
• two input neurons; one for each of the two attributes.
• two output neurons; one for each possible class value. In an ideal scenario, for any given example, all output neurons will output the value 0, except for the neuron that corresponds to the correct class, which will output the value 1. e.g. for a banana, the outputs will be 1 0 and for an orange the outputs will be 0 1.
• three hidden neurons in a single layer, because I said so. In reality it is not easy to determine the correct number of hidden neurons, so you will often have to try a few structures and see which gives the optimal output.
This gives us the following network:
b Output Neurons 67
net w56 w37 net 67
w36 w46 w47 w57
Hidden Neurons
4 Transfer Function
Input Neurons
When provided with some input values, each neuron will combine its inputs as a weighted sum, and then put that sum through a transfer function in order to get a final output value. In our case, we are going to use the sigmoid function:
f(x) = 1 1+e−x

We use the sigmoid function because the function needs to be differentiable (as seen in the lectures), and this one limits values to the range (0, 1). It also has the nice property that f′(x) = f(x)(1 − f(x)) (also proven in lectures).
5 Training the Network
To train the network, we need to (perhaps repeatedly) put each fruit through the network.
5.1 Example 1
Let’s put fruit number 1 through the network. This example has the following values: • o1 =0.6,o2 =0.1
• d6 =1,d7 =0
Filling out all the information we know, plus adding in initial weights and biases (chosen at random from [-1, 1]):
-0.1 6 -0.4
7 0.6 net7 -0.1 -0.2
5 0.5 net5
Output Neurons
-0.4 -0.2 0.1
Hidden Neurons
Input Neurons

5.1.1 Feed-forward step
Note: all values will be calculated and rounded to two decimal places, for the purposes of simplicity.
To start, we calculate the output of each of the neurons, starting from the bottom layer and moving upwards. We already have that o1 = 0.6 and o2 = 0.1.
net3 =w13 ×o1 +w23 ×o2 +b3
= 0.1×0.6+(−0.2)×0.1+0.1 = 0.14
o3 = sigmoid(net3) = 0.53
o4 = sigmoid(net4)
net5 =w15 ×o1 +w25 ×o2 +b5
= 0.3×0.6+(−0.4)×0.1+0.5 = 0.64
o5 = sigmoid(net5) = 0.65
=w37 ×o3 +w47 ×o4 +w57 ×o5 +b7
= 0.2×0.53+(−0.1)×0.55+(−0.2)×0.65+0.6 = 0.52
= sigmoid(net7)
net6 =w36 ×o3 +w46 ×o4 +w56 ×o5 +b6
= (−0.4)×0.53+0.1×0.55+0.6×0.65+(−0.1) = 0.13
o6 = sigmoid(net6) = 0.53
5.1.2 Back propagation of errors
netj = 􏰀 wkj × ok + bj k
oj =f(netj)
net4 =w14 ×o1 +w24 ×o2 +b4
= 0×0.6+0.2×0.1+0.2
Now we need to calculate the error on the output of each neuron, and then update the weights leading to those neurons in order to attempt to fix those errors. Our learning rate η will be given as 0.1.
For an output layer neuron, this error is given by:
δi = (di − oi) · f′(neti)
which when using the sigmoid transfer function, is equivalent to:
δi =(di −oi)·oi ·(1−oi)
δ6 = (d6 −o6)·o6 ·(1−o6)
= (1−0.53)×0.53×(1−0.53) = 0.12
Calculating the changes in the weights is done with the following:
δ7 = (d7 −o7)·o7 ·(1−o7)
= (0−0.63)×0.63×(1−0.63) = −0.15
∆wji = η × δi × oj wnew=w +∆w

∆w36 =η×δ6 ×o3
= 0.1×0.12×0.53
wnew =w +∆w 36 36 36
= (−0.4)+0.01 = −0.39
∆w37 =η×δ7 ×o3
= 0.1×(−0.15)×0.53
wnew =w +∆w 37 37 37
= 0.2 + (−0.01) = 0.19
∆w46 =η×δ6 ×o4
= 0.1×0.12×0.55
wnew =w +∆w 46 46 46
= 0.1+0.01 = 0.11
∆w47 =η×δ7 ×o4
= 0.1×(−0.15)×0.55
wnew =w +∆w 47 47 47
= (−0.1) + (−0.01) = −0.11
∆w56 =η×δ6 ×o5
= 0.1×0.12×0.65
wnew =w +∆w 56 56 56
= 0.6+0.01 = 0.61
∆w57 =η×δ7 ×o5
= 0.1×(−0.15)×0.65
wnew =w +∆w 57 57 57
= (−0.2) + (−0.01) = −0.21
Similarly, calculating the changes in the biases is done with the following:
= 0.1 × 0.12
= 0.1 × (−0.15)
∆bi =η×δi bnew =b +∆b
bnew =b +∆b
= (−0.1) + 0.01 = −0.09
Now for a hidden layer neuron, the error is given by:
δj =f′(netj)·􏰀wji·δi
which when using the sigmoid transfer function, is equivalent to:
δj =oj ·(1−oj)·􏰀wji ·δi i
bnew =b +∆b
= 0.6 + (−0.01)
δ3 =o3 ·(1−o3)·(w36 ×δ6 +w37 ×δ7)
= 0.53·(1−0.53)·((−0.4)×0.12+0.2×(−0.14)) = −0.02
δ4 =o4 ·(1−o4)·(w46 ×δ6 +w47 ×δ7)
= 0.55·(1−0.55)·(0.1×0.12+(−0.1)×(−0.14)) = 0.01
δ5 =o5 ·(1−o5)·(w56 ×δ6 +w57 ×δ7)
= 0.65 · (1 − 0.65) · (0.6 × 0.12 + (−0.2) × (−0.14)) = 0.02
Weight and bias changes are calculated the same way as previously:

333 444 555
= 0.1+0 = 0.2+0 = 0.5+0 = 0.1 = 0.2 = 0.5
∆w13 =η×δ3 ×o1 ∆w23 =η×δ3 ×o2 ∆w14 =η×δ4 ×o1
= 0.1×(−0.02)×0.6 = 0.1×(−0.02)×0.1 = 0.1×0.01×0.6 =0=0=0
wnew =w +∆w 13 13 13
= 0.1+0 = 0.1
∆w24 =η×δ4 ×o2
= 0.1×0.01×0.1
wnew =w +∆w 23 23 23
= (−0.2)+0 = −0.2
∆w15 =η×δ5 ×o1
= 0.1×0.02×0.6
wnew =w +∆w 14 14 14
=0.1+0 = 0.1
∆w25 =η×δ5 ×o2
= 0.1×0.02×0.1
wnew =w +∆w 24 24 24
= (−0.1)+0 = −0.1
= 0.1×(−0.02)
wnew =w +∆w 15 15 15
= 0.6+0 = 0.6
= 0.1×0.01
wnew =w +∆w 25 25 25
= (−0.1)+0 = −0.1
= 0.1×0.02
bnew =b +∆b bnew =b +∆b bnew =b +∆b
Now if we are are doing batch learning, we continue with the next example and only update weights after all examples have been processed. We will do incremental (stochastic) learning, so first update the weights, then continue to put fruit number 2 through the network. This example has the following values:
• o1 =0.2,o2 =0.3 • d6 =0,d7 =1
-0.09 6 -0.39 0.11
7 0.59 net7
-0.11 -0.21 o5
5 0.5 net5
Output Neurons
Hidden Neurons
Input Neurons

5.2.1 Feed-forward step
Calculate all output values: net3 =w13 ×o1 +w23 ×o2 +b3
= 0.1×0.2+(−0.2)×0.3+0.1
net4 =w14 ×o1 +w24 ×o2 +b4 = 0×0.2+0.2×0.3+0.2 = 0.26
o4 = sigmoid(net4) = 0.56
net5 =w15 ×o1 +w25 ×o2 +b5
= 0.3×0.2+(−0.4)×0.3+0.5 = 0.44
o5 = sigmoid(net5) = 0.61
o3 = sigmoid(net3)
5.2.2 Back propagation of errors
Calculate errors for the output neurons.
δ6 = (d6 −o6)·o6 ·(1−o6)
= (0−0.53)×0.53×(1−0.53) = −0.13
δ7 = (d7 −o7)·o7 ·(1−o7)
= (1−0.62)×0.62×(1−0.62) = 0.09
net6 =w36 ×o3 +w46 ×o4 +w56 ×o5 +b6
= (−0.39) × 0.51 + 0.11 × 0.56 + 0.61 × 0.61 + (−0.09) = 0.14
o6 = sigmoid(net6) = 0.53
net7 =w37 ×o3 +w47 ×o4 +w57 ×o5 +b7
= 0.21 × 0.51 + (−0.11) × 0.56 + (−0.21) × 0.61 + 0.59 = 0.51
o7 = sigmoid(net7) = 0.62
Calculate weight and bias updates for the layer under the output neurons:
∆w36 =η×δ6 ×o3
= 0.1×(−0.13)×0.51
wnew =w +∆w 36 36 36
= (−0.39) + (−0.01) = −0.4
∆w37 =η×δ7 ×o3
= 0.1×0.09×0.51
wnew =w +∆w 37 37 37
= 0.19 + 0 = 0.19
∆w46 =η×δ6 ×o4
= 0.1×(−0.13)×0.56
wnew =w +∆w 46 46 46
= 0.11 + (−0.01) = 0.1
∆w47 =η×δ7 ×o4
= 0.1×0.09×0.56
wnew =w +∆w 47 47 47
= (−0.11) + 0.01 = −0.1
∆w56 =η×δ6 ×o5
= 0.1×(−0.13)×0.61
wnew =w +∆w 56 56 56
= 0.61 + (−0.01) = 0.6
∆w57 =η×δ7 ×o5
= 0.1×0.09×0.61
wnew =w +∆w 57 57 57
= (−0.21) + 0.01 = −0.2

= 0.1 × (−0.13)
= 0.1 × 0.09
bnew =b +∆b
= (−0.09) + (−0.01) = −0.1
Calculate errors for the hidden neurons:
δ3 =o3 ·(1−o3)·(w36 ×δ6 +w37 ×δ7)
= 0.51 · (1 − 0.51) · ((−0.39) × (−0.13) + 0.19 × 0.09) = 0.02
δ5 =o5 ·(1−o5)·(w56 ×δ6 +w57 ×δ7)
= 0.61 · (1 − 0.61) · (0.61 × (−0.13) + (−0.21) × 0.09) = −0.02
Calculate weight and bias updates for the layer under the hidden neurons:
∆w13 =η×δ3 ×o1 ∆w23 =η×δ3 ×o2 ∆w14 =η×δ4 ×o1
= 0.1×0.02×0.2 = 0.1×0.02×0.3 = 0.1×(−0.01)×0.2 =0=0=0
wnew =w +∆w 13 13 13
= 0.1+0 = 0.1
∆w24 =η×δ4 ×o2
= 0.1×(−0.01)×0.3
wnew =w +∆w 23 23 23
= (−0.2)+0 = −0.2
∆w15 =η×δ5 ×o1
= 0.1×(−0.02)×0.2
wnew =w +∆w 14 14 14
= 0.1+0 = 0.1
∆w25 =η×δ5 ×o2
= 0.1×(−0.02)×0.3
wnew =w +∆w 24 24 24
= (−0.1)+0 = −0.1
= 0.1 × 0.02
wnew =w +∆w 15 15 15
= 0.6+0 = 0.6
= 0.1 × (−0.01)
wnew =w +∆w 25 25 25
= (−0.1)+0 = −0.1
= 0.1 × (−0.02)
bnew =b +∆b bnew =b +∆b bnew =b +∆b
333 444 555
= 0.1+0 = 0.2+0 = 0.5+0 = 0.1 = 0.2 = 0.5
bnew =b +∆b
= 0.59 + 0.01
δ4 =o4 ·(1−o4)·(w46 ×δ6 +w47 ×δ7)
= 0.56 · (1 − 0.56) · (0.11 × (−0.13) + (−0.11) × 0.09) = −0.01

5.3 Stopping Conditions
Now we have updated the network to look like this:
d6 o6 -0.1 6
7 0.6 net7
Output Neurons
5 0.5 net5
-0.4 -0.2 o2
We need to check if we have done enough to correctly classify all two of our training examples. We run them through the forward pass of the algorithm (exactly as previously done), and we’d find that we’d get the following outputs:
o6 o7 0.53 0.63 0.53 0.63
Predicted Class Orange Orange
Actual Class Banana Orange
Since these outputs are not all correct, we would run another epoch (everything we just did). We would continue until either all examples are correctly classified, or until we have performed a maximum number of epochs.
Hidden Neurons
Input Neurons

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com