3 Neural Networks
1. Give brief definitions of the following terms:
an information processing unit.
• action potential
Copyright By PowCoder代写 加微信 powcoder
the signal outputted by a biological neuron.
• firing rate
the number of action potentials emitted during a defined time-period.
the connection between two neurons.
• an artificial neural network
a parallel architecture composed of many simple processing elements interconnected to achieve certain
collective computational capabilities.
2. A neuron has a transfer function which is a linear weighted sum of its inputs and an activation function that is the Heaviside function. If the weights are w = [0.1, −0.5, 0.4], and the threshold zero, what is the output of this neuron when the input is: x1 = [0.1, −0.5, 0.4]t and x2 = [0.1, 0.5, 0.4]t ?
Output of neuron is defined as:
y = H(wx − θ)
y1 =H(wx1 −θ)=H((0.1×0.1)+(−0.5×−0.5)+(0.4×0.4)−0)=H(0.42)=1
y2 =H(wx2 −θ)=H((0.1×0.1)+(−0.5×0.5)+(0.4×0.4)−0)=H(−0.08)=0
3. A Linear Threshold Unit has one input, x1, that can take binary values. Apply the sequential Delta learning rule so that the output of this neuron, y, is equal to x ̄ (or NOT(x )), i.e., such that:
x1 y 01 10
Assume initial values of θ = 1.5 and w1 = 2, and use a learning rate of 1.
Using Augmented notation, y = H(wx) where w = [−θ, w1], and x = [1, x1]T . 21
For the Delta rule, weights are updated such that: w ← w + η(t − y)xt Initial w = [−1.5, 2]
xt t (1, 0) 1 (1, 1) 0 (1, 0) 1 (1, 1) 0 (1, 0) 1 (1, 1) 0 (1,0) 1 (1, 1) 0 (1, 0) 1 (1, 1) 0 (1, 0) 1
y = H(wx) H(−1.5×1+2×0)=H(−1.5)=0 H(−0.5 × 1 + 2 × 1) = H(1.5) = 1 H(−1.5×1+1×0)=H(−1.5)=0 H(−0.5 × 1 + 1 × 1) = H(0.5) = 1 H(−1.5×1+0×0)=H(−1.5)=0 H(−0.5×1+0×1)=H(−0.5)=0 H(−0.5×1+0×0)=H(−0.5)=0 H(0.5×1+0×1)=H(0.5)=1 H(−0.5×1−1×0)=H(−0.5)= 0 H(0.5×1+−1×1)=H(−0.5)= 0 H(0.5×1−1×0)=H(0.5)=1
t−y η(t−y)xt 1 [1,0] −1 [−1, −1] 1 [1,0]
−1 [−1, −1] 1 [1,0] 0 [0,0]
[−0.5, 2] [−1.5, 1] [−0.5, 1] [−1.5, 0] [−0.5, 0] [−0.5, 0] [0.5,0] [−0.5, −1] [0.5, −1] [0.5, −1] [0.5, −1]
4. Repeat the above question using the batch Delta learning rule.
Epoch 1, initial w = [−1.5, 2] xt t
(1, 0) 1 (1, 1) 0
Epoch 2, initial w = [−1.5, 1]
Epoch 3, initial w = [−0.5, 1]
total weight change
η(t − y)xt [1, 0] [−1, −1] [0, −1]
η(t − y)xt [1, 0] [0, 0] [1, 0]
η(t − y)xt [1, 0] [−1, −1] [0, −1]
η(t − y)xt [1, 0] [0, 0] [1, 0]
1 −1 1 0 0
[1,0] [−1, −1] [1, 0] [0, 0] [0, 0]
Learning has converged, so required weights are w = [0.5, −1], or equivalently θ = −0.5, w1 = −1.
H(−1.5 × 1 + 2 × 0) = H(−1.5) = 0 H(−1.5 × 1 + 2 × 1) = H(0.5) = 1
t y=H(wx) t−y (1, 0) 1 H(−1.5 × 1 + 1 × 0) = H(−1.5) = 0 1 (1, 1) 0 H(−1.5 × 1 + 1 × 1) = H(−0.5) = 0 0
Epoch 4, initial w = [−0.5, 0]
t y=H(wx) t−y (1, 0) 1 H(−0.5 × 1 + 0 × 0) = H(−0.5) = 0 1 (1, 1) 0 H(−0.5 × 1 + 0 × 1) = H(−0.5) = 0 0
total weight change
total weight change
t y=H(wx) t−y 1 H(−0.5 × 1 + 1 × 0) = H(−0.5) = 0 1 0 H(−0.5 × 1 + 1 × 1) = H(0.5) = 1 −1
total weight change
Epoch 5, initial w = [0.5, 0] xt t
t−y η(t−y)xt
(1, 0) 1 (1, 1) 0
total weight change
[0, 0] [−1, −1] [−1, −1]
H(0.5 × 1 + 0 × 0) = H(0.5) = 1 H(0.5 × 1 + 0 × 1) = H(0.5) = 1
[−0.5, −1]
5. A Linear Threshold Unit has two inputs, x1 and x2, that can take binary values. Apply the sequential Delta learning rule so that the output of this neuron, y, is equal to x1AND x2, i.e., such that:
x1 x2 y 000 010 100 111
Assume initial values of θ = −0.5, w1 = 1 and w2 = 1, and use a learning rate of 1.
Using Augmented notation, y = H(wx) where w = [−θ, w1, w2], and x = [1, x1, x2]t. For the Delta rule, weights are updated such that: w ← w + η(t − y)xt
Epoch 6, initial w = [−0.5, −1]
xt t y=H(wx) t−y η(t−y)xt
(1,0) 1 H(−0.5×1−1×0)=H(−0.5)=0 1 (1,1) 0 H(−0.5×1−1×1)=H(−1.5)=0 0
total weight change
[1,0] [0,0] [1, 0]
Epoch 7, initial w = [0.5, −1]
xt t y=H(wx) t−y η(t−y)xt
(1, 0) 1 H(0.5 × 1 − 1 × 0) = H(0.5) = 1 0 (1,1) 0 H(0.5×1−1×1)=H(−0.5)=0 0
[0, 0] [0,0] [0, 0]
total weight change
Learning has converged, so required weights are w = [0.5, −1], or equivalently θ = −0.5, w1 = −1.
Initial w=[0.5,1,1]
t−y η(t−y)xt
(1, 0, 0) (1,0,1) (1, 1, 0) (1, 1, 1) (1, 0, 0) (1,0,1) (1,1,0) (1, 1, 1) (1, 0, 0) (1, 0, 1) (1,1,0) (1, 1, 1) (1, 0, 0) (1,0,1) (1, 1, 0) (1, 1, 1) (1, 0, 0) (1, 0, 1)
0 H(0.5 × 1 + 1 × 0 + 1 × 0) = H(0.5) = 1
0 H(−0.5×1+1×0+1×1)=H(0.5)=1 0 H(−1.5×1+1×1+0×0)=H(−0.5)=0 1 H(−1.5×1+1×1+0×1)=H(−0.5)=0 0 H(−0.5×1+2×0+1×0)=H(−0.5)=0 0 H(−0.5×1+2×0+1×1)=H(0.5)=1 0 H(−1.5×1+2×1+0×0)=H(0.5)=1 1 H(−2.5×1+1×1+0×1)=H(−1.5)=0 0 H(−1.5×1+2×0+1×0)=H(−1.5)=0 0 H(−1.5×1+2×0+1×1)=H(−0.5)=0 0 H(−1.5×1+2×1+1×0)=H(0.5)=1 1 H(−2.5×1+1×1+1×1)=H(−0.5)=0 0 H(−1.5×1+2×0+2×0)=H(−1.5)=0 0 H(−1.5×1+2×0+2×1)=H(0.5)=1 0 H(−2.5×1+2×1+1×0)=H(−0.5)=0 1 H(−2.5 × 1 + 2 × 1 + 1 × 1) = H(0.5) = 1 0 H(−2.5×1+2×0+1×0)=H(−2.5)=0 0 H(−2.5×1+2×0+1×1)=H(−1.5)=0
−1 −1 0 1 0 −1 −1 1 0 0 −1 1 0 −1 0 0 0 0
[−1, 0, 0] [−1,0,−1] [0, 0, 0] [1, 1, 1] [0, 0, 0] [−1,0,−1] [−1,−1,0] [1, 1, 1] [0, 0, 0] [0, 0, 0] [−1,−1,0] [1, 1, 1] [0, 0, 0] [−1,0,−1] [0, 0, 0] [0, 0, 0] [0, 0, 0] [0, 0, 0]
[−0.5, 1, 1] [−1.5,1,0] [−1.5, 1, 0] [−0.5, 2, 1] [−0.5, 2, 1] [−1.5,2,0] [−2.5,1,0] [−1.5, 2, 1] [−1.5, 2, 1] [−1.5, 2, 1] [−2.5,1,1] [−1.5, 2, 2] [−1.5, 2, 2] [−2.5,2,1] [−2.5, 2, 1] [−2.5, 2, 1] [−2.5, 2, 1] [−2.5, 2, 1]
Learning has converged, so required weights are w = [−2.5, 2, 1], or equivalently θ = 2.5, w1 = 2, w2 = 1.
The decision surface looks like this:
2.5 2 1.5 1 0.5 0
6. Consider the following linearly separable data set.
Apply the Sequential Delta Learning Algorithm to find the parameters of a linear threshold neuron that will cor- rectly classify this data. Assume initial values of θ = −1, w1 = 0 and w2 = 0, and a learning rate of 1.
class (0,2) 1 (1,2) 1 (2,1) 1 (−3, 1) 0 (−2, −1) 0 (−3, −2) 0
Using Augmented notation, y = H(wx) where w = [−θ, w1, w2], and x = [1, x1, x2]T . So initial weight values are w = [1, 0, 0] and the dataset is:
xt t (1, 0, 2) 1 (1, 1, 2) 1 (1, 2, 1) 1
(1, −3, 1) 0 (1, −2, −1) 0 (1, −3, −2) 0
For the Sequential Delta Learning Algorithm, weights are updated such that: w ← w + η(t − y)xt. Here, η = 1.
iteration w xt
y = H(wx) t 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 1 1
w ← w + (t − y)xt [1,0,0]
[1,0,0] − [1,−3,1] = [0,3,−1] [0, 3, −1]
[0, 3, −1]
[0, 3, −1] + [1, 0, 2] = [1, 3, 1] [1,3,1]
1 2 3 4 5 6 7 8 9
11 [1, 3, 1] [1, −2, −1]
12 [1, 3, 1] [1, −3, −2]
13 [1,3,1] [1,0,2]
[1,0,0] [1,0,2] [1,0,0] [1,1,2] [1,0,0] [1,2,1]
[1, 0, 0] [0, 3, −1] [0, 3, −1] [0, 3, −1]
[1, −3, 1] [1, −2, −1] [1, −3, −2] [1, 0, 2]
[1,3,1] [1,1,2]
[1,3,1] [1,2,1] 10 [1,3,1] [1,−3,1]
Learning has converged (we have gone through all the data without needing to update the weights), so required parameters are w = (1, 3, 1).
7. A negative feedback network has three inputs and two output neurons, that are connected with weights W = 110 T
1 1 1 . Determine the activation of the output neurons after 5 iterations when the input is x = (1, 1, 0) , assuming that the output neurons are updated using parameter α = 0.25, and the activations of the output
neurons are initialised to be all zero.
The activation of a negative feedback network is determined by iteratively evaluating the following equations:
eT (1, 1, 0)
(0, 0, −0.5)
(0.125, 0.125, −0.375) (0.09375, 0.09375, −0.34375) (0.085938, 0.085938, −0.30469)
e = x − WT y y ← y + αWe
(0, −0.5) (0.25, −0.125) (0.1875, −0.15625) (0.17188, −0.13281)
yT (0.5, 0.5)
(0.5, 0.375) (0.5625, 0.34375) (0.60938, 0.30469) (0.65234, 0.27148)
(1, 1, 0.5)
(0.875, 0.875, 0.375) (0.90625, 0.90625, 0.34375) (0.91406, 0.91406, 0.30469) (0.92383, 0.92383, 0.27148)
0.65234 So output is 0.27148
Note, competition results in the first neuron increasing its output, while the output of the second neuron is suppressed. Also, note that the vector WT y becomes similar to the input x. WT y converges towards a reconstruction of the input.
8. Repeat the previous question using a value of α = 0.5.
So output is
e = x − WT y y ← y + αWe
eT (We)T yT (WT y)T (1,1,0) (2,2) (1,1) (2,2,1)
(−1, −1, −1)
(1.5, 1.5, 0.5) (−1.75, −1.75, −1.25) (2.375, 2.375, 1.125)
2.125 1.8125
(−2, −3) (3, 3.5) (−3.5, −4.75) (4.75, 5.875)
(0, −0.5) (1.5, 1.25) (−0.25, −1.125) (2.125, 1.8125)
(−0.5, −0.5, −0.5) (2.75, 2.75, 1.25) (−1.375, −1.375, −1.125) (3.9375, 3.9375, 1.8125)
Note, competition results in oscillatory responses. If α is too large the network becomes unstable. Instability is a common problem with recurrent neural networks.
9. A more stable method of calculating the activations in a negative feedback network is to use the following update rules:
e = x ⊘ WT y ε2
y←[y]ε1 ⊙W ̃e
where [v]ε = max(ε, v); ε1 and ε2 are parameters; W ̃ is equal to W but with each row normalised to sum to one;
and ⊘ and ⊙ indicate element-wise division and multiplication respectively. This is called Regulatory Feedback or Divisive Input Modulation.
Determine the activation of the output neurons after 5 iterations when the input is x = (1, 1, 0)T , and W = 110
1 1 1 , assuming that ε1 = ε2 = 0.01, and the activations of the output neurons are initialised to be all zero.
110 W ̃ = 2 2
So output is
eT (100, 100, 0)
(0.6, 0.6, 0) (1.1538, 1.1538, 0) (1.1143, 1.1143, 0) (1.0825, 1.0825, 0)
0.83505 0.10997
(W ̃ e)T (100, 66.66667) (0.6, 0.4) (1.1538, 0.76923) (1.1143, 0.74286) (1.0825, 0.72165)
(1, 0.66667)
(0.6, 0.26667) (0.69231, 0.20513) (0.77143, 0.15238) (0.83505, 0.10997)
(1.6667, 1.6667, 0.66667) (0.86667, 0.86667, 0.26667) (0.89744, 0.89744, 0.20513) (0.92381, 0.92381, 0.15238) (0.94502, 0.94502, 0.10997)
Note, competition results in the first neuron increasing its output, while the output of the second neuron is suppressed. Also, note that the vector WT y becomes similar to the input x. WT y converges towards a reconstruction of the input.
10. The figure below shows an autoencoder neural network.
Draw a diagram of a de-noising autoencoder and briefly explain how a de-noising autoencoder is trained.
The network is trained so that the output, r, reconstructs the input, x. However, before encoding is performed the input is corrupted with noise. This mitigates overfitting.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com