Student ID: 19M18926 Name: WUJIEYU Task 1: Deterministic Model of Neuron
l Method
Give initial values to neurons x1-x9 (0 or 1) in random. “A winner takes all” model is
applied to solve the problem, which can be represented by using energy.
Make the energy formula: E=(x1+x2+x3-1)^2+(x4+x5+x6-1)^2+(x7+x8+x9-1)^2 +(x1+x4+x7-1)^2+(x2+x5+x8-1)^2+(x3+x6+x9-1)^2
Compare the coefficients of formula to get the weights and thresholds.
Choose one neuron and update its output by computing the weight sum for the neuro𝑛
*
𝑆$ =&𝑤(𝑥( (+,
Use the deterministic model to update the neuron one by one
𝑆$ < 0 𝑦 = 0 𝑆$ > 0 𝑦 = 1
l Result
The initial values are given as 0 0 1 1 1 0 0 0 0. The weights and thresholds are shown
in the figure. The neuron is updated one by one and the energy is also be calculated. As it is shown in the figure, the energy decreases with updating. When the energy becomes 0, though the neurons continue updating, the values donʼt change anymore. And that time, the neurons are satisfied with “A winner takes all” model.
Fig 1.1: weight and threshold in task1, energy change with updating
Task 2: Deterministic Model of Recurrent Neural Networks: l Method
The energy formula is: E=(x1-x2+2×3-x4-3)^2+(2×1+x2-2×3+x4)^2 +(-x1+2×2+x3+2×4)^2+(x2-x3-x4+1)^2
The other procedure is the same as task 1.
l Result
The initial values are given as 1 0 0 0. The weights and thresholds are shown in the
figure. The neuron is updated one by one and the energy is also be calculated. With updating, the energy decreases until it becomes the minimum value which equals 1. Then the neuron keeps updating but the values are constant. So in this way, it has a local minimum and canʼt reach 0.
Fig 2.1 weight and threshold in task2 energy change with updating
Task 3: Probabilistic Model of Recurrent Neural Networks l Method
The initial values of neurons are 0 0 0 0. And the energy formula is the same as task2.
The only changing thing is the way to update the neuron. Task 3 use the
probabilistic model:
1. Choose one neuron to update and compute the weight of the sum
2. Calculate the probability:
2
𝑆$ =&𝑤(𝑥( (+,
𝑃=1
1 + 𝑒67∗ 9:
3. Generate a number in random between 0 and 1, and then update the output with the following rules.
rand() < P output=1 rand() >P output=0
4. Update the neuron one by one until the values donʼt change anymore. Run the program 1000 times to get 1000 copies and record the numbers of copies in every situation. We use 4 neurons so there are 16 kinds of situations.
5. Calculate the Boltzman ʻs Distribution of every case and the Normalization Coefficient A:
𝑁𝑢𝑚$ =1000∗𝐴∗𝑒67∗?:
𝐴=1
∑DE −𝑎∗𝐸 $+D $
6. Compare the numbers of copies with Boltzmanʼs Distribution to see whether it obey the distribution or not.
l Result
The weight and threshold are shown in the figure. And the energy of 16 cases are
calculated. As you can see, the energy is minimum in the twelfth case 1 0 1 1. Unlike the energy in task 2, it can reach the minimum which is 0. So the probabilistic model can avoid the local minimum.
Fig 3.1 energy in 16 cases
Change the value of a , and see the influence to the numbers of copies: 1. a=0.1
As it is shown in figure 3.2, blue line is the curve of Number of copies and red line is the curve of Boltzmanʼs Distribution. And The two are almost match. Because a is small, the biggest number of the case when energy is the minimum is not very obvious.
Fig 3.2.1 : a=0.1 Curve of two cases
Fig 3.2.2 : a=0.1 The numbers of copies
(num1: the updating number num2:Boltzmanʼs Distribution)
2. a=1
When a=1 , the numbers of copies are already almost concentrated in the 12th
situation which energy is the minimum.(See Fig 3.3)
Fig 3.3 : a=1 Curve and copies numbers of two cases
3. a=100
As long as a is big enough, the probabilistic model becomes deterministic model.
And the numbers of copies are all concentrated in the case which energy is the minimum. So the value of a can affect the distribution of the states. If a is too small, itʼs hard to screen out the best case.
Fig 3.4 : a=100 Curve and copies numbers of two cases
Task 4: Application to Eight Queen Problem l Method
Get 9 neurons value in random. (0 or 1) The procedure is almost the same as Task 3. But this time there are no 1000 copies. Ergodicity is used to solve the problem. When all the conditions are satisfied, we only need to generate one copy then update it. Then we can observe the series of states generated by a single BM instead of the set of states generated by many copies of BM. Update the neuron 100000 times.
l Result
The weight and threshold are the same as Task 1. 16 neurons generate 512 states.
When the updating times is enough, all the numbers of states become balanced and they are satisfied with Boltzmanʼs Distribution. (See figure 4.1)
The biggest numbers of the state are 6 cases that satisfied “the winner takes all” model.
a =1
a=10
Fig 4.1: Distribution of 512 cases. Blue bar: Numbers of states Red line: Boltzmanʼs Distribution
Task 5: Application to Travelling Salesman Problem l Method
16 neurons are used to solve the problem. Give the initial value to 16 neurons in random. (0 or 1). There are 2DE states in total.
Assign the Distance matrix = [0 1 2 3;1 0 1 2; 2 1 0 1;3 2 1 0]; Its geometric meaning is that the four cities are on a linear line. And the distance between any 2 neighboring cities are 1.
Calculate the length energy and common energy to get the total energy. And compare the coefficient to gain the weight and threshold. The weight equals to the sum of two energyʼs weight. The weights change with the change of value b.
𝐸GHG7I = 𝐸J + 𝑏 ∗ 𝐸L
The updating method is similar to task 3. And it satisfies the condition for ergodicity. So we donʼt need many copies. Only one is enough. Update 1000000 times and observe the time sequence of generated states.
Change the value of b to see the difference.
l Result
1. b=1
When the value of b is small, the weight of 𝐸L is weak, and it can barely influence the 𝐸GHG7I. So it doesnʼt satisfy “the winner takes all” model though the sum of length is small. And it has no meaning in the reality problem because every time only one city is allowed to be visited. Therefore, the gap between the real number of occurrence and ideal number is a bit big.
2. b=1000
When b is big enough, 𝐸L becomes the essential element to influence the
weight of 𝐸GHG7I. And at that time , the energy satisfy “the winner takes all” model which means only one city is visited in each time. And the weight of 𝐸J limit the sum of the visiting distance to be the minimum. All the result when the total distance is minimum is shown in figure. And they all satisfy the reality.
The ideal condition is that these 16 situations are shown with the same probability so they have the same numbers of occurrence. And when the initial values which are generated in random update in the enough time, it concentrate on the only best solution which becomes the “deterministic model”. And the total energy after balance is also the minimum distance of visiting city. In this case, the minimum energy, ie, the minimum of distance is 6 which is satisfied with the reality.
Task 6 : Learning of Recurrent Neural Network l Method
Give initial values to the connection weights (0-1): wxh[i][j]=rand() i=0-2 j= 1-3
why[j]=rand() j=1-3
wxy0=rand()
Set inputs to x1 and x2 and fix their values: x1=x2=0
Update h1-h3 and y using the probabilistic model for 10000 times. Compute the average xi * hi.