Department of Engineering/Informatics, King’s College London Nature-Inspired Learning Algorithms (7CCSMBIM)
Tutorial 2
Q1. What are the advantages and disadvantages of gradient descent method?
Q2. Show how gradient descent method works using pseudo code.
2 12
Q3. Consider a least-squares problem, min f (x) =|| Ax − B ||2 where A = and
x34 B = 6 . Find the optimal solution x.
5
Q4. Consider the minimisation problem of the function: f (x, y) = (x − 1)x + (y + 1)y
using Nelder-Mead Downhill Simplex Method.
a. Starting with 3 vertices (0, 0), (1, 2), (3, 4), determine the points B, G and
W.
b. Find the midpoint M and f(M).
c. Find the reflection R and f(R).
d. Determine the 3 vertices (B, G and W) in the next iteration. If Case (ii)
is performed in the pseudo code of Nelder-Mead downhill simplex method,
C=W+M isused. 2
Q5. Considering the minimisation problem of the function: f (x, y) = (x − 1)x + (y + 1)y using the gradient descent method, find x and y, and f (x, y) for the first 3 iterations with step size hk = 0.1 and initial guess (7,8).
Q6. Consider the minimisation problem of the function: f (x, y) = (x − 1)x + (y + 1)y using the gradient descent method.
a. Determine the optimal step size hk.
b. Show that it requires one iteration for the gradient descent method to converge
with the optimal step size hk obtained in Q6a.
Q7. Consider the minimisation problem of the function: f (x, y) = (x − 1)x + (y + 1)y using the random walk optimisation algorithm. The Threshold (for accepting the worse solution) is 0.75 and the random number generator will generate a repeating sequence of {0.8, 0.7, 0.2, 0.6, 0.9, 0.7, 0.5, 0.6} (The left-most number is the first number generated). The diagonal entries of Dk are all 1 and all the entries of hk are 0.5. Fill in the content of the following table.
k
xTk
f(xk)
xTk+1
f(xk+1)
rand()
xbest
f(xbest)
0
[−1 −2]
1
2
3
4
5
1
Q8. Consider a system taking two input variables x1 and x2 and generate an output y. A set of M input-output data is collected in an experiment, e.g., the dataset is (x1(1), x2(1), y(1)), (x1(2), x2(2), y(2)), · · · , (x1(M), x2(M), y(M)). Design a linear regressor in the form of yˆ = w1x1 + w2x2 + b to best fit the data in terms of Mean Squared Error using the gradient descent method, where w1 and w2 and b are the parameters to be determined.
a. Formulate the data fitting problem as a minimisation problem.
b. Denote the step size as hk. Derive the update rule for each parameter.
c. Use hk = 0.1 and the initial guess for all variables is 1. Considering the dataset: (1,−2,3), (2,4,−1), (3,0,5), obtain the best set of parameters for the linear regressor.
d. Plot “iteration k” against “MSE” for 200 iterations. Is the choice of hk = 0.1 right or wrong?
Q9. Find the least-squares solution of Ax = B where: 101 0
A=0 1 1andB=1. −1 0 1 3
0 −1 1 4
Q10. Consider the three data points (x, y) given as follows: (0, 6), (1, 0), (2, 0),
which are the data obtained from a single line but have errors due to measurement. Find a straight line that best fits these points in the sense of least-squares? Explain the quantity that is being minimized.
(0,6)
(1,0) (2,0)
2
Q11. Find a parabola equation that best fits the below data points (x, y) in the sense of least-squares.
(−1, 1), (1, −1), (2, −1), (3, 2). 22
What quantity is being minimized?
(3,2)
(
-1,1) 2
(2,- 1 2
(1,-1)
)
Q12. According to the following data, find the best approximated linear function f(x,y) in the sense of least-squares and identify the quantity being minimized.
x y f(x,y) 100 011
−1 0 3 0 −1 4
Q13. There are three categories of machines I, II and III installed in a factory. Machines have different requirements on operating time. Machines I and II can be operated for most 12 hours whereas machine III needs to be operated for at least 5 hours per day. This factory produces two items M and N. Both require the use of all the three machines. The operating time of producing 1 unit of each item on the three machines are presented in Table 1 as follows:
Items Machine III M1
N 1.25
Table 1: Operating time of producing one unit of each item on each machine.
The profit of item M and N are £600 and £400 each per unit, respectively. What would be the most reasonable strategy to produce the items M and N as to maximise the profit of the factory? What would be the maximum profit in this case?
Machine I
Machine II
1
2
2
1
3
Q14. There are two factories located in place P and Q. The commodity produced by these two factories will be delivered to three depots situated at A, B and C. Each week, the depots at A, B and C requires 5, 5 and 4 units of the commodity, while the production capacity of the factories at P and Q are 8 and 6 units correspondingly. The commodities produced at both factories will be all transported. The cost of transportation varies from different factories to different depots where the details of the cost of transportation per unit commodity are in Table 2.
From/To Cost of C
Cost of A
Cost of B
160
100
100
120
P Q
150 100
Table 2: Cost of transportation per unit commodity.
Find the units of commodity from factories to depots so the cost of transportation is minimum? What will be the minimum cost of transportation?
Q15. A factory wishes to produce a new type of food which has two ingredients denoted as Ingredient I and Ingredient II. This new type of food need to contains vitamin A and vitamin C with minimum value 8 and 10, respectively. Ingredient I contains 2 units/kg of vitamin A and 1 units/kg of vitamin C. Ingredient II contains 1 units/kg of vitamin A and 2 units/kg of vitamin C. The price of ingredient I and ingredient II are £50/kg and £70/kg, respectively. Decide the formula (the weight of each ingredient) of this new type of food so that the cost is at the minimum?
Q16. Explain how “Recursive least-squares” works.
Q17. Consider a least-squares problem as Ax = B, where
5 6 5 A=5 8,B=2
451
Given that there are five new samples coming in the the following order,
aT1=34, b1=7; aT2 =10 10, b2=5; aT3=38, b3=8; aT4=88, b4=2; aT5=75, b5=3.
calculate the initial solution, denoted as x0, for Ax = B and update the solution using the recursive least-squares technique with the given new samples. Show the working and the solution after adding the samples one by one according to the order shown in the above.
4