CS计算机代考程序代写 algorithm tutorial9.dvi

tutorial9.dvi

COMP9414: Artificial Intelligence

Tutorial 9: Neural Networks/Reinforcement Learning

1. (i) Construct by hand a perceptron which correctly classifies the following data; use your
knowledge of plane geometry to choose values for the weights w0, w1 and w2.

Training Example x1 x2 Class
a 0 1 −
b 2 0 −
c 1 1 +

(ii) Simulate the perceptron learning algorithm on the above data, using a learning rate of
1.0 and initial weight values of w0 = −0.5, w1 = 0 and w2 = 1. In your answer, clearly
indicate the new weight values at the end of each training step.

2. Explain how each of the following could be constructed:

(i) Perceptron to compute the OR function of m inputs

(ii) Perceptron to compute the AND function of n inputs

(iii) 2-Layer neural network to compute any (given) logical expression written in CNF

3. Consider a world with two states S = {S1, S2} and two actions A = {a1, a2}, where the
transitions δ and reward r for each state and action are as follows:

δ(S1, a1) = S1 r(S1, a1) = 0
δ(S1, a2) = S2 r(S1, a2) = −1
δ(S2, a1) = S2 r(S2, a1) = +1
δ(S2, a2) = S1 r(S2, a2) = +5

(i) Draw a picture of this world, using circles for the states and arrows for the transitions.

(ii) Assuming a discount factor of γ = 0.9, determine:

(a) the optimal policy π∗ : S → A

(b) the optimal value function V ∗ : S → R

(iii) Write the Q values in a table.

(iv) Trace through the first few steps of the Q-learning algorithm on some randomly chosen
input, with all Q values initially set to zero. Explain why it is necessary for the agent
to explore the environment through probabilistic choice of actions in order to ensure
convergence to the true Q values.

Related Posts