CS计算机代考程序代写 More on Reinforcement Learning

More on Reinforcement Learning
AIMA 22.3

All CS188 materials are available at http://ai.berkeley.edu.]

● Q

● Q

More on Reinforcement Learning
AIMA 22.3





Value Iteration Cannot Work



● T(s,a,s’)
○ s
○ T(s,a,s’)

Solution: Q-Value Iteration

● Q
○ Q0(s,a) = 0
○ Qk k+1 Q Q

● Q

● Q(s,a)

● ɑ




More on Reinforcement Learning
AIMA 22.3






○ t
1/t

k

● k

● t

k

● q a t
● Qt(a) a
● t, a Qt(a)
● Qt(a)

○ Qt(a)


○ Qt(a) q*(a)

Sutton & Barto, Reinforcement Learning, Fig
2.2




More on Reinforcement Learning
AIMA 22.3

Q

● Q
○ s, a, r, s’, a’
○ Q a’

Q- s’

Q

● Q



● Q