●
●
●
> git clone https://github.com/ipab-rad/rl-cw1
> cd rl-cw1
> python keyboard_agent.py
https://github.com/ipab-rad/rl-cw1
https://github.com/ipab-rad/rl-cw1
●
○
●
○
https://github.com/mgbellemare/Arcade-Learning-Environment
https://github.com/mgbellemare/Arcade-Learning-Environment
# Remember to change to your student number
> ssh -X sNNNNNNN@student.ssh.inf.ed.ac.uk
# Enter Dice password and SSH again
> ssh -X student.login
> git clone https://github.com/ipab-rad/rl-cw1
> cd rl-cw1/
> python keyboard_agent.py
mailto:sNNNNNNN@student.ssh.inf.ed.ac.uk
https://github.com/ipab-rad/rl-cw1
Number of cars
left to pass
Distance travelled
Agent
Opponents
Row 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 2 0 0 0 0
Row 10
Col 0 Col 9
Agent
QAgent
●
Agent
● Agent
●
KeyboardAgent RandomAgent QAgent
Agent
def run(self, learn, episodes)-> None
def getActionsSet(self)->
def move(self, action)-> reward
[Action.ACCELERATE,
Action.BREAK,
Action.RIGHT,
Action.LEFT]
Agent
def initialise(self, grid)
def act(self)
def sense(self, grid)
def learn(self)
def callback(self, learn, episode, iteration)