6CCS3AIN, Tutorial 01 (Version 1)
1. Write down an example of an intelligent agent, either from the fictional world (from a book, a play, a film, a
television show and so on) or from the real world. Write a few lines to describe this example, and say:
(a) what kind of sensors it uses;
(b) what actions it can carry out; and (c) what environment it operates in.
2. Classify the environment from the previous question.
You should classify the environment in the way that we described in the lecture — accessible or inaccessible,
static or dynamic, and so on. Explain why you give the answer that you do.
3. Come up with another example agent that operates in a different kind of environment. Classify that environment
also.
4. Figure 1 (next page) shows an agent situated in an environment. We can think of the environment E being made up of 36 states:
e0,0, e0,1, . . . where the subscript of each e indicates a square in the grid.
Thus the agent, sitting in the bottom lefthand corner, is in state e0,0, while if the agent were at the goal, in the top righthand corner, it would be in state e5,5. e0,0 is the initial state of the environment.
The filled squares indicate obstacles — the agent cannot be in these states. The agent can move north, south, east or west, which we write as:
αn, αs, αe, αw
and these have the effects you would expect. If the agent is in state e0,0 and takes action αn, it will end up in state e0,1, while if the agent is in state e3,2 and takes action αe it will end up in e4,2. If the agent tries to move outside the grid then it does not move (for example if the agent is in e0,5 and tries to do αn then it stays in e0,5).
If the agent enters the state e5,5, marked with the word goal, then it gets a reward of 10. If it enters state
e1,4, marked with a dark circle, it gets a reward of −10 (ie it takes a loss).
(a) Write down a run of the agent that takes it from e0,0 to e1,4.
(b) Consider the following control program:
while( not in state e5,5 ){
randomly pick either αn or αe (each with probability 0.5) execute the action that was selected
}
Write down two runs that the agent might carry out when executing this program.
(c) If the agent executes the program, can it ever reach e5,0? Why?
(d) What is the maximum and minimum reward that the agent can get running this program? Why?
(e) How likely is it that when the agent runs the above program it will get the reward of −10?
(Hint: There is a precise probability that the agent will get to the relevant state. You should work out how to calculate this.)
5. Suppose the vacuum world from the lecture (the one from the slides from lecture 1) contains obstacles which the agent has to avoid, and that the agent has a sensor to detect the obstacles. That is, there are squares that are blocked because they are simply inaccessible (such as a wall) and other squares where there are obstacles (e.g., a human foot). Write down a new control program for the agent.
Explain what, if any, guarantees you can make for your solution about its ability to make the room clean, and justify your answer.
1
north
goal
5
4
3 2
1
0
012345
Figure 1:
6. Suppose that the vacuum world agent’s sensors — the one from the lecture that sees the dust, and the new one from Question 5 that can see obstacles — are now noisy so they only give the right answer 80% of the time.
(You can interpret this to mean that if there is dust or an obstacle, the relevant sensor will say that it is there 80% of the time. You can assume when there is no dust or obstacle, the sensor always correctly reports this.)
How does this change the logic-based control program? Assume that if the agent tries to move into a square that contains an obstacle, it does not move.
Explain what, if any, guarantees you can make for your solution about its ability to make the room clean, and justify your answer.
2