Search in Non-Deterministic or Partially
Observable Environments
AIMA 4.3-4.4
CMPSC 442
Week 4, Meeting 10, Three Segments
Outline
● Non-Deterministic Problems and Belief States
● Sensorless Problems
● Exploration Problems
Outline, Wk 4, Mtg 10 2
Search in Non-Deterministic or Partially
Observable Environments
AIMA 4.3-4.4
CMPSC 442
Week 4, Meeting 10, Segment 1 of 3: Non-Deterministic
Problems and Belief States
Non-Deterministic Worlds
● Deterministic Problems: given the states S and actions A
○ Current state + specific action → resulting successor state
● Non-Deterministic Problems: given the states S and actions A
○ Current state + specific action → belief state consisting of alternative
possible successors
○ Solution to search is a strategy for taking actions, rather than a specific
action sequence
○ The strategy execution is contingent on detecting results of actions
Non-deterministic Problems 4
Non-deterministic Vacuum Cleaner Agent
Non-deterministic Problems
8 States
● Three actions
○ Right: applies to odd-numbered states to
produce even-numbered states
○ Left: inverse of Right
○ Suck
■ When vacuum is in a dirty square, the
action will clean the square, and sometimes
an adjacent square
For example, Results(1, Suck) = {5, 7}
5
And-Or Search Trees: Specification
Search in non-deterministic worlds must consider alternative outcomes
● Or Nodes: As in the deterministic search methods, search trees will contain
state nodes with one or more possible action arcs; e.g., from state 1, an action
arc for each member of actions A = {Right, Suck}
● And Nodes: When an action has alternative outcomes {s
i
, s
i+1
, . . . s
n
}, the search
tree must consider the path to s
i
and to s
i+1
and . . . to s
n
Non-deterministic Problems 6
Partial And-Or Solution
● Solution is a tree instead of a path, with contingent states
○ If the action = Suck, and the result = state 7, goal
○ Else if the result = state 5, then move Right, then Suck
● At execution time, the agent can test which state resulted from a
non-deterministic action
7
● Rectangles
○ OR nodes for states
○ Arcs are actions
● Circles:
○ AND nodes
○ All possible outcomes
Non-deterministic Problems
Complete And-Or
Search Tree
● Two actions available from every state
○ Suck
○ Move to other square
● Move actions (Right/Left) are
deterministic: resulting AND node
always has a single child
● Suck actions are non-deterministic:
resulting AND node always has two
children
● The solution is a sub-tree (thicker
arrows)
Non-deterministic Problems 8
And-Or Search Tree
Solution SubTree
● Contains a GOAL at every leaf
○ LOOP states are not leaves
○ LOOP states are re-entry points
● A single action is chosen from each
OR node
● Every outcome branch is shown at
each AND node
Non-deterministic Problems 9
And-Or Search Algorithm: Recursive Depth First (1)
Non-deterministic Problems 10
And-Or Search Algorithm: Recursive Depth First (2)
Non-deterministic Problems 11
And-Or Search Algorithm: Recursive Depth First (3)
Non-deterministic Problems 12
Belief States
● When actions are non-deterministic, the agent must maintain a state
representation that includes all the possible action outcomes
● A state representation with alternative states that might exist is
referred to as a belief state
● Later in the course, we will see how uncertainty can be incorporated
into belief states through probabilistic reasoning
13Non-deterministic Problems
Search in Non-Deterministic or Partially
Observable Worlds
AIMA 4.3-4.4
CMPSC 442
Week 4, Meeting 10, Segment 2 of 3: Sensorless Problems
Not Observable Environments (Searching Blind)
● A solution to a non-deterministic problem assumes that at execution
time, the agent’s percepts can resolve the outcome of an action
● What if the agent’s percepts do not provide enough information?
○ The environment is partially observable or not observable
● Sensorless (or conformant) problems: states are not observable
○ Sensors can be time consuming, unreliable or can suffer damage
○ Example: in manufacturing, placing parts in the correct location could rely
on constraints/physics rather than sensing
○ Example: in medicine, a broad-spectrum antibiotic can treat many
infections, so no need to wait for test results to identify the pathogen
15Sensorless Problems
Coercion in a Sensorless World
Agent has no information about start state
● Start belief state = {1, 2, 3, 4, 5, 6, 7, 8}
● If agent moves right
○ Resulting belief state = {2, 4, 6, 8}
○ New belief state is less uncertain
● If agent moves right, then sucks
○ Resulting belief state = {4, 8}
● If agent action sequence = [Right, Suck, Left, Suck]
○ Resulting belief state = {7}
○ Coercion of certainty in a non-deterministic, sensorless world
16
8 States
Sensorless Problems
Using Search to Solve Sensorless Problems
● Applying search algorithms to sensorless problems
○ So far, we have used search algorithms to search the state space
○ The same algorithms can search the belief state
● Why should this work?
○ Percepts are irrelevant: they are always empty
○ The solution to a sensorless problem is a single sequence of actions
○ The belief state is deterministic: the agent knows its own beliefs by
definition
17Sensorless Problems
Search through Belief State Space
● States: ?
● Initial state: ?
● Actions: ?
● Transition model: ?
● Goal test: ?
● Action cost: ?
18Sensorless Problems
Search States, Initial State and Actions
● States: Belief state space
○ Every possible subset of physical states P
○ Where |P| = N, equal to 2N
(every subset is T or F)
● Initial state: P in the absence of prior knowledge
● Actions: Two cases
○ All actions are safe:
○ Some actions cause disaster:
19Sensorless Problems
Transition Model
● Transition model: Two cases
○ Deterministic actions:
○
○ Non-deterministic actions
20Sensorless Problems
Goal Test and Actions
● Goal Test
○ Agent possibly achieves the goal if any state s in the belief state satisfies
the goal test
○ Agent necessarily achieves the goal if all states s in the belief state satisfy
the goal test
● Action cost
○ Case one: if all actions have same cost from any state, then the cost is the
same as in the state space problem
○ Case two: if the same action can have different costs, depending on the
state, then the cost is a function of the belief state
21Sensorless Problems
Solving Sensorless Problems with Search
● The belief state-space can become too large for efficient search
● Methods to handle search in belief state spaces
○ Prune the belief space: e.g., if the belief state space at node ni is a
superset of the belief state space at node nj, then prune node ni
○ Use a more compact representation of belief (see chapter 7)
○ Incremental search:
■ A solution to an initial belief state S that contains {s1, s2, . . . sn} must work for
each state si ∈ S
■ So, find a solution to s1; test the solution for each next state; iterate
22Sensorless Problems
Search in Non-Deterministic or Partially
Observable Environments
AIMA 4.3-4.4
CMPSC 442
Week 4, Meeting 10, Segment 3 of 3: Partially Observable
Environments
8-Puzzle as Sensorless or Partially Observable
● 8-Puzzle cannot be solved if the environment is
non-observable/sensorless
● 8-Puzzle can be solved if just one cell ci can be observed:
○ If cell ci is empty, move an adjacent tile into ci and observe its value vi
○ Else record the value vi of the tile in ci
○ For every successor belief state sk
■ Keep track of the new location of vi in sk
■ Every time a new tile is moved into ci, observe its value vj
24Partially Observable Environments
Transition Model in Partially Observable Environments
● Prediction stage computes the hypothesized belief that results from
taking action a in belief state b:
● Possible Percepts stage computes the possible observations in the
predicted state:
● Update stage computes the belief state resulting from the percepts
25Partially Observable Environments
Belief Updating
● Observations cannot increase uncertainty:
● Sensing can be deterministic or non-deterministic
○ Deterministic sensing leads to disjoint belief states for the possible
percepts, thus a partition of the predicted belief state
26Partially Observable Environments
Putting it All Together
The belief states resulting from an action a in belief state b and the
observations o resulting from the resulting possible percepts
27Partially Observable Environments
Example: Partially Observable, Deterministic
28Partially Observable Environments
Example: Partially Observable, non-Deterministic
29Partially Observable Environments
Solving Partially Observable Problems
● In place of the successor function from fully observable deterministic
search, we now have:
○ A PERCEPT function to produce possible observations in successor belief
state
○ A RESULTS function to update the belief state
● Given the above, the AND-OR search algorithm can be applied to the
belief state space to find a solution
● The solution is a conditional plan
30Partially Observable Environments
Execution of the Conditional Plan
● To execute an if-then-else expression in the conditional plan
○ Agent receives percept, then executes the appropriate branch of the
condition
○ Agent updates its beliefs after each action
● Similar to, but simpler than, the search
○ Percepts are actual observations in the environment, rather than possible
observations maintained for all ways the belief state space could evolve
31Partially Observable Environments
Partially Observable Problems
● Most real-world problems involve partial knowledge of the state of the
world
● Belief update is central to agents that operate in partially observable
worlds
32Partially Observable Environments
Robot Localization: First Percept
● Robot percept: length 4 vector representing sonar percept of wall or
no wall to North, East, South and West
● First percept [1011] = walls on the North, South, West orientations only
● Four locations robot could be shown by target circles
33Partially Observable Environments
Robot Localization: First + Second Percept
● Second percept [1010] = walls on the North & South orientations only
● Because of the prior percept [1011], there is only one possible location
for the robot
34Partially Observable Environments
Summary
● AND-OR search is used in non-deterministic environments
● A partially observable environment requires a new data structure, the
agent’s belief state
● Standard search algorithms can be applied to sensorless problems,
producing a solution that is a sequence of actions
● AND-OR search can be applied to partially observable environments
35Wk 4, Mtg 10 Summary