CS计算机代考程序代写 data structure algorithm Search in Non-Deterministic or Partially

Search in Non-Deterministic or Partially
Observable Environments
AIMA 4.3-4.4

CMPSC 442
Week 4, Meeting 10, Three Segments

Outline

● Non-Deterministic Problems and Belief States
● Sensorless Problems
● Exploration Problems

Outline, Wk 4, Mtg 10 2

Search in Non-Deterministic or Partially
Observable Environments
AIMA 4.3-4.4

CMPSC 442
Week 4, Meeting 10, Segment 1 of 3: Non-Deterministic
Problems and Belief States

Non-Deterministic Worlds

● Deterministic Problems: given the states S and actions A
○ Current state + specific action → resulting successor state

● Non-Deterministic Problems: given the states S and actions A
○ Current state + specific action → belief state consisting of alternative

possible successors
○ Solution to search is a strategy for taking actions, rather than a specific

action sequence
○ The strategy execution is contingent on detecting results of actions

Non-deterministic Problems 4

Non-deterministic Vacuum Cleaner Agent

Non-deterministic Problems

8 States

● Three actions
○ Right: applies to odd-numbered states to

produce even-numbered states
○ Left: inverse of Right
○ Suck

■ When vacuum is in a dirty square, the
action will clean the square, and sometimes
an adjacent square

For example, Results(1, Suck) = {5, 7}

5

And-Or Search Trees: Specification

Search in non-deterministic worlds must consider alternative outcomes
● Or Nodes: As in the deterministic search methods, search trees will contain

state nodes with one or more possible action arcs; e.g., from state 1, an action
arc for each member of actions A = {Right, Suck}

● And Nodes: When an action has alternative outcomes {s
i
, s

i+1
, . . . s

n
}, the search

tree must consider the path to s
i
and to s

i+1
and . . . to s

n

Non-deterministic Problems 6

Partial And-Or Solution

● Solution is a tree instead of a path, with contingent states
○ If the action = Suck, and the result = state 7, goal
○ Else if the result = state 5, then move Right, then Suck

● At execution time, the agent can test which state resulted from a
non-deterministic action

7

● Rectangles
○ OR nodes for states
○ Arcs are actions

● Circles:
○ AND nodes
○ All possible outcomes

Non-deterministic Problems

Complete And-Or
Search Tree
● Two actions available from every state

○ Suck
○ Move to other square

● Move actions (Right/Left) are
deterministic: resulting AND node
always has a single child

● Suck actions are non-deterministic:
resulting AND node always has two
children

● The solution is a sub-tree (thicker
arrows)

Non-deterministic Problems 8

And-Or Search Tree
Solution SubTree
● Contains a GOAL at every leaf

○ LOOP states are not leaves
○ LOOP states are re-entry points

● A single action is chosen from each
OR node

● Every outcome branch is shown at
each AND node

Non-deterministic Problems 9

And-Or Search Algorithm: Recursive Depth First (1)

Non-deterministic Problems 10

And-Or Search Algorithm: Recursive Depth First (2)

Non-deterministic Problems 11

And-Or Search Algorithm: Recursive Depth First (3)

Non-deterministic Problems 12

Belief States

● When actions are non-deterministic, the agent must maintain a state
representation that includes all the possible action outcomes

● A state representation with alternative states that might exist is
referred to as a belief state

● Later in the course, we will see how uncertainty can be incorporated
into belief states through probabilistic reasoning

13Non-deterministic Problems

Search in Non-Deterministic or Partially
Observable Worlds
AIMA 4.3-4.4

CMPSC 442
Week 4, Meeting 10, Segment 2 of 3: Sensorless Problems

Not Observable Environments (Searching Blind)

● A solution to a non-deterministic problem assumes that at execution
time, the agent’s percepts can resolve the outcome of an action

● What if the agent’s percepts do not provide enough information?
○ The environment is partially observable or not observable

● Sensorless (or conformant) problems: states are not observable
○ Sensors can be time consuming, unreliable or can suffer damage
○ Example: in manufacturing, placing parts in the correct location could rely

on constraints/physics rather than sensing
○ Example: in medicine, a broad-spectrum antibiotic can treat many

infections, so no need to wait for test results to identify the pathogen

15Sensorless Problems

Coercion in a Sensorless World

Agent has no information about start state
● Start belief state = {1, 2, 3, 4, 5, 6, 7, 8}
● If agent moves right

○ Resulting belief state = {2, 4, 6, 8}
○ New belief state is less uncertain

● If agent moves right, then sucks
○ Resulting belief state = {4, 8}

● If agent action sequence = [Right, Suck, Left, Suck]
○ Resulting belief state = {7}
○ Coercion of certainty in a non-deterministic, sensorless world

16

8 States

Sensorless Problems

Using Search to Solve Sensorless Problems

● Applying search algorithms to sensorless problems
○ So far, we have used search algorithms to search the state space
○ The same algorithms can search the belief state

● Why should this work?
○ Percepts are irrelevant: they are always empty
○ The solution to a sensorless problem is a single sequence of actions
○ The belief state is deterministic: the agent knows its own beliefs by

definition

17Sensorless Problems

Search through Belief State Space

● States: ?
● Initial state: ?
● Actions: ?
● Transition model: ?
● Goal test: ?
● Action cost: ?

18Sensorless Problems

Search States, Initial State and Actions

● States: Belief state space
○ Every possible subset of physical states P
○ Where |P| = N, equal to 2N

(every subset is T or F)

● Initial state: P in the absence of prior knowledge
● Actions: Two cases

○ All actions are safe:
○ Some actions cause disaster:

19Sensorless Problems

Transition Model

● Transition model: Two cases
○ Deterministic actions:

○ Non-deterministic actions

20Sensorless Problems

Goal Test and Actions

● Goal Test
○ Agent possibly achieves the goal if any state s in the belief state satisfies

the goal test
○ Agent necessarily achieves the goal if all states s in the belief state satisfy

the goal test

● Action cost
○ Case one: if all actions have same cost from any state, then the cost is the

same as in the state space problem
○ Case two: if the same action can have different costs, depending on the

state, then the cost is a function of the belief state

21Sensorless Problems

Solving Sensorless Problems with Search

● The belief state-space can become too large for efficient search
● Methods to handle search in belief state spaces

○ Prune the belief space: e.g., if the belief state space at node ni is a
superset of the belief state space at node nj, then prune node ni

○ Use a more compact representation of belief (see chapter 7)
○ Incremental search:

■ A solution to an initial belief state S that contains {s1, s2, . . . sn} must work for
each state si ∈ S

■ So, find a solution to s1; test the solution for each next state; iterate

22Sensorless Problems

Search in Non-Deterministic or Partially
Observable Environments
AIMA 4.3-4.4

CMPSC 442
Week 4, Meeting 10, Segment 3 of 3: Partially Observable
Environments

8-Puzzle as Sensorless or Partially Observable

● 8-Puzzle cannot be solved if the environment is
non-observable/sensorless

● 8-Puzzle can be solved if just one cell ci can be observed:
○ If cell ci is empty, move an adjacent tile into ci and observe its value vi
○ Else record the value vi of the tile in ci
○ For every successor belief state sk

■ Keep track of the new location of vi in sk
■ Every time a new tile is moved into ci, observe its value vj

24Partially Observable Environments

Transition Model in Partially Observable Environments

● Prediction stage computes the hypothesized belief that results from
taking action a in belief state b:

● Possible Percepts stage computes the possible observations in the
predicted state:

● Update stage computes the belief state resulting from the percepts

25Partially Observable Environments

Belief Updating

● Observations cannot increase uncertainty:

● Sensing can be deterministic or non-deterministic
○ Deterministic sensing leads to disjoint belief states for the possible

percepts, thus a partition of the predicted belief state

26Partially Observable Environments

Putting it All Together

The belief states resulting from an action a in belief state b and the
observations o resulting from the resulting possible percepts

27Partially Observable Environments

Example: Partially Observable, Deterministic

28Partially Observable Environments

Example: Partially Observable, non-Deterministic

29Partially Observable Environments

Solving Partially Observable Problems

● In place of the successor function from fully observable deterministic
search, we now have:
○ A PERCEPT function to produce possible observations in successor belief

state
○ A RESULTS function to update the belief state

● Given the above, the AND-OR search algorithm can be applied to the
belief state space to find a solution

● The solution is a conditional plan

30Partially Observable Environments

Execution of the Conditional Plan

● To execute an if-then-else expression in the conditional plan
○ Agent receives percept, then executes the appropriate branch of the

condition
○ Agent updates its beliefs after each action

● Similar to, but simpler than, the search
○ Percepts are actual observations in the environment, rather than possible

observations maintained for all ways the belief state space could evolve

31Partially Observable Environments

Partially Observable Problems

● Most real-world problems involve partial knowledge of the state of the
world

● Belief update is central to agents that operate in partially observable
worlds

32Partially Observable Environments

Robot Localization: First Percept

● Robot percept: length 4 vector representing sonar percept of wall or
no wall to North, East, South and West

● First percept [1011] = walls on the North, South, West orientations only
● Four locations robot could be shown by target circles

33Partially Observable Environments

Robot Localization: First + Second Percept

● Second percept [1010] = walls on the North & South orientations only
● Because of the prior percept [1011], there is only one possible location

for the robot

34Partially Observable Environments

Summary

● AND-OR search is used in non-deterministic environments
● A partially observable environment requires a new data structure, the

agent’s belief state
● Standard search algorithms can be applied to sensorless problems,

producing a solution that is a sequence of actions
● AND-OR search can be applied to partially observable environments

35Wk 4, Mtg 10 Summary