COMP 424 – Artificial Intelligence Searching Under Uncertainty
Instructor: Jackie CK Cheung and Readings: R&N 4.3, 4.4
Major assumptions so far
Copyright By PowCoder代写 加微信 powcoder
• The environment is fully observable
• The agent always knows exactly what state it’s in
• The environment is deterministic
• The outcome of every action is known with certainty
As a consequence:
• Perception doesn’t even matter
• The solution to a problem is a single sequence of actions
• The solution can be predetermined via search
COMP-424: Artificial intelligence 2
Major assumptions so far
• The environment is fully observable
• The environment is deterministic
• Clearly, these assumptions don’t hold in many real-world problems
• We typically don’t know the complete state of the world
• Actions tend to have unintended consequences
• So let’s relax the assumptions…
• Let’s allow for uncertainty
COMP-424: Artificial intelligence 3
Sources of uncertainty
There are two major sources:
• The agent may not be able to observe which state it is in • We can classify problems as observable, partially
observable, or non-observable
• The agent may not know what the outcomes of its actions
• In this case, the actions are non-deterministic
COMP-424: Artificial intelligence 4
Examples of partial observability
Card games Pong (without memory)
Word games
When sensors are imperfect
COMP-424: Artificial intelligence
Examples of non-deterministic actions
Rolling dice
Driving on uneven terrain
Attacking, etc., in an RPG
Anything you say or do as a human being…
COMP-424: Artificial intelligence 6
Searching with non-deterministic actions
• Consider the case where the next state is not fully determined by the current state and action
e.g., Play backgammon
• The sets of states and actions are finite
• Each action can have multiple outcomes depending on the
result of the dice roll
• Future states cannot be determined in advance
• The solution to this problem is not a path, but a contingency plan (or strategy)
COMP-424: Artificial intelligence 7
Different search settings
Let’s consider a simple problem under different assumptions about observability and determinism: the vacuum world
• Two rooms
• Vacuum in one of the rooms
• Rooms can be dirty or clean
COMP-424: Artificial intelligence 8
Observable vacuum world
Observable, deterministic case:
• Start in a given initial state: {5}
• Plan: {Right; Sweep}
{Left, Right, Sweep}
Transitions: As expected Goal states: {7, 8}
COMP-424: Artificial intelligence
Observable vacuum world: state space
Observable Deterministic actions
COMP-424: Artificial intelligence 10
Non-observable case:
• Start in any state: {1,2,3,4,5,6,7,8}
• Begin in a set of possible states
Non-observable vacuum world
1. Observable case:
• Start in a given state: {5}
• Plan: {Right; Sweep}
{Left, Right, Sweep}
Transitions: As expected Goal states: {7, 8}
COMP-424: Artificial intelligence
Non-observable vacuum world
1. Observable case:
• Start in a given state: {5}
• Plan: {Right; Sweep}
2. Non-observable case:
• Start in any state: {1,2,3,4,5,6,7,8}
• Begin in a set of possible states
• Plan: {Right; ? }
• Taking an action (e.g., Right) moves the agent to a new set of states
• We must reason over sets now
• We call these sets “beliefs”
{Left, Right, Sweep}
Transitions: As expected Goal states: {7, 8}
COMP-424: Artificial intelligence
Searching with unobservable states
• Consider the case where agent is in one of several possible states, but cannot observe which
• We can represent uncertainty using a belief state
• i.e., the agent’s current belief about the possible states, given
the sequence of actions and observations up to that point
• a belief state is a set of physical states
• What’s the total number of possible beliefs?
• Here, each of the 8 possible states may be included (or not) in the belief state
• We can think of a belief state as an 8-bit vector: 28 = 256
• Technically 28-1, since the belief has to include at least 1 state
COMP-424: Artificial intelligence 13
Searching with unobservable states
• Consider the case where agent is in one of several possible states, but cannot observe which
• We can represent uncertainty using a belief state
• i.e., the agent’s current belief about the possible states, given
the sequence of actions and observations up to that point
• a belief state is a set of physical states
• What’s the total number of possible beliefs? 28-1
• But what’s the number of reachable belief states?
COMP-424: Artificial intelligence 14
Non-observable vacuum: reachable belief space
Non-observable Deterministic actions
Initial belief state = total ignorance 😰 =
8 physical states
NB: this figure omits self links!
COMP-424: Artificial intelligence 15
Searching with unobservable states
• Consider the case where agent is in one of several possible states, but cannot observe which
• We can represent uncertainty using a belief state
• i.e., the agent’s current belief about the possible states, given
the sequence of actions and observations up to that point
• a belief state is a set of physical states
• What’s the total number of possible beliefs? 28-1
• But what’s the number of reachable belief states? Only 12
COMP-424: Artificial intelligence 16
Searching with unobservable states
• Assume (for now) that there are no observations and actions are deterministic.
• Start in any state: b = {1, 2, 3, 4, 5, 6, 7, 8}
• How to use actions?
b = {1, 2, 3, 4, 5, 6, 7, 8} →Right {2, 4, 6, 8} = b’
For deterministic actions: |b’| ≤ |b|
COMP-424: Artificial intelligence 17
Searching with unobservable states
• Assume (for now) that there are no observations and actions are deterministic.
• Start in any state: b = {1, 2, 3, 4, 5, 6, 7, 8}
• How to use actions?
b = {1, 2, 3, 4, 5, 6, 7, 8} →Right {2, 4, 6, 8} = b’
For deterministic actions: |b’| ≤ |b|
• Plan? Use a conformant plan
{1, 2, 3, 4, 5, 6, 7, 8} →Right {2, 4, 6, 8} →Sweep{4, 8} →Left {3,7} →Sweep{7}
From a state of total uncertainty, we can coerce the world into state 7 COMP-424: Artificial intelligence 18
Conformant planning
• Goal: a plan that is guaranteed to take us from any initial state to a goal state, no matter what the effect of actions is
• A good search heuristic is to favor actions that reduce uncertainty
→ reduce the belief state to a single physical state → you might need to back up from bad belief states
• Once you know what physical state you’re in, apply standard search to reach the goal!
• Use conformant planning when there are NO observations 19
Visual analogy
• Feeding parts for manufacturing: http://goldberg.berkeley.edu/fences.mpg
Faulty vacuum world
1. Observable case:
• Start in a given state: {5}
• Plan: {Right; Sweep}
2. Non-observable case:
• Start in any of: {1,2,3,4,5,6,7,8}
• Plan: {Right; Sweep; Left; Sweep}
Non-deterministic (but observable) case:
• Start in a given state: {1}
• Transitions are no longer as expected:
• Sweeping in a dirty square sometimes also cleans the adjacent square;
Sweeping in a clean square sometimes deposits dirt • Plan?
COMP-424: Artificial intelligence 21
Non-deterministic actions: AND-OR search
In the case of stochastic actions, we also start by constructing a search tree
We augment trees to use two node types: AND and OR
Sweep OR Right
COMP-424: Artificial intelligence
Non-deterministic actions: AND-OR search
AND-OR search tree:
• OR nodes: branching is induced by the agent’s choice between actions
• AND nodes: branching is induced by the environment’s choice of outcome
Sweep OR Right
COMP-424: Artificial intelligence
Observable Non-deterministic actions
(faulty vacuum)
COMP-424: Artificial intelligence
deterministic
stochastic
AND nodes only branch when the related action is stochastic.
The two node types alternate.
Observable Non-deterministic actions
(faulty vacuum)
COMP-424: Artificial intelligence
deterministic
stochastic
Non-deterministic vacuum: AND-OR search
A solution for an AND-OR search problem is a subtree that satisfies the following conditions:
1. It specifies one action at each OR node
2. It includes every outcome at each AND node
3. It has a goal node at every leaf
COMP-424: Artificial intelligence 26
Observable Non-deterministic actions
(faulty vacuum)
COMP-424: Artificial intelligence
deterministic
stochastic
In the AND-OR setting we can find a solution with breadth-first or best-first search (or other variants).
Observable Non-deterministic actions
(faulty vacuum)
COMP-424: Artificial intelligence
deterministic
stochastic
Another case: The slippery vacuum
Movement actions, {Left, Right}, sometimes fail, leaving the agent in the same location. E.g. {1} ->Right {1, 2}
COMP-424: Artificial intelligence 29
Another case: The slippery vacuum
Movement actions, {Left, Right}, sometimes fail, leaving the agent in the same location. E.g., {1} ->Right {1, 2}
There are no longer any acyclic solutions starting from State 1. Thus, AND-OR search will return with failure. What now?
COMP-424: Artificial intelligence 30
Keep on trying until it works
Cyclic solution: keep on trying an action until it works.
Is this a good solution?
Depends on source of the stochasticity:
• OK if caused by a random event • e.g., Die roll
• Not OK if caused by an unobserved event • e.g., broken vacuum
Observable Non-deterministic actions
(slippery vacuum)
COMP-424: Artificial intelligence
Exercise: Tic-Tac-Toe
• What would the AND-OR tree for tic-tac-toe look like? Draw it for a few levels.
• What is a goal node?
• Suppose you are the starting player (X), and your
opponent is (O).
COMP-424: Artificial intelligence 32
Exercise: Tic-Tac-Toe
COMP-424: Artificial intelligence 33
Exercise: Tic-Tac-Toe
COMP-424: Artificial intelligence 34
Exercise: Tic-Tac-Toe
COMP-424: Artificial intelligence 35
Exercise: Tic-Tac-Toe
COMP-424: Artificial intelligence 36
Partial observability
• Partially observable states lie between fully observable and unobservable states
• E.g., perhaps the vacuum can only sense locally: it senses if the current room is dirty or clean but not the adjacent room
• In this case, the agent must account for possible observations
• Observations (percepts) are very useful: they tell the agent
something partial about its new state
• But they don’t tell it everything. It must again maintain a
belief over possibilities
COMP-424: Artificial intelligence 37
Partial observability
• The problem formulation must specify how the environment generates observations from the state (observation fcn)
• Typically, several states can generate the same observation
• We can think of transitions from one belief state to the next
for a given action as occurring in three stages:
1. prediction: from the given action and the current belief, predict the
next belief state (same as in unobservable case)
2. observation prediction: predict which observations could be
emitted in the predicted belief state
3. update: for each possible observation, determine the resulting
belief state; this is the set of states that could have produced the observation
COMP-424: Artificial intelligence 38
Search with partial observations
• Deterministic case: 1.
• Non-deterministic case:
Partially observable (local sensing)
Non-deterministic actions (slippery vacuum)
COMP-424: Artificial intelligence
Partially observable (local sensing)
Deterministic actions
Searching the belief space
In general, we can apply any standard search algorithm for partially observable problems
E.g., AND-OR search for the local-sensing vacuum:
AND-OR search returns a conditional plan that tests the belief state rather than the actual state
COMP-424: Artificial intelligence 40
Searching the belief space
• In general, can apply any standard search algorithm for partially observable problems
• However, scalability is a major issue!
• Problem #1: Number of reachable beliefs can be very large
• Use sampling or pruning to reduce this
• Problem #2: Number of physical states in each belief state can be very large
• Use a compact state representation
• Plan for each state separately (not as a search over belief space)
COMP-424: Artificial intelligence 41
• Actions can be {deterministic, nondeterministic}
• States can be {fully observable, unobservable, partially observable (deterministic), partially observable (non-deterministic)}
• Belief states represent the set of possible physical states an agent is in
• AND-OR trees can be used to represent and search through a belief state space to produce a plan, even when there is uncertainty
COMP-424: Artificial intelligence 42
Exercise: Mastermind
http://www.web-games-online.com/mastermind/index.php
• Belief space?
• Initial belief?
• Action space?
• Deterministic / non-deterministic actions?
• Possible percepts?
• Perception function?
• Goal test?
• Step cost?
COMP-424: Artificial intelligence 43
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com