CISC 6525 Artificial Intelligence
CISC 6525
Artificial Intelligence
Time and Uncertainty
Chapter 15
Temporal Probabilistic Agent
environment
agent
?
sensors
actuators
t1, t2, t3, …
Time and Uncertainty
The world changes, we need to track and predict it
Examples: diabetes management, traffic monitoring
Basic idea: copy state and evidence variables for each time step
Xt – set of unobservable state variables at time t
e.g., BloodSugart, StomachContentst
Et – set of evidence variables at time t
e.g., MeasuredBloodSugart, PulseRatet, FoodEatent
Assumes discrete time steps
States and Observations
Process of change is viewed as series of snapshots, each describing the state of the world at a particular time
Each time slice involves a set or random variables indexed by t:
the set of unobservable state variables Xt
the set of observable evidence variable Et
The observation at time t is Et = et for some set of values et
The notation Xa:b denotes the set of variables from Xa to Xb
Stationary Process/Markov Assumption
Markov Assumption: Xt depends on some previous Xis
First-order Markov process:
P(Xt|X0:t-1) = P(Xt|Xt-1)
kth order: depends on previous k time steps
Sensor Markov assumption:
P(Et|X0:t, E0:t-1) = P(Et|Xt)
Assume stationary process: transition model P(Xt|Xt-1) and sensor model P(Et|Xt) are the same for all t
In a stationary process, the changes in the world state are governed by laws that do not themselves change over time
Complete Joint Distribution
Given:
Transition model: P(Xt|Xt-1)
Sensor model: P(Et|Xt)
Prior probability: P(X0)
Then we can specify complete joint distribution:
Probabilistic Temporal Models
Hidden Markov Models (HMMs)
Kalman Filters
Dynamic Bayesian Networks (DBNs)
DBN Example
Raint-1
Umbrellat-1
Raint
Umbrellat
Raint+1
Umbrellat+1
Rt-1 P(Rt|Rt-1)
T
F 0.7
0.3
Rt P(Ut|Rt)
T
F 0.9
0.2
Transition model: P(Raint | Raint-1)
Sensor model: P(Umbrellat | Raint )
Inference Tasks
Filtering or monitoring: P(Xt|e1,…,et)
computing current belief state, given all evidence to date
Prediction: P(Xt+k|e1,…,et)
computing prob. of some future state
Smoothing: P(Xk|e1,…,et)
computing prob. of past state (hindsight)
Most likely explanation: arg maxx1,..xtP(x1,…,xt|e1,…,et)
given sequence of observation, find sequence of states that is most likely to have generated those observations.
Examples
Filtering: What is the probability that it is raining today, given all the umbrella observations up through today?
Prediction: What is the probability that it will rain the day after tomorrow, given all the umbrella observations up through today?
Smoothing: What is the probability that it rained yesterday, given all the umbrella observations through today?
Most likely explanation: if the umbrella appeared the first three days but not on the fourth, what is the most likely weather sequence to produce these umbrella sightings?
Filtering
We use recursive estimation to compute P(Xt+1 | e1:t+1) as a function of et+1 and P(Xt | e1:t)
We can write this as follows:
This leads to a recursive definition
f1:t+1 = FORWARD(f1:t:t,et+1)
Umbrella Example
Day 0 – no observations P(R0)=(0.5,0.5) (note: using “(“ “)” instead of “<“ “>”)
Day 1 – umbrella appears, U1=true,
prediction t=0 to t=1 is
P(r1)=r0 P(R1|r0)P(r0)=(0.7,0.3)*0.5+(0.3,0.7)*0.5=(0.5.0.5)
update of evidence for t=1
P(R1|u1)=P(u1|R1)P(R1)= (0.9,0.2)(0.5,0.5)= (0.45,0.1)(0.818,0.182)
Day 2 – umbrella appears, U2=true,
prediction t=1 to t=2,
P(r2|u1)=r0 P(R2|r1)P(r1|u1)=(0.7,0.3)*0.818+(0.3,0.7)*0.182 (0.627,0.373)
update of evidence for t=2
P(R2|u1,u2)= P(u2|R2)P(R2|u1)= (0.9,0.2)(0.627,0.373) (0.883,0.117)
Update
Prediction
Forward Filtering Example
prediction t=0 to t=1 is
P(r1)=r0 P(R1|r0)P(r0)
=(0.7,0.3)*0.5+(0.3,0.7)*0.5=(0.5.0.5)
update of evidence for t=1
P(R1|u1)=P(u1|R1)P(R1)= (0.9,0.2)(0.5,0.5)
= (0.45,0.1)(0.818,0.182)
prediction t=1 to t=2,
P(r2|u1)=r0 P(R2|r1)P(r1|u1)
=(0.7,0.3)*0.818+(0.3,0.7)*0.182 (0.627,0.373)
update of evidence for t=2
P(R2|u1,u2)= P(u2|R2)P(R2|u1)
= (0.9,0.2)(0.627,0.373) (0.883,0.117)
Smoothing
Compute P(Xk|e1:t) for 0<= k < t
Using a backward message bk+1:t = P(Ek+1:t | Xk), we obtain
P(Xk|e1:t) = f1:kbk+1:t
The backward message can be computed using
This leads to a recursive definition
Bk+1:t = BACKWARD(bk+2:t,ek+1:t)
Umbrella Example
Computed smoothed estimate prob rain at k=1 given umbrella observations on days 1 and 2:
P(R1|u1,u2)= P(R1|u1)P(u2|R1)
First term is (0.818,0.182) from filtering example.
Second term P(u2|R1) = r2 P(u2|r2)P(|r2)P(r2|R1)
= (0.9*1*(0.7,0.3))+(0.2*1*(0.3,0.7))=(0.69,0.41)
giving us
P(R1|u1,u2)= (0.818,0.182)(0.69,0.41)(0.883,0.117)
Note – smoothed estimate higher than filtered estimate
Most Likely Explanation: Viterbi Algorithm,
Viterbi Example
Hidden Markov Model
HMM Example: Robot Localization
Xt - location of robot on grid {s1,….,sn}
NEIGHBORS(s) – { empty squares adjacent to s }
N(s) = size of NEIGHBORS(s)
Transition Model
Priors
P(Xo=i)=1/n If n=42, T has 42x42=1764 entries
HMM Example: Robot Localization
Et has 16 possible values – each giving the presence or absence of an obstacle in the N, S, E, W directions from the robot; e.g. e=NS
Assume error rate is , independent;
Probability of all four returns correct is (1- )4 and 4 all wrong
Discrepancy dit is number of different return bits.
For example, the probability that a square with obstacles to the north and south would produce a sensor reading NSE is (1- )3 1
Even when is 20%—which means that the overall sensor reading is wrong 59% of the time—the robot is usually able to work out its location within two squares after 25 observations.
When is 10%, the performance after a half-dozen observations is hard to distinguish from the performance with perfect sensing.
Kalman Filters
Linear Gaussian
Next state is a linear
function of the current state
PLUS Gaussian noise
Bayes network structure
for a linear dynamical system
Updating a Gaussian distribution
Simple 1D Example: Random Walk
General (Multivariate) Kalman Update
Tracking Example: Filtering
Where it Breaks!
Remember
Dynamic Bayesian Networks?
DBN versus HMM ??
DBN versus KF ??
Exact Inference in DBNs
Inexact Inference: Particle Filtering
Particle Filtering
Particle Filtering
Localization using PF
Localization using Sonar
Red dots are
possible poses
Summary
Õ
=
-
=
t
1
i
i
i
1
i
i
0
t
1
t
1
0
)
X
|
E
(
P
)
X
|
X
(
P
)
X
(
P
)
E
,...,
E
,
X
,...,
X
,
X
(
P
å
+
+
+
+
+
+
+
+
+
+
+
+
+
a
=
a
=
a
=
=
t
x
t
:
1
t
t
1
t
1
t
1
t
t
:
1
1
t
1
t
1
t
t
:
1
1
t
t
:
1
1
t
1
t
1
t
t
:
1
1
t
1
t
:
1
1
t
)
e
|
x
(
P
)
x
|
X
(
P
)
X
|
e
(
P
)
e
|
X
(
P
)
X
|
e
(
P
)
e
|
X
(
P
)
e
,
X
|
e
(
P
)
e
,
e
|
X
(
P
)
e
|
X
(
P
)
X
|
x
(
P
)
x
|
e
(
P
)
x
|
e
(
P
b
k
k
x
k
t
:
k
k
k
t
:
k
k
1
1
2
1
1
1
1
+
+
+
+
+
+
å
+
=