CS 4610/5335
Sequential Bayes Filtering
Robert Platt Northeastern University
Material adapted from:
1. Russell & Norvig, AIMA
2. Dan Klein & Pieter Abbeel, UC Berkeley CS 188 3. Lawson Wong, CS 5335
4. Chris Amato, CS 5100
5. Sebastian Thrun, Wolfram Burgard, & Dieter Fox,
Probabilistic Robotics
1-D example
From “Probabilistic Robotics” (Ch. 7-8)
by Sebastian Thrun, Wolfram Burgard, & Dieter Fox
1-D example
From “Probabilistic Robotics” (Ch. 7-8)
by Sebastian Thrun, Wolfram Burgard, & Dieter Fox
Challenge: Robot can only sense wall/door at its current location
1-D example
From “Probabilistic Robotics” (Ch. 7-8)
by Sebastian Thrun, Wolfram Burgard, & Dieter Fox
Challenge: Robot can only sense wall/door at its current location
Robot starts somewhere random, keeps on heading right,
detects Wall, Door, Wall, Wall, …
Where is it now?
1-D example
From “Probabilistic Robotics” (Ch. 7-8)
by Sebastian Thrun, Wolfram Burgard, & Dieter Fox
Robot starts somewhere random, keeps on heading right. Where is it after:
WWDWDWWW WWWDWWW
1-D example
From “Probabilistic Robotics” (Ch. 7-8)
by Sebastian Thrun, Wolfram Burgard, & Dieter Fox
Robot starts somewhere random, keeps on heading right. Where is it after:
WWDWDWWW WWWDWWW WWWDDDWWW WDWDDWWDDDWWW
2-D Example
Robot is actually located here, but it doesn’t know it.
Prob
Gray level denotes estimated probability that robot is in that square
01
Goal: localize the robot based on sequential observations
– robot is given a map of the world; robot could be in any square – initially, robot doesn’t know which square it’s in
Image: Berkeley CS188 course notes (downloaded Summer 2015)
2-D Example
Robot perceives that there are walls above and below, but no walls either left or right
Prob
Gray level denotes estimated probability that robot is in that square
01
On each time step, the robot moves, and then observes the directions in which there are walls.
– observes a four-bit binary number
– observations are noisy: there is a small chance that each bit will be flipped.
Image: Berkeley CS188 course notes (downloaded Summer 2015)
2-D Example
Prob
01
Image: Berkeley CS188 course notes (downloaded Summer 2015)
2-D Example
Prob
01
Image: Berkeley CS188 course notes (downloaded Summer 2015)
2-D Example
Prob
01
Image: Berkeley CS188 course notes (downloaded Summer 2015)
2-D Example
Prob
01
Question: how do we update this probability distribution from time t to t+1?
Image: Berkeley CS188 course notes (downloaded Summer 2015)
Rules of Probability
Conditional Probabilities
Conditional probability: P(A| B)=P(A,B) (if P(B)>0 ) P(B)
14
Bayes’ Rule
It’s easy to derive from the product rule:
Solve for this
15
Interpretation:
Posterior
P(state | obs)
Likelihood * Prior
P(obs | state) P(state)
= ————————————–
P(obs)
Evidence
Bayesian inference
16
Interpretation:
Posterior
P(state | obs)
Likelihood * Prior
P(obs | state) P(state)
= ————————————–
P(obs)
Evidence
Bayesian inference
E.g., State = Is it raining outside right now? – Prior: It rains on 30% of days in Boston
– P(rain) = 0.3 ; P(not rain) = 1-0.3 = 0.7
17
Interpretation:
Posterior
P(state | obs)
Likelihood * Prior
P(obs | state) P(state)
= ————————————–
P(obs)
Evidence
Bayesian inference
E.g., State = Is it raining outside right now? – Prior: It rains on 30% of days in Boston
– P(rain) = 0.3 ; P(not rain) = 1-0.3 = 0.7
Observation = People are carrying wet umbrellas – P(obs | rain) = 0.9 ; P(obs | not rain) = 0.2
18
Interpretation:
Posterior
P(state | obs)
Likelihood * Prior
P(obs | state) P(state)
= ————————————–
P(obs)
Evidence
Bayesian inference
E.g., State = Is it raining outside right now? – Prior: It rains on 30% of days in Boston
– P(rain) = 0.3 ; P(not rain) = 1-0.3 = 0.7
Observation = People are carrying wet umbrellas – P(obs | rain) = 0.9 ; P(obs | not rain) = 0.2
Bayes’ rule: P(rain | obs) α P(obs | rain) * P(rain) – P(rain | obs) α 0.9 * 0.3 = 0.27
– P(not rain | obs) = ?
19
Interpretation:
Posterior
P(state | obs)
Likelihood * Prior
P(obs | state) P(state)
= ————————————–
P(obs)
Evidence
Bayesian inference
E.g., State = Is it raining outside right now? – Prior: It rains on 30% of days in Boston
– P(rain) = 0.3 ; P(not rain) = 1-0.3 = 0.7
Observation = People are carrying wet umbrellas – P(obs | rain) = 0.9 ; P(obs | not rain) = 0.2
Bayes’ rule: P(rain | obs) α P(obs | rain) * P(rain) – P(rain | obs) α 0.9 * 0.3 = 0.27
– P(not rain | obs) α P(obs | not rain) * P(not rain)
= 0.2 * 0.7 = 0.14
20
Interpretation:
Posterior
P(state | obs)
Likelihood * Prior
P(obs | state) P(state)
= ————————————–
P(obs)
Evidence
Bayesian inference
E.g., State = Is it raining outside right now? – Prior: It rains on 30% of days in Boston
– P(rain) = 0.3 ; P(not rain) = 1-0.3 = 0.7
Observation = People are carrying wet umbrellas – P(obs | rain) = 0.9 ; P(obs | not rain) = 0.2
Bayes’ rule: P(rain | obs) α P(obs | rain) * P(rain) – P(rain | obs) α 0.9 * 0.3 = 0.27
– P(not rain | obs) α P(obs | not rain) * P(not rain)
= 0.2 * 0.7 = 0.14
Normalize to get answer: P(rain | obs) = 0.27 / (0.27+0.14) ≈ 0.66
21
1-D example (Init) p(x1) = Uniform
1-D example (Observation) p(x1) = Uniform
Detect z1=Door. P(z1=Door | x1):
1-D example (Posterior update) p(x1) = Uniform
Detect z1=Door. P(z1=Door | x1) :
Apply Bayes’ rule. P(x1 | z1=Door) α P(z1=Door | x1) * P(x1) :
1-D example (Posterior update) p(x1) = Uniform
Detect z1=Door. P(z1=Door | x1) :
Apply Bayes’ rule. P(x1 | z1=Door) α P(z1=Door | x1) * P(x1) : Normalization step not shown explicitly:
– Calculate RHS for all values of x1
– Divide RHS by the sum (such that posterior sums to 1)
1-D example (Transition)
P(x1 | z1=Door) :
Take one step to the right. P(x2 | x1, z1=Door) :
1-D example (Observation) P(x2 | x1, z1=Door) :
Detect z2=Door. P(z2=Door | x2) :
1-D example (Posterior update) P(x2 | x1, z1=Door) :
Detect z2=Door. P(z2=Door | x2) :
Apply Bayes’ rule and normalize (not shown explicitly):
P(x2 | z2=Door, z1=Door) α P(z2=Door | x2) * P(x2 | x1, z1=Door)
1-D example (Transition)
P(x2 | z2=Door, z1=Door) :
Take more steps, etc.
Mobile robot localization (Hallway example)
From
“Probabilistic Robotics” (Ch. 7-8)
by
Sebastian Thrun, Wolfram Burgard,
& Dieter Fox
Continuous case; Ideal outcome
30
Mobile robot localization (Hallway example)
From
“Probabilistic Robotics” (Ch. 7-8)
by
Sebastian Thrun, Wolfram Burgard,
& Dieter Fox
This is an important problem!
– Ch. 2-4 (half of Part 1)
is HMM inference
– Ch. 7-8 (entire Part 2!)
is about localization – There are 4 parts,
17 chapters in total
31
Mobile robot localization (Hallway example)
From
“Probabilistic Robotics” (Ch. 7-8)
by
Sebastian Thrun, Wolfram Burgard,
& Dieter Fox
Discrete case;
Exact HMM inference
32
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Single action still has stochastic outcomes
P(Actually moves right) = 0.8 P(Stays still) = 0.2
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Single action still has stochastic outcomes
P(Actually moves right) = 0.8 P(Stays still) = 0.2
What if agent does not know where it is exactly? – New: State uncertainty
Belief: Probability distribution over states
P(1) = 0.8 P(2) = 0.2 P(3) = 0.0 P(4) = 0.0
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Single action still has stochastic outcomes
P(Actually moves right) = 0.8 P(Stays still) = 0.2
What if agent does not know where it is exactly? – New: State uncertainty
Belief: Probability distribution over states
P(1) = 0.8 P(2) = 0.2 P(3) = 0.0 P(4) = 0.0
What information can we use to infer our state? – New: Noisy observations
Robot detects whether upper cell is door/empty P(Empty | In state 1) = 0.9 P(Door | In state 1) = 0.1 P(Empty | In state 2) = 0.1 P(Door | In state 2) = 0.9 …
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Suppose we know that x0 = 1 P(x1) = ?
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Suppose we know that x0 = 1
P(x1): P(x1=1)=0.2 P(x1=2)=0.8
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Suppose we know that x0 = 1
P(x1): P(x1=1)=0.2 P(x1=2)=0.8
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Suppose we know that x0 = 1
P(x1): P(x1=1)=0.2 P(x1=2)=0.8
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
With prob. 0.8, we start in x0 = 1, then:
– With prob. 0.8, we end up in x1 = 2 → p = 0.8 * 0.8 = 0.64 – With prob. 0.2, we end up in x1 = 1 → p = 0.8 * 0.2 = 0.16
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Suppose we know that x0 = 1
P(x1): P(x1=1)=0.2 P(x1=2)=0.8
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
With prob. 0.8, we start in x0 = 1, then:
– With prob. 0.8, we end up in x1 = 2 – With prob. 0.2, we end up in x1 = 1
With prob. 0.2, we start in x0 = 2, then:
– With prob. 0.8, we end up in x1 = 3 – With prob. 0.2, we end up in x1 = 2
→ p = 0.8 * 0.8 = 0.64 → p = 0.8 * 0.2 = 0.16
→ p = 0.2 * 0.8 = 0.16 → p = 0.2 * 0.2 = 0.04
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
Suppose we know that x0 = 1
P(x1): P(x1=1)=0.2 P(x1=2)=0.8
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
With prob. 0.8, we start in x0 = 1, then:
– With prob. 0.8, we end up in x1 = 2 – With prob. 0.2, we end up in x1 = 1
With prob. 0.2, we start in x0 = 2, then:
– With prob. 0.8, we end up in x1 = 3 – With prob. 0.2, we end up in x1 = 2
→ p = 0.8 * 0.8 = 0.64 → p = 0.8 * 0.2 = 0.16
→ p = 0.2 * 0.8 = 0.16
→ p = 0.2 * 0.2 = 0.04 P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
1-D example
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
With prob. 0.8, we start in x0 = 1, then:
– With prob. 0.8, we end up in x1 = 2 – With prob. 0.2, we end up in x1 = 1
With prob. 0.2, we start in x0 = 2, then:
– With prob. 0.8, we end up in x1 = 3 – With prob. 0.2, we end up in x1 = 2
→ p = 0.8 * 0.8 = 0.64 → p = 0.8 * 0.2 = 0.16
→ p = 0.2 * 0.8 = 0.16
→ p = 0.2 * 0.2 = 0.04 P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
1-D example
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
With prob. 0.8, we start in x0 = 1, then:
– With prob. 0.8, we end up in x1 = 2 – With prob. 0.2, we end up in x1 = 1
With prob. 0.2, we start in x0 = 2, then:
– With prob. 0.8, we end up in x1 = 3 – With prob. 0.2, we end up in x1 = 2
→ p = 0.8 * 0.8 = 0.64 → p = 0.8 * 0.2 = 0.16
→ p = 0.2 * 0.8 = 0.16
→ p = 0.2 * 0.2 = 0.04 P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16 What information can we use to infer our state?
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
What information can we use to infer our state? Suppose we detect that the upper cell is a door Observation at state x1: y1 = Door
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
What information can we use to infer our state? Suppose we detect that the upper cell is a door Observation at state x1: y1 = Door
How do we use y1 = Door to update P(x1)? What do you expect to happen?
Recall: Bayes’ Rule
One interpretation:
P(effect | cause) P(cause) P(cause | effect) = ————————————–
Another interpretation:
Posterior
P(state | obs)
P(effect)
Likelihood * Prior
P(obs | state) P(state)
= ————————————–
P(obs)
Evidence
47
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
How do we use y1 = Door to update P(x1)?
P(obs | state) P(state) P(state | obs) = ————————————–
P(obs)
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
How do we use y1 = Door to update P(x1)?
P(y1 = Door | x1) P(x1) P(x1 | y1 = Door) = ————————————–
P(y1 = Door)
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
P(y1 = Door | x1) P(x1) P(x1 | y1 = Door) = ————————————–
P(y1 = Door)
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
P(y1 = Door | x1) P(x1) P(x1 | y1 = Door) = ————————————–
P(y1 = Door)
Notice that y1 = Door is fixed – we cannot change this Posterior probability varies with x1
but denominator does not!
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
P(x1 |y1=Door) α P(y1=Door|x1)P(x1)
Notice that y1 = Door is fixed – we cannot change this Posterior probability varies with x1
but denominator does not!
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
P(x1 |y1=Door) α P(y1=Door|x1)P(x1)
Notice that y1 = Door is fixed – we cannot change this Posterior probability varies with x1
but denominator does not!
Compute numerator product for each x1, then normalize
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
Compute numerator product for each x1, then normalize
P(x1 |y1=Door) α P(y1=Door|x1)P(x1)
P(x1 = 1 | y1 = Door) α P(y1 = Door | x1 = 1) P(x1 = 1) = 0.1 * 0.16 = 0.016
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
Compute numerator product for each x1, then normalize
P(x1 |y1=Door) α P(y1=Door|x1)P(x1)
P(x1 = 1 | y1 = Door) α P(y1 = Door | x1 = 1) P(x1 = 1) = 0.1 * 0.16 = 0.016 P(x1 = 2 | y1 = Door) α P(y1 = Door | x1 = 2) P(x1 = 2) = 0.9 * 0.68 = 0.612 P(x1 = 3 | y1 = Door) α P(y1 = Door | x1 = 3) P(x1 = 3) = 0.1 * 0.16 = 0.016
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
Compute numerator product for each x1, then normalize
P(x1 |y1=Door) α P(y1=Door|x1)P(x1)
P(x1 = 1 | y1 = Door) α P(y1 = Door | x1 = 1) P(x1 = 1) = 0.1 * 0.16 = 0.016 P(x1 = 2 | y1 = Door) α P(y1 = Door | x1 = 2) P(x1 = 2) = 0.9 * 0.68 = 0.612 P(x1 = 3 | y1 = Door) α P(y1 = Door | x1 = 3) P(x1 = 3) = 0.1 * 0.16 = 0.016
0.016 + 0.612 + 0.016 = 0.644 ≠ 1 → That’s okay, we normalize
1-D example
What if we did not know x0, but only had a belief P(x0)?
P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
Observation at state x1: y1 = Door
Compute numerator product for each x1, then normalize
P(x1 |y1=Door) α P(y1=Door|x1)P(x1)
P(x1 = 1 | y1 = Door) α P(y1 = Door | x1 = 1) P(x1 = 1) = 0.1 * 0.16 = 0.016 P(x1 = 2 | y1 = Door) α P(y1 = Door | x1 = 2) P(x1 = 2) = 0.9 * 0.68 = 0.612 P(x1 = 3 | y1 = Door) α P(y1 = Door | x1 = 3) P(x1 = 3) = 0.1 * 0.16 = 0.016
0.016 + 0.612 + 0.016 = 0.644 ≠ 1 → That’s okay, we normalize P(x1 = 1 | y1 = Door) = 0.016 / (0.016 + 0.612 + 0.016) ≈ 0.025
P(x1 = 2 | y1 = Door) ≈ 0.950 P(x1 = 3 | y1 = Door) ≈ 0.025
1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
What if we did not know x0, but only had a belief P(x0)? P(x0): P(x0=1)=0.8 P(x0=2)=0.2
P(x1 = 1) = 0.16 P(x1 = 2) = 0.64 + 0.04 = 0.68 P(x1 = 3) = 0.16
What information can we use to infer our state? Suppose we detect that the upper cell is a door Observation at state x1: y1 = Door
How do we use y1 = Door to update P(x1)?
P(x1 = 1 | y1 = Door) ≈ 0.025 P(x1 = 2 | y1 = Door) ≈ 0.950 P(x1 = 3 | y1 = Door) ≈ 0.025
Recap: 1-D example
1
2
3
4
Consider a 1-D grid world
The agent can only move right
What if agent does not know where it is exactly? – Belief: Probability distribution over states What information can we use to infer our state? – Noisy observations
Hidden Markov Models
Recursive forward filtering in HMMs
1
2
3
4
Consider a 1-D grid world
The agent can only move right
A Hidden Markov Model (HMM) is specified by 3 things: – Initial belief P(x0)
– Transition model P(xt+1 | xt)
– Observation model P(yt | xt) (or emission model P(et | xt))
61
Recursive forward filtering in HMMs
1
2
3
4
Consider a 1-D grid world
The agent can only move right
A Hidden Markov Model (HMM) is specified by 3 things: – Initial belief P(x0)
– Transition model P(xt+1 | xt)
– Observation model P(yt | xt) (or emission model P(et | xt))
62
Recursive forward filtering in HMMs
1
2
3
4
Consider a 1-D grid world
The agent can only move right
A Hidden Markov Model (HMM) is specified by 3 things: – Initial belief P(x0)
– Transition model P(xt+1 | xt)
– Observation model P(yt | xt) (or emission model P(et | xt))
More generally, at every time step:
63
Recursive forward filtering in HMMs
1
2
3
4
Consider a 1-D grid world
The agent can only move right
A Hidden Markov Model (HMM) is specified by 3 things: – Initial belief P(x0)
– Transition model P(xt+1 | xt)
– Observation model P(yt | xt) (or emission model P(et | xt))
More generally, at every time step:
The equations above are actually not quite right!
64
Recursive forward filtering in HMMs
1
2
3
4
Consider a 1-D grid world
The agent can only move right
A Hidden Markov Model (HMM) is specified by 3 things: – Initial belief P(x0)
– Transition model P(xt+1 | xt)
– Observation model P(yt | xt) (or emission model P(et | xt))
More generally, at every time step:
The equations above are actually not quite right!
65
Recursive forward filtering in HMMs
1
2
3
4
Consider a 1-D grid world
The agent can only move right
A Hidden Markov Model (HMM) is specified by 3 things: – Initial belief P(x0)
– Transition model P(xt+1 | xt)
– Observation model P(yt | xt) (or emission model P(et | xt))
More generally, at every time step:
66
Recursive forward filtering in HMMs
A Hidden Markov Model (HMM) is specified by 3 things: – Initial belief P(x0)
– Transition model P(xt+1 | xt)
– Observation model P(yt | xt) (or emission model P(et | xt))
b0 = Initial belief
For t = 1 to T:
Compute prediction step
Compute bt-1 → t using bt-1 and transition model
Compute observation step
Compute bt using bt-1 → t and observation model
67
Forward filtering (prediction step) More generally, at every time step:
1
2
3
4
68
Forward filtering (observation step) More generally, at every time step:
1
2
3
4
69
Mobile robot localization (Hallway example)
From
“Probabilistic Robotics” (Ch. 7-8)
by
Sebastian Thrun, Wolfram Burgard,
& Dieter Fox
Discrete case;
Exact HMM inference
70