CMPSC442-Wk7-Mtg19
Quantifying Uncertainty
AIMA 12.1-12.4
CMPSC 442
Week 7, Meeting 19, Three Segments
Outline
● Acting under Uncertainty: Probability as Belief
● Basic Probability Notation
● Probabilistic Inference; Independence
2Outline, Wk 7, Mtg 19
Quantifying Uncertainty
AIMA 12.1-12.4
CMPSC 442
Week 7, Meeting 19, Segment 1 of 3: Acting under
Uncertainty
Logical Agents and Uncertainty
● Logical agents have belief states
● Probability theory can be incorporated into logical agents
○ To change epistemological commitments from truth values to degrees of
belief in truth
○ Ontological commitments (what is believed to be in the world) remain the
same
4Acting under Uncertainty
Example: Automated Taxi
● Problems: The world is not
○ Fully observable (road state, other drivers’ plans, etc.)
○ Deterministic (flat tire, etc.)
○ Single agent (immense complexity modeling and predicting traffic)
○ Noisy sensors (traffic reports, etc.)
● Let action A
t
= leave for airport t minutes before flight.
○ Will A
15
get me there on time? true/false
○ Will A
20
get me there on time? true/false
○ Will A
30
get me there on time? true/false
○ Will A
200
get me there on time? true/false
5Acting under Uncertainty
Pure Logic Risks Falsehood or Irrelevance
● A
65
○ Possibly true if everything goes smoothly
● A
1440
○ Probably true but passenger would have to stay overnight at the airport!
● (¬ ∃x car(x) ∧ onRoad(x)) ∧ no_rain(now) ∧ (∀y mytire(y) ⇒ ¬ flat(y))
⇒ A
20
○ True but useless to try to logically specify all cases in minute detail
○ Logical qualification problem (discussed in chapter 7)
6Acting under Uncertainty
Methods to Handle Uncertainty for Logical Agents
● Default or nonmonotonic logic:
○ Assume A
65
works (default) unless contradicted by evidence
○ Problems: What default assumptions are reasonable? What evidence?
● Fuzzy logic: truth values in [0,1]
○ Can handle different interpretations of the same predicate
○ Problems: How many truth values? How to assign?
● Subjectivist (Bayesian) Probability
○ Model agent’s degree of belief
○ Estimate probabilities from experience (e.g., expectation as average)
○ Probabilities have a clear calculus of combination
7Acting under Uncertainty
Probability Summarizes the Unknown
● Theoretical ignorance:
○ Often, we have no complete theory of the domain, e.g. medicine
● Poor cost benefit tradeoff if we have fairly complete theories:
○ Difficult to formulate knowledge and inference rules about a domain that
handles all (important) cases
● Unavoidable uncertainty (partial observability):
○ When we know all the implication relations (rules), we might be uncertain
about the premises
8Acting under Uncertainty
Subjectivist or Bayesian Probability
● Degrees of belief
○ P(A) = 1: Agent completely believes A is true.
○ P(A) = 0: Agent completely believes A is false.
○ 0.5 < P(A) < 1: Agent believes A is more likely to be true than false.
● New evidence (improved priors)
○ Strengthens beliefs
○ Therefore changes probability estimate
9Acting under Uncertainty
Uncertainty and Rational Decisions
● Suppose the autonomous taxi agent has the following degrees of
belief on what time to leave for the airport:
○ P(A
25
gets me there on time | B) = 0.04
○ P(A
90
gets me there on time | B) = 0.70
○ P(A
120
gets me there on time | C) = 0.95
○ P(A
1440
gets me there on time | D) = 0.9999
● Which action to choose depends on the given context
● A rational agent must also consider preferences for missing flight vs.
time spent waiting, etc.
10Acting under Uncertainty
Decision Theory
● Decision Theory develops methods to make optimal decisions in the
presence of uncertainty
● Decision Theory = utility theory + probability theory
● Utility theory (AIMA Ch 16. Making Simple Decisions) is used to
represent and infer preferences
○ Every state has a degree of usefulness
○ An agent is rational if and only if it chooses an action A that yields the
maximum expected utility (expected usefulness)
11Acting under Uncertainty
Decision Theoretic Agent
12Acting under Uncertainty
Quantifying Uncertainty
AIMA 12.1-12.4
CMPSC 442
Week 7, Meeting 19, Segment 2 of 3: Basic Probability
Notation
Sample Spaces
● The set of all possible worlds (e.g., for a given logical agent) is called
the sample space (specifiable)
○ The sample space consists of a exhaustive set of mutually exclusive
possibilities
● Example: Rolling two dice
○ Sample space Ω = D x D, where D = {1, 2, 3, 4, 5, 6}
○ Each member ωi of a sample space Ω is called an elementary event
○ |Ω|=36
14Basic Probability Notation
Probabilities of Elementary Events
● Every ωi ∈ Ω is assigned a probability (elementary event in the sample
space) P(ωi)
○ 0 ≤ P(ωi ) ≤ 1
● Assuming Ω is finite (w
1
,…, w
n
) we require
○ P(Ω) = ∑ω_i P(ωi ) = 1
15Basic Probability Notation
Prior (Unconditional) versus Conditional Probabilities
● Prior probability: probability of an event, apart from conditioning
evidence
○ P(roll of 2 dice sums to 11) = P((5, 6)) + P((6,5)) = 1/36 + 1/36 = 1/18
● Conditional (or posterior) probablity of an event conditioned on the
occurrence of an earlier event
○ P(Die2=6|Die1=5) = 1/6
16Basic Probability Notation
Product Rule of Conditional Probabilities
17Basic Probability Notation
Propositions and Random Variables
● Probabilistic propositions are factored representations consisting of
variables and values (combines elements of PL and CSP)
○ Variable name: Cavity
○ Values: cavity, ¬cavity
● Variables in probability theory are called random variables
○ Uppercase names for the variables, e.g., P(A=true)
○ Lowercase names for the values, e.g., P(a) is an abbreviation for A=true
● A random variable is a function from a domain of possible worlds Ω to
a range of values
18Basic Probability Notation
Values of Random Variables
● A random variable V can take on one of a set of different values
○ Each value has an associated probability
○ The value of V at a particular time is subject to random variation
○ Discrete random variables have a discrete (often finite) range of values
○ Domain values must be exhaustive and mutually exclusive
● For us, random variables will have a discrete, countable (usually finite)
domain of arbitrary values
○ Here we will use categorical or Boolean variables
○ Example: A Boolean random variable has the domain {true,false}
19Basic Probability Notation
Some Notation
● Distinguish the probability P of a specific event in the sample space
from the probability distribution P of a random variable
Weather = sunny abbreviated sunny
P(Weather=sunny)=0.72 abbreviated P(sunny)=0.72
Cavity = true abbreviated cavity
Cavity = false abbreviated ¬ cavity
● Vector notation:
○ Fix order of domain elements:
○ Specify P(Weather) by a vector that sums to one:<0.72,0.10, 0.10, 0.08>
20Basic Probability Notation
Continuous Variables, i.e., Infinitely Many Values
● Range of a random variable could be
● P(NoonTemp = x)
○ Range is defined in terms of a probability density function (pdf)
○ Parameterized function of x, e.g., Uniform(x; 18C, 26C)
■ 100% probability x falls in the 8C range 18C – 26C
■ 50% probability x falls in a 4C range within [18,26]
○ Intuitively, P(x) is the probability that X falls within a small region beginning
at x, divided by the width of the region
21Basic Probability Notation
Two Key Axioms of Probability
● If probabilities are used as degrees of agent’s belief, a rational agent’s
beliefs must obey the axioms of probability
22Basic Probability Notation
Inclusion-Exclusion Principle
● From axioms (1) and (2) on preceding slide, the inclusion-exclusion
principle follows
23Basic Probability Notation
Inclusion-Exclusion Principle in Practice
● P(a) = 0.4
● P(b) = 0.3
● Possibilities for P(a) ∧ P(b)
○ Can P(a ∧ b) = 0?
○ Can P(a ∧ b) = 0.3?
○ Can P(a∧ b) > 0.3?
● Possibilities for P(a) ∨ P(b)
○ Can P(a ∨ b) = 0.7?
○ Can P(a ∨ b) > 0.7?
24Basic Probability Notation
Inclusion-Exclusion Principle in Practice
● P(a) = 0.4
● P(b) = 0.3
● Possibilities for P(a) ∧ P(b)
○ Can P(a ∧ b) = 0? Yes
○ Can P(a ∧ b) = 0.3? Yes
○ Can P(a∧ b) > 0.3? No
● Possibilities for P(a) ∨ P(b)
○ Can P(a ∨ b) = 0.7? Yes
○ Can P(a ∨ b) > 0.7? No
25Basic Probability Notation
Argument for Axioms of Probability
● De Finetti argued that an agent’s belief in proposition α should result in being
able to state odds at which it is indifferent to bet for or against α
● Or: Beliefs that violate the axioms of probability lead to losing bets
● EG: Agent believes p(a) = 0.4, p(b)=0.3, p(a ∨ b) = 0.8
26Basic Probability Notation
Quantifying Uncertainty
AIMA 12.1-12.4
CMPSC 442
Week 7, Meeting 19, Segment 3 of 3: Probabilistic Inference;
Independence
Full Joint Probability Distributions
● Consider three random boolean variables: Toothache, Cavity, Catch
○ Gives a 2 x 2 x 2 = 8 sample space of distinct events
■ Four cavity and four ¬cavity
■ Four toothache and four ¬toothache
■ Four catch and four ¬catch
○ Sum of probabilities in the full joint distribution = 1
28Inference & Independence
Full Joint Probability Distributions
● Full joint probability distribution can be used to answer questions
about probabilities of all possible events
○ Six ways to have (toothache ⋁ cavity)
○
○ Marginal probability of cavity
29Inference & Independence
Marginalization and Conditioning
● Marginalizing (summing out) for a variable sums over all the values of
another variable in the joint distribution
● Conditioning, derived from applying the product rule to the rule for
marginalizing
30Inference & Independence
● Probability of a cavity conditioned on toothache
● Probability of not having a cavity conditioned on toothache
Illustration of Conditional Probabilities
31Inference & Independence
Inference: Queries about Probabilities
● Let X be a variable to query, E be the list of evidence variables, e be the list of
observed values for the evidence, and Y the remaining unobserved values,
then we can formulate a query P(X|e)
● Compute the probability of X conditioned on e by summing over all
combinations of values of the unobserved variables
● Theoretically, this general query can be addressed for any conditioning
context of any variable using a full joint probability distribution
● In practice, full joint probability distributions are impractical for large sets of
variables
32Inference & Independence
Factoring a Single Joint Distribution
33
● Given N random variables, some may be independent of others
● Reduction in number of cases if weather is independent of the
other three variables
○ 4 weather x 2 cavity x 2 toothache x 2 catch = 32
○ 4 weather + 2 cavity x 2 toothache x 2 catch = 12
Inference & Independence
Independence
● Random variables X and Y are independent iff:
● Taking any independence into account is essential for efficient
probabilistic reasoning
● Unfortunately, complete independence is rare
34Inference & Independence
Independence Exemplified: Two Coins
● Example: W = {HH, HT, TH, TT}
● Given: ∀ω∈Ω, P(ω)=1/(|Ω|)
● Event A = First flip is H: A = {HH, HT}
● Event B = Second flip is H: B = {HH, TH}
○ P(A∩B) = P(HH) = ¼
○ P(A)P(B) = ½ * ½ = ¼
● Therefore: A and B are independent events
35Inference & Independence
Non-Independence Exemplified: Two Coins
● Example: W = {HH, HT, TH, TT}
● Given: ∀ω∈Ω, P(ω)=1/(|Ω|)
● Event A = First flip is H: A = {HH, HT}
● Event B = First or second flip has a T: B = {HT, TH, TT}
○ P(A∩B) = P(HT) = ¼
○ P(A)P(B) = ½ * 3/4 = 3/8
● Therefore: A and B are not independent events
36Inference & Independence
Conditional Independence
● Random variables X and Y are conditionally independent given Z iff
37Inference & Independence
Role of Conditional Independence
● In most cases, the use of conditional independence reduces the size of
the representation of the joint distribution from exponential in n to
linear in n.
● Conditional independence is much more common in the real world
than complete independence.
● Conditional independence is our most basic and robust form of
knowledge about uncertain environments.
38Inference & Independence
Summary – Page One
● Uncertainty is inescapable in complex, nondeterministic, or partially
observable environments.
● Probabilities summarize the agent’s beliefs relative to the evidence.
Basic probability statements include prior probabilities and
conditional probabilities over simple and complex propositions.
● The axioms of probability constrain the possible assignments of
probabilities to propositions.
39Summary, Wk 7, Mtg 19
Summary – Page Two
● The full joint probability distribution is usually too large to create or
use in its explicit form, but if available it can be used to answer queries
simply by adding up entries for the possible worlds corresponding to
the query propositions.
● Absolute independence between subsets of random variables allows
the full joint distribution to be factored into smaller joint distributions,
greatly reducing its complexity. Absolute independence seldom occurs
in practice.
40Summary, Wk 7, Mtg 19