COMP 424 – Artificial Intelligence Modelling Uncertainty
Instructor: Jackie CK Cheung and Readings: R&N Ch 13
Uncertainty in the course so far
Copyright By PowCoder代写 加微信 powcoder
Sources of uncertainty:
• Non-deterministic actions
• Non-deterministic environment
• Opponent
Techniques related to uncertainty:
• AND-OR trees (+ contingency planning)
• Game search trees (+ minimax or alpha-beta pruning)
• So far, we have had to either ignore uncertainty, or enumerate all the possibilities we can think of, and plan for the worst case.
• Uncertainty is implicitly modelled
Explicit modelling of uncertainty
• Build a model of the world that explicitly describes the uncertainty about the system’s:
• dynamics • sensors
• Reason about the effect of actions given the model
How do we represent uncertainty?
• What language should we use? What are the semantics of
our representations?
• What queries can we answer with our representations? How do we answer them?
• How do we construct a representation? Do we need to ask an expert or can we learn it from data?
Why not use logic?
A purely logical approach has two problems:
1. Risks falsehood:
“Check the tire => Tire is ok.” (Noisy sensors, unexpected effects.)
2. Leads to conclusions that are too weak:
“Check the tire after every other action” (Too slow!)
Methods for handling uncertainty
Logic: make assumptions unless contradicted by evidence. • E.g. “Assume my car doesn’t have a flat tire.”
Issue: What assumptions are reasonable? How to handle contradictions?
Reasoning under uncertainty: consider all possible states. • E.g. “Assume my car does or doesn’t have a flat tire.”
Issue: All possible states are equally likely. How do we draw conclusions?
Probability: associate a probability of occurrence with facts. • E.g. “Driving ->0.01 FlatTire”, “FlatTire ->0.9 Lateness”
Let’s explore this!
Bayesian probability
• Framework we will adopt for handling uncertainty
• Probabilities characterize a degree of belief of an agent
• Provides principled answers for:
• Combining evidence
• Incorporating new evidence
• Performing predictive and diagnostic reasoning
• Can be learned from data
• Intuitive to human experts
• There exist other frameworks for probability:
• Frequentist statistics (see textbook Section 13.3)
Probabilistic beliefs
• Probabilities describe the world and its uncertainties
• Beliefs relate logical propositions to current state of
knowledge.
• Beliefs are subjective assertions, given one’s state of knowledge.
e.g., P(FlatTire | driving on dry pavement) = 0.05
• Different agents may hold different beliefs.
Making decisions under uncertainty
Besides modelling the state of the world, we also want to take actions
State of the world e.g., P(FlatTire | road conditions) Action Should I take this route if I have a 1% chance
of getting a flat?
Probability theory alone is not enough to model decision making – also need to model preferences
• Utility theory is used to represent and infer preferences.
• Decision theory = utility theory + probability theory.
Probabilities: What you should know
• Basic axioms of probability.
• Joint probabilities.
• Conditional probabilities.
• Chain rule and Bayes rule.
• Conditional independence.
Defining probabilistic models
• Define the world as a set of random variables: Ω = {X1, .., Xn}.
• The world is divided into a set of elementary, mutually exclusive events (also called states).
• A probabilistic model is an encoding of probabilistic information that allows us to compute the probability of any event in the world.
• A joint probability distribution function assigns non-negative weights to each event
• The weights of all possible events must sum up to 1
• If you know the joint probability distribution, you can compute the probability of any event of interest!
Inference using joint distributions
Probabilistic inference
• Typical setup:
• Infer change in belief given new evidence
Initial belief New info Updated belief (prior) (evidence) (posterior)
Bayes’ Rule
Example: Medical Diagnosis
• You go to the doctor complaining of having a fever (F).
• A doctor knows that bird flu (B) causes a fever 95% of the
• The doctor knows that if a person is selected randomly from the population, there is a 10-7 chance of the person having bird flu.
• In general, 1 in 100 people in the population suffer from fever.
What is the probability that you have bird flu?
Example: Medical Diagnosis (cont’d)
• What are the relevant facts? F = fever B = bird flu
• What are the probabilities of these facts? P(F) = 0.01 P(B) = 10-7
• Anything else we know about the world? P(F|B) = 0.95
• Given these facts, we can answer the question using Bayes rule:
P(B|F) = P(F|B) P(B) / P(F) = 0.95 x 10-7 / 0.01
= 0.95 x 10-5
Computing conditional probabilities
Generally speaking:
• Consider a world described by a set of variables X
• Typically, we are interested in the posterior joint distribution of some query variables Y, given specific values e for some evidence variables E.
• Often, there are some hidden variables Z = X – Y – E
• If we know the joint probability distribution, we compute
the answer by “summing out” the hidden variables:
P(Y, e) = ∑z P(Y, e, z) <= “summing out” Problem: the joint distribution is too big to handle!
• Consider medical diagnosis, where there are 100 different symptoms and test results that the doctor could consider.
• A patient comes in complaining of fever, cough and chest pains.
• The doctor wants to compute the probability of bronchitis:
• Probability table has >= 2100 entries!
• To compute probability of bronchitis, we have to sum out over 97 hidden variables (=297 entries).
Independence of random variables
• Two random variables X and Y are independent if knowledge about X does not change the uncertainty about Y (and vice versa.)
P(x | y) = P(x) ∀x ∈ SX, ∀y ∈ SY P(y | x) = P(y) ∀x ∈ SX, ∀y ∈ SY
Independence of random variables
• Two random variables X and Y are independent if knowledge about X does not change the uncertainty about Y (and vice versa.)
P(x | y) = P(x) ∀x ∈ SX, ∀y ∈ SY
P(y | x) = P(y) ∀x ∈ SX, ∀y ∈ SY
• If n Boolean variables are independent, the whole joint
distribution can be computed:
P(x1, …, xn) = ∏i P(xi)
• Only n numbers needed to specify the joint, instead of 2n.
Problem: Absolute independence is a very strong requirement!
Conditional independence
• Two variables X and Y are conditionally independent given Z if:
P(x | y, z) = P(x | z) ∀x, y, z
• This means that knowing the value of Y does not change
the prediction about X if the value of Z is known. This is much more common!
• Consider a patient with three random variables:
B (patient has bronchitis), F (patient has fever), C (patient has cough) The full joint distribution has 23-1 = 7 independent entries.
Example (cont’d)
Naïve Bayes model
• A common assumption in early diagnosis is that the symptoms are independent of each other given the disease.
• Let s1, .., sn be the symptoms exhibited by a patient (e.g. fever, headache, etc.). Let D be the patient’s disease.
• Using the Naive Bayes assumption:
P(D, s1, …, sn) = P(D) P(s1 | D) … P(sn | D)
Sympt om 1
Diagno sis
• How many parameters are there in the joint probability distribution, assuming all diagnoses and symptoms are binary:
P(D, s1, …, sn)
• How many parameters are there assuming the Naïve
Bayes assumption?
P(D, s1, …, sn) = P(D) P(s1 | D) … P(sn | D)
• What is the conditional probability that you would use to actually diagnose a patient, given their symptom?
• How many parameters are there in the joint probability distribution, assuming all diagnoses and symptoms are binary:
P(D, s1, …, sn)
• How many parameters are there assuming the Naïve
Bayes assumption?
P(D, s1, …, sn) = P(D) P(s1 | D) … P(sn | D)
• What is the conditional probability that you would use to actually diagnose a patient, given their symptom?
Naïve Bayes example
Example solutions – patient 1
Example solutions – patient 2
Example solutions – patient 3
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com