PowerPoint Presentation
CSE 3521: Probability Refresher
[Many slides are adapted from previous CSE 5521 course at OSU.]
Copyright By PowCoder代写 加微信 powcoder
What is Probability?
“The probability the coin will land heads is 0.5”
Q: what does this mean?
What is Probability?
“The probability the coin will land heads is 0.5”
Q: what does this mean?
Interpretations:
Frequentist (Repeated trials)
If we flip the coin many times…
On average, we will see half of the times with heads
What is Probability?
“The probability the coin will land heads is 0.5”
Q: what does this mean?
Interpretations:
Frequentist (Repeated trials)
If we flip the coin many times…
On average, we will see half of the times with heads
We believe there is equal chance of heads/tails
Advantage: events that do not have long term frequencies
What is Probability?
“The probability the coin will land heads is 0.5”
Q: what does this mean?
Interpretations:
Frequentist (Repeated trials)
If we flip the coin many times…
On average, we will see half of the times with heads
We believe there is equal chance of heads/tails
Advantage: events that do not have long term frequencies
Q: What is the probability the polar ice caps will melt by 2050?
Probability Basic (a less strict version)
Begin with a set Ω (i.e., the sample space)
e.g., 6 possible rolls of a die
ω ∈ Ω, is an outcome, a sample point (e.g., ω=1), or atomic event
Probability Basic (a less strict version)
Begin with a set Ω (i.e., the sample space)
e.g., 6 possible rolls of a die
ω ∈ Ω, is an outcome, a sample point (e.g., ω=1), or atomic event
An event A is a subset of Ω
atomic event: {6} or sample 6
Probability Basic (a less strict version)
Begin with a set Ω (i.e., the sample space)
e.g., 6 possible rolls of a die
ω ∈ Ω, is an outcome, a sample point (e.g., ω=1), or atomic event
An event A is a subset of Ω
A probability space or probability model is a sample space with a probability function for every event A Ω, s.t,
Probability Basic (a less strict version)
Begin with a set Ω (i.e., the sample space)
e.g., 6 possible rolls of a die
ω ∈ Ω, is an outcome, a sample point (e.g., ω=1), or atomic event
An event A is a subset of Ω
A probability space or probability model is a sample space with a probability function for every event A Ω, s.t,
In the case where every sample can be an “atomic” event
Random Variables
A random variable X is a function from Ω to some range (e.g., the reals or Booleans)
e.g., X(1) = True; X(3) = True; X(5) = True; X(2) = False; X(4) = False; X(6) = False
e.g., X(“head”) = 0; X(“tail”) = 1
e.g., X(“Temperature is 3.5 degree”) = 3.5
P induces a probability measure for any random variable X
; is an assignment
e.g., P (X = True) = P (1) + P (3) + P (5) = + + =
can be viewed as an event
and with can be viewed as disjoint events
Denote random variables with capital letters
Proposition
Think of a proposition as the event (set of sample points) where the proposition is true
Given Boolean random variables A and B
event a = set of sample points where A(ω) = true
event ¬a = set of sample points where A(ω) = false
event a ∧ b = points where A(ω) = true and B(ω) = true
event a V b = points where A(ω) = true or B(ω) = true
Often in AI applications, the “multi-dimensional” sample points/data instances are defined by the values of a set of random variables (of different sample spaces),
i.e., the sample space is the Cartesian product of the ranges of the variables
e.g., coin = (weight, height)
The definitions imply that certain logically related events must have related probabilities
e.g., P (a ∨ b) = P (a) + P (b) − P (a ∧ b)
Both events are in the same “sample” space.
Why should we use probability?
Propositional or Boolean random variables
e.g., Cavity (do I have a cavity?)
Cavity = true is a proposition, also written cavity
Discrete random variables (finite or infinite)
e.g., Weather is one of sunny, rain, cloudy, snow
Weather = rain is a proposition
Values must be exhaustive and mutually exclusive
Continuous random variables (bounded or unbounded)
e.g., Temp = 21.6; also allow, e.g., Temp < 22.0.
Arbitrary Boolean combinations of basic propositions
Prior probability (before seeing evidence/data)
Prior or unconditional probabilities of propositions
e.g., P (Cavity = true) = 0.2 and P (Weather = sunny) = 0.72
correspond to belief prior to arrival of any (new) evidence
Probability distribution gives values for all possible assignments:
P(Weather = sunny/rain/cloudy/snow) = 0.72/0.1/0.08/0.1 (normalized, i.e., sums to 1)
Joint probability distribution for a set of random variables gives the probability of every atomic event on those random variables
i.e., every sample point/data instance
P(Weather, Cavity) = a 4 × 2 matrix of values, sum to 1
Weather sunny rain cloudy snow
Cavity= true 0.144 0.02 0.016 0.02
Cavity = false 0.576 0.08 0.064 0.08
Probability for Continuous Variables
Express distribution as a parameterized function of value:
P (X = x) = (x) = uniform density between 18 and 26
Here P is a density (i.e., probability density function: PDF); integrates to 1:
P (X = 20.5) = 0.125 really means = 0.125
The density can be larger than 1
Gaussian Density
Conditional Probability given evidence
Conditional or posterior probabilities
e.g., P (Cavity = true | Toothache = true) = 0.8
i.e., given that toothache is true is all I know, the probability of cavity is 0.8
Notation for conditional probabilities:
Denote Cavity = true with cavity, Cavity = false with ¬ cavity, and so on!
P(cavity | toothache)
P(cavity | toothache) + P(¬ cavity | toothache) = 1
P(cavity | ¬ toothache) + P(¬ cavity | ¬ toothache) = 1
“conditional” Toothache = true Toothache = false
Cavity= true 0.8 0.2
Cavity = false 0.2 0.8
Conditional Probability given evidence
If we know more, e.g., not-brush-teeth, the probability may change
P (cavity | toothache, not-brush-teeth) = 0.95
P (cavity | toothache, visit-a-dentist-last-week) = 0.2
New evidence may be irrelevant, allowing simplification
e.g., P (cavity | toothache, 49℃) = P (cavity | toothache) = 0.8
This kind of inference, sanctioned by domain knowledge, is crucial
Inference by Enumeration
Start with joint distribution: of (1) cavity or not (2) toothache or not (3) catch or not
For any proposition, , sum the atomic events where it is true
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
Inference by Enumeration
Start with joint distribution
For any proposition, , sum the atomic events where it is true
P (toothache) = p(toothache, cavity, catch) + p(toothache, cavity, ¬catch) + p(toothache, ¬cavity, catch) + + p(toothache, ¬ cavity, ¬catch) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2
We call it a “marginal distribution”
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
Inference by Enumeration
Start with joint distribution
For any proposition, , sum the atomic events where it is true
P(cavity ∨ toothache) = 0.108 + 0.012 + 0.072 + 0.008 + 0.016 + 0.064 = 0.28
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
Inference by Enumeration
Start with joint distribution
For any proposition, , sum the atomic events where it is true
P(¬ cavity | toothache) = = = 0.4
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
Normalization
Start with joint distribution
Denominator can be viewed as a normalization constant
P(cavity| toothache)
General idea: compute distribution on query variable
by fixing evidence variables and summing over hidden variables
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
P(cavity, ¬catch)
P(toothache| cavity, ¬catch)
P(cavity, ¬catch| toothache)
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
P(cavity, ¬catch) =P(cavity, ¬catch, toothache) + P(cavity, ¬catch, ¬ toothache)
P(toothache| cavity, ¬catch)
P(cavity, ¬catch| toothache)
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
P(cavity, ¬catch) =P(cavity, ¬catch, toothache) + P(cavity, ¬catch, ¬ toothache)
P(toothache| cavity, ¬catch) =
P(cavity, ¬catch| toothache)
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
P(cavity, ¬catch) =P(cavity, ¬catch, toothache) + P(cavity, ¬catch, ¬ toothache)
P(toothache| cavity, ¬catch) =
P(cavity, ¬catch| toothache)=
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
The product rule
P(Cavity, Toothache)= P(Cavity)P(Toothache|Cavity)
P(cavity, toothache)= P(cavity)P(toothache|cavity)
P(cavity, ¬ toothache)= P(cavity)P(¬ toothache|cavity)
And so on for all assignments
The chain rule
P(Cavity, Toothache, Catch)=
P(Cavity)P(Toothache , Catch |Cavity) = P(Cavity)P(Toothache|Cavity)P(Catch| Toothache, Cavity)
P(Toothache)P(Cavity,Catch|Toothache) = P(Toothache)P(Catch|Toothache)P(Cavity|Catch,Toothache)
Independence (not always)
A and B are independent if and only if
P(A|B) = P(A) or P(B|A) = P(B) or P(A, B) = P(A)P(B)
Need to check all the assignments: e.g., P(A=a, B=b) = P(A=a)P(B=b) and so on
P(Toothache, Catch, Cavity, Weather) = P(Toothache, Catch, Cavity) P(Weather)
Absolute independence is powerful but rare
Dentistry is a large field with hundreds of variables, none of which are independent.
What to do?
Cavity decomposes into Toothache Catch Cavity
Toothache Catch
Conditional Independence (not always)
A and B are conditional independent given C if and only if
P(A, B|C=c) = P(A|C=c) P(B|C=c) or P(A|B, C=c) = P(A|C=c) or P(B|A, C=c) = P(B|C=c)
Need to check all the assignments: e.g., A= a, B = b, C = c
If I have a cavity, the probability that the probe catches in it doesn’t depend on whether I have a toothache:
P (catch|toothache, cavity) = P (catch| ¬ toothache, cavity) = P (catch|cavity)
The same independence holds if I haven’t got a cavity:
P (catch|toothache, ¬cavity) = P (catch| ¬ toothache, ¬cavity) = P (catch|¬cavity)
toothache ¬toothache
catch ¬catch catch ¬catch
cavity 0.108 0.012 0.072 0.008
¬cavity 0.016 0.064 0.144 0.576
Conditional Independence
Catch is conditionally independent of Toothache given cavity (or ¬cavity):
P(Catch|Toothache, Cavity = true) = P(Catch|Cavity = true)
P(Catch|Toothache, Cavity = false) = P(Catch|Cavity = false)
Equivalent statements:
P(Toothache|Catch, Cavity) = P(Toothache|Cavity)
P(Toothache, Catch|Cavity) = P(Toothache|Cavity) P(Catch|Cavity)
P(Toothache, Catch, Cavity) needs to store 2^3 independent numbers
If Toothache and Catch are conditionally independent given Cavity:
P(Toothache, Catch, Cavity) = P(Toothache , Catch |Cavity) P(Cavity) = P(Toothache|Cavity) P(Catch |Cavity) P(Cavity)
Conditional Independence
Write out full joint distribution using chain rule:
P(toothache, catch, cavity)
= P(toothache| catch, cavity)P(catch, cavity)
= P(toothache| catch, cavity)P(catch | cavity) P(cavity)
= P(toothache| cavity)P(catch | cavity) P(cavity)
In most cases, the use of conditional independence reduces the size of the representation of the joint distribution from exponential in n to linear in n.
Conditional independence is our most basic and robust form of knowledge about uncertain environments.
Bayes' Rule
Bayes’ rule P (a|b) =
or in distribution form
P(Y|X) = = α P(X|Y)P(Y); α is the normalization factor/denominator
Useful for assessing diagnostic probability from causal probability:
P (cause|effect) =
e.g., let M be meningitis, S be stiff neck:
Product rule P (a, b) = P (a|b)P (b) = P (b|a)P(a)
Bayes' Rule Conditional Independence
P(cavity|toothache, catch)
= α P(toothache, catch| cavity)P( cavity)
= α P(toothache|cavity)P(catch|cavity)P(cavity)
This is an example of a naive Bayes model:
P(cause, effect1, . . ., effectn) = P(cause)ΠiP(effecti |cause)
Total number of parameters is linear in n
cavity cause
toothache catch effect1 effectn
Let as N random variables, as the assignments for them
Joint: P() = P()
Marginal: P()
Conditional: P()
Independent, conditional independence
Product and chain rules: P(
Bayes’ rule: P()
/docProps/thumbnail.jpeg
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com