程序代写代做代考 Bayesian network Bayesian data structure python chain Hidden Markov Mode Introduction to

Introduction to
Artificial Intelligence
with Python

Uncertainty

Probability

Possible Worlds
P(ω)

P(ω)

0 ≤ P(ω) ≤ 1

0 ≤ P(ω) = 1 ∑
ω∈Ω

111111 666666

1
P( ) = 1/6 6

23 43 45 65 6 7
4 65 6 87 87 98 65 7 87 9 9 10
10 11 11 12
7
8 9 10

P(sum to 12) = P(sum to 7) =
1 36
6=1 36 6

unconditional probability
degree of belief in a proposition
in the absence of any other evidence

conditional probability
degree of belief in a proposition given some evidence that has already been revealed

conditional probability
P(a | b)

P(rain today | rain yesterday)

P(route change | traffic conditions)

P(disease | test results)

P(a|b) =
P(a∧b) P(b)

P(sum 12 | )

P( )
=
1 6

P(sum 12)
=1 36
= 6
1
6
P( )=1
P(sum 12 | )

P(a∧b) P(b)
P(a ∧ b) = P(b)P(a|b) P(a ∧ b) = P(a)P(b|a)
P(a|b) =

random variable
a variable in probability theory with a domain of possible values it can take on

random variable
Roll
{1, 2, 3, 4, 5, 6}

random variable
Weather
{sun, cloud, rain, wind, snow}

random variable
Traffic
{none, light, heavy}

random variable
Flight
{on time, delayed, cancelled}

probability distribution
P(Flight = on time) = 0.6 P(Flight = delayed) = 0.3 P(Flight = cancelled) = 0.1

probability distribution
P(Flight) = ⟨0.6, 0.3, 0.1⟩

independence
the knowledge that one event occurs does not affect the probability of the other event

independence
P(a ∧ b) = P(a)P(b|a)

independence
P(a ∧ b) = P(a)P(b)

independence
P( ) = P( )P( )
=1⋅1=1 6 6 36

independence
P( ) ≠ P( )P( )
=1⋅1=1 6 6 36

independence
P( )≠P( )P( | )
=1⋅0=0 6

Bayes’ Rule

P(a ∧ b) = P(b) P(a|b) P(a ∧ b) = P(a) P(b|a)

P(a) P(b|a) = P(b) P(a|b)

Bayes’ Rule
P(b) P(a|b) P(a)
P(b|a) =

Bayes’ Rule
P(a|b) P(b) P(a)
P(b|a) =

AM PM
Given clouds in the morning,
what’s the probability of rain in the afternoon?
• 80% of rainy afternoons start with cloudy mornings.
• 40% of days have cloudy mornings. • 10% of days have rainy afternoons.

P(rain|clouds) = =
P(clouds | rain)P(rain) P(clouds)
(.8)(.1) .4
= 0.2

Knowing
P(cloudy morning | rainy afternoon) we can calculate
P(rainy afternoon | cloudy morning)

Knowing
P(visible effect | unknown cause) we can calculate
P(unknown cause | visible effect)

Knowing
P(medical test result | disease) we can calculate
P(disease | medical test result)

Knowing
P(blurry text | counterfeit bill) we can calculate
P(counterfeit bill | blurry text)

Joint Probability

AM
PM
C = cloud
C = ¬cloud
0.4
0.6
R = rain
R = ¬rain
0.1
0.9
PM
AM
R = rain
R = ¬rain
C = cloud
0.08
0.32
C = ¬cloud
0.02
0.58

P(C | rain)
P(C | rain) = P(C, rain) = αP(C, rain)
P(rain)
= α⟨0.08, 0.02⟩ = ⟨0.8, 0.2⟩
R = rain
R = ¬rain
C = cloud
0.08
0.32
C = ¬cloud
0.02
0.58

Probability Rules

Negation
P(¬a) = 1 − P(a)

Inclusion-Exclusion
P(a ∨ b) = P(a) + P(b) − P(a ∧ b)

Marginalization
P(a) = P(a, b) + P(a, ¬b)

Marginalization
P(X = xi) = ∑ P(X = xi, Y = yj) j

Marginalization
R = rain
R = ¬rain
C = cloud
0.08
0.32
C = ¬cloud
0.02
0.58
P(C = cloud)
= P(C = cloud, R = rain) + P(C = cloud, R = ¬rain) = 0.08 + 0.32
= 0.40

Conditioning
P(a) = P(a|b)P(b) + P(a|¬b)P(¬b)

Conditioning
P(X = xi) = ∑ P(X = xi | Y = yj)P(Y = yj) j

Bayesian Networks

Bayesian network
data structure that represents the dependencies among random variables

Bayesian network
• directed graph
• each node represents a random variable
• arrow from X to Y means X is a parent of Y • each node X has probability distribution
P(X | Parents(X))

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}

none
light
heavy
0.7
0.2
0.1
Rain
{none, light, heavy}

Rain
{none, light, heavy}
R
yes
no
none
0.4
0.6
light
0.2
0.8
heavy
0.1
0.9
Maintenance {yes, no}

Rain
{none, light, heavy}
R
M
on time
delayed
none
yes
0.8
0.2
none
no
0.9
0.1
light
yes
0.6
0.4
light
no
0.7
0.3
heavy
yes
0.4
0.6
heavy
no
0.5
0.5
Maintenance {yes, no}
Train
{on time, delayed}

Maintenance {yes, no}
Train
{on time, delayed}
T
attend
miss
on time
0.9
0.1
delayed
0.6
0.4
Appointment {attend, miss}

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}
Computing Joint Probabilities
P(light) P(light)

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}
Computing Joint Probabilities
P(light, no) P(light) P(no | light)

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}
Computing Joint Probabilities
P(light, no, delayed) P(light) P(no | light) P(delayed | light, no)

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Computing Joint Probabilities
Appointment {attend, miss}
P(light, no, delayed, miss) P(light) P(no | light) P(delayed | light, no) P(miss | delayed)

Inference

Inference
• • •
•
Query X: variable for which to compute distribution Evidence variables E: observed variables for event e Hidden variables Y: non-evidence, non-query variable.
Goal: Calculate P(X | e)

P(Appointment | light, no)
= α P(Appointment, light, no)
Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}
= α [P(Appointment, light, no, on time) + P(Appointment, light, no, delayed)]

Inference by Enumeration
P(X | e) = α P(X, e) = α∑ P(X, e, y) y
X is the query variable.
e is the evidence.
y ranges over values of hidden variables. α normalizes the result.

Approximate Inference

Sampling

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}

R = none
none
light
heavy
0.7
0.2
0.1
Rain
{none, light, heavy}

R = none
M = yes
Rain
{none, light, heavy}
R
yes
no
none
0.4
0.6
light
0.2
0.8
heavy
0.1
0.9
Maintenance {yes, no}

R = none
M = yes
T = on time
Rain
{none, light, heavy}
Maintenance {yes, no}
R
M
on time
delayed
none
yes
0.8
0.2
none
no
0.9
0.1
light
yes
0.6
0.4
light
no
0.7
0.3
heavy
yes
0.4
0.6
heavy
no
0.5
0.5
Train
{on time, delayed}

Maintenance {yes, no}
R = none
M = yes
T = on time
A = attend
Train
{on time, delayed}
T
attend
miss
on time
0.9
0.1
delayed
0.6
0.4
Appointment {attend, miss}

R = none
M = yes
T = on time
A = attend

R = light
M = no
T = on time
A = miss
R = light
M = yes
T = delayed
A = attend
R = none
M = no
T = on time
A = attend
R = none
M = yes
T = on time
A = attend
R = none
M = yes
T = on time
A = attend
R = none
M = yes
T = on time
A = attend
R = heavy
M = no
T = delayed
A = miss
R = light
M = no
T = on time
A = attend

P(Train = on time) ?

P(Rain = light | Train = on time) ?

Rejection Sampling

Likelihood Weighting

Likelihood Weighting
•
•
•
Start by fixing the values for evidence variables.
Sample the non-evidence variables using conditional probabilities in the Bayesian Network.
Weight each sample by its likelihood: the probability of all of the evidence.

P(Rain = light | Train = on time) ?

Rain
{none, light, heavy}
Maintenance {yes, no}
Train
{on time, delayed}
Appointment {attend, miss}

R = light
T = on time
none
light
heavy
0.7
0.2
0.1
Rain
{none, light, heavy}

R = light
M = yes
T = on time
Rain
{none, light, heavy}
R
yes
no
none
0.4
0.6
light
0.2
0.8
heavy
0.1
0.9
Maintenance {yes, no}

R = light
M = yes
T = on time
Rain
{none, light, heavy}
Maintenance {yes, no}
R
M
on time
delayed
none
yes
0.8
0.2
none
no
0.9
0.1
light
yes
0.6
0.4
light
no
0.7
0.3
heavy
yes
0.4
0.6
heavy
no
0.5
0.5
Train
{on time, delayed}

Maintenance {yes, no}
R = light
M = yes
T = on time
A = attend
Train
{on time, delayed}
T
attend
miss
on time
0.9
0.1
delayed
0.6
0.4
Appointment {attend, miss}

R = light
M = yes
T = on time
A = attend
Rain
{none, light, heavy}
Maintenance {yes, no}
R
M
on time
delayed
none
yes
0.8
0.2
none
no
0.9
0.1
light
yes
0.6
0.4
light
no
0.7
0.3
heavy
yes
0.4
0.6
heavy
no
0.5
0.5
Train
{on time, delayed}

Uncertainty over Time

Xt: Weather at time t

Markov assumption
the assumption that the current state depends on only a finite fixed number of previous states

Markov Chain

Markov chain
a sequence of random variables where the distribution of each variable follows the Markov assumption

Transition Model
Tomorrow (Xt+1)
0.8
0.2
0.3
0.7
Today (Xt)

X0 X1
X2 X3 X4

Sensor Models

Hidden State
Observation
robot’s position
robot’s sensor data
words spoken
audio waveforms
user engagement
website or app analytics
weather
umbrella

Hidden Markov Models

Hidden Markov Model
a Markov model for a system with hidden states that generate some observed event

Sensor Model
Observation (Et)
0.2
0.8
0.9
0.1
State (Xt)

sensor Markov assumption
the assumption that the evidence variable depends only the corresponding state

X0 X1 X2 X3 X4
E0 E1 E2 E3 E4

Task
Definition
filtering
given observations from start until now, calculate distribution for current state
prediction
given observations from start until now, calculate distribution for a future state
smoothing
given observations from start until now, calculate distribution for past state
most likely explanation
given observations from start until now, calculate most likely sequence of states

Uncertainty

Introduction to
Artificial Intelligence
with Python

Related Posts