CM3112 Artificial Intelligence
Knowledge and reasoning: Constructing Bayesian networks
Steven Schockaert
SchockaertS1@cardiff.ac.uk
School of Computer Science & Informatics Cardiff University
Construction of Bayesian networks
Construction algorithm
109 109 ↵ = 1715+285 = 2000
1) Choose an ordering of the random variable9s X1,…,Xn
P (john | burglary, earthquake) = 10 · 1715 · 10 9 = 1715
2) For each i = 1,…,n: select a minimal subset of parents for X among
X1,…Xi-1 such that
2000 i 2000 P(Xi = xi |X1 = x1,…,Xi 1 = xi 1)
=P(Xi =xi|{Xj =xj|Xj isaparentofXi}) A good ordering is important to ensure a compact network
The choice of parents reflects which assumptions we make on conditional independence
MaryCalls
JohnCalls
Alarm
Burglary
)? No
Earthquake
Construction of Bayesian networks
Example
Suppose we take the ordering: mary, john, alarm, burglary, earthquake
ose the ordering M, J, A, B, E
J
P(A|J)? P(A|J,M) = P(A)? No
=P(B|A)? Yes =P(B)? No
·
m
J?o
P ? P(A| (
=Y =P(B)? No
alarm mary
= ·
alarm mary
P (john|alarm⇥) · P (mary⇥|alarm⇥) · P (alar Construction of Bayesian networks
· P (burglary) · P (earthquake) Suppose we take the ordering: mary, john, alarm, burglary, earthquake
Example
ose the ordering M, J, A, B, E
⌅x⇥1 , …, x⇥i+1 . P (x⇥i |parents⇥ (xi )) = P (x⇥i |x⇥1 , …, x⇥i P (john|mary) ⇤= P (john)
7
) (
MaryCalls
JohnCalls
Burgla
Alarm
ry
Ear
J, M ) = P es
N
A|J ) P(B|A)?
thquake
A)? No
1715 + 285 2000
=
i
J PA
=P
=P(B)? No
109 9 P (john | burglary, earthquake) = 2000 · 1715 · 10
Construction of Bayesian networks
Example
P(Xi = xi |X1 = x1,…,Xi 1 = xi 1) =P(Xi =xi|{Xj =xj|Xj isaparentofX
P (alarm|john, mary) 6= P (alarm)
P (alarm|john, mary) 6= P (alarm|john) P (alarm|john, mary) 6= P (alarm|mary)
Suppose we take the ordering: mary, john, alarm, burglary, earthquake
ose the ordering M, J, A, B, E
)? (
MaryCalls
JohnCalls Alarm
Burglary
No
|J)? P(A|J,M) = P(A)? No
(B|A)? Yes
Earthquake
Construction of Bayesian networks
Example
Suppose we take the ordering: mary, john, alarm, burglary, earthquake
ose the ordering M, J, A, B, E
P (alarm|john, mary) = P (alarm) P (alarm|john, mary) = P (john) P (alarm|john, mary) = P (mary)
P (burglary|alarm, john, mary) = P (burglary)
P (burglary|alarm, john, mary) = P (burglary|alarm)
o
MaryCalls
JohnCalls Alarm
Burglary
Earthquake
) = P(A)?
)? No
(A|J)? P(A|J,
P(B|A)? Ye
J PMN
=s =P(B)? No
Construction of Bayesian networks
Example
P (alarm|john, mary) = P (alarm) P (alarm|john, mary) = P (john)
P (alarm|john, mary) = P (mary)
P (burglary|alarm, john, mary) = P (burglary)
P (burglary|alarm, john, mary) = P (burglary|alarm)
Suppose we take the ordering: mary, john, alarm, burglary, earthquake
ose the ordering M, J, A, B, E MaryCalls
JohnCalls
Alarm
P (earthquake|burglary, alarm, john, mary) = P (earthquake)
P (earthquake|burglary, alarm, john, mary) = P (earthquake|burglary)
P (earthquake|burglary, alarm, john, mary) = P (earthquake|alarm)
P (earthquake|burglary, alarm, john, mary) = P (earthquake|burglary, alarm)
Burglary
)
Earthquake
)=P(A)? No
? No
A|J)? P(A|J,
P(B|A)? Yes
J
P( M
=
=P(B)? No
ppose we choose the ordering M, J, A, B, E
Construction of Bayesian networks
choose the ordering M, J, A, B, E
MaryCalls
Example
JohnCalls
Eplecontd.
J|M)
A|J, M
B|A,J
F F
T T
Alarm
Burglary
Earthquake
B
E P(A|B,E)
T .95 F .94
Ear
thquake
T .29 F .001
A P(J|A)
T .90 F .05
P(E)
.002
A P(
T. F.
BA( (
P (A)? No JohnCalls
MaryCalls
|
MaryCalls
Alarm
JohnCalls
,M)=P(B|A)? Yes
Burglary Alarm
=P(J)? No
) = P(A|J)? P(A|J,M) =
Burglary
, J, M ) = P
B)? No
Earthquake
J) |
? No
,A,J,M) =
P| B,A,
P(E|A)? No
E| J,M)=P(E|A,B)? Yes
EB
= (A J)? P(A|J,M) = P(A)? No
)=P(B|A)? Yes )=P(B)? No
Usually a more compact network is obtained by choosing the ordering
,M)=P(E|A)? No
of the random variables in accordance with the causal relations
,M)=P(E|A,B)? Yes
Chapt
xam
P(B)
.001
Chapter 14.1–3 17
u
(
(
(
M
7
(0
e
P
( (
M M J
J
Exercise
John spends 60% of his work time in his office and 40% of his work time in meetings. When he is in his office, he puts on the light 90% of the time. When he goes to a meeting, he leaves the light on 20% of the time. When he is in his office, he is logged onto his computer 80% of the time. When he is in a meeting, he is logged onto his computer (remotely) 10% of the time.
Construct a Bayesian network to represent this scenario
Evaluate the probability that the light is on in John’s office, given the evidence that he is logged onto his computer
Construction from data
Consider the probability distribution encoded in the following table
c
¬c
a
d b 3/128
¬d
9/128
d
1/128
¬d
3/128
¬b
3/128
9/128
18/128
18/128
¬a
b 2/128
6/128
2/128
6/128
¬b
6/128
18/128
12/128
12/128
We want to encode this probability distribution as a Bayesian network, starting from the ordering A,B,C,D
Construction from data P(a) = (3+9+1+3+3+9+18+18) / 128 = 1/2
P(A)
1/2
A
c
¬c
d
¬d
d
¬d
a
b
3/128
9/128
1/128
3/128
¬b
3/128
9/128
17/128
17/128
¬a
b
2/128
6/128
2/128
6/128
¬b
6/128
18/128
12/128
12/128
Construction from data
P(b) = (3+9+1+3+2+6+2+6) / 128 = 1/4
P(b|a) = (3+9+1+3) / P(b|¬a) = (2+6+2+6) /
P(B|A) = P(B)
P(A)
1/2
P(B)
1/4
(3+9+1+3+3+9+18+18)
= 1/4
= 1/4
(2+6+2+6+6+18+12+12)
A
B
c
¬c
d
¬d
d
¬d
a
b
3/128
9/128
1/128
3/128
¬b
3/128
9/128
18/128
18/128
¬a
b
2/128
6/128
2/128
6/128
¬b
6/128
18/128
12/128
12/128
Construction from data
P(c|a) = (3+9+3+9) / (3+9+1+3+3+9+18+18) = 3/8
P(c|b) = (3+9+2+6) / 5/8
P(c|a,b) = (3+9) / (3+9+1+3) = 3/4 P(c|a,¬b) = (3+9) / (3+9+18+18) = 1/4
P(c|¬a,b) = (2+6) / (2+6+2+6) = 1/2 P(c|¬a,¬b) = (6+18) / (6+18+12+12) = 1/2
P(C|A,B) ≠ P(C|A) P(C|A,B) ≠ P(C|B)
(3+9+1+3+2+6+2+6) =
P(A)
1/2
P(B)
1/4
A
B
P(C|A,B)
T
T
3/4
T
F
1/4
F
T
1/2
F
F
1/2
A
C
B
c
¬c
d
¬d
d
¬d
a
b
3/128
9/128
1/128
3/128
¬b
3/128
9/128
18/128
18/128
¬a
b
2/128
6/128
2/128
6/128
¬b
6/128
18/128
12/128
12/128
Construction from data
P(d|b,c) = (3+2)/(3+9+2+6) = 1/4
P(d|b,¬c) = (1+2)/(1+3+2+6) = 1/4 P(d|¬b,c) = (3+6)/(3+9+6+18) = 1/4 P(d|¬b,¬c) = (18+12)/(18+18+12+12) = 1/2
P(d|a,b,c) = 3/(3+9) = 1/4 P(d|a,b,¬c) = 1/(1+3) = 1/4
P(d|a,¬b,c) = 3/(3+9) = 1/4
P(d|a,¬b,¬c) = 18/(18+18) = 1/2 P(d|¬a,b,c) = 2 / (2+6) = 1/4 P(d|¬a,b,¬c) = 2/(2+6) = 1/4 P(d|¬a,¬b,c) = 6 / (6+18) = 1/4 P(d|¬a,¬b,¬c) = 12/ (12+12) = 1/2
P(D|A,B,C) = P(D|B,C) P(D|A,B,C) ≠ P(D|B)
P(D|A,B,C) ≠ P(D|C)
A
C
D
P(A)
1/2
P(B)
1/4
A
B
P(C|A,B)
T
T
3/4
T
F
1/4
F
T
1/2
F
F
1/2
B
B
C
P(D|B,C)
T
T
1/4
T
F
1/4
F
T
1/4
F
F
1/2
Exercise
Can the DAG on the right be the structure of a Bayesian network which encodes the probability distribution on the left?
A
BC D
c
¬c
d
¬d
d
¬d
a
b
16/82
4/82
4/82
4/82
¬b
4/82
1/82
4/82
4/82
¬a
b
4/82
4/82
1/82
4/82
¬b
4/82
4/82
4/82
16/82