COMP 424 – Artificial Intelligence Bayesian Networks 2
Instructor: Jackie CK Cheung and Readings: R&N Ch 14
Final Project
Copyright By PowCoder代写 加微信 powcoder
• Colosseum Survival
• Please get started on the project now!
COMP-424: Artificial intelligence 2
How can we speed up (exact) inference with Bayes Nets? Variable elimination algorithm
• Sum up probabilities in an efficient way Bayes Ball
• Figure out which variables are conditionally independent given other variables
COMP-424: Artificial intelligence 3
Inference in BNs: Leveraging structure
P(A| T, F)
Recall: Entries in the full joint probability. e.g., P(~T, F, A, S, L) ? P(~T, F, A, S, L) = P(~T) P(F) P(A | ~T, F) P(S | F) P(L | A)
= 0.98 x …
Say we want: P(T | L=1) = P(T, L=1) / P(L=1)
COMP-424: Artificial intelligence 4
Inference in BNs: Leveraging structure
Naïve approach:
Start with
P(T | L=1) = P(T, L=1) / P(L=1)
P(L=1) = P(T=1, L=1) + P(T=0, L=1)
P(T=1, L=1) = ∑a, s, f P(A=a, S=s, F=f, T=1, L=1)
= ∑a, s, f P(s | f) P(f) P(a | f, T=1) P(L=1 | a) P(T=1) P(T=0, L=1) = ∑a, s, f P(A=a, S=s, F=f, T=0, L=1)
= ∑a, s, f P(s | f) P(f) P(a | f, T=0) P(L=1 | a) P(T=0) COMP-424: Artificial intelligence 5
Inference in BNs: Leveraging structure
A better solution:
• Re-arrange the sums:
P(T, L=1) = ∑a, s, f P(s | f) P(f) P(a | f, T) P(L=1 | a) P(T)
= ∑a, f P(f) P(a | f, T) P(L=1 | a) P(T) ∑s P(s | f)
COMP-424: Artificial intelligence
Inference in BNs: Leveraging structure
A better solution:
• Re-arrange the sums:
P(T, L=1) = ∑a, s, f P(s | f) P(f) P(a | f, T) P(L=1 | a) P(T)
= ∑a, f P(f) P(a | f, T) P(L=1 | a) P(T) ∑s P(s | f) • Replace:
∑ s P(s | f) = mS(f) (Note that mS(f)=1, but ignore for now.) • Now we have:
P(T, L=1) = ∑a, f P(f) P(a | f, T=1) P(L=1 | a) P(T=1) mS(f) • Repeat with other hidden variables (A, F).
Instead of O(2n) terms, we have to sum over O(2kn) terms.
COMP-424: Artificial intelligence 7
Basic idea of variable elimination
1. Impose an ordering over variables, with the query variable coming last.
2. Maintain a list of “factors”, which depend on given variables.
3. Sum over variables in the order in which they appear in the list.
4. Memorize the result of intermediate computations.
This is a kind of dynamic programming.
COMP-424: Artificial intelligence 8
A bit of notation
• Let Xi be an evidence variable with observed value xi.
• Let the evidence potential be an indicator function:
δ(xj, xi) = 1 if and only if xj = xi.
This way we can turn conditionals into sums, e.g.
P(s | F=1) = ∑f P(s | f) δ(f, 1)
• This is convenient for notation, but not effective as a
practical implementation.
COMP-424: Artificial intelligence 9
Variable elimination algorithm
1. Pick a variable ordering with Y (query variable) at the end of the list.
2. Initialize the active factor list with the conditional probability distributions (tables) in the Bayes net.
3. Add to the active factor list the evidence potentials δ(ej, ei) for all evidence variables E.
4. For i=1..n
1. Take the next variable Xi from the ordering.
2. Take all the factors that have Xi as an argument off the active factor list, and multiply them, then sum over all values of Xi, creating a new factor mXi
3. Put mXi on the active factor list.
COMP-424: Artificial intelligence 10
List: P(S|F), P(F), P(T), P(A|F,T), P(L|A), δ(L,1)
3. Eliminate S: replace P(S|F) in list by computing
ms(F) = ∑s P(s | F)
List: P(S|F), P(F), P(T), P(A|F,T), P(L|A), δ(L,1), ms(F)
To calculate:
P(T | L=1) =
P(T, L=1) / P(L=1)
1. Pick a variable ordering: S, F, L, A, T
2. Initialize the active factor list and introduce the evidence:
COMP-424: Artificial intelligence 11
Example (cont’d)
List: P(F), P(T), P(A|F,T), P(L|A), δ(L,1), ms(F)
4. Eliminate F: mF(A,T) = ∑f P(f) P(A|f,T) mS(f)
List: P(F), P(T), P(A|F,T), P(L|A), δ(L,1), mS(F), mF(A,T) 5. Eliminate L: mL(A) = ∑l P(l|A) δ(l,1)
List: P(T), P(L|A), δ(L,1), mF(A,T), mL(A)
6. Eliminate A: mA(T) = ∑a mF(a,T) mL(a)
List: P(T), mF(A,T), mL(A), mA(T)
7. Compute answer for T=0 and T=1:P(T=0)mA(T=0)
P(T=1)mA(T=1)
This is the answer we are looking for!
COMP-424: Artificial intelligence 12
Complexity of variable elimination
• We need at most O(n) multiplications to create an entry in a factor (where n is the total number of variables).
• If m is the maximum number of values that a variable can take, a factor depending on k variables will have O(mk) entries.
• So it is important to have small factors. But the size of the factors depends on the ordering of the variables!
• Choosing an optimal ordering is NP-complete.
COMP-424: Artificial intelligence 13
Strategy #2: DAGs and independence
Given a graph G, what independence assumptions are implied?
COMP-424: Artificial intelligence
DAGs and independence
COMP-424: Artificial intelligence
DAGs and independencies
Absence of links implies certain conditional independence relationships.
• Are there other independencies or conditional independencies between variables?
• AreAandSindependent?
• Is S conditionally indep. of L and R, given F?
What variables are independent, or conditionally
independent, in general?
• Can be read off the graph with the right tools!
• Why do we care? Faster inference! • E.g. Suppose we want to know P(S | A)
• Do we need to sum over all values of T, L, R?
• Skip variables that are conditionally indep. of S given A!
COMP-424: Artificial intelligence
Implied independencies
• Given evidence for variable Q, what can we say about the sets of variables P and R?
• If we get information about P, does this change our belief about R?
• If P and R are not directly connected, and having information about P gives information about R, then it must be because it gives information about the variables along the path from P to R.
COMP-424: Artificial intelligence
Three types of connections
COMP-424: Artificial intelligence
Indirect connection
• Interpret lack of edge between X and Z as conditional independence:
P(Z | X, Y) = P(Z | Y). Is this justified? X
• Based on the graph structure, we have:
P(X, Y, Z) P(Z | X, Y)
= P(X) P(Y | X) P(Z | Y)
= P(X, Y, Z) / P(X, Y)
P(X) P(Y | X) P (Z | Y) / P(X) P(Y | X) P(Z | Y)
• So Z is independent of X whenever the value of Y is known. E.g. X = Travel Y = Bird flu Z = Fever
• On the other hand, Z is not independent of X if Y not known. (Can check that P(Z | X) does not simplify)
COMP-424: Artificial intelligence 19
Common cause
• Again, interpret lack of edge between X and Z as conditional independence given Y. Is this true?
P(Z | X, Y) = P(X, Y, Z) / P(X, Y)
= P(Y) P(X | Y) P (Z | Y) / P(Y) P(X | Y) = P(Z | Y)
• This is a hidden variable scenario: if Y is unknown then X and Z can appear dependent on each other.
E.g. Y = Bronchitis X = Cough Z = Fever
• On the other hand, Z is not independent of X if Y not
known. (can check that P(Z | X) does not simplify)
COMP-424: Artificial intelligence 20
V-structure
• Interpret lack of an edge between X and Z as statement of marginal independence.
• In this case, when given Y, X is not independent of Z.
(You can check that P(Z | X, Y) does not simplify.)
• Same argument if a child of Y is given!
• This is a case of explaining away when there are multiple competing explanations.
E.g. X = Bird fluZ = Cold Y = Fever COMP-424: Artificial intelligence
Explaining away (Cont.)
• Tampering and Fire are independent variables (not the same as mutually exclusive! They can both be True, for example).
COMP-424: Artificial intelligence 22
Explaining away (Cont.)
• Tampering and Fire are independent variables (not the same as mutually exclusive! They can both be True, for example).
• Agent hears the Alarm sound. It could be due to Tampering and/or Fire. P(F | A) > P(F) ; P(T | A) > P(T)
• The agent gets some other piece of evidence, it sees Smoke. P(F | A, S) ? P(F | A)
COMP-424: Artificial intelligence 23
Explaining away (Cont.)
• Tampering and Fire are independent variables (not the same as mutually exclusive! They can both be True, for example).
• Agent hears the Alarm sound. It could be due to Tampering and/or Fire. P(F | A) > P(F) ; P(T | A) > P(T)
• The agent gets some other piece of evidence, it sees Smoke. P(F | A, S) > P(F | A)
P(T | A, S) ? P(T | A)
COMP-424: Artificial intelligence 24
Explaining away (Cont.)
• Tampering and Fire are independent variables (not the same as mutually exclusive! They can both be True, for example).
• Agent hears the Alarm sound. It could be due to Tampering and/or Fire. P(F | A) > P(F) ; P(T | A) > P(T)
• The agent gets some other piece of evidence, it sees Smoke.
P(F | A, S) P(T | A, S) P(T | A, S)
> P(F | A) < P(T | A) ? P(T)
COMP-424: Artificial intelligence 25
Explaining away (Cont.)
• Tampering and Fire are independent variables (not the same as mutually exclusive! They can both be True, for example).
• Agent hears the Alarm sound. It could be due to Tampering and/or Fire. P(F | A) > P(F) ; P(T | A) > P(T)
• The agent gets some other piece of evidence, it sees Smoke.
P(F | A, S) P(T | A, S) P(T | A, S)
> P(F | A) < P(T | A) > P(T)
• If instead there was some additional evidence directly related to Tampering, what happens to our belief about Fire, as compared to just having heard the Alarm?
COMP-424: Artificial intelligence 26
Summary of the three cases
Case 1: X Case 2: Y Case 3: X
• Cases 1 & 2: path between X and Z is:
• open if Y is unknown.
• blocked if Y is known.
• Case 3: path between X and Z is:
• blocked if Y is unknown.
• open if Y is known.
COMP-424: Artificial intelligence
Bayes Ball algorithm
https://www.cs.nyu.edu/~roweis/csc412-2004/notes/lec2x.pdf
COMP-424: Artificial intelligence 28
Boundary Cases and Explaining Away
• Boundary cases
• Case b) above means that explaining away happens if Y or any of its descendants is shaded.
COMP-424: Artificial intelligence 29
Summary of rules
Bayes Ball Example 1
COMP-424: Artificial intelligence 31
Bayes Ball Example 2
COMP-424: Artificial intelligence 32
Bayes Ball
• Bayes Ball implies the two properties we discussed before
COMP-424: Artificial intelligence 33
Summary of Inference with Speed-Ups
COMP-424: Artificial intelligence 34
Summary of inference in Bayes nets
• Complexity of inference depends a lot on network’s structure.
• Inference is efficient (poly-time) for tree-structured networks.
• In worse-case, inference is NP-complete.
• Best exact inference algorithm converts network to a tree, then does exact inference on the tree.
• In practice, for large nets, approximate inference methods work much better.
COMP-424: Artificial intelligence 35
What you should know
• Basic rules of probabilistic inference.
• Variable elimination
• Algorithm, complexity
• Conditional independence in a Bayes net
• Three base cases and the boundary leaf case.
• Bayes ball algorithm
COMP-424: Artificial intelligence 36
Quiz: Bayes net inference
1. What is the unconditional probability of WetGrass, P(W=1)?
a) P(W=1|S,R)P(S,R)
b) P(W=1|S,R)P(S|C)P(R|C)P(C)
c) Σs,r P(W=1 | S=s, R=r)
d) Σs,r P(W=1, S=s, R=r)
e) Σs,r,c P(W=1,S=s, R=r, C=c)
2. What is the conditional probability of Rain, given that the grass is wet, P(R=1|W=1)?
a) P(R=1,W=1) / (P(W=1) + P(R=1))
b) P(R=1|W=1,C=0)+P(R=1|W=1,C=1)
c) Σs,c P(W=1, R=1, S=s, C=c)
d) Σs,c P(W=1, R=1, S=s, C=c) / P(W=1, R=r, S=s, C=c)
To-do later: Calculate the exact probability for each.
COMP-424: Artificial intelligence
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com