程序代写 COMP30024 Artificial Intelligence

COMP30024 Artificial Intelligence
Week 11 Problem Sheet
Weekly Topic Review
In Bayesian inference we are often interested in P (X|e) – the posterior probability over some random variable X, given we know e, the evidence, has occurred. Bayesian networks, also known as graphical models, are a useful modelling tool for problems involving a reasonable number of random variables. In today’s tutorial, we will see how to conduct inference in the context of Bayesian networks.

Copyright By PowCoder代写 加微信 powcoder

Let H denote all variables outside the variable of interest, X (the ’query’), and the observed evidence e. Then the union X ∪ e ∪ H encompasses all variables in the Bayesian network. Bayes’ Theorem says:
P(X|e)= 1 P(X,e) P (e)
= 1 􏰖P(X,e|H)P(H) P(e) H
= 1 􏰖P(X,e,H) P(e) H
This says that the posterior P(X|e) can be found by computing the full joint distribution over all variables in the problem, P (X, e, H) using the Bayesian network, then marginalizing out the hidden variables (for example, using the variable elimination algorithm), and normalizing appropriately.
In summary, to solve a Bayesian inference problem in the context of Bayesian network modelling:
• Identify query (X), evidence (e) and hidden (H) variables.
• UseBayesiannetworkstructuretofactorizejointP(X,e,H)intoproductofsimplerconditional distributions.
• Fix evidence variables to observed values e.
• Sum (marginalize) over remaining hidden variables H.
The last step is typically the hardest/most computationally demanding, and in this subject you will only need to understand how to perform inference schematically, for the most part.

(Based on RN 14.15). Leo is a botanist who lives in the Bay Area. His neighbourhood is a hotspot for burglars, so his house is fitted with an alarm system. Unfortunately, the alarm is not perfectly reliable: it doesn’t always trigger during a home invasion, and it may be erroneously triggered during minor earthquakes, which occur occasionally. Leo has asked his
Chapter 13 Probabilistic Reasoning
neighbours John and Mary to call him if they hear the alarm. This setup can be represented as the following Bayesian network.
Earthquake
P(A=true|B,E)
tt tf ft ff
.70 .01 .70 .01
P(J=true|A)
P(M=true|A)
Figure 13.2 A typical Bayesian network, showing both the topology and the conditional probability tables (CPTs). In the CPTs, the letters B, E, A, J, and M stand for Burglary,
1. Explaining Away [T] Consider the above Bayesian network. Earthquake, Alarm, JohnCalls, and MaryCalls, respectively.
(a) How many parameters (i.e. probability values) are required to model this network? If
we add another boolean-valued parent to the alarm variable, how many parameters are
needed to model the alarm node?
tion 13.2.) Each row in a CPT contains the conditional probability of each node value for a
(b) For boolean X , how many parameters are required to model the joint distribution i
se conditioning case. A conditioning case is just a possible combination of values for the parent p(X1, . . . , Xn)? Is there a unique factorization of the joint distribution?
nodes—a miniature possible world, if you like. Each row must sum to 1, because the entries
(c) Now place a Bayesian network structure on the joint p(X , . . . , X ). Derive an ex- 1n
represent an exhaustive set of cases for the variable. For Boolean variables, once you know
pression, in terms of the maximum number of parents for any RV in the Bayes net,
that the probability of a true value is p, the probability of false must be 1 − p, so we often P∗ = maxi|Parents(Xi)|, for the number of parameters required to model the joint.
omit the second number, as in Figure 13.2. In general, a table for a Boolean variable with k (d) If no evidence is observed, are the Burglary and Earthquake nodes independent?
Boolean parents contains 2k independently specifiable probabilities. A node with no parents (e) Assume we observe Alarm = True, now are Burglary and Earthquake independent? Jus-
has only onetirfoywy,ourerparnessweenrtibnygctahlceulpartiongr pwrhoebthaebriltihtiesprofbaebaiclihtiepsoisnsviobllvedvasalutiesfyofththeedvefianritaibonle. of conditional independence.
Notice that the network does not have nodes corresponding to Mary’s currently listening
(f) (Discussion) Explain intuitively why knowledge of the Alarm random variable renders
to loud music or to the telephone ringing and confusing John. These factors are summarized
the random variables Earthquake and Burglary non-conditionally independent.
in the uncertainty associated with the links from Alarm to JohnCalls and MaryCalls. This
shows both laziness and ignorance in operation, as explained on page 386: it would be a lot of work to find out why those factors would be more or less likely in any particular case, and
we have no reasonable way to obtain the relevant information anyway.
The probabilities actually summarize a potentially infinite set of circumstances in which
the alarm might fail to go off (high humidity, power failure, dead battery, cut wires, a dead

2. Burglary or Earthquake? [T] When Leo gets to work in the morning, he sees that John tried to call him, but left no message. There has also been a call from Mary. Assume John and Mary did not confer before calling. Using the Bayesian network above, derive an expression for the probability of a burglary having occurred given the above information (though possible, there is no need to compute the exact numerical value here).
3. Inference along a Chain [T] Suppose a network has the form of a chain: a sequence of BooleanvariablesX1,…Xn whereParents(Xi)={Xi−1}fori=2,…n.
X1 →X2 →X3 →···→Xn (1) First derive an expression for the probability P(X1|Xn = True).
What is the time complexity of computing the inference P(X1|Xn = True) using:
(a) Enumeration?
(b) Variable Elimination? Contrast the space complexity with na ̈ıve enumeration.
(c) (Bonus [∗]) What is the complexity of variable elimination on a general Bayes net structure? Express your answer in terms of:
• The number of variables in the full joint, n.
• The maximum number of values that variable may assume, k = maxi|Xi|. • The size of the largest intermediate factor generated, S.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com