Solutions to Exercises for Module 6 – Defining Bayesian Network Probabilities Discrete Nodes
Exercise 1: Suppose A is a ranked node with 5 states and that it has 3 parent nodes each with 5 states. How many cells are there in the NPT of A? What kind of problems would you expect to encounter if you tried to complete this NPT manually? How could you avoid these problems?
Solution:
The size of the NPT for node A would be 5 x 5 x 5 long (because 3 parent nodes with 5 states would create 5 x 5 x 5 NPT scenarios) and 5 wide (because node A has 5 states). This would result is 625 cells in the NPT for node A.
The sheer size of the table makes it impractical to complete manually, but even if it was attempted the most serious problem is ensuring consistency. For example, suppose all nodes have states (poor, below average, average, above average, high) and that the parents of A are nodes B, C, D. Now suppose that the entry for A being ‘below average’ given that B, C, D are all average is 0.2. Then, if B, C, D are all expected to have a positive effect on A you would have to make sure that whenever B, C, D are all at least ‘average’, the entry for A being ‘below average’ is 0.2 or less.
To avoid the problems of manually completing such a table you could reduce the number of states in each node. However, even 3 states for each node results in an NPT that is extremely difficult to complete manually. Hence the best solution is to use a predefined function, such as a weighted mean, for the NPT.
Exercise 2: Open the “Chest Clinic” Bayesian network. Suppose that the only information you have about a patient is that their x-ray result is abnormal. What would you conclude about their most likely condition? If you had the opportunity to ask just one more question about the patient in order to diagnose their disease, what would it be?
Solution:
There most likely condition would be either Lung Cancer or Bronchitis but we would still be unsure. The least likely condition would be Tuberculosis.
We are least sure that they have shortness of breath (Dyspnea), so asking this question would determine if they are more likely to have Lung Cancer or Bronchitis.
Exercise 3: Open the “Heart Attack” Bayesian network. Suppose that a new independent risk factor is found: “Loneliness”. The impact of this factor on “heart attack before 60” is believed to be similar to that of “Poor diet”. Construct the revised model with the additional risk factor. It is also proposed that “Overweight” is a risk factor that should be added. Why is it not so simple to add this risk factor?
Solution:
1
NPT for Loneliness:
NPT for Heart Attach Before 60:
It is not so simple to add Overweight as a risk factor because it is related to Poor Diet and Lack of Exercise. In other words, being Overweight is not independent of other risk factors. You could add Overweight to the model, however you would need to know its contribution to heart attack risk as well as the contribution of Poor Diet and Lack of Exercise to being Overweight.
2
3