程序代写代做代考 data structure algorithm Lab 4 – Belief Propagation

Lab 4 – Belief Propagation

Lab 4: Belief Propagation¶
This lab is built around the process of identifying the fault with a coffee machine.

Your task is to:

Given the structure of a graphical model for the state of a coffee machine learn the distributions from data.
Implement belief propagation, so you can evaluate the probability of each failure given the available evidence.
Identify the most probable failure for a set of broken coffee machines.

Marking and Submission¶
These lab exercises are marked, and contribute to your final grade. For this lab exercise there are 3 places where you are expected to enter your own code, for 15 marks overall. Every place you have to add code is indicated by

# **************************************************************** n marks

with instructions above the code block.

Please submit your completed workbook using Moodle before 5pm on the 28th November 2018. The workbook you submit must be an .ipynb file, which is saved into the directory you’re running Jupyter; alternatively you can download it from the menu above using File -> Download As -> Notebook (.ipynb). Remember to save your work regularly (Save and checkpoint in the File menu, the icon of a floppy disk, or Ctrl-S); the version you submit should have all code blocks showing the results (if any) of execution below them. You will normally receive feedback within a week.

In [1]:

%matplotlib inline

import zipfile
import io
import csv

import numpy
import matplotlib.pyplot as plt

Coffee Machine Data Set¶
You can download from moodle a zip file that contains the data as a csv file. The below code will load the data directly from the zip file, so you don’t have to unzip it.

Each row of the file contains a fully observed coffee machine, with the state of every random variable. The random variables are all binary, with False represented by 0 and True represented by 1. The variables are:

Failures (you’re trying to detect these):

he – No electricity

fp – Fried power supply unit

fc – Fried circuit board

wr – Water reservoir empty

gs – Group head gasket forms seal

dp – Dead pump

fh – Fried heating element

Mechanism (these are unobservable):

pw – Power supply unit works

cb – Circuit board works

gw – Get water out of group head

Diagnostic (these are the tests the mechanic can run – observable):

ls – Room lights switch on

vp – A voltage is measured across power supply unit

lo – Power light switches on

wv – Water visible in reservoir

hp – Can hear pump

me – Makes espresso

ta – Makes a hot, tasty espresso

For the above the number is the column number of the provided data (dm) and the two letter code a suggested variable name.

For if you are unfamiliar with an espresso coffee machine here is a brief description of how one works (you can ignore this if you want):

The user puts ground coffee into a portafilter (round container with a handle and two spouts at the bottom), tamps it (compacts the coffee down), and clamps the portafilter into the group head at the front of the machine. A gasket (rubber ring) forms a seal between the portafilter and group head. A button is pressed. Water is drawn from a reservoir by a pump into a boiler. In the boiler a heating element raises the waters temperature, before the pump pushes it through the group head and into the portafilter at high pressure. The water passes through the coffee grinds and makes a tasty espresso.

The graphical model showing how the variables are related is included on moodle as coffee machine.pdf; here it is given as conditional probabilities:

Failures:

P_he: $P(\texttt{no electricity})$
P_fp: $P(\texttt{fried psu})$
P_fc: $P(\texttt{fried circuit board})$
P_wr: $P(\texttt{water reservoir empty})$
P_gs: $P(\texttt{group head gasket seal broken})$
P_dp: $P(\texttt{dead pump})$
P_fh: $P(\texttt{fried heating element})$

Mechanism:

P_pw_he_fp: $P(\texttt{psu works}\enspace|\enspace\texttt{no electricity},\enspace\texttt{fried psu})$
P_cb_pw_fc: $P(\texttt{circuit board works}\enspace|\enspace\texttt{psu works},\enspace\texttt{fried circuit board})$
P_gw_cb_wr_dp: $P(\texttt{get water}\enspace|\enspace\texttt{circuit board works},\enspace\texttt{water reservoir empty},\enspace\texttt{dead pump})$

Diagnostic:

P_ls_he: $P(\texttt{lights switch on}\enspace|\enspace\texttt{no electricity})$
P_vp_pw: $P(\texttt{voltage across psu}\enspace|\enspace\texttt{psu works})$
P_lo_cb: $P(\texttt{power light on}\enspace|\enspace\texttt{circuit board works})$
P_wv_wr: $P(\texttt{water visible}\enspace|\enspace\texttt{water reservoir empty})$
P_hp_dp: $P(\texttt{can hear pump}\enspace|\enspace\texttt{dead pump})$
P_me_gw_gs: $P(\texttt{makes espresso}\enspace|\enspace\texttt{get water},\enspace\texttt{group head gasket seal broken})$
P_ta_me_fh: $P(\texttt{tasty}\enspace|\enspace\texttt{makes espresso},\enspace\texttt{fried heating element})$

Note that while the model is close to what you may guess the probabilities are not absolute, to account for mistakes and unknown failures. For instance, the mechanic may make a mistake while brewing an espresso and erroneously conclude that the machine is broken when it is in fact awesome. The probabilities associated with each failure are not uniform.

In [2]:

# It may prove helpful to have a mapping between the suggested variable names and
# column indices in the provided file…
nti = dict() # ‘name to index’

nti[‘he’] = 0
nti[‘fp’] = 1
nti[‘fc’] = 2
nti[‘wr’] = 3
nti[‘gs’] = 4
nti[‘dp’] = 5
nti[‘fh’] = 6

nti[‘pw’] = 7
nti[‘cb’] = 8
nti[‘gw’] = 9

nti[‘ls’] = 10
nti[‘vp’] = 11
nti[‘lo’] = 12
nti[‘wv’] = 13
nti[‘hp’] = 14
nti[‘me’] = 15
nti[‘ta’] = 16

# Opposite to the above – index to name…
itn = [‘he’, ‘fp’, ‘fc’, ‘wr’, ‘gs’, ‘dp’, ‘fh’,
‘pw’, ‘cb’, ‘gw’,
‘ls’, ‘vp’, ‘lo’, ‘wv’, ‘hp’, ‘me’, ‘ta’]

# For conveniance this code loads the data from the zip file,
# so you don’t have to decompress it (takes a few seconds to run)…
with zipfile.ZipFile(‘coffee_machines.zip’) as zf:
with zf.open(‘coffee_machines.csv’) as f:
sf = io.TextIOWrapper(f)
reader = csv.reader(sf)
next(reader)
dm = []
for row in reader:
dm.append([int(v) for v in row])
dm = numpy.array(dm, dtype=numpy.int8)

print(‘Data: {} exemplars, {} features’.format(dm.shape[0], dm.shape[1]))
print(‘ Broken machines =’, dm.shape[0] – dm[:,nti[‘ta’]].sum())
print(‘ Working machines =’, dm[:,nti[‘ta’]].sum())

Data: 262144 exemplars, 17 features
Broken machines = 122603
Working machines = 139541

In [3]:

nti

Out[3]:

{‘cb’: 8,
‘dp’: 5,
‘fc’: 2,
‘fh’: 6,
‘fp’: 1,
‘gs’: 4,
‘gw’: 9,
‘he’: 0,
‘hp’: 14,
‘lo’: 12,
‘ls’: 10,
‘me’: 15,
‘pw’: 7,
‘ta’: 16,
‘vp’: 11,
‘wr’: 3,
‘wv’: 13}

1. Learn Model¶
Below a set of variables to represent conditional probability distributions have been defined. They are a Bernoulli trial for each combination of conditional variables, given as $P(\texttt{False}|…)$ in [0,…] and $P(\texttt{True}|…)$ in [1,…] (It may be easier to think of them as boring categorical distributions).

To fit a maximum likelihood model you should first use them as counters – loop over the data set and count how many of each combination exist. You must then normalise them so that the sum over axis 0 is always 1. There is an extra mark for doing the right thing and including a prior. You may want to know that the conjugate prior to the Bernoulli trial represented as $\left[P(\texttt{False}), P(\texttt{True})\right]$ is a Beta distribution; a uniform prior would be a reasonable choice (it can be argued that the expected failure probability is low, and you should adjust the first 7 variables accordingly, but given the quantity of data available it’s not going to matter).

Hint:

The use of 0=False and 1=True both in the dm table and in the conditional probability distributions is very deliberate.
Consider putting all of the variables into a list with extra information about them (such as indices from nti) to make your code neater.
If you make a mistake you could easily end up with NaN or infinity – which would break the next step. Print them out so you can check they are valid!
Do not write unique code for each variable – that will be hundreds of lines of code and very tedious. It’s possible to get all of the marks in 7 lines of code, if you’re particularly sneaky.

(4 marks)

2 marks for counting all conditions
1 mark for normalising distributions
1 mark for including a sensible prior

In [1074]:

# A set of variables that will ultimately represent conditional probability distributions.
# The naming convention is that P_he means P(he), or that P_ta_me_hw means P(ta|me,hw)…

P_he = numpy.zeros(2)
P_fp = numpy.zeros(2)
P_fc = numpy.zeros(2)
P_wr = numpy.zeros(2)
P_gs = numpy.zeros(2)
P_dp = numpy.zeros(2)
P_fh = numpy.zeros(2)

P_pw_he_fp = numpy.zeros((2,2,2))
P_cb_pw_fc = numpy.zeros((2,2,2))
P_gw_cb_wr_dp = numpy.zeros((2,2,2,2))

P_ls_he = numpy.zeros((2,2))
P_vp_pw = numpy.zeros((2,2))
P_lo_cb = numpy.zeros((2,2))
P_wv_wr = numpy.zeros((2,2))
P_hp_dp = numpy.zeros((2,2))
P_me_gw_gs = numpy.zeros((2,2,2))
P_ta_me_fh = numpy.zeros((2,2,2))

# It may prove conveniant to have a structure that describes the above in a computer
# readable form, including labeling what each dimension is…
ops = [(P_he, nti[‘he’]),
(P_fp, nti[‘fp’]),
(P_fc, nti[‘fc’]),
(P_wr, nti[‘wr’]),
(P_gs, nti[‘gs’]),
(P_dp, nti[‘dp’]),
(P_fh, nti[‘fh’]),
(P_pw_he_fp, nti[‘pw’], nti[‘he’], nti[‘fp’]),
(P_cb_pw_fc, nti[‘cb’], nti[‘pw’], nti[‘fc’]),
(P_gw_cb_wr_dp, nti[‘gw’], nti[‘cb’], nti[‘wr’], nti[‘dp’]),
(P_ls_he, nti[‘ls’], nti[‘he’]),
(P_vp_pw, nti[‘vp’], nti[‘pw’]),
(P_lo_cb, nti[‘lo’], nti[‘cb’]),
(P_wv_wr, nti[‘wv’], nti[‘wr’]),
(P_hp_dp, nti[‘hp’], nti[‘dp’]),
(P_me_gw_gs, nti[‘me’], nti[‘gw’], nti[‘gs’]),
(P_ta_me_fh, nti[‘ta’], nti[‘me’], nti[‘fh’])]

# **************************************************************** 4 marks
for i in range(len(ops)):

ind = ops[i][1:]
comb = np.argwhere(ops[i][0]==0)

for com in comb:

com1 = com.copy()
com1[0] = 1 – com[0]

num1=np.all(dm[:,ind]== com, axis=1).sum()
num2=np.all(dm[:,ind]== com1, axis=1).sum()

ops[i][0][tuple(com)] = (1+num1)/(2+num2+num1)

In [460]:

P_me

Out[460]:

array([0, 1])

Brute Force¶
The below code does exactly what you want to implement for step 2, but it brute forces it. Slow and memory inefficient of course, but lets you test it.

In [793]:

def brute_marginals(known):
“””known is a dictionary, where a random variable index existing as a key in the dictionary
indicates it has been observed. The value obtained using the key is the value the
random variable has been observed as. Returns a 17×2 matrix, such that [rv,0] is the
probability of random variable rv being False, [rv, 1] the probability of being True.”””

# Calculate the full joint (please don’t ask)…
params = []
for op in ops:
params.append(op[0])
params.append(op[1:])
params.append(range(17))
joint = numpy.einsum(*params)
#print(params)
# Multiply in the known states (zero out the dimensions we know it’s not)…
for key, value in known.items():
print(key, value)
other = abs(value-1)
index = [slice(None)] * len(joint.shape)
index[key] = other
joint[tuple(index)] = 0.0

# Calculate and return all marginals…
ret = numpy.empty((len(joint.shape), 2))
for row in range(ret.shape[0]):
ret[row,:] = numpy.einsum(joint, range(len(joint.shape)), [row])

ret /= ret.sum(axis=1)[:,None]
return ret

print(brute_marginals({}))
print()
print(brute_marginals({nti[‘me’] : 1}))

[[0.99047096 0.00952904]
[0.80056152 0.19943848]
[0.98004547 0.01995453]
[0.89956742 0.10043258]
[0.95010414 0.04989586]
[0.94968071 0.05031929]
[0.97033714 0.02966286]
[0.20705955 0.79294045]
[0.22287459 0.77712541]
[0.32884651 0.67115349]
[0.10854219 0.89145781]
[0.21506203 0.78493797]
[0.22367574 0.77632426]
[0.28021332 0.71978668]
[0.14461369 0.85538631]
[0.42213307 0.57786693]
[0.46711295 0.53288705]]

15 1
[[9.99988680e-01 1.13197454e-05]
[9.99951986e-01 4.80137278e-05]
[9.99987176e-01 1.28242015e-05]
[9.89199860e-01 1.08001398e-02]
[9.94152722e-01 5.84727836e-03]
[9.99974793e-01 2.52068210e-05]
[9.70337140e-01 2.96628596e-02]
[4.00504313e-05 9.99959950e-01]
[3.23840843e-05 9.99967616e-01]
[1.31563083e-05 9.99986844e-01]
[9.99797579e-02 9.00020242e-01]
[1.01366225e-02 9.89863378e-01]
[1.06817175e-03 9.98931828e-01]
[2.08497919e-01 7.91502081e-01]
[9.93173565e-02 9.00682644e-01]
[0.00000000e+00 1.00000000e+00]
[7.78509487e-02 9.22149051e-01]]

2. Belief Propagation¶
Your task is to write a function that take known states and calculates the marginal probability for every state of the graphical model. This will require using the sum-product algorithm and passing all messages on the graph.

Here is the wikipedia page for reference: https://en.wikipedia.org/wiki/Belief_propagation (not the greatest, but not horrible)

Hints:

The order of the variables above is such that each variable is dependent only on variables that occur before it. This should save you some hassle in terms of calculating a message passing order.
You may want to add code before the function to prepare. This might involve creating a list of edges in the graphical model, in the order they need to be processed for instance. It can also be done with dictionaries or classes – choose whatever you are comfortable with, but spend time thinking it through first. If you get it wrong your code will get messy!
A good approach to recording messages is a dictionary indexed [from, to]
The easiest way to include a known value in the model is often to add/replace the unary term for that value.
This problem is small enough that you can brute force it – code to do so has been provided above. Do test that your implementation is correct by comparing them.

(10 marks)

3 marks for sensible data structures
5 marks for sending the messages correctly
2 marks for correctly calculating the marginals at the end

In [1131]:

def prob(con):
for i in range(len(con)):

ind = con[i][1:]
comb = np.argwhere(con[i][0]==0)

for com in comb:

com1 = com.copy()
com1[0] = 1 – com[0]

num1=np.all(dm[:,ind]== com, axis=1).sum()
num2=np.all(dm[:,ind]== com1, axis=1).sum()

con[i][0][tuple(com)] = (1+num1)/(2+num2+num1)
return con

P_fp_he = numpy.zeros((2,2))
P_fc_fp = numpy.zeros((2,2))
P_wr_fc = numpy.zeros((2,2))
P_gs_wr = numpy.zeros((2,2))
P_dp_gs = numpy.zeros((2,2))
P_fh_dp = numpy.zeros((2,2))
P_pw_fh = numpy.zeros((2,2))
P_cb_pw = numpy.zeros((2,2))
P_gw_cb = numpy.zeros((2,2))
P_ls_gw = numpy.zeros((2,2))
P_vp_ls = numpy.zeros((2,2))
P_lo_vp = numpy.zeros((2,2))
P_wv_lo = numpy.zeros((2,2))
P_hp_wv = numpy.zeros((2,2))
P_me_hp = numpy.zeros((2,2))
P_ta_me = numpy.zeros((2,2))

P_he = numpy.zeros(2)
P_fp = numpy.zeros(2)
P_fc = numpy.zeros(2)
P_wr = numpy.zeros(2)
P_gs = numpy.zeros(2)
P_dp = numpy.zeros(2)
P_fh = numpy.zeros(2)
P_pw = numpy.zeros(2)
P_cb = numpy.zeros(2)
P_gw = numpy.zeros(2)
P_ls = numpy.zeros(2)
P_vp = numpy.zeros(2)
P_lo = numpy.zeros(2)
P_wv = numpy.zeros(2)
P_hp = numpy.zeros(2)
P_me = numpy.zeros(2)
P_ta = numpy.zeros(2)

unary = [(P_he, nti[‘he’]),
(P_fp, nti[‘fp’]),
(P_fc, nti[‘fc’]),
(P_wr, nti[‘wr’]),
(P_gs, nti[‘gs’]),
(P_dp, nti[‘dp’]),
(P_fh, nti[‘fh’]),
(P_pw, nti[‘pw’]),
(P_cb, nti[‘cb’]),
(P_gw, nti[‘gw’]),
(P_ls, nti[‘ls’]),
(P_vp, nti[‘vp’]),
(P_lo, nti[‘lo’]),
(P_wv, nti[‘wv’]),
(P_hp, nti[‘hp’]),
(P_me, nti[‘me’]),
(P_ta, nti[‘ta’])
]

con = [(P_fp_he, nti[‘fp’],nti[‘he’]),
(P_fc_fp, nti[‘fc’],nti[‘hp’]),
(P_wr_fc, nti[‘wr’],nti[‘fc’]),
(P_gs_wr, nti[‘gs’],nti[‘wr’]),
(P_dp_gs, nti[‘dp’],nti[‘gs’]),
(P_fh_dp, nti[‘fh’],nti[‘dp’]),
(P_pw_fh, nti[‘pw’],nti[‘fh’]),
(P_cb_pw, nti[‘cb’],nti[‘pw’]),
(P_gw_cb, nti[‘gw’],nti[‘cb’]),
(P_ls_gw, nti[‘ls’],nti[‘gw’]),
(P_vp_ls, nti[‘vp’],nti[‘ls’]),
(P_vp_pw, nti[‘vp’],nti[‘pw’]),
(P_lo_vp, nti[‘lo’],nti[‘vp’]),
(P_wv_lo, nti[‘wv’],nti[‘lo’]),
(P_hp_wv, nti[‘hp’],nti[‘wv’]),
(P_me_hp, nti[‘me’],nti[‘hp’]),
(P_ta_me, nti[‘ta’],nti[‘me’])
]

unary = prob(unary)
con = prob(con)

#Graph Structure
graph = [(P_he, P_fp_he ),
(P_fp, P_fc_fp ),
(P_fc, P_wr_fc ),
(P_wr, P_gs_wr ),
(P_dp, P_dp_gs ),
(P_fh, P_fh_dp ),
(P_pw, P_cb_pw, P_he, P_fp, P_pw_he_fp),
(P_cb, P_gw_cb, P_pw, P_fc, P_cb_pw_fc),
(P_gw, P_ls_gw, P_cb, P_wr, P_dp, P_gw_cb_wr_dp),
(P_ls, P_vp_ls, P_he, P_ls_he ),
(P_vp, P_vp_pw, P_pw, P_vp_pw ),
(P_lo, P_wv_lo, P_cb, P_lo_cb ),
(P_wv, P_hp_wv, P_wr, P_wv_wr ),
(P_hp, P_me_hp, P_dp, P_hp_dp ),
(P_me, P_ta_me, P_gw, P_gs, P_me_gw_gs ),
(P_ta, P_me, P_fh, P_ta_me_fh)
]

def marginals(known):
“””known is a dictionary, where a random variable index existing as a key in the dictionary
indicates it has been observed. The value obtained using the key is the value the
random variable has been observed as. Returns a 17×2 matrix, such that [rv,0] is the
probability of random variable rv being False, [rv, 1] the probability of being True.”””

# **************************************************************** 10 marks
pass

In [1137]:

Out[1137]:

5

In [ ]:

In [ ]:

In [1130]:

def cond(con):
for i in range(len(con)):

ind = con[i][1:]
comb = np.argwhere(con[i][0]==0)

for com in comb:

com1 = com.copy()
com1[0] = 1 – com[0]

num1=np.all(dm[:,ind]== com, axis=1).sum()
num2=np.all(dm[:,ind]== com1, axis=1).sum()

con[i][0][tuple(com)] = (1+num1)/(2+num2+num1)
return con

In [1104]:

P_fh

Out[1104]:

array([0.97033714, 0.02966286])

In [1110]:

#cond(con)
#P_fh = np.array([0,1])
m_Fhefp_he = P_he
m_Fhefp_fp = numpy.einsum(‘b,ab->b’, m_Fhefp_he, con[0][0])

m_fp_Ffpfc = numpy.einsum(‘a,b->a’,P_fp,m_Fhefp_fp)
m_Ffpfc_fc = numpy.einsum(‘b,ab->b’,m_fp_Ffpfc,con[1][0])

m_fc_Ffcwr = numpy.einsum(‘a,b->a’,P_fc,m_Ffpfc_fc)
m_Ffcwr_wr = numpy.einsum(‘b,ab->b’,m_fc_Ffcwr,con[2][0])

m_wr_Fwrgs = numpy.einsum(‘a,b->a’, P_wr, m_Ffcwr_wr)
m_Fwrgs_gs = numpy.einsum(‘b,ab->b’,m_wr_Fwrgs,con[3][0])

m_gs_Fgsdp = numpy.einsum(‘a,b->a’, P_gs,m_Fwrgs_gs)
m_Fgsdp_dp = numpy.einsum(‘b,ab->b’,m_gs_Fgsdp,con[4][0])

m_dp_Fdpfh = numpy.einsum(‘a,b->a’, P_dp,m_Fgsdp_dp)
m_Fdpfh_fh = numpy.einsum(‘b,ab->b’,m_dp_Fdpfh,con[5][0])

m_fh_Ffhpw = numpy.einsum(‘a,b->a’, P_fh,m_Fdpfh_fh)
m_Ffhpw_pw = numpy.einsum(‘b,ab->b’,m_fh_Ffhpw,con[6][0])

m_Fhe_Ffp = numpy.einsum(‘a,b’, P_he, P_fp)
m_Fpwhefp_pw = numpy.einsum(‘bc,abc->a’, m_Fhe_Ffp, P_pw_he_fp)
m_pw_Fpwcb = numpy.einsum(‘a,b->a’, m_Fpwhefp_pw, m_Ffhpw_pw)
m_pw_Fpwcb = numpy.einsum(‘a,b->b’, m_pw_Fpwcb, P_pw)

m_Fpwcb_cb = numpy.einsum(‘b,ab->b’,m_pw_Fpwcb,con[7][0])
m_Fpw_Ffc = numpy.einsum(‘a,b’, P_pw, P_fc)
m_Fcbpwfc_cb = numpy.einsum(‘bc,abc->a’, m_Fpw_Ffc, P_cb_pw_fc)
m_cb_Fcbgw = numpy.einsum(‘a,b->a’, m_Fcbpwfc_cb, m_Fpwcb_cb)
m_cb_Fcbgw = numpy.einsum(‘a,b->b’, m_cb_Fcbgw, P_cb)

m_Fcbgw_gw = numpy.einsum(‘b,ab->b’,m_cb_Fcbgw,con[8][0])
m_Fcb_Fwr_Fdp = numpy.einsum(‘a,b,c’, P_cb, P_wr, P_dp)
m_Fgwcbwrdp_gw = numpy.einsum(‘bcd,abcd->a’, m_Fcb_Fwr_Fdp,P_gw_cb_wr_dp)
m_gw_Fgwls = numpy.einsum(‘a,b->a’,m_Fgwcbwrdp_gw,m_Fcbgw_gw)
m_gw_Fgwls = numpy.einsum(‘a,b->b’,m_gw_Fgwls,P_gw)

m_Fgwls_ls = numpy.einsum(‘b,ab->b’,m_gw_Fgwls,con[9][0])
m_Flshe_ls = numpy.einsum(‘b,ab->a’, P_he, P_ls_he)
m_ls_Flsvp = numpy.einsum(‘a,b->a’,m_Flshe_ls,m_Fgwls_ls)
m_ls_Flsvp = numpy.einsum(‘a,b->b’,m_ls_Flsvp,P_ls)

m_Flsvp_vp = numpy.einsum(‘b,ab->b’,m_ls_Flsvp,con[10][0])
m_Fpwvp_vp = numpy.einsum(‘b,ab->a’, P_pw, P_vp_pw)
m_vp_Fvplo = numpy.einsum(‘a,b->a’,m_Fpwvp_vp,m_Flsvp_vp)
m_vp_Fvplo = numpy.einsum(‘a,b->b’,m_vp_Fvplo,P_vp)

m_Fvplo_lo = numpy.einsum(‘b,ab->b’,m_vp_Fvplo,con[11][0])
m_Flocb_lo = numpy.einsum(‘b,ab->a’, P_cb, P_lo_cb)
m_lo_Flowv = numpy.einsum(‘a,b->a’,m_Flocb_lo,m_Fvplo_lo)
m_lo_Flowv = numpy.einsum(‘a,b->b’,m_lo_Flowv,P_lo)

m_Flowv_wv = numpy.einsum(‘b,ab->b’,m_lo_Flowv,con[12][0])
m_Fwvwr_wv = numpy.einsum(‘b,ab->a’, P_wr, P_wv_wr)
m_wv_Fwvhp = numpy.einsum(‘a,b->a’,m_Fwvwr_wv,m_Flowv_wv)
m_wv_Fwvhp = numpy.einsum(‘a,b->b’,m_wv_Fwvhp,P_wv)

m_Fwvhp_hp = numpy.einsum(‘b,ab->b’,m_wv_Fwvhp,con[12][0])
m_Fhpdp_hp = numpy.einsum(‘b,ab->a’, P_dp, P_hp_dp)
m_hp_Fhpme = numpy.einsum(‘a,b->a’,m_Fhpdp_hp,m_Fwvhp_hp)
m_hp_Fhpme = numpy.einsum(‘a,b->b’,m_hp_Fhpme,P_hp)

m_Fhpme_me = numpy.einsum(‘b,ab->b’,m_hp_Fhpme,con[13][0])
m_Fgw_Fgs = numpy.einsum(‘a,b’, P_gw, P_gs)
m_Fmegwgs_me = numpy.einsum(‘bc,abc->a’, m_Fgw_Fgs, P_me_gw_gs)
m_me_F_meta = numpy.einsum(‘a,b->a’,m_Fmegwgs_me,m_Fhpme_me)
m_me_F_meta = numpy.einsum(‘a,b->b’,m_me_F_meta,P_me)

m_Fme_Ffh = numpy.einsum(‘a,b’, P_me, P_fh)
m_Ftamefh_ta = numpy.einsum(‘bc,abc->a’, m_Fme_Ffh, P_ta_me_fh)

ret = np.array([
m_he_Fhefp,
m_fp_Ffpfc,
m_fc_Ffcwr,
m_wr_Fwrgs,
m_gs_Fgsdp,
m_dp_Fdpfh,
m_fh_Ffhpw,
m_pw_Fpwcb,
m_cb_Fcbgw,
m_gw_Fgwls,
m_ls_Flsvp,
m_vp_Fvplo,
m_lo_Flowv,
m_wv_Fwvhp,
m_hp_Fhpme,
m_me_F_meta,
m_Ftamefh_ta,
])
ret /= ret.sum(axis=1)[:,None]

In [1111]:

ret

Out[1111]:

array([[0.99047096, 0.00952904],
[0.80056152, 0.19943848],
[0.98004547, 0.01995453],
[0.89956742, 0.10043258],
[0.95010414, 0.04989586],
[0.94968071, 0.05031929],
[0.97033714, 0.02966286],
[0.20699534, 0.79300466],
[0.22294447, 0.77705553],
[0.32961022, 0.67038978],
[0.10854257, 0.89145743],
[0.21499851, 0.78500149],
[0. , 1. ],
[1. , 0. ],
[0.14461407, 0.85538593],
[0.42276823, 0.57723177],
[0.46769865, 0.53230135]])

In [1082]:

brute_marginals({nti[‘fh’] : 1})

6 1

Out[1082]:

array([[9.90470959e-01, 9.52904107e-03],
[8.00561519e-01, 1.99438481e-01],
[9.80045471e-01, 1.99545292e-02],
[8.99567417e-01, 1.00432583e-01],
[9.50104140e-01, 4.98958596e-02],
[9.49680712e-01, 5.03192877e-02],
[0.00000000e+00, 1.00000000e+00],
[2.07059547e-01, 7.92940453e-01],
[2.22874588e-01, 7.77125412e-01],
[3.28846512e-01, 6.71153488e-01],
[1.08542188e-01, 8.91457812e-01],
[2.15062033e-01, 7.84937967e-01],
[2.23675738e-01, 7.76324262e-01],
[2.80213317e-01, 7.19786683e-01],
[1.44613692e-01, 8.55386308e-01],
[4.22133073e-01, 5.77866927e-01],
[9.99742983e-01, 2.57016897e-04]])

In [818]:

m_Fme_Ffh = numpy.einsum(‘a,b’, np.array([0,1]), P_fh)
m_Ftamefh_ta = numpy.einsum(‘bc,abc->a’, m_Fme_Ffh, P_ta_me_fh)
m_ta_Fmeta = m_Ftamefh_ta#numpy.einsum(‘a,b->a’, m_Ftamefh_ta, P_ta)

m_Fmeta_ta = numpy.einsum(‘b,ab->b’,m_me_F_meta,con[14][0])

g_ta = numpy.einsum(‘a,b->b’,m_Fmeta_ta,m_Ftamefh_ta)

m_Fmeta_me = numpy.einsum(‘a,ab->b’, m_ta_Fmeta, P_ta_me)
m_Fmeta_me

Out[818]:

array([0.07785857, 0.85643306])

In [824]:

(P_ta*m_Ftamefh_ta*m_Fmeta_ta*np.array([0,1]))/(P_ta*m_Ftamefh_ta*m_Fmeta_ta*np.array([0,1])).sum()

Out[824]:

array([0., 1.])

In [795]:

m_Fme_Ffh = numpy.einsum(‘a,b,c’, np.array([0,1]), P_fh, P_ta)
m_Ftamefh_ta = numpy.einsum(‘abc,abc->a’, m_Fme_Ffh, P_ta_me_fh)
m_ta_Fmeta = m_Ftamefh_ta
m_Ftamefh_ta/=m_Ftamefh_ta.sum()
m_Ftamefh_ta

Out[795]:

array([0., 1.])

In [788]:

brute_marginals({nti[‘me’] : 1})

15 1

Out[788]:

array([[9.99988680e-01, 1.13197454e-05],
[9.99951986e-01, 4.80137278e-05],
[9.99987176e-01, 1.28242015e-05],
[9.89199860e-01, 1.08001398e-02],
[9.94152722e-01, 5.84727836e-03],
[9.99974793e-01, 2.52068210e-05],
[9.70337140e-01, 2.96628596e-02],
[4.00504313e-05, 9.99959950e-01],
[3.23840843e-05, 9.99967616e-01],
[1.31563083e-05, 9.99986844e-01],
[9.99797579e-02, 9.00020242e-01],
[1.01366225e-02, 9.89863378e-01],
[1.06817175e-03, 9.98931828e-01],
[2.08497919e-01, 7.91502081e-01],
[9.93173565e-02, 9.00682644e-01],
[0.00000000e+00, 1.00000000e+00],
[7.78509487e-02, 9.22149051e-01]])

In [1113]:

cond(con)

#{nti[‘wv’] : False, nti[‘lo’] : True}
P_wv = np.array([1,0])
P_lo = np.array([0,1])

m_he_Fhefp = P_he
m_Fhefp_fp = numpy.einsum(‘b,ab->a’, m_he_Fhefp, con[0][0])

m_fp_Ffpfc = numpy.einsum(‘a,b->a’,P_fp,m_Fhefp_fp)
m_Ffpfc_fc = numpy.einsum(‘b,ab->a’,m_fp_Ffpfc,con[1][0])

m_fc_Ffcwr = numpy.einsum(‘a,b->a’,P_fc,m_Ffpfc_fc)
m_Ffcwr_wr = numpy.einsum(‘b,ab->a’,m_fc_Ffcwr,con[2][0])

m_wr_Fwrgs = numpy.einsum(‘a,b->a’, P_wr, m_Ffcwr_wr)
m_Fwrgs_gs = numpy.einsum(‘b,ab->a’,m_wr_Fwrgs,con[3][0])

m_gs_Fgsdp = numpy.einsum(‘a,b->a’, P_gs,m_Fwrgs_gs)
m_Fgsdp_dp = numpy.einsum(‘b,ab->a’,m_gs_Fgsdp,con[4][0])

m_dp_Fdpfh = numpy.einsum(‘a,b->a’, P_dp,m_Fgsdp_dp)
m_Fdpfh_fh = numpy.einsum(‘b,ab->a’,m_dp_Fdpfh,con[5][0])

m_fh_Ffhpw = numpy.einsum(‘a,b->a’, P_fh,m_Fdpfh_fh)
m_Ffhpw_pw = numpy.einsum(‘b,ab->a’,m_fh_Ffhpw,con[6][0])

m_Fhe_Ffp = numpy.einsum(‘a,b’, P_he, P_fp)
m_Fpwhefp_pw = numpy.einsum(‘bc,abc->a’, m_Fhe_Ffp, P_pw_he_fp)
m_pw_Fpwcb = numpy.einsum(‘a,b->a’, m_Fpwhefp_pw, m_Ffhpw_pw)
m_pw_Fpwcb = numpy.einsum(‘a,b->b’, m_pw_Fpwcb, P_pw)

m_Fpwcb_cb = numpy.einsum(‘b,ab->a’,m_pw_Fpwcb,con[7][0])
m_Fpw_Ffc = numpy.einsum(‘a,b’, m_pw_Fpwcb, m_fc_Ffcwr) ###related node
m_Fcbpwfc_cb = numpy.einsum(‘bc,abc->a’, m_Fpw_Ffc, P_cb_pw_fc)
m_cb_Fcbgw = numpy.einsum(‘a,b->b’, m_Fcbpwfc_cb, m_Fpwcb_cb)
m_cb_Fcbgw = numpy.einsum(‘a,b->b’, m_cb_Fcbgw, P_cb)

m_Fcbgw_gw = numpy.einsum(‘b,ab->a’,m_cb_Fcbgw,con[8][0])
m_Fcb_Fwr_Fdp = numpy.einsum(‘a,b,c’, P_cb, m_wr_Fwrgs, P_dp)
m_Fgwcbwrdp_gw = numpy.einsum(‘bcd,abcd->a’, m_Fcb_Fwr_Fdp,P_gw_cb_wr_dp)
m_gw_Fgwls = numpy.einsum(‘a,b->b’,m_Fgwcbwrdp_gw,m_Fcbgw_gw)
m_gw_Fgwls = numpy.einsum(‘a,b->b’,m_gw_Fgwls,P_gw)

m_Fgwls_ls = numpy.einsum(‘b,ab->a’,m_gw_Fgwls,con[9][0])
m_Flshe_ls = numpy.einsum(‘b,ab->a’, P_he, P_ls_he)
m_ls_Flsvp = numpy.einsum(‘a,b->a’,m_Flshe_ls,m_Fgwls_ls)
m_ls_Flsvp = numpy.einsum(‘a,b->b’,m_ls_Flsvp,P_ls)

m_Flsvp_vp = numpy.einsum(‘b,ab->a’,m_ls_Flsvp,con[10][0])
m_Fpwvp_vp = numpy.einsum(‘b,ab->a’, m_pw_Fpwcb, P_vp_pw) ###related node
m_vp_Fvplo = numpy.einsum(‘a,b->a’,m_Fpwvp_vp,m_Flsvp_vp)
m_vp_Fvplo = numpy.einsum(‘a,b->b’,m_vp_Fvplo,P_vp)

m_Fvplo_lo = numpy.einsum(‘b,ab->a’,m_vp_Fvplo,con[11][0])
m_Flocb_lo = numpy.einsum(‘b,ab->a’, m_cb_Fcbgw, P_lo_cb) ###related node
m_lo_Flowv = numpy.einsum(‘a,b->a’,m_Flocb_lo,m_Fvplo_lo)
m_lo_Flowv = numpy.einsum(‘a,b->b’,m_lo_Flowv,P_lo)

m_Flowv_wv = numpy.einsum(‘b,ab->a’,m_lo_Flowv,con[12][0])
m_Fwvwr_wv = numpy.einsum(‘b,ab->a’, m_wr_Fwrgs, P_wv_wr)
m_wv_Fwvhp = numpy.einsum(‘a,b->a’,m_Fwvwr_wv,m_Flowv_wv)
m_wv_Fwvhp = numpy.einsum(‘a,b->b’,m_wv_Fwvhp,P_wv)

m_Fwvhp_hp = numpy.einsum(‘b,ab->a’,m_wv_Fwvhp,con[12][0])
m_Fhpdp_hp = numpy.einsum(‘b,ab->a’, P_dp, P_hp_dp)
m_hp_Fhpme = numpy.einsum(‘a,b->a’,m_Fhpdp_hp,m_Fwvhp_hp)
m_hp_Fhpme = numpy.einsum(‘a,b->b’,m_hp_Fhpme,P_hp)

m_Fhpme_me = numpy.einsum(‘b,ab->a’,m_hp_Fhpme,con[13][0])
m_Fgw_Fgs = numpy.einsum(‘a,b’, m_gw_Fgwls, m_gs_Fgsdp) ###related note
m_Fmegwgs_me = numpy.einsum(‘bc,abc->a’, m_Fgw_Fgs, P_me_gw_gs)

m_me_F_meta = numpy.einsum(‘a,b->a’,m_Fmegwgs_me,m_Fhpme_me)
m_me_F_meta = numpy.einsum(‘a,b->b’,m_me_F_meta,P_me)

m_me_gwgs = numpy.einsum(‘a,b->a’, m_me_F_meta, P_me)
m_Fme_Ffh = numpy.einsum(‘a,b’, m_me_gwgs, m_fh_Ffhpw)
m_Ftamefh_ta = numpy.einsum(‘bc,abc->a’, m_Fme_Ffh, P_ta_me_fh)

ret = np.array([
m_he_Fhefp,
m_fp_Ffpfc,
m_fc_Ffcwr,
m_wr_Fwrgs,
m_gs_Fgsdp,
m_dp_Fdpfh,
m_fh_Ffhpw,
m_pw_Fpwcb,
m_cb_Fcbgw,
m_gw_Fgwls,
m_ls_Flsvp,
m_vp_Fvplo,
m_lo_Flowv,
m_wv_Fwvhp,
m_hp_Fhpme,
m_me_F_meta,
m_Ftamefh_ta,
])
ret /= ret.sum(axis=1)[:,None]
ret

Out[1113]:

array([[0.99047096, 0.00952904],
[0.80056152, 0.19943848],
[0.98004547, 0.01995453],
[0.89956742, 0.10043258],
[0.95010414, 0.04989586],
[0.94968071, 0.05031929],
[0.97033714, 0.02966286],
[0.20699534, 0.79300466],
[0.22294447, 0.77705553],
[0.32961022, 0.67038978],
[0.10854257, 0.89145743],
[0.21499851, 0.78500149],
[0. , 1. ],
[1. , 0. ],
[0.14461407, 0.85538593],
[0.42276823, 0.57723177],
[0.46769865, 0.53230135]])

In [1117]:

m_ta_F_meta

Out[1117]:

array([0.46769358, 0.53230642])

In [1114]:

m_ta_F_meta = numpy.einsum(‘a,b->b’,m_Ftamefh_ta,P_ta)
m_F_meta_me = numpy.einsum(‘a,ab->b’,m_ta_F_meta,P_ta_me)

m_me_F_hpme = numpy.einsum(‘a,b->b’,m_Fmegwgs_me,P_me)
m_me_F_hpme = numpy.einsum(‘a,b->b’,m_F_meta_me,m_me_F_hpme)
m_F_hpme_hp = numpy.einsum(‘a,ab->b’,m_me_F_hpme,P_me_hp)

m_me_F_hpme*m_Fhpme_me*m_ta_F_meta

m_hp_F_wvhp = numpy.einsum(‘a,b->b’,m_Fhpdp_hp,P_hp)
m_hp_F_wvhp = numpy.einsum(‘a,b->b’,m_F_hpme_hp,m_hp_F_wvhp)

m_F_hpme_hp = numpy.einsum(‘a,ab->b’,m_me_F_hpme,P_me_hp)

m_hp_F_wvhp = numpy.einsum(‘a,b->b’,m_F_hpme_hp,m_Fhpdp_hp)

(P_hp*m_hp_F_wvhp*m_F_hpme_hp*m_Fwvhp_hp)/(P_hp*m_hp_F_wvhp*m_F_hpme_hp*m_Fwvhp_hp).sum()

Out[1114]:

array([0.41475939, 0.58524061])

In [1116]:

(m_me_F_hpme*m_Fhpme_me*m_ta_F_meta)/(m_me_F_hpme*m_Fhpme_me*m_ta_F_meta).sum()

Out[1116]:

array([0.20049136, 0.79950864])

In [1112]:

ret

Out[1112]:

array([[0.99047096, 0.00952904],
[0.80056152, 0.19943848],
[0.98004547, 0.01995453],
[0.89956742, 0.10043258],
[0.95010414, 0.04989586],
[0.94968071, 0.05031929],
[0.97033714, 0.02966286],
[0.20699534, 0.79300466],
[0.22294447, 0.77705553],
[0.32961022, 0.67038978],
[0.10854257, 0.89145743],
[0.21499851, 0.78500149],
[0. , 1. ],
[1. , 0. ],
[0.14461407, 0.85538593],
[0.42276823, 0.57723177],
[0.46769865, 0.53230135]])

In [1095]:

brute_marginals({nti[‘wv’] : False, nti[‘lo’] : True})

13 False
12 True

Out[1095]:

array([[9.99989854e-01, 1.01464022e-05],
[9.99976567e-01, 2.34329020e-05],
[9.99989634e-01, 1.03657912e-05],
[6.41598872e-01, 3.58401128e-01],
[9.50104140e-01, 4.98958596e-02],
[9.49680712e-01, 5.03192877e-02],
[9.70337140e-01, 2.96628596e-02],
[1.45290999e-05, 9.99985471e-01],
[4.91213203e-06, 9.99995088e-01],
[3.57451562e-01, 6.42548438e-01],
[9.99787023e-02, 9.00021298e-01],
[1.01113593e-02, 9.89888641e-01],
[0.00000000e+00, 1.00000000e+00],
[1.00000000e+00, 0.00000000e+00],
[1.44613692e-01, 8.55386308e-01],
[4.46761196e-01, 5.53238804e-01],
[4.89823305e-01, 5.10176695e-01]])

3. What’s broken?¶
Simply print out what the most probable broken part of each of the below machines is. Please print it as
dictionary key: name of broken part, e.g. A: wr. This will allow automarking to be used.

If you can’t get your belief propagation implementation to work feel free to use brute_marginals instead.

(1 mark)

In [ ]:

repair = {}
repair[‘A’] = {nti[‘me’] : True}
repair[‘B’] = {nti[‘wv’] : True}
repair[‘C’] = {nti[‘wv’] : False, nti[‘lo’] : True}
repair[‘D’] = {nti[‘hp’] : False, nti[‘ta’] : False}
repair[‘E’] = {nti[‘hp’] : True, nti[‘wv’] : True, nti[‘vp’]: True}

# **************************************************************** 1 mark

In [ ]:

brute_marginals(repair[‘E’])

In [740]:

P_wv

Out[740]:

array([0.28021408, 0.71978592])

In [ ]: