(Answer)
1. baskets and contents (0.5 pts) a: {b,c,d}
b: {c, d}
c: {b,d,e,f}
d: {e,f}
e: {b,c,d}
f: {}
frequent itemsets (s = 3)
– single: b, c, d
– pair: {b,d}, {c,d}
2. Bipartite subgraph (0.5 pts for one graph)
1
(Answer)
C = {w, x, y} (green) and D= {w, y, z} (red)
Pwx = Pxy = PC, Pyz = Pwz = PD, Pwy = 1 – (1 – PC)(1 – PD) = PC+PD -PCPD, Pxz = ɛ (tiny value)
The likelihood of the graph:
Pwx Pwy Pxy Pyz (1 -Pwz) (1 -Pxz)
=PC (PC+PD -PCPD)PC PD(1-PD)(1-ɛ)
= PC2(PC+PD -PCPD) PD (1 – PD) (1 – ɛ)
≈ PC2(PC+PD -PCPD) PD (1 – PD)
PC, PD makes the likelihood maximizes.
2
PC should be as large as possible, that is, PC = 1
The likelihood becomes PD (1 – PD), and we can find PD by using derivative, PD = 0.5 (1 pt)
(Meaning: 1 pt)
Given the assumption, C = {w, x, y} and D= {w, y, z}, the maximum likelihood is 0.25 by plugging PC = 1, PD = 0.5.
By repeating the MLE (maximum likelihood estimation) process with different assumptions (the number of communities, members in the communities), we can find communities which maximizes the likelihood.
(Answer)
(1 pt)
C: at least one community
𝑃(𝑢,𝑣)=1−∏ (1−𝑃 (𝑢,𝑣))
𝐶 𝐶
=1−∏ (exp(−𝐹 𝐹 ))
𝑢𝐶 𝑣𝐶 =1−exp(−∑𝐹 𝐹 )
𝐶
𝑢𝐶 𝑣𝐶 𝐶
=1−exp(−𝑭 𝑭𝑻) 𝒖𝒗
(2 pts) Initialize F
Locally minimal neighborhoods are used. N(u) is locally minimal if N(u) has lower conductance than all the N(v) for nodes v connected to node u. (Neighborhoods N(u) of node u: community of u and its neighbors)
Fuk = 1 (node u who belongs to a locally minimal neighborhood k) Fuk = 0 (otherwise)
3
(Answer)
(step 2 graph: 0.25 pts)
(steps: 0.5 pts)
1. Perform a breadth-first search (BFS), starting at node B.
2. Label each node by the number of shortest paths from the root B. 3. Calculate each edge and node credit from the bottom with rules: – Each node other than the root: credit = 1
(step 3 graph: 0.25 pts)
4
– Leaf node: credit = 1
– node other than leaves: credit =1 + sum of the credits of the edges from children
– edge: credit = the credit of the child node * the fraction of shortest paths
the credit for the edge (Yi,Z): the credit of Z * 𝒑 / ∑𝒌 𝒑 𝒊 𝒋=𝟏𝒋
Z: child node
Y1,Y2,…,Yk: Z’s parents
𝑝𝑖: the number of shortest paths from the root to Yi (computed in Step 2)
(Repeat this process for each node as root, the betweenness: sum edge values and then divide by 2)
(Answer)
=1∑ 𝑘𝑖×2𝑚=1∑ 𝑘𝑖=12𝑚=𝑚 4𝑚𝑖∈𝑁 2𝑖∈𝑁2
(∑𝑢∈𝑁 𝑘𝑢 = 2𝑚)
5
(Answer)
(1 pt)
𝑥𝑇𝐿𝑥 = 𝑥𝑇(𝐷 − 𝐴)𝑥 = 𝑥𝑇𝐷𝑥 − 𝑥𝑇𝐴𝑥 𝑛
=∑𝑑𝑥2 − ∑ 2𝑥𝑥 𝑖𝑖 𝑖𝑗
𝑖=1 (𝑖,𝑗)∈𝐸
=∑
(meaning: 1 pt)
Minimum Ncut is obtained by minimum 𝑥𝑇𝐿𝑥. If both xi and xj have the same sign, (𝑥 − 𝑥 )2 tends to
clustered in the same community. The sign of x can be used to split the graph.
(𝑥2+𝑥2−2𝑥𝑥)=∑
(𝑖,𝑗)∈𝐸𝑖 𝑗 𝑖𝑗 (𝑖,𝑗)∈𝐸𝑖 𝑗
(𝑥 −𝑥)2
𝑖𝑗 decrease, which can minimize 𝑥𝑇𝐿𝑥. Thus, the nodes i corresponding xi with the same signs can be
6