Mining Social Networks
COMPSCI 753 Kaiqi Zhao
Overview
§ Social network analysis
§ Social network introduction § Community detection
§ Influence maximization
14
How to define good communities?
1 2456
3
5
6
5
6 Actual
§ Edge betweenness
1 2
3
Cut
§ Best cuts § Modularity
2
3
4
1
1 -2
3
5
6
15
4
4 Expected
Community detection
§ Girvan-Newman algorithm § Spectral clustering
§ Modularity maximization
16
Weak ties and communities
The weak tie between Ego and his acquaintance, therefore, becomes not merely a trivial acquaintance tie but rather a crucial bridge between the two densely knit clumps of close friends
— by Mark Granovetter
17
Weak ties and communities
§ How can we measure the importance of an edge as a bridge between communities?
§ Edge betweenness: Number of shortest paths passing over the edge
18
Computing Betweenness
§ Ulrik Brandes, 2001
§ Idea on unweighted graph – breadth-first-search, level = distance
19
Brandes’ algorithm
§ Step 1: Top-down to compute # shortest path from root to a node § Step 2: Bottom-up to distribute # shortest path to edges
Top-down:
Ø Start from root and assign 1 to root
Ø For each node, add the number of shortest path of its parents
1 1111
212
3
3 6
20
Brandes’ algorithm
§ Step 1: Top-down to compute # shortest path from root to a node § Step 2: Bottom-up to distribute # shortest path to edges
Bottom-up:
ØInitially, each node gets 1 credit Ø Start from leaves, backpropagate
the credits
Ø For each node with 𝑘 parents,
each has 𝑝% shortest paths:
Ø Distribute the credit of the
node to each edge proportional to 𝑝%
2 1
1
2 2 4 2
4 1 2 112211
1 1 221222
1 0.5 1.5 3
0.5
6
1
0.5 1
3 1.5
0.5
21
Brandes’ algorithm
§ Step 1: Top-down to compute # shortest path from root to a node § Step 2: Bottom-up to distribute # shortest path to edges
Complexity
ØWe need to traverse the (unweighted graph) with BFS fromtop,andthereversefrom bottom
Ø BFS takes O(|E|) time
Ø Need to run BFS for every node! Overall complexity – 𝑂(|𝐸||𝑉|)
2 1 4
2 1 2
0.5 0.5
1
2
2 1 1
2 2 1 1 0.5
1 1
1.5 3
0.5
3 1.5 6
2
4
1
1 1 2
1 2 2 2
22
Girvan-Newman algorithm
§ Divisive hierarchical clustering based on edge betweenness
§ How?
1. ApplyBrandes’algorithmtocomputebetweenness
2. Removetheedge(s)withhighestbetweenness
3. Repeat1and2untilnofurtheredgesareleftorwegetanumberof𝑘 clusters
1
12
33
49
Communities with different granularities
23