PowerPoint Presentation
Lecture 10: Mining Social-Network Graphs
Prof. Michael R. Lyu
Computer Science & Engineering Dept.
The Chinese University of Hong Kong
1
CMSC5741 Big Data Tech. & Apps.
1
Prediction of 2016 US Election
2
2016 ELECTION PREDICTIONS:
HILLARY CLINTON TOPS
DONLAD TRUMP IN
ELECTORAL COLLEGE,
FORECASTERS AGREE
POLITICS
Social Network Prediction of
2016 US Election
3
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
4
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
5
Networks: Communication
6
Graph of the Internet (Autonomous Systems)
Power-law degrees [Faloutsos-Faloutsos-Faloutsos, 1999]
Robustness [Doyle-Willinger, 2005]
Networks: Media
7
Connections between political blogs
Polarization of the network [Adamic-Glance, 2005]
Network: Science
8
Citation networks and Maps of science
[Börner et al., 2012]
Networks: Biology
9
Protein-Protein Interaction Networks:
Nodes: Proteins
Edges: ‘physical’ interactions
Metabolic networks:
Nodes: Metabolites and enzymes Edges: Chemical reactions
Metabolic: 代谢
Enzyme: 酶
9
Networks: Brain
10
Human brain has between 10-100 billion neurons [Sporns, 2011]
10
Networks: Social
11
Country with Largest World Population?
12
Population ??? China India USA Indonesia Brazil 1999 1378 1324 323 261 208
13
13
If Facebook were a Country..
14
4
Not including Facebook’s market share (Nov. 20, 2016): $337 Billion.
Mark Zuckerberg’s net worth: 50 Billion.
Not including China yet!
14
15
Acquired by Facebook
Oct. 6, 2014 Forbes News: WhatsApp founders Jan Koum and Brian Acton became billionaires last February when Facebook announced it was buying the company they had started five years ago for a jaw-dropping $19 billion.
Facebook Inc (FB.O) closed its acquisition of mobile messaging service WhatsApp on Monday, with the final price tag rising an additional $3 billion to roughly $22 billion because of the increased value of Facebook’s stock in recent months.
15
Networks!
Behind each such system there is an intricate wiring diagram, a network, that defines the interactions between the components
We will never understand these systems unless we understand the networks behind them!
16
Social Web – The Lab of Knowledge and Humanity
17
The web is our “laboratory” for knowledge and understanding humanity
Networks: Impact
18
Cisco Market cap: $211b billion (2y ago it was 150b)
Google Market cap: $740b billion (2y ago it was 536b)
Facebook Market cap: $400b billion (2y ago it was 336b)
Networks: Problems
Social networks:
Link prediction, friend recommendation
Social circle detection, community detection
Social recommendations
Identifying influential nodes, viral marketing
Communication networks:
Intrusion detection, fraud
Churn prediction
Information networks:
Navigational aids
19
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
20
Social Network as Graph
Undirected graphs
Links: undirected (symmetrical, reciprocal relations)
Undirected links:
Collaborations
Friendship on Facebook
Directed graphs
Links: directed (asymmetrical relations)
Directed links:
Phone calls
Following on Twitter
21
Adjacency Matrix
22
1
0
Node Degrees
Node degree : the number of edges adjacent to node
Avg. degree:
In directed networks we define an in‐degree and out‐degree.
The (total) degree of a node is the sum of in‐ and out‐degrees.
23
Degree Matrix
24
This graph has entries only on the diagonal.
The entry for row and column i is the degree of the ith node
Laplacian Matrix
Suppose a graph as adjacency matrix and degree matrix .
Laplacian matrix is defined as .
the Laplacian matrix
Has the same entries as on the diagonal.
Off the diagonal, at row and column ,
has −1 if there is an edge between nodes and
0 if not.
25
In-class Practice 1
Go to practice
26
Complete Graph
The maximum number of edges in an undirected graph on N nodes is
A graph with the number of edges is a complete graph
and its average degree is
N-1
27
Networks are Sparse Graphs
Most real‐world networks are sparse
E << Emax (or k << N-1)
Consequence: Adjacency matrix is filled with zeros!
Density of the matrix (E/N2):
WWW=1.51*10-5, MSN IM Instant Messenger = 2.27*10-8
28
More Types of Graphs
Unweighted
Weighted
29
More Types of Graphs
Self-edges (Self-loops)
Multigraph
30
Network Representations
31
Bipartite Graph
Bipartite graph is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V;
that is, U and V are independent sets
Examples:
Authors‐to‐papers (they authored)
Actors‐to‐Movies (they appeared in)
Users‐to‐Movies (they rated)
“Folded” networks:
Author collaboration networks
Movie co‐rating networks
32
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
33
Degree Distribution
Degree distribution : Probability that a randomly chosen node has degree k
Nk = # nodes with degree k
Normalized:
P(k) = Nk / N ➔ plot
34
Paths in a Graph
A path is a sequence of nodes in which each node is linked to the next one
Path can intersect itself and pass through the same edge multiple times
E.g.: ACBDCDEG
In a directed graph a path can only follow the direction of the “arrow”
35
Distance in a Graph
Distance (shortest path, geodesic) between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes
*If the two nodes are disconnected, the distance is usually defined as infinite
In directed graphs paths need to follow the direction of the arrows
Consequence: Distance is not symmetric: hA,C ≠ hC, A
36
Network Diameter
Diameter: the maximum (shortest path) distance between any pair of nodes in a graph
Average path length for a connected graph (component) or a strongly connected (component of a) directed graph:
Many times we compute the average only over the connected pairs of nodes (we ignore “infinite” length paths)
37
Finding the Shortest Path
Breath‐First Search:
Start with node u, mark it to be at distance hu(u)=0, add u to the queue
While the queue not empty:
Take node v off the queue, put its unmarked neighbors w into the queue and mark hu(w)=hu(v)+1
38
Clustering Coefficient
Clustering coefficient:
What portion of i’s neighbors are connected?
Node i with degree ki
39
where ei is the number of edges
between the neighbors of node i
Clustering Coefficient
Clustering coefficient:
What portion of i’s neighbors are connected?
Node i with degree ki
40
where ei is the number of edges
between the neighbors of node i
Key Network Properties
Degree distribution:
Path length:
Clustering coefficient:
41
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
42
The MSN Messenger
MSN Messenger activity in June 2006:
150Gb/day (compressed)
4.5Tb / month
245 million users logged in
180 million users engaged in conversations
More than 30 billion conversations
More than 255 billion exchanged messages
43
Communication: Geography
44
Communication Network
45
Messaging as a Network
46
MSN Network: Connectivity
47
MSN: Degree Distribution
48
MSN: Log-Log Degree Distribution
49
We plot the same
data as on the
previous slide,
just the axes are
now logarithmic.
MSN: Clustering
50
MSN: Diameter
51
Avg. path length 6.6
90% of the people can be reached in < 8 hops
Number of links
between pairs of
nodes
MSN: Key Network Properties
Degree distribution: heavily skewed. avg degree = 14.4
Path length: 6.6
Clustering coefficient: 0.11
52
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
53
Community Detection
54
How to find communities?
We will work with undirected (unweighted) networks
Betweenness: Strength of Ties
Edge betweenness: number of shortest paths (among all pair of vertices) passing over the edge.
If there is more than one shortest path between a pair of nodes, each path is assigned equal weight such that the total weight of all the paths is equal to 1.
55
Simple algorithm for calculating betweeness.
Repeat for each vertex
compute its shortest path to all other vertices.
for each edge, count the number of the shortest paths passing over the edge
Sum up the results from each vertex.
.5
Betweenness: Strength of Ties
56
Calculation for 1 vertex
Calculation for all vertices
In-class Practice 2
Go to practice
57
Method for Community Detection: Girvan-Newman
Divisive hierarchical clustering based on the notion of edge betweenness:
Number of shortest paths passing through the edge
Girvan-Newman Algorithm:
Undirected unweighted networks
Repeat until no edges are left:
Calculate betweenness of edges
Remove edges with highest betweenness
Connected components are communities
Gives a hierarchical decomposition of the network
58
Girvan-Newman: Example
59
Need to re-compute betweenness at every step
Girvan-Newman: Example
60
Girvan-Newman
61
Communities in physics collaborations
Problems of Girvan-Newman
Computing betweenness is slow.
No theoretical guarantees.
Not widely applied.
62
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
63
Graph Partitioning
Undirected graph 𝑮 (𝑽,𝑬):
Bi-partitioning task:
Divide vertices into two disjoint groups 𝑨,𝑩
Questions:
How can we define a “good” partition of 𝑮?
How can we efficiently identify such a partition?
64
Graph Partitioning
What makes a good partition?
Maximize the number of within-group connections
Minimize the number of between-group connections
65
Graph Cuts
Express partitioning objectives as a function of the “edge cut” of the partition
Cut: Set of edges with only one vertex in a group:
66
Graph Cut Criterion
Criterion: Minimum-cut
Minimize weight of connections between groups
Degenerate case:
Problem:
Only considers external cluster connections
Does not consider internal cluster connectivity
67
Graph Cut Criterion
68
Criterion: Normalized-cut [Shi-Malik, ’97]
Connectivity between groups relative to the density of each group
vol(A): total weight of the edges with at least one endpoint in 𝑨
Why use this criterion?
Produces more balanced partitions
How do we efficiently find a good partition?
Problem: Computing optimal cut is NP-hard
Spectral Graph Partitioning
A: adjacency matrix of undirected G
Aij =1 if (𝒊,𝒋) is an edge, else 0
x is a vector in ℜn with components (𝒙𝟏,…,𝒙𝒏). Think of it as a label/value of each node of 𝑮
What is the meaning of Ax?
Entry yi is a sum of labels xj of neighbors of i
69
xj
What’s the Meaning of Ax?
jth coordinate of Ax:
Sum of the x-values of neighbors of j
Make this a new value at node j
Spectral Graph Theory:
Analyze the “spectrum” of matrix representing 𝑮
Spectrum: Eigenvectors 𝒙𝒊 of a graph, ordered by the magnitude (strength) of their corresponding eigenvalues 𝝀𝒊:
70
Example: d-Regular Graph
Suppose all nodes in 𝑮 have degree 𝒅 and 𝑮 is connected
What are some eigenvalues/vectors of 𝑮?
𝑨⋅ 𝒙 =𝝀⋅𝒙 What is λ? What x?
Let’s try: 𝒙 = (𝟏,𝟏,…,𝟏)
Then: 𝑨⋅𝒙 =𝒅,𝒅,…,𝒅=𝝀⋅𝒙. So: 𝝀=𝒅
We found eigenpair of 𝑮: 𝒙 = (𝟏,𝟏,…,𝟏), 𝝀=𝒅
71
xj
d is the Largest Eigenvalue of A
G is d-regular connected, A is its adjacency matrix
Claim:
d is largest eigenvalue of A,
d has multiplicity of 1 (there is only 1 eigenvector associated with eigenvalue d)
Proof: Why no eigenvalue 𝒅’ > 𝒅?
To obtain d we needed 𝒙𝒊=𝒙𝒋 for every 𝑖,𝑗
This means 𝒙=𝑐⋅(1,1,…,1) for some const. 𝑐
Define: 𝑺 = nodes 𝒊 with maximum possible value of 𝒙𝒊
Then consider some vector 𝒚 which is not a multiple of vector (𝟏,…,𝟏). So not all nodes 𝒊 (with labels 𝒚𝒊 ) are in 𝑺
Consider some node 𝒋∈𝑺 and a neighbor 𝒊∉𝑺 then node 𝒋 gets a value strictly less than 𝒅
So 𝑦 is not eigenvector! And so 𝒅 is the largest eigenvalue!
72
Example: Graph on 2 Components
What if G is not connected?
G has 2 components, each d-regular
What are some eigenvectors?
𝒙= Put all 𝟏s on 𝑨 and 𝟎s on 𝑩 or vice versa
And so in both cases the corresponding
A bit of intuition
73
More Intuition
More Intuitions
If the graph is connected (right example) then we already know that 𝒙𝒏=(𝟏,…𝟏) is an eigenvector
Since eigenvectors are orthogonal then the components of 𝒙𝒏−𝟏 sum to 0.
So we can look at the eigenvector of the 2nd largest eigenvalue and declare nodes with positive label in A and negative label in B. But there is still lots to sort out.
74
Matrix Representation
Adjacency matrix (A):
n*n matrix
A=[aij], aij=1 if edge between node i and j
Important properties:
Symmetric matrix
Eigenvectors are real and orthogonal
75
Matrix Representation
Degree matrix (D):
n*n diagonal matrix
D=[dii], dii = degree of node i
76
Matrix Representation
Laplacian matrix (L):
n*n symmetric matrix
What is trivial Eigenpair?
𝒙=(𝟏,…,𝟏) then 𝑳⋅𝒙=𝟎 and so 𝝀=𝝀𝟏=𝟎
Important properties:
Eigenvalues are non-negative real numbers
Eigenvectors are real and orthogonal
77
Facts about Laplacian L
(a) All eigenvalues are ≥ 0
(b) for every x
(c)
That is, 𝐿 is positive semi-definite
78
as Optimization Problem
Fact: For symmetric matrix L:
79
as Optimization Problem
What else do we know about x?
𝒙 is unit vector:
𝒙 is orthogonal to 1st eigenvector (𝟏,…,𝟏) thus:
Remember
80
We want to assign values 𝒙i to nodes i such that few edges cross 0. (we want xi and xj to subtract each other)
Finding Optimal Cut [Fielder’73]
Back to finding the optimal cut
Express partition (A,B) as a vector
We can minimize the cut of the partition by finding a non-trivial vector x that minimizes
81
Rayleigh Theorem
The minimum value of 𝒇(𝒚) is given by the 2nd smallest eigenvalue λ2 of the Laplacian matrix L
The optimal solution for 𝒇(𝒚) is given by the corresponding eigenvector 𝒙, referred as the Fiedler vector
82
Approx. Guarantee of Spectral (details)
Suppose there is a partition of G into A and B where 𝐴≤|𝐵|, s.t. then .
This is the approximation guarantee of the spectral clustering.
It says the cut spectral is at most 2 away from the optimal one of score .
83
So Far..
How to define a “good” partition of a graph?
Minimize a given graph cut criterion
How to efficiently identify such a partition?
Approximate using information provided by the eigenvalues and eigenvectors of a graph
Spectral Clustering
84
Spectral Clustering Algorithms
Three basic stages:
1) Pre-processing
Construct a matrix representation of the graph
2) Decomposition
Compute eigenvalues and eigenvectors of the matrix
Map each point to a lower-dimensional representation based on one or more eigenvectors
3) Grouping
Assign points to two or more clusters, based on the new representation
85
Spectral Partition Algorithms
86
86
Spectral Partition Algorithms
3) Grouping
Sort components of reduced 1-dimensional vector
Identify clusters by splitting the sorted vector in two
How to choose a splitting point?
Naïve approaches:
Split at 0 or median value
More expensive approaches:
Attempt to minimize normalized cut in 1-dimension (sweep over ordering of nodes induced by the eigenvector)
87
Example: Spectral Partitioning
88
Example: Spectral Partitioning
89
Example: Spectral Partitioning
90
Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
91
Conclusions
How do we reason about networks?
Empirical: Study network data to find organizational principles. How do we measure and quantify networks?
Mathematical models:
Graph theory, statistical models
allow us to understand behaviors and distinguish surprising from expected phenomena
Algorithms
for analyzing graphs
Hard computational challenges
92
Conclusions: This is Just a Beginning
93
One-Slide Takeaway
Key concepts of social graphs
Undirected/Directed Graph/Adjacency Matrix
Shortest path/Diameter
Degree distribution/Path length/Clustering Coefficient
Real world graph
Degree distribution: high-skewed
Most path lengths are small
Community detection algorithms
Purpose of community detection
Spectral clustering
94
Reference
Adamic, Lada A., and Eytan Adar. “Friends and neighbors on the web.” Social networks 25.3 (2003): 211-230.
Kossinets, Gueorgi, and Duncan J. Watts. “Empirical analysis of an evolving social network.” Science 311.5757 (2006): 88-90. APA
Liben-Nowell, David, et al. “Geographic routing in social networks.” Proceedings of the National Academy of Sciences of the United States of America 102.33 (2005): 11623-11628.
Leskovec, Jure, and Eric Horvitz. “Planetary-scale views on a large instant-messaging network.” Proceedings of the 17th international conference on World Wide Web. ACM, 2008.
J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE
Trans. on Pattern Analysis and Machine Intelligence,” 22:8 (2000), pp. 888–905.
S. Fortunato, “Community detection in graphs,” Physics Reports 486:3–5(2010), pp. 75–174.
M. Girvan and M.E.J. Newman, “Community structure in social and bi-ological networks,” Proc. Natl. Acad. Sci. 99 (2002), pp. 7821–7826.
95
Reference
U. von Luxburg, “A tutorial on spectral clustering,” Statistics and Computing bf17:4 (2007), 2007, pp. 395–416.
J. Leskovec, K.J. Lang, A. Dasgupta, and M.W. Mahoney, “Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters,” http://arxiv.org/abs/0810.1355.
L. Backstrom and J. Leskovec, “Supervised random walks: predicting and recommending links in social networks,” Proc. Fourth ACM Intl. Conf. on Web Search and Data Mining (2011), pp. 635–644.
Predicting Positive and Negative Links in Online Social Networks by J. Leskovec, D. Huttenlocher, J. Kleinberg. ACM WWW International conference on World Wide Web (WWW), 2010.
Learning to Discover Social Circles in Ego Networks by J. McAuley, J. Leskovec. Neural Information Processing Systems (NIPS), 2012.
Defining and Evaluating Network Communities based on Ground-truth by J. Yang, J. Leskovec. IEEE International Conference On Data Mining (ICDM), 2012.
The Life and Death of Online Groups: Predicting Group Growth and Longevity by S. Kairam, D. Wang, J. Leskovec. ACM International Conference on Web Search and Data Mining (WSDM), 2012
96
In-class Practice 1
For the graph on the right, compute:
The adjacency matrix
The degree matrix
The Laplacian matrix
97
In-class Practice 2
For the graph on the right, compute:
The shortest path between each pairs of nodes
The betweenness of each edge
98
6
1
4
2
5
3
569697395MT00007_Liverpool_
/docProps/thumbnail.jpeg