程序代写代做代考 data mining algorithm PowerPoint Presentation

PowerPoint Presentation

Lecture 10: Mining Social-Network Graphs
Prof. Michael R. Lyu
Computer Science & Engineering Dept.
The Chinese University of Hong Kong

1
CMSC5741 Big Data Tech. & Apps.

1

Prediction of 2016 US Election

2

2016 ELECTION PREDICTIONS:
HILLARY CLINTON TOPS
DONLAD TRUMP IN
ELECTORAL COLLEGE,
FORECASTERS AGREE
POLITICS

Social Network Prediction of
2016 US Election

3

Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
4

Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
5

Networks: Communication

6

Graph of the Internet (Autonomous Systems)
Power-law degrees [Faloutsos-Faloutsos-Faloutsos, 1999]
Robustness [Doyle-Willinger, 2005]

Networks: Media
7

Connections between political blogs
Polarization of the network [Adamic-Glance, 2005]

Network: Science

8

Citation networks and Maps of science
[Börner et al., 2012]

Networks: Biology

9

Protein-Protein Interaction Networks:
Nodes: Proteins
Edges: ‘physical’ interactions

Metabolic networks:
Nodes: Metabolites and enzymes Edges: Chemical reactions

Metabolic: 代谢
Enzyme: 酶
9

Networks: Brain

10

Human brain has between 10-100 billion neurons [Sporns, 2011]

10

Networks: Social

11

Country with Largest World Population?
12

Population ??? China India USA Indonesia Brazil 1999 1378 1324 323 261 208

13

13

If Facebook were a Country..

14

4

Not including Facebook’s market share (Nov. 20, 2016): $337 Billion.
Mark Zuckerberg’s net worth: 50 Billion.
Not including China yet!
14

15
Acquired by Facebook

Oct. 6, 2014 Forbes News: WhatsApp founders Jan Koum and Brian Acton became billionaires last February when Facebook announced it was buying the company they had started five years ago for a jaw-dropping $19 billion.

Facebook Inc (FB.O) closed its acquisition of mobile messaging service WhatsApp on Monday, with the final price tag rising an additional $3 billion to roughly $22 billion because of the increased value of Facebook’s stock in recent months.
15

Networks!
Behind each such system there is an intricate wiring diagram, a network, that defines the interactions between the components

We will never understand these systems unless we understand the networks behind them!
16

Social Web – The Lab of Knowledge and Humanity

17
The web is our “laboratory” for knowledge and understanding humanity

Networks: Impact

18

Cisco Market cap: $211b billion (2y ago it was 150b)
Google Market cap: $740b billion (2y ago it was 536b)
Facebook Market cap: $400b billion (2y ago it was 336b)

Networks: Problems
Social networks:
Link prediction, friend recommendation
Social circle detection, community detection
Social recommendations
Identifying influential nodes, viral marketing
Communication networks:
Intrusion detection, fraud
Churn prediction
Information networks:
Navigational aids

19

Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions
20

Social Network as Graph
Undirected graphs
Links: undirected (symmetrical, reciprocal relations)

Undirected links:
Collaborations
Friendship on Facebook
Directed graphs
Links: directed (asymmetrical relations)

Directed links:
Phone calls
Following on Twitter
21

Adjacency Matrix

22

1
0

Node Degrees
Node degree : the number of edges adjacent to node

Avg. degree:

In directed networks we define an in‐degree and out‐degree.
The (total) degree of a node is the sum of in‐ and out‐degrees.

23

Degree Matrix

24

This graph has entries only on the diagonal.
The entry for row and column i is the degree of the ith node

Laplacian Matrix
Suppose a graph as adjacency matrix and degree matrix .
Laplacian matrix is defined as .
the Laplacian matrix
Has the same entries as on the diagonal.
Off the diagonal, at row and column ,
has −1 if there is an edge between nodes and
0 if not.

25

In-class Practice 1
Go to practice
26

Complete Graph
The maximum number of edges in an undirected graph on N nodes is

A graph with the number of edges is a complete graph
and its average degree is
N-1

27

Networks are Sparse Graphs
Most real‐world networks are sparse
E << Emax (or k << N-1) Consequence: Adjacency matrix is filled with zeros! Density of the matrix (E/N2): WWW=1.51*10-5, MSN IM Instant Messenger = 2.27*10-8 28 More Types of Graphs Unweighted Weighted 29 More Types of Graphs Self-edges (Self-loops) Multigraph 30 Network Representations 31 Bipartite Graph Bipartite graph is a graph whose nodes can be divided into two disjoint sets U and V such that every link connects a node in U to one in V; that is, U and V are independent sets Examples: Authors‐to‐papers (they authored) Actors‐to‐Movies (they appeared in) Users‐to‐Movies (they rated) “Folded” networks: Author collaboration networks Movie co‐rating networks 32 Outline Introduction Social Networks as Graph Basics of Graph Theory Important properties of social networks as graphs Analysis of a real-world social network Community detection algorithms Girvan-Newman algorithm Spectral clustering Conclusions 33 Degree Distribution Degree distribution : Probability that a randomly chosen node has degree k Nk = # nodes with degree k Normalized: P(k) = Nk / N ➔ plot 34 Paths in a Graph A path is a sequence of nodes in which each node is linked to the next one Path can intersect itself and pass through the same edge multiple times E.g.: ACBDCDEG In a directed graph a path can only follow the direction of the “arrow” 35 Distance in a Graph Distance (shortest path, geodesic) between a pair of nodes is defined as the number of edges along the shortest path connecting the nodes *If the two nodes are disconnected, the distance is usually defined as infinite In directed graphs paths need to follow the direction of the arrows Consequence: Distance is not symmetric: hA,C ≠ hC, A 36 Network Diameter Diameter: the maximum (shortest path) distance between any pair of nodes in a graph Average path length for a connected graph (component) or a strongly connected (component of a) directed graph: Many times we compute the average only over the connected pairs of nodes (we ignore “infinite” length paths) 37 Finding the Shortest Path Breath‐First Search: Start with node u, mark it to be at distance hu(u)=0, add u to the queue While the queue not empty: Take node v off the queue, put its unmarked neighbors w into the queue and mark hu(w)=hu(v)+1 38 Clustering Coefficient Clustering coefficient: What portion of i’s neighbors are connected? Node i with degree ki 39 where ei is the number of edges between the neighbors of node i Clustering Coefficient Clustering coefficient: What portion of i’s neighbors are connected? Node i with degree ki 40 where ei is the number of edges between the neighbors of node i Key Network Properties Degree distribution: Path length: Clustering coefficient: 41 Outline Introduction Social Networks as Graph Basics of Graph Theory Important properties of social networks as graphs Analysis of a real-world social network Community detection algorithms Girvan-Newman algorithm Spectral clustering Conclusions 42 The MSN Messenger MSN Messenger activity in June 2006: 150Gb/day (compressed) 4.5Tb / month 245 million users logged in 180 million users engaged in conversations More than 30 billion conversations More than 255 billion exchanged messages 43 Communication: Geography 44 Communication Network 45 Messaging as a Network 46 MSN Network: Connectivity 47 MSN: Degree Distribution 48 MSN: Log-Log Degree Distribution 49 We plot the same data as on the previous slide, just the axes are now logarithmic. MSN: Clustering 50 MSN: Diameter 51 Avg. path length 6.6 90% of the people can be reached in < 8 hops Number of links between pairs of nodes MSN: Key Network Properties Degree distribution: heavily skewed. avg degree = 14.4 Path length: 6.6 Clustering coefficient: 0.11 52 Outline Introduction Social Networks as Graph Basics of Graph Theory Important properties of social networks as graphs Analysis of a real-world social network Community detection algorithms Girvan-Newman algorithm Spectral clustering Conclusions 53 Community Detection 54 How to find communities? We will work with undirected (unweighted) networks Betweenness: Strength of Ties Edge betweenness: number of shortest paths (among all pair of vertices) passing over the edge. If there is more than one shortest path between a pair of nodes, each path is assigned equal weight such that the total weight of all the paths is equal to 1. 55 Simple algorithm for calculating betweeness. Repeat for each vertex compute its shortest path to all other vertices. for each edge, count the number of the shortest paths passing over the edge Sum up the results from each vertex. .5 Betweenness: Strength of Ties 56 Calculation for 1 vertex Calculation for all vertices In-class Practice 2 Go to practice 57 Method for Community Detection: Girvan-Newman Divisive hierarchical clustering based on the notion of edge betweenness: Number of shortest paths passing through the edge Girvan-Newman Algorithm: Undirected unweighted networks Repeat until no edges are left: Calculate betweenness of edges Remove edges with highest betweenness Connected components are communities Gives a hierarchical decomposition of the network 58 Girvan-Newman: Example 59 Need to re-compute betweenness at every step Girvan-Newman: Example 60 Girvan-Newman 61 Communities in physics collaborations Problems of Girvan-Newman Computing betweenness is slow. No theoretical guarantees. Not widely applied. 62 Outline Introduction Social Networks as Graph Basics of Graph Theory Important properties of social networks as graphs Analysis of a real-world social network Community detection algorithms Girvan-Newman algorithm Spectral clustering Conclusions 63 Graph Partitioning Undirected graph 𝑮 (𝑽,𝑬): Bi-partitioning task: Divide vertices into two disjoint groups 𝑨,𝑩 Questions: How can we define a “good” partition of 𝑮? How can we efficiently identify such a partition? 64 Graph Partitioning What makes a good partition? Maximize the number of within-group connections Minimize the number of between-group connections 65 Graph Cuts Express partitioning objectives as a function of the “edge cut” of the partition Cut: Set of edges with only one vertex in a group: 66 Graph Cut Criterion Criterion: Minimum-cut Minimize weight of connections between groups Degenerate case: Problem: Only considers external cluster connections Does not consider internal cluster connectivity 67 Graph Cut Criterion 68 Criterion: Normalized-cut [Shi-Malik, ’97] Connectivity between groups relative to the density of each group vol(A): total weight of the edges with at least one endpoint in 𝑨 Why use this criterion? Produces more balanced partitions How do we efficiently find a good partition? Problem: Computing optimal cut is NP-hard Spectral Graph Partitioning A: adjacency matrix of undirected G Aij =1 if (𝒊,𝒋) is an edge, else 0 x is a vector in ℜn with components (𝒙𝟏,…,𝒙𝒏). Think of it as a label/value of each node of 𝑮 What is the meaning of Ax? Entry yi is a sum of labels xj of neighbors of i 69 xj What’s the Meaning of Ax? jth coordinate of Ax: Sum of the x-values of neighbors of j Make this a new value at node j Spectral Graph Theory: Analyze the “spectrum” of matrix representing 𝑮 Spectrum: Eigenvectors 𝒙𝒊 of a graph, ordered by the magnitude (strength) of their corresponding eigenvalues 𝝀𝒊: 70 Example: d-Regular Graph Suppose all nodes in 𝑮 have degree 𝒅 and 𝑮 is connected What are some eigenvalues/vectors of 𝑮? 𝑨⋅ 𝒙 =𝝀⋅𝒙 What is λ? What x? Let’s try: 𝒙 = (𝟏,𝟏,…,𝟏) Then: 𝑨⋅𝒙 =𝒅,𝒅,…,𝒅=𝝀⋅𝒙. So: 𝝀=𝒅 We found eigenpair of 𝑮: 𝒙 = (𝟏,𝟏,…,𝟏), 𝝀=𝒅 71 xj d is the Largest Eigenvalue of A G is d-regular connected, A is its adjacency matrix Claim: d is largest eigenvalue of A, d has multiplicity of 1 (there is only 1 eigenvector associated with eigenvalue d) Proof: Why no eigenvalue 𝒅’ > 𝒅?
To obtain d we needed 𝒙𝒊=𝒙𝒋 for every 𝑖,𝑗
This means 𝒙=𝑐⋅(1,1,…,1) for some const. 𝑐
Define: 𝑺 = nodes 𝒊 with maximum possible value of 𝒙𝒊
Then consider some vector 𝒚 which is not a multiple of vector (𝟏,…,𝟏). So not all nodes 𝒊 (with labels 𝒚𝒊 ) are in 𝑺
Consider some node 𝒋∈𝑺 and a neighbor 𝒊∉𝑺 then node 𝒋 gets a value strictly less than 𝒅
So 𝑦 is not eigenvector! And so 𝒅 is the largest eigenvalue!

72

Example: Graph on 2 Components
What if G is not connected?
G has 2 components, each d-regular
What are some eigenvectors?
𝒙= Put all 𝟏s on 𝑨 and 𝟎s on 𝑩 or vice versa

And so in both cases the corresponding
A bit of intuition

73

More Intuition
More Intuitions

If the graph is connected (right example) then we already know that 𝒙𝒏=(𝟏,…𝟏) is an eigenvector
Since eigenvectors are orthogonal then the components of 𝒙𝒏−𝟏 sum to 0.
So we can look at the eigenvector of the 2nd largest eigenvalue and declare nodes with positive label in A and negative label in B. But there is still lots to sort out.

74

Matrix Representation
Adjacency matrix (A):
n*n matrix
A=[aij], aij=1 if edge between node i and j

Important properties:
Symmetric matrix
Eigenvectors are real and orthogonal

75

Matrix Representation
Degree matrix (D):
n*n diagonal matrix
D=[dii], dii = degree of node i

76

Matrix Representation
Laplacian matrix (L):
n*n symmetric matrix

What is trivial Eigenpair?
𝒙=(𝟏,…,𝟏) then 𝑳⋅𝒙=𝟎 and so 𝝀=𝝀𝟏=𝟎
Important properties:
Eigenvalues are non-negative real numbers
Eigenvectors are real and orthogonal

77

Facts about Laplacian L
(a) All eigenvalues are ≥ 0
(b) for every x
(c)

That is, 𝐿 is positive semi-definite

78

as Optimization Problem
Fact: For symmetric matrix L:

79

as Optimization Problem
What else do we know about x?
𝒙 is unit vector:
𝒙 is orthogonal to 1st eigenvector (𝟏,…,𝟏) thus:

Remember

80

We want to assign values 𝒙i to nodes i such that few edges cross 0. (we want xi and xj to subtract each other)

Finding Optimal Cut [Fielder’73]
Back to finding the optimal cut
Express partition (A,B) as a vector

We can minimize the cut of the partition by finding a non-trivial vector x that minimizes

81

Rayleigh Theorem

The minimum value of 𝒇(𝒚) is given by the 2nd smallest eigenvalue λ2 of the Laplacian matrix L
The optimal solution for 𝒇(𝒚) is given by the corresponding eigenvector 𝒙, referred as the Fiedler vector

82

Approx. Guarantee of Spectral (details)
Suppose there is a partition of G into A and B where 𝐴≤|𝐵|, s.t. then .
This is the approximation guarantee of the spectral clustering.
It says the cut spectral is at most 2 away from the optimal one of score .

83

So Far..
How to define a “good” partition of a graph?
Minimize a given graph cut criterion
How to efficiently identify such a partition?
Approximate using information provided by the eigenvalues and eigenvectors of a graph
Spectral Clustering

84

Spectral Clustering Algorithms
Three basic stages:
1) Pre-processing
Construct a matrix representation of the graph
2) Decomposition
Compute eigenvalues and eigenvectors of the matrix
Map each point to a lower-dimensional representation based on one or more eigenvectors
3) Grouping
Assign points to two or more clusters, based on the new representation

85

Spectral Partition Algorithms

86

86

Spectral Partition Algorithms
3) Grouping
Sort components of reduced 1-dimensional vector
Identify clusters by splitting the sorted vector in two
How to choose a splitting point?
Naïve approaches:
Split at 0 or median value
More expensive approaches:
Attempt to minimize normalized cut in 1-dimension (sweep over ordering of nodes induced by the eigenvector)

87

Example: Spectral Partitioning
88

Example: Spectral Partitioning

89

Example: Spectral Partitioning

90

Outline
Introduction
Social Networks as Graph
Basics of Graph Theory
Important properties of social networks as graphs
Analysis of a real-world social network
Community detection algorithms
Girvan-Newman algorithm
Spectral clustering
Conclusions

91

Conclusions
How do we reason about networks?
Empirical: Study network data to find organizational principles. How do we measure and quantify networks?
Mathematical models:
Graph theory, statistical models
allow us to understand behaviors and distinguish surprising from expected phenomena
Algorithms
for analyzing graphs
Hard computational challenges

92

Conclusions: This is Just a Beginning

93

One-Slide Takeaway
Key concepts of social graphs
Undirected/Directed Graph/Adjacency Matrix
Shortest path/Diameter
Degree distribution/Path length/Clustering Coefficient
Real world graph
Degree distribution: high-skewed
Most path lengths are small
Community detection algorithms
Purpose of community detection
Spectral clustering

94

Reference
Adamic, Lada A., and Eytan Adar. “Friends and neighbors on the web.” Social networks 25.3 (2003): 211-230.
Kossinets, Gueorgi, and Duncan J. Watts. “Empirical analysis of an evolving social network.” Science 311.5757 (2006): 88-90. APA
Liben-Nowell, David, et al. “Geographic routing in social networks.” Proceedings of the National Academy of Sciences of the United States of America 102.33 (2005): 11623-11628.
Leskovec, Jure, and Eric Horvitz. “Planetary-scale views on a large instant-messaging network.” Proceedings of the 17th international conference on World Wide Web. ACM, 2008.
J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE
Trans. on Pattern Analysis and Machine Intelligence,” 22:8 (2000), pp. 888–905.
S. Fortunato, “Community detection in graphs,” Physics Reports 486:3–5(2010), pp. 75–174.
M. Girvan and M.E.J. Newman, “Community structure in social and bi-ological networks,” Proc. Natl. Acad. Sci. 99 (2002), pp. 7821–7826.
95

Reference
U. von Luxburg, “A tutorial on spectral clustering,” Statistics and Computing bf17:4 (2007), 2007, pp. 395–416.
J. Leskovec, K.J. Lang, A. Dasgupta, and M.W. Mahoney, “Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters,” http://arxiv.org/abs/0810.1355.
L. Backstrom and J. Leskovec, “Supervised random walks: predicting and recommending links in social networks,” Proc. Fourth ACM Intl. Conf. on Web Search and Data Mining (2011), pp. 635–644.
Predicting Positive and Negative Links in Online Social Networks by J. Leskovec, D. Huttenlocher, J. Kleinberg. ACM WWW International conference on World Wide Web (WWW), 2010.
Learning to Discover Social Circles in Ego Networks by J. McAuley, J. Leskovec. Neural Information Processing Systems (NIPS), 2012.
Defining and Evaluating Network Communities based on Ground-truth by J. Yang, J. Leskovec. IEEE International Conference On Data Mining (ICDM), 2012.
The Life and Death of Online Groups: Predicting Group Growth and Longevity by S. Kairam, D. Wang, J. Leskovec. ACM International Conference on Web Search and Data Mining (WSDM), 2012
96

In-class Practice 1
For the graph on the right, compute:
The adjacency matrix
The degree matrix
The Laplacian matrix

97

In-class Practice 2
For the graph on the right, compute:
The shortest path between each pairs of nodes
The betweenness of each edge
98
6
1
4
2
5
3

569697395MT00007_Liverpool_

/docProps/thumbnail.jpeg