程序代写代做代考 database Recommender Systems

Recommender Systems

Social Network Analysis
Graph Representations
Robin Burke
DePaul University
Chicago, IL

1

Outline
Internal representations
edge list
adjacency matrix
adjacency list
database table
Bipartite networks
Example

2

Network representation
Pictures are nice!
easy for humans to examine at a glance
A picture is just one representation of a network
not necessarily the best one for all purposes
Computers are not so good at looking at pictures

3

Internal representations
Edge list
Adjacency matrix
Adjacency list
A
B
C
D
E

4

Edge list
A graph is a list of edges
usually we include a list of nodes, too
but not strictly necessary
Edge list…
A
B
C
D
E

5

Hard to compute with
Suppose we want to see if there’s a path from node D to node E
Repeated searches of the list
A
B
C
D
E

6

Adjacency Matrix
Most important graph representation is the adjacency matrix
also “sociomatrix”
Square marix with an entry for each pair of nodes
non-zero entry if the nodes are connected
We will use this representation a lot
fundamental to network, sna, and other R packages

7

Matrix

A
B
C
D
E

8

Directed graph?

A
B
C
D
E

9

Matrix
Great for graph computing
Matrix multiplication = path traversal
more on this later

10

Sparsity
Most networks are sparse
only a small percentage of possible edges exist
Make sense especially in social networks
9.5 million people in metro Chicago
If 1% of the edges existed
everyone would have to know 95,000 others
0.001% more likely
Adjacency matrix is inefficient for sparse networks
matrix is 99.999% zeros

11

Adjacency list
Adjacency list is a compromise
For each node, a list of nodes you can get to
Adjacency list…
There is redundant information
but computation is easier
Typically this is the internal representation for sparse matrices

A
B
C
D
E

12

Quick question
What is the adjacency list representation for node A
A: A -> { A, B, C, D, E }
B: A -> { B, E, D }
C: A -> { B, C, E }
D: A -> { A, B, C, E }
A
B
C
D
E

13

What is the adjacency list representation for node A?
https://www.polleverywhere.com/multiple_choice_polls/T6lx24IekhUnvRP
14

Graph representations
Edge list
compact, good for loading the network
bad for computation
Adjacency matrix
space-inefficient, mathematically elegant
Adjacency list
reasonable compromise

15

Database storage
Edge list
table of pairs for each edge
Adjacency matrix
table of for all i,j
where value = 0 if no edge, 1 if edge
still inefficient
Adjacency list
not so easy to store directly
index against edge list for the same effect

16

Bipartite Networks
Data where the vertices are divided into classes
edges can only go between classes
Typical example
affiliation network
DePaul courses -> students enrolled in those courses
only edges between students and courses
no course-course or student-student edges
Also two-mode network
Can be extended to additional types
Professors -> courses -> students
three-mode network

17

Bipartite networks in
social media
Very common structure
users are connected
by commenting on the same item
by using the same hashtag in posts
by “liking” the same post
etc.
A way to extract a community with common interests

18

Bipartite Example
Two-mode data
women and social events in Nachez, MS in the 1930s
“davis” data

19

Key point
Bipartite networks are not that useful in themselves
strange structure
computations have different meaning

20

Example
Degree distribution
Davis data
This is really hard to make sense of
Two different kinds of nodes mixed together

21

The highest degree a node can have in an undirected one-mode network is n-1. What is the highest degree a node can have in a bipartite network?
https://www.polleverywhere.com/multiple_choice_polls/GiWmd6Q9qcT2p1E
22

Projections
We can “project” a two-mode network
into two single mode networks
Example
Davis
data

23

Projection 1: Person-Person

24

Projection 2: Event-Event

25

Projection results
Dense, weighted networks

P1
P2
P3
P4
E1
E2
P1
P2
P3
P4
1
1
1
2
2
2

26

Projections
Dense
because n incident edges = n(n-1)/2 edges in projection
Weighted
so that we can keep track of the number of shared connections

27

Filtering
Generally, filtering is necessary to clarify structure
P1
P2
P3
P4
1
1
1
2
2
2
P1
P2
P3
P4
2
2
2

28

Bipartite networks example

29

ú
ú
ú
ú
ú
ú
û
ù
ê
ê
ê
ê
ê
ê
ë
é
0
0
0
0
1
0
0
1
1
0
0
1
0
0
1
0
1
0
0
1
1
0
1
1
0

/docProps/thumbnail.jpeg