程序代写代做代考 algorithm Recommender Systems

Recommender Systems

Social Network Analysis
Visualization
Robin Burke
DePaul University
Chicago, IL

1

Network Visualization

Jean Ensminger
Social network analysis of 12 African Villages, 2006

Harry Potter
https://www.behance.net/gallery/24910555/Harry-Potter-relations-network

Visual Elements
Spatial layout of nodes
Spatial layout of edges
determined (mostly) by node layout
Properties of nodes
size, color, label, shape, texture
Properties of edges
size, color, label, texture

Caveat
Network size matters
~ 20 nodes
~ 100 nodes
~ 1000
~ 10000
~1,000,000
~10,000,000

Layout: Node Distance
Distances correspond to ties
connected vertices should be closer than disconnected ones
Try to equalize edge length


Layout: Avoid
Vertices and edges too close together
hard to distinguish
Small angles between incident edges
Edges intersect non-incident edges
crossing lines look like vertices
Non-edges pass through vertices
this is usually avoidable

Edge Layout
Straight is customary
easy to see where an edge is heading
Curved edges have advocates
avoids collinearity (sometimes)
this is a default in Gephi preview
I am NOT a fan

Layout Algorithms
Placing nodes by hand very tedious
Need a system doing this for you
Different algorithms
aim to preserve different properties
impose different contraints
work better or worse on networks of different types

But
Layout algorithms will only get you 80-90% of a good result
You should expect to tweak algorithmic layouts by hand
For finished product
Easy to do in Gephi

Inspiration: Maps

Different scale

Another Type

Node Properties
Mappings
value  display property
Label
Size
Color
Shape

Node Label
Makes sense if
network is small
labels are meaningful
Can also identify a small number of nodes to label
leave the rest blank
Can scale the label by value
easy to do in Gephi

Anti-example

Node ids are not meaningful

Anti-example
Too much text
Unreadable

Area
Large size highlights important nodes
they occupy more visual space
Area is 5th on the quantitative list
But other not available for nodes
Usually size parameter controls radius but
actual 2D size is the square of radius
a value twice as big appears 4x more
Acceptable size range is limited
too small nodes disappear
too large node overlap
Layout algorithms usually do not take node size into account
Some Gephi layouts do

Anti-example
Small nodes hard to see
Central nodes overlap

Color
Eye can detect many different colors
but distinguishing between shades somewhat difficult
esp. if not adjacent
people vary in color perception
Hue has no natural order
is blue > green?
Hue is last in the quantitative list

Anti-example
Shades hard to distinguish
Scale hard to interpret
Hue is last on list

Better

Mapping to density

Shape
Node shape can be used
to distinguish a nominal set
ex. multimode network
Supported in igraph only
not Gephi
Hard to distinguish shapes
for small node sizes

Multidimensional Display
Possible to show multiple dimensions
by using color and size
Helps identify contrasts and outliers

Example

Mapped Ranges
Every mapping has an input range and an output range
and sometimes a transformation
Example: dolphin degree
input: 1 – 12
output: 3 – 15
transform: none
Example: Les Mis weighted degree
input: 1 -158
output: 3 – 18
transform: log
makes the scale manageable
esp. when only a few large outliers
“Spline” technique in Gephi

Node Properties
Effective use of node property mappings
makes for informative visualizations
Have to choose
best mapping for a data type
appropriate transformations

Edge Properties
Definitely less important than node properties
Many more edges than nodes
can’t hang too much information there
Unless the network is very small

Edge Properties
Mappings
value  display property
Label
Width
Color
Arc

Edge Label
Similar to node labels
Makes sense if
network is small
labels are meaningful
Can also identify a small number of edges to label
leave the rest blank
Can scale the label by edge weight

Width
Width highlights important edges
they occupy more visual space
Effectively area
Acceptable width range is limited
too small edges disappear
too large they compete with nodes visually

Example
Les Miserable network
Size by weighted degree
Width by edge weight
# of co-appearances

Color
Similar to node color
Gephi by default colors edges based on the nodes they connect
Allows contrast without problems of size
But remember
Hue is the worst
Density is better

Example
Les Miserables
with edge color = weight

Multidimensional Display
Possible to show multiple dimensions
by using color and size
Usually you don’t have as many edge attributes

Layout is Complex
Impossible to meet all the constraints in a large graph
connected nodes close
edges similar length
avoid crossings
avoid small angles
etc.
Layout algorithms try to approximate solutions in different ways

Force-directed layout
Basic idea
simulate a physical system
For example
Vertices are charged particles
Exert a repulsive force
Edges are springs
Keep connected vertices together

37

Goal: minimum energy
Vertices not too close
repulsive energy
Edges not too long
stretch energy

Forces

Forces

Forces

Forces

Physical models
FR
particle: attract + repel
Others
springs only
Optimizations
consider only local forces
multi-scale

Termination
When to stop?
50 iterations?
Energy threshold?
Local minimum reached?
User input?
In R
default to 500 iterations
In Gephi
user stops the process

What layout is best?
Depends
on the graph
on what you are trying to show
More in the lab
Rule of thumb
if graph is disconnected or small
try FR first
otherwise try Force Atlas 2 first
Kamada-Kawai in igraph
if graph is very large
Yifan Hu Proportional in Gephi
use drl or lgl in igraph

Gephi
much more user-friendly for graph output
interactive control over layout parameters
ability to drag and drop nodes

Break
Lab Daley 505