Recommender Systems
Social Network Analysis
Visualization
Robin Burke
DePaul University
Chicago, IL
1
Network Visualization
Jean Ensminger
Social network analysis of 12 African Villages, 2006
Harry Potter
https://www.behance.net/gallery/24910555/Harry-Potter-relations-network
Visual Elements
Spatial layout of nodes
Spatial layout of edges
determined (mostly) by node layout
Properties of nodes
size, color, label, shape, texture
Properties of edges
size, color, label, texture
Caveat
Network size matters
~ 20 nodes
~ 100 nodes
~ 1000
~ 10000
~1,000,000
~10,000,000
Layout: Node Distance
Distances correspond to ties
connected vertices should be closer than disconnected ones
Try to equalize edge length
✖
✔
Layout: Avoid
Vertices and edges too close together
hard to distinguish
Small angles between incident edges
Edges intersect non-incident edges
crossing lines look like vertices
Non-edges pass through vertices
this is usually avoidable
✖
Edge Layout
Straight is customary
easy to see where an edge is heading
Curved edges have advocates
avoids collinearity (sometimes)
this is a default in Gephi preview
I am NOT a fan
Layout Algorithms
Placing nodes by hand very tedious
Need a system doing this for you
Different algorithms
aim to preserve different properties
impose different contraints
work better or worse on networks of different types
But
Layout algorithms will only get you 80-90% of a good result
You should expect to tweak algorithmic layouts by hand
For finished product
Easy to do in Gephi
Inspiration: Maps
Different scale
Another Type
Node Properties
Mappings
value display property
Label
Size
Color
Shape
Node Label
Makes sense if
network is small
labels are meaningful
Can also identify a small number of nodes to label
leave the rest blank
Can scale the label by value
easy to do in Gephi
Anti-example
Node ids are not meaningful
Anti-example
Too much text
Unreadable
Area
Large size highlights important nodes
they occupy more visual space
Area is 5th on the quantitative list
But other not available for nodes
Usually size parameter controls radius but
actual 2D size is the square of radius
a value twice as big appears 4x more
Acceptable size range is limited
too small nodes disappear
too large node overlap
Layout algorithms usually do not take node size into account
Some Gephi layouts do
Anti-example
Small nodes hard to see
Central nodes overlap
Color
Eye can detect many different colors
but distinguishing between shades somewhat difficult
esp. if not adjacent
people vary in color perception
Hue has no natural order
is blue > green?
Hue is last in the quantitative list
Anti-example
Shades hard to distinguish
Scale hard to interpret
Hue is last on list
Better
Mapping to density
Shape
Node shape can be used
to distinguish a nominal set
ex. multimode network
Supported in igraph only
not Gephi
Hard to distinguish shapes
for small node sizes
Multidimensional Display
Possible to show multiple dimensions
by using color and size
Helps identify contrasts and outliers
Example
Mapped Ranges
Every mapping has an input range and an output range
and sometimes a transformation
Example: dolphin degree
input: 1 – 12
output: 3 – 15
transform: none
Example: Les Mis weighted degree
input: 1 -158
output: 3 – 18
transform: log
makes the scale manageable
esp. when only a few large outliers
“Spline” technique in Gephi
Node Properties
Effective use of node property mappings
makes for informative visualizations
Have to choose
best mapping for a data type
appropriate transformations
Edge Properties
Definitely less important than node properties
Many more edges than nodes
can’t hang too much information there
Unless the network is very small
Edge Properties
Mappings
value display property
Label
Width
Color
Arc
Edge Label
Similar to node labels
Makes sense if
network is small
labels are meaningful
Can also identify a small number of edges to label
leave the rest blank
Can scale the label by edge weight
Width
Width highlights important edges
they occupy more visual space
Effectively area
Acceptable width range is limited
too small edges disappear
too large they compete with nodes visually
Example
Les Miserable network
Size by weighted degree
Width by edge weight
# of co-appearances
Color
Similar to node color
Gephi by default colors edges based on the nodes they connect
Allows contrast without problems of size
But remember
Hue is the worst
Density is better
Example
Les Miserables
with edge color = weight
Multidimensional Display
Possible to show multiple dimensions
by using color and size
Usually you don’t have as many edge attributes
Layout is Complex
Impossible to meet all the constraints in a large graph
connected nodes close
edges similar length
avoid crossings
avoid small angles
etc.
Layout algorithms try to approximate solutions in different ways
Force-directed layout
Basic idea
simulate a physical system
For example
Vertices are charged particles
Exert a repulsive force
Edges are springs
Keep connected vertices together
37
Goal: minimum energy
Vertices not too close
repulsive energy
Edges not too long
stretch energy
Forces
Forces
Forces
Forces
Physical models
FR
particle: attract + repel
Others
springs only
Optimizations
consider only local forces
multi-scale
Termination
When to stop?
50 iterations?
Energy threshold?
Local minimum reached?
User input?
In R
default to 500 iterations
In Gephi
user stops the process
What layout is best?
Depends
on the graph
on what you are trying to show
More in the lab
Rule of thumb
if graph is disconnected or small
try FR first
otherwise try Force Atlas 2 first
Kamada-Kawai in igraph
if graph is very large
Yifan Hu Proportional in Gephi
use drl or lgl in igraph
Gephi
much more user-friendly for graph output
interactive control over layout parameters
ability to drag and drop nodes
Break
Lab Daley 505