SWEN90004
Modelling Complex Software Systems
Lecture Cx.09 Networks I: Theory and Models
Artem Polyvyanyy, Nic Geard
artem.polyvyanyy@unimelb.edu.au;
nicholas.geard@unimelb.edu.au
Semester 1, 2021
SLIDE 1
Objectives
To understand how the structure of complex systems can be described in terms of the pattern of interaction between their components, and represented as a network.
To describe the structure of a network in terms of it’s basic properties.
To use algorithms to generate particular classes of networks (small world and
scale-free)
To recognise how the structural characteristics of a system can influence its
functional characteristics.
SLIDE 2
Introduction to networks
SLIDE 3
A brief history of network research
Leonard Euler and the Seven Bridges of Konigsberg (1735)
SLIDE 4
A brief history of network research
Leonard Euler and the Seven Bridges of Konigsberg (1735)
SLIDE 5
Milgram’s “small world” experiments (1960s)
SLIDE 6
Milgram’s “small world” experiments (1960s)
Social psychologist Stanley Milgram conducted a series of experiments in the 1960s to measure the degree to which modern populations were interconnected.
His experimental procedure was:
1. Individuals in various midwest cities were chosen at random and sent a package with basic information about a target person in Boston.
2. If they knew the target person, they were to send the package directly to them.
3. If (as was more likely) they didn’t know the target person, they were to send the package to someone who they
did know, who they thought was more likely to know the target person.
4. When (and if!) the parcel reached the target person, the researchers could count how many times it had been forwarded between origin and destination.
The average number path lengths of the chains of forwarding was between five and six, a phenomenon that has since been popularly referred to as six degrees of separation.
SLIDE 7
Networks with low “degrees of separation” have since been sought and found in many different domains. One notable example is the network of movie actors, where nodes represent actors, and edges indicate that two actors have starred in a film together. In the early eighties, the actor Kevin Bacon was named “the center of the entertainment universe” on the basis that paths of length six or less could be traced to him from arbitrary starting points (other actors).
A brief history of network research
Operations Research (1900s—)
logistics, shortest paths, optimisation Social networks (1960s—)
shift in sociology from individuals to groups to relationships
development of abstract models and patterns: cliques, clusters, power brokers, . . . Modern network science (1990s—)
more abstract models: seeking insights into systems on the basis of shared structural properties
application in a wide range of domains
SLIDE 8
Example networks—biology
http://www.nature.com/nbt/journal/v30/n10/full/nbt.2356.html
SLIDE 9
Red and blue nodes indicate two proteins that that are preferentially secreted by different cell types. Edges indicate interactions between proteins, such as excitatory or inhibitory effects on transcription regulation.
Example networks—biology
http://www.biomedcentral.com/1752-0509/2/101/
SLIDE 10
Again, nodes represent proteins and edges represent interactions between proteins. This example is based on the regulatory network that controls the development of the fruit fly Drosophila melanogaster. Green and red nodes indicate proteins that are over- or under-represented in a mutant compared to the wild type.
Example networks—sociology
http://www.fmsasg.com/SocialNetworkAnalysis/
SLIDE 11
Nodes in social networks typically represent people, while edges represent some type of interaction between those people. Examples may include: “know each other”, “work together”, “are friends on Facebook”, etc. Note that the structure of the network will differ depending on how edges are defined.
It is also possible to represent more complex types of information by labelling edges with the type of interaction; eg: “married to”, “sibling of”, “friend of”, “workmate of”, etc.
Example networks—epidemiology
http://www.freerepublic.com/focus/news/1327834/posts
SLIDE 12
Social encounter networks are of great interest when studying how infectious diseases are transmitted through populations. Different pathogens may require different levels of contact in order to spread, so different types of social network may be relevant for specific diseases.
In this network, the blue and pink nodes represent two different classes of people. What do you think they might be? What might the edges represent? What type of disease might this network be relevant to?
Example networks—computing
http://en.wikipedia.org/wiki/Opte_Project
SLIDE 13
Several projects have attempted to map (portions of) the Internet. This network represents one such attempt based on packet routing between IP addresses. Besides being of aesthetic value, such maps can help to visualise how the internet has grown over time, and identify structural features of interest, such as weakly or densely connected regions.
The structure of a computing network has implications for its function. For example, short path lengths will enable fast communication. The presence of central “hubs” can provide weak points against a targetted attack, while redundant paths can provide protection against random failure.
Why study networks?
More and more data on patterns of interaction is becoming available.
Network science offers tools to help make sense of this data.
We can learn something about the functional properties of these systems by studying their structural properties.
Insights developed in one domain may generalise to other domains.
Networks are everywhere: gene networks, protein networks, many types of social networks (influence, encounter, friendship, etc), food webs, transport networks (roads, bus routes, flight paths), electricity distribution networks, telecommunication networks, supply and distribution networks, company networks, political support networks, . . .
SLIDE 14
Network properties
SLIDE 15
What is a network?
A network consists of:
A set of nodes or vertices
A set of edges or links
Data about nodes and edges
Formally, a graph G = (N, E) is an ordered pair of finite sets. Elements of N are called nodes or vertices. Elements of E ⊂ N × N are called edges or links.
SLIDE 16
Network properties
The structural properties of networks often have functional consequences. Structural properties are often used to compare and contrast network models of different systems.
Networks may be unipartite (all nodes of the same type) or bipartite (nodes of two different types), or k-partite (nodes of k different types).
Edges may be directed or undirected.
bipartite networks: directors and the companies whose boards they site on; actors
and the movies they have appeared in, . . .
directed networks: web pages and the (directed) links between them; food webs with directed edges indicating predation, . . .
SLIDE 17
Network properties
To start, we will consider properties of networks that are unipartite, with undirected edges—these are the most studied, although bipartite networks and directed networks are also very common.
Other common assumptions are that there are:
no parallel edges (ie, there can only be one edge between two nodes) no self-connections (ie, there are no edges between a vertex and itself)
SLIDE 18
Network properties—density
The density D of a network is the ratio of the number of edges in the network to the number of possible edges.
FigureN = 20;E = 30 FigureN = 20;E = 100
A network with N nodes and E edges could have up to N(N−1) possible edges, therefore: 2
2E N(N − 1)
D=
SLIDE 19
Network properties—degree
The degree ki of node i is the number of edges between node i and other nodes in the network. Nodes with high degree are often considered to be more “important”.
The average degree < k > of a network, where < k >= 2E/N.
The degree distribution P(k) of a network is the probability distribution of degrees over
all nodes in the network; P(k) = Nk/N where Nk is the number of nodes of degree k. The degree distributions of many real world networks are strongly skewed.
SLIDE 20
Network properties—degree
FigureYeast protein interaction network FigureWestern US power grid
http://konect.uni-koblenz.de/networks/
SLIDE 21
Network properties—distance
The distance dij between nodes i and j is the length of the shortest path between those two nodes.
FigureA “path” network FigureA “star” network
The characteristic path length L is the average distance (ie, shortest path) between all pairs of nodes in the network:
L= 1 dij
N(N − 1)
i̸=j
The diameter of a network is the longest shortest path between any two nodes; ie, it is the largest number of nodes that must be traversed to move from one node to another without backtracking or looping.
SLIDE 22
Network properties—clustering
The clustering coefficient C is a measure of “cliquishness” in a network; ie, the extent to which, if Alice is friends with Bob and Carol, Bob and Carol are likely to be friends with each other.
Another way of thinking about the clustering coefficient of a node i is in terms of the proportion of possible triangles containing i that exist:
SLIDE 23
Network properties—clustering
The clustering coefficient is defined as:
C=
where:
Tc is the number of closed triplets—sets of three nodes each of which is connected to the other two
To is the number of open triplets—sets of three nodes in which exactly one node links the other two
Formally, the clustering coefficient Ci of node i is defined as the ratio between the number of edges Ei that actually do exist between these ki nodes and the total number ki(ki − 1)/2 possible:
Ci = 2Ei ki(ki −1)
The clustering coefficient of a network is often calculated as the average clustering coefficient over all the nodes in the network
Tc Tc + To
SLIDE 24
Network properties—centrality and assortativity
Centrality is another approach to measuring which are the most important nodes or edges in a network. There are many different definitions including:
the number of shortest paths that pass through a particular node or edge
the extent to which a node is connected to other “important” nodes
Assortativity measures the extent to which similar nodes tend to be connected to each other. With respect to degree, a highly assortative network is one in which the high degree nodes are all connected to each other, while a dissasortative network is one in which high degree nodes tend to be connected to low degree nodes.
SLIDE 25
Types of networks
SLIDE 26
Regular lattices
Every node in a regular lattice has exactly m edges; ie, the degree distribution is given by:
pk = 1 if k = m 0 otherwise
FigureA regular 2D lattice
FigureA circular lattice
(Can have) high clustering coefficient, long characteristic path length
SLIDE 27
These classes of network from the cellular automata models we discussed in week 3. The 1D cellular automata can be represented as a ring (because we allowed the cells on each end to “wrap around”. The example shown here has a larger neighbourhood than the simple 1D CA we studied: each node is connected to its nearest four neighbours (two on each side), rather than its nearest two neighbours (one on each side).
A 2D cellular automata (such as the Game of Life) can be representd as a regular 2D lattice. The example on the left shows a 2D CA with a neighbourhood of four cells, and boundaries that do not wrap around. What would the lattice look like with a neighbourhood of eight cells? What would happen to the clustering coefficient in this case? What would the lattice look like if the boundaries wrapped around? (as in the NetLogo implementation of the Game of Life)
Random networks
Specifically, Erdos-Renyi random graphs, which starts with a set of N nodes and then adds edges between pairs of nodes at random (eg, sampling each edge with a given probability).
Low clustering coefficient, short characteristic path length
SLIDE 28
Small world networks
1. Start with a regular lattice of even degree m 2. Randomly rewire a proportion p of the edges 3. (alternatively, add new edges at random)
SLIDE 29
Rows (a) and (b) show two different mechanisms for creating a small world network: rewiring links and adding links. What differences would arise in other network properties as a result of these different approaches.
Small world networks
Consider the rewiring option:
when p = 0, the network is a regular lattice
when p = 1, the network is completely random
As p approaches 1:
clustering coefficient: C ≈ 3(m/2−1) (1 − p)3
2(m−1)
characteristic path length approaches logN
logm
For some range of value of p between 0 and 1, we observe:
High clustering coefficient and low characteristic path length
SLIDE 30
Small world networks
SLIDE 31
C(p)/C(0) is the average clustering coefficient of a lattice with p edges rewired, relative to the clustering coefficient with no edges rewired (p = 0).
L(p)/L(0) is the average shortest path length of a lattice with p edges rewired, relative to the average shortest path length with no edges rewired (p = 0).
Note that, after only a very small number of rewirings, the average shortest path length drops rapidly, while the clustering coefficient remains relatively high.
Scale free networks
A scale free network is one whose degree distribution follows a power law distribution; ie, pk ∼ k−γ , where γ is usually between 2 and 3.
One straightforward way of constructing a scale free network is via a process of growth and preferential attachment:
1. Start with a small number of nodes, connected at random.
2. Add a new node to the network.
3. Connect the new node to m existing nodes with a probability proportional to the current degree of those nodes.
Repeat steps 2 and 3. As the network grows, positive feedback (“the rich get richer”) will lead to a small number of nodes gaining a disproportionate number of edges.
(Relatively) high clustering coefficient and short average path length
SLIDE 32
pk is the proportion of nodes with k neighbours.
Scale free networks
SLIDE 33
The degree distribution in a scale free network: the x-axis is the degree k; the y-axis is the proportion of nodes in the network that are of degree k. Most nodes in the network are of very low degree (few neighbours), but a small number of nodes are of very high degree (many neighbours).
Scale free networks
SLIDE 34
An example scale free network, in which the size of the node is scaled to reflect it’s degree: large nodes have many neighbours while small nodes have few.
Note the presence of a well-connected “core” of the network, and a more weakly connected “periphery”.
Scale free networks—error and attack tolerance
An interesting property of scale free networks is that they are highly robust to random error.
ie, if a random node “fails” the connectivity of the overall network is not likely to be affected.
However, a trade-off is that they are highly sensitive to targeted attack.
ie, if a specific, high degree node is targeted and “incapacitated”, the connectivity of the overall network can be affected dramatically.
SLIDE 35
Think back to the structure of computer networks (such as the Internet) and the effects of targeted attack and random failure.
Summary
A broad variety of systems can be described in terms of their network structure.
Doing so allows us to quantify and analyse system structure in terms of relatively
simple properties.
These properties can have functional consequences, that may generalise across different systems.
Next lecture:
network dynamics
networks and agent-based models
SLIDE 36