Graph Theory
Data Structures and Abstractions
Graph Theory
Lecture 32
*
The Origins of Graph Theory
Graph Theory (unlike a lot of what we do) dates back to before 1736.
In Konisberg there were two islands in the middle of a river, connected by 7 bridges.
The question was: “is it possible
to cross each bridge exactly once and end up back where you started?”
In 1736, Euler answered this problem by establishing “Graph Theory” as a discipline. (The answer is “no”)
Another Common Problem
As a child you may have met something similar:
Draw the shape below without taking your pen off the page and without going over any line or node more than once.
It is a graph problem, just as the Konisberg problem is a graph problem.
But…
But the theory itself remained a kind of mathematician-only esoteric field until
Computers became available that could handle graph processing algorithms in reasonable time.
Many of the complex problems of society were recognised to be graph problems.
It was realised that Network traffic and the WWW were graphs.
Some AI applications (simulations, neural networks etc) were discovered to use graph theory.
Computer game playing required graph theory.
So What is a Graph?
A graph is a set of vertices connected by edges.
Two vertices are adjacent if they are connected by a single edge.
A graph is weighted if there is a number associated with each edge. (can be cost, distance, ..etc)
A graph is directed if any of the edges are one-way.
vertex
edge
adjacent vertices
non-adjacent vertices
5
8
6
7
11
12
Graph Definitions
A path is a sequence of adjacent vertices
A simple path is one that has distinct edges: no vertex is visited twice.
A cyclic path is one where the start and finish are the same vertex.
Two paths are disjoint if they have no vertices in common, other than, possibly, their endpoints. [see the red and blue paths]
1
2
3
4
5
6
In an unweighted graph, the length of a path is the number of traversed edges.
In a weighted graph, the length of a path is the sum of the weights of the traversed edges.
Length = 5
40
55
70
65
95
Length = 95+65+40+55+70 = 325
A tour is a cyclic path that touches every vertex.
* of 22
A connected graph is one where every vertex is reachable from every other vertex
Connected
Not connected
A graph with no cycles (an acyclic graph) is a tree.
A complete graph is one where every vertex is adjacent to every other vertex.
Data Structures to Represent Graphs
Representing a graph as vertices and edges is fine in the abstract (physical) sense but makes processing too difficult.
Two alternatives are therefore used within programming:
Adjacency matrices
Constant access time
Slow search time
Adjacency list
Fast search time
Slow access time
For both of these, the vertices are arbitrarily numbered.
Adjacency Matrix Representation
The graph is represented as a two dimensional array of boolean.
A vertex is not considered to connect to itself.
3
1
4
2
5
0
false false true true false false
false false true true false true
true true false false false true
true true false false false true
false false false false false true
false true true true true false
0
1
2
3
4
5
0 1 2 3 4 5
Drawing an Adjacency Matrix
Make sure you can draw an adjacency matrix for a graph. Use the Graph program to check your answers.
Adjacency List Representation
The graph is represented as a one dimensional sorted list of connected vertices:
3
1
4
2
5
0
2
3
2
3
5
0
1
5
0
1
5
5
1
2
3
4
0
1
2
3
4
5
Drawing an Adjacency List
Make sure you can draw an adjacency list for a graph. Use the Graph program to check your answers.
Matrix and List Comparison
Advantages of Lists
More flexible as the size is not fixed
Less space used: O(V+E) rather than O(V2) for a matrix.
Faster processing (searching) at each vertex
Advantages of Matrices
Easier to program
Access time to find out if a pair of vertices are connected is constant time as opposed to O(V) for lists.
Directed Graph Definitions
Directed graphs are also known as di-graphs.
A vertex is reachable from another vertex if there is a path between them.
It is assumed that each vertex can reach itself.
The in-degree of a vertex is the number of edges leading into a vertex.
The out-degree of a vertex is the number of edges leading out of a vertex.
A sink is a vertex with out-degree of zero.
A source is a vertex of in-degree 1: it is reachable only from itself.
Sink
Source
A map is a di-graph where every vertex has out-degree 1.
A di-graph is strongly connected if every vertex is reachable from every other vertex.
A di-graph with no cycles is an Acyclic Directed Graph, or DAG.
Adjacency Matrix Representation of a Di-graph
The di-graph is represented as a two dimensional array of boolean.
Note that in a di-graph vertices are considered to be connected to themselves.
3
1
4
2
5
0
true false true true false false
false true true false false true
false true true false false true
true true false true false true
false false false false true true
false true false true true true
0
1
2
3
4
5
0 1 2 3 4 5
*
Adjacency List Representation
The di-graph is represented as a one dimensional sorted list of connected vertices:
2
2
3
5
1
5
0
1
5
5
1
3
4
3
1
4
2
5
0
0
1
2
3
4
5
0
1
2
3
4
5
Graph Domains
Traversal problems
Travel itineraries
Neural networks
The WWW (the biggest graph of them all)
Electric circuits
Schedules
Financial transactions
Compilers use graphs to represent call structures
Within games software
UML diagrams, data flow diagrams, E-R diagrams etc
Automatic diagram generation
etc
Some Graph, Di-Graph and DAG Processing Problems
Searching: how do we get from a particular vertex to another.
Connectivity: is a given graph connected.
Find the minimum length set of edges that connects all vertices (the Minimum Spanning Tree or MST).
Find the shortest path between two vertices.
Find the shortest path from a specific vertex to all other vertices (the Shortest Path Tree or SPT).
Planarity: can a specific graph be drawn without any intersecting lines?
Matching: what is the largest subset of edges with the property that no two are connected to the same vertex?
Find the tour with the shortest path (mail carrier problem).
Topological Sort: sort the vertices of a DAG in order of the number of dependencies.
In Fact
These problems are NP-Hard.
There is no solution for any of them that is guaranteed to be solvable in a reasonable amount of time.
Restricted case solutions are possible but not in the general case.
There are only solutions that work quite well in some circumstances.
This, combined with the large number of domains, makes this field one that is rich in research possibilities.
Readings
Textbook Chapter on Graphs.
Reference book, Introduction to Algorithms. Part on Graph Algorithms contains a number of chapters on graph algorithms. Please read for further study. The lecture notes and textbook is sufficient for this unit.