COMP251: Graphs, Probability and Binary numbers
Jérôme Waldispühl School of Computer Science McGill University
Outline
• Graphs
o Terminology,definitionsandproperties
o Graphtraversal:Depth-FirstSearchandBreadth-
first search • Probability
• Binary numbers
Background
Graphs
Graph
• A graph is a pair (V, E), where
– V is a set of nodes, called vertices
– E is a collection of pairs of vertices, called edges
• Example:
– Avertexrepresentsanairportandstorestheairportcode – Anedgerepresentsaflightroutebetweentwoairports
PVD LGA
MIA
SFO LAX
ORD
HNL
DFW
• Directed edge
– ordered pair of vertices (u,v)
– first vertex u is the origin
– second vertex v is the destination
– e.g., a flight
• Undirected edge
– unordered pair of vertices (u,v)
– e.g., a street
• Directed graph: all edges are directed
• Weighted edge: has a real number
associated to it
– e.g. distance between cities
– e.g. bandwidth between internet routers
• Weighted graph: all edges have weights
ORD ORD
ORD
PVD PVD
849 PVD miles
Edge Types
Labeled graphs
• Labeledgraphs:verticeshaveidentifiers
SFO ORD
LAX
PVD
LGA MIA
= SFO
ORD
PVD
HNL
LAX DFW
HNL MIA
LGA
DFW
– Note: Geometric layout doesn’t matter – only connections matter
• Unlabeledgraph:verticeshavenoidentifiers
=
Electronic circuits
n Printed circuit board n Integrated circuit
Transportation networks n Highway network
n Flight network
Computer networks n Local area network n Internet
n Web
math.brown.edu
Applications
cslab1a cslab1b
cs.brown.edu
att.net
qwest.net
brown.edu
cox.net
John
Databases
n Entity-relationship diagram
Paul
David
• Endpoints of an edge
– U and V are the endpoints of a
• Edges incident on a vertex
– a,b,anddareincidentonV
• Adjacentvertices
– Connected by an edge – UandVareadjacent
• Degree of a vertex
– Number of incident edges
– X has degree 5
• Parallel edges
– h and i are parallel edges
• Self-loop
– j is a self-loop
a V b
U d X Z
c
Terminology
h j
W
i f
e
g Y
Terminology (cont.) • Path
– sequenceofadjacentvertices • Simplepath
– pathsuchthatallitsvertices are distinct
• Examples
– P1=(V,X,Z)isasimplepath
– P2=(U,W,X,Y,W,V)isapath that is not simple
• Graph is connected iff
– Forallpairofverticesuandv, there is a path between u and v
V d
c P2 W
bP X
a
U
1
e
h
Z
f
g Y
Terminology (cont.)
• Cycle
– paththatstartsandendsatthe same vertex
• Simple cycle
– cyclewhereeachvertexis
distinct
• Examples
– C1=(V,X,Y,W,U,¿)isasimple cycle
– C2=(U,W,X,Y,W,V,¿)isa cycle that is not simple
• A tree is a connected acyclic graph
aVb d
UC XhZ 2 e C
c 1
W
g f
Y
Property 1
SvÎV deg(v)=2|E| Why?
Property 2
In an undirected graph with no self-loops and no multiple edges
|E| £ |V| (|V| – 1)/2 Why?
Notation
|V| number of vertices |E| number of edges
deg(v) degree of vertex v
Example
n |V| = 4
n |E| = 6
n deg(v) = 3
Properties
Data structure for graphs – Adjacency lists
• Graph can be stored as
– Adictionaryofpairs(key,info)where
– key=vertexidentifier
– infocontainsalist(calledadj)ofadjacentvertices
• Example: if the dictionary is implemented as a linked-list vertices
SFO |
LAX |
ORD |
DFW |
LGA |
SFO LAX
ORD
LGA
DFW
ORD SFO
LAX
SFO LAX DFW ORD LAX ORD
DFW DWF LGA
Adjacency lists – Operations
• addVertex(key k): vertices.insert(k, emptyList)
• addEdge(key k, key l): vertices.find(k).adj.insert(l) vertices.find(l).adj.insert(k)
• areAdjacent(key k, key l):
return vertices.find(k).adj.find(l)
Data structure for graphs – Adjacency matrix
Define some order on the vertices, for example: DFW, LAX, LGA, ORD, SFO
Graph with n vertices is stored as n n x n array M of boolean, where
n M[i][j] = ORD
1 if there is an edge between i-th and j-th vertices 0 otherwise
SFO LAX
LGA DFW
DFW LAX LGA ORD SFO DFW 0 1 1 1 0 LAX 1 0 0 1 1 LGA 1 0 0 0 0 ORD 1 1 0 0 1 SFO 0 1 0 1 0
Adjacency matrix – Operations
• addEdge(i,j): matrix[i][j] = 1
• removeEdge(i,j): matrix[i][j] = 0
• Not great for inserting/removing vertices because it requires shifting elements of matrix.
• Requires space O(n2)
Lists vs Matrices
• Adjacencylistsarebetterif:
– You frequently need to add/remove vertices – The graph has few edges
– Need to traverse the graph
• Adjacencymatricesarebetterif
– you frequently need to
• add/remove edges, but NOT vertices
• Check for the presence/absence of an edge between i,j
– matrix is small enough to fit in memory
Graph traversal – Idea • Problem:
– you visit each node in a graph, but all you have to start with is:
• One vertex A
• A method getNeighbors(vertex v) that returns the
set of vertices adjacent to v
A
BDE C
Graph traversal – Motivations • Applications
– Exploration of graph not known in advance, or too big to be stored:
• Web crawling
• Exploration of a maze
– Graph may be computed as you go. Example: game strategy:
• Vertices = set of all configurations of a Rubik’s cube
• Edges connect pairs of configuration that are one rotation away.
Depth-First Search • Idea:GoDeep!
– Intuition: Adventurous web browsing: always click the first unvisited link available. Click “back” when you hit a deadend.
– Start at some vertex v
– Let w be the first neighbor of v that is not yet visited.
Move to w.
– If no such unvisited neighbor exists, move back to the vertex that lead to v
Example
A
A
unexplored vertex
visited vertex
unexplored edge
discovery edge
A BDE
C
AA BDEBDE
CC
Example (cont.)
AA BDEBDE
CC
AA BDEBDE
CC
DFS Algorithm
Algorithm DFS(G, v)
Input: graph G with no parallel edges and a start
vertex v of G
Output: Visits each vertex once (as long as G is
connected)
print v // or do some kind of processing on v v.setLabel(VISITED)
for all u Î v.getNeighbors()
if ( u.getLabel() != VISITED ) then DFS(G, u)
DFS and Maze Traversal
• The DFS algorithm is similar to a classic strategy for exploring a maze
– Wemarkeach intersection, corner and dead end (vertex) visited
– Wemarkeachcorridor (edge ) traversed
– Wekeeptrackofthe path back to the
entrance (start vertex) by means of a rope (recursion stack)
DFS and Rubik’s cube
• Rubik’s cube game can be represented as a graph:
– Vertices: Set of all possible configurations of the cube
– Edges: Connect configurations that are just one rotation away from each other
• Given a starting configuration S, find a path to the “perfect” configuration P
• Depth-first search could in principle be used:
– start at S and making rotations until P is reached, avoiding
• Problem: The graph is huge: 43,252,003,274,489,856,000 vertices
configurations already visited
Running time of DFS
• DFS(G, v) is called once for every vertex v (if G is connected)
• Whenvisitingnodev,thenumberofiterationsof the for loop is deg(v).
• Conclusion: The total number of iterations of all for loops is: Sv deg(v) = ?
• Thus, the total running time is O(|E|)
Applications of variants of DFS
• DFScanbeusedto:
– Determine if a graph is connected
– Determine if a graph contains cycles
– Solve games single-player games like Rubik’s cube
Breadth-First Search
Idea:
n Explore graph layers by layers
n Start at some vertex v
n Then explore all the neighbors of v
n Then explore all the unvisited neighbors of the neighbors of v
n Then explore all the unvisited neighbors of the neighbors of the neighbors of v
n until no more unvisited vertices remain
Example
L0 BCD
E F
L0 A
A
A
unexplored vertex
visited vertex
unexplored edge
discovery edge
L0 A
A
L1
L1 B C D L1 B C D EF EF
Example (cont.)
L0 A L0 A
L1 B C D L1 B C D
EF L2EF
L0 BCD
L0 BCD
A
A
L1
L2
EF
L1
L2
EF
Example (cont.)
L0 A L0 A
L1 B C D L1 B C D
L2 E F
L2 E F
L0 BCD
A
L1
L2
EF
Depth-First Search
30
Iterative BFS
Idea: use a queue to remember the set of vertices on the frontier
Algorithm iterativeBFS(G, v)
Input graph G with no parallel edges and a start vertex v of G Output Visits each vertex once (as long as G is connected)
q ¬ new Queue()
v.setLabel(VISITED)
q.enqueue(v)
while (! q.empty()) do
w ¬ s.deque()
print w // or do some kind of processing on w for all u Î w.getNeighbors() do
if ( u.getLabel() != VISITED ) then u.setLabel(VISITED) s.enqueue(u)
Running time and applications
Running time of BFS: Same as DFS, O(|E|)
BFS can be used to:
n Find a shortest path between two vertices
n Rubik’s cube’s fastest solution
n Determine if a graph is connected
n Determine if a graph contains cycles n Get out of an infinite maze…
Iterative DFS
• Use a stack to remember your path so far
Algorithm iterativeDFS(G, v)
Input graph G with no parallel edges and a start vertex v of G
Output Visits each vertex once (as long as G is connected) s ¬ new Stack()
v.setLabel(VISITED)
s.push(v)
while (! s.empty()) do w ¬ s.pop()
print w
for all u Î w.getNeighbors() do
if ( u.getLabel() != VISITED ) then u.setLabel(VISITED)
s.push(u)
Note: Code is identical to BFS, but with a stack instead of a queue!
Background
Expectation & Indicators
Expectation
• Averageormean
• TheexpectedvalueofadiscreterandomvariableXis
E[X] = åx x Pr{X=x}
• LinearityofExpectation
– E[X+Y] = E[X]+E[Y], for all X, Y
– E[aX+Y] = a E[X] + E[Y], for constant a and all X, Y
• FormutuallyindependentrandomvariablesX1,…,Xn – E[X1X2 …Xn]=E[X1]·E[X2]·…·E[Xn]
Expectation – Example
• LetXbetheRVdenotingthevalueobtainedwhenafair die is thrown. What will be the mean of X, when the die is thrown n times.
– Let X1, X2, …, Xn denote the values obtained during the n throws.
– The mean of the values is (X1+X2+…+Xn)/n.
– Since the probability of getting values 1 to 6 is (1/6) in average,
we can expect each of the 6 values to show up (1/6)n times.
– So, the numerator in the expression for mean can be written as
(1/6)n·1+(1/6)n·2+…+(1/6)n·6
– The mean, hence, reduces to (1/6)·1+(1/6)·2+…(1/6)·6,
which is what we get if we apply the definition of expectation.
Indicator Random Variables
• Asimpleyetpowerfultechniqueforcomputingthe expected value of a random variable.
• Convenientmethodforconvertingbetween probabilities and expectations.
• Helpfulinsituationsinwhichtheremaybe dependence.
• Takesonly2values,1and0.
• IndicatorRandomVariableforaneventAofa sample space is defined as:
I { A } = #”
!
1 if A occurs, #$ 0 if A does not occur.
Indicator Random Variable
Lemma 5.1
Given a sample space S and an event A in the sample space S, let XA= I{A}. Then E[XA] = Pr{A}.
Proof:
Let Ā = S – A (Complement of A) Then,
E[XA] = E[I{A}]
= 1·Pr{A} + 0·Pr{Ā} = Pr{A}
Indicator RV – Example Problem: Determine the expected number of
heads in n coin flips.
Method 1 (without indicator random variables)
Let X be the random variable for the number of heads in n flips.
Then, E[X] = åk=0..nk·Pr{X=k}
We can solve this with a lot of math.
Indicator RV – Example
Method 2 (with Indicator Random Variables)
• Definenindicatorrandomvariables,Xi,1£i£n.
• Let Xi be the indicator random variable for the event that the ith flip results in a Head.
⇒ Xi = I{the ith flip results in H}
• ThenX=X1 +X2 +…+Xn =åi=1..nXi.
• ByLemma5.1,E[Xi]=Pr{H}=1⁄2,1£i£n.
• ExpectednumberofheadsisE[X]=E[åi=1..nXi].
• Bylinearityofexpectation,E[åi=1..nXi]=åi=1..nE[Xi].
• E[X]=åi=1..nE[Xi]=åi=1..n1⁄2=n/2.
Background
Binary numbers
Decimal (base 10)
Digits = { 0, 1, 2, 3, 4, 5, 7, 8, 9 }
Example of numerals: 11, 923, 5548, etc.
11 = 1 ∗ 10! + 1 ∗ 10″
923 = 9 ∗ 10# + 2 ∗ 10! + 3 ∗ 10″ 5548 = 5 ∗ 10$5 ∗ 10# + 4 ∗ 10! + 8 ∗ 10″
𝑚 = – 𝑑 𝑖 ∗ 10% %&”
digit
Binary (base 2)
Bits = { 0, 1 }
Example of numerals: 11, 101, 1010, etc.
11 = 1 ∗ 2! + 1 ∗ 2″
101 = 1 ∗ 2# + 0 ∗ 2! + 1 ∗ 2″ 1010 = 1 ∗ 2$ + 0 ∗ 2# + 1 ∗ 2! + 0 ∗ 2″
𝑚 = – 𝑏 𝑖 ∗ 2% %&”
bit
Relationship
Decimal Binary 00 11
2 10
3 11 4 100 5 101 6 110 7 111 8 1000
Fixed size representation
Decimal
0 00000000
1 00000001
2 00000010
3 00000011
4 00000100
5 00000101
6 00000110
7 00000111
8 00001000
Binary
Fixed number of bits (typically 8, 16, 32, 64…).
8 bits is called “byte”.
Conversion • How to convert from binary to decimal?
11010 # = 1∗2′ +1∗2$ +0∗2# +1∗2! +0∗2″ = 16+8+0+2+0
= 26
Note: You need to know powers of 2…
• How to convert from decimal to binary?
241 !” =???
Operations on decimals
Use this property of any positive integer 𝑚: 𝑚= 𝑚⁄10 ∗10+𝑚%10
Ex: 238 = 23 * 10 + 8
• (integer) division by 10 = dropping rightmost digit Ex: 238/10 = 23
• Multiplication by 10 = shifting left by one digit Ex: 23*10 = 230
• Remainder of integer division by 10 = rightmost digit Ex: 238%10 = 8
Operations on binary Same property holds for binary:
Example:
𝑚=𝑚%2+ 𝑚⁄2 ∗2
𝑚 =1011# 𝑚⁄2 = 0101 # 𝑚⁄2∗2= 1010# 𝑚%2 = 0001#
Decimal⟶Binary (Algorithm)
Algorithm decimal2binary(m) Input: a decimal m
Output: a binary b
i⟵0
while m>0 do b[i] ⟵m%2 m ⟵m/2
i ⟵i+1
Decimal⟶Binary (Example)
i
0
1
2
3 471 531 611 701 8–
m/2 m%2 (b[i]) 241
120 1
60 0 30 0 15 0
Answer:
b[ ] = …011110001
Why is the algorithm working?
𝑚 = 𝑚⁄2 ∗ 2 + 𝑚 % 2
…𝑏3𝑏2𝑏1
…𝑏3𝑏2𝑏10! 𝑏0 !
…𝑏 3 𝑏 2 𝑏 1 𝑏[0] !
!
Decimal
0+1=1
1+1=2
1+2=3
Binary
0+1=1
1+1=10
1+10=11
Additions
Decimal
26 + 15 = 41
Binary
11010 + 01111 = ?????
Additions
Addition in binary
11010 +01111 = ? ? ? ?! 1
Addition in binary
11010 +01111 = ? ? ?” 0! 1
Addition in binary
11010 +01111 = ? ?” 0″ 0! 1
Addition in binary
11010 +01111
= ?” 1″ 0″ 0! 1
Addition in binary
11010 +001111 = 1 0″ 1″ 0″ 0! 1
Addition in binary
1 1 0 1 0 =26 + 0 0 1 1 1 1 =15 = 1 0 1 0 0 1 =41
2( 2′ 2$ 2# 2! 2″ =2(+2$+2″ =32+8+1=41 101001
Operation in binary
Recall grade-school algorithm for addition, subtraction, multiplication, and division.
There is nothing special about base 10.
These algorithms work for binary (base 2), and work for other bases too!
Representation size
$%&
𝑚=,𝑏𝑖 ∗2! !”#
What is the relationship between 𝑚 and N ?
(How many bits N do we need to represent a positive integer m?)
Lower bound
– 2% = 1 + 2 + 4 + ⋯ + 2)*! = 2) − 1 %&”
(This is a special case of ∑)*! 𝑥% = +!*! where 𝑥 = 2) %&” +*!
)*!
)*!
)*!
𝑚=-𝑏𝑖∗2% ≤ -1∗2%
%&” %&”
= 2)−1
< 2)
log# 𝑚 < 𝑁 (apply log on both sides)
Upper bound
We can assume that 𝑁 − 1 is the index 𝑖 of the leftmost bit b[i]
such that 𝑏[𝑖] = 1 (we ignore leftmost 0’s: 00001101). )*!
𝑚=-𝑏𝑖∗2% ≥ 2)*!
%&"
log#𝑚 ≥ 𝑁−1
log# 𝑚 + 1 ≥ 𝑁
How many bit do we need? log'𝑚<𝑁≤ log'𝑚 +1
Answer: The largest integer less than or equal to log)𝑚 +1.
We write it as 𝑁 = log) 𝑚 + 1 (a.k.a. ”floor” that means “round down”)
Examples
m (decimal) m (binary) 𝑁 = log# 𝑚 + 1 00- 111
2 10 2
3 11 2
4 100 3
5 101 3 6 110 3 7 111 4 81000 4 91001 4
To think about...
• Howarenegativeintegersrepresented?
• Howmanybitsareusedtorepresentint,short,longina computer?
• Howarenon-integers(fractionalnumbers)represented?
• Howarecharactersrepresented?