MAST30012 Discrete Mathematics
School of Mathematics and Statistics
The University of Melbourne
2021
This compilation has been made in accordance with the provisions of Part VB
of the copyright act for the teaching purposes of the University. For the use of
students of the University of Melbourne enrolled in the subject MAST30012
Discrete Mathematics.
Contents
1 Sets and Tuples 1
2 Probability by Enumeration 1
3 Multiplication Principle 2
4 Addition Principle 2
5 Complement Principle 3
6 Birthday Problem 4
7 Multisets 5
8 Words 6
9 Pigeonhole Principle 9
10 Maps and Object Arrangements 14
11 Permutations 15
12 Combinations 17
13 Interpreting Binomial Coe�cients 19
14 Lattice Path Problem 20
15 Combinations: Ordering, Replacement and Repetition 21
16 Combinatorial Identies 23
17 Binomial Theorem 26
18 Inclusion-Exclusion Principle 28
19 Bijection Principle 33
20 Bijection Principle and Identities 36
21 More on Multisets 38
22 Multinomial Coe�cients 39
23 Sampling with Repetition 41
24 Graph Theory 42
25 Pigeonhole Principles 47
26 Ramsey Theory 48
i
27 Ramsey Numbers 52
28 Parity 56
29 Sperner’s Lemma for an Interval 58
30 Intermediate Value Theorem 60
31 Sperner’s Lemma 61
32 Brouwer fixed point theorem 63
33 Lattice Paths 66
34 Ballot Paths 66
35 Lattice Paths 72
36 Generating Functions 76
37 Binomial Expansion 77
38 Fibonacci Numbers 80
39 Recursions and generating functions 83
40 Formal Power Series 95
41 Catalan Numbers 100
42 Proper Binary trees 101
43 More Catalania 106
44 Asymptotic Results 111
45 Asymptotics from Generating Functions 111
46 Functional Equations 113
47 Dyck Paths 117
48 Parenthesised Expressions 120
49 Penrose Tilings 122
50 Permutations 129
51 Two-line Arrays 130
52 Cycle Notation 131
53 Composition of Permutations 133
ii
54 Inverse permutations 134
55 Involutions 136
56 Stirling numbers 137
57 Symmetric group 139
58 Group theory 140
59 Parity of Permutations 142
60 Alternating Group 143
61 Elementary transpositions 144
62 Inversions 146
63 Algebraic Relations for si’s 148
64 Alternating group 149
65 The 15-puzzle 150
66 7-Puzzles 153
67 Designs 156
68 Incidence matrix 159
69 Kirkman’s schoolgirl problem 161
70 Steiner triple systems 162
71 Resolvable designs 164
72 Hadamard matrices 166
73 Error correcting codes 169
iii
1 Sets and Tuples
Sets and Tuples
Definition 1 (Sets and Tuples). • A set is a list of objects or elements.
• A tuple is an ordered list of objects or elements.
Notation 2 (Sets and Tuples). Sets are denoted by enclosing the elements
in braces. So {a, b, c, d} is the set with elements a, b, c, and d. Tuples are
denoted by enclosing the elements in parenthesis, e.g. (a, c, d).
Definition 3 (Cardinality). Let X be a finite set. The cardinality or size of
X, denoted by |X | is the number of elements in X.
Note: The sets {a, b, c} and {b, c, a} are the same because they contain
the same elements. However, the tuples (a, b, c) and (b, c, a) are distinct.
2 Probability by Enumeration
Probability by Enumeration
Axiom 4 (Probability by Enumeration). Let T be the set of possible outcomes
and F the set of favourable outcomes. Assuming each outcome is equally
likely the probability of F is:
Pr(F ) =
number of favourable outcomes
total number of outcomes
=
|F |
|T |
Note: This axiom provides the fundamental link between counting or
enumeration of finite sets and probability theory.
Example:
1. ‘Birthday’ problem: In a room with r people how likely is it that two
have the same birthday?
2. When two dice are thrown what is the probability that the total is 8?
Two Dice Problem
Example: Back to the two dice problem. Probability total is 8.
T = { (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6),
(2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6),
(4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6),
(6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6) }
There are six rows and six columns. So |T | = 6⇥ 6 = 36.
1
The favourable outcomes are F = {(2, 6), (3, 5), (4, 4), (5, 3), (6, 2)}. Hence
Pr(Total is 8) =
|F |
|T |
=
5
36
.
Note: (2, 6) and (6, 2) etc. are distinct outcomes, i.e. we distinguish the
dice. We are counting tuples!
3 Multiplication Principle
Multiplication Principle
Axiom 5 (Multiplication Principle). Let A and B be finite sets and let
A⇥B = {(a, b) | a 2 A, b 2 B}.
Then
|A⇥B | = |A | · |B | .
Example: For the two dice problem: A = B = {1, 2, 3, 4, 5, 6} and
T = A⇥B. Then we have that |T | = |A | · |B | = 62 = 36.
Example: How many number plates are there with 3 letters then 3
digits?
With L = {a, b, c, . . . , y, z} and D = {0, 1, . . . , 9} we have
|L⇥ L⇥ L⇥D ⇥D ⇥D | = |L |3 · |D |3 = 263 · 103.
4 Addition Principle
Addition Principle
Axiom 6 (Addition Principle). Let A and B be finite disjoint sets, i.e. A\B =
;, then
|A [B | = |A |+ |B | .
More generally, if A1, A2, . . . , An are finite pairwise disjoint sets, then
|A1 [A2 [ · · · [An | = |A1 |+ |A2 |+ · · ·+ |An | .
Example: Identify 3 ‘natural’ disjoint sets from this class of students.
L = {Left-handed students inside this lecture theatre}
R = {Right-handed students inside this lecture theatre}
A = {Ambidextrous students inside this lecture theatre}
Then
L \R = L \A = R \A = ;
|L [R [A | = |L |+ |R |+ |A | = # of students in this lecture theatre.
2
Addition Principle – Example
Example: How many integers between 100 and 100,000 are there that
use di↵erent odd digits?
We solve this by breaking the problem into subsets according to the number
of digits in the integers. There are either three, four or five digits (the cases).
Addition Principle – Example
Case 1: Integers with three digits. There are five odd digits. We have to
choose three di↵erent odd digits. There are 5 choices for the first odd digit, 4
for the second, 3 for the third. So all in all 5 · 4 · 3 = 60 such integers.
Case 2: There are 5 · 4 · 3 · 2 = 120 integers with four distinct odd digits.
Case 3: Five distinct odd digits. There are 5! = 120 such integers.
The three subsets or cases are clearly disjoint.
By the Addition Principle there are 60 + 120 + 120 = 300 integers of the
required type.
5 Complement Principle
Complement Principle
Theorem 7 (Complement Principle). Let A and B be finite sets with B ⇢ A.
Then
|A\B | = |A |� |B | .
Proof: This is a simple consequence of the Addition Principle:
A = (A\B) [B and (A\B) \B = ;
) |A | = |A\B |+ |B | (Addition Principle)
) |A\B | = |A |� |B | .
A
B
Complement Principle
The complement principle is useful in counting sets with a restriction where
we can embed the sought after set in a larger easy to count set and we can
easily count the complementary set.
3
Complement Principle
Example: Count the integers from 0 to 999, 999 containing the digit ‘1’.
Solutions: View the integers as 6-tuples with entries from {0, 1, 2, . . . , 9}.
By the multiplication principle there are 106 integers in total.
Consider the set of integers Y not containing the digit ‘1’.
These are 6-tuples with entries from {0, 2, . . . , 9}, i.e.,
Y =
�
(x1, x2, . . . , x6) |xi 2 {0, 2, . . . , 9}
By the multiplication principle there are 96 such integers.
By the complement principle the number of integers containing the digit
‘1’ is
106 � 96.
6 Birthday Problem
Birthday Problem
Example: Assuming that there are 365 days per year, and birthdays are
uniformly spread among these days, what is the probability that in a group of
r randomly chosen people, at least two have the same birthday?
Solution: Record birthdays by an integer from 1 to 365 starting from Jan
1, e.g., March 1st = 31 + 28 + 1 = 60.
Form a tuple of birthdays, (b1, b2, . . . , br), where br is the birthday of person
r, and each bj 2 {1, 2, . . . , 365}.
Use the basic probability formula:
Pr(2 or more birthdays coinciding) =
|F |
|T |
Here
F = {(b1, b2, . . . , br) | bj 2 {1, . . . , 365}, two or more bj ’s are the same}
T = {(b1, b2, . . . , br) | bj 2 {1, . . . , 365}
Birthday Problem (continued)
Immediately, by the multiplication principle, |T | = 365r.
To evaluate |F |, we consider instead the complement of F :
C = {(b1, b2, . . . , br) | bj 2 {1, . . . , 365} with all of the bj ’s di↵erent}.
We have by the complement principle: |F | = |T |� |C | .
4
To compute |C |, we have:
365 ways to choose b1
364 ways to choose b2
…
365� r + 1 ways to choose br
Hence, by the multiplication principle,
|C | = 365 · 364 · . . . · (365� r + 1)
Birthday Problem (continued)
Hence,
|F | = 365r � 365 · 364 · . . . · (365� r + 1)
= 365r
✓
1� 1 ·
364
365
· . . . ·
(365� r + 1)
365
◆
) Pr( � 2 birthdays coinciding) = 1�
364
365
· . . . ·
(365� r + 1)
365
In particular:
r = 2, Pr(� 2 birthdays coinciding) = 1/365
r = 23, Pr(� 2 birthdays coinciding) ⇡ 0.5072972
r = 116, Pr(� 2 birthdays coinciding) ⇡ 0.999999998846
So in a group of just 23 people there is a better than even chance of two
people having the same birthday.
In a class the size of this (enrolment as of July 25) it is almost certain
there is at least one shared birthday.
7 Multisets
Multisets
Sometimes we want to repeat elements such as in B = {a, a, c, d, d}.
Problem is B is not a set. In a set all elements are distinct.
Solution: Multisets or “Sets with repetition”.
A more formal definition of a multiset is
Definition 8 (Multiset). Let A be a finite set and N0 = {0, 1, 2, . . .} then a
multiset M is a map
M : A! N0
where M(a) is the number of repetitions of the element a 2 A.
5
Example: If A = {a, b, c, d} then for B above
M(a) = 2, M(b) = 0, M(c) = 1, M(d) = 2.
Multisets
Notation 9 (Multiset). 1. Conventional: B = {a, a, c, d, d}. That is use
set notation with repeated elements (though strictly speaking not a set).
2. Exponents: B = {a2, b0, c1, d2}.
3. Map: B = (M(a),M(b),M(c),M(d)).
Note: As with sets the order of the elements in a multiset is irrelevant.
Example: All the following multisets are identical (based on A =
{a, b, c, d})
{a, a, b, d, d, d} {b, d, a, d, d, a} {d, a, b, a, d, d}
{a2, b1, c0, d3} {c0, b1, a2, d3} {d3, a2, b1, c0}
Multisets
Definition 10 (Cardinality of Multiset). The size or cardinality of a multiset
M = {an1
1
, a
n2
2
, · · · , anmm } is
|M | = n1 + n2 + · · ·+ nm.
Example: | {b, d, a, d, d, a} | = 6. The size of the “set with repetitions”.
Example: List all distinct 3-scoop icecreams with two flavours F =
{a, b}.
3-scoop possibilities: a, a, a a, a, b a, b, b b, b, b
Better to list as multisets: {a, a, a}, {a, a, b}, {a, b, b}, {b, b, b}.
Scoop configurations ! Multisets
Note: If you care where scoops go use tuples, e.g.,
(b, b, a) 6= (b, a, b) 6= (a, b, b).
8 Words
Words
Definition 11 (Words). A word in an alphabet A = {a1, a2, . . . , ak} is a tuple
w = (x1, x2, . . . , xn) 2 An.
Words are often written by concatenating the entries w = x1x2 · · ·xn.
6
The elements of A are called ‘letters’ or ‘characters’ or ‘symbols’ etc.
The set of all words in A is denoted A⇤ – called the Kleene closure of A.
Note that A⇤ includes the empty word.
Words
Example:
1. A = {a, b, c} Words: abc, bbc, b, ca, ccbbac, · · ·
2. A = {4,⇤} Words: 44,⇤,4⇤⇤,⇤4⇤⇤44⇤⇤⇤, · · ·
Note: Using concatenation is convenient. But be careful that there is no
ambiguity. If A = {0, 1, 01} is ‘01’ a single letter or a two letter word?
Definition 12 (Unambiguous Alphabets). If an alphabet does not have the
above sort of ambiguity then it is called an unambiguous alphabet.
Words
Definition 13 (Properties of Words). • Length of a word w = number
of letters in w, denoted |w |
• |w |ai = number of times ai occurs in w
• Two words w1 = x1x2 · · ·xn 2 A⇤ and w2 = y1y2 · · · yn 2 A⇤ are equal
i↵ |w1 | = |w2 | and xi = yi 8i
• Product of w1 and w2 is the concatenation of the two words:
w1 · w2 = w1w2 = x1x2 · · ·xny1y2 · · · ym
• A subword of w is any non-empty sequence of consecutive letters.
• An ordered alphabet is an alphabet whose letters are “totally or-
dered”.
Words
Example:
• | aca | = 3, | abba |b = 2, | abba |d = 0
• w = abbca; a, ca, bbc, abb are subwords
• A = {a, b, c, e} with a < b < c < e is an ordered alphabet • B = {4,⇤} with 4 < ⇤ is an ordered alphabet 7 Multisets and Words A multiset M taken from an unambiguous set A can be represented by a unique word in the ordered alphabet A in which the order of the letters of the word have the same ordering as A. Example: Let A = {a, b, c} a < b < c. {a, a, b} �! aab {b, a, c, b} �! abbc {a, b, b, a} �! aabb Ordering of the letters is required since the order of the elements of a multiset is not important. Problem Example: Four di↵erent coloured cars are parked in a row. The red and blue cars are always next to each other and the yellow car must be next to a red car or a blue car. How many ways can the cars be parked? Solution: Let C = {r, b, y, g} be the set of cars: r ! red, b! blue, y ! yellow, g ! green. Let P = set of parking configurations. Then P ⇢ C⇤ s.t. if w 2 P then 1. |w | = 4. 2. Each letter is used once. 3. rb or br is a subword. 4. yr or ry or yb or by is a subword. Method: Use Addition Principle on a partition of P . Problem (continued) Claim: Restrictions 3 and 4 ) w must have a subword: yrb, ybr, rby, or bry (i) All words in P formed by adding g to the front or back of these subwords. (ii) Easy to check (exercise) that this results in four disjoint subsets of size 2. Hence, |P | = 2 + 2 + 2 + 2 = 8. Note: Make sure that (i) the partition into subsets produces the whole set 8 (ii) the subsets in the partition are mutually disjoint is fulfilled for more complex problems is di�cult! It is very easy to miss cases! 9 Pigeonhole Principle Pigeonhole Principle (PHP) Theorem 14 (Pigeonhole Principle). Let n be an integer. If (n+ 1) pigeons occupy n pigeonholes, then at least one hole will house 2 or more pigeons. Proof Without Words: The Pigeonhole Principle RAN LIBESKIND-HADAS Harvey Mudd College Claremont, CA 91711 Proof Without Words: The Pigeonhole Principle RAN LIBESKIND-HADAS Harvey Mudd College Claremont, CA 91711 Letter to the Editor Dear Editor: The article by Dresden in the October 2001 issue presents a solution to the problem of the periodicity of the sequence of rightmost nonzero digits of n!. This problem has appeared in several places. I first learned about it frorn Crux Mathematicorum 19 (1993), 260-261 and 20 (1994) 45, where is was presented as an unused Olympiad problem. A discussion of the problem that focuses on the issue of a fast algorithm to get, say, the rightmost nonzero digit of the facto- rial of a googol appears as Problem 90 in my book Which Way Did the Bicycle Go? (with D. Velleman and J. Konhauser, MAA, 1996). And in Exercise 4.40 of Concrete Mathematics by Graham, Knuth, and Patashnik (Addison-Wesley, 1989) one finds a formula (see also Problem 90.2 of my book) for this digit in basep,whenpisprime. Stan Wagon Macalester College St. Paul, MN 55105 wagon @ macalester.edu Letter to the Editor Dear Editor: The article by Dresden in the October 2001 issue presents a solution to the problem of the periodicity of the sequence of rightmost nonzero digits of n!. This problem has appeared in several places. I first learned about it frorn Crux Mathematicorum 19 (1993), 260-261 and 20 (1994) 45, where is was presented as an unused Olympiad problem. A discussion of the problem that focuses on the issue of a fast algorithm to get, say, the rightmost nonzero digit of the facto- rial of a googol appears as Problem 90 in my book Which Way Did the Bicycle Go? (with D. Velleman and J. Konhauser, MAA, 1996). And in Exercise 4.40 of Concrete Mathematics by Graham, Knuth, and Patashnik (Addison-Wesley, 1989) one finds a formula (see also Problem 90.2 of my book) for this digit in basep,whenpisprime. Stan Wagon Macalester College St. Paul, MN 55105 wagon @ macalester.edu VOL. 75, NO. 1, FEBRUARY 2002 VOL. 75, NO. 1, FEBRUARY 2002 39 39 Seven pigeons in six boxes Seven pigeons in six boxes This content downloaded from 128.250.144.144 on Thu, 30 Jul 2015 01:10:48 UTC All use subject to JSTOR Terms and Conditions Pigeonhole Principle (PHP) Theorem 15 (Pigeonhole Principle). Let n be an integer. If (n+ 1) pigeons occupy n pigeonholes, then at least one hole will house 2 or more pigeons. Note: This is not an enumeration principle, but an existence theorem. It states that under certain conditions (i.e. (n+ 1) pigeons) a certain type of configuration must exist. As an exercise in mathematical logic we will state the theorem more for- mally. All possible pigeon hole occupancies correspond to the set of n-tuples: Pn = � (p1, p2, · · · , pn) 2 Nn0 �� nP i=1 pi = n+ 1 where pi = # pigeons in box i. Example: n = 2 : P2 = {(0, 3), (1, 2), (2, 1), (3, 0)} Pigeonhole Principle Then the PHP can be stated (very) formally as Theorem 16 (Pigeonhole Principle). 8p 2 Nn 0 p 2 Pn ! 9 pi 2 p s.t. pi � 2. 9 Here we recall that 8 means “for all”, 9 means “there exists”, s.t. is just short-hand for “such that” and N0 = N [ {0} = {0, 1, 2, 3, · · · }. Also pi 2 p means pi is an entry in the n-tuple p. Here we shall give two simple proofs of the PHP. 1. Prove the contrapositive statement 2. Proof using the bounded set theorem Pigeonhole Principle: Proof 1 Proof by contrapositive: • The logic statement p �! q is equivalent to the contrapositive: ¬q �! ¬p (not q �! not p). • The negation of: 9x 2 S s.t. p(x) is 8x 2 S ¬p(x) or ¬ (9x 2 S s.t. p(x)) () (8x 2 S ¬p(x)) Thus we first find the contrapositive. p 2 Pn �! 9 pi 2 p s.t. pi � 2 () ¬ (9 pi 2 p s.t. pi � 2) �! ¬ (p 2 Pn) () 8pi 2 p ¬(pi � 2) �! p /2 Pn () 8pi 2 p pi < 2 �! p /2 Pn Pigeonhole Principle: Proof 1 Proof by contrapositive continued: 8pi 2 p pi < 2 �! p /2 Pn 8pi 2 p pi 2 {0, 1} (Premis) ) nX i=1 pi nX i=1 1 = n (All the pi are 0 or 1) ) p /2 Pn (If p 2 Pn then nX i=1 pi = n+ 1) ) 8pi 2 p pi < 2 �! p /2 Pn This completes the proof. 10 Bounded Set Theorem Theorem 17 (Bounded Set). Given a bounded set of real numbers X then sup(X) � E(X). Note: sup is the supremum or least upper bound. For a finite set this is equal to the value of the maximum element. E denotes the expected value, which for a finite set is the average value. So if X is a finite set X = {a1, a2, . . . , an}, ai 2 R, then the theorem above says max {ai} � a1 + a2 + . . .+ an n . That is, in a finite set the element with the maximum value is at least as large as the average value of the elements in the set. Pigeonhole Principle: Proof 2 Proof by bounded set: Need to prove: p 2 Pn �! 9 pi 2 p s.t. pi � 2 p 2 Pn (Premis) ) max {pi} � p1 + p2 + . . .+ pn n (Bounded Set Theorem) = n+ 1 n (Since p 2 Pn) = 1 + 1 n ) max {pi} � 2 (n � 1, max {pi} an integer) ) 9 pi 2 p s.t. pi � 2 Application of Pigeonhole Principle Theorem 18. Given any sequence of n positive integers there exists a subse- quence of consecutive integers whose sum is divisible by n. Example: • Take n = 4 • Consider the sequence 1, 5, 9. • Here 4 does not divide 1, 5, 9, 1 + 5 = 6, 5 + 9 = 14, 1 + 5 + 9 = 15. 11 • The claim is that if we include one further positive integer, either the new integer is itself divisible by 4, or a sum of consecutive members from the sequence including the new integer is divisible by 4. • Say we add 5 to the sequence to get 1, 5, 9, 5. Here 1 + 5 + 9 + 5 = 20, which is divisible by 4. • Maybe check some further cases yourself. Application of Pigeonhole Principle Proof: Let the n positive integers be (a1, a2, . . . , an). Define bi = � iX p=1 ap � mod n. [From example n = 4, (1, 5, 9, 5), b1 = 1, b2 = 2, b3 = 3, b4 = 0] Case 1: For some i, bi = 0. Then the statement is true. Case 2 (remaining case): 1 bi n� 1. Application of Pigeonhole Principle Proof (continued): Apply the pigeonhole principle. The remainders 1, 2, . . . , n � 1 are the pigeonholes, and b1, b2, . . . , bn are the pigeons (bi goes to hole j if bi = j). We then have that at least two of the bi’s have the same value. Let these be bp and bq with p q, say. Thus bq � bp = 0 ) (ap+1 + ap+2 + . . .+ aq) mod n = 0. So the statement is again true. Generalisation of Pigeonhole Principle Theorem 19 (Generalised Pigeonhole Principle). Let n, k 2 N. If there are n identical objects and k boxes into which the objects are distributed then there exists a box containing at least l n k m objects. Notation 20 (Floor and Ceiling). If x 2 R then dxe, pronounced “ceiling of x”, is the smallest integer greater than or equal to x dxe = min {n 2 Z |n � x}. Similarly, bxc, pronounced “floor of x”, is bxc = max {n 2 Z |n x}. 12 Generalisation of Pigeonhole Principle Proof: We have k boxes and n objects so consider Pk(n) = � (p1, p2, · · · , pk) 2 Nk0 �� kP i=1 pi = n . Take any p 2 Pk(n). Then by the bounded set theorem max {pi} � P pi2p pi | p | = n k ) max {pi} � l n k m (max {pi} 2 N) ) 9 pi 2 p s.t. pi � l n k m . Application of Pigeonhole Principle Theorem 21 (Increasing or Decreasing Subsequences). Every sequence of n 2 + 1 distinct real numbers contains a subsequence of length n + 1 that is either strictly increasing or strictly decreasing. Proof: Let x1, x2, . . . , xn2+1 be a sequence of n 2+1 distinct real numbers. To each xk associate an ordered pair (ik, dk) : • ik: length of longest increasing subsequence starting at xk. • dk: length of longest decreasing subsequence starting at xk. Application of Pigeonhole Principle Proof (continued): Supposition: No increasing or decreasing subsequences of length n+ 1. Now 1 ik, dk n so there are n2 possible ordered pairs (ik, dk). By the Pigeonhole Principle two of the pairs must be equal. ) 9 xs and xt, with s < t s.t. is = it and ds = dt. The terms of the sequence are distinct so either xs < xt or xs > xt.
Application of Pigeonhole Principle
Proof (continued):
If xs < xt we can form an increasing subsequence of length is + 1. Take xs followed by the increasing subsequence starting at xt of length it = is. This contradicts is = it (and is longest increasing subsequence). 13 A similar argument leads to a contradiction of ds = dt when xs > xt.
Hence our original supposition, “No increasing or decreasing subsequences
of length n+ 1”, must be false.
This completes the proof by contradiction.
10 Maps and Object Arrangements
Maps and Object Arrangements
Let f : X ! A be a map or function (both X and A are finite sets).
We can interpret f combinatorially in two ways:
I Arranging objects in boxes:
– X = set of objects to be sorted into
– A = set of boxes
Each way of sorting the objects is a map f : X ! A, e.g.,
X = {1, 2, 3, 4}, A = {a, b, c, d}, f(1) = a, f(2) = d, f(3) =
b, f(4) = a
Box sorting:
�� 1, 4
�� �� 3
�� �� �� �� 2
��
a b c d
Maps and Object Arrangements
II Tuples or words:
– X = set of positions of letters (entries in tuple)
– A = alphabet (assumed unambiguous)
X = {1, 2, 3, 4}, A = {a, b, c, d}, f(1) = a, f(2) = a, f(3) =
c, f(4) = d
w = aacd Note: |w | = |X | .
Duality of Object Arrangements and Words
Each objects-into-boxes arrangement corresponds to a unique word:
Objects ! Positions
Boxes ! Letters
This duality is often used to recast an enumeration into an easier form.
14
Duality of Object Arrangements and Words
1. Duality is a bijection (one-to-one onto map) between the set of n-object
arrangements into boxes and words of length n defined by interchanging
objects & positions and boxes & letters.
2. We can put constraints on the map f which in turn become constraints
on box occupancy and by duality on word forms.
(a) f injective :
⇢
! at most one object per box
! no repeated letters
(b) f surjective :
⇢
! no empty boxes
! every letter used at least once
Example
Example: X = {1, 2, 3} A = {a, b}, f surjective
Arrangements: f : X ! | | |
a b
with no empty boxes.
) |1 2 | 3 | |1 3 | 2 | |2 3 | 1 |
| 1 |2 3 | | 2 |1 3 | | 3 |1 2 |
Words: Each letter {a, b} used at least once:
aab, aba, baa, abb, bab, bba
Duality map:
|1 3 | 2 |
dual
�! a b a
1 2 3
| 3 |1 2 |
dual
� b b a
1 2 3
11 Permutations
Permutations
• Di↵erent orderings of distinct objects are referred to as permutations.
• Let the distinct objects be denoted a1, a2, · · · , an. From these objects
form the set
X = {a1, a2, · · · , an}. (Note: |X| = n)
• The task is to enumerate (count) all tuples (x1, x2, · · · , xn) with each
xj 2 X and furthermore all the xj ’s distinct.
More generally we have
Definition 22 (Permutation). A k-permutation of a finite set X is a k-tuple
(x1, x2, · · · , xk) xi 2 X, xi 6= xj 8i 6= j
If k = |X | the tuple is called a permutation of X.
15
Permutations
Note:
1. xi 6= xj means all entries are distinct and requires k |X |.
2. If X is unambiguous k-permutations are often written as words.
3. A k-permutation is an injective map:
p : positions in the k-tuple �! X.
4. A permutation is a bijective map.
Example: X = {1, 2, 3, 4}
• 2-permutations: 12, 21, 13, 31, 14, . . .
• 3-permutations: 123, 124, 213, 214, 132, . . .
• 4-permutations: 1234, 2134, 2314, 2341, . . .
Permutations
Theorem 23 (Number of k-permutations). The number of k-permutations of
a finite set X with |X | = n � k is
(n)k := n(n� 1)(n� 2) · · · (n� k + 1)| {z }
k factors
.
Note: (n)k is called a falling factorial.
Corollary 24 (Number of Permutations). The number of permutations of X
is (n)n = n!, where n = |X |.
Example: All permutations for n = 3:
(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)
| {z }
6 permutations in total=3⇥2⇥1
Permutations
Proof: Let Pk be the set of all k-permutations of X.
Let Xn = X
Xn�1 = Xn\{x1}, x1 2 Xn
Xn�2 = Xn�1\{x2}, x2 2 Xn�1
…
Xn�k+1 = Xn�k+2\{xk�1}, xk�1 2 Xn�k+2
Then
16
Pk = Xn ⇥Xn�1 ⇥Xn�2 ⇥ · · ·⇥Xn�k+1
) |Pk | = |Xn | · |Xn�1 | · |Xn�2 | · · · |Xn�k+1 |
= n(n� 1)(n� 2) · · · (n� k + 1) = (n)k
Problem
Example: There are seven distinct red cars and 3 distinct blue cars.
How many ways can they be parked (in a line) so that:
1. The blue cars form a block (no red cars between them)?
2. The cars at the ends are red and no blue cars are adjacent?
Solution 1: Treat the blue cars as a single object:
X = {r1, r2, . . . , r7, B}, B = {b1, b2, b3}
Parking configurations of X are permutations of X so (7 + 1)! configura-
tions.
Must also account for parking configurations within B (= 3!).
So total number of parking configurations = 8! · 3!
Problem continued
Solution 2: To solve this problem consider permutations of positions of
cars.
The two constraints imply configurations are of the type:
R1
1
R2
2
R3
3
R4
4
R5
5
R6
6
R7
Where: Ri ! red car,
j
! possible blue car position.
• Permutations of the seven positions {R1, . . . , R7} give the possible con-
figurations of red cars.
• 3-permutations of the six positions {
1
, . . . ,
6
} give the possible po-
sitions of the blue cars.
• Total number of parking configurations = 7! · (6)3 = 7! · 6 · 5 · 4.
12 Combinations
Combinations
17
Definition 25 (Combinations). A selection of r distinct elements from a set
of size n without regard to order is called a combination of size r and the
number of possible selections or combinations is denoted Cnr .
Note: The order of elements is irrelevant so combinations are represented
by sets (in fact a combination is just a subset).
There are many ways of thinking about combinations such as
• Given n elements and two boxes, count the number of ways of putting
r elements in box 1, and n� r elements in box 2 when the order of the
elements within the boxes doesn’t matter.
• Number of ways of choosing a subset of size r from a set of size n.
• Number of ways to colour n elements so that r elements are red and
n� r elements are blue.
Number of Combinations
Theorem 26 (Number of Combinations). The number of combinations of r
elements from a set of size n is
C
n
r =
(n)r
r!
=
n!
r! (n� r)!
=
✓
n
r
◆
;
that is, they are counted by the binomial coe�cients
�
n
r
�
.
Proof: There are (n)r ways of forming r-permutations of the set. We can
form any r-permutation by first choosing a combination of size r and then
ordering the r elements chosen. There are r! ways of ordering the chosen
elements. This shows that
(n)r = C
n
r · r!
and the result follows.
Note: This explains why the binomial coe�cient
�
n
r
�
is often pronounced
or referred to as ‘n choose r’.
Basic Properties of Combinations
Corollary 27 (Cnr = C
n
n�r). Let n and r be nonnegative integers with r n.
Then Cnr = C
n
n�r
Proof: From previous theorem it follows that
C
n
n�r =
n!
(n� r)! (n� (n� r))!
=
n!
(n� r)! r!
= Cnr .
Note: Combinatorially this makes perfect sense. In forming a combi-
nation choosing r elements that goes into the combination is equivalent to
choosing (n� r) elements that does not go into the combination.
18
Corollary 28 (Cn
0
= Cnn = 1). With the convention (or definition) that 0! = 1
we have
�
n
0
�
= 1 and
�
n
n
�
= 1.
Note: There is 1 way of selecting 0 or n elements from n elements. Even
for n = 0 the result makes sense: There is 1 way of choosing no element from
zero elements (or the empty set is the only subset of the empty set).
13 Interpreting Binomial Coe�cients
Example
Example: List subsets of size 2 from {a, b, c, d}, and so determine
�
4
2
�
.
• The subsets are:
{a, b}, {a, c}, {a, d}
{b, c}, {b, d}
{c, d}
• Hence:
�
4
2
�
= 6.
Note: We can put the elements from these subsets in the first box and
the remaining elements from the set in the second box. This gives an inter-
pretation or the problem in terms of boxes.
Combinations
Example: Calculate
�
n
3
�
using the boxes interpretation.
• Procedure:
– Choose any of the n elements to place in box 1 (n choices).
– Choose any of the remaining n� 1 elements to also place in box 1.
– Choose any of the remaining n� 2 elements to also place in box 1.
– The remaining n� 3 elements are placed in box 2.
• Order is not important. So suppose we have chosen the elements a, d, g.
– Then {a, d, g} = {a, g, d} = {d, a, g} = {d, g, a} = {g, a, d} =
{g, d, a}.
– There are 3! ways of arranging 3 objects on a line (permutations).
Must divide out by this.
• Hence ✓
n
3
◆
=
n(n� 1)(n� 2)
3!
.
19
Interpreting
�
n
r
�
– Example
Example: Number of ways to colour n elements so that r elements are
coloured and n� r elements are not.
• To see the equivalence of this viewpoint, list the elements in order.
• e.g. n = 6, r = 2
x1 x2 x3 x4 x5 x6
⇤ ⇤ ⇤ ⇤ ⇤ ⇤
• Form a subset from the coloured elements e.g.
• {x3, x6}
x1 x2 x3 x4 x5 x6
⇤ ⇤ ⌅ ⇤ ⇤ ⌅
Interpreting
�
n
r
�
• Colouring: can think of placing n blocks in a row,
�
n
r
�
is the number of
ways of colouring r blocks red, n� r blocks blue.
• Or, nr red blocks, nb blue blocks. Number of arrangements is
✓
nr + nb
nr
◆
=
(nr + nb)!
nr!nb!
• (nr + nb)! ways of arranging (nr + nb) blocks, dividing out the number
of ways of arranging the red blocks (nr!) and the blue blocks (nb!).
• Note that this is unchanged by nr $ nb.
14 Lattice Path Problem
Lattice Path Problem
Example: Consider an integer grid. The number of lattice paths which
go from (0, 0) to (n,m) with each step either to the North or East is equal to�
n+m
n
�
.
20
(0, 0)
e e e e
(3, 0)
e e e e
e e e e
e e e e
e e e e
(0, 5) e e e e (3, 5)
6
6
–
6
6
–
6
–
Lattice Path Problem
• Count the number of ways to go from (0, 0) to (n,m), n,m > 0, according
to the rule that paths consist of North and East steps only.
• There is an obvious one-to-one correspondence (or bijection) between
the lattice path problem and words in the alphabet {N,E}, e.g.,
6
6
—
() NNEE
-6
-6 () ENEN
There are n east and m north steps so each path corresponds to a word
of length n+m with n letters E and m letters N .
• Also equivalent to arranging n red blocks| {z }
east steps
and m blue blocks| {z }
north steps
in a line.
• Hence the number of paths is
�
n+m
n
�
or
�
n+m
m
�
.
• As these paths are counted by binomial coe�cients they are called bi-
nomial paths.
15 Combinations: Ordering, Replacement and Rep-
etition
Combinations: Ordering, Replacement and Repetition
Choosing r elements from a set of size n can be viewed as a sampling
problem. In general sampling problems can be classified as follows.
1. Order: Is the sample (the chosen elements) ordered or not?
2. Replacement: Can an element be chosen more than once?
21
3. Repetition: Are the elements of the set distinct or not?
Thus far in forming combinations we have mostly considered cases with
n distinct elements (no repetition) and the chosen elements were all di↵erent
(no replacement).
The following example will demonstrate how to count with replacement.
Sample with Replacement
Example: Alice has a purse containing small change, i.e. 5, 10, 20, and
50 cent pieces. If the purse contains 12 coins how many possible combinations
of coins are there?
Solution: A sample of size r = 12 is to be chosen from the set {5, 10, 20, 50}
so replacement is necessary and the sample is unordered.
Place the coins in order of their denomination. A typical sample may look
like:
We need to count all such configurations. A nice trick does this.
Place a separator between di↵erent denominations:
Sample with Replacement
Configurations can now be represented in a two-letter alphabet {C,L}:
CCCLCCCCCLCCLCC
Note: There is no need to specify the denomination of the coins. This is
determined by the placement of the separators, i.e., every coin to the left of
the first separator is a five cent piece etc.
The counting problem is now easy. There are 12 + 3 positions 3 of which
are separators and 12 of which are coins. So the answer is:
# configurations =
✓
15
12
◆
=
✓
15
3
◆
= 455.
Combinations: Ordering and Replacement
Theorem 29 (Unordered Sample with Replacement). The number of un-
ordered samples of size r with replacement chosen from a set of size n is
C
n+r�1
r =
✓
n+ r � 1
r
◆
=
✓
n+ r � 1
n� 1
◆
= Cn+r�1n�1
22
Proof: This is straightforward generalisation of the previous example.
We summarise our results on combinations in the following table:
Set Size Sample Size Replacement Ordered Number
n r No Yes (n)r
n r No No Cnr
n r Yes Yes nr
n r Yes No Cn+r�1r
At the Icecream Parlour
Example: An icecream parlour sells 3 scoop icecreams from 20 flavours.
1. How many combinations are possible if all scoops are di↵erent?
2. What if scoops with the same flavour are allowed?
Solution 1:
✓
20
3
◆
=
20⇥ 19⇥ 18
3!
= 1140
Solution 2: This a problem of selection with replacement.
Here n = 20, r = 3, giving
✓
22
3
◆
=
22⇥ 21⇥ 20
3!
= 1540
At the Icecream Parlour (continued)
• Problem 2 can also be solved using “natural” disjoint sets:
S1 = {all scoops are di↵erent}
S2 = {two scoops are the same, the other di↵erent}
S3 = {three scoops all the same}
• So
# configurations = |S1|+ |S2|+ |S3| by the addition principle
=
✓
20
3
◆
+ 20⇥ 19 + 20
=
✓
20
3
◆
+ 2
✓
20
2
◆
+
✓
20
1
◆
= 1140 + 380 + 20 = 1540 =
✓
22
3
◆
.
16 Combinatorial Identies
Combinatorial Identies
23
Theorem 30 (Sum of Binomial Coe�cients).✓
n
0
◆
+
✓
n
1
◆
+
✓
n
2
◆
+ · · ·+
✓
n
n
◆
=
nX
r=0
✓
n
r
◆
= 2n.
Proof: Count total number of subsets (from set of size n) in two ways.
LHS: Decomposition according to the subset size:
{subsets with 0 elements} [ {subsets with 1 element} [ · · ·
· · · [ {subsets with n elements} = {all subsets}
Since the subsets are disjoint the Addition Principle gives the LHS.
Next we show that from n elements there are a total of 2n subsets:
Think of the n elements as a tuple of size n. Record a 1 if the element is
in the subset and a zero otherwise, e.g., for n = 3:
(0, 0, 0)$ ; (0, 1, 0)$ {x2} (1, 1, 0)$ {x1, x2}
The problem is reduced to counting tuples of size n with elements from
{0, 1}. ) Total number = RHS = 2n, by the multiplication principle.
Combinatorial Identies
Theorem 31 (Recurrence for binomial coe�cients).✓
n
r
◆
=
✓
n� 1
r � 1
◆
+
✓
n� 1
r
◆
.
Proof: The LHS is just the total number of subsets of size r from a set of
size n, say, {1, 2, · · · , n}.
Now partition these subsets into two disjoint sets: Those which include
the element n and those which don’t.
The first set can be identified with forming subsets of size r � 1 (element
n is already included) from a set of size n� 1 (the remaining elements).
The second set entails forming subsets of size r from a set of size n� 1.
Then by the Addition Principle we get the RHS.
Note: Pascal’s triangle follows directly from this recursion relation.
Back to the Icecream Parlour
In the icecream parlour example we used two di↵erent methods to derive
the number of ways to choose three scoops of icecream when scoops with the
same flavour are allowed.
This way we found:
�
22
3
�
=
�
20
3
�
+ 2
�
20
2
�
+
�
20
1
�
.
This is a special case of Vandermonde’s identity.
Theorem 32 (Vandermonde’s Identity). Let m,n and r be nonnegative inte-
gers with r not exceeding m or n. Then
✓
m+ n
r
◆
=
rX
k=0
✓
m
r � k
◆✓
n
k
◆
.
24
Note: The above special result follows by setting n = 20, r = 3,m =
r � 1 = 2. Now this violates the restriction r m. But we can use
�
2
3
�
= 0
and its OK.
In the general case, m = r� 1, the identity still holds if we use
�
r�1
r
�
= 0.
Vandermonde’s Identity – Proof
Proof: We prove Vandermonde’s identity by showing that the LHS and
RHS count the same objects in two di↵erent ways:
Suppose we have two disjoint sets A and B with m and n elements.
We can pick r elements from the union of the two sets in Cm+nr = LHS
ways.
Next pick elements from the union but keep track of where they came from.
We can pick r � k elements from A and k elements from B. By the
Multiplication Principle we know this can be done in Cmr�kC
n
k ways.
Clearly k can range from 0 to r and the resulting sets that we pick are
disjoint so by the Addition Principle we get the RHS.
This proves Vandermonde’s identity.
Vandermonde and Icecream
Is there a general r-scoop problem related to Vandermonde’s identity?
Answer: Yes, but we need to engage in a bit of reverse engineering.
Look at the identity in the case m = r � 1:
✓
n+ r � 1
r
◆
=
rX
k=0
✓
r � 1
r � k
◆✓
n
k
◆
.
Now we have n = # of flavours and r = # of scoops.
From previous work we know that LHS = number of ways to choose r
scoops with replacement from n flavours.
What about the RHS?
It would seem we are breaking the problem into ‘partitions’ according to
k. So could k be the number of distinct flavours in our selection of scoops?
Vandermonde and Icecream
This makes sense since
�
n
k
�
is “n choose k”. But what about the factor⇣
r�1
r�k
⌘
?
A concrete example should clarify things. Say n = 20, r = 5:
25
k Factor Flavours Remaining Scoops Comment
0
�
4
5
� �
20
0
�
None None Can’t choose no flavours
1
�
4
4
� �
20
1
�
a aaaa All scoops the same
2
�
4
3
� �
20
2
�
ab aaa, aab, abb, bbb Choose 2 flavours then
3 scoops from 2 flavours
3
�
4
2
� �
20
3
�
abc aa, ab, ac, bb, bc, cc Choose 3 flavours then
2 scoops from 3 flavours
4
�
4
1
� �
20
4
�
abcd a, b, c, d Choose 4 flavours then
1 scoop from 4 flavours
5
�
4
0
� �
20
5
�
abcde None All scoops distinct
Vandermonde and Icecream
It should now be clear what is going on.
The term on the RHS comes about as:
We partition the problem on number of chosen flavours k.
First we choose k distinct flavours in
�
n
k
�
ways.
Then we choose remaining r�k scoops from k flavours with replacement.
This can be done in
⇣
k+(r�k)�1
r�k
⌘
=
⇣
r�1
r�k
⌘
ways.
So we can choose r scoops, using k out of n flavours in
⇣
r�1
r�k
⌘ �
n
k
�
ways.
Then sum over 0 k r to get the RHS (scoop partitions are clearly
disjoint).
17 Binomial Theorem
Binomial Theorem
Theorem 33 (Binomial Theorem). Let x and y be variables and n 2 N. Then
(x+ y)n =
nX
k=0
✓
n
k
◆
x
k
y
n�k
.
Proof: We shall give a combinatorial proof.
Now (x+ y)n = (x+ y)(x+ y) · · · (x+ y)
| {z }
n factors in total
.
Think of each x as a white block and each y as a black block. Expanding
out the factors then gives terms which correspond to arranging a total of n
blocks (coloured white or black) in a line.
The term xkyn�k corresponds to having k white blocks and n � k black
blocks, showing that its coe�cient must be
�
n
k
�
.
26
Binomial expansion
Example: Illustration in the case n = 3:
(x+ y)(x+ y)(x+ y)
= (⇤+⌅)(⇤+⌅)(⇤+⌅)
= ⇤⇤⇤+ (⌅⇤⇤+⇤⌅⇤+⇤⇤⌅) + (⇤⌅⌅+⌅⇤⌅+⌅⌅⇤) +⌅⌅⌅
= All white + 1 black, 2 white + 2 black, 1 white + All black
= x3 + 3x2y + 3xy2 + y3
=
✓
3
0
◆
x
3 +
✓
3
1
◆
x
2
y +
✓
3
2
◆
xy
2 +
✓
3
3
◆
y
3
Binomial expansion
Setting y = 1, we obtain
(1 + x)n =
nX
k=0
✓
n
k
◆
x
k
.
What do we find if we set x = �1? How can we interpret the identity?
LHS is 0. RHS is a sum of positive (k even) and negative (k odd) terms.
Theorem 34 (Even and Odd Subsets). For any set the number of subsets
with an even number of elements equals the number of subsets with an odd
number of elements.
Example: X = {1, 2, 3, 4}
• Even subsets: ;, {1, 2}, {1, 3}, {1, 4}, {2, 3}, {2, 4}, {3, 4}, {1, 2, 3, 4} – 8
subsets.
• Odd subsets: {1}, {2}, {3}, {4}, {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4} – 8
subsets.
Yet Another Identity
Theorem 35. ✓
n
k
◆✓
k
m
◆
=
✓
n
m
◆✓
n�m
k �m
◆
Proof: From a class with n students choose a committee of size k with a
subcommittee of size m.
LHS: First choose a committee with k members in
�
n
k
�
ways then from the
k members choose a subcommittee with m members in
⇣
k
m
⌘
ways.
27
RHS: First choose the m members of the subcommittee from the whole
class then choose the remaining k �m members from the remaining class.
A little more Number Theory
This last identity has an interesting application to Number Theory.
Theorem 36 (Erdös and Szekeres, 1978). For integers 0 < m k < n, � n m � and � n k � have a nontrivial common factor. That is, gcd �� n m � , � n k �� > 1.
Proof: Suppose, to the contrary that
�
n
m
�
and
�
n
k
�
are relatively prime.
By the previous identity
�
n
m
�
divides
�
n
k
� ⇣
k
m
⌘
.
By supposition
�
n
m
�
and
�
n
k
�
have no common factors so
�
n
m
�
must divide⇣
k
m
⌘
.
This is impossible since clearly
�
n
m
�
>
⇣
k
m
⌘
.
18 Inclusion-Exclusion Principle
Inclusion-Exclusion Principle – Warm Up
The Addition Principle tells us how to find the size of a union of disjoint
sets.
But what if the sets A and B aren’t disjoint? What is |A [B |?
|A [B | = |A |+ |B |� |A \B |
On the RHS, |A\B| is subtracted since elements occurring in both A and
B are counted twice in |A|+ |B|.
Somewhat more formally we can argue as follows: Let C = A \B.
Now A\C, B\C and C are disjoint sets and
A [B = (A\C) [ (B\C) [ C
Hence by the Addition Principle and the Complement Principle
|A [B | = |A\C |+ |B\C |+ |C | = (|A |� |C |) + (|B |� |C |) + |C |
= |A |+ |B |� |A \B | .
Inclusion-Exclusion Principle – Graphically
Graphically we have the following illustration:
28
A B
A B
C
And we get for the case with three sets:
|A [B [ C | = |A |+|B |+|C |�|A \B |�|A \ C |�|B \ C |+|A \B \ C | .
Licence Plate Problem
Example: A license plate contains 3 letters followed by 3 digits. How
many di↵erent plates can be produced if the plates cannot contain both the
letter ‘O’ and the digit ‘0’.
Solution: We have L = {A,B, . . . , Z} and D = {0, 1, . . . , 9} and a valid
licence plate is a word w = l1l2l3d1d2d3 with li 2 L and dj 2 D s.t. the letter
‘O’ and digit ‘0 don’t both occur in w. Let N denote the set of valid license
plates.
We previously looked at the unrestricted case T and found |T | = 263 ·103.
Let S1 = set of plates with no ‘O’ and S2 = set of plates with no ‘0
0.
Let LO = L\{O} and D0 = D\{0}. Then S1 = L3O⇥D
3 and S2 = L
3⇥D3
0
.
Hence
|S1 | = |LO |3 · |D |3 = 253 · 103
|S2 | = |L |3 · |D0 |3 = 263 · 93
Both S1 and S2 satisfy the “no O-0” constraint ) N = S1 [ S2.
Licence Plate Problem (continued)
29
) |N | = |S1 |+ |S2 | (Addition Principle)
= 253 · 103 + 263 · 93
= 263 · 103 ·
✓
25
26
◆
3
+
✓
9
10
◆
3
!
| {z }
=1.6…
) |N | > 263 · 103
But |T | = 263 · 103 — Ooops!!
What has gone wrong?
Actually: S1 \ S2 6= ; ) can’t use Addition Principle!
e.g. AAA222 2 S1 (no ‘O’) and AAA222 2 S2 (no ‘0’.)
Licence Plate Problem (resolved)
We are over-counting words in
S1 \ S2 = L3O ⇥D
3
0
) |S1 \ S2 | = 253 · 93.
By the Inclusion-Exclusion Principle we then get:
|N | = |S1 |+ |S2 |� |S1 \ S2 |
= 253 · 103 + 263 · 93 � 253 · 93
= 263 · 103 ·
✓
25
26
◆
3
+
✓
9
10
◆
3
�
✓
25
26
◆
3
✓
9
10
◆
3
!
| {z }
=0.96…
Inclusion-Exclusion Principle
Following on from the concrete examples we can state the general theorem.
30
Theorem 37 (Inclusion-Exclusion Principle).
|A1 [A2 [ · · · [An | = |A1 |+ |A2 |+ · · ·+ |An |
� |A1 \A2 |� |A1 \A3 |� · · · |An�1 \An |
+ |A1 \A2 \A3 |+ · · ·+ |An�2 \An�1 \An |
…
+ (�1)n�1 |A1 \A2 \ · · · \An |
=
nX
k=1
(�1)k�1
X
{s1,…,sk}⇢{1,2,…,n}
|As1 \As2 \ · · · \Ask | .
Inclusion-Exclusion Principle – Interpreted in Words
Inclusion-Exclusion is intuitively simpler than the theorem looks.
So let us decipher it line-by-line:
On the LHS we ask for the number of elements in a union of n sets, which
are not necessarily disjoint.
On the RHS we say this is the sum of the number of elements in each set.
Were the sets disjoint that would be it by the Addition Principle.
However, some elements occur in more than one set so we remove the count
of these.
But we may have removed too much, namely elements occurring in more
than two sets. So we add these back in and so on.
That is first we “include” all elements from all sets, then we “exclude”
all the elements occurring in two or more sets, then we “include” elements
occurring in three or more sets, and so on.
Example
Example: In a European languages class there are 25 students: 14
speak Spanish, 12 French, 6 French and Spanish, 5 German and Spanish, 2
Spanish, French and German.
The 6 that speak German all speak either French or Spanish as well.
How many students speak none of these languages?
Solution: The given information can be summarized as:
|S| = Number of Spanish speakers = 14
|F | = Number of French speakers = 12
|G| = Number of German speakers = 6
|F \ S| = 6
|G \ S| = 5
|F \ S \G| = 2
Missing information is: |F \G|.
31
Example – Continued
We are given that G ⇢ (F [ S).
We also know that |G \ S| = 5, |F \ S \G| = 2 and |G| = 6.
Can therefore conclude that |F \G| = 3.
Applying the Inclusion-Exclusion principle, we have
|S [ F [G| = 14 + 12 + 6
� (6 + 5 + 3)
+ 2
= 20.
Let |T | denote the total number of students.
Given |T | = 25, we therefore have |T |� |S [ F [G| = 25� 20 = 5.
An Exercise
Example: Calculate the number of positive integers less than or equal
to 1,000 divisible by 7, 10 or 15.
Solution: For 1 k 1000 , let Ak denote the set of integers divisible
by k.
Ak =
�
k, 2k, 3k, . . . ,
�
1000
k
⌫
k
) |Ak| =
�
1000
k
⌫
Also,
Ak \Al = An
where n = the lowest common multiple of k and l, e.g. A10 \A15 = A30
Final answer = 256.
Inclusion-Exclusion Principle – Towards a Proof
For 4 sets the Inclusion-Exclusion formula read:
|A [B [ C [D|
= |A|+ |B|+ |C|+ |D|
�(|A \B|+ |A \ C|+ |A \D|+ |B \ C|+ |B \D|+ |C \D|)
+(|A \B \ C|+ |A \B \D|+ |A \ C \D|+ |B \ C \D|)
�|A \B \ C \D|
• To check this formula, our task is to show that for an element x in some
given sets, it is counted once on the RHS.
• e.g. suppose x 2 A,C,D, x /2 B
32
• Then subsets containing x, formed by intersections, are
”size” 1 A,C,D = +3
”size” 2 A \ C,A \D,C \D = �3
”size” 3 A \ C \D = +1
+1
as required.
Inclusion-Exclusion Principle – Proof
Proof: The proof follows by formalising the above example. We shall
show that the elements in A1 [A2 [ · · · [An are counted exactly once on the
RHS:
nX
k=1
(�1)k�1
X
{s1,…,sk}⇢{1,2,…,n}
|As1 \As2 \ · · · \Ask | .
Suppose x is an element in r of the sets A1, A2, . . . , An where 1 r n.
This element is counted C r
1
times by
P
|Ai | and C r2 times by
P
|Ai \Aj |.
In general, it is counted C rm times in the term involving the intersection of
m distinct sets. Thus this element is counted
C
r
1
� C r
2
+ C r
3
� · · ·+ (�1)r�1C rr =
rX
k=1
(�1)k�1C rk
times on the RHS. From our work on the binomial theorem we know
rX
k=0
(�1)kC rk = 0 )
rX
k=1
(�1)k�1C rk = C
r
0
= 1.
This completes the proof of the inclusion-exclusion principle.
19 Bijection Principle
Bijection Principle
Bijections have already been mentioned several times. e.g., between object
arrangements and words or binomial paths and words.
We have seen that it can be useful to count the same set in two di↵erent
ways (e.g. in proving binomial identities) or to replace one counting problem
with an “equivalent” problem (e.g. the small coins example).
These ideas are formalised in the following fundamental principle.
Bijection Principle
Axiom 38 (Bijection Principle (BP)). Let A and B be finite sets. If there is
a bijection from A to B, then |A | = |B | .
1. The BP only tells us two sets have the same size. It only gives us a
formula for |A | if we already have one for |B |.
33
2. With di↵erent formulae for |A | and |B | the BP gives us an identity.
3. The BP is frequently used to replace the problem of finding |A | by that
of finding |B |. To do this we must know (or prove) that a bijection
exists between A and B.
Bijection Principle – Example 1
Example: Bijective proof that Cnr =
(n)r
r!
.
Proof: We shall give a bijective proof that (n)r = r!C
n
r .
Task: ‘Find’ two sets A and B enumerated by (n)r and r!C
n
r , respectively,
and a bijection T : A! B.
Let N = {1, 2, . . . , n}
• A = the set of all r-permutations of N . By definition |A | = (n)r.
• B = {(S,⇡) |S ✓ N, |S | = r,⇡ a permutation of S} .
There are Cnr subsets of N of size r and r! permutations of each subset.
• Hence |B | = r!Cnr .
But what is T? Do a simple example to get idea for general case.
Bijection Principle – Example 1
Take n = 3, r = 2 : r! = 2, Cnr = C
3
2
= 3, (n)r = 3 · 2 = 6.
A = {12, 21, 13, 31, 23, 32} |A | = 6
Subsets: S ✓ {1, 2, 3} s.t. |S | = 2 {1, 2}, {1, 3}, {2, 3}
.& .& .&
Permutations: 12 21 13 31 23 32
) B = {({1, 2}, 12), ({1, 2}, 21), ({1, 3}, 13), ({1, 3}, 31), ({2, 3}, 23), ({2, 3}, 32)}
The bijection should be clear: T (12) = ({1, 2}, 12), T (21) = ({1, 2}, 21), . . .
General Case: If the permutation of r objects is ⇡ = a1a2 · · · ar, then
T (⇡) = ({a1, a2, . . . , ar},⇡).
Bijection Principle – Example 2
Example: Give a bijective proof of the statement:
Number of subsets of even size = Number of subsets of odd size.
Warm-up: n = 3, there are 23 = 8 subsets in total.
34
subsets of even size
;
{x1, x2}
{x1, x3}
{x2, x3}
()
subsets of odd size
{x1}
{x2}
{x3}
{x1, x2, x3}
Here, the rule is that the element x1 is to be added to a subset if it wasn’t
already present, and removed if it was present.
This works for general n and establishes the result.
Bijection Principle – Example 3
Example: Binomial Paths to Binary Words
• A = set of binomial paths: (0, 0)! (n,m).
• With D = {0, 1}, B = {w 2 D⇤ | |w | = n+m, |w |
0
= n, |w |
1
= m}.
The bijection maps east steps to the digit 0 and north steps to the digit 1.
Formally we can write this as:
p = s1s2 · · · sn+m 2 A si = step i
T (p) = d1d2 · · · dn+m 2 B
where
di =
8
<
:
0 if si = East step
1 if si = North step
Bijection Principle – General Notes
1. If |A | = |B | = n then there always exists n! bijections T : A ! B.
How? Just list all elements in A and B. Then each permutation of B
defines a T .
2. What we want is an “intrinsic” bijection. That is a map defined by the
structure of the elements rather than just their position in a list.
3. In all the examples we have defined a map T : A! B, but we have not
explicitly proved that the map is a bijection (injective and surjective).
In this course our bijections are su�ciently simple that we skip this step.
Bijection Principle – General Notes
4. The power of bijections goes well beyond the scope suggested by these
simple examples. Say the objects in A can be partitioned into subsets Ai
having property pi. The bijection may then map to subsets of Bi with
well-defined properties qi so |Ai | = |Bi |.
35
• Proving theorems about the subsets Ai may be ‘easy’ and this then
gives proofs for the subsets Bi.
• Objects in A can be partitioned into subsets with given properties in
a natural way. In B what are the corresponding properties? This
can give new and surprising insights into problem B.
20 Bijection Principle and Identities
Bijection Principle and Identities
Next we look at how to use bijections to prove identities such as
1.
✓
n
r
◆
=
✓
n
n� r
◆
2.
nX
r=0
✓
n
r
◆
= 2n
3.
✓
n
r
◆
=
✓
n� 1
r � 1
◆
+
✓
n� 1
r
◆
The general idea is to use that if LHS is |A | and RHS is |B | with T :
A! B a bijection then invoking the BP gives a proof that |A | = |B | .
Note: We have already used this idea when proving binomial identities by
counting the same set in di↵erent ways. Now we shall formally give bijections
which were only implied by these proofs.
Identity :
�
n
r
�
=
⇣
n
n�r
⌘
Bijective Proof:
• A= set of all r-subsets of an n-set.
• B= set of all (n� r)-subsets of an n-set.
Example: N = {a, b, c}, r = 1 : A = {{a}, {b}, {c}} and B =
{{a, b}, {a, c}, {b, c}}.
We know |A | =
�
n
r
�
and |B | =
⇣
n
n�r
⌘
. We need bijection T : A! B.
Let S ✓ N, |S | = r. Then
T (S) = N\S
is a bijection and S ✓ N ) |N\S | = |N | � |S | = n � r by the
complement principle so we have that N\S is a (n� r)-subset of N .
Thus by the BP we get |A | = |B | as required.
36
Identity :
nP
r=0
�
n
r
�
= 2n
Bijective Proof: The proof given in Week 2 (slide 25) is essentially
bijective.
So we just need to formalise it a bit.
• A = set of all subsets of an n-set N = {1, 2, . . . , n}.
• B = set of binary strings (words in {0, 1}) of length n.
LHS is obtained by partitioning A according to the size of the subsets.
RHS is the number of binary strings (or words) of length n.
The bijection is: Let S = {a1, a2, . . . ar} ✓ N with |S | = r. Then
T (S) = b1b2 · · · bn, bi =
8
<
:
1 if i 2 S
0 if i /2 S
Thus |A | = |B | = 2n.
Binary Words, Binomial Paths and Subsets
Previously we found a bijection between binary words and binomial paths.
The above example gives a bijection between binary words and subsets.
This gives an (indirect) bijection between binomial paths and subsets.
First reexamine binary words and binomial paths. The bijection to subsets
requires all binary words of length n. This is not quite what we had before.
Binary Words, Binomial Paths and Subsets
Bijection still maps east steps to the digit 0 and north steps to the digit
1, but
• A = set of binomial paths: (0, 0)! ???.
• B = set of binary words of length n.
A binomial path from (0, 0) ! (x, y) 2 N2
0
has length x + y so need
x+ y = n. Hence
A = set of binomial paths from (0, 0) ending on the line y = �x+ n.
The bijection from paths to subsets of N = {a1, a2, · · · , an} is: label the
steps of the path so si labelled ai then read of the labels of the North steps,
e.g.,
a3 | a4
| a2 ! {a1, a2, a4}
| a1
a1 a2 a3 | a4 ! {a4}
37
21 More on Multisets
Back to Multisets
Theorem 39 (Number of Multisets). The number of size k multisets with
elements from a set of size n is
✓✓
n
k
◆◆
=
✓
n+ k � 1
k
◆
=
✓
n+ k � 1
n� 1
◆
Proof: If this reminds you of sampling with replacement it should!
A multiset is essentially just a sample from a n-set with replacement.
More formally we note that a multiset is defined by the tuple
(am1
1
, a
m2
2
, . . . , a
mn
n ), with
nX
i=1
mi = k, mi 2 N0.
If the alphabet is ordered we only need the exponents m1,m2, . . . ,mn
which we can view as (biject to) an arrangement of k identical objects into n
boxes.
Note: The objects must be identical as we don’t care which box they go
to only how many go into a given box.
Multisets – Example
Example: 2 boxes containing 3 identical objects (represented by •).
We have the following ‘obvious’ bijections (delete the first and last “wall”
from the box):
n1 = 3, n2 = 0 n1 = 2, n2 = 1 n1 = 1, n2 = 2 n1 = 0, n2 = 3
|• • •| | | •• | • | | • | •• | | |• • •|
a b a b a b a b
" " " "
# # # #
• • • | • • | • • | • • | • • •
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
" " " "
# # # #
(a3, b0) (a2, b1) (a1, b2) (a0, b3)
or
{a, a, a} {a, a, b} {a, b, b} {b, b, b}
In the middle rows we have 4 positions and choose one position for the
wall.
Dot–Wall Diagrams
Definition 40 (Dot–Wall Diagrams). A word w 2W ⇤, where W = {•, | }, is
called a dot-wall diagram.
38
Theorem 41 (Number of Dot–Wall Diagrams). The number of dot-wall dia-
grams with m walls and k dots is
⇣
m+k
k
⌘
.
Proof: Dot-wall diagrams clearly bijects to an object arrangement of
black and white blocks on a line with (also recall Alice and her purse)
#positions = #dots + #walls.
Now choose k walls from the m+ k possible positions.
Dot–Wall Diagrams and Multisets
Dot-wall diagrams with m walls and k dots bijects to multisets of size k
chosen from a set of size n = m+ 1 (as per previous example):
T (
m=n�1 wallsz }| {
· · ·|{z}
m1
| · · ·|{z}
m2
| · · · · · · | · · ·|{z}
mn
) = (am1
1
, a
m2
2
, . . . , a
mn
n )
and from this we get
✓✓
n
k
◆◆
=
✓
n+ k � 1
k
◆
=
✓
n+ k � 1
n� 1
◆
with
nX
i=1
mi = k.
22 Multinomial Coe�cients
Multinomial Coe�cients
We can generalise the binomial coe�cients by considering blocks of more
than two colours:
1. n blocks (identical except for their colour).
2. k colours: c1, c2 · · · , ck.
3. ni blocks of colour ci with n1 + n2 + · · ·+ nk = n.
How many ways can the blocks be arranged?
Example: n = 4, n1 = n2 = 1, n3 = 2, i.e., three colours r, g, b:
rgbb, rbgb, rbbg grbb, gbrb, gbbr brgb, brbg, bbrg, bgrb, bgbr, bbgr
So 12 arrangements in total.
Multinomial Coe�cients
If we use the (object arrangements) $ (positions) duality or bijection we
can state an equivalent box filling problem: the k colours are the k boxes
(Colour ci occurs ni times) (ni objects in box ci)
39
From previous example we have:
positions:
duality
boxes:
?
r g b b
1 2 3 4
?
1 2 3,4
r g b
r b g b · · ·
1 2 3 4
?
1 3 2,4
r g b
g b r b · · ·
1 2 3 4
?
3 1 2,4
r g b
| {z }
nr = 1, ng = 1, nb = 2Multinomial Coe�cients
Theorem 42 (Multinomials). The number of ways of placing n distinct un-
ordered objects into k distinct boxes with exactly ni objects in box i is given by
the multinomial coe�cient
✓
n
n1, n2, . . . , nk
◆
=
n!
n1!n2! · · · nk!
with n1 + n2 + · · ·+ nk = n.
Example: For the previous example n = 4, n1 = 1, n2 = 1, n3 = 2 we
get
✓
4
1, 1, 2
◆
=
4!
1! 1! 2!
= 12.
Example: k = 2
✓
n
n1, n2
◆
=
n!
n1!n2!
=
n!
n1! (n� n1)!
=
✓
n
n1
◆
.
So we recover the binomials.
Multinomial Coe�cients – Proof
Proof: Similar to the proof that binomials count combinations. It is an
exercise in the practice classes.
Note: Mathematically we can write arrangements of n objects into k
boxes (order of boxes is significant while order of objects within boxes is not)
as k-tuples of sets:
Ex. 3 boxes, 7 objects:
2, 4, 5 1, 3 6, 7
?
({2, 4, 5}, {1, 3}, {6, 7})
If order within the boxes is significant then we can use k-tuples of tuples.
((2, 4, 5), (1, 3), (6, 7)) 6= ((2, 4, 5), (3, 1), (6, 7))
40
Multinomial Expansion
We have the following generalization of the binomial expansion:
Theorem 43 (Multinomial Expansion). For all n, k 2 N with m � 2 and for
commuting variables x1, . . . , xk, we have
(x1 + · · ·+ xk)
n =
X
n1,...,nk�0
n1+···+nk=n
✓
n
n1, · · · , nk
◆
x
n1
1
· · · xnkk
Proof: We will use a bijection to an “objects into boxes arrangement”.
On the LHS there are (n factors! n objects) and (k variables! k boxes).
Then ni objects goes into box i ni factors of xi in product on RHS.
Hence the coe�cient of xn1
1
x
n2
2
· · · xnkk in the expansion of the LHS is ex-
actly the multinomial coe�cient
⇣
n
n1,n2,...,nk
⌘
appearing on the RHS.
Multinomial Identities
Theorem 44 (Sum of Multinomial Coe�cients).
X
n1,...,nk�0
n1+···+nk=n
✓
n
n1, · · · , nk
◆
= kn.
Proof: Set x1 = x2 = · · · = xk = 1 in the multinomial theorem.
Multinomial Identities
Theorem 45 (Multinomial Recursion).
n+ 1
n1, n2, . . . , nk
!
=
n
n1�1, n2, . . . , nk
!
+
n
n1, n2�1, . . . , nk
!
+ · · ·+
n
n1, n2, . . . , nk�1
!
Proof: LHS counts the number of to ways to distribute n + 1 elements
in k boxes with ni elements in box i.
RHS counts the same thing by taking a given element a and placing it in
box i.
Distribute remaining n elements in k boxes with nj elements in box j 6= i
and ni�1 elements in box i. Sum over the box i in which we placed element
a.
Note: This is a generalisation of Pascal’s Triangle to multinomials.
23 Sampling with Repetition
Sampling with Repetition
Example: How many di↵erent strings can be made by reordering the
letters of the word ‘SUCCESS’ ?
41
Solution: Since the word has repeated letters the answer is not 7!
We are looking for all ordered samples of size 7 without replacement.
Let x = number of possible strings.
Label repeated letters, e.g., S1, S2, S3 and C1, C2. Then there are 7! ways
of ordering (permutations) the labelled distinct letters.
Now there are x ways of making the required string a letters. The letters
S can be permuted (or labelled) in 3! ways and the C’s can be labelled in 2!
ways.
So
7! = x(3!)(2!) ) x =
7!
3! 2!
=
✓
7
3, 2, 1, 1
◆
= 420.
Number of Samples with Repetition
Theorem 46 (Number of Samples with Repetition). If a set of k distinct
objects has repetitions, say ni repetitions of object ai, with n =
Pk
i ni, then
the number of ordered samples without replacement of size n is given
by the multinomial coe�cient
✓
n
n1, n2, . . . , nk
◆
=
n!
n1!n2! · · · nk!
Proof: This is a straightforward generalisation of the previous example.
24 Graph Theory
Graph Theory
Definition 47 (Graphs, Vertices and Edges). A graph G is an ordered pair
G = (V,E) consisting of a finite nonempty set V of objects called vertices
and a (possibly empty) set E of 2-element subsets of V called edges. Vertices
u, v 2 V are adjacent if {u, v} 2 E.
Example: G = (V,E), V = {a, b, c, d}, E = {{a, b}, {a, c}, {b, d}}
Graph Theory
Notation 48 (Vertices and Edges). The set of vertices is often denoted V (G)
and is called the vertex set of G. Similarly the set of edges is denoted E(G)
and called the edge set of G.
If u, v 2 V (G) then an edge {u, v} 2 E(G) can be denoted uv (we have
vu = uv).
Definition 49 (Order and Size). The number of vertices in G, that is |V (G) |,
is called the order of G, while |E(G) | (number of edges in G) is called the
size of G.
42
Graph Theory
Definition 50 (Adjacent and Incident). Let u, v 2 V (G) and uv 2 E(G). We
say that u and v are adjacent and that the edge uv is incident on the vertex
u (v).
It is customary to draw graphs with vertices represented as points or circles
(open or filled) and edges represented by lines between adjacent vertices.
Example: G = (V,E), V = {a, b, c, d}, E = {{a, b}, {a, c}, {b, d}}
a b
c d
=G =
b a
c d
Definition 51 (Degree of Vertices). Let u 2 V (G). The number of edges
incident on u (adjacent vertices) is called the degree of u and denoted deg(u).
Example: In the above graph deg(a) = deg(b) = 2 and deg(c) =
deg(d) = 1.
Graph Theory
Definition 52 (Complement). The complement of a graph G, denoted G0,
has the same vertex set as G, V (G0) = V (G), but an edge e 2 E(G0) if and
only if e /2 E(G).
Example: a b
c d
) G0 =G =
a b
c d
Definition 53 (Subgraph). A subgraph H of G has V (H) ✓ V (G) and
E(H) ✓ E(G) such that if uv 2 E(H) then u, v 2 V (H) (we can have
u, v 2 V (H) but uv /2 E(H)).
Example: a b
c d
G =
a
c
H1 =
a b
c d
H2 =
Complete Graph Kn
Definition 54 (Complete Graphs Kn). The complete graph on n vertices,
denoted Kn, is the graph where all vertices are mutually adjacent. Every
vertex has degree n� 1 and there are
�
n
2
�
edges.
Example:
43
K2 K3
K4
K5 K6
Note: Any graph with n vertices is a subgraph of Kn. If we colour edges
of Kn red if e 2 E(G) and blue if e /2 E(G) then the blue edges are the edges
of G0.
Walks, Trails, Paths and Cycles
Definition 55 (Walks, Trails, Paths and Cycles). A walk in a graph is an
alternating sequence
v0, e1, v1, e2, v2, . . . , ek, vk
of vertices and edges, where k � 0, v0, v1, v2, . . . , vk are vertices, and ei =
vi�1vi for i = 1, 2, . . . , k. This walk is called a (v0, vk)-walk of length k.
Note: A walk in a graph can also be specified by omitting the edges (but
in a multigraph, where parallel edges can occur, the edges should be included).
A trail is a walk with no edge repeated.
A path is a walk in which no vertex is repeated.
The length of a walk is the number of edges in it.
A cycle is a walk with distinct vertices v0, v1, . . . , vn where n � 3 and
v0 = vn.
A cycle of length n is called an n-cycle.
Trees
Definition 56 (Tree). A tree is a connected graph with no cycles.
44
Trees – Examples
Example:
Path Caterpillar
Star
Binary Tree
Leafs
Definition 57 (Leaf). A leaf in a tree is a vertex of degree 1.
Example: Leafs are red.
Equivalent Definitions of Trees
Definition 58 (Tree). 1. A connected graph with no cycles.
2. A connected graph G with |E(G)| = |V (G)|� 1.
45
3. A graph G with |E(G)| = |V (G)|� 1 and no cycles.
4. A graph in which every pair of vertices are joined by exactly one path.
Properties of Trees
A tree T with order n and size m has the following properties:
1. connected
2. no cycle
3. exactly one path between any two vertices
4. deleting an edge gives a graph with two components
5. deleting a non-leaf vertex v gives a graph with deg(v) components
6. m = n� 1
7. at least two leafs.
Forests
Definition 59 (Forest). A graph is called a forest if it has no cycles.
Equivalently, a graph is a forest if each of its components is a tree.
A forest may or may not be connected. A forest is connected i↵ it is a
tree.
Theorem 60. A forest F of order n has exactly n� k(F ) edges, where k(F )
is the number of components of F .
Proof: Let k = k(F ) and let T1, . . . , Tk denote the components of F .
Then each Ti is a tree and so |E(Ti)| = |V (Ti)|� 1. Hence
|E(F )| =
kX
i=1
|E(Ti)| =
kX
i=1
(|V (Ti)|� 1) =
kX
i=1
|V (Ti)|
!
� k = n� k
Spanning Trees
Definition 61 (Spanning Tree). A spanning tree of a graph G is a spanning
subgraph of G that is a tree.
Example:
G Spanning trees of G
46
Note: Construct spanning trees by repeatedly deleting an edge that lies
on a cycle (edges that are not bridges).
25 Pigeonhole Principles
Back to Pigeonhole Principles
There are many di↵erent formulations of Pigeonhole Principles:
1. n+ 1 pigeons in n holes ) at least one hole has two pigeons.
2. When m balls go into n boxes at least one box contains at least dmn e
balls.
3. a1 + a2 � 1 balls in 2 boxes ) either box 1 has at least a1 balls or box
2 has at least a2 balls.
4. More generally if m = a1 + · · · + an � (n � 1) balls are placed into n
boxes then at least one box, say j, contains at least aj balls.
Often easy to prove when proposition written in the contrapositive form.
Consider the third statement: if there are less than a1 balls in box 1 and less
that a2 balls in box 2, then there is not a1 + a2 � 1 balls in total.
Note: The values n+1 and a1+a2�1 are the smallest possible for which
statements 1 and 3 are true.
Statement 1 is a special case of statement 4 with ai = 2 for all i.
Examples
Example: 51 numbers from the integers 1 n 100 are chosen. Show
that at least two of them must be consecutive.
Solution: Form 50 pigeonholes labelled by (2j�1, 2j) for j = 1, 2, . . . , 50.
⇤ ⇤ · · · ⇤
box(1, 2) box(3, 4) box(99, 100)
Place the number k from the set in box (2j�1, 2j) if k = 2j�1 or k = 2j.
There are 51 numbers in the range from 1 to 100 to be placed so at least one
of the boxes contains two numbers. Thus these numbers are consecutive.
Examples
Example: There are 5 points in a square of side length 2. Show that
there are 2 points having a distance less than or equal to
p
2.
Solution: Divide the square up into 4 sub-squares of side length 1.
The maximum separation of points in the unit square is
p
2 (= the length
of the hypotenuse). Since 5 points are placed into 4 unit squares then two or
more points must be placed into the same unit square. But points in a unit
square are no more than
p
2 units apart.
47
Examples
Example: In an 11 week period, a chess player plays at least one game
a day, but no more than 12 games in a week. Show that there is a succession
of consecutive days in which exactly 21 games have been played.
Solution: Let ai = the number of games played on day i, 1 i 77 (=
11 weeks).
Let bi = number of games played up to and including day i, i.e.,
bi = a1 + a2 + · · ·+ ai
Since at least 1 game is played each day,
1 b1 < b2 < b3 < · · · < b77
Also, since no more than 12 games are played in a week,
b77 12⇥ 11 = 132
Example Continued
Now introduce ’21’:
We know that
b1 + 21 < b2 + 21 < · · · < b77 + 21 132 + 21 = 153
Now consider the 154 numbers
b1, b2, . . . , b77, b1 + 21, b2 + 21, . . . , b77 + 21
These are between 1 and 153, so by the pigeonhole principle we can con-
clude that at least two must be the same.
But b1, b2, . . . , b77 are distinct, as are b1 + 21, b2 + 21, . . . , b77 + 21.
Hence we must have bi = bj + 21 for some i > j
) 21 = bi � bj =
iX
k=j+1
ak
Thus on the consecutive days j + 1, j + 2, . . . , i exactly 21 games were
played.
26 Ramsey Theory
Ramsey Theory
Introduction: Loosely speaking Ramsey Theory is concerned with finding
the smallest size system of objects certain to satisfy some given property:
48
• Every size n configuration has property p.
• At least one size n� 1 configuration does not have property p.
Ramsey Theory is thus concerned with existence theorems as was the Pi-
geon Hole Principle, and indeed Ramsey Theory can be viewed as an advanced
generalisation of the PHP.
Example: Restate the pigeon hole problem in a Ramsey like fashion.
Question: What is the minimum number of balls, m, placed into n boxes
such that there is certainly at least one box with 2 or more balls?
Answer: Clearly m � n + 1 (if m = n a ‘1 ball per box’ configuration
exists). The PHP tells us that m n+ 1 ) m = n+ 1.
Ramsey Theory
The metastatement of Ramsey theory is: “complete disorder is impossi-
ble”.
In other words, in a large system, however complicated, there is always a
smaller subsystem which exhibits some sort of special structure.
Perhaps the oldest answered question of this type is the following.
Example: What is the smallest group of people such that
1. All pairs are either friends or strangers and
2. Surely there are at least 3 mutual friends or 3 mutual strangers.
We can represent this problem as a graph. People are vertices, friendships
are indicated by an edge while there is no edge between strangers.
a b
c
d
e
)
{b, c, d} are mutual friends
{a, b, e} and {a, c, e} are mutual strangers
Ramsey Theory
We can now rephrase the problem in a more formal way. Let Rn be the
set of all possible graphs of order n.
Question: What is the minimum value of n such that
A Every G 2 Rn has the property
(a) G contains the subgraph K3 (a triangle)
OR
(b) the complement of G contains the subgraph K3.
B At least one G = Rn�1 does not have the above property.
49
Rather than looking at the complement it is visually easier and equivalent
to colour edges red (friends) or blue (strangers). Previous example then gives
a b
c
d
e
)
{b, c, d} are mutual friends ! Red K3
{a, b, e} and {a, c, e} are mutual strangers
! Blue K3
Ramsey Theory
Theorem 62 (3 Mutual Friends/Stangers). In any group of 6 or more people
(all pairs either friends or strangers) there always exists at least 3 mutual
friends or 3 mutual strangers. Moreover 6 is the smallest number of people in
a group that guarantees this is true.
The equivalent more formal graph theoretical theorem says
Theorem 63 (K3 Subgraphs of a Bicoloured Kn). In any bicolouring of Kn
with n � 6 there is a monochromatic K3 and n = 6 is the smallest number for
which this is true.
Example: K5 can be bicoloured with no monochromatic K3:
Ramsey Theory
The general problem concerns edge-colourings of the complete graph Kn.
Definition 64 (Ramsey Numbers). The Ramsey number R(a, b) is the
smallest value of n s.t. an edge-colouring of Kn using red and blue edges
always contains either a subgraph Ka in red or a subgraph Kb in blue. By
symmetry (interchange colours) we have R(a, b) = R(b, a).
Example: The Ramsey number R(3, 3) is the smallest value of n such
that two colouring Kn always give a red triangle or a blue triangle.
The previous theorem tells us that R(3, 3) = 6.
Example: R(4, 4) is the smallest number n such that any two colouring
of Kn always contains a red K4 or a blue K4. It has been proved that R(4, 4) =
18.
Red K4 Blue K4
50
R(2, n)
Theorem 65 (R(2, n) = R(n, 2) = n). The Ramsey numbers R(2, n) =
R(n, 2) = n.
Example: Consider the case R(2, 3).
R(2, 3) is the smallest value of n such that a two colouring of Kn always
gives a red K2 (an edge) or a blue K3 (a triangle).
R(2, 3) > 2 since by colouring K2 blue we do not have the sought pattern.
Consider now the 2-colouring of K3. We can have:
or or or
+ edge permutations + edge permutations
all of which satisfy the condition, ) R(2, 3) 3.
Combining with R(2, 3) > 2 gives R(2, 3) = 3.
Proof that R(2, n) = n
Proof: This is essentially a straightforward generalization of the R(2, 3)
case.
We must have R(2, n) > n� 1. By colouring Kn�1 blue we do not have a
subgraph Kn in blue or a subgraph K2 in red.
Consider a 2-colouring of Kn: Using blue to colour all the edges gives Kn
blue as a subgraph (as required).
All other colourings have at least one red edge.
Hence for all 2-colourings of Kn, there is either a red edge or a blue Kn.
This tells us that: R(2, n) n.
Thus
R(2, n) = n (as R(2, n) > n� 1).
Proof that R(3, 3) = 6
Theorem 66 (R(3, 3) = 6). The Ramsey number R(3, 3) = 6.
Proof: We already showed explicitly that we can bicolour K5 so there
isn’t a red K3 or a blue K3. Thus, R(3, 3) > 5.
Show via the pigeonhole principle that R(3, 3) 6. Key argument:
• Start with K6 and single out one of the vertices.
• There are 5 edges to the other vertices and by the pigeonhole principle
there must either be at least 3 red edges or 3 blue edges.
51
• Assume that at least 3 edges are red. If the adjacent vertices have any
red edges between them we must have a red triangle; else all the edges
between these vertices are blue, in which case we have a blue triangle.
• Hence, any two-coloured complete graph of 6 vertices must have either
a blue or red triangle. Thus R(3, 3) 6.
Combining results we get R(3, 3) = 6.
27 Ramsey Numbers
Ramsey Numbers
Next we shall prove that R(3, 4) = 9. After this it should be clear that
calculating R(a, b) is a very hard problem.
The general methods is as for R(3, 3).
1. First we establish a lower bound by explicitly finding a two-colouring of
a Kn which has no red K3 or blue K4. For the R(3, 4) case this is K8
showing that R(3, 4) > 8.
2. Next we establish the upper bound R(3, 4) 9.
This entails proving the general results:
(a) The general upper bound R(a, b) R(a�1, b) +R(a, b�1), refined
to
(b) R(a, b) R(a�1, b) + R(a, b�1) � 1, when R(a�1, b), R(a, b�1)
even.
We shall also prove these general results.
3. The Ramsey number R(a, b) is finite ) R(a, b) is well-defined.
4. The general upper bound: R(a, b)
⇣
a+b�2
a�1
⌘
.
Lower Bound on R(3, 4)
We show that we have the lower bound R(3, 4) > 8 by presenting an
explicit two-colouring of K8 containing neither a red K3 nor a blue K4.
• K8 has total number of edges equal to
�
8
2
�
= 28.
• Mark in the red edges (no K3’s allowed).
• Furthermore, the remaining edges don’t permit any blue K4.
K8 K8 without red K3 K8 without blue K4
52
Handshaking Theorem – Graph Version
Theorem 67 (Handshaking Theorem). In any graph G the number of odd
degree vertices is even.
Proof: Let G be a graph with |V (G) | = p and |E(G) | = q. Then
X
v2V (G)
deg(v) = 2q,
since each edge is counted twice once for each incident vertex.
Let Ve denote the set of even degree vertices and Vo the set of odd degree
vertices of G. Then
X
v2V (G)
deg(v) =
X
v2Ve
deg(v) +
X
v2Vo
deg(v) = 2q,
so that X
v2Vo
deg(v) = 2q �
X
v2Ve
deg(v).
Every term on the RHS is even so the sum on the LHS is even. A sum of
odd numbers is even if and only if it contains an even number of terms.
Upper Bound on Ramsey Numbers
Theorem 68 (Upper Bound). The Ramsey numbers satisfy the inequality
R(a, b) R(a� 1, b) +R(a, b� 1).
This can be sharpened to
Theorem 69 (Upper Bound for R(a�1, b) and R(a, b�1) Even). The Ramsey
numbers satisfy the inequality
R(a, b) R(a� 1, b) +R(a, b� 1)� 1,
when R(a� 1, b) and R(a, b� 1) are even.
• The first bound implies that R(3, 4) R(2, 4) +R(3, 3) = 4 + 6 = 10,
which is then sharpened to R(3, 4) R(2, 4) +R(3, 3)� 1 = 9.
• Combined with the lower bound R(3, 4) is thus equal to 9.
Upper Bound – Proof
Proof: Argument similar (in structure) to proof of upper bound for
R(3, 3).
Form a graph with R(a� 1, b) +R(a, b� 1) vertices.
Single out a particular vertex, say x, which is connected to the remaining
R(a� 1, b) +R(a, b� 1)� 1 vertices by red and blue edges.
A version of the pigeonhole principle says that if we put a1 + a2 � 1 balls
in two boxes, either box 1 contains at least a1 balls or box 2 contains at least
a2 balls.
Therefore, there must be either R(a � 1, b) red edges or R(a, b � 1) blue
edges.
53
Upper Bound – Proof
Say there are R(a�1, b) red edges. They are incident to R(a�1, b) vertices.
By definition of the Ramsey number, these vertices either have a red Ka�1,
which together with the edges incident from x form Ka, or they have a blue
Kb.
Similarly for the case when there are R(a, b� 1) blue edges.
Thus, any graph with R(a� 1, b)+R(a, b� 1) vertices must either contain
a red Ka or a blue Kb, and hence R(a, b) R(a� 1, b) +R(a, b� 1).
Upper Bound – Proof – Even Case
If R(a� 1, b) and R(a, b� 1) are even, we can improve the bound to:
R(a, b) R(a� 1, b) +R(a, b� 1)� 1.
Proof: Form a graph with R(a� 1, b) +R(a, b� 1)� 1 vertices and single
out a special vertex, x. There are now 2 cases to consider:
(a) At least R(a� 1, b) red edges or at least R(a, b� 1) blue edges.
(b) R(a� 1, b)� 1 red edges and R(a, b� 1)� 1 blue edges.
In case (a) we have subgraphs of the sought type as proved above.
For case (b) to be relevant all vertices must have the stated property )
all vertices have R(a� 1, b)� 1 red edges and R(a, b� 1)� 1 blue edges.
But this leads to a contradiction: The total number of red edges is
(R(a� 1, b) +R(a, b� 1)� 1)(R(a� 1, b)� 1).
This is an odd number times an odd number.
This is not possible by the Handshaking Theorem.
Known Results
Small Ramsey Numbers, http://www.combinatorics.org under Surveys
by Stanislaw P. Radziszowski. Values and bounds for R(a, b).
b 3 4 5 6 7 8 9 10 11 12
a
3 6 9 14 18 23 28 36
40
42
47
50
53
59
4 18 25
36
41
49
61
59
84
73
115
92
149
102
191
128
238
5
43
48
58
87
80
143
101
216
133
316
149
442
183
633
203
848
6
102
165
115
298
134
495
183
780
204
1171
256
1804
294
2566
7
205
540
217
1031
252
1713
292
2826
405
4553
417
6954
8
282
1870
329
3583
343
6090 10630 16944
9
565
6588
581
12677 22325 38832
10
798
23556 45881 81123
54
More Ramsey Theory
Theorem 70 (R(a, b) is finite). The Ramsey number R(a, b) is finite and
R(a, b) is a well-defined function.
Proof: We showed that R(a, b) R(a � 1, b) + R(a, b � 1), valid for
a, b � 3.
We also proved that R(2, n) = R(n, 2) = n.
This guarantees that all Ramsey numbers are finite.
Apply inequality repeatedly to RHS:
R(a, b) R(a� 1, b) +R(a, b� 1)
R(a� 2, b) + 2R(a� 1, b� 1) +R(a, b� 2)
R(a� 3, b) + 3R(a� 2, b� 1) + 3R(a� 1, b� 2) +R(a, b� 3)
…
Remind you of something?
The process stops once all terms are known.
Thus since a, b are finite we get a finite sum of integers.
Upper Bound: R(a, b)
⇣
a+b�2
a�1
⌘
Theorem 71 (Upper Bound). The Ramsey numbers
R(a, b)
✓
a+ b� 2
a� 1
◆
.
Proof: We will prove it by using induction on the value of a+ b.
Base case: a + b = 4, R(2, 2) = 2. RHS =
⇣
2+2�2
2�1
⌘
=
�
2
1
�
= 2, hence the
result is true.
Suppose it is true for a + b = p � 1. We need to prove that it is true for
a+ b = p:
R(a, b)
| {z }
a+b=p
case p
R(a� 1, b)
| {z }
a+b=p�1
case p�1
+R(a, b� 1)
| {z }
a+b=p�1
case p�1
✓
a+ b� 3
a� 2
◆
+
✓
a+ b� 3
a� 1
◆
=
✓
a+ b� 2
a� 1
◆
Here we used that
�
n
k
�
=
�
n�1
k
�
+
⇣
n�1
k�1
⌘
.
Generalisations of Ramsey Problems
Example: Suppose in a group of 17 people, each pair are either friends,
strangers or enemies. Show that there must always by either 3 mutual friends,
3 mutual strangers or 3 mutual enemies.
1. After singling out 1 person there are 16 persons left to put into 3 groups.
By the PHP a group contains at least 6 people. Suppose this is group 1.
55
2. Group 1 are friends of person 1 so if there is a pair of mutual friends in
group 1 this pair and person 1 form 3 mutual friends.
3. If there are no mutual friends in group 1 then they are either mutual
strangers or enemies. Since R(3, 3) = 6 we can conclude that there is at
least 3 mutual strangers or 3 mutual enemies in group 1.
4. If there is less than 6 people in group 1 then either group 2 or group 3 has
at least 6 people and similar arguments applies with ‘friends’ exchanged
for ‘strangers’ or ‘enemies’ as appropriate.
Note: R(3, 3, 3) = 17 is the only non-trivial results for 3 colours. No
results are known for more than 3 colours. So this ain’t an easy problem.
28 Parity
Parity Arguments
Parity refers to the even/odd property of integers.
odd integer ! odd parity
even integer ! even parity
Notation 72 (Parity). If n 2 Z, then parity(n) denotes the parity of n.
Like the pigeonhole principle it is an elementary idea and it can make
apparently di�cult problems simpler.
We already used a parity type argument in a Ramsey Theory proof.
In the proof of the sharpened upper bound for Ramsey numbers we used
that the product of two odd numbers is an odd number.
Parity arguments are not about counting but more about existence, prov-
ing whether a statement is true or not, demonstrating whether certain events
can happen or not, etc..
Exercise
Example: On a table sits 5 drinking glasses all turned upside down.
Turning over two glasses at a time can we make all glasses right way up?
Turning over three glasses at a time can we make all glasses right way up?
Solution: Let T = # up glasses. When we flip two glasses, we could
have
2 up! 2 down, �T = �2;
1 up, 1 down! 1 down, 1 up, �T = 0;
2 down! 2 up, �T = 2.
Starting with T even it must remain even.
Starting with T = 0 we can never reach T = 5.
Turning over 3 glasses at a time change T by ±3,±1.
56
Answer is yes! Every move changes the parity of T .
Explicitly we have (glasses to be turned in boldface):
ddddd! uuudd! uddud! uuuuu
Example
Example: Let (x, y, z) 2 R3 be a point with positive integer coordi-
nates. Show that if we choose 9 such points then the midpoint of at least one
pair of these points has integer coordinates.
Proof: Consider any two points (x1, y1, z1) and (x2, y2, z2) in R3 with
integer coordinates. The midpoint of these is
((x2 + x1)/2, (y2 + y1)/2, (z2 + z1)/2).
The midpoint has integer coordinates if and only if x1 and x2 have the
same parity, y1 and y2 have the same parity and z1 and z2 have the same
parity.
(x, y, z) has 8 possible triples of parity (such as (even,odd,odd)).
By the PHP at least two of the nine points have the same triple of parities.
The midpoint of two such points has integer coordinates.
Handshaking Theorem
Theorem 73 (Handshaking Theorem). At a party the number of people who
shake an odd number of hands is even.
Example: Let there be 5 people indicate that any two have shaken hands
by drawing an edge between them.
If V = {1, 2, 3, 4, 5} and E = {{1, 4}, {2, 4}, {2, 5}, {3, 5}, {4, 5}} then 1, 3,
4, 5 shook hands an odd number of times.
1 2 3
4 5
Parity and Subdivisions
Starting with the interval [0, 1]:
1. Subdivide into any finite number of subintervals
0 1
#
57
0 1
Here five ticks gives us six intervals.
2. Label each new tick 0 or 1 (in any way):
0 11 1 0 1 0
Parity and Subdivisions
Notation 74. #(a, b) denotes the number of intervals of type (a, b).
Note: There are no ticks between the labels a and b.
Question: What is the parity of (#(0, 1) + #(1, 0))?
• For our previous example we have:
0 11 1 0 1 0
• 3 (green) (0,1) intervals, 2 red (1,0) intervals. Note that green and red
intervals alternate.
29 Sperner’s Lemma for an Interval
Sperner’s Lemma for an Interval
Lemma 75 (Sperner’s Lemma for an Interval). The total number of (0, 1)
and (1, 0) subintervals is odd.
Proof: Suppose we have a given subdivision. Idea is to track the change
in (#(0, 1) + #(1, 0)) as a single tick is added (an interval is divided). There
are four types of intervals and two possible labels for the new tick:
00! 000 (+0) 00! 010 (+2)
01! 001 (+0) 01! 011 (+0)
10! 100 (+0) 10! 110 (+0)
11! 101 (+2) 11! 111 (+0)
The number of (0,1) and (1,0) intervals always changes by an even number.
Given that the initial subdivision had exactly 1 such interval, any subdi-
vision resulting from a sequence of insertions must also have an odd number
of (0,1) and (1,0) intervals.
58
Sperner’s Lemma for an Interval
Here is an alternative proof. It is more involved, but it will prepare us for
the generalisation of Sperner’s Lemma to two dimensions.
Alternative Proof: Idea: Map subdivisions to floor plans of 1D rooms.
Biject intervals to sequence of rooms and doors:
1. Interval ! Room
2. ‘1’ tick ! Door
3. ‘0’ tick ! Wall
0 1 1 0 1 0 1
Note: Every room has at most two doors.
Sperner’s Lemma for an Interval
Consider “room paths” defined such that:
1. Paths always start and end in one door rooms. The “outside” is con-
sidered a one-door room.
2. Paths traverses each door exactly once.
0 1 1 0 1 0 1
Paths:
Sperner’s Lemma for an Interval
There are two possible types of room path
1. Both ends are inside (an in-in path).
2. One end inside and the other end outside (in-out path).
We then have that
# rooms with one door = 2(# in-in paths) + (# in-out paths)
Clearly there is only one in-out path so the parity above is odd.
But # rooms with one door = # di↵erent labelled intervals
So the total number of (0, 1) and (1, 0) subintervals is odd.
59
30 Intermediate Value Theorem
Sperner’s Lemma and Intermediate Value Theorem
We shall now make one of our few excursions into the continuum and relate
Sperner’s lemma to the intermediate value theorem for continuous functions.
Theorem 76 (Intermediate Value Theorem). Let f : R 7! R be a continuous
function, a, b 2 R, a < b with f(a) < f(b). For any y 2 R with f(a) < y <
f(b) there exists x 2 (a, b) such that f(x) = y.
The relation is easier to see if we look to the equivalent statement:
Theorem 77. Let f : [0, 1] 7! [�1, 1] be continuous and suppose that f(0) <
0, f(1) > 0.
Then there exists x 2 (0, 1) such that f(x) = 0.
We show how Sperner’s lemma for the interval implies the second theorem.
Intermediate Value Theorem – Proof
Proof: Let f : [0, 1] 7! [�1, 1] be continuous with f(0) < 0, f(1) > 0.
Assume there is no x 2 (0, 1) such that f(x) = 0.
Divide the interval [0, 1] at equidistant points and label the tick at position
a in the subdivision by zero if f(a) < 0 and by 1 if f(a) > 0.
Sperner’s lemma guarantees a (0, 1) subinterval. Repeat this procedure
on the (0, 1) subinterval and so on.
Define two sequences {an} and {bn} such that an is the position of the ‘0’
tick and bn the position of the ‘1’ tick after n subdivisions.
We now appeal to the Bolzano-Weierstrass theorem to show that both {an}
and {bn} must contain convergent subsequences {ank} and {bnk}.
Since |an � bn| 1/2n these subsequences must have a common limit x.
By the continuity of f :
0 � lim
k!1
f(ank) = f( lim
k!1
ank) = f(x)
0 lim
k!1
f(bnk) = f( lim
k!1
bnk) = f(x)
Note: Labels merge at x so can view x as labelled both 0 and 1.
Parity in two-dimensional floor plans
Setup: Two dimensional floor plans where each room has at most 2 doors.
Why a max of 2 doors? Paths must enter via one door and exit (if possible)
via another unique door.
Example: Indicate doors by a red circle:
60
Draw all possible paths on the floor plan above. Three kinds of paths:
1. Completely inside: Path joins 2 rooms with a single door at either end.
2. Inside to outside: Path from room with a single door exit via outside
door.
3. Outside to outside: Path entering and exiting via outside doors.
Parity in two-dimensional floor plans
Theorem 78 (Floor Plan Lemma). # (rooms with one door) has the same
parity as # (outside doors).
Proof: Consider all possible paths.Three natural ways to categorize these:
1. Completely inside (one door room to one door room). Count p11.
2. Inside to outside (one door room to outside). Count p10.
3. Outside to outside. Count p00.
The number of rooms with one door and the number of outside doors is:
# of rooms with one door = 2p11 + p10
# outside doors = 2p00 + p10
Numbers are expressed as (even number) + p10 and hence have same
parity.
31 Sperner’s Lemma
Sperner’s Lemma – Setup
Suppose we are given a triangle whose main vertices are labelled 1, 2, and
3.
This triangle is then triangulated (subdivided into triangles) such that each
vertex is labelled 1, 2, and 3 with the restrictions that:
Definition 79 (Sperner Labelling). 1. Vertices on the edge (a,b) of the
large triangle are labelled a or b.
2. Interior vertices may be labelled arbitrarily
1 2
3
7�!
1 2
3
3 2
2
7�!
1 2
3
3 2
22 1
3
3
3
2
1
3 3
61
Sperner’s Labelling – Example
Example:
1 2
3
3
1
2
1
3
3
2
?
?
?
Sperner’s Lemma
Theorem 80 (Sperner’s Lemma). Any Sperner labelled triangulation (of a
triangle) contains an odd number of triangles possesing all the labels (1,2,3).
Proof: Basic idea: Bijection between Sperner triangulations and floor
plans.
Use FP lemma and Sperner’s lemma for an interval to prove parity state-
ment.
Map of triangulations to floor plans:
1. Triangles are rooms.
2. Edges labelled (1,2) have a door. All other edges are walls.
The only possible (1,2) triangles are:
1
�3
2 1
�2
2 1
�1
2
) all rooms have at most 2 doors and only (1,2,3)-triangles have one
door.
Sperner’s Lemma – Proof
Floor plan lemma ) parity of outside doors is the same as the parity of
rooms with a single door.
All single doors must lie on the (1,2) edge: subdivision lemma ) there
must be an odd number of (1,2) and (2,1) intervals.
Number of outside doors is odd ) number of rooms with a single door is
odd.
Thus the number of triangles labelled (1,2,3) must be odd, as required.
62
Aside: This argument can be generalized to higher dimensions, where one
labels faces of simplices (tetrahedra, and generalizations). Note that we re-
quired that the theorem was true in one dimension, in order to prove the
theorem in two dimensions. Can prove for all dimensions via induction.
Sperner’s Lemma – Illustration
Doors marked by red circles:
1 2
3
1
2
1
3
3
2
?
?
?
32 Brouwer fixed point theorem
Brouwer Fixed Point Theorem in 1D
Theorem 81 (Brouwer Fixed Point Theorem in 1D). f : [0, 1] 7! [0, 1] con-
tinuous. Then f has a fixed point x⇤ 2 [0, 1], i.e., f(x⇤) = x⇤.
Proof: Let f : [0, 1] 7! [0, 1] continuous.
Define g(x) = x� f(x).
From the intermediate value theorem we have that:
g : [0, 1] 7! [�1, 1] continuous with g(0) = �f(0) 0 and g(1) = 1�f(1) �
0
) 9×0 2 [0, 1] for which g(x0) = 0 ) f(x0) = x0.
Brouwer Fixed Point Theorem in 1D – Proof
As a warm-up to the 2D case we give a proof for 1D using Sperner’s Lemma
in a di↵erent setting to the proof for the intermediate value theorem.
We shall embed a 1D interval in 2D and consider
I = {(x1, x2) 2 R2 |x1 + x2 = 1, x1, x2 � 0}.
Let f : I 7! I be a continuous function and let x = (x1, x2). Then
f(I) = {( f1, f2) 2 R2 | f1(x) + f2(x) = 1, f1(x), f2(x) � 0}.
63
1
1
x1
x2
7�!
f
1
1
f1
f2
Brouwer Fixed Point Theorem in 1D – Proof
Now divide I into smaller segments by placing ticks (or vertices) along I.
Look at such a vertex at say a = (a1, a2) 2 I, then we can label this vertex
‘1’ if f1(a) a1 and ‘2’ if f2(a) a2.
Note that the restrictions a1 + a2 = 1 and f1(a) + f2(a) = 1 with all
numbers positive guarantees that one or the other inequality from above must
hold.
The vertex (1, 0) can be labelled ‘1’ since we can’t have f2(1, 0) < 0 and (0, 1) can be labelled ‘2’ since we can’t have f1(0, 1) < 0. This then gives us a Sperner labelling of I and we can proceed along the lines of the proof of the intermediate value theorem to argue that there must exist a convergent sequence to x⇤ such that f(x⇤) = x⇤ (the point x⇤ is labelled both 1 and 2). We naturally must ensure that the length of the sub-intervals vanish as the number of sub-divivions goes to 1. Brouwer Fixed Point Theorem Theorem 82 (Brouwer Fixed Point Theorem for Triangles). Let � denote a triangle in the plane. Every continuous function f : � 7! � has a fixed point in �. • Proved here in the case of a triangle, but this can be generalised to the unit disc or any other two-dimensional region (without a hole). • Generalises the fact that every continuous mapping from the unit interval to the unit interval has a fixed point. • Can be straightforwardly generalised to arbitrary dimension by using Sperner’s lemma for simplices in arbitrary dimension. • Important, fundamental result in topology. Brouwer Fixed Point Theorem – Proof Proof: Choose a triangular region, �, in three dimensional space, with vertices (1,0,0), (0,1,0), and (0,0,1), corresponding to a region in the plane defined by the equation x1 + x2 + x3 = 1. This has the (slight) notational advantage that the vertices have extremal values of the coordinate directions. 64 Consider a continuous function f which maps this triangular region onto itself. (0, 1, 0) (0, 0, 1) (1, 0, 0) x1 x2 x3 7�! f (0, 1, 0) (0, 0, 1) (1, 0, 0) f1 f2 f3 Brouwer Fixed Point Theorem – Proof Let Sj be a sequence of triangulations of the triangle �, such that Sj is obtained from Sj�1 via further subdivisions, i.e., the vertices in Sj�1 form a proper subset of the vertices in Sj . We impose the condition that all triangles in Sj approach size 0 as j !1. In particular, we define d(Sj) to be the maximum edge length in Sj and impose the condition lim j!1 d(Sj) = 0. Assign labels to vertices in Sj as a value of i such that fi(x) xi. There is always such an i since P i xi = P i fi(x) = 1. Vertex (1,0,0) can be labelled 1 since we can’t have f2(x) < 0 or f3(x) < 0. Similarly the vertex (0,1,0) can be labelled 2 and (0,0,1) can be labelled 3. Brouwer Fixed Point Theorem – Proof Vertices on the edge between (1,0,0) and (0,1,0) can be labelled 1 or 2, vertices between (0,1,0) and (0,0,1) by 2 or 3, and vertices between (1,0,0) and (0,0,1) by 1 or 3. Interior vertices may have labels 1, 2 or 3. Thus each triangulation Sj satisfies the conditions of Sperner’s lemma; hence we are guaranteed that for each Sj there will be at least one triangle with vertices labelled 1, 2, and 3. The vertices of these triangles labelled 1, 2, and 3 form an infinite sequence. Due to the compactness of �, we are guaranteed that there must be a sub- sequence of triangles whose vertices converge to a point x⇤ 2 � (in practice, we can consider this subsequence as arising from finer and finer subdivisions around x⇤). The point x⇤ has all three labels. f is continuous so for x(j) 2 Sj , lim j!1 f(x(j)) = f( lim j!1 x(j)) = f(x⇤). 65 But if x⇤ has all three labels f(x⇤) = x⇤, which establishes the result. A point of detail • In 1-D, we always have a Sperner labelling for each subdivision - this gives our nesting. • Compare to 2-D: A subtriangle is of the form 1 a 2 3 . We can’t be sure that this is a Sperner labelling - e.g. only 1, 3’s along the bottom. • To get around this, we subdivide by a half, say, at each step. • According to Sperner’s lemma in 2-D, there is at least one subtriangle (1, 2, 3) for each such subdivision. • We record the coordinates of one from each sequence member. • Appeal to the Boltzano-Weierstrass theorem from real analysis guaran- tees a convergent sub-sequence that converges to a point, that further- more involves nested subtriangles. 33 Lattice Paths Lattice Paths We shall confine ourselves to 2 dimensions and (largely) to paths on the integer grid N2 0 also known as the first quadrant of the square lattice. Most definitions can easily be generalised to higher dimensions and other lattices. Definition 83 (Lattice Path). A lattice path of length n is a sequence v0v1 · · · vn of vertices vi = (xi, yi) 2 N20 where N0 = {0, 1, 2, . . .}, subject to the conditions: 1. v0 = (0, 0). 2. Steps, si = vi � vi�i 2 S, where S is some step set. Example: Binomial paths are lattice paths with S = {(0, 1), (1, 0)}. The step set S is just: (1, 0) $ East step and (0, 1) $ North step. Then a path, say, ENEN $ (0, 0)(1, 0) | {z } s1=(1,0) (1, 1) (2, 1)(2, 2) | {z } s4=(0,1) . Note: Vertex notation is good for definitions but otherwise cumber- some. 34 Ballot Paths Ballot Paths Definition 84 (Ballot Paths). Ballot paths are lattice paths with the addi- tional constraint: 66 3. If vi = (xi, yi) is a vertex in the paths then xi yi for all i = 1, 2, . . . , n. If vn = (m,m+ h) denote set of all such ballot paths by B h m. Note: All vertices in a ballot path lie on or above the line y = x. Example: b = NENNE 2 B1 2 , but b0 = NEENNE is not a ballot path: 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 Note: In a binomial ballot path b = s1s2 . . . sn with sj 2 {N,E} in any subword b0 = s1s2 . . . sk we have | b0 |N � | b 0 |E , i.e., at any stage of the path the number of north steps cannot be less than the number of east steps. General Lattice Paths We shall generally only consider binomial ballot paths (ballot paths for short). However, many interesting problems involve more involved step sets. The only restriction on a step set is that any step must connect vertices in N2 0 and very often the path can only visit a vertex once (self-avoiding paths). Example: 1. S1 = {(0, 1), (1, 0), (1, 1)}. Words in alphabet: {N,E,D}. 2. S2 = {(0, 1), (1, 0), (0,�1)}. Words in alphabet: {N,E, S}. 0 1 2 3 4 5 0 1 2 3 4 5 w1 = NEDNDNE (ballot path) 0 1 2 3 4 5 0 1 2 3 4 5 w2 = NEENNESSENNNNESSS Back to Ballot Paths Geometrically we have: 67 (0, 0) (m, 0) (m,m) (m,m+ h) 1. h = height above y = x. 2. m+h = y-coor of last vertex. B h m 1. m = # east steps. 2. m = x-coor of last vertex. Number of Ballot Paths How many ballot paths are there? Example: 68 B 2 1 ={ }, ��B21 �� = 3 { }B02 = , ��B02 �� = 2 { }B12 = , ��B12 �� = 5 Ballot Number Recurrence Theorem 85 (Ballot Number Recurrence). The number of ballot paths satisfy the recurrence ���Bhm ��� = ���Bh�1m ���+ ���Bh+1m�1 ��� , h � 0 with ���Bhm ��� = 0 for h < 0 and ���Bh0 ��� = 1. Proof: Every b 2 Bhm can be uniquely obtained by adding one (north) step to a b0 2 Bh�1m (h > 0) or one (east) step to a b00 2 B
h+1
m�1.
Note: This is a bijective proof. A b 2 Bhm is mapped to a unique
b
0 2 Bh�1m [B
h+1
m�1. The map deletes the last step of b.
y = x
2 Bh�1m (h > 0)
2 Bhm.
2 Bh+1m�1
#
Ballot Triangle
Theorem gives the ballot triangle:
69
���Bh�1m
���
���Bh+1m�1
���
& .+
���Bhm
���
���Bhm
��� = 0, h < 0
���Bh0
��� = 1
1
0 1
1 1
0 2 1
2 3 1
0 5 4 1
5 9 5 1
0 14 14 6 1
14 28 20 7 1
# h=�1
# h=0
# h=1
# h=2
# h=3
# h=4
m=0 &
m=1 &
m=2 &
m=3 &
m=4 &
Ballot Grid
Can also write counts on the m⇥ (m+ h) grid (above y = x):
0 1 2 3 4 5 6
0
1
2
3
4
5
6
y = x
m :
m+ h
0
0
0
0
0
0
1
1
1
1
1
1
1
1
2
3
4
5
6
2
5
9
14
20
5
14
28
48
14
42
90
42
142 142
Dyck Paths and Catalan Numbers
Definition 86 (Dyck Paths). Ballot paths with h = 0 are calledDyck paths.
The numbers
��B0m
�� are called Catalan numbers, denoted Cm.
Note: The number of steps n in a Dyck path is n = 2m.
Example:
70
{ }B01 = , ��B01 �� = 1 { }B02 = , ��B02 �� = 2
{ }B03 = , ��B03 �� = 5
Number of Ballot Paths
Theorem 87 (Number of Ballot Paths). The number of ballot paths is
���Bhm
��� =
h+ 1
m+ h+ 1
✓
2m+ h
m+ h
◆
.
Proof (Algebraic): From the ballot number recurrence we have:
���Bhm
��� =
���Bh�1m
���+
���Bh+1m�1
��� , h � 0
with
���Bhm
��� = 0 for h < 0 and
���Bh0
��� = 1.
First we just check the special cases:
���Bh0
��� =
h+ 1
h+ 1
✓
h
h
◆
= 1 and
���B�1m
��� =
0
m
✓
2m� 1
m� 1
◆
= 0.
Number of Ballot Paths – Proof
RHS =
���Bh�1m
���+
���Bh+1m�1
���
=
h
m+ h
✓
2m+ h� 1
m+ h� 1
◆
+
h+ 2
m+ h+ 1
✓
2m+ h� 1
m+ h
◆
=
h (2m+ h� 1)!
(m+ h) (m+ h� 1)!m!
+
(h+ 2) (2m+ h� 1)!
(m+ h+ 1) (m+ h)! (m� 1)!
=
(2m+ h� 1)!
(m+ h)!m!
h+
m(h+ 2)
(m+ h+ 1)
�
=
(2m+ h� 1)!
(m+ h)!m!
⇥
(2m+ h)(h+ 1)
m+ h+ 1
=
✓
2m+ h
m+ h
◆
h+ 1
m+ h+ 1
= LHS.
71
Number of Dyck Paths
Corollary 88 (Number of Dyck Paths). The number of Dyck paths is
���B0m
��� = Cm =
1
m+ 1
✓
2m
m
◆
=
(2m)!
m! (m+ 1)!
.
Note: Dyck paths is just one of more than a hundred known combi-
natorial problems which are counted by the Catalan numbers. A few others
are
• Binary trees.
• Triangulations of polygons with n vertices.
• Perfectly balanced parenthesis.
• Non-crossing arcs on 2n nodes.
• etc. etc.
We shall explore some of these in week seven.
35 Lattice Paths
Ballot Numbers – Bijective Proof
Bijective Proof: We shall prove the formula for ballot numbers using a
bijection and inclusion-exclusion.
The starting point is that it is easy to count unconstrained binomial paths.
Bm,k ={ }, |Bm,k | = ✓k +mm ◆
(0, 0)
(m, k)
Not so easy if some constraint is imposed:
Ballot paths binomial paths with no vertex below y = x.
72
Ballot Numbers – Bijective Proof
Consider the binomial paths Bm,m+h. Partition Bm,m+h into two disjoint
subsets:
• G = binomial paths with no vertices below y = x, i.e., ballot paths.
• H = binomial paths with at least one vertex below y = x.
Then
|Bm,m+h | = |G |+ |H | .
So
���Bhm
��� = |G | = |Bm,m+h\H |
= |Bm,m+h |� |H |
=
✓
2m+ h
m+ h
◆
� |H | .
Ballot Numbers – Bijective Proof
How does this help?
H is another constrained binomial path problem; so still not easy!
Idea: Biject the constrained problem H to an unconstrained binomial
path problem so that |H | = some binomial coe�cient.
To see why and how this might work look at binomial and ballot grid
counts:
Binomial Numbers
1
1
1
1
1
1
1 1 1 1 1 1
2
3
4
5
6
3
6
10
15
21
4
10
20
35
56
5
15
35
70
126
6
21
56
126
252
7
28
84
210
462
Ballot Numbers
0
0
0
0
0
1
1
1
1
1
1
1
2
3
4
5
2
5
9
14
5
14
28
14
42 42
Ballot Numbers – Bijective Proof
Binomial Numbers
1
1
1
1
1
1
1 1 1 1 1 1
2
3
4
5
6
3
6
10
15
21
4
10
20
35
56
5
15
35
70
126
6
21
56
126
252
7
28
84
210
462
Ballot Numbers
0
0
0
0
0
1
1
1
1
1
1
1
2
3
4
5
2
5
9
14
5
14
28
14
42 42
73
Now something amazing happens!
Every entry in position (i, j) in the ballot grid equals the entry at position
(i, j) in the binomial grid minus another entry in the binomial grid. But which
entry?
Pos (2,2) 2 = 6�4 Pos (2,2) – Pos (3,1) Pos
(2,4) 9 = 15 � 6 Pos (2,4) – Pos (5,1) Pos (3,5)
28 = 56� 28 Pos (3,5) – Pos (6,2) Pos (4,5) 42 =
126� 84 Pos (4,5) – Pos (6,3)
Ballot Numbers – Bijective Proof
Now something amazing happens!
Every entry in position (i, j) in the ballot grid equals the entry at position
(i, j) in the binomial grid minus another entry in the binomial grid. But which
entry?
• Pos (2,2) 2 = 6� 4 Pos (2,2) – Pos (3,1)
• Pos (2,4) 9 = 15� 6 Pos (2,4) – Pos (5,1)
• Pos (3,5) 28 = 56� 28 Pos (3,5) – Pos (6,2)
• Pos (4,5) 42 = 126� 84 Pos (4,5) – Pos (6,3)
Notice: Sums of coordinates of the two entries are equal!
On closer inspection we see that we have: Pos (i, j) – Pos (j+1, i�1)
This corresponds to the position of the second entry being obtained from
the first entry by reflection in the line: y = x� 1.
Ballot Numbers – Bijective Proof
In ballot ‘notation’ we have i = m and j = m + h so i + j = 2m + h and
we get if our observation is true:
���Bhm
��� =
✓
i+ j
j
◆
�
✓
i+ j
j + 1
◆
=
✓
2m+ h
m+ h
◆
�
✓
2m+ h
m+ h+ 1
◆
=
(2m+ h)!
m!(m+ h)!
�
(2m+ h)!
(m� 1)!(m+ h+ 1)!
=
✓
2m+ h
m+ h
◆✓
1�
m
m+ h+ 1
◆
=
h+ 1
m+ h+ 1
✓
2m+ h
m+ h
◆
.
So that is fine.
74
Ballot Numbers – Bijective Proof
Task: To complete bijective proof we must now ‘find’ the bijection between
the set H and the set Bm+h+1,m�1.
Find a bijection between:
1. Binomial paths ending at (m,m + h) with at least one vertex below
y = x.
2. Binomial paths ending at (m+ h+ 1,m� 1).
The hint was reflection in the line y = x� 1.
Bijection: T : H 7! Bm+h+1,m�1 , where b0 = T (b) 2 Bm+h+1,m�1 is the
path obtained from b 2 H by “reflecting” all steps after the last vertex on
y = x� 1.
Note: Such a vertex must exist as every path in B has a vertex below
y = x.
By “reflecting” the steps we mean that every north step after the “last”
vertex becomes an east step and east steps become north steps.
I think we need to look at some examples.
Ballot Numbers – Bijective Proof
Example:
�!
Ballot Numbers – Bijective Proof
Say “last” vertex occurs at (i, i�1). Path continues to position (m,m+h).
The “reflection” then takes
• Remaining m�i East steps m�i North steps.
• Remaining m+h�i+1 North steps m+h�i+1 East steps.
So the reflected path b0 = T (b) goes from (i, i�1) to (m+h+1,m�1), i.e.,
b
0 2 Bm+h+1,m�1.
Note: Every b0 2 Bm+h+1,m�1 must cross y = x� 1 and so have at least
one vertex on y = x� 1, i.e., b0 is the image of some path in H.
This establishes the sought after bijection.
75
36 Generating Functions
Generating Functions – Definition
Generating functions are a way to encode information about sequences in
an intuitive and powerful way.
Notation 89 (Sequences). We denote sequences as {an} or a0, a1, a2, · · ·
where n 2 N0.
Definition 90 (Generating Function). The generating function f(x) for a
sequence {an} is
f(x) = a0 + a1x+ a2x
2 + · · ·+ anxn + · · · =
1X
n=0
anx
n
Example: Given a sequence, say, {aj} = 1, 2, 3, 4, · · · we could encode
this information as a function
f(x) = 1 + 2x+ 3x2 + 4x3 + · · · =
d
dx
(1 + x+ x2 + x3 + · · · )
=
d
dx
✓
1
1� x
◆
=
1
(1� x)2
Generating Functions
Notation 91 (Taylor Coe�cient). Let f(x) be an analytic function. The
coe�cient an of x
n in the Taylor series of f(x) about 0 is denoted an =
[xn] f(x).
Thus if An is a set of objects of size n rather than calculating an = |An |
directly we could form the associated generating function f(x) and obtain an
from the Taylor expansion (provided it exists of course).
At first glance this does not seem to help does it? Why should it be better
(‘easier’) to find f(x) rather than |An |. Well I hope the next few weeks will
convince you that often it is ‘easier’ to find the generating function f(x).
Secondly if we have a generating function f(x) we gain access to very
powerful methods from real and complex analysis which we can use to study
the function (and hence the problem). In particular we can use methods for
analyzing the asymptotic behaviour of |An |.
Note: The generating function representation is completely equivalent to
the original representation.
Generating Functions – Examples
Example: Binary numbers: Bn = {(b1, b2, . . . , bn) | bj 2 {0, 1}} )
|Bn | = 2n.
The generating function is
f(x) = 1 + 2x+ 4x2 + 8x3 + · · · =
1X
n=0
2nxn =
1X
n=0
(2x)n =
1
1� 2x
.
76
Example: Fibonacci numbers: (Fn = Fn�1 + Fn�2, F1 = F2 = 1) have
the generating function
F (x) =
1X
n=0
Fnx
n =
x
1� x� x2
.
We shall prove this next lecture.
Example: Number of Dyck paths of length 2n is dn =
1
n+1
�
2n
n
�
.
The generating function is
D(x) =
1X
n=0
dnx
n =
1
2x
�
1�
p
1� 4x
�
.
We shall prove this later on.
Multivariable Generating Functions
Natural generalisation is to have functions of more than one variable.
Example: Consider Dyck paths. It is natural to enumerate (count)
these by the number of steps n and the number of times k the path touches
the line y = x. Then
D(u, z) =
X
n,k
dn,ku
k
z
n
,
where dn,k is the number of Dyck paths of length n with k returns to y = x.
Physically we can view the path as a polymer (chain molecule) and y = x as
a surface. Then u represents an interaction energy between the monomers and
the surface. This is a simple physical model for the attachment (adsorption)
or detachment (desorption) of polymers that interact with a surface.
37 Binomial Expansion
Paving or Tiling
Definition 92 (Paving or Tiling). An n-board is a line of n squares or cells.
A paving or tiling is a covering of an n-board with a set of pavers or tiles
such as:
A monomer is a single square while a dimer is a pair of adjacent squares.
Notation: A 5-board a monomer • and a dimer •��• .
A paving of a 9-board: • •��• • •��• •��• • .
Questions:
1. How many pavings are there of an n-board (variable number of pavers)?
n = 3 : • • •
(k=3)
,
• •��•
(k=2)
,
•��• •
(k=2)
2. How many pavings with k pavers (variable length boards)?
k = 3 : • • •
(n=3)
,
• •��• •
(n=4)
,
•��• •��• •��•
(n=6)
77
Pavings and the Binomial Expansion
How to e�ciently count the number of pavings? Use generating functions!
We have already seen the basic idea when proving the binomial theorem.
In expanding (x+ y)n we had to choose a term x OR y from each factor.
Now think of the pavers as the variables. We fill the board from left to
right by filling the next empty cell with a monomer OR a dimer (which takes
up 2 cells).
Hence putting down k pavers gives us a term: fk(x) = (x+ x
2)k.
This answers question 2.
The total number of pavings is fk(1) = 2
k. But that is obvious!.
More interestingly the number of pavings of length n 2 [k, 2k] is the coef-
ficient of xn in the generating function fk(x), i.e., [x
n] fk(x) =
⇣
k
n�k
⌘
.
Pavings and Generating Functions
What about question 1? Answer is not (x+ x2)n.
Rather we look to the generation function formed by putting down any
possible number of pavers:
F (x) =
1X
k=0
(x+ x2)k =
1
1� (x+ x2)
=
1
1� x� x2
. (Geometric series)
The number a pavings of length n is: [xn]F (x).
What if we want to track the number of monomers or dimers? Easy!
Just add a factor y for a monomer so a k paving is (xy + x2)k and hence:
F (x, y) =
1X
k=0
(xy + x2)k =
1
1� (xy + x2)
=
1
1� xy � x2
.
In the expansion of this generating function the coe�cient of xnym is the
number of pavings of length n using m monomers!!
This is a hint that generating functions are objects of great beauty and
power!
Algebra and Generating Functions
This example tells us something important about the relationship between
algebra and the combinatorics of generating functions.
• Algebra:
– Multiplication ! Concatenation of ‘base’ objects
– Addition ! ‘OR’
Where ‘OR’ means: Choose a ‘replacement’ for base object.
78
• Paving:
– Base object �!
– (x+ x2) : �! • OR •��•
| {z }
pavers
Binomial Expansion
Example: Consider a box with two balls, say red and purple. What
are all the possible selections of balls we can take from the box?
f(r, p) = (1 + r)(1 + p) = 1 + r + p+ rp
What if we don’t care about the colour of the balls? Then we must change
the labels (variables) r ⌘ red ball and p ⌘ purple ball to b ⌘ ball. We then
obtain
g(b) = f(b, b) = (1 + b)(1 + b) = 1 + 2b+ b2.
So there is one way of choosing no balls, 2 ways of choosing 1 balls, and
one way of choosing 2 balls.
But then we knew that already so what is the point?
Wait there is more.
Binomial Expansion
Example: Consider a box with four balls, say red, purple, green and
yellow. What are all the possible selections of balls we can take from the box?
f(r, p, g, y) = (1 + r)(1 + p)(1 + g)(1 + y)
= 1 + r + p+ g + y
+ rp+ rg + ry + pg + py + gy
+ pgy + rgy + rpy + rpg
+ rpgy
Lets say we’re interested in keeping track of whether the red ball has been
chosen, but we don’t care about the colours of the other balls. Then we set
p = b, g = b, and y = b, and obtain:
f(r, b, b, b) = (1 + r)(1 + b)(1 + b)(1 + b) = (1 + r)(1 + b)3
= (1 + 3b+ 3b2 + b3) + r(1 + 3b+ 3b2 + b3).
Thus, for example, the number of ways of choosing 3 balls, of which one
of them must be red, is 3 (coe�cient of rb2).
79
Exercise
Example: How many ways are there to give k bananas to 3 people such
that persons 1 and 2 get no more than 2 bananas, while person 3 gets at most
1 banana?
Solution via generating functions: For each banana given to a person
there is a factor of x, first factor is for person 1, second factor for person 2,
third factor for person 3.
f(x) = (1 + x+ x2)(1 + x+ x2)(1 + x)
= 1 + 3x+ 5x2 + 5x3 + 3x4 + x5
We can read o↵, for example, that there are 5 ways to distribute 2 bananas
according to this rule.
Note: The problem has no solution for k � 6 since all terms [xk] f(x) =
0.
Why are Generating Functions Useful?
• Recurrences: Given a recurrence, we may be able to find an elegant,
closed form solution.
• Asymptotic formulae: Wemay mostly be interested in how a sequence
grows for large n. Generating functions encode this succinctly.
Recall Stirling’s formula for n!: n! '
p
2⇡nnne�n, valid for ”large” n.
We gain access to powerful methods from real and complex analysis.
• Prove identities: Given an identity, often the easiest way to prove it
is to multiply each side by a power of x, and sum, to obtain expressions
involving generating functions.
• Relating combinatorial problems: Basic manipulations (algebraic,
change of variables, di↵erentiation) have combinatorial interpretations
and can be used to turn one combinatorial problem into another.
38 Fibonacci Numbers
Fibonacci Numbers – Introduction
• Study Fibonacci numbers and Fibonacci sequence.
• Apply generating function methods to the problem.
• Fibonacci numbers relate to the golden ratio with origin in classical
geometry.
Motivation: Consider a population of rabbits, which reproduce according
to the following rule. Each month,
80
1. Every baby female rabbit becomes an adult.
2. Female adult rabbits give birth to one female baby rabbit.
Denoting a for a baby rabbit, A for an adult rabbit, then this is equivalent
to the substitution rule:
a 7! A
A 7! Aa
Fibonacci Numbers
If we start with a single baby rabbit, a, then the population evolves as:
n = 1 a
n = 2 A
n = 3 Aa
n = 4 AaA
n = 5 AaAAa
n = 6 AaAAaAaA
n = 7 AaAAaAaAAaAAa
...
...
Count the population as a function of time, with xn being the number of
baby rabbits, Xn adult rabbits, and Fn = xn + Xn total number of rabbits
after n months.
We have the initial data x1 = 1, X1 = 0, and the substitution rules lead
to the recurrence relations:
xn = Xn�1
Xn = Xn�1 + xn�1
Fibonacci Numbers
Thus,
Fn = xn +Xn
= Xn�1 +Xn�1 + xn�1
= Xn�2 + xn�2 +Xn�1 + xn�1
= Fn�2 + Fn�1.
This is a 2-term linear recurrence relation with initial conditions F1 = 1,
F2 = 1.
Questions:
1. Can we compute the generating function for {Fn}?
i.e. F (x) = x+ x2 + 2x3 + 3x4 + 5x5 + 8x6 + . . .
2. Is there a ”simple” formula for Fn?
81
3. How fast does Fn grow with n? Is there a ”simple” formula in that case?
Fibonacci Number Generating Function
Theorem 93 (Fibonacci Number Generating Function). The Fibonacci num-
bers (defined by Fn = Fn�1+Fn�2, F1 = F2 = 1) have the generating function
F (x) =
1X
n=1
Fnx
n =
x
1� x� x2
.
Proof: We prove the theorem using a general method that can be used
to derive generating functions from some recurrence relations.
The starting point is
Fn = Fn�1 + Fn�2
Multiply by xn and sum from n = 3 (start from n = 3 so Fn�2 ! F1),
1X
n=3
Fnx
n =
1X
n=3
Fn�1x
n +
1X
n=3
Fn�2x
n
Fibonacci Number Generating Function
Shift the index so all sums explicitly involve Fn:
1X
n=3
Fnx
n =
1X
n=3
Fn�1x
n +
1X
n=3
Fn�2x
n
=
1X
n=2
Fnx
n+1 +
1X
n=1
Fnx
n+2
= x
1X
n=2
Fnx
n + x2
1X
n=1
Fnx
n
Only last generating function equals F (x). The others are missing some
terms.
Fibonacci Number Generating Function
Add/Subtract missing terms to get F (x) series:
1X
n=3
Fnx
n = �xF1 � x2F2 +
1X
n=1
Fnx
n = �x� x2 + F (x)
x
1X
n=2
Fnx
n = x
�xF1 +
1X
n=1
Fnx
n
!
= �x2 + xF (x)
Solve for F (x):
82
�x� x2 + F (x) = �x2 + xF (x) + x2F (x)
) (1� x� x2)F (x) = x
) F (x) =
x
1� x� x2
.
39 Recursions and generating functions
Fibonacci Numbers – Asymptotics
Question: Can we extract useful information from the generating func-
tion?
Asymptotic behaviour of the Fibonacci numbers; an exact expression for
Fn?
Answer: Yes. Use a partial fraction decomposition.
The zeroes of 1� x� x2 are: �
1±
p
5
2
.
Defining ↵± =
1±
p
5
2
, we have 1� x� x2 = �(↵+ + x)(↵� + x). So
F (x) = �
x
(↵+ + x)(↵� + x)
= �
x
↵+↵�(1 + x/↵+)(1 + x/↵�)
Since ↵+↵� = �1, this finally leads to
F (x) =
x
(1� ↵�x)(1� ↵+x)
.
Note: ↵+ is the so-called “golden ratio”.
Partial Fraction Decomposition
Now we are going to perform a partial fraction decomposition of F (x):
F (x) =
x
(1� ↵�x)(1� ↵+x)
=
A
1� ↵�x
+
B
1� ↵+x
Multiply each side of the equation by (1� ↵�x)(1� ↵+x) to obtain
x = A(1� ↵+x) +B(1� ↵�x)
= (A+B)� (A↵+ +B↵�)x
Equating terms on LHS and RHS and solving for A and B gives:
A = �
1
p
5
and B =
1
p
5
and we find
F (x) =
x
1� x� x2
=
1
p
5
✓
�
1
1� ↵�x
+
1
1� ↵+x
◆
.
83
Fibonacci Numbers – Exact Formula
We have now arrived at the moment of truth. We can immediately expand
each of the terms on the RHS as they are simply geometric series. Thus
F (x) =
1
p
5
1X
n=1
(�(↵�x)n + (↵+x)n)
=
1
p
5
1X
n=1
�
↵
n
+
� ↵n�
�
x
n
.
Hence, by using the fact that the coe�cient of xn is, by definition, Fn, we
obtain the exact closed form expression for the Fibonacci numbers.
Theorem 94 (Fibonacci Numbers I). The Fibonacci numbers are given by
the formula
Fn =
1
p
5
�
↵
n
+
� ↵n�
�
, where ↵± =
1±
p
5
2
.
Fibonacci Numbers – Exact Simplified Formula
In this particular case we can simplify things a little bit further.
We have that ↵� ⇡ �0.618, and therefore
����
↵
n
�p
5
���� ⇡
0.618n
2.21
<
1
2
8n � 1
We know that Fn must be an integer. The ↵� term contributes a value of
less than 1/2.This means that Fn is the nearest integer to the remaining part.
We can write the “nearest integer” to a number x as
⌅
x+ 1
2
⇧
. Thus
Theorem 95 (Fibonacci Numbers II). The Fibonacci numbers are given by
the formula
Fn =
�
↵
n
+p
5
+
1
2
⌫
, where ↵+ =
1 +
p
5
2
.
Fibonacci Numbers – Asymptotic
What is the asymptotic behaviour of Fn?
Frequently, this is the question we most want to answer about a sequence.
This follows immediately from the previous theorem:
Corollary 96 (Fibonacci Numbers Asymptotic). The asymptotic behaviour
of the Fibonacci numbers is
Fn ⇠
↵
n
+p
5
, where ↵+ =
1 +
p
5
2
.
84
Golden Ratio
In classical geometry, the Golden Ratio � = 1+
p
5
2
= 1.618 . . . occurs in
• A rectangle with a special subdivision property.
• Regular pentagons and pentagrams.
• Plus many other situations.
Rectangles can also be used to give a proof of the Fibonacci numbers
identity
F
2
1
+ F 2
2
+ F 2
3
+ · · ·+ F 2n = FnFn+1.
Golden Ratio and Rectangles
Question: Find x > 1 with property that the subdivided rectangle has
the same proportions.
11
1
x
x� 1
We require:
long side
short side
=
x
1
=
1
x� 1
) x2 � x� 1 = 0
) x =
1
2
⇣
1 +
p
5
⌘
= �.
Fibonacci Identity and Rectangles
Fibonacci identity using rectangles,
F
2
1
+ F 2
2
+ F 2
3
+ · · ·+ F 2n = FnFn+1.
85
n = 3 :
1
F
2
1
1
F
2
2
2F 2
3
=
6
?
1 + 1
F3
2
+
1
F4
n = 4 :
3
3 2
2
1 1
F
2
4
F
2
3
F
2
2
F
2
1
= F4
F5
Golden Ratio and Pentagons
Regular unit pentagon
x
1
1�x
1
Similar triangles:
x
1
=
1
x� 1
) x =
1
2
⇣
1 +
p
5
⌘
= �
Golden Ratio and Pentagrams
Regular Pentagrams
Here the ratio of the lengths of the coloured lines are �:
� =
|red line|
|green line|
=
|green line|
|blue line|
=
|blue line|
|magenta line|
86
Odom’s Construction of the Golden Ratio
George Odom has given a construction for � involving an equilateral tri-
angle: if an equilateral triangle is inscribed in a circle and the line segment
joining the midpoints of two sides is drawn so as to intersect the circle in either
of two points, then these three points are in golden proportion:
A
B
C
D
E
F
|AB|
|BC|
=
|AC|
|AB|
= �.
Fibonacci Numbers – Alternative Derivation
In the next few slides we give an alternative derivation for F (x).
The Fibonacci recurrence can be represented in matrix form:
xn
Xn
�
=
0 1
1 1
�
xn�1
Xn�1
�
subject to the initial condition
x1
X1
�
=
1
0
�
.
In physics a matrix like T =
0 1
1 1
�
is known as a Transfer Matrix.
Fibonacci Numbers – Alternative Derivation
The solution of the above matrix recurrence is
x2
X2
�
=
0 1
1 1
�
1
0
�
x3
X3
�
=
0 1
1 1
�
x2
X2
�
=
0 1
1 1
�2
1
0
�
…
xn
Xn
�
=
0 1
1 1
�n�1
1
0
�
87
Question: How does one compute matrix powers? We write
0 1
1 1
�
= QDQ�1, D =
�1 0
0 �2
�
Fibonacci Numbers – Alternative Derivation
Then
0 1
1 1
�n�1
= QDn�1Q�1 = Q
�
n�1
1
0
0 �n�1
2
�
Q
�1
Here �1 and �2 are the eigenvalues of the matrix T , obtained from the
characteristic equation:
det (T � �I) = det
✓
0 1
1 1
�
� �
1 0
0 1
�◆
= 0
) �2 � �� 1 = 0 ) �1 = ↵+, �2 = ↵�
With this we can then show that
Q =
1
51/4
1 �1
↵+ �↵�
�
and Q�1 =
1
51/4
�↵� 1
�↵+ 1
�
Fibonacci Numbers – Alternative Derivation
Note that Fn = Xn + xn is the (2,2) entry in QD
n�1
Q
�1. Now
QD
n�1
Q
�1 =
1
p
5
1 �1
↵+ �↵�
�
↵
n�1
+
0
0 ↵n�1�
�
�↵� 1
�↵+ 1
�
=
1
p
5
1 �1
↵+ �↵�
�
�↵�↵n�1+ ↵
n�1
+
↵+↵
n�1
� ↵
n�1
�
�
which shows that:
Fn =
1
p
5
�
↵
n
+
� ↵n�
�
Now to find the generating function F (x) we simply have to reverse the
steps used previously to go from F (x) to the above result.
Walking on a Graph
Question: How many walks of length n are there on a graph starting at
a vertex v and ending at a vertex u or ending at any vertex?
This question can be answered using linear algebra. We need
88
Definition 97 (Adjacency Matrix). Given a graphG with V (G) = {v1, v2, . . . , vp},
the adjacency matrix of G is the p ⇥ p matrix A = (aij) where aij = 1 if
{vi, vj} 2 E(G) and aij = 0 otherwise.
Note: The adjacency matrix relies on the given order of vertices, but is
unique up to permutation of rows and permutation of columns.
Example:
1
3
5
2
4
6
7
0
BBBBBBBB
@
0 1 1 1 0 0 0
1 0 1 0 0 0 0
1 1 0 1 1 0 0
1 0 1 0 1 1 0
0 0 1 1 0 0 0
0 0 0 1 0 0 0
0 0 0 0 0 0 0
1
CCCCCCCC
A
Walking on a Graph
The adjacency matrix can be viewed as a transfer or transition matrix
telling us which vertices can be reached from vertex vi in a single step. We
therefore get the following result
Theorem 98 (Random Walks on a Graph). Let A be the adjacency matrix
of a graph G. The ij entry a
(n)
ij in A
n equals the number of (random) walks
of length n from vertex vi to vj.
Proof: This is a direct consequence of the definition of matrix multiplica-
tion:
a
(n)
ij = (A
n)ij =
X
aii1ai1i2ai2i3 · · · ain�1j
where the sum is over all sequences (ii, i2, . . . , in�1) with ik 2 {1, 2, . . . , p}.
The summand is 1 if there is a walk along edges {vik , vik+1} of G from vertex
vi to vj and 0 otherwise (aikik+1 = 0 unless {vik , vik+1} 2 E(G)). Hence the
sum equals the number of walks from vi to vj .
Note: This concept is readily generalised to weighted graphs, directed
graphs and/or pseudo graphs where loops are allowed.
Walking on a Graph
Theorem 99 (Rational Generating Functions). Define the generating func-
tion Fij(G,�) =
P
a
(n)
ij �
n
, then
Fij(G,�) =
(�1)i+jdet (I � �A : j, i)
det (I � �A)
,
where (B : j, i) is the matrix obtained by removing row j and column i
from B.
Thus in particular Fij(G,�) is a rational function of �.
Proof: Theorem 4.7.2 in “Enumerative Combinatorics, Vol 1” by R P
Stanley.
89
Note: This fundamental result tells us that finite dimensional transfer
matrices leads to rational generating functions. Thus only infinite dimensional
transfer matrices can lead to generating functions that aren’t rational.
Walking on a Graph
Example: Consider the graph
1
2
3
4 5
A =
0
BBBB
@
0 1 1 0 0
1 0 1 1 0
1 1 0 0 0
0 1 0 0 1
0 0 0 1 0
1
CCCC
A
A
2 =
0
BBBB
@
2 1 1 1 0
1 3 1 0 1
1 1 2 1 0
1 0 1 2 1
0 1 0 0 1
1
CCCC
A
Walking on a Graph
A
3 =
0
BBBB
@
2 4 3 1 1
4 2 4 4 0
3 4 2 1 1
1 2 1 0 2
1 0 1 0 0
1
CCCC
A
A
5 =
0
BBBB
@
12 18 13 7 5
18 14 18 16 2
13 18 12 7 5
7 16 7 2 6
5 2 5 6 0
1
CCCC
A
Fibonacci as Walks on a Graph
Example: Consider the weighted pseudo di-graph
u v
1
1
1
A =
0 1
1 1
�
90
Walks on this graph generate the Fibonacci numbers. Theorem by Stanley
gives expressions for the generating functions for the entries a
(n)
ij of A
n.
det
�
I � �A
�
=
����
1 ��
�� 1� �
���� = 1� �� �
2
.
We start at u so initial vector is (1, 0). So we need the sum of the generating
functions of the entries a
(n)
11
and a
(n)
21
. These are
1� �
1� �� �2
and
�
1� �� �2
.
The sought after generating function is
G(�) =
1
1� �� �2
.
A Checker Jumping Problem
Here is a problem in combinatorial logic which makes use of the golden
ratio:
Consider the integer grid in the xy-plane.
Suppose that “checkers” are allowed to be placed on lattice points on or
below the x-axis.
These checkers can jump over neighbouring checkers, with the checker
being jumped over removed, according to the usual rule.
The problem is to determine the least number of checkers it takes to reach
a prescribed height y above the x-axis.
For y = k, k = 1, 2, 3, 4 the answer is 2, 4, 8, 20.
However, amazingly, for k � 5 it is impossible!
A Checker Jumping Problem
Initially, checkers must be on or below x-axis.
Height y = 1 :
x
y
Height y = 2 :
x
y
91
A Checker Jumping Problem
The aim is to get a checker to the point (0, 5).
Associate with each lattice point a variable xd. Define d to be the minimum
number of steps along grid lines it takes to reach (0, 5).
The value of a set of checkers is the sum of the corresponding variables.
We will show that by an appropriate choice of x, the value of a set of
checkers can never increase.
On the other hand we will show that the value of the set of all checker in
the region y 0 is equal to 1, which is the same as the value of a checker at
the point (0, 5).
Hence no finite configuration of checkers can ever reach (0, 5).
A Checker Jumping Problem
Step 1: Introduce a weighting to each lattice site.
Weigh the site we’d like to reach, (0, 5) by x0 = 1.
Weigh the other sites by xd, where d is the shortest distance from the site
to (0, 5) as measured along the x and y axes, e.g.,
x
y
x0
x1
x2
x3
x4
x5
x1 x1
x2 x2
x3 x3
x4 x4
x5 x5
x6 x6
x3 x3
x4 x4
x5 x5
x6 x6
x7 x7
x5 x5
x6 x6
x7 x7
x8 x8
x7 x7
x8 x8
x9 x9
A Checker Jumping Problem
Step 2: Investigate change in weight associated with di↵erent moves:
• Class 1 – Moves which carry a checker closer to the destination (0, 5).
Change in weight: xn � (xn+1 + xn+2) = xn(1� x� x2).
That is • • � ! � � • with weights xn+2, xn+1, xn, from left to right.
• Class 2 – Moves taking a checker further away.
Change in weight: xn+2 � (xn+1 + xn) = xn(x2 � x� 1).
• Class 3 – Moves that leave the distance unchanged.
Change in weight: �xn (central checker xn ones next to it xn+1).
• Choose x > 0 such that the change in weight of a ‘closing’ move is zero.
) x =
1
2
(�1 +
p
5) =
1
�
.
Then change in weight of the other moves is negative!
92
A Checker Jumping Problem
The conclusion from the analysis above is that to be able to reach (0, 5),
the initial configuration of checkers must have weight � 1, i.e. the weight at
(0, 5).
Step 3: Add up the weight of all the lattice sites on or below the x-axis.
Total initial weight of checkers on x-axis =
x
5 + 2×6 + 2×7 + . . . = x5 + 2×6(1 + x+ x2 + x3 + . . .)
= x5 +
2×6
1� x
= x5
1 + x
1� x
.
Similarly total weight on line at y = �1 is x6
1 + x
1� x
, at y = �2 it is x7
1 + x
1� x
,
etc. So total initial weight is:
x
5
1 + x
1� x
+ x6
1 + x
1� x
+ x7
1 + x
1� x
+ · · · = x5
1 + x
(1� x)2
.
Now insert x = 1/� and we find that the total initial weight is at most 1.
So no finite configuration of checkers can ever reach (0, 5).
Euclid’s Algorithm
Euclid’s Algorithm appears as a solution to Proposition VII.2 in the Ele-
ments:
• Given two numbers not prime to one another, to find their greatest
common measure.
What Euclid called ”common measure” is termed nowadays a common
factor or a common divisor. Euclid VII.2 then o↵ers an algorithm for finding
the greatest common divisor (gcd) of two integers.
Not surprisingly, the algorithm bears Euclid’s name.
Algorithm 100 (Euclid’s Algorithm). Let a � b be integers then gcd(a, b) =
gcd(b, a mod b). Euclid’s algorithm iterates this equation and stops once a
mod b = 0 since gcd(b, 0) = b.
Euclid’s Algorithm
Example: Let a = 2322, b = 654
gcd(2322, 654) = gcd(654, 2322 mod 654) = gcd(654, 360)
gcd(654, 360) = gcd(360, 654 mod 360) = gcd(360, 294)
gcd(360, 294) = gcd(294, 360 mod 294) = gcd(294, 66)
gcd(294, 66) = gcd(66, 294 mod 66) = gcd(66, 30)
gcd(66, 30) = gcd(30, 66 mod 30) = gcd(30, 6)
gcd(30, 6) = gcd(6, 30 mod 6) = gcd(6, 0) = 6
93
Therefore, gcd(2322, 654) = 6.
Question: Given a and b how many iterations are required to find gcd(a, b)?
Lamé’s Theorem
I used “Lamé’s Theorem – the Very First Application of Fibonacci Num-
bers’ at http://www.cut-the-knot.org/blue/LamesTheorem.shtml
Theorem 101 (Lamé’s Theorem – Honsberger Version). The number of di-
vision steps in an application Euclid’s algorithm never exceeds 5 times the
number of digits in the lesser number.
Theorem 102 (Lamé’s Theorem – Knuth Version). For n � 1, let integers a
and b, a > b > 0, be such that processing a and b by Euclid’s algorithm takes
exactly n division steps. Moreover, assume that a is the least possible number
satisfying that requirement. Then a = Fn+2 and b = Fn+1, where {Fn} is the
Fibonacci sequence.
Theorem 103 (Lamé’s Theorem – Moll Version). Let a > b be integers. The
number of steps in Euclid’s algorithm is about log
10
b/ log
10
�. This is at most
five times the number of digits of b.
Proof – Honsberger Version
Here we define the Fibonacci numbers by Fn = Fn�1+Fn�2, F0 = 1, F1 = 1.
Lemma 104. For all n � 1, Fn+5 > 10Fn.
Proof:
Fn+5 = Fn+4 + Fn+3
= 2Fn+3 + Fn+2
= 3Fn+2 + 2Fn+1
= 5Fn+1 + 3Fn
= 8Fn + 5Fn�1
= 13Fn�1 + 8Fn�2
= 21Fn�2 + 13Fn�3
> 10(2Fn�2 + Fn�3)
= 10Fn
Proof – Honsberger Version
Let dn denote the number of digits in Fn.
From the above lemma it follows that Fn+5 has at least one more digit
than Fn.
By direct inspection, numbers Fn, for 1 n 5 are single-digit. They
94
have at least 2 digits for 5 < n 10, have at least 3 digits for 10 < n 15, have at least 4 digits for 15 < n 20, and, in general, have at least k digits for 5(k � 1) < n 5k. Thus it is always the case that dn � n/5. Now back to Euclid’s algorithm. Proof – Honsberger Version Given two integers, a > b > 0, we set a = r0, b = r1 and divide with
remainder. Assuming Euclid’s algorithm takes n steps,
r0 = r1q1 + r2, 0 < r2 < r1 r1 = r2q2 + r3, 0 < r3 < r2 · · · rn�2 = rn�1qn�1 + rn, 0 < rn < rn�1 rn�1 = rnqn Note that qn � 2 (otherwise, we would have rn�1 = rn). Proceed back- wards, starting with rn � 1, which implies rn � F1. Next, rn�1 � 2rn = F2. And further, rn�2 = rn�1qn�1 + rn > rn�1 + rn > F2 + F1 = F3.
Proof – Honsberger Version
In general:
rn�k = rn�k+1qn�k1 + rn�k+2 > rn�k+1 + rn�k+2 > Fk + Fk�1 = Fk+1,
such that at the end of the process (i.e., at the beginning of the algorithm)
b = r1 > Fn and a = r0 > Fn+1.
Therefore if Euclid’s algorithm takes exactly n steps, then necessarily b >
Fn and has at least as many digits, which, as we found previously, is at least
n/5. And this proves the Honsberger version.
40 Formal Power Series
Formal Power Series
It is very useful to view generating functions as “formal” mathematical
objects.
Certain standard operations (such as term-by-term di↵erentiation) can be
given nice and useful combinatorial interpretations.
Term-by-term di↵erentiation of the Taylor series of a function is only per-
mitted when the series is uniformly convergent (the function is analytic).
95
However, the combinatorial interpretation is true irrespective of the ana-
lytic properties of the series. In combinatorics it is therefore very useful to
always permit term-by-term di↵erentiation (or other operations) regardless of
their ‘analytic’ validity.
This view-point leads to the concept of Formal Power Series. But first
Definition 105 (Counting Function). A counting function is a map f :
N0 ! C.
Note: We are mainly concerned with the case f : N0 ! N0.
Formal Power Series
Definition 106 (Formal Power Series). Let f : N0 ! C. We associate to f
the formal power series
F (z) =
X
n�0
f(n)zn,
and say that F (z) is the generating function of the counting function
f .
Formal Power Series
• The adjective “formal” refers to the fact that we regard these series as
algebraic objects to be added, multiplied or otherwise manipulated.
• The symbol zn is just a mark for where f(n) is placed.
• We do not need to consider F (z) as a function is the usual sense; that
is, we do not have to concern ourselves with its convergence etc.
• Two formal power series are equal if and only if they agree term by term.
• If we have manipulated two such series F (z) and G(z) algebraically and
obtained F (z) = G(z), we can read of f(n) = g(n) for all n, and this
proves a corresponding combinatorial identity.
Manipulation of Generating Functions
Operations such as addition, multiplication and di↵erentiation are associ-
ated with certain operations on the coe�cients of the generating functions.
These operations often have natural combinatorial interpretations.
Let
f(x) =
1X
k=0
fkx
k
g(x) =
1X
k=0
gkx
k
h(x) =
1X
k=0
hkx
k
96
Theorem 107 (Addition). If h(x) = f(x) + g(x) then hk = fk + gk.
Proof: Trivial.
Manipulation of Generating Functions
Theorem 108 (Multiplication). If h(x) = f(x) g(x), then
hk =
kX
j=0
fj gk�j =
kX
j=0
fk�j gj =
X
i+j=k
i,j�0
fi gj .
These sums are called the convolution of the sequences { fk} and {gk}.
Proof:
h(x) = f(x)g(x) =
0
@
1X
j=0
fjx
j
1
A
1X
i=0
gix
i
!
=
1X
i=0
1X
j=0
fj gi x
i+j (k = i+ j, i = k � j)
=
1X
k=0
kX
j=0
fj gk�j x
k ) [xk]h(x) =
kX
j=0
fj gk�j
Manipulation of Generating Functions
There are two very important special cases of the multiplication theorem.
Corollary 109 (Shifting). f(x) = xm : hk = gk�m or [x
k] (xmg(x)) =
[xk�m]g(x).
Note: Multiplication by xm shifts backwards by m units.
Corollary 110 (Summing). f(x) =
1
1� x
: hk =
kX
j=0
gj or [x
k]
✓
g(x)
1� x
◆
=
kX
j=0
gj =
kX
j=0
[xj ]g(x) .
Proof: f(x) =
1
1� x
=
1X
k=0
x
k ) fk = 1 8k.
Note: hk =
kX
j=0
gj are the partial sums of the sequence {gk}.
Manipulation of Generating Functions
Theorem 111 (Di↵erentiation). If h(x) =
✓
x
d
dx
◆
f(x), then hk =
k fk.
97
Proof: Di↵erentiate term-by-term and use x
d
dx
x
k = x k xk�1 = k xk.
Note: Why x d
dx? Here n is the ‘size’ of the combinational object.
Di↵erentiating and then multiplying by x ensures that the placeholder xn
in the formal power series still refers to objects of size n.
Theorem 112 (Integration). If h(x) =
1
x
Z
f(x) dx, then hk =
1
k + 1
fk.
Note: Integration is rarely used in combinatorics.
Combinatorial Interpretations
1. Addition: Disjoint union of two sets of objects F and G so objects of
same size are put together.
2. Multiplication:
• Distribution of objects (balls into boxes etc.).
• Convolution corresponds to concatenating (gluing) objects together.
Form objects of size n by taking objects of size k from set F and
‘glue’ them to objects of size n� k from G.
• Particularly useful when F = G, then we are saying ‘large’ objects
in F can be constructed be gluing together ‘smaller’ objects.
• Very important when the objects in F have a recursive structure.
3. Di↵erentiation: ‘Marking’ or singling out a component in an object.
4. Integration: No natural combinatorial interpretation.
Combinatorial Interpretations
Definition 113 (Marking). If an object consists of several indistinguishable
parts thenmarking a part distinguishes this part in some way say by colouring
an edge or vertex in a graph, putting a dot on an edge etc.
Example: Edge marked binomial paths.
Mark an edge by placing a dot on the edge, by di↵erent colour etc.
Bn,m = {b | b 2 Bn.m with one horizontal step marked}
B2,1 = { }, ��B2,1 �� = 6
98
Edge Marked Binomial Paths
In general we have n horizontal steps gives a factor of n
��Bn,m
�� = n |Bn,m | = n
✓
n+m
n
◆
= n
✓
p
n
◆
(p = n+m).
Thus
B(x) =
pX
n=0
��Bn,p�n
��xn (length p is fixed)
=
pX
n=0
n
✓
p
n
◆
x
n =
pX
n=0
✓
p
n
◆
x
d
dx
x
n
�
= x
d
dx
”
pX
n=0
✓
p
n
◆
x
n
#
= x
d
dx
(1 + x)p
= px(1 + x)p�1.
Proving an Identity
Example: Use generating functions to prove that
1 + 2 + 3 + · · ·+ n = n(n+ 1)/2
Solution: Let g(x) = 1+2x+3×2+4×3+ · · · =
1X
n=0
(n+1)xn =
1
(1� x)2
.
Coe�cients are the numbers to be summed.
Define h(x) as the product h(x) = g(x)
1
1� x
=
1
(1� x)3
.
By the Summing Theorem [xn�1]h(x) = 1 + 2 + 3 + · · ·+ n.
Geometric series gives, for |x| < 1,
1 + x+ x2 + . . . =
1
1� x
Di↵erentiating term-by-term we get (OK since series uniformly convergent)
0 + 1 + 2x+ 3x2 + 4x3 + . . . =
d
dx
1
1� x
=
1
(1� x)2
Proving an Identity
However, we also have
h(x) =
1
(1� x)3
=
1
2
d
dx
1
(1� x)2
=
1
2
d
dx
1X
n=0
(n+ 1)xn =
1
2
1X
n=1
n(n+ 1)xn�1
and hence the coe�cient of xn�1 of h(x) is n(n+ 1)/2.
99
The two expressions for h(x) must be equal and we have proved the
identity
1 + 2 + 3 + · · ·+ n = n(n+ 1)/2.
Combinatorial interpretation as areas:
+ + + . . .
n = . . . n =
n
1
+
n
n= 1
2
n + 1
2
n2 = 1
2
n(n + 1)
Derivatives and Averages
Example: Consider Dyck paths counted by length and number of con-
tacts.
D(u, z) =
X
n,k
dn,ku
k
z
n
,
where dn,k is the number of Dyck paths of length n with k returns to y = x.
First derivative w.r.t u gives an expansion for the average number of con-
tacts:
D
0(1, z) =
X
n
X
k
kdn,k
!
z
n
.
We first take the derivative and then set u = 1.
Second (and higher) derivatives give higher moments of the distribution of
contacts., e.g., the fluctuation in the number of contacts is � = hk2i � hki2.
Where the m’th moment given by
hkmi =
P
kk
m
dn,kP
kdn,k
.
Fluctuation quantities are particularly useful because they can give quite
sensitive indications of phase-transitions.
41 Catalan Numbers
Catalan Numbers
Leonhard Euler, in the year 1751, introduced and found a closed formula for
the Catalan numbers including the generating function. Euler was interested
in the number of di↵erent ways of dividing a polygon into triangles. The proof
of the result eluded him, until he was assisted by Christian Goldbach, and
more substantially by Johann Segner. By 1759 a proof was completed.
The sequence is named after Eugène Charles Catalan, who discovered a
connection to parenthesized expressions during his exploration of the Towers
of Hanoi puzzle.
In 1988, it came to light that the Catalan number sequence had been used
in China by the Mongolian mathematician Ming Antu by 1730. He wrote a
book (completed by his student and not published until 1839) “Quick Methods
for Accurate Values of Circle Segments” which included many trigonometric
identities and power series, some involving Catalan numbers.
100
For more on the history of Catalan numbers see: Igor Pak Catalan Num-
bers Page at http://www.math.ucla.edu/~pak/lectures/Cat/pakcat.htm
and check out his brief article under ‘Historical articles’.
Dyck Paths – Application
Example: In a two candidate election with 2n+ 1 voters, n voters cast
their vote for candidate A and n+ 1 for B. What is the probability that A is
always ahead of B (as the votes are counted) but loses at the last count?
Say n = 3 so there are 7 voters. The possible vote counting sequences are:
ABABABB
ABAABBB
AABBABB
AABABBB
AAABBBB
9
>>>>=
>>>>;
A never losing except at last vote
The vote count can be mapped (bijection) to a binomial path.
A vote 7! North step B vote 7! East step
The restriction A never losing except at last vote means that the path
stays above y = x except for the last step which must be an East step.
Dyck Paths – Application
So up until the last step the path is a Dyck path. Thus
Pr(A ahead till last vote) =
��B0
2n
��
|Bn+1,n |
=
1
n+ 1
✓
2n
n
◆
✓
2n+ 1
n+ 1
◆
=
1
n+ 1
(2n)!
n!n!
n! (n+ 1)!
(2n+ 1)!
=
1
2n+ 1
=
1
v
, v = # voters.
So with v = 101 the probability is about 1%.
42 Proper Binary trees
Proper Binary Trees
Definition 114 (Proper Binary Trees). A proper binary tree is either a leaf
or a node with a left and a right tree.
Example:
101
Generating Function for Proper Binary Trees
Theorem 115 (Generating Function for Proper Binary Trees). The generat-
ing function for proper binary trees has the functional equation
T (x) = 1 + xT (x)2
with exact closed form solution
T (x) =
1
2x
�
1�
p
1� 4x
�
.
The number of binary trees with n nodes tn is given by the Catalan numbers
tn =
1
n+ 1
✓
2n
n
◆
Proof: We have t0 = 1. When constructing a tree of size n, we have a
node at the root, and then we can have a tree on the LHS of size j and a tree
on the RHS of size n� j � 1. Thus,
tn = t0tn�1 + t1tn�2 + t2tn�3 + · · · tn�1t0
Generating Function for Proper Binary Trees
tn = t0tn�1 + t1tn�2 + t2tn�3 + · · · tn�1t0
Note: The RHS is a convolution of {tn} with itself. It should look famil-
iar – this kind of convolution occurs when multiplying generating functions.
It suggests that the RHS involves T (x)2.
Recipe: Let T (x) =
P1
n=0 tnx
n, multiply both sides of the recurrence
by xn, and sum over allowed values of n (t�1 is not a valid member of the
sequence) so
102
tn =
n�1X
j=0
tjtn�1�j
)
1X
n=1
x
n
tn =
1X
n=1
x
n
n�1X
j=0
tjtn�1�j
) T (x)� t0 =
1X
n=1
x
n�1X
j=0
�
tjx
j
� �
tn�1�jx
n�1�j�
Changing Summation Order
T (x)� t0 = x
1X
n=1
n�1X
j=0
�
tjx
j
� �
tn�1�jx
n�1�j�
j
0
n
j = n� 1
1 P1
n=1
Pn�1
j=0
,
j
0
n
n = j + 1
1P1
j=0
P1
n=j+1
Sum over n from 1 to 1, for each fixed
value of n, sum over j from 0 to n� 1.
Sum over j from 0 to 1, for each fixed
value of j, sum over n from j+1 to 1.
T (x)� t0 = x
1X
j=0
1X
n=j+1
�
tjx
j
� �
tn�1�jx
n�1�j�
Functional Equation for T (x)
Note: “Convergence” of the infinite sums is not an issue as they are
formal power series. We only need the two forms to be equal term-by-term.
Using that t0 = 1, and substituting k = n� 1� j, we have
T (x)� 1 = x
1X
j=0
1X
n=j+1
�
tjx
j
� �
tn�1�jx
n�1�j�
= x
1X
j=0
1X
k=0
�
tjx
j
� ⇣
tkx
k
⌘
= x
0
@
1X
j=0
tjx
j
1
A
1X
k=0
tkx
k
!
) T (x) = 1 + x (T (x))2
103
Solving the Functional Equation for T (x)
• It is also possible to simply write down the equation for the generating
function directly from the recursive definition of the binary tree.
• Getting a functional equation for the generating function is enough to
be able to extract useful information.
• Here we are “lucky” and find a simple closed form expression for T (x).
T (x) = 1 + x (T (x))2
+
x (T (x))2 � T (x) + 1 = 0
+
T (x) =
1±
p
1� 4x
2x
+
T (x) =
1
2x
�
1�
p
1� 4x
�
,
choosing the branch with t0 = 1.
Taylor Expansion of T (x)
To extract a closed form expression for tn we perform a Taylor expansion
of T (x) around x = 0. In general, the Taylor expansion of a function f(x) is
f(x) =
1X
n=0
f
(n)(0)
n!
x
n
,
and the expansion for f(x) = (1 + x)↵ is
(1 + x)↵ =
1X
n=0
↵(↵� 1) · · · (↵� n+ 1)
n!
x
n
.
Let f(x) =
p
1� 4x = (1� 4x)1/2. Then
T (x) =
1
2x
1�
1X
n=0
1
2
·
�
�1
2
�
·
�
�3
2
�
· · ·
�
1
2
� n+ 1
�
n!
(�4x)n
!
.
Taylor Expansion of T (x)
104
T (x) =
1
2x
1�
1X
n=0
1
2
·
�
�1
2
�
·
�
�3
2
�
· · ·
�
1
2
� n+ 1
�
n!
(�4x)n
!
Pull out the first two terms from the infinite sum.
=
1
2x
1�
1� 2x+
1X
n=2
1
2
·
�
�1
2
�
·
�
�3
2
�
· · ·
�
1
2
� n+ 1
�
n!
(�4x)n
!!
= 1�
1
2x
1X
n=2
1
2
·
�
�1
2
�
·
�
�3
2
�
· · ·
�
1
2
� n+ 1
�
n!
(�4x)n
!
Manipulate all the ‘minus’ signs. n terms, n� 1 are negative.
= 1�
1
2x
1X
n=2
(�1)n�1 1
2
· 1
2
· 3
2
· · ·
�
n� 3
2
�
n!
(�1)n(4x)n
!
Taylor Expansion of T (x)
T (x) = 1 +
1
2x
1X
n=2
1
2
· 1
2
· 3
2
· · ·
�
n� 3
2
�
n!
22nxn
!
= 1 +
1X
n=2
1
2
· 1
2
· 3
2
· · ·
�
n� 3
2
�
n!
22n�1xn�1
Make the substitution r = n� 1.
= 1 +
1X
r=1
1
2
· 1
2
· 3
2
· · ·
�
r � 1
2
�
(r + 1)!
22r+1xr
Multiply each factor in numerator by 2. There are r + 1 factors.
= 1 +
1X
r=1
1 · 3 · · · (2r � 1)
(r + 1)!
2rxr
Taylor Expansion of T (x)
105
T (x) = 1 +
1X
r=1
1 · 3 · · · (2r � 1)
(r + 1)!
2rxr
Make numerator equal (2r)! . The missing terms are 2rr!
= 1 +
1X
r=1
(2r)!
2rr!(r + 1)!
2rxr
= 1 +
1X
r=1
1
r + 1
✓
2r
r
◆
x
r
=
1X
n=0
1
n+ 1
✓
2n
n
◆
x
n
) [xn]T (x) = tn =
1
n+ 1
✓
2n
n
◆
= Cn.
43 More Catalania
More Catalania
We explore several combinatorial problems counted by Catalan numbers.
1. Balanced parentheses
2. Two row standard Young tableaux
3. Non-itersecting arches on 2n points
4. Triangulations of n+ 2 sided polygons.
How does one prove that these are counted by Catalan numbers?
1. Direct construction
2. Show the counting problem satisfies the Catalan recurrence
3. Bijection to a problem known to be Catalan
Balanced Parentheses
Definition 116 (Balanced Parentheses). Balanced parentheses consists
of open ‘(’ and closed ‘)’ parentheses such that any open parentheses has a
matching closed parentheses.
Example: n=1 : (), n=2 : ()() (()), n=3 : ()()() (())() ()(()) (()()) ((())).
Note: The matching (or balanced) condition means that as we scan the
parentheses from left to right the number of ( parentheses is never less than
the number of ) parentheses.
106
Claim: Balanced parentheses are counted by Catalan numbers.
Proof: We shall show that the number bn of balanced parentheses of size
n satisfies the Catalan recurrence (i.e., the same recurrence as binary trees).
Consider n = 3. Note that the balanced parentheses fall into two sets.
• Indecomposable parentheses, e.g., (()()) or ((())).
• Parentheses composed from smaller components, e.g., ()()() or ()(()).
Balanced Parentheses
How do we then construct all possible parentheses of size n?
Break them into two parts. The first indecomposable component and the
rest.
(()(()))
| {z }
1st comp
(())()(()()) · · ·
| {z }
The rest
Look at indecomposable components. Remove first ( and last ). What is
left?
Balanced parentheses one size smaller, but not necessarily indecomposable.
Construct parentheses as: (A)B where both A and B are parentheses.
Parentheses of size n are parentheses of size 0 j n � 1 placed inside
( ) with parentheses of size n� 1� j added on. This leads to the recurrence:
bn = b0bn�1 + b1bn�2 + · · ·+ bn�2b1 + bn�1b0.
This is exactly the same recurrence as for binary trees (also need that b0 = 1).
Another Catalan Application
Example: Six people are all of di↵erent heights, with the tallest denoted
6 down to the smallest denoted 1. List all the ways the people can be arranged
into two rows of length three such that the heights increase along each row,
and each person in the second row is taller than the person in the first row.
1 2 3
4 5 6
1 3 4
2 5 6
1 2 4
3 5 6
1 3 5
2 4 6
1 2 5
3 4 6
These are examples of Standard Young Tableaux
Young Diagrams – Integer Partitions
Definition 117 (Young Diagram). A Young diagram is a finite collection of
boxes, arranged in left-justified rows, with the row lengths weakly decreasing.
The shape � of a Young diagram is an ordered list of the lengths of the rows.
The size of a Young diagram is the total number of boxes.
Example: Young diagrams:
107
Shape � = (5, 4, 2) Shape � = (4, 4)
Note: A Young diagram of size n represents an integer partition of
n, that is, n expressed as a sum of integers. The number of Young diagrams
of size n equals the number of integer partitions of n. This is a fundamental
problem in number theory.
Standard Young Tableux
Definition 118 (Standard Young Tableux). AYoung tableau is obtained by
filling in the boxes of a Young diagram with symbols taken from some alphabet,
which is usually required to be a totally ordered set, often just {1, 2, . . . , n}.
In a standard Young tableau the entries are strictly increasing along both
rows and columns.
Example:
a c d f h
b g i k l
e j
1 3 4 6
2 5 7 8
Bijection to Parentheses
Claim: Two-row SYT of length n are in bijection to balanced parentheses.
Proof: Let p1 . . . p2n denote a sequence of parentheses, i.e. pi 2 {(, )}.
Let pi = (, if i is in the first row, and let pi = ), if i is in the second row:
1 2 3
4 5 6
7! ((())) 1 3 4
2 5 6
7! ()(()) 1 2 4
3 5 6
7! (()())
1 3 5
2 4 6
7! ()()() 1 2 5
3 4 6
7! (())()
Using the recurrence for parentheses one can now deduce (exercise!) the
recurrence for the number Tn of two-row SYT’s of length n:
Tn = T0Tn�1 + T1Tn�1 + · · ·+ Tn�1T0.
Non-intersecting Arches Connecting 2n Points
Example:
108
n = 1 n = 2
n = 3
n = 4
· · · · · · · · ·
Bijection to parentheses is obvious.
Triangulations of n+ 2 Sided Polygons
Example:
n = 3
n = 4
n = 5
n = 6 · · · · · ·
Triangulations of n+ 2 Sided Polygons
Constructive Proof: Let Cn = # triangulations of n+2 sided polygons.
Note: At this stage we don’t claim that Cn is a Catalan number.
Given a polygon P with n+2 sides. First mark one of its sides as the base.
P is then triangulated. We can further choose and orient one of its 2n+1
edges. There are (4n+ 2)Cn such decorated triangulations.
Now given a polygon Q with n + 3 sides. Again mark one of its sides as
the base. If Q is triangulated, we can further mark one of the sides other than
the base side. There are (n+ 2)Cn+1 such decorated triangulations.
109
There is a simple bijection between the two kinds of decorated triangula-
tions: We can either collapse the triangle in Q whose side is marked, or in
reverse expand the oriented edge in P to a triangle and mark its new side.
Thus
(4n+ 2)Cn = (n+ 2)Cn+1.
The formula for Catalan numbers follows from this relation and the initial
condition C1 = 1.
Triangulations of n+ 2 Sided Polygons
Bijective Proof: Bijection of polygons to parentheses:
1. Start at top edge of polygon. Step inside polygon and walk around the
perimeter.
2. Each time an edge is crossed the first time record a ‘(’ or .
3. Each time a polygon (outside) edge is passed record a ‘)’ or .
4. Delete the last parentheses.
(())() ()(()) ((())) ()()() (()())
Geometric Series vs Catalan
We have seen that both the geometric series and Catalan generating func-
tion appears again and again as the solution to numerous combinatorial prob-
lems.
Why is this?
Let us look at the general structure of the combinatorial objects they count.
We are given a set of irreducible object A. Both the geometric series and
Catalan series arise when we form large objects by concatenating objects from
A, e.g., an objects could be:
a1a2a3 · · · an�1an
with each ai chosen at random from A (including the empty object).
This explains the prevalence of these series. But why two cases?
• Geometric: This is the generating function in cases where |A | = n.
• Catalan: In this case |A | =1. To get the Catalan generating function
the object in A need to have a ‘nested or recursive inner structure’.
110
44 Asymptotic Results
Asymptotic Results
• The number of proper binary trees with n nodes is tn =
1
n+ 1
✓
2n
n
◆
.
• This is the nth Catalan number.
• Is this a “good” answer?
If you want to know the number of trees with 23 nodes, then yes.
If you want to know whether, for example, tn > F
3
n , then we would do
better with an asymptotic formula.
– Fn = 0, 1, 1, 2, 3, 5, 8, · · ·
– F 3n = 0, 1, 1, 8, 27, 125, 512, · · ·
– tn = 1, 1, 2, 5, 14, 42, 132, · · ·
Asymptotic Results
We have
tn =
1
n+ 1
✓
2n
n
◆
=
1
n+ 1
(2n)!
(n!)2
.
Using Stirling’s formula for n!
n! ⇠
p
2⇡n
⇣
n
e
⌘n
we obtain
tn ⇠
1
n
p
2⇡2n
�
2n
e
�
2n
2⇡n
�
n
e
�
2n
⇠
4n
n3/2
p
⇡
• tn ⇠ ⇡�1/2n�3/24n.
• We have Fn ⇠ �n ⇡ (1.618 · · · )n which leads to F 3n ⇠ (4.236 · · · )n.
• Thus, for su�ciently large n we see that F 3n is exponentially larger than
tn.
45 Asymptotics from Generating Functions
Asymptotics from Generating Functions
Recall the generating functions for Fibonacci and Catalan numbers:
111
F (x) =
1X
n=0
Fnx
n =
1
1� x� x2
,
C(x) =
1X
n=0
Cnx
n =
1
2x
�
1�
p
1� 4x
�
.
In these expressions, x is a formal variable used for bookkeeping and the
generating functions are to be interpreted as formal power series:
1
1� x� x2
= 1 + x+ 2×2 + 3×3 + 5×4 + 8×5 + . . .
What if we were to interpret generating functions as real-valued functions?
Or even complex valued functions?
Asymptotics from Generating Functions
If we interpret F (z) =
1X
n=0
Fnz
n =
1
1� z � z2
,
as a complex valued function, we have to worry about the convergence of
the infinite series. Plot of the (real-valued) Fibonacci generating function:
0.2 0.4 0.6 0.8 1 1.2
-8
-6
-4
-2
2
4
6
8
This function has a pole at x = 1/� = 0.618034 . . ..
Asymptotics from Generating Functions
Plot of the (real-valued) Catalan generating function: C(x) =
1
2x
�
1�
p
1� 4x
�
0 0.05 0.1 0.15 0.2 0.25 0.3
1
1.2
1.4
1.6
1.8
2
112
This function has a singularity (branch-point) at x = 1/4.
A complex function has a singularity at a point a if it is not analytic at a
(i.e. if not all complex derivatives exist).
Asymptotics from Generating Functions
Recall that: Fn ⇠ �n and Cn ⇠ ⇡�1/2n�3/24n.
The generating functions F (x) and C(x) have singularities at the inverse
of the dominant asymptotic, i.e., at x = 1/� and at x = 1/4, respectively.
More generally, if G(x) =
P1
n=0 gnx
n, the dominant asymptotics is deter-
mined by the singularity of G(x) that is closest to the origin x = 0.
Asymptotics from Generating Functions
Many of the generating functions studied in combinatorics are believed to
have algebraic singularities, though in many cases this has not been proved.
That is to say, the generating function is believed to behave as
G(x) ⇠ A(1� x/xc)✓ as x! x�c . (1)
Hence it follows that
gn = [x
n]G(x) ⇠
An
�✓�1
�(✓)xnc
. (2)
A is referred to as the critical amplitude, xc as the critical point, and ✓
as the critical exponent. It follows that the dominant asymptotic is µn where
µ = 1/xc.
46 Functional Equations
Functional Equations
We have seen several examples that suggest that the best way to find a
closed form solution for the generating functions of a sequence is to find a
functional equation for the generating function.
Up till now we did this by finding a recurrence for the sequence and then
solving this using generating functions.
Combinatorial Problems ! Recurrences ! Generating Functions !
Functional Equations ! Solutions
Here we wish to show that in many cases we can jump straight to the
functional equation.
Combinatorial Problems ! Functional Equations ! Solutions
Definition 119 (Counting Variable). By a counting variable we mean the
variable, say z, used as a placeholder in a formal power series.
113
Functional Equation for Binomial Paths
Definition 120 (Binomial Paths). Binomial paths are lattice paths in N2
0
with step set S = {(1, 0), (0, 1)}.
Let x be the counting variable for East steps and y for North steps.
Functional Equation for Binomial Paths:
B(x, y) = 1 + (x+ y)B(x, y).
This equation ‘reads’: A binomial path is either an empty walk or it starts
with a step along the x- or y-axis followed by any binomial path.
This functional equation is just a special case of the functional equation
which arises by considering objects constructed entirely by concatenating ele-
ments from a finite set of basic units.
To get tilings with monomers and dimers set x = z, y = z2, B(x, y) =
B(z).
To get tilings with tiles up to length k: x+y 7! z+z2+ · · ·+zk, B(x, y) =
B(z).
Proper Binary Trees
Definition 121 (Proper Binary Trees). A proper binary tree is either a leaf
or a node with a left and a right tree.
Let x be the counting variable for the number of nodes in a binary tree.
The recursive definition then allows us to write down the functional equa-
tion:
T (x) = 1 + xTL(x)TR(x),
where ‘1’ is the leaf, x is the node and attached to the node we have left
and right trees. But the left and right trees are themselves just proper binary
trees. So
T (x) = 1 + xT (x)2.
Dyck Paths
Definition 122 (Dyck Path). Consider Dyck paths as lattice paths in N2
0
with step set S = {(1, 1), (1,�1)} which start and end on the line y = 0.
Example:
114
Theorem 123 (Dyck Path Generation Function). The number of Dyck paths
of length 2n are the Catalan number Cn and hence the generating function
D(x) for Dyck paths is
D(x) =
1
2×2
⇣
1�
p
1� 4×2
⌘
.
Dyck Paths
Proof: Partition (or factorise) a Dyck path at the first return to y = 0.
The left-most component is an irreducible Dyck path and the right-most
component a Dyck path. Now, remove the first and last step of an irreducible
Dyck path and the ‘interior’ is just another Dyck path. Pictorially we have:
x D(x) x D(x)
Thus
D(x) = 1 + x2D(x)2 ) D(x) =
1
2×2
⇣
1�
p
1� 4×2
⌘
.
Motzkin Paths
Definition 124 (Motzkin Path). A Motzkin path is a lattice path in N2
0
with step set S = {(1, 0), (1, 1), (1,�1)} which starts and ends on the line
y = 0.
Note: The restriction to N2
0
means the path never goes below y = 0 (it
is a ballot version of a more general path on N0 ⇥ Z).
Example:
Note: The associated Motzkin numbers also counts the number of ways
of drawing any number of nonintersecting chords joining n (labeled) points on
a circle; the number of standard Young tableaux of height 3, etc.
Motzkin Paths
Theorem 125 (Motzkin Path Generating Function). The generating function
M(x) for n-step Motzkin paths is
M(x) =
1
2×2
⇣
1� x�
p
1� 2x� 3×2
⌘
115
Proof by functional equation: ‘Factorise’ the path by first return to
y = 0.
There are now two cases to consider:
Case 1: First step is (1, 0). What follows is any Motzkin path so factor
xM(x).
Case 2: First step is (1, 1). Akin to Dyck paths and gives factor x2M(x)2.
So we get:
M(x) = 1 + xM(x) + x2M(x)2.
Solve for M(x) and choose the solution analytic at x = 0.
Motzkin Paths
Proof by Dyck paths (change of variable): We can also ‘construct’
Motzkin paths out of Dyck paths by a change of variable.
Consider the bijection from Dyck paths to parentheses. We get Motzkin
paths by allowing any number of ‘�’ to be inserted between the parentheses.
(()())() ! �(�� (�)(���)�)����(�)��
The insertion of any number of ‘�’ is given by a factor 1/(1 � y). The
change of variable x 7! x/(1 � y) takes care of every string of ‘�’ except the
first one . So
M(x, y) =
1
1� y
D
✓
x
1� y
◆
=
1
1� y
(1� y)2
2×2
⇣
1�
p
1� 4×2/(1� y)2
⌘
=
1
2×2
⇣
1� y �
p
(1� y)2 � 4×2
⌘
.
Motzkin Paths
But y was just a placeholder (or counting variable) for (1, 0) steps.
If we don’t want to distinguish steps we can set y = x and elementary
algebra gives us the result from the theorem.
But wait there is more!
x ‘counts’ the total number of steps.
For Motzkin paths we may ask a question such as ‘what is the generating
function when steps are weighted by their length’?
Now (1, 0) steps have length 1 while (1, 1) and (1,�1) steps have lengthp
2.
116
Thankfully, we kept track of the two types of step so we can easily answer
the question. Just do x 7!
p
2y and we get the generating function for Motzkin
paths with length weighted steps:
M(y) =
1
4y2
⇣
1� y �
p
1� 2y � 7y2
⌘
.
47 Dyck Paths
Dyck Paths by Returns to the Surface
Definition 126 (Dyck Paths at a Surface). Consider Dyck paths as lattice
paths in N2
0
with step set S = {(1, 1), (1,�1)} which start and end on the
surface given by the line y = 0.
Example:
Theorem 127 (Dyck Path Generation Function). The generating function of
Dyck paths of length n with k returns to y = 0 is
D(z, u) =
2
2� u+ u
p
1� 4z2
where z ‘counts’ the number of steps and u the number of returns.
Dyck Paths by Returns to the Surface
Proof: Partition (or factorise) a Dyck path at the first return to y = 0.
Pictorially we have:
z zuD(z, 1) D(z, u)
The left-most component is an irreducible Dyck path and the right-most
component a Dyck path.
Now, remove the first and last step of an irreducible Dyck path and the
‘interior’ is a Dyck path. However, this path has no contacts with the surface.
The generating function for Dyck paths without contacts isD(z, 1) = D(z),
where D(z) = 1
2z2
⇣
1�
p
1� 4z2
⌘
.
117
Dyck Paths by Returns to the Surface
Thus
D(z, u) = 1 + uz2D(z, 1)D(z, u) = 1 + uz2D(z)D(z, u)
+
D(z, u) =
1
1� uz2D(z)
=
1
1� u1
2
⇣
1�
p
1� 4z2
⌘
=
2
2� u+ u
p
1� 4z2
= 1 + uz2 + (u+ u2) z4 + (2u+ 2u2 + u3) z6 +
(5u+ 5u2 + 3u3 + u4) z8 +
(14u+ 14u2 + 9u3 + 4u4 + u5) z10 +
(42u+ 42u2 + 28u3 + 14u4 + 5u5 + u6) z12 + · · ·
Singular Behaviour
Consider: D(z, u) =
2
2� u+ u
p
1� 4z2
The singularities (in z) of this function (may) depend on the parameter u.
Where is D(z, u) singular?
Branch point from
p
1� 4z2 at z = 1
2
or denominator is zero, i.e.,
2� u+ u
p
1� 4z2 = 0 )
p
1� 4z2 =
u� 2
u
.
Since
p
1� 4z2 � 0 we must have u � 2 for this to be relevant.
Solving for z gives z =
p
u� 1
u
< 1 2 , u > 2.
So the series for D(z, u) (with u fixed) has dominant asymptotic
µ =
(
2 u 2
u
p
u� 1
u > 2.
Average Number of Contacts
Let Dm(z, u) =
�
u
d
du
�m
D(z, u). A basic physical quantitiy of the model
is the average number of contacts at a fixed length and fixed value u = u0
hki =
1
n
[zn]D1(z, u0)
[zn]D(z, u0)
.
118
We see hki ! 0 for u 2 and hki ! constant > 0 when u > 2.
Furthermore, hki ‘ 1/n, u < 2; 1/n1/2, u = 2.
Fluctuations of Number of Contacts
For fixed n a quantity of interest is the fluctuations in the number of
contacts:
�(u) =
1
n
�
hk2i � hki2
�
Fluctuations of Number of Contacts – Peak Position
Expect position of peak up to approach the critical values u = uc = 2.
Figures show up � 2 against 1/n and 1/n1/3.
119
48 Parenthesised Expressions
Parenthesised Expressions
Recall the ‘construction’ of Motzkin paths from Dyck paths by a change
of variable. Consider the bijection from Dyck paths to parentheses.
We get the Motzkin path generating function by allowing any number of
‘•’ to be inserted between the parentheses, e.g.,
(()())() ! •(• • (•)(• • •)•) • •(•) • •
The insertion of any number of ‘•’ is given by a factor 1/(1 � b). The
change of variable z 7! z/(1 � b) takes care of every string of ‘•’ except the
first one. So
M(z, b) =
1
1� b
D
✓
z
1� b
◆
Now if we want ‘properly’ parenthesised expressions it is not ‘natural’ to
allow symbols outside the parenthesis.
Can we fix this? Oh yes we can!
Parenthesised Expressions
The problem arises in terms such as
(•(•)) • •()
that is where a ‘•’ is inserted after the first set parenthesis was closed.
NOTE this is exactly when a Dyck path returns to the surface!
If we take our paths with returns, counted by generating function D(z, u),
then we can use u to fix the problem of counting ‘illegal’ configurations.
A factor of u occurs whenever the path returns to the surface OR (and
this is the crucial observation) when a set of parenthesis are properly
closed.
We can ‘cancel’ or ‘delete’ the o↵ending strings of ‘•’ by setting u = (1�b)
and z = z/(1� b).
Here the change z 7! z/(1� b) as before inserts a string of ‘•’ after every
parenthesis while u 7! (1� b) deletes the ones we don’t want.
Parenthesised Expressions
120
So
P (z) = D
✓
z
1� z
, 1� z
◆
=
2
2� (1� z) + (1� z)
p
1� 4z2/(1� z)2
=
2
1 + z +
p
1� 2z � 3z2
= 1 + z2 + z3 + 3z4 + 6z5 + 15z6 + 36z7 + 91z8 + · · ·
counts the number of parenthesised expressions where there is a factor z
for each parenthesis and ‘•’.
Parenthesised Expressions
You may object that it is not natural to have empty parenthesis.
And you would be quite right! So let us try to fix that too.
We distinguish between opening and closing parenthesis.
In D(z, u) we only have the factor z2. This really represents an opening
and closing parenthesis. We can replace the term z2 by xy, where x counts
opening parenthesis and y closing parenthesis.
We then have
D(x, y, u) =
2
2� u+ u
p
1� 4xy
Two cases arise naturally: We count both parenthesis and ‘•’ or only
‘•’.
Parenthesised Expressions
If we count both then we make the substitutions:
x 7!
xb
1� b
, y 7!
y
1� b
, u 7! (1� b).
If we don’t care to distinguish between parenthesis and ‘•’ then
x 7!
z
2
1� z
, y 7!
z
1� z
, u 7! (1� z)
so
121
P (z) = D
✓
z
2
1� z
,
z
1� z
, 1� z
◆
=
2
2� (1� z) + (1� z)
p
1� 4z3/(1� z)2
=
2
1 + z +
p
1� 2z + z2 � 4z3
= 1 + z3 + z4 + z5 + 3z6 + 6z7 + 10z8 + · · ·
Parenthesised Expressions
If we only count ‘•’ then we make the substitutions:
x 7!
b
1� b
, y 7!
1
1� b
, u 7! (1� b)
and if we now set b = z we get
P (z) = D
✓
z
1� z
,
1
1� z
, 1� z
◆
=
2
2� (1� z) + (1� z)
p
1� 4z/(1� z)2
=
2
1 + z +
p
1� 6z + z2
= 1 + z + 3z2 + 11z3 + 45z4 + 197z5 + 903z6
+4279z7 + 20793z8 + · · ·
49 Penrose Tilings
Penrose Tilings
• The Fibonacci substitution sequence is generated by the rules a ! A
and A! Aa, with initial string S1 = a.
• Today we are going to study recursively defined geometric structures
called Penrose tilings (tiles outside western entrance of Peter Hall).
• We derive a substitution rule to create a tiling of the plane which has
five-fold symmetry (not possible for lattices to have five-fold symmetry).
• This forms a two-dimensional quasicrystal ; with long range order, but no
translational invariance. 3-dimensional quasicrystals are seen in nature.
122
Penrose Tilings
• There are various formulations, but one formulations has triangles A and
B with dimensions and substitution rules derived below.
• Further reading: http://en.wikipedia.org/wiki/Quasicrystal http://
en.wikipedia.org/wiki/Penrose_tiling http://en.wikipedia.org/wiki/
Pinwheel_tiling (as seen at Federation square)
Tilings Encyclopedia
http://tilings.math.uni-bielefeld.de
Well worth a visit if only for the beautiful pictures!
Some Geometry
Given the regular pentagon below, what is ↵j and �?
↵1
↵2
↵3
� �
11
11
1
Theorem 128. 1. ↵1 = ↵2 = ↵3 = ⇡/5.
2. � = 1
2
(1 +
p
5) = � (The golden ratio)
Some Geometry
Proof: 1 By symmetry we have ↵1 = ↵3 = ✓. Now consider
� �
�
✓
✓
2�
We have from the left-most figure 5� = 2⇡ and 2�+� = ⇡ ) 2� = 3⇡/5.
From the right-most figure we have 2� + 2✓ = ⇡ ) ✓ = ⇡/5.
Thus ↵2 = 2� � 2✓ = 3⇡/5� 2⇡/5 = ⇡/5 and hence
↵1 = ↵2 = ↵3 = ⇡/5.
123
Some Geometry
Proof: 2 We use similar triangles and consider
✓
✓
✓
2✓
3✓
P
Q
R
S
4PQR ⇠= 4QSR (angles the same). 4QPS is isosceles so |PS| = 1.
Let |PR| = � and |RS| = �0, then (using 1 + �0 = �)
|QR|
|RS|
=
|PR|
|QR|
)
1
�0
= � ) ��0 = 1 ) �(�� 1) = 1 ) � =
1
2
⇣
1 +
p
5
⌘
.
Note: We have �2 = �+ 1.
Tiles and Tilings
Definition 129 (Tiles and Tilings). • A tile is the interior of some region
in R2.
• A tiling of R2 is a (countable) set of tiles which is a covering (their
union is R2) and a packing (the intersection of any pair of tiles is empty).
This definition is obviously very general. To be a little more concrete we
shall create a tiling via the following procedure:
Algorithm 130 (Tiling). 1. Start with an initial set of tiles {T1, T2, · · · , Tn}.
2. Scale each tile: {T1, T2, · · · , Tn} 7! {Q(T1), Q(T2), · · · , Q(Tn)}
3. Subdivide scaled tiles: Q(Tj) 7! {Tj1 , Tj2 , · · · , Tjk}
4. Repeat 2 and 3 ad infinitum.
Note: The scaling must increase the size of the tiles.
Penrose Tiling
Back to the triangles of our pentagon:
Initial tiles:
3✓
✓ ✓
11
�
A =
✓
2✓ 2✓
��
1
B =
Q : Scale by � (i.e. inflate as � > 1) and use �2 = �+ 1:
124
��
�+ 1
Q(A) = �+ 1�+ 1
�
Q(B) =
Penrose Tiling
Subdivide the “1 + �” side of the triangles:
1 + � 1 �
OR
� 1
The choice of which way to subdivide leads to many di↵erent tilings of
R2.
Use arrows to indicate which of the two subdivisions to choose.
Arrow points to the shorter length.
��
�+ 1
Q(A) : ��
1�
=
AB
Penrose Tiling
�+ 1�+ 1
�
Q(B) :
�
1
�
1
�
=
A
B
B
125
Penrose Tiling
We start with a five-fold symmetric configuration of type A triangles.
Penrose Tiling
And after successive substitutions we obtain:
Penrose Tiling
Dropping the arrows gives the aperiodic tiling:
126
Penrose Tiling
How many A and B triangles are there after n substitutions?
A! AB, B ! ABB, and therefore
An+1 = An +Bn
Bn+1 = An + 2Bn.
In matrix form
✓
An+1
Bn+1
◆
=
✓
1 1
1 2
◆✓
An
Bn
◆
We have initial conditions A0 = 10, B0 = 0, thus
✓
An
Bn
◆
=
✓
1 1
1 2
◆n✓
10
0
◆
Penrose Tiling
What is
✓
1 1
1 2
◆n
? The first few iterates are
✓
1 1
1 2
◆ ✓
2 3
3 5
◆ ✓
5 8
8 13
◆ ✓
13 21
21 34
◆
· · ·
Lets guess that ✓
1 1
1 2
◆n
=
✓
F2n�1 F2n
F2n F2n+1
◆
127
This leads to✓
An
Bn
◆
=
✓
F2n�1 F2n
F2n F2n+1
◆✓
10
0
◆
= 10
✓
F2n�1
F2n
◆
Thus as we perform more iterations
lim
n!1
Bn
An
= lim
n!1
F2n
F2n�1
= lim
n!1
1p
5
�
2n
1p
5
�2n�1
= �.
The ratio between the two types of triangles approaches the golden ratio.
Penrose Tiling
The elementary shapes in the original formulation of the Penrose tiling are
On the RMIT building the elementary shapes are rhombi
Penrose Tiling
128
Penrose Tiling
One final image
50 Permutations
Permutations and Bijective Maps
129
Definition 131 (Permutation). A permutation of N = {1, 2, 3, . . . , n} is an
arrangement or n-tuple
(�1,�2, · · · ,�n) �i 2 N, �i 6= xj 8i 6= j
such that each i 2 N occurs exactly once.
Definition 132 (Permutations as Bijective Maps). A permutation can be
viewed as a bijective map on N = {1, 2, 3, . . . , n}
� : N 7! N, � bijective, i.e., one-to-one onto.
Notation: We write � : i 7! �i or � : i 7! �(i).
Example: If � maps: 1 7! 2, 2 7! 4, 3 7! 1, 4 7! 3 then we may write
� = (2, 4, 1, 3) (or just 2413)
Note: Recall that there are n! permutations or arrangements of n ob-
jects.
51 Two-line Arrays
Permutations as Two-line Arrays
Notation 133 (Two-line Arrays). A permutation can be encoded in a two-
line array, telling us where elements in position j are mapped to
✓
1 2 · · · j · · · n
�1 �2 · · · �j · · · �n
◆
original positions, in order
new positions
Example: For the above example we have
✓
1 2 3 4
2 4 1 3
◆
For our example there are 4! di↵erent two-line arrays, corresponding to
permutations of 1, 2, 3, 4 in the bottom line.
For n positions (objects) there are n! possible orderings in the second line
) n! two-line arrays.
Permutations as Bipartite Graphs
Definition 134 (Bipartite Graph). A graph G is bipartite if the vertices
V (G) can be partitioned into two subsets V1 6= ; and V2 6= ; such that for all
edges, e = v1v2 2 E(G), v1 2 V1, v2 2 V2. That is, every edge joins a vertex
of V1 and a vertex of V2.
Example: Permutations can be represented on bipartite graphs. Con-
sider
130
✓
1 2 3 4
4 2 1 3
◆
.
This permutation can be depicted as
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
Note: The horizontal line is there as a ‘guide to the eye’.
Exercises
Example: Give in two-line notation the permutation � with the action
� : (a, b, c, d, e, f) 7! (b, a, f, c, e, d).
Solution: ✓
1 2 3 4 5 6
2 1 6 3 5 4
◆
Example: Give the above permutation as a bipartite graph:
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
6
6
6
6
52 Cycle Notation
Permutations in Cycle Notation
A notation which is often more convenient is so-called cycle notation.
To see where this comes from we represent permutations as directed graphs.
Definition 135 (Permutation as Directed Graph). Draw a directed graph
(edges have arrows on them) with n vertices and an arc (directed edge) from
vertex i to vertex j = �i.
Example:
Consider the permutation: � =
✓
1 2 3 4 5 6 7
3 6 5 2 1 4 7
◆
Note: Every vertex has exactly one ‘in’ arc and one ‘out’ arc.
131
Cycle Notation
Definition 136 (k-cycle). A k-cycle is a cyclic mapping of length k that takes
an element back to itself:
a1 7! a2 7! a3 7! · · · 7! ak 7! a1
Notation: The k-cycle is written as (a1a2a3 · · · ak).
Example: From the above example we have
1 7! 3 7! 5 7! 1 = (135); 2 7! 6 7! 4 7! 2 = (264); 7 7! 7 = (7).
) � =
✓
1 2 3 4 5 6 7
3 6 5 2 1 4 7
◆
= (135)(264)(7).
Note: We often express permutations as “products” of disjoint cycles.
Here are some more examples:
✓
1 2 3 4
2 4 1 3
◆
= (1243)
✓
1 2 3 4 5 6
5 2 6 3 4 1
◆
= (15436)(2)
✓
1 2 3 4
3 4 1 2
◆
= (13)(24)
✓
1 2 3 4 5 6
3 6 5 2 1 4
◆
= (135)(264)
Cycle Notation
• The start of a cycle is not important.
(2314), (4231), (1423) and (3142) are all the same cycle.
So a k-cycle can be written in k ways (all cyclic permutations).
• If a permutation contains several cycles it is written as a “product” of
disjoint cycles, e.g., � = 21435 = (12)(34)(5)
• Cycles are disjoint ) each i 2 N occurs in exactly one cycle.
• Since there are k ways to write any k-cycle and r! ways to arrange or
order the cycles in any product of r cycles it is convenient to have a
standard way for writing down a cycle form.
Definition 137 (Fixed Points and Transpositions). 1. 1-cycles are called
fixed points.
2. 2-cycles are called transpositions.
3. If 2-cycle is (i i+1) it is called an elementary or simple transposition.
Standard Cycle Form
Definition 138 (Standard Cycle Form). 1. Of the k ways to write a k-
cycle choose the one which starts with the smallest member.
132
2. Order the product so the cycle starting with lowest number is to the left,
then write down the cycle involving the next lowest number and so on
until all elements are exhausted.
3. Omit any 1-cycles.
This “prescription” makes cycle notation unique (? is standard form):
(123)(456)? = (456)(123)
(1)(25)(3)(46) = (25)(46)? = (52)(1)(64)(3)
(456123) = (123456)?
Note: In standard form omitted 1-cycles may mean n has to be specified.
� = (12)(34) =
✓
1 2 3 4
2 1 4 3
◆
(n = 4) but
✓
1 2 3 4 5
2 1 4 3 5
◆
(n = 5)
53 Composition of Permutations
Composition of Permutations
Since permutations can be considered to be maps we can form composi-
tions.
Definition 139 (Composition of Permutations). Let ⌧ and � be permutations
of N = {1, 2, . . . , n}. The composition of ⌧ and �, ↵ = ⌧ ��, is a permutation
of N defined as
↵ : i 7! ↵(i) = ⌧(�(i)).
Note: � is applied first then ⌧ . Often we drop the ‘�’ and talk of “prod-
ucts”.
Given � and ⌧ we can compute ↵ = ⌧ � � using any representation.
Example: Let � = (4, 3, 1, 2) and ⌧ = (2, 3, 1, 4) then
↵ : 1 7! ⌧(�(1)) = ⌧(4) = 4
↵ : 2 7! ⌧(�(2)) = ⌧(3) = 1
↵ : 3 7! ⌧(�(3)) = ⌧(1) = 2
↵ : 4 7! ⌧(�(4)) = ⌧(2) = 3
) ↵ = (4, 1, 2, 3)
Composition or Multiplication of Bipartite Graphs
Multiplication in graph notation is easily done by top to bottom concate-
nation:
With � = (4, 3, 1, 2) and ⌧ = (2, 3, 1, 4) we have:
133
1
1
2
2
3
3
4
4
�
�
⌧
=
1
1
2
2
3
3
4
4
↵
Composition or Multiplication of Two-line Arrays
We can multiply (compose) two-line arrays, e.g.
✓
1 2 3 4
2 3 1 4
◆✓
1 2 3 4
4 3 1 2
◆
When acting on arrangement of objects, the rightmost “permutation” acts
first.
✓
1 2 3 4
2 3 1 4
◆✓
1 2 3 4
4 3 1 2
◆
For this example we have 1 ! 4 ! 4, 2 ! 3 ! 1, 3 ! 1 ! 2,
4! 2! 3.
We can express the resulting permutation in a two-line array:
✓
1 2 3 4
2 3 1 4
◆✓
1 2 3 4
4 3 1 2
◆
=
✓
1 2 3 4
4 1 2 3
◆
Composition or Multiplication in Cycle Notation
In standard cycle notation we have
� = (4, 3, 1, 2) = (1423) ⌧ = (2, 3, 1, 4) = (123)
For the multiplication put back in the (4) in ⌧ so
↵ = (123)(4) � (1423) = (1432)
We read of the action: 1 7! 4 7! 4, 4 7! 2 7! 3, 3 7! 1 7! 2, 2 7! 3 7! 1.
Example: What is � � ⌧ = (1423) � (123)(4)?
54 Inverse permutations
Identity and Inverse Permutations
Definition 140 (Identity and Inverse Permutations). The identity permu-
tation, denoted �I or “id”, is:
134
�I = (1, 2, 3, . . . , n) =
✓
1 2 3 · · · n
1 2 3 · · · n
◆
, · · ·
Since � is a bijection it has an inverse ��1 defined by
� � ��1 = ��1 � � = �I
Next we consider:
• Given � in various representations; compute ��1.
• For each representation there is a simple way to compute ��1.
Inverse Permutations from Bipartite Graphs
Procedure 141 (Inverse Permutation from Bipartite Graph). Given a bipar-
tite graph of a permutation � we get the inverse permutation ��1 by “reading”
the graph backwards.
Example:
� =
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
�
�1 =
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
1
1
1
1
1
1
2
2
2
2
2
2
3
3
3
3
3
3
4
4
4
4
4
4
� � ��1 =
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
=
Inverse Permutations from Two-line Arrays
Procedure 142 (Inverse Permutation from Two-line Array). Given a two-
line array of a permutation � we get the inverse permutation ��1 by “reading”
the two-line array from bottom to top.
Example:
� =
✓
1 2 3 4 5 6
2 1 6 3 5 4
◆
) ��1 =
✓
1 2 3 4 5 6
2 1 4 6 5 3
◆
Exercise: Check that � � ��1 = ��1 � � = �I
135
Inverse Permutations from Cycles
Procedure 143 (Inverse Permutation from Cycles). Given a permutation �
in standard cycle form we get the inverse permutation ��1 by “reading” each
cycle backwards, i.e., from right to left.
Example: ��1 is obtained by ‘reversing the arrows’ in cycle notation.
a1 7! a2 7! a3 7! · · · 7! ak 7! a1
becomes
a1 7! ak 7! ak�1 7! · · · 7! a2 7! a1
So
� = (1357)(24) ) ��1 = (7531)(42) = (1753)(24)
Note: The inverse of a 2-cycle is the same 2-cycle.
55 Involutions
Involutions
Definition 144 (Involution). A permutation � which contains only 1-cycles
and 2-cycles is an involution.
Theorem 145 (Basic Involution Property). If � is an involution then �2 = �I
and � = ��1.
Proof: A 1-cycle is already the identity.
We showed above that the inverse of a 2-cycle is the same 2-cycle.
Now � is a product of 1- and 2-cycles so each factor in � �� = �2 becomes
the identity on these 1 or 2-cycles.
So �2 leaves every element in place and hence is the identity.
Fixed Point Free Involutions
Theorem 146 (Fixed Point Free Involutions). The number of fixed point free
involutions of N = {1, 2, 3, . . . , n} is
(n� 1)!! := (n� 1)(n� 3)(n� 5) · · · 3 · 1
Note: n must be even.
Proof (outline): Since � is fixed point free (it has no 1-cycles) all i 2 N
are part of some 2-cycle (i�i).
Represent � by a “dot-link” diagram (which is a line with n dots where
the dots are connected by “arcs” or edges above the line).
Link set = { (i�i) | i 2 N, i < �i}. Example: � = (13)(24)(56) 136 Fixed Point Free Involutions We calculate the number of such link-dot diagrams as: • First link dot at position 1 to any of the remaining n� 1 dots. • Among the remaining n � 2 dots take the one in the left-most position and link to any of the remaining n� 3 dots. • Next from the remaining n�4 dots take the one in the left-most position and link to any of the remaining n� 5 dots. • etc. etc. ) |Link set| = (n� 1)(n� 3)(n� 5) · · · 3 · 1 = (n� 1)!! (n even) (Use induction to do a proper proof.) 56 Stirling numbers Counting Cycles Definition 147 (Sn). Let Sn be the set of all permutations ofN = {1, 2, 3, . . . , n}. Definition 148 (Stirling Numbers). Stirling numbers of the first kind, S1(n, k) are defined as the number of permutations � 2 Sn with exactly k disjoint cycles (in standard form). Example: n = 4, k = 2 : S1(4, 2) = 11. (123)(4), (132)(4), (124)(3), (142)(3), (134)(2), (143)(2), (1)(234), (1)(243), (12)(34), (13)(24), (14)(23). Example: List all permutations of S4 which consist of a single cycle. (1234) (1243) (1324) (1342) (1423) (1432) i.e. fix 1, and then consider all possible orderings of 2, 3, 4. 137 Seating Arrangements An equivalent representation is that S1(n, k) counts the number of ways of seating n people at k round tables, where the tables are indistinguishable, no table is empty, and we only care about the order of people around the table. Example: (134)(25) = 1 3 4 2 5 (1256)(3)(478) = 1 2 56 3 4 7 8 Special Values of Stirling Numbers 1. S1(n, k) = 0, k > n or k < 0. By convention, but obvious combinatorially. 2. S1(0, 0) = 1, otherwise S1(n, 0) = 0, n > 0.
By convention to give correct boundary conditions for recurrence.
3. S1(n, 1) = (n� 1)!
There is one cycle which in standard form starts with 1. There are
(n� 1)! permutations of the remaining elements.
4. S1(n, n) = 1.
Only one way to seat n people at n tables.
5. S1(n, n� 1) =
�
n
2
�
.
There are n � 1 tables and exactly one table has 2 people. We can
choose these two people in
�
n
2
�
ways. After that there is only one way to
seat the remaining people.
Basic Properties of Stirling Numbers
Theorem 149 (Basic Properties of Stirling Numbers). 1. Summation for-
mula:
nX
k=1
S1(n, k) = n!
2. Recurrence relation: S1(n, k) = S1(n�1, k�1)+(n�1)S1(n�1, k).
138
3. Generating function:
nX
k=1
S1(n, k)x
k = x(x+1)(x+2) · · · (x+n�1).
Proof: 1 Obvious from the definition.
2 On the RHS we have S1(n� 1, k� 1) from ‘add the 1-cycle (n)’ and
(n � 1)S1(n � 1, k) from ‘insert ‘n’ into standard form of a � 2 Sn�1 with k
cycles.’ ‘n’ can be inserted in any position in a cycle except before the first
number. Now check there are n � 1 such positions, e.g., � = (15)(243) 5
positions for 6.
3 Q5 from Practice Class 9.
Sterling Triangle
Recurrence relation and the boundary conditions leads to a Sterling trian-
gle resembling Pascal’s triangle for binomial coe�cients:
n increases down rows while k is constant along ‘right to left diagonals’.
57 Symmetric group
Symmetric Group
The fact that permutations can be multiplied to form other permutations,
and have inverses, tells us that the set of all permutations from a group.
This group is called the symmetric group, and denoted Sn where n is the
size of the permutation.
Example: The six elements of S3 are
✓
1 2 3
1 2 3
◆
= (1)(2)(3),
✓
1 2 3
2 1 3
◆
= (12)(3),
✓
1 2 3
1 3 2
◆
= (1)(23),
✓
1 2 3
2 3 1
◆
= (123),
✓
1 2 3
3 1 2
◆
= (132),
✓
1 2 3
3 2 1
◆
= (13)(2).
139
58 Group theory
Groups
Definition 150 (Group). A group, G, is a collection of elements with a mul-
tiplication relation satisfying:
Closed under multiplication:
a, b 2 G) a · b 2 G
Associativity:
a, b, c 2 G) (a · b) · c = a · (b · c)
Existence of unique identity element:
9 e 2 G s.t. e · a = a · e = a 8 a 2 G
Existence of inverse:
8 a 2 G, 9 b 2 G s.t. a · b = b · a = e
Note: Group multiplication need not be commutative: can have a · b 6=
b · a.
Permutations – Exercise
Exercise: multiplying permutations.
✓
1 2 3 4
2 3 1 4
◆✓
1 2 3 4
1 3 4 2
◆
=?
Solution: ✓
1 2 3 4
2 1 4 3
◆
✓
1 2 3 4
1 3 4 2
◆✓
1 2 3 4
2 3 1 4
◆
=?
Solution: ✓
1 2 3 4
3 4 1 2
◆
Permutations form a Group
Let Sn = set of all permutations of n elements {1, 2, 3 . . . , n}. With com-
position of permutations as multiplication Sn is a group called the symmetric
group.
Definition 151 (Symmetric Group). Let �, ⌧,↵ 2 Sn.
1. Sn is closed under multiplication: � � ⌧ 2 Sn.
2. Multiplication is associative: (� � ⌧) � ↵ = � � (⌧ � ↵). Which follows
since composition of maps is associative.
140
3. Unique identity element: �I = (1, 2, 3, · · · , n).
4. Every � 2 Sn has a unique inverse (property of bijections).
Note: Multiplication in Sn is not commutative: in general � � ⌧ 6= ⌧ � �.
As we saw in previous example (2314)(1342) = (2143) 6= (1342)(2314) =
(3412).
Transpositions
Definition 152 (Transpositions). If � 2 Sn and � = (i j) (i.e. � is a single
2-cycle and n� 2 1-cycles) then � is called a transposition. If j = i+1 then
� is called a simple or elementary transposition.
Question: Can any � 2 Sn be written as a product of (simple) transpo-
sitions?
That is, can we decompose � into a product on non-disjoint 2-cycles?
Is this decomposition unique and if so how many are there?
Example:
(134) = (14)(13)
1! 3, 3! 1! 4, 4! 1
(1423) = (13)(12)(14)
1! 4, 4! 1! 2, 2! 1! 3, 3! 1
Writing a Cycle as Product of Transpositions
Theorem 153 (Decomposition of Permutations). If (a1a2 · · · ak) is a k-cycle
then
(a1a2a3 · · · ak) = (a1ak)(a1ak�1) · · · (a1a4)(a1a3)(a1a2)
or
(a1a2a3 · · · ak) = (a1a2)(a2a3) · · · (ak�2ak�1)(ak�1ak).
Proof: Outline of proof for first form.
To see why this is true we read RHS from right to left:
a1 ! a2. Then we never touch the element in position 2 again (it does not
appear in any more cycles) and therefore a1 is in the correct place.
a2 ! a1 ! a3. We never touch a3 again so a2 remains in the correct place.
a3 ! a1 ! a4. We never touch a4 again so a3 remains in the correct place.
Finally, ak�1 ! a1 ! ak, and ak ! a1.
This shows that all elements end up in their correct places.
141
Writing a Cycle as Product of Transpositions
Corollary 154 (Permutations as Products of Transpositions). All permuta-
tions can be written as a product of transpositions (and 1-cycles).
Note:
1. Decomposition of permutations is not unique (two forms given).
2. The transpositions need not be simple.
3. In both of the given forms all transpositions are distinct. If repeats are
allowed then there are even more possibilities, e.g.,
(25) = (45)(24)(45)
Check: on RHS, 4! 5! 4, 5! 4! 2, 2! 4! 5. Same as LHS.
59 Parity of Permutations
Parity of Permutations
Definition 155 (Parity of Permutation). The parity of a permutation is even
if it can be written as the product of an even number of transpositions, and
odd if it can be written as a product of an odd number of transpositions.
Definition 156 (Sign of Permutation). The sign of a permutation, denoted
sgn is defined as
sgn(�) =
(
+1 if � is even
�1 if � is odd
Note: Fixed points (1-cycles) have sgn = +1.
Parity of Permutations
Theorem 157 (Parity of Permutations). The parity of the number of trans-
positions cannot vary and sgn(�) is constant.
Proof – Exercise: Given two di↵erent expressions for a permutation �
in terms of transpositions, can you prove that the number of transpositions in
these two expressions must either both be even, or both be odd?
From � = (a1a2a3 · · · ak) = (a1ak)(a1ak�1) · · · (a1a4)(a1a3)(a1a2)
) sgn(�) = (�1)k�1 (for a k-cycle),
and hence in general
sgn(�) =
Y
cycles
(�1)|cycle|�1,
142
where |cycle| is the length of the cycle.
Example: � = 43512768 = (14)(235)(67) ) sgn(�) = (�1)(�1)2(�1) =
+1
60 Alternating Group
Alternating Group
From the following facts
1. If �, ⌧ 2 Sn and sgn(�) = sgn(⌧) = +1 ) sgn(� � ⌧) = +1.
2. sgn(��1) = sgn(�), since cycles written backwards have the same length.
3. sgn(�I) = +1
we get
Theorem 158 (Alternating Group). The set of even permutations in Sn form
a subgroup called the alternating group which is denoted An and we have
|An | = n!2 .
Element of S3 and A3
Write out all elements of S3, first in two line notation, then as cycles, and
finally in terms transpositions. Read o↵ the elements of A3.
✓
1 2 3
1 2 3
◆
= �I even
✓
1 2 3
1 3 2
◆
= (23) odd
✓
1 2 3
2 1 3
◆
= (12) odd
✓
1 2 3
2 3 1
◆
= (123) = (13)(12) even
✓
1 2 3
3 1 2
◆
= (132) = (12)(13) even
✓
1 2 3
3 2 1
◆
= (13) odd
We read o↵ that: A3 = {�I, (123), (132)}
Exercise
Example: Let � be a permutation of {1, 2, · · · , n} consisting of l disjoint
cycles, including possible fixed points. Show that sgn(�) = (�1)n�l.
Example: Show that (123) generates A3, i.e. all elements of A3 can be
expressed as products of (123).
143
Notation: A3 = h(123)i.
61 Elementary transpositions
Reordering Permutations
The following simple ‘bubble sort’ algorithm can be used to reorder any
permutation back to �I.
Algorithm 159 (Bubble Sort). Any list of elements from {1, 2, . . . , n} can
be sorted (ordered) by repeatedly switching the leftmost ‘out-of-order’ pair of
elements.
Example: Sort the permutation 31542 by bubble sort.
31542
switch
pos (12)7�! 13542
switch
pos (34)7�! 13452
switch
pos (45)7�! 13425
switch
pos (34)7�! 13245
switch
pos (23)7�! 12345
In terms of neighbour switching we can write
�
(12)
7�! �(1)
(34)
7�! �(2)
(45)
7�! �(3)
(34)
7�! �(4)
(23)
7�! �I
Reordering Permutations
Note: ✓
1 2 3 4 5
3 1 5 4 2
◆
� (12) =
✓
1 2 3 4 5
1 3 5 4 2
◆
✓
1 2 3 4 5
1 3 5 4 2
◆
� (34) =
✓
1 2 3 4 5
1 3 4 5 2
◆
We see that switching entries i and i+1 is the same as composition on the
right with the 2-cycle (i i + 1), i.e., � � (12) = �(1), �(1) � (34) = �(2), etc.,
and
� � (12) � (34) � (45) � (34) � (23) = �I
From which is follows that
� = (23) � (34) � (45) � (34) � (12)
Note: We have thus written � as a composition (product) of elementary
transpositions and our working suggests this can always be done.
Permutations from Elementary Transpositions
Theorem 160 (Permutations from Elementary Transpositions). Permuta-
tions � 2 Sn can be written as products of elementary transpositions.
The theorem follows from the following procedure
1. Decompose each cycle of � 2 Sn into a composition of transpositions
using the results from Week 9 Lecture 3.
144
2. Decompose each transposition into a composition of elementary trans-
positions using the next theorem.
Definition 161 (Elementary Transpositions of Sn). The elementary trans-
positions for Sn are si = (i i+ 1), i.e.,
s1 = (12), s2 = (23), s3 = (34), · · · , sn�1 = (n� 1 n).
Transpositions from Elementary Transpositions
Theorem 162 (Transpositions from Elementary Transpositions). Let si be
the elementary transpositions for Sn. Then, for i < j the transposition (i j)
can be written
(i j) = sj�1sj�2 · · · si+1sisi+1 · · · sj�2sj�1,
Proof: “Follow the composition” and check i 7�! j 7�! i.
sj�1sj�2 · · · si+1sisi+1 · · · sj�2sj�1 =
(j�1 j)�(j�2 j�1)�· · ·�(i+1 i+2)�(i i+1)�(i+1 i+2)�· · ·�(j�2 j�1)�(j�1 j).
We see j 7! j � 1 7! j � 2 7! · · · 7! i+ 2 7! i+ 1 7! i.
Now i doesn’t appear again so all in all j 7! i.
If we start with i nothing happens until we hit (i i+1) and then
i 7! i+ 1 7! i+ 2 7! · · · 7! j � 2 7! j � 1 7! j.
Inverses as Elementary Transpositions
Theorem 163 (Inverses as Elementary Transpositions). If � 2 Sn and
� = si1si2 · · · sik�1sik
then
�
�1 = siksik�1 · · · si2si1 ,
i.e., just reverse the order of the elementary transpositions.
Proof: Form � � ��1 and use s2i = �I.
Transpositions from Elementary Transpositions
Any transposition can be written as a product of elementary transpositions.
Example:
(24) = (34)(23)(34) = s3s2s3.
This can be shown by the shu✏ing action on (a, b, c, d):
s3s2s3(a, b, c, d) = s3s2(a, b, d, c) = s3(a, d, b, c)
= (a, d, c, b) = (24)(a, b, c, d).
(25) = (45)(34)(23)(34)(45) = s4s3s2s3s4
145
s4s3s2s3s4(a, b, c, d, e) = s4s3s2s3(a, b, c, e, d)
= s4s3s2(a, b, e, c, d) = s4s3(a, e, b, c, d)
= s4(a, e, c, b, d) = (a, e, c, d, b) = (25)(a, b, c, d, e).
Symmetric Group
We have shown that any cycle can be written as a product of transpositions.
Then we showed that any transposition can be written as a product of
elementary transpositions (si = (i i+ 1)).
Thus any permutation � 2 Sn can be written as a product of si’s.
We have therefore shown that
Theorem 164 (Sn is Generated by Elementary Transpositions). Let si be the
elementary transpositions for Sn. Every � 2 Sn can be written as a product of
si’s and we say that the set of si’s generate Sn, and write
Sn = hsi | i = 1, 2, · · · , n� 1i.
Permutations as Words
Previous theorem says a permutation � 2 Sn is a word in the alphabet
S = {s1, s2, . . . , sn�1}
Question: What is the shortest word?
There are many ways to write a given � 2 Sn as (shortest length) product
of elementary transpositions.
Example: Consider � = 4213
� = 4213 = (143) = (13) � (14) = s2s1s2 � s3s2s1s2s3 length = 8.
However, � = s3s1s2s1 = s1s3s2s1 = s3s2s1s2
Check: s3s1s2s1 = (34)(12)(23)(12) = (143)(2)
or s1s3s2s1 = (12)(34)(23)(12) = (143)(2)
62 Inversions
Inversions
So if � =
✓
1 2 3 4
4 2 1 3
◆
, then the shortest word is w� = s3s1s2s1.
Looking at the bipartite graph for � reveals a secret:
146
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
=
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
s1
s2
s1
s3
Graph has four crossings, and it is no coincidence that the length of w�
is 4.
Inversions
A cross
i j
�j �i
corresponds to two entries out of order: i < j,
�i > �j .
Definition 165 (Inversion). An inversion of a permutation � is a pair (i, j)
such that i < j and �i > �j . The set of inversions of � is denoted
I� = {(i, j) | i < j, �i > �j}.
Note: The bigger number �i appear before �j in (�1,�2, . . . ,�n).
Bonus: Bipartite graph gives decomposition as elementary transpositions.
Read from top to bottom and record crossings in order of appearance.
Inversions
Example: � =
✓
1 2 3 4
4 2 1 3
◆
then
I� = {(1, 2), (1, 3), (1, 4), (2, 3)}.
All pairs in � = (4, 2, 1, 3) are given by the table
4 2 1 3
4 2
4 1
4 3
2 1
2 3
1 3
1 2 3 4
Four pairs out of order. Positions of the elements give the entries in I�.
147
Reduced Words
Theorem 166 (Length of Shortest Word). The length `(w�) of a shortest
word w� 2 Sn representing � by elementary transpositions equals the number
of inversions of �, i.e., if `(w�) is minimal then `(w�) = |I�|.
Definition 167 (Reduced Word). A minimum length word is called a reduced
word.
63 Algebraic Relations for si’s
Algebraic Relations for si’s
Elementary transpositions, si, are characterised by three algebraic rela-
tions:
1. s2i = �I (the identity permutation)
2. sisj = sjsi, |i� j| > 1 (commutation relation)
3. sisi+1si = si+1sisi+1 (the braid relation)
Exercise: Check that these relations are true.
Note: The proof that any two expressions for a permutation � in terms
of si’s, can be converted from one expression to the other via a sequence of
operations of the form (1), (2), and (3) is lengthy and not part of the course.
Relation (1) reduces the number of si’s by two (shortens the word).
Relations (2) and (3) don’t change the number si’s, but re-orders them.
Thus the parity of the number of elementary transpositions is conserved
by the algebraic relations.
Braid Relation
We can represent the braid relation visually as follows:
sisi+1si
i i+1 i+2
i i+1 i+2
apply si
apply si+1
apply si
=
si+1sisi+1
i i+1 i+2
i i+1 i+2
apply si+1
apply si
apply si+1
So pictorially the braid relation is like a reflection in a line through i+1.
148
Algebraic Relations in Action – Graphically
Return to � =
✓
1 2 3 4
4 2 1 3
◆
, with w� = s3s1s2s1.
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
s2
s1
s2
s3
=
(3)
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
s1
s2
s1
s3
=
(2)
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
4
4
4
4
4
4
4
4
4
4
s1
s2
s3
s1
Reduced Word – Example
Example: Find all reduced words for w� = s1s3s2s3s1.
64 Alternating group
The Alternating Group
Recall: If � and ⇡ are even permutations, then by inspection �⇡ is also
even. We also have that ��1 is even, and the identity permutation I is even,
and so even permutations form a group, called the alternating group.
We denote the alternating group of n elements as An.
We will now specify a generating set for An. Elements of An can be
written as a product of an even number of transpositions. We know that the
elementary transpositions generate Sn, and so we have
An = hsisj | 1 i < j n� 1i. How can we be sure i < j is su�cient? Because we can generate sjsi with i < j sjsi = sisj when j > i+ 1 from (2)
si+1si = s
2
i si+1si = si(sisi+1si) = si(si+1sisi+1) = (sisi+1)
2
.
Generating Set for An
We can go one step further. Note that for i < j we have
sisj = sis
2
i+1s
2
i+2 · · · s
2
j�1sj = (sisi+1)(si+1si+2) · · · (sj�1sj).
So any sisj with i < j can be generated by sisi+1! Thus
An = hsisi+1 | 1 i n� 2i = h(i i+ 1 i+ 2) | 1 i n� 2i
Exercise: Show that sisi+1 = (i i+1 i+2)
Definition 168 (Elementary 3-cycles). A 3-cycle is simple or elementary
if it is of the form (i i+1 i+2).
149
Theorem 169 (An Generated by Elementary 3-cycles). If � 2 An then � can
be decomposed into a product of elementary 3-cycles and An is thus generated
by elementary 3-cycles.
This generating set will form the basis for our understanding of the 15-
puzzle.
65 The 15-puzzle
The 15-puzzle
The 15-puzzle was invented towards the end of the 19th Century, and
started a world-wide craze bigger than that of the Rubik’s cube in the 1980’s.
See http://en.wikipedia.org/wiki/15-puzzle.
The 15-puzzle consists of a 4 ⇥ 4 grid with tile labelled 1 to 15 and an
empty cell. The tiles may be slid into an adjacent empty cell (called a legal
move).
Typical challenge: Change configuration A to configuration B?
Change a random (scrambled) configuration into the ordered configuration:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Objectives
We are going to study the 15-puzzle as a mathematical object.
• How to prove or characterise which configurations can be legally ob-
tained?
• Idea: map configurations to permutations and moves to the action of a
permutation on the configuration.
We prove a connection to the alternating group An by showing two things:
• Only even permutations of the numbers 1, · · · , 15 can be achieved. (Hence,
the tiles labelled 14 and 15 can’t be swapped, as this corresponds to a
single transposition).
• All even permutations of the numbers 1, · · · , 15 can be achieved.
Note: The 15-puzzle can be generalised to an n⇥m grid with n⇥m� 1
tiles and a empty cell.
150
Grid Representation
There are many ways to map configurations to permutations. The obvious
one is to label cells 1 to 16 and then �(i) = label of the tile above i.
For example (with clockwise cell numbering)
3 2
4 1
�! � =
✓
1 2 3 4
4 1 2 3
◆
= (4, 1, 2, 3) = (1432)
Move 3 up:
4 2
3 1
�! µ = (3, 1, 2, 4) = (132)(4) = (1432)(14)
Moving the tile with 3 resulted in right-composition with the transposition
(14).
Right-Composition
Recall: Right composition by two-cycle (ij) swaps values at positions i
and j.
�
0 = � (ij), � = (�1, . . . ,�i, . . . ,�j , . . . ,�n)
= (�1, . . . ,�j , . . . ,�i, . . . ,�n)
Exercise: Characterise left composition by (ij).
Only Even Permutations are Possible
Theorem 170 (Only Even Permutations are Possible). On an m ⇥ n board
with moves consisting of swapping the “empty” cell with a neighbouring tile
only even permutations of the tiles are permitted.
Proof: Our goal is to characterise possible configurations where the
empty tile remains fixed, i.e. in the bottom right corner.
Swapping the empty tile with a neighbour corresponds to a transposition.
If we must return the empty tile to its starting position, then we have
# of up slides = # of down slides
# of left slides = # of right slides
) Total slides = (# up) + (# down) + (# left) + (# right)
= 2(# up) + 2(# left) = even number
So the final configuration must be reached via an even number of transpo-
sitions, and hence corresponds to an even permutation in Snm�1.
Exercise: Why is this true for Snm�1, even though we have n⇥m tiles?
151
Notes on the Grid Representation
1. Grid permutation representation requires us to work with Sn⇥m.
2. Configurations related by sliding the empty cell along cell labels corre-
spond to di↵erent permutations, e.g.:
3 2
4 1
(4,1,2,3)
3 2
1 4
(1,4,2,3)
3 4
1 2
(1,2,4,3)
4 3
1 2
(1,2,3,4)
3. To get the permutation we must “look under” the tile to see the cell
label.
The representation we shall look at next reduces n⇥m to n⇥m� 1 and
fixes the problems with (2) and (3) from the grid representations.
All Even Permutations are Possible
Theorem 171 (All Even Permutations are Possible). On an m⇥n board with
moves consisting of swapping the “empty” tile with a neighbouring tile all even
permutations of the tiles are permitted.
Proof: This is surprisingly non-trivial. See e.g. http://www.cs.cmu.edu/afs/
cs/academic/class/15859-f01/www/notes/15-puzzle.pdf for a proof by Archer in the
American Mathematical Monthly (1999).
First we need a di↵erent representation: “Snake pattern representation”
1 2 3 4
56
15
�(i) is the position of tile i along the snake ignoring the blank ) � 2 S15
Snake Pattern Representation
Example: Can target configuration be reached from initial configura-
tion?
1 2 3 4
5 6 7 8
9 10 11 12
13 15 14
Target configuration
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15
Initial configuration
152
�i = (1, 2, 3, 4, 8, 7, 6, 5, 9, 10, 11, 12, 15, 14, 13)
= (58)(67)(13 15) = odd permutation
�t = (1, 2, 3, 4, 8, 7, 6, 5, 9, 10, 11, 12, 14, 15, 13)
= (58)(67)(13 14 15) = even permutation
Since moves correspond to even permutations, they cannot produce the
target configuration from the initial configuration.
Blank can be moved along the snake without changing the permutation)
map from puzzle configurations to permutations is not one-to-one but 16 to
1.
66 7-Puzzles
7-Puzzles
For simplicity we now focus on 2⇥ 4 grids.
We prove that each permutation gives rise to a puzzle configuration.
In fact 7 configurations related to each other by blank slides along the
snake. Initial configuration:
1 2 3 4
5 6 7
�i = (1, 2, 3, 4, 7, 6, 5) = (57)
Target configuration:
1 2 3 4
5 7 6
�t = (1, 2, 3, 4, 6, 7, 5) = (567)
7-Puzzles
Moving a tile into the blank changes the permutation of the configuration
p
move�! q
which is represented by right composition
�p = �qµ,
where µ is the permutation corresponding to the slide.
Many slide moves correspond to longer composition sequences:
�r = �0µ1µ2 · · ·µr.
So for 7-puzzles the questions are
• (567) ?= (57)µ1 · · ·µr with µi permutations corresponding to legal moves.
153
• If not, does every even permutation correspond to a legal configuration?
7-Puzzle: Fundamental Moves
For the 7-puzzle and the snake representation we only need three di↵erent
fundamental permutations.
Move:
Permutation: µ1 µ2 µ3
Thus µi corresponds to moving the tile in column i up from bottom to top
row (and µ�1i is the top to bottom move).
Recall that horizontal slides do not change the permutation.
So we can always make those “for free”.
Note: is not required. Why is that?
7-Puzzle: Fundamental Permutations
As permutations the moves correspond to
�1 �2 �3
�7 �6 �5 �4
(�7,�1, . . . ,�6) = (�1, . . . ,�7)µ1
Hence µ1 = (1765432), a 7-cycle. Check this!
Note that µ1 is independent of the tile permutation �.
Likewise for µ2 and µ3:
�1 �2 �3
�7 �6 �5 �4
(�1,�6,�2,�3,�4,�5,�7) = (�1, . . . ,�7)µ2
Hence µ2 = (26543), a 5-cycle (check).
�1 �2 �3
�7 �6 �5 �4
(�1,�2,�5,�3,�4,�6,�7) = (�1, . . . ,�7)µ3
Hence µ3 = (354), a 3-cycle (check).
7-Puzzle: Even Permutations
Now
sign(µ1) = (�1)6 = +1
sign(µ2) = (�1)4 = +1
sign(µ3) = (�1)2 = +1
so these are all even permutations.
�t = (567) can never equal �iµ1 · · ·µr = (57)µ1 · · ·µr as their signs are
di↵erent.
154
We now show that µ1, µ2, µ3 and their inverses generate A7, the alternating
group on seven symbols.
Recall that An is generated by all 3-cycles (i i+ 1 i+ 2), i = 1, . . . , n� 2.
A7 is generated by:
h1 = (123), h2 = (234), h3 = (345) = µ
�1
3
, h4 = (456), h5 = (567).
Relation to Alternating Group
We need the following result.
Lemma 172. If ↵ = (↵1↵2 . . .↵k) is a k-cycle and � 2 Sn then
�(↵1↵2 . . .↵k)�
�1 = (�(↵1)�(↵2) . . .�(↵k)).
Proof : Note that ↵ sends ↵i 7! ↵i+1, and consequently �↵��1 sends
�(↵i) 7! ↵i 7! ↵i+1 7! �(↵i+1).
Example:
µ1µ
�1
3
µ
�1
1
= µ1h3µ
�1
1
= (1765432)(345)(1234567) = (234) = h2.
Relation to Alternating Group
Likewise
µ
2
1
h3µ
�2
1
= (1642753)(345)(1357246) = (123) = h1,
µ
�1
2
h3µ2 = (23456)(345)(26543) = (456) = h4,
µ
�1
1
h4µ1 = (1234567)(456)(1765432) = (567) = h5.
So all the elementary 3-cycles in A7 can be written as compositions of the
7-puzzle permutations µ1, µ2 and µ3 and their inverses!
h1 = (123) = µ
2
1
µ
�1
3
µ
�2
1
,
h2 = (234) = µ1µ
�1
3
µ
�1
1
,
h3 = (345) = µ
�1
3
,
h4 = (456) = µ
�1
2
µ
�1
3
µ2,
h5 = (567) = µ
�1
1
µ
�1
2
µ
�1
3
µ2µ1.
Note: The inverse of a 3-cycle hi is h2i (as h
3
i = �I).
Puzzle Moves Generate An⇥m�1
Theorem 173 (Puzzle Moves Generate An⇥m�1). The three moves µ1, µ2
and µ3 and their inverses generate the group A7. Similarly, the moves of the
15-puzzle generate A15.
155
Note: For the 15-puzzle the snake-permutation has 9 moves and inverses.
To prove the result using the “grid permutation” representation requires
48 moves µi (and their inverses).
For each of the 48 moves (and inverses) need to
1. Compute the parity.
2. Write the 14 generators hi of A16 as products of the moves permutations.
For the snake-representation we need to write the 13 generators hi of A15
as products of the 9 moves permutations (and their inverses)
7-Puzzle
Example: Are these positions of a 7-puzzle related by legal moves?
4 1 2
7 3 6 5
6 2 1 5
4 3 7
Answer: The positions are not related by legal moves.
15-Puzzle
Example: Are these positions of a 15-puzzle related by legal moves?
1 3 2 4
5 6 7 15
12 10 11 9
13 14 8
15 2 3 4
7 5 6 8
9 10 11 12
1 14 13
Answer: The positions are not related by legal moves.
67 Designs
Designs – Introduction
Suppose we wish to conduct a taste test to compare 9 di↵erent varieties of
co↵ee. We have limited resources (and time), and so wish to do this as fairly
as possible. What are some considerations?
• Assign varieties of co↵ee to di↵erent households, but can’t have a house-
hold comparing all 9 varieties. Their opinion of the 9th variety would be
unreliable due to saturation of taste.
• Compare pairs of varieties.
156
• Want any given variety to be compared with all the others.
• For fairness, all pairs should be compared the same number of times.
How can we do this?
Designs – Introduction
Solution: Let the varieties of co↵ee be numbered 1, · · · , 9:
1 2 3
4 5 6
7 8 9
Now there are
�
9
2
�
= 36 possible pairs of co↵ee varieties.
Twelve households are each given 3 varieties to compare.
3 varieties has 3 pairs so each possible pairing amongst the 9 varieties will
be rated by exactly one household.
This way every variety is compared against the others by some household.
The varieties of co↵ee can be distributed as follows.
{1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {1, 4, 7}, {2, 5, 8}, {3, 6, 9},
{1, 5, 9}, {2, 6, 7}, {3, 4, 8}, {1, 6, 8}, {2, 4, 9}, {3, 5, 7}.
Designs – Definition
The previously problem is an example of a combinatorial design prob-
lem.
The term design comes from the design of statistical experiments, i.e. the
field of “experimental design”.
Used in biology, social sciences, software testing and other fields.
More formally, a design is a (finite) family of subsets of a set which obeys
certain properties.
Definition 174 (Design). A design is a pair (X,A) such that
1. X is a set of elements (points, varieties).
2. A is a collection (multiset) of nonempty subsets of X called blocks.
Designs – Definition
Note: The properties or conditions onA can be very general but typically
involve incidence: set membership, set intersection and so on. For this reason,
a design as defined above is also called an incidence structure.
We will see that the study of designs is a “big” field. Block designs are
also related to (block) error correcting codes.
The general definition of designs is extremely broad and in particular A
may contain repeated blocks (hence multiset in the definition).
In the previous design, each household tests the same number of varieties,
and thus the opinion of each household has equal weight.
157
Designs – More Definitions
Each pair of varieties was in the same number of subsets (blocks).
Definition 175 (Designs). • A design with no repeated blocks is said to
be simple.
• A design is regular if every variety (element) occurs equally often. A
regular design is also called a 1-design.
• A block is complete if it contains every variety (incomplete otherwise).
• A design is incomplete if at least one block in incomplete.
• If x and y are any two di↵erent varieties in an incomplete design the
number of blocks containing both x and y is the covalency denoted
�xy.
• A balanced incomplete block design (BIBD), or a 2-design, is a
regular incomplete design in which �xy = � is constant.
Designs – Notation
Notation 176 (Designs). • # subsets = # blocks = b.
• # elements = # varieties = v.
• # subsets containing any one variety = r.
• # varieties in each subset = k (< b).
• # times a pair of varieties are compared = �.
A BIBD can be specified by these parameters in the order (b, v, r, k,�).
Note: We will prove in a following theorem that the number of appear-
ances of a given variety, r, is determined by v, k, and �.
A BIBD with parameters (b, v, r, k,�) is also called a 2-(v, k,�) design.
Example: Our previous example is a (12, 9, 4, 3, 1) design.
{1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {1, 4, 7}, {2, 5, 8}, {3, 6, 9},
{1, 5, 9}, {2, 6, 7}, {3, 4, 8}, {1, 6, 8}, {2, 4, 9}, {3, 5, 7}.
Balanced Incomplete Block Designs
Theorem 177 (BIBD). In a balanced incomplete block design with (b, v, r, k,�),
r(k � 1) = �(v � 1).
Proof: Count the number of pairs a given variety, x is in, in two ways.
Method 1: The element x is in rx blocks (note, we don’t assume rx is the
same for each x). In each block, it can be paired with k � 1 other varieties,
and hence x is in rx(k � 1) pairs.
158
Method 2: The element x can be paired with v� 1 other varieties. Each
of these pairs must, by definition, appear � times in the design, and hence x
is in (v � 1)� pairs. Thus,
rx(k � 1) = �(v � 1) ) rx = �
v � 1
k � 1
.
Note: As a consequence of the fairness requirements (each block has the
same weight k and each pair is compared exactly � times) it follows that rx is
independent of x, and given by the equation above.
Regular Designs
Theorem 178 (Regular Designs). In any regular design, bk = vr.
Proof: Count the number of varieties in all of the blocks in two ways.
Method 1: There are v varieties, which each appear r times, and thus
the total number of appearances is vr.
Method 2: There are b blocks, which each contain k varieties, and thus
the total number of appearances is bk. Thus
rv = bk ) r =
bk
v
.
Note: Taken together rv = bk and r(k � 1) = �(v � 1), gives us two
equations between the 5 parameters of a BIBD; thus only 3 parameters can
be chosen independently. For this reason BIBD’s are often written simply as
(v, k,�).
Importantly, not all choices are possible, as b, v, r, k,� must be integers.
68 Incidence matrix
Incidence Matrix
Definition 179 (Incidence Matrix). A design can be represented by an in-
cidence matrix M where the rows represent blocks Si and the columns
represent varieties vj and the entries Mij = 1 if variety vj 2 Si, 0 otherwise.
Example: The incidence matrix for the (7, 7, 4, 4, 2) design
{3, 5, 6, 7}, {1, 4, 6, 7}, {1, 2, 5, 7}, {1, 2, 3, 6}, {2, 3, 4, 7}, {1, 3, 4, 5}, {2, 4, 5, 6}
is
M =
2
66666666
4
0 0 1 0 1 1 1
1 0 0 1 0 1 1
1 1 0 0 1 0 1
1 1 1 0 0 1 0
0 1 1 1 0 0 1
1 0 1 1 1 0 0
0 1 0 1 1 1 0
3
77777777
5
.
159
Incidence Matrix
Key facts about the incidence matrix of any (b, v, r, k,�) BIBD.
• There are b rows.
• Each row has k 1’s (number of varieties in a block).
• There are v columns.
• Each column has r 1’s (number of appearances of each variety).
• There are exactly � 1’s in common between any two columns.
Consider columns i and j: The values of entries are are both 1 on those
occasions when varieties i are j are in the same block. This must occur
exactly � times, by definition.
Properties of the Incidence Matrix
Theorem 180 (MTM). Let J be the v ⇥ v matrix consisting of all 1’s, then
the incidence matrix M of any BIBD satisfies
M
T
M = (r � �)I + �J.
Proof: Assume that we have i, j 2 {1, · · · , v}. Now,
(MTM)ij = (row i of M
T) · (column j of M)
= (column i of M) · (column j of M)
= r�ij + �(1� �ij)
= (r � �)�ij + � · 1.
Hence
M
T
M = (r � �)I + �J.
Inequalities for BIBD
Theorem 181 (Inequalities for BIBD). The parameters of any BIBD satisfies
b � v and r � k.
Proof: We have r(k�1) = �(v�1) ) � = r
k � 1
v � 1
< r since k < v,
so
160
det(MTM) =
�����������
r � � · · · �
� r � · · · �
� � r · · · �
...
...
� � � · · · r
�����������
=
�����������
r � � · · · �
�� r r � � 0 · · · 0
�� r 0 r � � · · · 0
...
...
�� r 0 0 · · · r � �
�����������
=
�����������
r + (v � 1)� � � · · · �
0 r � � 0 · · · 0
0 0 r � � · · · 0
...
...
0 0 0 · · · r � �
�����������
= (r + (v � 1)�)(r � �)v�1 = rk(r � �)v�1
6= 0.
Proof – Continued
det(MTM) = rk(r � �)v�1 6= 0.
Assume b < v.
Then we could add (v � b) additional rows of 0s to make a square matrix,
M1.
Now,
M
T
1
M1 = M
T
M by inspection additional rows just give 0⇥ 0
) det(MTM) = det(MT
1
M1) = det(M1)
2 = 0
But, det(MTM) 6= 0 so assumption is false. Thus b � v.
Since b � v, the relation bk = vr implies r � k.
69 Kirkman’s schoolgirl problem
Kirkman’s Schoolgirl Problem and Resolvable Designs
• Designs were a topic of recreational interest for their combinatorial prop-
erties in the 19th Century, pioneered by Steiner.
• The most famous design problem was Kirkman’s schoolgirl problem,
which was posed in 1850:
Every day of the week a teacher takes 15 schoolgirls on a walk. During
the walk the girls are grouped in triplets. Can the teacher construct
the triplets so that after the seven walks each pair of girls has walked in
the same triplet once and only once?
161
• AKirkman triple is a Steiner triple (v, 3, 1), with the additional condition
that it must be possible to partition the blocks such that each partition
contains each element once (the design is said to be resolvable).
• Question makes sense for number of schoolgirls v = 3, 9, 15, 21, · · · .
70 Steiner triple systems
Steiner Triple Systems
Definition 182 (Steiner Triple System). A Steiner triple system or order
v is a BIBD with parameters (v, 3, 1).
Note: Since BIBD’s with k = 2 are trivial Steiner triple system are the
simplest “interesting” designs.
Example: The simplest case is (7, 3, 1) so v = 7 varieties, k = 3 varieties
in each sub-set and each pair of varieties are compared � = 1 times.
Using our equalities from previous lecture we also have
r =
�(v � 1)
k � 1
=
6
2
= 3, rv = bk ) b = v = 7
The (unique) solution is:
{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {2, 4, 6}, {2, 5, 7}, {3, 4, 7}, {3, 5, 6}
Steiner Triple Systems
The incidence matrix of this design
{1, 2, 3}, {1, 4, 5}, {1, 6, 7}, {2, 4, 6}, {2, 5, 7}, {3, 4, 7}, {3, 5, 6}
is:
M =
0
BBBBBBBB
@
1 1 1 0 0 0 0
1 0 0 1 1 0 0
1 0 0 0 0 1 1
0 1 0 1 0 1 0
0 1 0 0 1 0 1
0 0 1 1 0 0 1
0 0 1 0 1 1 0
1
CCCCCCCC
A
Example: The first design we looked at is a (9, 3, 1) Steiner triple system.
This again is a unique solution (up to permutations and so on).
Steiner Triples
How many Steiner triples are there?
• (v, 3, 1) designs exist if and only if v = 1 or 3 (mod 6).
162
• Known up to v = 19, but it is a hard problem. No reason to think that
there is a simple closed form answer and so one has to count them via
computer.
• Online encylopedia of integer sequences (OEIS) # A030128, number
of Steiner triples: 1, 0, 1, 0, 0, 0, 30, 0, 840, 0, 0, 0, 1197504000, 0,
60281712691200, 0, 0, 0, 1348410350618155344199680000,· · ·
• # A030129, number of non-isomorphic (distinct) Steiner triples: 1, 0, 1,
0, 0, 0, 1, 0, 1, 0, 0, 0, 2, 0, 80, 0, 0, 0, 11084874829,· · ·
• Unless someone finds a better algorithm (which avoids generating all dis-
tinct triples), then finding out the answer for n = 21 will take hundreds
of thousands of CPU years.
Steiner Triple-(7,3,1)
(7,3,1) can be represented geometrically such that the elements 1, 2, · · · ,7
are points, and the blocks are lines.
This is the simplest example of a finite projective plane.
• Any 2 points lie on exactly 1 line.
• Any 2 lines meet at exactly 1 point.
• There are 4 points such that no line is incident with more than two of
them. This condition excludes trivial, non-interesting cases.
Finite projective planes of order n are equivalent to square designs of order
(n2 + n+ 1, n+ 1, 1).
Square Designs
Definition 183 (Square Designs). A square design (or symmetric design)
is a BIBD with b = v and since rv = bk it follows that r = k.
A square design is usually specified by the three parameters (v, k,�) only.
Square Designs
Theorem 184 (Rows of Square Designs). For square designs the number of
1’s agreeing in any two rows equals �.
163
Note: For any design the number of 1’s agreeing in any two columns is
�.
Proof: In Practice Class 11 you will prove that MTM = M MT.
Assume that we have i, j 2 {1, · · · , v}, with i 6= j. Now,
(MTM)ij = (row i of M
T) · (column j of M)
= (column i of M) · (column j of M)
= # of 1’s which match up between colums i and j = �
(M MT)ij = (row i of M) · (column j of MT)
= (row i of M) · (row j of M)
= # of 1’s which match up between rows i and j,
as required.
71 Resolvable designs
Resolvable Designs
Definition 185 (Resolvable Design). A design is said to be resolvable if the
subsets can be partitioned so that each partition contains each of the elements
once.
Note: A necessary condition for this to be possible is that k divide v.
Example: Our previous (12,9,4,3,1) deign is resolvable.
{1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {1, 4, 7}, {2, 5, 8}, {3, 6, 9},
{1, 5, 9}, {2, 6, 7}, {3, 4, 8}, {1, 6, 8}, {2, 4, 9}, {3, 5, 7}.
{123} {456} {789}
{147} {258} {369}
{159} {267} {348}
{168} {249} {357}
Properties of Kirkman Triples and Resolvable Designs
Definition 186 (Parallel Classes and Resolvable Designs). Let (X,A) be a
BIBD with parameters (v, k,�). A parallel class in (X,A) is a subset of
disjoint blocks from A whose union is X. A partition of A into r parallel
classes is called a resolution, and (X,A) is said to be a resolvable BIBD if
A has at least one resolution.
For Kirkman triples (v, 3, 1) we have
• r =
�(v � 1)
k � 1
=
1
2
(v � 1), b =
vr
k
=
1
6
v(v � 1).
• # parallel classes = r, # blocks per class = v/k.
164
Theorem 187 (Inequality for Resolvable Designs). For resolvable designs,
the inequality b � v is strengthened to b � v + r � 1.
Note: This was proved by Bose in 1942.
Solutions for Kirkman Triples
• v = 3: trivial, the 3 schoolgirls take a walk (note this is not an incomplete
block as b = k).
• v = 9: corresponds to our earlier (12,9,4,3,1) design.
1 2 3
4 5 6
7 8 9
1 4 7
2 5 8
3 6 9
1 5 9
2 6 7
3 4 8
1 6 8
2 4 9
3 5 7
Only 1 solution in this case.
Solution can be obtained from a simple geometric design. Take a circle and
place 8 points equidistantly on the perimeter with a 9th point at the centre.
This construction is from the chapter “Dinner Guests, Schoolgirls and
Handcu↵ed Prisoners”, in The Last Recreations, by Martin Gardner. Many
nice constructions and fun problems - highly recommended reading.
Solution for Kirkman Triple, v = 9
1
2
3
4
5
6
7
8
9
Solution for Kirkman Triple, v = 9
Now rotate the circle in either direction one step at a time to four di↵erent
positions. (The fifth step brings the pattern back to what it was originally.)
At each step copy down the triplet indicated by the ends and the centre
of the straight line and the two triplets indicated by the corners of the two
triangles. The three triplets found at each of the disk’s four position give the
triplets for each of the four days.
Labels of points on circle change as a 7! (a mod 8) + 1, while 9 stays put.
So
165
1 5 9
2 7 8
3 4 6
1 3 8
2 6 9
4 5 7
1 2 4
3 7 9
5 6 8
1 6 7
2 3 5
4 8 9
This solution seems to be di↵erent from the design given above for the
schoolgirl problem, but by substituting
1 7! 1, 2 7! 5, 3 7! 7, 4 7! 9, 5 7! 3, 6 7! 8, 7 7! 6, 8 7! 4, 9 7! 2
we get the original design.
Solution for Kirkman triple, v = 15
Since 1922 the case of v = 15 has been known to have seven basic solutions.
They can be generated by di↵erent patterns of triangles, with or without
a diameter line.
One pattern of five triangles is shown below. In this case the disk must be
rotated two units at a time to seven di↵erent positions. At each position the
corners of each triangle provide one of the five triplets for that day.
So labels of points: a 7! (a+ 1 mod 14) + 1, a 14, 15 7! 15.
1 2 15
3 7 10
4 5 13
6 9 11
8 12 14
1 6 7
2 10 14
3 4 15
5 9 12
8 11 13
· · · · · ·
1 5 8
2 3 11
4 7 9
6 10 12
13 14 15
Geometric Solution for Kirkman Triple, v = 15
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
72 Hadamard matrices
Hadamard Matrices
166
Definition 188 (Hadamard Matrix). A Hadamard matrix is a n⇥n matrix
with ±1 entries which satisfies
HH
T = nI.
Note: By the definition the rows of a Hadamard matrix are orthogonal.
det(HHT) = det(H)2 = nn, so | det(H)| = nn/2.
This is the maximum possible value for ±1 matrices.
HH
T = nI implies H�1 = 1nH
T and since H�1H = I we also have
H
T
H = nI.
So HT is also a Hadamard matrix.
It also shows that the columns of a Hadamard matrix are orthogonal.
Hadamard Matrices
Example: For n = 1 and 2 we have
H1 =
�
+1
�
H2 =
✓
+1 +1
�1 +1
◆
It is possible to construct larger Hadamard matrices from smaller ones via
matrix tensor product, e.g.,
H4 = H2 ⌦H2 =
✓
+H2 +H2
�H2 +H2
◆
=
0
BB
@
+1 +1 +1 +1
�1 +1�1 +1
�1�1 +1 +1
+1�1�1 +1
1
CC
A
Note: This construction can be repeated so
H8 = H2 ⌦H4 =
✓
+H4 +H4
�H4 +H4
◆
This shows that there are Hadamard matrices of every size n = 2k, k 2 N.
Size of Hadamard Matrices
Theorem 189 (Size of Hadamard Matrices). Hadamard matrices must have
size n = 1, 2, or 4k.
Proof: Assume that n � 3. Choose the Hadamard matrix to have its
first row is all 1’s (just multiply columns by ±1 as required).
All other rows are orthogonal to the first, and so have n/2 +1’s and n/2
�1’s.
Permute columns so that all 1’s on the second row are to the left.
Finally, permute the columns within the left half so that the +1’s on the
third row appear before the �1’s, and likewise for the right half. Then
167
H =
0
BBB
@
+1 · · ·+1 +1 · · ·+1 +1 · · ·+1 +1 · · ·+1
+1 · · ·+1 +1 · · ·+1 �1 · · ·�1 �1 · · ·�1
+1 · · ·+1 �1 · · ·�1 +1 · · ·+1 �1 · · ·�1
...
...
...
...
1
CCC
A
.
Let the number of columns that start + + + equal a, that start + + �
equal b, that start +�+ equal c, that start +�� equal d.
Size of Hadamard Matrices
Let the number of columns that start + + + equal a, that start + + �
equal b, that start +�+ equal c, and that start +�� equal d.
Clearly,
a+ b+ c+ d = n. (3)
The rows are orthogonal so we get by taking dot-products of rows 1 and
2:
a+ b� c� d = 0. (4)
And similarly from rows 1 and 3 and rows 2 and 3 we get
a� b+ c� d = 0, (5)
a� b� c+ d = 0. (6)
Now (2) + (3)) a = d, (2) + (4)) a = c, (3) + (4)) a = b, so
4a = n.
Since a is an integer, the result follows.
Hadamard Matrices and Square Designs
Theorem 190 (Hadamard Matrices and Square Designs). Hadamard matrices
of size n = 4k are in bijection to incidence matrices of square designs with
parameters (4k � 1, 2k � 1, k � 1).
Proof: Let n = 4k. Set first row and first column to all +1’s.
All other rows and columns must have n/2 +1’s and n/2 �1’s.
Delete the first row and column.
Convert the �1’s to 0’s, and call the matrix M .
Consider M as an incidence matrix: it has b = 4k � 1, v = 4k � 1, there
are 2k � 1 1’s in each row and column, and the number of 1’s which align in
any two rows (or columns) is k � 1.
This last fact is equivalent to our derivation that a = n/4 above. Since
rows 2 and 3 are not special, any 2 rows must have exactly n/4 pairs of 1’s in
common columns, n/4 pairs of 0’s.
Thus M is the incidence matrix of a (4k�1, 2k�1, k�1) square design.
168
Hadamard Conjecture
The key open question in this topic is whether or not this statement is
true:
Conjecture 191 (Hadamard). Hadamard matrices of size 4k ⇥ 4k exists for
all k 2 N.
Note: Size n = 668 = 4⇥ 167 smallest unresolved case, i.e., it is known
that Hadamard matrices exists for k 166.
• Topic of much interest.
• Deep connections to other topics including error-correcting codes.
• Some asymptotic results exist, based on constructions and recursions,
but they are not su�ciently strong.
• Major advance will be needed to find a Hadamard matrix of size n = 668,
let alone answer the general question.
73 Error correcting codes
Error Correcting Codes
Context: We are attempting to transmit a message in binary code, and
wish to avoid loss or corruption of data in case of noise, which may “flip” some
of the binary digits.
Definition 192 (Codewords and Hamming Distance). A codeword is a bi-
nary word, i.e., a word consisting of 0’s and 1’s. The Hamming distance p
between codewords is defined as the number of digits which are di↵erent, e.g.
for
p = 4 :
1 1 0 1 0 0 0
0 1 1 0 1 0 0
" " " "
Note: If
j
p�1
2
k
digits are reversed then the altered codeword is closest
to a unique codeword in the set, and can be corrected.
Error Correcting Codes
Example: Suppose we have just two codewords
1 1 0 1 0 0 0
0 1 1 0 1 0 0
" " " "
Here p = 4 ) then
⌅
4�1
2
⇧
= 1 bit can be corrected.
If the message received was 0 1 0 1 0 0 0. This has distance 1 from the
first code word, and distance 3 from the second codeword, and so could be
corrected.
169
If the message received was 0 1 0 0 0 0 0. This has distance 2 from the
first code word, and distance 2 from the second codeword, and so the message
cannot be corrected.
Error Correcting Codes and Designs
The topics of block codes and block designs are intimately connected.
The first error correcting code, invented by Hamming in 1950, is related
to the (7,4,1) square design and projective plane.
To convert a square design to a code:
• Take each row as a codeword.
• Each codeword has k 1s and v � k 0s.
• Each pair of codewords have � 1s in common.
• The remaining (k��) 1s in each codeword must be aligned with zeroes.
• Thus there are 2(k��) di↵erent bits, which can correct k���1 errors.
Error Correcting Codes from Hadamard Matrices
• From the square design corresponding to a Hadamard matrix one obtains
a code that can correct n/4� 1 errors.
• Some of these codes are linear, which makes them easily computable.
• In 1961, using the computational resources of JPL, Golomb and Baumert
found the first examples of Hadamard matrices of size 92⇥92 and 184⇥
184.
• In 1971 Mariner 9 used the code derived from the Hadamard matrix
n = 32.
170
74 Pattern Avoiding Permutations
Pattern Avoiding Permutations
• In the 1st edition of Vol. 1 of “The Art of Computer Programming”,
Knuth asked which permutations various data structures can generate
or sort.
• One question was “which permutations could be generated by a single
stack?” Knuth solved this problem completely.
• Answer: � is a stack sortable permutation if and only if it does not
involve the permutation 231.
• Trying to sort 231 with a single stack:
Two Stacks in Parallel or Series
• In the problem section Knuth asked the same question of several other
structures, such as two stacks in parallel. This can sort 231.
• Two stacks in series can sort 2431, which a single stack can’t.
171
Patterns and Pattern Avoidance
Definition 1 (Pattern). Let � 2 Sn be a permutation of length n and ⌧ 2 Sk
a permutation of length k, with k n. ⌧ occurs as a pattern in � if for a
subsequence of � of length k the elements of the subsequence occur in the
same relative order as the elements of ⌧.
Example: 1324 occurs twice as a pattern in 153264: 153264 and 153264
Definition 2 (Pattern Avoiding Permutations). If a permutation ⌧ does not
occur in �, � is said to be a pattern avoiding permutation (PAP) with
respect to ⌧.
Notation 3 (Number of Pattern Avoiding Permutations). The number of
permutations � 2 Sn avoiding ⌧ will be denoted as Sn(⌧).
Grid Representation of Permutations
One can represent � 2 Sn on a n ⇥ n square grid by filling in the cells
(i,�i).
Example: The grid representation of 47328156 2 S8:
The pattern 231 The pattern 132 The pattern 1423
Pattern Avoiding Permutations
Theorem 4 (Number of 132 Avoiding Permutations). The number of permu-
tations, Sn(132), of length n avoiding the pattern 132 is Cn, the n’th Catalan
number.
Proof: Consider any permutation � 2 Sn avoiding 132.
Assume that �i = n, that is, n appears in position i.
Any entry to the left of n must be larger than any entry to the right of n.
If not we could find �l < �r such that �l · · ·n · · ·�r forms a 132 pattern.
This now means that the entries to the left of n and those to the right of
n must both avoid 132 independently of each other.
172
Now, n can appear anywhere, so
Sn(132) =
nX
i=1
Si�1(132)Sn�i(132), S0(132) = 1.
Beyond Knuth
• Knuth’s work inspired responses by Tarjan (1972) and Pratt (1973).
• Pratt noted that the questions about permutations were more general,
and of more interest, than the restricted questions on data structures.
• Since then, the study of PAPs has become an active field in its own right.
• Stanley and Wilf conjectured (late 1980s) that for every permutation ⌧ ,
the number Sn(⌧) of length n PAPs is at most
n.
Note: There are n! permutations so non-trivial conjecture.
• Arratia (1999) observed that this is equivalent to the convergence of
= lim
n!1
Sn(⌧)
1/n.
• This was proved by Marcus and Tardos in 2004.
• Key question. Classify PAPs by the possible values of . PAPs with the
same form a so-called Wilf class.
Wilf Classes for Patterns with n < 4.
• For n = 2, ⌧ = 12 or ⌧ = 21 and only the strictly decreasing/increasing
permutations avoid these patterns, respectively.
• For n = 3, there are six permutations, forming 3 classes as the problem
is symmetrical.
For a given pattern ⌧ look at the reverse pattern ⌧ 0 (for ⌧ = 132 we have
⌧ 0 = 231). If � avoids ⌧ then the reverse of � avoids ⌧ 0. This sets up an
easy bijection to show Sn(⌧) = Sn(⌧
0).
• For all 3 classes, Sn(⌧) = Cn, the n’th Catalan number.
• Proof that Sn(132) = Sn(312).
Consider the complement � of � 2 Sn defined by �i = n+ 1� �i.
Now 312 is the complement of 132 and if � avoids 132 then � avoids 312
and vice versa again setting up a simple bijection.
173
Left-Right Minima
Definition 5 (Left-Right Minimum). A left-right minimum of a permutation
� is an index �j such that �j < �i, i < j.
�j is a left-right minimum if all entries appearing before �j and larger than
�j .
Note: The left-right minima of � form a decreasing subsequence.
Example: ✓
1 2 3 4 5 6
4 6 3 5 1 2
◆
has three left-right minima, 4, 3 and 1.
Note: �1 (leftmost entry) and the entry 1 are always left-right minima.
Wilf Class for 123
Theorem 6 (Wilf Class for 123). For all n > 0, Sn(123) = Sn(132).
Proof: We define a bijection from 132– to 123–avoiding permutations.
Take a 132–avoiding permutation and fix the left-right minima in place.
Remove all the elements that are not left-right minima.
Put the removed elements back into the empty slots in decreasing order.
Note that this will not change the set of left-right minima. Why is this?
The resulting permutation in a union of two decreasing sequences and it
is therefore 123-avoiding.
Wilf Classes for Patterns with n = 4.
• For n = 4, it turns out that the 24 possible permutations give rise to 3
distinct Wilf classes.
• For 1234, Gessel (1990) proved that
Sn(1234) =
1
(n+ 1)2(n+ 2)
nX
k=0
✓
2k
k
◆✓
n+ 1
k + 1
◆✓
n+ 2
k + 1
◆
,
Sn(1234) ⇠
81
p
3
16⇡
· 9n · n�4.
• The associated generating function is a 3rd order linear, homogeneous
ODE whose solution is known in terms of hypergeometric functions.
�
x2 � 11×3 + 19×4 � 9×5
�d3G1234(x)
dx3
+
�
9x� 90×2 + 153×3 � 72×4
�d2G1234(x)
dx2
+
�
16� 154x+ 264×2 � 126×3
�dG1234(x)
dx
�
�
32� 72x+ 36×2
�
G1234(x)
= 0
174
The second class, 1342.
• For 1342, Bóna (1997) showed that
Sn(1342) = (�1)n
(2 + 3n� 7n2)
2
+6
nX
i=2
(�1)n�i 2i
(2i� 4)!
i!(i� 2)!
✓
n� i+ 2
2
◆
.
• The generating function satisfies the linear ODE
(1� 7x� 8×2)
d2G1342(x)
dx2
+ (8� 28x)
dG1342(x)
dx
� 12G1342(x) = 0.
• It can be exactly solved
G1342(x) =
32x
1 + 20x� 8×2 � (1� 8x)3/2
,
giving
Sn(1342) ⇠
64
243
p
⇡
8nn�5/2.
The third class, 1324.
• This class remains unsolved. Can prove
Sn(1342) < Sn(1234) < Sn(1324) for n > 6.
• is not known. Rigorous bounds are 9.81 < < 13.002. • Johansson and Nakamura (2014) calculated Sn(1324) for n 30. • Conway, Guttmann and Zinn-Justin (2017) improved this to n 50. • They conjectured Sn(1324) ⇠ B · n · µ p n 1 · n g, where = 11.600± 0.003, µ1 = 0.0400± 0.0005 and g = �1.1± 0.1. • The stretched exponential term µ p n 1 is a new feature that had not pre- viously been reported and it is not proved. 75 Rubik’s cube Rubik’s cube objectives 175 • Prove or characterise which configurations can be legally obtained. • Idea: Map configurations to permutations and moves to the action of a permutation on the configuration. • Calculate the number of di↵erent colour patterns possible by rotating the six outer faces of the cube. Moves There are two types of move: • Middle moves and Face move Middle moves: only middle layer moves (picture) • Middle right, down or back All can be achieved by face moves so won’t be considered any further. 176 Face moves The outer face moves are called front (F ) right (R) up (U) back (B) left (L) down (D) and refer to the position of the faces relative to the person holding the cube. F R U Rotations are words in {F,L,B, U,R,D} (clockwise rotations by 90� and their inverses F�1, L�1, B�1, U�1, R�1, D�1 (anti-clockwise rotations). Example: The set of face moves: U�1RU, (reading right to left) denotes rotating the up face 90� clockwise, then rotating the right face 90� clockwise and finally rotating the up face 90� anti-clockwise. 75.1 Configurations Configurations How do we characterise all possible configurations? We can treat the following as (almost) independent pieces of information • position of corner cubes (with 3 visible faces) • position of edge cubes (with 2 visible faces) • orientation of corner cubes • orientation of edge cubes Corners Label corners: 177 1 2 3 4 5 6 7 8 The elementary rotations can be expressed as permutations of the corner labels: F = (1234), i.e. 1 goes to 2, 2 goes to 3, 3 goes to 4 and 4 goes to 1. Corners Notation 7 (Elementary Moves). Each elementary move corresponds to a permutation ⇢ 2 S8 F = (1234) F�1 = (1432) B = (5876) B�1 = (5678) L = (1562) L�1 = (1265) R = (3784) R�1 = (3487) U = (2673) U�1 = (2376) D = (1485) D�1 = (1584) Example: What is the e↵ect of U�1R�1UR? (2376)(3487)(2673)(3784) = (34)(67) i.e. corners 3 and 4 are swapped, as well as corners 6 and 7. Corners 1 2 3 4 5 6 7 8 Example: Without computing, show that ⌧ = F�1UF�1URU�1 cannot interchange two corners leaving others fixed. • A single pair swap is odd: ⇢ = (ij). • All elementary moves are 4-cycles, and hence odd. 178 • ⌧ consists of six elementary moves, and hence is even, and therefore cannot swap a single pair. Edges Label the edges: 1 2 3 4 5 6 7 8 9 a b c Each elementary move corresponds to a permutation � 2 S12. F = (1234) F�1 = (1432) B = (9cba) B�1 = (9abc) L = (1596) L�1 = (1695) R = (37b8) R�1 = (38b7) U = (26a7) U�1 = (27a6) D = (48c5) D�1 = (45c8) Note: Same elementary moves but di↵erent permutations. Positions Thus the positions of all cubes represented by a pair of permutations (⇢,�) with ⇢ 2 S8 and � 2 S12 with composition rule (⇢0,�0) � (⇢,�) = (⇢0 � ⇢,�0 � �). Example: Compute the e↵ect on the cube positions of the movesB�1RB. B�1RB = ⇣ (5678), (9abc) ⌘⇣ (3784), (37b8) ⌘⇣ (5876), (9cba) ⌘ = ⇣ (3854), (37c8) ⌘ B�1RB cyclically permutes the corner cubes 3, 8, 5 and 4 as well as cy- clially permutes the edge cubes 3, 7, c and 8. Orientations What about orientations? Define a rotation with respect to a mark or standard configuration: ) 179 With respect to the standard (= solve) configuration, label each corner cube 0, 1, 2 clockwise with the 0 inside the mark: 0 12 Orientations Under F , corner 3 goes to corner 4, and the orientation of corner 4 becomes that of corner 3 rotated anti-clockwise over 2⇡/3. In mod 3 arithmetic: 0 12 corner at 3 F�! 1 20 corner at 4 mod 3 arithmetic Let xi be the value of the orientation variable in mark of corner i (i.e. in the solved configuration xi = 0 for all i). 0 12 �! 1 20 1 20 �! 2 01 2 01 �! 0 12 Are all of the form xi 7! xi + 1 (mod 3). Orientations Example: F (x1, x2, x3, x4, x5, x6, x7, x8) = (x4 + 2, x1 + 1, x2 + 2, x3 + 1, x5, x6, x7, x8) or R(x1, x2, x3, x4, x5, x6, x7, x8) = (x1, x2, x4 + 1, x8 + 2, x5, x6, x3 + 2, x7 + 1) or U(x1, x2, x3, x4, x5, x6, x7, x8) = (x1, x3, x7, x4, x5, x2, x6, x8) 180 All calculations are (mod 3). So corner configurations are labeled by 8-tuples (x1, x2, x3, x4, x5, x6, x7, x8), and a key observation is that for cube moves x1 + x2 + . . .+ x8 = 0 (mod 3). Configurations How do we characterise all possible configurations? We can treat the following as (almost) independent pieces of information • position of corner cubes (with 3 visible faces) • position of edge cubes (with 2 visible faces) • orientation of corner cubes • orientation of edge cubes We have almost characterised cube configurations using triples (⇢,�, x): • ⇢ 2 S8 labels corner positions • � 2 S12 labels edge positions • x = (x1, x2, x3, x4, x5, x6, x7, x8) with x1 + x2 + . . . + x8 = 0 (mod 3) labels corner orientations Edge Orientations Edges come in two orientations, so we put marks and associate a variable yi with each edge cube with yi = 0 if the mark is present, and yi = 1 if not. 0 0 1 1 0 0 1 1 0 0 1 1 1 2 3 4 5 6 7 8 9 a b c Example: So mod 2 we get R(y1, . . . , yc) = (. . . , 3 y8 + 1, . . . , 7 y3 + 1, 8 yb + 1, . . . , b y7 + 1, yc). 181 Configurations So a configuration of the cube is represented by (⇢,�, x, y) where • ⇢ 2 S8 are corner positions • � 2 S12 are edge positions • x = (x1, . . . , x8) with x1+ x2+ . . . + x8 = 0 (mod 3) are corner orien- tations • y = (y1, . . . , yc) with y1+y2+. . .+yc = 0 (mod 2) are edge orientations As a group this is S8 ⇥ S12 ⇥ Z73 ⇥ Z 11 2 /Z2 • not Z83 because one corner is fixed as a reference • not Z122 because one edge is fixed as a reference • quotient of Z2 because mixed parities of ⇢ and � are not allowed. The group structure implies that there exist moves that, say, change posi- tion of corners without changing edge cubes or orientations. Cubology Theorem 8 (First Law of Cubology). For any sequence of elementary moves: 1. sgn(⇢) = sgn(�) 2. x1 + x2 + . . .+ x8 = 0 (mod 3) 3. y1 + y2 + . . .+ y12 = 0 (mod 2) Proof : First, this holds for the standard configuration. 1. Elementary moves on corners and edges are 4-cycles ) all odd parity. Corners and edges change simultaneously so parities are “in sync”. 2. Check that this holds for all elementary moves: • xi’s in U or D do not change • xi’s in R,L, F or B: two xi’s increase by 1 (mod 3) and two de- crease by 1 (mod 2), so no change in x1+x2+ . . .+x8 = 0 (mod 3) 3. Check explicitly that for all moves U,D,R,L, F and B the sum y1+y2+ . . .+ y12 = 0 (mod 2) does not change. Cubology Theorem 9 (Existence of Move Sequence). Every configuration satisfying the First Law can be attained by a sequence of elementary moves. Sketch of proof : The six-step strategy in the printed notes shows that any configuration can be obtained from the standard configuration (by “reverse solving”). 182 Cubology Theorem 10 (Number of Configurations). The number of possible configura- tions of Rubik’s cube is (consistent with First Law) 1 12 · 8! · 12! · 38 · 212 ⇡ 4.3⇥ 109. Proof: First we count the sizes of the 4 contributions to the configurations • 8! = |S8|. • 12! = |S12|. • 38 ) each xi 2 {0, 1, 2}. • 212 ) each yi 2 {0, 1}. But there are some constraints which reduce the count: • 12 as sgn(⇢) = sgn(�). • 13 as last corner cannot be chosen arbitrarily. • 12 as last edge cannot be chosen arbitrarily. 183