CPSC 320: Steps in Algorithm Design and Analysis Solutions*
In this worksheet, you will practice ve useful steps for designing and analyzing algorithms, starting from a possibly vague problem statement. These steps will be useful throughout the class. They could also be useful when you nd yourself thinking on your feet in an interview situation. And hopefully they will serve you well in your work post-graduation too!
We will use the Stable Matching Problem (SMP) as our working example. Following the historical literature, the text formulates the problem in terms of marriages between men and women. We will avoid the gender binaries inherent in that literature and use employers and job applicants instead. Imagine for example the task faced by UBC’s co-op oce each semester, which seeks to match hundreds of student applicants to employer internships. To keep the problem as simple as possible for now, assume that every applicant has a full ranking of employers and the other way around (no ties).
Step 1: Build intuition through examples.
1. Write down small and trivial instances of the problem. We will use the words instance and inputs interchangeably.
SOLUTION: Here’s an instance with three employers and three applicants, with preferences listed in order from most to least preferred.
e1: a2 a1 a3
e2: a1 a2 a3
e3: a2 a1 a3
a1: e3 e1 e2
a2: e3 e2 e1
a3: e2 e1 e3
And here’s
e1: a1 a2
e2: a1 a2
one with two employers and two applicants:
a1: e1 e2
a2: e2 e1
Neither of these are trivial, since there is more than one way to match employers and applicants. It’s tempting to say that the smallest possible instance has one employer and one applicant. Could we go even smaller? Degenerate cases are often helpful as base cases in algorithm design and analysis; here the degenerate case has zero employers and zero applicants.
2. Write down potential solutions for your instances. Are some solutions better than others? How so?
SOLUTION: One potential solution for our instance of size three is the set M = { (e1, a1), (e2, a2), (e3, a3) }. An alternative solution is the set M′ = { (e1, a1), (e2, a3), (e3, a2) }, drawn below. This one seems better than the rst, since both e3 and a2 rank each other rst and are matched with each other. Admittedly, e2 gets a lower-ranked applicant in M′ than in M.
e1 a1
e2 a2
e3 a3
Copyright Notice: UBC retains the rights to this document. You may not distribute this document without permission.
*
1
Step 2: Develop a formal problem specication
1. Develop notation for describing a problem instance. What quantities or data (such as numbers, sets, lists, etc.) matter? Give them short, usable names. Think of these as input parameters to the algorithm code. Use your earlier examples to illustrate your notation.
SOLUTION: You may have come up with dierent names and quantities, but here are some useful ones.
n ≥ 0, the number of employers and the number of applicants.
E, the set of employers {e1,e2,…,en} (so, n = |E|)
A, the set of applicants {a1,a2,…,an} (again, n = |A|)
PE[i], a preference list for each employer ei. This can be a permutation of [1..n].
We could use PE to denote the list of all employer’s preference lists (so PE is a list of lists). PA[j], a preference list for each applicant aj, also a permutation of [1..n].
We could use PA to denote the list of all applicant’s preference lists.
The quantity n and lists of lists PE and PA fully describe a problem instance:
◃ n ≥ 0 is the number of employers and also the number of applicants
◃ PE is the collection of complete preference lists (>e) of the employers ◃ PA is the collection of complete preference lists (>a) of the applicants
function SMP-Algorithm(n,PE,PA)
◃ return a stable matching M for the stable matching instance (n,PE,PA)
…
2. Develop notation for describing a potential solution. Use your earlier examples to illustrate your notation.
SOLUTION: A potential solution (or just solution) for SMP is a set M of employer-applicant pairs, where each employer and each applicant appears in exactly one pair. We call such a set a perfect matching. We use (ei,aj) to indicate that employer ei and applicant aj are matched.
3. Describe what you think makes a solution good.
SOLUTION: With respect to a given matching M, an instability is a pair (e, a′) ∈ E × A such that (e,a) ∈ M, (e′,a′) ∈ M, but e prefers a′ to a and a′ prefers e to e′. A solution is good if it contains no instabilities, in which case we say that the solution is stable. That is, a stable solution is “self-enforcing” in the sense that no employer and applicant who aren’t matched will decide to break the matching.
We will write a′ >e a to mean “e prefers a′ to a” and similarly e′ >a e to mean “a prefers e′ to e”.
2
Step 3: Identify similar problems. What are the similarities?
SOLUTION: We haven’t seen many problems and algorithms yet, but as the course goes on, we will have more to draw on here. If you’ve worked with bipartite graphs and matching problems, anything associated with them seems promising, especially maximum matching. This might also feel a bit like an election or auction, which takes us toward game theory. There are many other possibilities. (The point is not to be “right”; it is to have potential tools on hand. As you collect more tools, you will start to judge which are more promising and which less.)
Step 4: Evaluate simple algorithmic approaches, such as brute force.
1. Design a brute force algorithm for the SMP problem.
A “brute force” algorithm has the form given in the following pseudocode. Flesh out the details: what is the problem input (i.e., input) and potential solution in this case. How would you test if a potential solution is a good solution? Do not worry about how you would generate every possible perfect matching.
SOLUTION: Our brute force algorithm for SMP is as follows:
function SMP-Brute-Force(n, PE, PA)
◃ n ≥ 1 is the number of employers and also the number of applicants
◃ PE is the collection of complete preference lists (>e) of the employers ◃ PA is the collection of complete preference lists (>a) of the applicants
for each perfect matching M (i.e., potential solution) do if M is stable (i.e., a good solution) then
return M
return “no stable matching
Next, we esh out our algorithm to check if a potential solution is a good solution, i.e., if a perfect matching is stable. This algorithm enumerates employer-applicant pairs (e,a′) and checks that (1) they are not matched and (2) they’d rather be with each other than with their matches.
function IsStable(M)
◃ M is a perfect matching
◃ return true if M is a stable matching for instance (n,PE,PA), and false otherwise
for all e ∈ E do
for all a′ ∈ A do
if (e,a′) ̸∈ M then
nd a such that (e,a) ∈ M nd e′ such that (e′, a′) ∈ M if e>a′ e′ anda′ >e athen
return true
return false
3
2. Analyze the worst case running time of brute force.
Count the number of iterations of the for loop. We need to count the number of potential solutions, i.e., perfect matchings. Imagine that the n applicants are arranged in some order. How many dierent ways can we arrange (permute) the n employers next to them?
SOLUTION: There are n applicants that can be lined up with the rst employer. Once we’ve chosen the rst applicant, there are n − 1 to line up next to the second. Then, n − 2 next to the third, and so on. Overall, then, that’s n×n−1×n−2×…×2×1 = n!. There are n! perfect matchings, or potential solutions. That’s already super-exponential! In the worst case, since the order in which solutions are enumerated is not specied, there could be n! iterations of the for loop, e.g., when there is exactly one stable matching. (Can you nd an instance of size n with exactly one stable matching?)
Bound the running time of your procedure to test if a potential solution is a good solution.
SOLUTION: Let’s assume that the basic steps of the algorithm, like comparing e’s preferences for applicants a and a′ and checking if (e,a) is in M, can be done in O(1) time. It’s not immediately obvious how to do this, but with some careful use of data structures and O(n2) preprocessing, it’s doable. For example, suppose that the input M is represented as a list of pairs (e, a) such that e is matched with a, where 1 ≤ e, a ≤ n. We can build an array Employer- Matches[1..n], whose eth entry is a. This can be done in O(n) time. Then, we determine which applicant is matched with a given employer e in O(1) time. We can build a similar array, Applicant-Matches[1..n] whose ath entry is e, so that we determine which applicant is matched with a given employer e in O(1) time. So the body of the inner loop runs in O(1) time. Each loop iterates at most n times, and so the overall time is O(n2).
Put the previous two parts together to bound the overall running time of brute force.
SOLUTION: The overall worst case running time is O(n!n2). That’s horrendous. It won’t do for even quite modest values of n.
4
Step 5: Design a better algorithm.
1. Brainstorm some ideas, then sketch out an algorithm. Try out your algorithm on some examples.
SOLUTION: You may have lots of ideas. For example, you might have noticed that if a employer and a applicant both most-prefer each other, we must match them; that might form the kernel of some kind of algorithm. For our algorithm, we will work with the Gale-Shapley algorithm, also in textbook and on part b of this worksheet.
1: 2: 3: 4: 5: 6: 7: 8: 9:
10: 11: 12: 13: 14: 15: 16: 17: 18: 19:
20:
function Gale-Shapley(n, PE, PA)
◃ n ≥ 1 is the number of employers and also the number of applicants
◃ PE is the collection of complete preference lists (>e) of the employers
◃ PA is the collection of complete preference lists (>a) of the applicants
◃ return a stable matching M for the stable matching instance (n,PE,PA)
M ← ∅ ◃ matching M is initially empty while some employer is unmatched and has not considered every applicant do
a1: e2 e1 e3
a2: e1 e2 e3
a3: e2 e1 e3
e1: a3 a1 a2
e2: a3 a2 a1
e3: a2 a1 a3
choose any such employer e
let a be the highest-ranked in e’s list that e has not yet considered ◃ e now considers a (i.e., “makes an oer” to a) as follows:
if a is unmatched then
add match (e, a) to M else
let a be currently matched to e′ if e>a e′ then
remove match (e′,a) from M
add match (e, a) to M else
return M
◃ a accepts e’s oer ◃ a is matched
◃aprefersetoe′ ◃ a rejects e′’s oer ◃ a accepts e’s oer ◃ a prefers e′ to e, in which case M does not change
For fun, we will work through G-S with applicants considering employers. G-S correctly terminates immediately on any n = 0 example with an empty set of matchings. With n = 1, the one applicant considers one employer, who accepts the match, and the algorithm correctly terminates.
Going back to our rst example:
G-S doesn’t specify what order the applicants are chosen. We will work from top to bottom:
(a) a1 considers e2, who accepts. M = { (e2, a1) }
(b) a2 considers e1, who accepts. M = { (e2, a1), (e1, a2) }
(c) a3 considers e2. e2 rejects a1 and accepts a3’s oer. M = { (e2, a3), e1:a2 }
(d) a1 considers e1 (2nd on its list). e1 rejects a2 and accepts a1’s oer. M = { (e2, a3), (e1, a1) }
(e) a2 considers e2, who prefers a3 to a2 and so rejects the oer. M = { (e2, a3), (e1, a1) } (f) a2 considers e3 (last on its list!), who accepts. M = { (e2, a3), (e1, a1), (e3, a2) }
(g) The algorithm terminates with the correct solution M = { (e2, a3), (e1, a1), (e3, a2) }. 5
2. Analyze the running time of your algorithm.
Using big-O notation, can you bound the number of iterations of the while loop? How much time does each iteration take? Is your bound tight in the worst case?
SOLUTION: To bound the number of iterations of the while loop, note that the algorithm terminates if the condition of the loop eventually becomes falsethat is, when all employers are matched or when all employers have considered all applicants, whichever comes rst. On each iteration of the loop, some employer e considers some applicant a that e has not considered in previous iterations. There are n2 employer-applicant pairs, so the condition of the while loop must become false within n2 iterations, and the algorithm terminates.
There are a constant number of steps at each iteration of the while loop. Using similar data structures to our IsStable algorithm, each of these steps takes O(1) time. So the algorithm runs in time O(n2).
To show that this bound is tight, we need to nd instances on which the G-S algorithm takes Θ(n2) time. Let’s try the following ranking for employers:
e1: a1, a2, …, an
e2: a1, a2, …, an
ei: a1, a2, …, an
In this case, each ei can rst consider a1. This takes n iterations of the while loop, and just one employer is matched to a1. Then, each of the n − 1 unmatched ei’s consider a2. This takes n − 1 iterations and results in one additional match, to a2. Continuing on, in the jth round, each of the n − j + 1 unmatched ei’s consider aj, with one being matched to aj at the end. This takes n − j + 1 iterations. Finally, the last remaining unmatched employer considers, and is matched to, an.
The total number of iterations is n+(n−1)+…+2+1, which is n(n+1)/2. Although this might not be the worst possible instance of size n, it is sucient to show that the algorithm requires Ω(n2) iterations of the while loop in the worst case.
Challenge problems
These are just for fun, some are easier than others.
1. Design an algorithm to generate each possible perfect matching between n employers and n applicants. (As always, it will help tremendously to start by giving your algorithm and its parameters names! Your algorithm will almost certainly be recursive.)
2. A “local search” algorithm might pick a matching and then “repair” instabilities one at a time by matching the pair causing the instability and also matching their partners. Use the smallest possible instance to show how bad this algorithm can get.
3. Design a scalable SMP instance that forces the Gale-Shapley algorithm to take its maximum possible number of iterations. How many is that? (A “scalable instance” is really an algorithm that takes a size and produces an instance of that size, just like the “input” in worst case analysis is scalable to any n.)
6