Topics Covered
● Stable Matching
● Greedy Algorithms
● Dynamic Programming
Copyright By PowCoder代写 加微信 powcoder
● Divide and Conquer
● Randomized Algorithms
Stable Matching
Stable Matching Problem
Consider a set M = {m1, …, mn} of n men and a set W = {w1, …., w_n} of n women. Each of them has a preference list of the other gender. Now, we want to find a stable matching S between men and women.
A matching is stable if
1) it is perfect
2) there is no instability
a) Instability: We match (m, w’) and (m’, w) but m prefers w to w’ and w prefers m to m’.
Example: Is it stable?
Example: Is it stable?
m_1 prefers w_2 to w_3 and w_2 prefers m_1 to m_2
Gale-Shapley Algorithm
● Every man proposes to women in the order he prefers.
● Women wait for a proposal. (Suppose m proposes to w)
○ If w is free, w becomes engaged to m.
○ Else w is currently engaged to m’, w becomes engaged only if w prefers m’ to m.
● Keep running this algorithm when there is a man who is free and hasn’t proposed to every
● (Refer to textbook page 6 for more details and proof)
● Q: Is it possible that there are still unmatched men or women after this algorithm finishes?
Quick Properties of G-S
● It always returns a perfect, stable matching
● If the same people propose, the stable matching returned is the same matching
● The proposers are always matched to their best valid match
● The respondents are always matched to their worst valid match
Example Problem
Consider a version of the stable matching problem in which men and women can be indifferent between certain options. As before we have a set M of n men and a set W of n women. Assume each man and each woman ranks the members of the opposite gender – but now we allow ties in the ranking. We will say that w prefers m to m’ if m is ranked higher than m’ on her preference list (they are not tied).
We define that a strong instability in a perfect matching S consists of a man m and a woman w, such that m and w prefer each other to their partners in S. Find an O(n^2) algorithm that is guaranteed to find a perfect matching with no strong instability.
Hint: Reduce this problem to the vanilla stable matching problem.
Reduce input of P to a stable matching problem.
Preferences: We break the ties lexicographically – that is if a man m is indifferent between two women w_i and w_j then w_i appears on m’s preference list before w_j if i < j and if j < i w_j appears before w_i.
Time complexity: O(n^2)
Run Gale-Shapley algorithm and get a stable matching. Time complexity: O(n^2)
We claim that the stable matching found by Gale-Shapley algorithm is a perfect matching with no strong instability in the modified stable matching problem.
Suppose we have a strong instability in the modified problem. It means there exist m and w such that they prefer each other to their partners. It implies an instability in the instance of the vanilla stable matching problem we generated. However, Gale-Shapley algorithm is guaranteed to find a stable matching. By contradiction, there is no strong instability in the modified stable matching problem.
Greedy Algorithms
Algorithms We’ve Covered
● Interval Scheduling
● Minimum Spanning Tree
○ Definitely be familiar (and know how to run) Kruskal and Prims
Proof Techniques to know
● Greedy Stays Ahead
● Exchange Argument ● Minimax
Interval Scheduling
Interval Scheduling (Section 4.1) Setup and Greedy Rule
Given a set of n requests, as well as start time s(i) and finish time f(i) for any request i, schedule as many requests as possible
What is the optimal greedy rule we should use? Earliest Finish Time
Interval Scheduling Algorithm
R = set of requests, A = {}
sort R by increasing finish time while R is not empty:
choose the next request i ∈ R
delete requests from R that conflict with i
Runs in O(n log n). Sorting by finishing time (using merge or quicksort) takes time O(n log n). In an additional O(n) time, we construct the array S[1..n] where S[i] contains the value s(i). Now, we select requests by processing in order of increasing f(i). We always select the first interval, then we reach the first interval j for which s(j) ≥ f(1). In general, if the most recent interval we’ve chosen ended at f, we iterate through subsequent intervals until we reach first request, j, for which s(j) ≥ f. This is done through one pass of intervals which is O(n). Thus, overall time is O(n log n).
Intuitive Explanation: Get the resource available asap after completing one task.
Interval Scheduling Proof: Greedy Stays Ahead
● What are we trying to prove for this problem?
○ The size of the output set of our greedy algorithm is equal to the size of an optimal set of intervals.
● How to prove?
○ Compare your greedy algorithm solution to an optimal solution and show how it “stays ahead”
○ Need to define what it means to “stay ahead”
○ Induction!
Proof (cont)
● For the Interval Scheduling problem, what does it mean for our greedy algorithm to “stay ahead”
○ Let ik and jk be the kth interval in the greedy algorithm set and optimal set of requests, respectively.
○ Want to show that for f(ik ) ≤ f(jk) for all k
○ Can show this using induction
■ Detailed proof on page 120 in the textbook
Interval Scheduling Proof: Minimax
● Suppose we select m out of n jobs, and we want to show that no more than m jobs can be selected.
● Claim: Let F = {f(i) : job i selected}. Then, the interval for every job
j in the input contains some element of F.
job 2 contains f(1), job 4 contains f(3), job 6 contains f(5), job 7 contains f(8).
Interval Scheduling Proof: Minimax
● Suppose we select m out of n jobs, and we want to show that no more than m jobs can be selected.
● Claim: Let F = {f(i) : job i selected}. Then, the interval for every job
j in the input contains some element of F.
job 2 contains f(1), job 4 contains f(3), job 6 contains f(5), job 7 contains f(8). ● Proof:
(a) if job j was selected, then f(j) is in by F by definition.
(b) if job j was not selected, then it was eliminated due to
conflict with an “earlier” element of F:
To show this formally, you can examine the way we iterated through the intervals in the example specification of the algorithm: we picked an interval to include (category a), and then passed on all the intervals that conflicted with its finish time (category b).
Finishing up Minimax
● Claim: Let F = {f(i) : job i selected}.
Then, the interval for every job j in the input contains some element of F. Proved!
● How does F restrict the size of any non-overlapping subset of intervals?
Finishing up Minimax
● Claim: Let F = {f(i) : job i selected}.
Then, the interval for every job j in the input contains some element of F. Proved!
● How does F restrict the size of any non-overlapping subset of intervals?
F is of size m, so if the set G had m + 1 jobs, two of them would contain the same point in F. Conflict!
Finishing up Minimax
● Claim: Let F = {f(i) : job i selected}.
Then, the interval for every job j in the input contains some element of F. Proved!
● How does F restrict the size of any non-overlapping subset of intervals?
F is of size m, so if the set G had m + 1 jobs, two of them would contain the same point in F. Conflict!
● Hence, there are no non-overlapping subsets of intervals of size greater than m.
● We found an overlapping subset of size m, so this shows optimality.
Minimum Spanning Tree
Minimum Spanning Tree (Section 4.5) Setup
Given an (undirected!) graph G = (V,E) with costs for each edge, find a subset of the edges T⊆E so that the graph (V, T) is connected and the total cost of the edges in T is as small as possible
a. Find a spanning tree, T of G.
■ Spanning tree? All nodes in V are connected through some set of edges in T.
■ Note that T won’t have cycles. Why?
b. Find the cheapest spanning tree
Greedy algorithm #1 - Kruskal’s Algorithm
sortedE <- sort E by increasing cost (smaller cost edges earlier in list) T = {}
for edge, e, in sortedE:
Add e to T if it does not create a cycle output T
Runs in O(|E| log |V|). To achieve this, we need to use the Union-Find data structure. If you want more details on this, see Section 4.6.
Greedy algorithm #2 - Prim’s Algorithm
Select a root node, s, from which to build the spanning tree. Let S = {s}
Let T = {}
while S ≠ V:
Select e = (u, v) where e is the min-cost edge such that u ∈ S and v ∈ V \ S Add e to T
Add v to S
Runs in O(|E| log |V|). This runtime is achieved by using a heap-based priority queue
Key concepts for proof
● Cut property -Let S be a nonempty subset of nodes not equal to V. Let e = (v,w) be the minimum-cost edge with one end in S and the other in V - S. Every MST contains e!
● Exchange argument - used to prove cut property
○ More details on page 145 in the textbook
○ Summary: If e (defined above) is not in our spanning tree T, find an edge e’ in T such that T - {e’} U {e} is still a
spanning tree and c(e’)>c(e). Exchange e’ with e to get T’ so we get cost of T’ < cost of T.
● Cut property used for optimality proof for Kruskal, Spanning Tree - Optimality
Cut Property:
(4.17 from K&T) Assume that all edge costs are distinct. Let S be any subset of nodes that is neither empty nor equal to all of V, and let edge e = (v, w) be the minimum- cost edge with one end in S and the other in V − S. Then every minimum spanning tree contains the edge e.
In K&T, the proofs of Kruskal’s and Prim’s algorithms both rely on the cut property. They argue at that each iterative selection is guaranteed to be in the MST with the cut property.
Minimum Spanning Tree - Cycle Property
Used to show how an edge is not in the minimum spanning tree
MST example: Find an MST using Kruskal and Prim’s algorithm (start at node e)
c5e3g1h 8964
Kruskal’s solution #1
Order (g, h) (d,f) (b, h) (e, g) (e, f) (e, c) (a, b)
c5e3g1h 8964
Kruskal Solution #2
Order (g, h) (d,f) (b, h) (e, g) (f, h) (e, c) (a, b)
c5e3g1h 8964
Prim’s solution #1
{e} -> (e,g)
{e,g} -> (g, h)
{e,g,h} -> (h,b) {e,g,h,b}. -> (e, f) {e,g,h,b,f} -> (f, d) {e,g,h,b,f,d} ->(e,d) {e,g,h,b,f,d,c} -> (b,a) {e,g,h,b,f,d,c,a} Done
Prim’s solution #2
{e} -> (e,g)
{e,g} -> (g, h)
{e,g,h} -> (h,b) {e,g,h,b}. -> (h, f) {e,g,h,b,f} -> (f, d) {e,g,h,b,f,d} ->(e,d) {e,g,h,b,f,d,c} -> (b,a) {e,g,h,b,f,d,c,a} Done
Practice MST Question
You are given a weighted undirected graph G and a minimum weight spanning tree T for this graph. The graph G has n vertices and m edges. Suppose a new edge (v, w) is added to the graph. Let G0 denote this new graph. Design an algorithm with running time O(n) to compute a minimum weight spanning tree T0 of the graph G0. You may assume that all edge weights are distinct.
● How is T disrupted if we add (v, w) to it? How can we derive T0 from this version?
● If we add (v, w) to T, there must be a cycle, since T now has n edges. By the cycle property, the maximum cost edge in this cycle cannot be in the true MST (T0), so we remove it. We are left with the needed minimum spanning tree T0
Greedy Algorithm Tips
Tips for designing greedy algorithms
● First, make sure that the problem isn’t actually a DP problem!
● What are some metrics that can be used to fuel the greedy rule?
○ e.g. earliest deadline, smallest interval, etc.
● Does it look similar to any of the problems you’ve encountered this semester?
○ Maybe try a reduction ● Proofs
○ Exchange argument
■ Examples: Section 4.2, Section 4.5 for cut-property
○ Greedy stays ahead (induction) ■ Examples: Section 4.1
General Guideline for Greedy Stays Ahead
1. Define the solution generated by your algorithm and an optimal solution (could be one of many) to compare it with
○ Example: Interval Scheduling; A={a1, a2, … , ai} is our solution and an optimal is O = {o1, o2, … , om}
2. Define some kind of metric for which your greedy algorithm stays ahead of an optimal solution.
○ Example: the last finish time of the first k jobs in your solution ends before the last finish time of the first k jobs in the optimal
○ The metric should also mention the way you are ordering the elements of O.
3. Use induction to prove greedy stays ahead
○ Base case
■ Example: r = 1, f(a1) ≤ f(o1)
○ Inductive step
■ Inductive hypothesis: Assume that statement holds for previous step (weak induction)
■ Example: for r > 1, assume inductive hypothesis holds for r – 1, prove statement holds for r.
4. Use what you’ve proved about greedy staying ahead about any arbitrary optimal solution to prove that with respect to the metric you’ve chosen, your solution is optimal
○ This step should follow naturally from the proof above
○ Example: Your output A in interval scheduling contains same amount of requests as an optimal set
Greedy Stays Ahead Example (K&T Problem 3 in chapter 4)
You are consulting for a trucking company that does a large amount of business shipping packages between and Boston. The volume is high enough that they have to send a number of trucks each day between the two locations. Trucks have a fixed limit W on the maximum amount of weight they are allowed to carry. Boxes arrive at the station one by one, and each package i has a weight wi. The trucking station is quite small, so at most one truck can be at the station at any time. Company policy requires that boxes are shipped in the order they arrive; otherwise, a customer might get upset upon seeing a box that arrived after his make it to Boston faster. At the moment, the company is using a simple greedy algorithm for packing: they pack boxes in the order they arrive, and whenever the next box does not fit, they send the truck on its way.
But they wonder if they might be using too many trucks, and they want your opinion on whether the situation can be improved. Here is how they are thinking. Maybe one could decrease the number of trucks needed by sometimes sending off a truck that was less full, and in this way allow the next few trucks to be better packed.
Prove that, for a given set of boxes with specified weights, the greedy algorithm currently in use actually minimizes the number of trucks that are needed. Your proof should follow the type of analysis we used for the Interval Scheduling Problem: it should establish the optimality of this greedy packing algorithm by identifying a measure under which it “stays ahead” of all other solutions.
1. Define the greedy solution and an optimal solution
Let A be the total number of trucks our greedy solution uses and O be the total number of trucks an optimal solution uses
2. Define the metric for which the greedy solution stays ahead.
For the first k trucks, let our greedy solution pack boxes b1, b2, … ,bm while the optimal solution we chose packs boxes b1, b2, … , bn. Note that the ordering constraint defined by the problem must be satisfied. That is, if box bi is sent before box bj, then bi came to the company first. Our metric is that m ≥ n. That is, for a set number of trucks, k, our greedy solution packs at least as many boxes as the optimal.
Solution (cont)
3. Use induction to prove greedy stays ahead
Base Case: k = 1. Our greedy algorithm packs as many boxes as possible into the first truck so this is true.
Inductive hypothesis: Assume for k – 1 trucks, our greedy solution fits m boxes, the optimal fits n boxes, and m ≥ n. Inductive step: need to prove that for k trucks, our greedy solution fits m’ boxes and the optimal fits n boxes and m’ ≥ n’.
For the kth truck, the optimal packs in boxes [n+1,..,n’]. Since know that m>=n using the inductive hypothesis. This means that our greedy solution can at least pack in boxes [m+1, …,n’] and potentially more since there could still be space as we’ve packed less total boxes (since n’-(n+1) ≥ n’-(m+1)). This means that if the greedy solution packs boxes [m+1,..,m’], there is a lower bound on m’ of n’ so m’ ≥ n’.
4. Use what you’ve proved about greedy staying ahead about any arbitrary optimal solution to prove that with respect to the metric you’ve chosen, your solution is optimal
If the total number of trucks our greedy solution uses to pack all boxes is N trucks, our proof shows that we pack at least as many boxes as optimal. Since we actually pack all boxes, this means that N is the smallest number of trucks needed to pack all the boxes.
General Guidelines for Exchange Argument
1. Define the solution generated by your algorithm and an optimal solution (could be one of many) to compare it with ○ Example: Let A be the matchings your algorithm chooses. Let O be an optimal set of matchings (there could be
many optimals!)
2. Assume that your greedy solution and the optimal solution you choose differ in some way and make sure to specify how. For example:
○ O could include an element A does not include, vice versa.
○ There could be an inversion which is a pair of elements that are ordered in reverse order in A and O.
3. “Exchange” elements in O to make it more similar to your solution, A, while preserving the
optimality of O
○ Note the order! You should look to make O more similar to A while preserving optimality, not the other way around!
○ Need to consider all different cases of swapping to show optimality is preserved.
4. Conclusion: By doing these swaps any time you find differences between O and A, we can
eventually eliminate all differences between O and A without worsening the optimality of the solution. This shows that your greedy solution is as good as any optimal solution so it is optimal
Exchange Argument Example (K&T Problem 7 in chapter 4)
Summary: You have a single large supercomputer as well as an essentially unlimited supply of high-end PCs. Additionally, there is a computation that is broken up in n distinct jobs, labeled J1, J2, …, Jn. Each job consists of two stages: first it needs to be preprocessed on the supercomputer and then it needs to be finished on the PCs. For each job Ji, the job needs pi seconds of time on the supercomputer followed by fi seconds of time on a PC.
There are at least n PCs so the finishing of the jobs can be performed fully in parallel. However, the supercomputer can only work on a single job at at time so you need to figure out an order in which to feed the jobs to the supercomputer. As soon as a job is done on the supercomputer, it is immediately handed off to a PC for finishing; at the same time, the next job is fed into the supercomputer.
A schedule is an ordering of the jobs for the supercomputer and the completion time of the schedule is the earliest time at which all jobs have finished processing on the PCs.
Give a polynomial-time algorithm that finds a schedule with as small a completion time as possible. Prove that it is optimal.
Solution – Polynomial algorithm
First, we note that since the supercomputer can only work on one job at a time, passing in different orderings of jobs does not affect the preprocessing time. Rather, the ordering only affects the total finishing time. This intuitively means that when the last job is preprocessed at the supercomputer, we want it to finish as fast as possible.
Let our greedy schedule be: Pass in jobs in order of decreasing finishing time so we do the jobs with the largest finishing first.
Solution – Proof of Optimality
1. Define the solution generated by your algorithm and an optimal solution (could be one of many) to compare it with a. Let A be our greedy solution and let O be any arbitrary optimal schedule.
2. Assume that your greedy solution and the optimal solution you choose differ in some way and make sure to specify how.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com