THE UNIVERSITY OF NEW SOUTH WALES
4. THE GREEDY METHOD
Raveen de Silva,
office: K17 202
Copyright By PowCoder代写 加微信 powcoder
Course Admin: ,
School of Computer Science and Engineering UNSW Sydney
Term 1, 2022
Table of Contents
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Example problem
3.2 Single source shortest paths 3.3 Minimum spanning trees
The Greedy Method
What is a greedy algorithm?
A greedy algorithm is one that solves a problem by dividing it into stages, and rather than exhaustively searching all the ways to get from one stage to the next, instead only considers the choice that appears best.
This obviously reduces the search space, but it is not always clear whether the locally optimal choice leads to the globally optimal outcome.
The Greedy Method
Does the greedy method always work?
Suppose you are searching for the highest point in a mountain range. If you always climb upwards from the current point in the steepest possible direction, you will find a peak, but not necessarily the highest point overall.
The Greedy Method
Is there a framework to decide whether a problem can be solved using a greedy algorithm?
Yes, but we won’t use it.
The study of matroids is covered in CLRS. We will instead prove the correctness of greedy algorithms on a problem-by-problem basis. With experience, you will develop an intuition for whether the greedy method is useful for a particular problem.
The Greedy Method
How do we prove that a greedy algorithm is correct?
There are two main methods of proof:
1. Greedy stays ahead: prove that at every stage, no other algo- rithm could do better than our proposed algorithm.
2. Exchange argument: consider an optimal solution, and gradu- ally transform it to the solution found by our proposed algorithm without making it any worse.
Again, experience is most important!
Table of Contents
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Example problem
3.2 Single source shortest paths 3.3 Minimum spanning trees
Activity Selection
Instance: A list of n activities, with starting times si and finishing times fi . No two activities can take place simultaneously.
Task: Find a maximum size subset of compatible activities.
Activity Selection
Always choose the shortest activity which does not conflict with the previously chosen activities, then remove the conflicting activities and repeat.
In the above example, our proposed algorithm chooses the activities in blue, then has to discard all the red activities, so clearly this does not work.
Activity Selection
Maybe we should always choose an activity which conflicts with the fewest possible number of the remaining activities? It may appear that in this way we minimally restrict our next choice . . .
As appealing this idea is, the above figure shows this again does not work!
Activity Selection
Among those activities which do not conflict with the previously chosen activities, always choose the activity with the earliest end time (breaking ties arbitrarily).
Activity Selection
To prove the correctness of our algorithm, we will use an exchange argument. We will show that any optimal solution can be transformed into a solution obtained by our greedy algorithm with the same number of activities.
Find the first place where the chosen activity violates the greedy choice.
What if we replace that activity with the greedy choice?
Activity Selection
Does the new selection have any conflicts? No!
Does the new selection have the same number of activities? Yes!
So the greedy choice is actually just as good as the choice used in the optimal solution!
We replace it and repeat.
Continuing in this manner, we eventually “morph” the optimal solution into the greedy solution, thus proving the greedy solution is also optimal.
Activity Selection
Activity Selection
What is the time complexity of the algorithm?
We represent activities by ordered pairs of their starting and their finishing times and sort them in increasing order of their finishing time (the second coordinate), in O(n log n) time.
We go through this sorted list in order. How do we tell whether an activity conflicts with the already chosen activities?
Activity Selection
Suppose we are up to activity i, starting at si and finishing at fi , with all earlier finishing activities already processed.
If all previously chosen activities finished before si , activity i can be chosen without a conflict. Otherwise, there will be a clash, so we discard activity i.
We would prefer not to go through all previously chosen activities each time.
Activity Selection
We need only keep track of the latest finishing time among chosen activities.
Since we process activities in increasing order of finishing time, this is just the finishing time of the last activity to be chosen.
Every activity is therefore either chosen (and the last finishing time updated) or discarded in constant time, so this part of the algorithm takes O(n) time.
Thus, the algorithm runs in total time O(n log n), dominated by sorting.
Activity Selection
A related problem
Instance: A list of n activities with starting times si and finishing times fi = si + d ; thus, all activities are of the same duration. No two activities can take place simultaneously.
Task: Find a subset of compatible activities of maximal total duration.
Since all activities are of the same duration, this is equivalent to finding a selection with a largest number of non conflicting activities, i.e., the previous problem.
Activity Selection
What happens if the activities are not all of the same duration?
The greedy strategy no longer works – we will need a more sophisticated technique.
Cell Towers
Instance: Along the long, straight road from Loololong (in the West) to Goolagong (in the East), houses are scattered quite sparsely, sometimes with long gaps between two consecutive houses. Telstra must provide mobile phone service to people who live alongside the road, and the range of Telstra’s cell tower is 5km.
Task: Design an algorithm for placing the minimal number of cell towers alongside the road, that is sufficient to cover all houses.
Cell Towers
Let’s attempt a greedy algorithm, processing the houses west to east.
The first house must be covered by some tower, which we place 5km to the east of this house.
This tower may cover some other houses, but eventually we should reach a house that is out of range of this tower. We then place a second tower 5km to the east of that house.
Continue in this way until all houses are covered.
Cell Towers
At each house, we need to decide whether to place a new tower. This can be done in constant time by referring to the most recently created tower, which can itself be updated in constant time if necessary.
Therefore this algorithm runs in O(n) time if the houses are provided in order, and O(n log n) time otherwise.
We can prove the correctness of this algorithm using an exchange argument; this is left as an exercise.
Cell Towers
One of Telstra’s engineers started with the house closest to Loololong and put a tower 5km away to the east. He then found the westmost house not already in the range of the tower and placed another tower 5 km to the east of it and continued in this way till he reached Goolagong.
His junior associate did exactly the same but starting from Goolagong and moving westwards and claimed that his method required fewer towers.
Is there a placement of houses for which the associate is right?
Minimising Job Lateness
Instance: A start time T0 and a list of n jobs, with duration times ti and deadlines di . Only one job can be performed at any time; all jobs have to be completed. If a job i is completed at a finishing time fi > di then we say that it has incurred lateness li = fi − di .
Task: Schedule all the jobs so that the lateness of the job with the largest lateness is minimised.
Minimising Job Lateness
Ignore job durations and schedule jobs in the increasing order of deadlines.
Proof of optimality
Consider any optimal solution. We say that jobs i and j form an inversion if job i is scheduled before job j but dj < di .
Minimising Job Lateness
We will show that there exists a scheduling without inversions which is also optimal.
Recall the BubbleSort algorithm: if we manage to eliminate all inversions between adjacent jobs, eventually all the inversions will be eliminated.
Minimising Job Lateness
Note that swapping adjacent inverted jobs reduces the larger lateness!
Tape Storage
Instance: A list of n files of lengths li which have to be stored on a tape. Each file is equally likely to be needed. To retrieve a file, one must start from the beginning of the tape and scan it until the file is found and read.
Task: Order the files on the tape so that the average (expected) retrieval time is minimised.
Tape Storage
If the files are stored in order l1, l2, . . . , ln, then the expected time is proportional to
l1 +(l1 +l2)+(l1 +l2 +l3)+...+(l1 +l2 +l3 +...+ln) =nl1 +(n−1)l2 +(n−2)l3 +...+2ln−1 +ln.
Thisisminimisedifl1 ≤l2 ≤l3 ≤...≤ln,sowesimplysort the files by increasing order of length for an O(n log n) solution.
Tape Storage II
Instance: A list of n files of lengths li and probabilities to be needed pi, ni=1 pi = 1, which have to be stored on a tape. To retrieve a file, one must start from the beginning of the tape and scan it until the file is found and read.
Task: Order the files on the tape so that the expected retrieval time is minimised.
Tape Storage II
If the files are stored in order l1, l2, . . . ln, then the expected time is proportional to
l1p1 +(l1 +l2)p2 +(l1 +l2 +l3)p3 +...+(l1 +l2 +l3 +...+ln)pn
We now show that this is minimised if the files are ordered in a decreasing order of values of the ratio pi /li .
Tape Storage II
Let us see what happens if we swap two adjacent files, say files k and k + 1.
The expected time before the swap and after the swap are, respectively,
E =l1p1 +(l1 +l2)p2 +(l1 +l2 +l3)p3 +... +(l1 +l2 +l3 +...+lk−1 +lk)pk
+(l1 +l2 +l3 +...+lk−1 +lk +lk+1)pk+1 +...+(l1 +l2 +l3 +...+ln)pn
E′ =l1p1 +(l1 +l2)p2 +(l1 +l2 +l3)p3 +... +(l1 +l2 +l3 +...+lk−1 +lk+1)pk+1 +(l1 +l2 +l3 +...+lk−1 +lk+1 +lk)pk +...+(l1 +l2 +l3 +...+ln)pn.
Tape Storage II
Thus, E − E′ = lkpk+1 − lk+1pk, which is positive whenever lkpk+1 > lk+1pk, i.e. when pk/lk < pk+1/lk+1.
Consequently, E > E′ if and only if pk/lk < pk+1/lk+1, which means that the swap decreases the expected time whenever pk /lk < pk+1/lk+1, i.e., if there is an inversion: file k + 1 with a larger ratio pk+1/lk+1 has been put after file k with a smaller ratio pk/lk.
As long as the sequence is not sorted, there will be inversions of consecutive files, and swapping will reduce the expected time. Consequently, the optimal solution is the one with no inversions.
Interval Stabbing
Instance: Let X be a set of n intervals on the real line, described by two arrays XL[1..n] and XR[1..n], representing their left and right endpoints. We say that a set P of points stabs X if every interval in X contains at least one point in P.
Task: Describe and analyse an efficient algorithm to compute the smallest set of points that stabs X.
Interval Stabbing
Is it a good idea to stab the largest possible number of intervals?
No! In the above example, this strategy needs three stabbing points rather than two.
Interval Stabbing
The interval which ends the earliest has to be stabbed somewhere. What is the best place to stab it?
Fractional Knapsack
Instance: A list of n items described by their weights wi and values vi , and a maximal weight limit W of your knapsack. You can take each item any number of times (not necessarily integer).
Task: Select a non-negative quantity of each item, with total weight not exceeding W and maximal total value.
Fractional Knapsack
Fill the entire knapsack with the item of highest value per unit weight!
0-1 Knapsack
Instance: A list of n discrete items described by their weights wi and values vi , and a maximal weight limit W of your knapsack.
Task: Find a subset S of the items with total weight not exceeding W and maximal total value.
0-1 Knapsack
Can we always choose the item of highest value per unit weight?
Assume there are just three items with weights and values: A(10 kg, $60), B (20 kg, $100), C (30 kg, $120)
and a knapsack of capacity W = 50 kg.
The greedy strategy would choose items A and B, while the
optimal solution is to take items B and C!
So when does the Greedy Strategy work?? Unfortunately there is no easy rule . . .
Array Merging
Assume you are given n sorted arrays of different sizes.
You are allowed to merge any two arrays into a single new sorted array and proceed in this manner until only one array is left.
Design an algorithm which achieves this task and moves array elements as few times as possible.
Give an informal justification why your algorithm is optimal.
This problem is somewhat related to the next problem, which is arguably among the most important applications of the greedy method!
The Huffman Code
Assume you are given a set of symbols, for example the English alphabet plus punctuation marks and a blank space (to be used between words).
You want to encode these symbols using binary strings, so that sequences of such symbols can be decoded in an unambiguous way.
The Huffman Code
One way of doing so is to reserve bit strings of equal and sufficient length, given the number of distinct symbols to be encoded. For example, if you have 26 letters and up to 6 punctuation symbols, you could use strings of 5 bits, as
To decode a piece of text you would partition the bit stream into groups of 5 bits and use a lookup table to decode the text.
The Huffman Code
However this might not be the most economical way: all the symbols have codes of equal length but the symbols are not equally frequent.
One would prefer an encoding in which frequent symbols such as ‘a’, ‘e’, ‘i’ or ‘t’ have short codes while infrequent ones, such as ‘q’,‘x’ and ‘z’ can have longer codes.
The Huffman Code
However, if the codes are of variable length, then how can we partition a bitstream uniquely into segments each corresponding to a code?
One way of ensuring unique readability of codes from a single bitstream is to ensure that no code of a symbol is a prefix of a code for another symbol.
Codes with such property are called prefix codes.
The Huffman Code
We can now formulate the problem as follows:
Given the frequencies (probabilities of occurrences) of each symbol, design an optimal prefix code, i.e. a prefix code which minimises the expected length of an encoded text.
Note that this amounts to saying that the average number of bits per symbol in an “average” text is as small as possible.
We now sketch the algorithm informally; please see the textbook for details and the proof of optimality.
MATH3411 Information, Codes & Ciphers covers this and much more!
Table of Contents
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Example problem
3.2 Single source shortest paths 3.3 Minimum spanning trees
Table of Contents
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Example problem
3.2 Single source shortest paths 3.3 Minimum spanning trees
Tsunami Warning
Instance: There are n radio towers for broadcasting tsunami warnings. You are given the (x,y) coordinates of each tower and its radius of range. When a tower is activated, all towers within the radius of range of the tower will also activate, and those can cause other towers to activate and so on.
You need to equip some of these towers with seismic sensors so that when these sensors activate the towers where these sensors are located all towers will eventually get activated and send a tsunami warning.
Task: Design an algorithm which finds the fewest number of towers you must equip with seismic sensors.
Tsunami Warning
Tsunami Warning
Tsunami Warning
Find the unactivated tower with the largest radius (if multiple with the same radius, pick the any of them). Place a sensor at this tower. Find and remove all activated towers. Repeat.
Find the unactivated tower with the largest number of towers within its range. If there is none, place a sensor at the leftmost tower. Repeat.
Give examples which show that neither Algorithm 1 nor Algorithm 2 solve the problem correctly.
Tsunami Warning
It is useful to consider the towers as vertices of a directed graph, where an edge from tower A to tower B indicates that the activation of A directly causes the activation of B, that is, B is within the radius of A.
Observation
Suppose that activating tower A causes tower B to also be activated, and vice versa. Then we never want to place sensors at both towers; indeed, placing a sensor at A is equivalent to placing a sensor at B.
Tsunami Warning
Observation
We can extend this notion to a larger number of towers.
Let S be a subset of the towers such that that activating any tower
in S causes the activation of all towers in S.
We never want to place more than one sensor in S, and if we place
one, then it doesn’t matter where we put it.
In this way, we can treat all of S as a unit; a super-tower.
Strongly Connected Components
Definition
Given a directed graph G = (V , E ) and a vertex v , the strongly connected component of G containing v consists of all vertices
u ∈ V such that there is a path in G from v to u and a path from u to v. We will denote it by Cv.
In the terms of our problem, strongly connected components are maximal super-towers.
Strongly Connected Components
How do we find the strongly connected component Cv ⊆ V containing v?
Construct another graph Grev = (V , Erev ) consisting of the same set of vertices V but with the set of edges Erev obtained by reversing the direction of all edges E of G.
u is in Cv if and only if u is reachable from v and v is reachable from u.
Equivalently, u is reachable from v in both G and Grev .
Strongly Connected Components
UseBFStofindthesetR⊆V ofallverticesinV whichare reachable in G from v.
Similarly find the set R′ ⊆ V of all vertices which are reachable in Grev from v.
The strongly connected component of G containing v is given byCv =R∩R′.
Finding all strongly connected components in this way could require O(V) traversals of the graph.
Strongly Connected Components
Faster algorithms exist! Famously, Kosaraju’s algorithm and Tarjan’s algorithm find all strongly connected components of a directed graph in O(V + E).
We won’t cover them in this course, but we will in COMP4128.
These faster algorithms are linear time (by which we mean linear in the size of the input, which consists of the vertex set and edge set).
A linear time algorithm is asymptotically “no slower than” reading the input, so we can run these algorithms “for free”, i.e. without worsening the time complexity of our solution to a problem.
The Condensation Graph
It should be clear that distinct strongly connected components are disjoint sets, so the strongly connected components form a partition of V .
Let CG be the set of all strongly connected components of a graph G.
Definition
Define the condensation graph ΣG = (CG,E∗), where E∗ = {(Cu1,Cu2) | (u
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com