CSC373 Fall’20
Assignment 1 Solutions: Dinosaurian Dilemmas
Q1 [25 Points] Fruit Frenzy
Dinosaur Bobby is hungry and wants to eat some fruits. He knows about n fruit gardens. Each garden i is described by two integers: ei, the easiness of reaching garden i from where Bobby is, and qi, the number of fruits in garden i.
Bobby wants to pick a garden i to visit. But he wants to make a smart choice: he wants to make sure that no other garden j is simultaneously easier to get to (ej > ei) and has more fruits (qj > qi). Bobby wants your help in counting the number of gardens i that meet this requirement.
(a) [15 Points] Give an O(n log n) time divide-and-conquer algorithm for this problem. (b) [10 Points] Provide a brief justification of its correctness and running time.
Solution to Q1
(a) The high-level algorithm works as follows.
1. Sort all the gardens in a nondecreasing order of their ei-s (i.e., so e1 ≤ e2 ≤ . . . ≤ en).
2. Call the recursive function BestGardens with this sorted list as the first parameter and 0 as the second parameter. Let the function return (list,value).
3. Return the number of gardens in list.
The recursive function BestGardens is given below. For a description of what the algorithm
does, see the justification of correctness in (b). Algorithm 1: BestGardens(inputList,einputThr)
1 Break inputList into two (roughly) equal halves at the middle: leftList and rightList
2 ethr ← e-value of the last garden in leftList
3 (leftOutput,-,-) ← BestGardens(leftList,0)
4 (rightOutput,rightMaxFruits,rightMaxFruitsAboveThr) ← BestGardens(rightList,ethr) 5 leftOutputFiltered ← list of gardens i from leftOutput such that ei < ethr and
qi ≥ rightMaxFruits, or ei = ethr and qi ≥ rightMaxFruitsAboveThr
6 maxFruits ← maximum q-value among all gardens in inputList
7 maxFruitsAboveThr ← maximum q-value among all gardens i in inputList with
ei > einputThr
8 return (leftOutputFiltered ∪ rightOutput, maxFruits, maxFruitsAboveThr)
(b) Justifications:
Running time: Sorting the gardens takes O(n log n) time. Let T (n) denote the worst-case running time of BestGardens on a list of size n. Then, Lines 3 and 4 take T(n/2) time, line 2 takes O(1) time, and lines 5, 6, and 7 take O(n) time. Hence, the worst-case running time is given by the equation T (n) ≤ 2T (n/2) + O(n), which gives T (n) = O(n log n).
1
Correctness: We say that garden i is dominated by garden j when ej > ei and qj > qi, and call garden i undominated if it is not dominated by any other garden.
When we recursively call BestGardens on the two halves (leftList and rightList), we get the lists (leftOutput and rightOutput) of gardens from each half that are not dominated by other gardens from the same half. Since gardens that are undominated must be part of this list, our desired list is a subset of the union of the two lists.
Further, each garden i in rightOutput must be undominated: it is not dominated by any other garden from rightList, and all gardens j from leftList have ej ≤ ei due to the sorting. Finally, for each garden i in leftOutput, we already know that it is not dominated by any other garden from leftList, so we only need to make sure that it is also not dominated by any garden j from rightList.
If all the e-values were distinct, then we would know that all gardens from rightList have higher e-value than any garden from leftList (and thus from leftOutput). Hence, we would only need to know the maximum q-value among gardens from rightList in order to decide which gardens from leftOutput to rule out. However, there is a slight subtlety: some gardens with equal e-values may get split between the left and right halves.
Hence, if ethr denotes the greatest e-value from leftList (i.e. the e-value of the last garden in leftList), then we need to check a garden i from leftOutput against:
• maximum q-value among those gardens j in rightList with ej > ethr, if ei = ethr;
• maximum q-value from rightList, if ei < ethr (as any garden j from rightList has ej ≥ ethr >
ei ).
Function BestGardens is amended such that ethr is passed as an argument to the call on the right half, which then returns precisely these two additional values (the additional parameter passed to the call on the left half or its additional return values do not matter). Then, Line 5 filters gardens from leftOutput using these two values, and Lines 6 and 7 compute these values in the parent call.
Q2 [25 Points] Missing Member
The dinosaur society has reached a magical point where it consists of m = 2n dinosaurs, each with a distinct age from the set {0,1,…,2n − 1}. One day, an assembly is called, and only m − 1 dinosaurs show up. Chief Seema enlists your help to find the age of the missing dinosaur.
She has lined up the dinosaurs (including herself) in an arbitrary order. You can walk up to any dinosaur and query their age. However, the dinosaurs are ex-computer-scientists and like binary encodings. So they will only provide a single bit of their age at a time. That is, in one query, you can walk up to some dinosaur i and ask the j-th bit of their age, where i ∈ {1,…,m} and j ∈ {1,…,n}.
Design an algorithm to find the age of the missing dinosaur in O(m) queries and O(m) running time. [Note: Each dinosaur’s age is represented by n bits, so there are a total of mn bits. That means
you will need to be smart and not ask every dinosaur every bit of their age.]
2
Solution to Q2
The idea of the algorithm is as follows:
• We run an iterative algorithm that determines the bits in the age of the missing dinosaurs one by one (say, from the most significant to the least). Let X0 be the list of all present dinosaurs.
• Consider iteration k, when the first k − 1 bits have already been determined.
• Let Xk−1 be the set of dinosaurs present such that the first k − 1 bits of their age match those
of the age of the missing dinosaur. (We will maintain this list as the algorithm progresses.)
• Note that out of {0, 1, . . . , 2n − 1}, there are precisely 2n−k+1 numbers whose first k − 1 bits
take given values. Hence, |Xk−1| ≤ 2n−k+1.
• Out of these, precisely 2n−k have the next (k-th) bit as 0, while the remaining 2n−k have the
next bit as 1.
• Thus, by querying all the dinosaurs from Xk−1 the k-th bit of their age, we can determine
the k-th bit of the age of the missing dinosaur.
• Before proceeding to the next iteration, we let Xk be the list of those dinosaurs from Xk−1
whose k-th bit also matches that of the missing dinosaur.
This is formally presented as the algorithm below. In iteration k, the number of queries made (in line 4) as well as the running time (of lines 4 and 6) is O(|Xk−1|). Hence, the overall running time and number of queries is O(2n + 2n−1 + . . . + 2) = O(2n+1) = O(m).
Algorithm 2: FindMissing
1 2 3 4
5 6 7 8
Letk←0,X0 ←{1,…,m−1}(indicesofpresentdinosaurs) whilek
4
2. OPT′ is also an optimal solution. By Property 1, we are removing at least one hospital, and we are adding only one at fik+1. Hence, we are not increasing the number of hospitals compared to OPT.
3. The k + 1 leftmost hospitals in OP T ′ match those in G. By Property 2, the k leftmost hospitals in OPT′ are still at fi1,…,fik. Since we remove all hospitals in (fik,fik+1] and add a hospital at fik+1 , the (k + 1)-st leftmost hospital is now at fik+1 .
This is the desired contradiction. Hence, G must be optimal. Q4 [25 Points] Traveller’s Dilemma
Dinosaur Eli is organizing a road trip across Canada. She owns an electric car, which has a battery that can last for 200km on a single charge. If the car runs out of battery, she will be stranded. Luckily for her, n − 1 charging stations are located along the road she plans to take; the distance of the ith charging station from her starting point is d[i]. The distance of her final destination from her starting point is d[n]. Here, d[1] < d[2] < . . . < d[n].
Assume that she begins the trip with a full charge, and the trip is feasible because the distance between any two consecutive charging stations is at most 200km (i.e. d[i + 1] − d[i] ≤ 200 for all i).
(a) [10 Points] Design a greedy algorithm which takes the array d as input and finds locations of fewest possible charging stations where Eli should stop to full-charge the battery in order to successfully complete her trip. Analyze its worst-case running time.
(b) [10 Points] Prove correctness of your algorithm.
(c) [5 Points] Now suppose that there is a cost c[i] to charge the battery at charging station i, regardless of the current amount of power in the battery. Show that the greedy algorithm given in part (a) does not necessarily minimize the total cost of travel.
Solution to Q4
(a) The greedy algorithm is given below. The algorithm always seeks the next charging station i that Eli will not be able to reach given the charge she has left, and decides to recharge at the charging station just before (i − 1).
Algorithm 4: Recharge(d[1], . . . , d[n])
1 2 3 4 5 6 7 8 9
G←∅(greedysolution)
h ← 0 (location of last recharge) for i = 1,...,n do
if d[i]−h>200then G ← G ∪ {d[i − 1]} h ← d[i − 1]
end end
return G
5
Running time: The running time of the algorithm is trivially O(n) as it takes O(1) time to process each i in the loop.
(b) We prove, by induction on i, that there exists an optimal solution that matches the greedy so- lution on the set of charging stations that it selects from {1, . . . , i}. The base case of i = 0 is trivial.
Suppose the statement holds for some i, and let OPTi be such an optimal solution. Let h be the last charging station up to i that both G and OP Ti select. For i + 1, we want to show that there is an optimal solution OP Ti+1 that matches G in its selection from {1, . . . , i + 1}. There are three cases.
1. Both G and OP Ti select station i + 1. In this case, OP Ti+1 = OP Ti works.
2. G selects i + 1 but OP Ti does not. This is not possible. Because G selects i + 1, we must have d[i + 1] − d[h] > 200, in which case OP Ti would not be a valid solution as Eli would run out of charge.
3. OPTi selects i+1 but G does not. In this case, consider OPTi+1 obtained by replacing station i+1 with i+2. Because G does not select i+1, we must have that d[i+2]−d[h] ≤ 200, so Eli can feasibly reach charging station i + 2 under OP Ti+1, and charging later only helps in reaching the later charging stations.
Thus, the inductive hypothesis holds for i = n, at which point we have G = OPTn, i.e., G is an optimal solution.
(c) Consider a simple example with n = 3: d[1] = 199, d[2] = 200, d[3] = 201, and c[1] = 1, c[2] = 100, c[3] = 1.
We can ignore the last station: since we only need to reach d[n], we will never recharge there. The greedy solution will skip the first charging station and go to the second since it can reach the second station. But that has a cost of 100. Instead, recharging early at the first station is still sufficient to reach d[3], but has a much lower cost of 1.
Bonus Question
Q5 [20 Points] No More Slaps, Please!
There are n dinosaurs in a line, sorted in a nondecreasing order of their age. You are given a number x and told that there is at least one dinosaur in the line whose age is x. Your goal is to find one such dinosaur.
At first glance, this may seem like a simple application of binary search that you, as a computer scientist, already know by heart. That is until you remember an important detail about the dinosaur society: Dinosaurs, unlike many humans, want to seem older and get offended if you guess them to be younger than they actually are. So if you approach dinosaur i and ask him “Are you x years old?”, three things can happen:
1. If he is younger than x, he says “Oh my, I’m flattered. But no, I’m younger than x.” 2. If his age is exactly x, he says, “Why yes! How did you know that?”
6
3. If he is older than x, he gets offended and slaps you.
Now, you are strong, but not that strong. You know you can withstand at most k dino-slaps. Any
more slaps and you’re done for.
Design an algorithm to search a dinosaur of the given age x such that you are guaranteed to get slapped no more than k times. What is the asymptotic number of queries that your algorithm makes in the worst case as a function of k?
√
[Note: Think of k as a constant. As a warm start, design an O( n) algorithm for k = 1. To receive any partial credit, you need to satisfactorily solve the problem at least for k = 2.]
Solution to Q5 Let D[1…n] denote the age-sorted array of dinosaurs. We show that for each 1
constant k ≥ 0, Algorithm Slapk is correct and makes O(nk+1 ) queries. We show this via induction on k.
Algorithm 5: Slapk(D[1…n])
1 2 3 4 5
6
7 8 9
10 11 12
ifk=0then
for i = 1,…,n do
Query the age of D[i], and if it is x, return D[i] end
else
1 fort=0,1,…,nk+1 −1do
k i←t·nk+1 +1
Query the age of D[i] If it is x, return D[i]
kk Ifitismorethanx,returnSlapk−1(D[(t−1)·nk+1 +1 … t·nk+1]).
end end
In the base case of k = 0, the algorithm simply does a linear search in the increasing order of age. Hence, it must encounter the promised dinosaur of age x without receiving any slap.
1
Suppose Slapk−1 finds the dinosaur of age x, receives at most k−1 slaps, and makes O(nk ) queries given n dinosaurs.
1
In Slapk, it is easy to see that line 8 makes O(nk+1 ) queries. Further, since the dinosaurs are sorted
by age, it will either encounter the promised dinosaur of age x (which it will return in line 9), or it
will overshoot and receive a slap. If it receives a slap at a given value of t but did not receive a slap kk
at t−1, then we know that the promised dinosaur must be in the range [(t−1)·nk+1 +1 … t·nk+1 ].
Since it calls Slapk−1 on this range, given the induction hypothesis, it will find the correct dinosaur k
with at most k − 1 additional slaps. Thus, Slapk is correct. Since Slapk−1 is called with nk+1 k+1 k k+1
dinosaurs in line 10, it will make O(n k · 1 ) = O(n 1 ) queries. Adding the queries made in line 8, 1
we have that the total number of queries is also O(nk+1 ).
7