CS计算机代考程序代写 algorithm Oliver Braun

Oliver Braun
CSE 101 (Summer Session 2021)

Homework 3: Divide-And-Conquer.

Instructions

Homework questions will be similar to your future exam questions. The homework will not be graded,
however, you are highly recommended to practice using these questions for better preparations of the
exams.

Key Concepts Divide-And-Conquer.

One of the most powerful techniques for solving problems is to break them down into smaller, more easily
solved pieces. Smaller problems are less overwhelming, and they permit us to focus on details that are
lost when we are studying the entire problem. A recursive algorithm starts to become apparent when
we can break the problem into smaller instances of the same type of problem.

Divide-and-Conquer splits the problem in (say) halves, solves each half, then stitches the pieces back
together to form a full solution. To use Divide-and-Conquer as an algorithm design technique, we must
divide the problem into two smaller subproblems, solve each of them recursively, and then meld the two
partial solutions into one solution to the full problem. Whenever the merging takes less time than solving
the two subproblems, we get an efficient algorithm.

Divide-and-Conquer is a design technique with many important algorithms to its credit, including Merge-
sort and Karatsuba’s algorithm to multiply large integers. Mergesort is the classic example of a Divide-
and-Conquer algorithm. It takes only linear time to merge two sorted lists of n/2 elements, each of
which was obtained in O(n log n) time. Analyzing the running time of a divide and conquer algorithm
generally involves solving a recurrence relation that bounds the running time recursively in terms of the
running time on smaller instances.

One thing to note about many settings in which Divide-and-Conquer is applied, including these, is that
the natural brute-force algorithm may already be polynomial time, and the divide and conquer strategy
is serving to reduce the running time to a lower polynomial. This is in contrast to most of the problems
in the previous sections, for example, where brute force was exponential and the goal in designing a
more sophisticated algorithm was to achieve any kind of polynomial running time. We will see that a
Divide-and-Conquer approach can often reduce the runtime from O(n2) to O(n log n).

Sources:

[1] Dasgupta, Sanjoy / Papadimitriou, Christos / Vazirani, Umesh (2008): Algorithms, McGrawHill.
[2] Kleinberg, Jon / Tardos, Eva (2006): Algorithm Design, Pearson.
[3] Skiena, Steven S. (2020): The Algorithm Design Manual, 3rd ed., Springer.

1

Problem 1. For each of the following, can the Master Theorem be used to determine the asymptotic complexity
of T (n)? If so, apply the theorem; if not, state why the Master Theorem cannot be applied.

(a) T (n) = 16T (n/4) + n

(b) T (n) = 2nT
(
n
2

)
+ nn

(c) T (n) = 3T (n/3) + n/2

(d) T (n) = 0.5T
(
n
2

)
+ n

(e) T (n) = 2T (n/4) + n0.51

(f) T (n) = 64T
(
n
8

)
− n2 log n

(g) T (n) = 6T (n/2) + n2

(h) T (n) = 3T (n/2) + n

n

(i) T (n) = T (n− 1) + n
(j) T (n) = 4T (n/2) + n2 + 3n

(k) T (n) = T (n/2) + n

Solution:

(a) T (n) = 16T (n/4) + n =⇒ T (n) ∈ O(n2) (Case 1)
(b) M.T. cannot be used; a is not a constant; the number of subproblems should be fixed

(c) T (n) = 3T (n/3) + n/2 =⇒ T (n) ∈ O(n log n) (Case 2)
(d) M.T. cannot be used; a < 1 — cannot have less than one sub problem (e) T (n) = 2T (n/4) + n0.51 =⇒ T (n) ∈ O(n0.51) (Case3) (f) T (n) = 64T ( n 8 ) − n2 log n =⇒ (it actually depends on the Master Theorem that you use. In our statement of the master theorem, this is inadmissable, but there is a more general statement for which this recurrence works) (g) T (n) = 6T (n/2) + n2 =⇒ T (n) ∈ O(nlog2 6) (Case 3) (h) T (n) = 3T (n/2) + n √ n =⇒ T (n) ∈ O(nlog2(3)) (Case 3) (i) M.T. cannot be used; not of required form. (j) T (n) = 4T (n/2) + n2 + 3n =⇒ T (n) ∈ O(n2 log n) (Case 2) (k) T (n) = T (n/2) + n =⇒ T (n) ∈ O(n) (Case 1) 2 Problem 2. Mergesort. Given an array A of n numbers, sort the array using Mergesort. (a) Describe the algorithm or write the Pseudocode. (b) Prove the correctness of the algorithm. (c) Prove the time complexity of the algorithm. Solution: (a) Algorithm: Mergesort(A[1,...,n]): if (n == 1) then return A[1] return Merge(Mergesort(A[1,...,n/2]), Mergesort(A[n/2+1,...,n])) Merge(A[1,...,k], B[1,...,l]: sorted arrays): if (k == 0) then return B[1,...,l] if (l == 0) then return A[1,...,k] if (A[1] <= B[1]) then return A[1] ? Merge(A[2,...,k], B[1,...,l]) else return B[1] ? Merge(A[1,...,k], B[2,...,l]) (b) Correctness: Base Case (n=1): When n = 1, then either k = 0 or l = 0, and the algorithm returns either the sorted array B[1..l] (if k = 0) or the sorted array A[1..k] (if l = 0). Induction Hypothesis: Assume that Rmerge(A[2..k], B[1..l]) and Rmerge(A[1..k], B[2..l]) each return a sorted array containing all elements from either array. Induction Step (n− 1→ n): Case 1: A[1] ¡= B[1]. Then Rmerge(A[1..k], B[1..l]) returns a sorted array containing all n = k + l elements from either array: A[1] is the smallest element of all elements and (by the Induction Hypothesis) Rmerge(A[2..k], B[1..l]) returns a sorted array. Case 2: B[1] ¡ A[1]. Then Rmerge(A[1..k], B[1..l]) returns a sorted array containing all n = k+l elements from either array: B[1] is the smallest element of all elements and (by the Induction Hypothesis) Rmerge(A[1..k], B[2..l]) returns a sorted array. For further details, please refer to the lecture slides and videos. (c) Time complexity: Supposing that n is even, the algorithm spends O(n) time to divide the input into two pieces of size n/2 each; it then spends time T (n/2) to solve each one (since T (n/2) is the worst-case running time for an input of size n/2); and finally it spends n−1 time to combine the solutions from the two recursive calls. Thus the running time T (n) satisfies the following recurrence relation. T (1) = 0 T (n) = 2 · T (n/2) + (n− 1) = 22 · T (n/22) + (n− 2) + (n− 1) = 22 · T (n/22) + 2n− (1 + 2) = 22 · (2 · T (n/23) + (n/22 − 1)) + 2n− (1 + 2) = 23 · T (n/23) + 3n− (1 + 2 + 22) = . . . = 2k · T (n/2k) + kn− k−1∑ i=0 2i = 2k · T (n/2k) + kn− (2k − 1) = . . . = n · T (1) + log2 n · n− (n− 1) = n · log n− (n− 1) = Θ(n log(n)) By the Master Theorem, we can write T (n) = O(n1 log(n)) since a = 2 = bd. For further details, please refer to the lecture slides and videos.. 3 Problem 3. Consider the problem to multiply two large integers (each has n digits). Let our time measure be the number of multiplications of two digits. Describe an O(n2) algorithm and describe Karatsuba’s algorithm. What is Karatsuba’s main idea and why is it a Divide-And-Conquer algorithm? Analyze the runtime of Karatsuba’s algorithm. Solution: O(n2) algorithm: The most common multiplication that you’ve learned in primary school. It will involve n2 multiplications of two digits and therefore T (n) = n2 multiplications of two digits. Karatsuba’s main idea: Compute the product of two n-digit numbers with only 3 multiplications of two n/2-digit numbers. procedure Karatsuba(a = a1a2...an, b = b1b2...bn: two n− digit integers in decimal) 1. if n == 1 2. return a · b 3. m := bn 2 c 4. al := a1...am, ar := am+1...an 5. bl := b1...bm, br := bm+1...bn 6. p1 := Karatsuba(al, bl) 7. p2 := Karatsuba(ar, br) 8. p3 := Karatsuba(al − ar, br − bl) + p1 + p2 9. return p1 · 102(n−m) + p3 · 10n−m + p2 Time complexity Let T (n) be the number of multiplications of Karatsuba’s algorithm on two n− digit inputs. T (1) = 1 T (n) = 3T ( n 2 ) = 32 · T ( n 22 ) = ... = 3k · T ( n 2k ) = ... = nlog2 3 · T (1) = nlog2 3 4 Problem 4. Medians (Dasgupta et al. 53-56). Given a list S = s1, . . . , sn of integers. (a) Design a Divide-and-Conquer algorithm to find the median integer in the list in time O(n) (on average). (b) Prove the correctness of the algorithm. (c) Prove the time complexity of the algorithm. Solution: (a) Algorithm: To find the median of a list, we will split the arrays into 3 groups. Initially set k = bn/2c since the middle element will be the median number. If n is even, we’ll call the same algorithm again k = dn/2e and add the two solutions and divide it by 2. Given a list S, choose a random element from the list, e.g. element v. Split list S into 3 categories: elements smaller than the value v, those equal to v (there might be duplicates), and those greater than v. Call these SL, Sv, SR respectively. Example: S = (2, 36, 5, 21, 8, 13, 11, 20, 5, 4, 1) We may choose v = 5, resulting in: SL = (2, 4, 1) Sv = (5, 5) SR = (36, 21, 8, 13, 11, 20) The search can instantly be narrowed down to one of these sublists. selection(S,k) =   selection(SL, k) if k ≤ |SL| v if |SL| < k ≤ |SR|+ |Sv| selection(SR, k − |SL| − |Sv|) if k > |SL|+ |Sv|

(b) Correctness: Proof by induction.

Base Case: If n = 1, return value of n. This is because the median of a list of size 1 is the
value of n.

Induction Hypothesis: Suppose n ≥ 2. Assume that when called, Selection (S[a….b], i’) returns
the i’th smallest element in S|a…b| ≤ n. (The active subarray A’ consists of all elements of A
which are between min(A’) and max(A’). Also, the current index i’ has the property that the
i’th smallest element of A’ is the ith smallest element of A.)

Induction step: Selection is called on 1 instance of the subarray of size less than n. By the IH,
this returns the ith smallest element. Because i is updated to maintain that the i’th smallest
number correlates to the median number, the output is correct.

(Selection (S,k) is called when the kth element is not chosen as the pivot in the current subarray.
It calls selection(A’,i’) and satisfies the loop invariant since if i is less than or equal to the left
subarray, then i would be the ith element in the left subarray. Otherwise if i is greater than
the left subarray + the middle subarray, it would be to the right of both arrays. Thus k is the
k-|SL| − |Sv|th array. The property holds)

(c) Runtime: Or algorithm depends on the random choices of v. If we constantly choose the worst
case, like Quicksort, the algorithm would run in O(n2) time. However, to come up with the
average case, we need to check if v is ”lucky or unlucky”. We will determine this by whether or
not it lies in the 25th or 75th percentile of the array. (A lucky choice would be if its between
the 25-75th percentile, an unlucky one is outside).

Using expected values, if it lands in the 25th-75th percentile, then we’re lucky. If it doesn’t,
we need to continue. Each possibility has a probability of 0.5. Thus E = 1 + 0.5E. Thus E =
2.

After 2 split operations, on average, the array will shrink to at most 3/4th it’s size letting T(n)
have an expected run time of T(n) = T(3n/4) + O(n). Using Master Theorem, we get that
the total runtime is O(n).

5

Problem 5. Counting Inversions (Kleinberg/Tardos 127-131). We are given an array L of n > 0 distinct
integers. We say that two indices i < j form an inversion if L[i] > L[j]; that is, if the two elements
L[i] and L[j] are ”out of order”. For example, consider the array (6, 1,−4, 10, 2, 7). This array has
six inversions: the index pairs (0, 1), (0, 2), (0, 4), (1, 2), (3, 4), (3, 5).

(a) One possible solution is to brute-force check every ordered pair of indices in our array. What
would be the runtime of such an algorithm?

(b) Design a Divide-and-Conquer algorithm that solves the counting inversion problem. Hint:
Modify the Mergesort algorithm accordingly.

(c) Prove the correctness of your algorithm.

(d) Prove the runtime of the algorithm provided in (b). Compare this to your findings in (a).

Solution:

(a) There are
(
n
2

)
ordered pairs to check (constant time for each pair). Therefore, a brute force

method would take O(n2) time.

(b) We follow the idea of the Mergesort algorithm. This means that we will need a sub-algorithm
that takes two arrays, and computes the number of new inversions created by putting them
together. It will also sort our array, just like with Mergesort.

Let’s call this sub-algorithm Merge-and-Count. Merge-and-Count takes two sorted arrays, A
and B, as input. Merge-and-Count will output two things: an integer and an array.

We first initialize two pointers: one starting at the beginning of A (pA), and one starting at the
beginning of B (pB). We also initialize a variable count to keep track of the new inversions.
Now, while both pA and pB are still within their respective arrays, suppose pA points to A[i]
and pB points to B[j].

We compare A[i] and B[j]:

1. If A[i] < B[j], then increment pA and put A[i] at the end of our output array. 2. If A[i] > B[j], then increment pB and put B[j] at the end of our output array.

Furthermore, increment count by the number of elements remaining in A (as, by our assumption
that A is sorted, every remaining element in A is greater than B[j]).

Once pA and pB are no longer within their respective arrays, we output count and our output
array. This completes the description of Merge-and-Count. Observe that if A and B are both
of length n, then the run-time of Merge-and-Count is O(n).

Now that Merge-and-Count is complete, it only remains to write our recursive algorithm for
Divide-and-Conquer, which we will call Sort-and-Count. The idea is again quite similar to that
of Mergesort.

Sort-and-Count takes in an array L as input and outputs an integer (the number of inversions
in L) and an array (the sorted array of L). If L has size 1, then we will output 0 and L.

Otherwise, we can divide L into two arrays. Suppose L has length n. We define A to be the
first dn/2e elements of L, and define B to be the remaining bn/2c elements of L. Now let

(A, countA) = Sort-and-Count(A)

(B, countB) = Sort-and-Count(B)

(L, count) = Merge-and-Count(A,B).

We see now that the number of inversions is c := countA + countB + count. Furthermore, L is
sorted. So we return c and L. This completes the design of Sort-and-Count.

(c) Correctness Proof. We first prove the correctness of Merge-and-Sort on two sorted arrays A
and B. (Remember that Merge-and-Sort should output the number of new inversions created
by combining A and B. More precisely, Merge-and-Sort counts the number of pairs (i, j) such
that A[i] > B[j]).

Fix j. Observe that count is only increased whenever we are in the second condition of our
while loop, where A[i] > B[j].

Claim 1: count is increased by precisely the number of pairs (i, j) such that A[i] > B[j].

6

If we can prove Claim 1, then the correctness of Merge-and-Sort will be complete (the proof
that the sorting is correct is the same as that of MergeSort). We first prove the following claim.

Claim 2: In our while loop, whenever we are in the second case, (A[i] > B[j]), i is the least
integer such that A[i] > B[j].

Suppose towards contradiction that there is some integer k < i such that A[k] > B[j]. Then
becauseB is sorted, our while loop would have run the second case onA[k] > B[j], contradicting
our original assumption that we are in the second case on A[i] > B[j].

With Claim 2 proven, and the fact that A is sorted, Claim 1 easily follows. This completes the
proof of MergeSort.

We now proceed by induction on n, the length of our array, to prove the correctness of Sort-
and-Count.

If n = 1, then Sort-and-Count outputs 0 inversions and L, which is clearly correct. This
establishes our basis.

Assume for our induction hypothesis that Sort-and-Count is valid on every array of at most
length n. Suppose L has length n+ 1. Then, (informally), countA is the number of inversions
with pairs within the first dn/2e integers of L, countB is the number of inversions with pairs
within the last bn/2c integers of L, and count is the number of inversions with pairs mixed
between the first and last halves of L. Therefore c matches our definition of inversions for
L. We have already proved that Merge-and-Count provides the correct sorted array for L, so
Sort-and-Count is valid for L. This completes the induction step for Sort-and-Count.

(d) We can see that Sort-and-Count has the run-time equation

T (n) = 2T (n/2) +O(n).

The master theorem now tells us that T (n) = O(n log n) – a drastic improvement from O(n2).

7

Problem 6. Counting Occurences (Source: Skiena, 148-149). We want to count the number of times a given
key (say “Arth”) occurs in a sorted array of strings. As an example, when the input is

(Arth, Anvitaa, Danica, Yihe, Alexis, Arth, Alexis, Gaurav,

Gaurav, Yihe, Anvitaa, Danica, Wilson, Arth, Danica, Arth)

the answer would be 4, because Arth occurs 4 times in the array.

(a) Given a sorted array A of strings and key k, design a Divide-and-Conquer algorithm that
counts the number of occurrences of k in the array A and has a runtime of O(log n). Hint: Use
a modified version of Binary search.

(b) Prove the correctness of the algorithm.

(c) Prove the time complexity of the algorithm.

Solution:

(a) Algorithm:

The algorithm uses the binary search routine presented below to find the index of an element
k of the correct block. To identify the boundaries of the block, the algorithm sequentially tests
elements to the left of k until we find one that differs from the search key, and then repeat this
search to the right of k. The difference between the indices of these boundaries plus one gives
the number of occurrences of k. If the binary search returns −1, then we set occurrence to 0.
Finally, we return the occurrence.

binary_search(A, k, low, high)

int middle;

if (low > high):

return -1;

middle = (low + high) / 2;

if(A[middle] == key):

return middle;

if(A[middle] > key):

return binary_search(A, k, low, middle -1)

else

return binary_search(A, k, middle + 1, high)

An alternative algorithm would be to use binary search twice: first time to find the index of
the right boundary and second time to find the index of the left boundary. We could use this
modified version of binary search in which we remove the line

if (A[middle] == key) then return middle;

and return index high instead of -1 on unsuccessful search. The index obtained from the first
search would be the right boundary of key k. For the second search, we would need to reverse
the comparison of A[middle], key to obtain the index of the left boundary of k. If either search
returns −1, then set the occurrence to 0. Finally, the algorithm returns right index – left index
+ 1, which is the occurrence of key k.

(b) Correctness: For the first method, the algorithm uses binary search to find one location of
key k since A is sorted. If k is not presented in the array A, then the algorithm returns 0
for the occurrence. Otherwise, binary search returns the index i of k. Then the algorithm
traverse to left of i in A until it sees an element that differs from k(it stops before moving to
the different element). This current index would be the correct index of the left boundary of
key k. Likewise, moving to the right until it encounters a different key would yield the correct
index of the right boundary of key k since the array A is sorted. The algorithm returns the
difference between the right and left index + 1, which gives the correct occurrence of k.

For the second method, the algorithm uses modified binary search twice to find the left and
right boundary of k. Since the array A is sorted, binary search would return the correct index
of k in A. If k is not in A, then the algorithm returns 0, which is the correct result. For the

8

modified version of binary search, all search would be unsuccessful since it does not have an
equality test. The search will proceed to the right half whenever the key is compared to an
identical array element, eventually terminating at the right boundary. For the second search,
since we reverse the direction of the comparison between A[middle] and k, the search will
proceed to the left half when the key is compared to an identical array element. This leads to
the left boundary. The algorithm returns the difference between the right and left index +1,
which also gives the correct occurrence of k.

(c) Runtime: The first method uses binary search to find the location of key k in O(log(n)) time.
Then the algorithm moves to the left of k to find the first different element and then moves to
the right to find another different element. This takes runtime O(s) where s is the occurrence
of k in array A. Thus, the runtime for the first method is O(log(n) + s). However, this method
could approach O(n) when s becomes close to n. This brings us to the second method. The
second method uses binary search twice to obtain the right and left index of key k and returns
the different plus 1. Since binary search takes O(log(n)) time, the second method runs in
O(log(n)) time.

9

Problem 7. Largest Subrange (Source: Skiena 157-158). Suppose you are tasked with writing the advertising
copy for a hedge fund whose monthly performance this year was

(−17, 5, 3,−10, 6, 1, 4,−3, 8, 1,−13, 4)

You lost money for the year, but from May through October you had your greatest gains over any
period, a net total of 17 units of gains. This gives you something to brag about.

The largest subrange problem takes an array A of n numbers, and asks for the index pair i and j
that maximizes S =

∑j
k=iA[k].

Remember: Summing the entire array does not necessarily maximize S because of negative num-
bers.

(a) Given an array A = [a1, . . . , an], design a Divide-and-Conquer algorithm that find a maximum
subarray and has a runtime of O(n log n). Hint: Use a modified version of Binary search.

(b) Prove the correctness of the algorithm.

(c) Prove the time complexity of the algorithm.

Solution:

(a) Algorithm. Given an array A = [aL, . . . , aH ], where the initial call is L = 1 and H = n.

� Divide [aL, . . . , aH ] into 2 (almost) equal subarrays AL = A[aL, . . . , aM ] and AH =
A[aM , . . . , aH ] where M = b(H − L)/2c

� find the maximum subarrays of AL and AH

� find a max-subarray that crosses the midpoint (by checking the strip straddling the centre)

� return the maximum of the three

(b) Correctness:

The largest subrange sum is either entirely to the left of center or entirely to the right, or (like
here) the sum of the largest center-bordering ranges on left and right.

Base case: When n = 1, the optimal subrange is A[1] since there are no other ranges.

Induction hypothesis: Suppose n ≥ 2. Assume that subrange(A[aL, . . . , aM ]) and
subrange(A[aM , . . . , aH ]) are correct where L = 1,M = n/2 and H = n.

Induction step: subrange is called on 2 instances of size n/2. By the IH, this returns 2 optimal
subranges. If the 2 subranges are in the middle, the union of the two are taken. The maximum
of the 3 are returned. Thus, the subrange problem is correct, it outputs the maximum of input
A.

(c) Runtime: Dividing the array in half takes O(n) time. Calculating the maximum subarrays of
AL, AH recurses on 2 halves of the problem (calculating the max-subarray is not a subproblem
because it is just the union of 2 arrays after checking the centre strip). Thus this is T (n) =
2T (n/2). Lastly, choosing the maximum of the 3 arrays is O(1). Thus we get T (n) = 2T (n/2)+
O(n) +O(1) = O(n log n) by the Master Theorem.

10

Problem 8. Closest Pair of Points (Source: Skiena 158-159, Kleinberg/Tardos 131-137). Given an array of n
points on a plane P = [(x1, y1), . . . , (xn, yn)], find the distance of the closest pair of points in the
array.

(a) Devise a Divide-and-conquer algorithm that solves this problem in O(n log n) time.

(b) Prove the correctness of the algorithm.

(c) Prove the time complexity of the algorithm.

Solution:

(a) Using known sorting algorithms, we can sort P into Px and Py, in order of x-coordinate and
y-coordinate, respectively. Furthermore, this can be done in O(n log n) time.

We now design a sub-algorithm Closest-Pair-Rec that takes in as input (Px, Py), where Px and
Py contain the same points but ordered by x-coordinate and y-coordinate, respectively.

If |P | ≤ 3, then we find a closest pair by measuring all pairwise distances. Closest-Pair-Rec
now outputs this pair.

Otherwise, construct Q by taking the first dn/2e elements of Px, and construct R by taking
the remaining elements. Construct Qx, Qy, Rx, and Ry in the same way that Px and Py were
constructed. This step only takes O(n) time since we already have Px and Py to work from.
Let

(q∗0 , q

1) = Closest-Pair-Rec(Qx, Qy)

(r∗0 , r

1) = Closest-Pair-Rec(Rx, Ry).

Furthermore, set
δ = min(d(q∗0 , q


1), d(r


0 , r

1))

x∗ = maximum x-coordinate of a point in Q

L = {(x, y) : x = x∗}

S = points in P within distance δ of L

where d denotes Euclidean distance between two points. Construct Sy. For every point s ∈ Sy,
compute the distance from s to each of the next 15 points in Sy. Let (s, s

′) be a pair in S that
has minimum distance.

Finally, if d(s, s′) < δ, then our sub-algorithm will output (s, s′). Otherwise, return the pair that has minimum distance out of (q∗0 , q ∗ 1) and (r ∗ 0 , r ∗ 1). This completes the sub-algorithm. (b) Correctness. Proof sketch - for a full proof, please refer to Kleinberg/Tardos 131. We proceed by induction on |P | to prove the correctness of Closest-Pair-Rec. If |P | ≤ 3, Closest-Pair-Rec brute forces the pairs and provides the correct answer, establishing our basis. Assume for our induction hypothesis that Closest-Pair-Rec outputs the correct answer whenever |P | ≤ n. Let P be a set of points with |P | = n+ 1. With Q and R defined as in the algorithm, (q∗0 , q ∗ 1) and (r ∗ 0 , r1∗) are a pair of closest points for Q and R, respectively. δ now provides an upper bound for the minimum of distances in P . The only way that we could have a smaller distance is by considering pairs of points, with one point from Q and one point from R. One can prove that if such a pair existed, the two points within the pair would have to be within S. Furthermore, one can prove that if such a pair existed, the two points in the pair would be at most 15 indices away from each other in Sy (S sorted by y-coordinate). [In fact, one can shorten this down to 7 indices]. Closest-Pair-Rec now takes the minimum of the pairs found in Q and R, or the pair found within S, which exhausts the pairs that we need to check. This completes the correctness proof of Closest-Pair-Rec. (c) We can see that Closest-Pair-Rec has the run-time equation T (n) = 2T (n/2) +O(n). The Master Theorem now tells us that T (n) = O(n log n). Therefore, our overall algorithm also has run-time O(n log n). 11 Problem 9. Suppose that each person in a group of n people votes for a person from a slate of candidates to fill a position on a committee. The top finisher wins position as long as the winner receives more than n/2 votes. (a) Devise a O(n log n) Divide-and-conquer algorithm that determines whether the candidate who received the most votes received at least n/2 votes and, if so, determine who the person is. Hint: Assume that n is even and split the sequence of votes into two sequences, each with n/2 elements. Note that a candidate could not received a majority of votes without receiving a majority of votes in at least one of the two halves. (b) Prove the correctness of the algorithm. (c) Perform a time analysis for your algorithm (Time measure: number of comparisons). Solution: (a) We first note that our base case is that a sequence with one element means that the one person on the list is the winner. For our recursive step, we will divide the list into two equal parts and count which name occurs the most in the two parts. We know that the winner requires a majority of votes and that would mean that the winner will need at least half+1 of a part to be his/her name. Keep applying this recursive step to each half and we will eventually have only at most two names in the list. Count the number of occurrences in the whole list of the two remaining names and this will decide the winner. This will only require at most 2n additional comparisons for a list of length n. (b) Correctness. For the base case, when n = 1, the winner is the person and the algorithm. Assume the algorithm correctly decides the winner and if the winner receives more than half votes for input of size k where 1 ≤ k < n. Consider the case where there are n people. Assume the person who gets the most votes gets m votes in total. We discuss cases for m and show our algorithm gives the correct output in both case. 1. m > n
2

: Then assume the person gets m1 votes in the first half of the votes, and gets m2
votes in the second half. We have m = m1 +m2 >

n
2

implies m1 >
n
4

or m2 >
n
4

. So this
person has to be the winner in at least one of the halves, and by IH would be returned by
the subroutine called on that half of the vote. There is possibility that in the other half
another candidate earns more than half votes, and our algorithm would scan through all
the votes and correctly count the number of votes earned by each of the two candidate,
and correctly returning the one with more votes, who is the winner in this voting.

2. m ≤ n
2

: in this case, no matter the procedure called on each halves returns a winner or
not, after examine the total votes of the candidates being returned by the subroutine, our
algorithm decides that they receive no more than half of the votes and correctly returns
that there’s no valid winner in this voting.

(c) Our function is f(n) = 2f(n/2) + 2n. So a = 2, b = 2, c = 2, d = 1. By the Master theorem,
since a = bd, the big-O is O(nd log n) = O(n log n).

12

Problem 10. Let’s say you have a stack of (rectangular) windows open on your computer and you have a setting
that tethers each program window so that the lower left corner of each window is in the lower left
corner of the screen. In order for your computer to know what part of the screen you are using, it
needs to know the general outline of all the windows put together.

Notice that since all windows are tethered to the lower left corner, each window can be identified
by the upper right corner.

Given a list of n upper right corners of windows: [(x1, y1), . . . , (xn, yn)], find the set of corners that
identify the outline of the stack of windows.

Example: Suppose that you had windows with upper right coordinates at:

[(4, 14), (7, 11), (10, 6), (8, 16), (15, 2), (2, 18), (13, 8), (17, 5)]

Then the outline of the collection of windows is identified by the red points:

[(2, 18), (8, 16), (13, 8), (17, 5)]

(a) Devise a Divide-and-conquer algorithm that solves this problem in O(n log n) time.

(b) Prove the correctness of the algorithm.

(c) Prove the time complexity of the algorithm.

Solution:

(a) Algorithm: Sort the points in increasing order by x coordinate. If there is only one window,
then return the top right corner of that window. Otherwise split the list of points in half to
get the left half L and the right half R. Recursively find the outline of L: OL and the outline
of R: OR.

Notice that the points that define the outline of R will be part of the outline of the original
set. The same cannot be said about the points that define the outline of OL. There could be
a point in OR that is higher than a point in OL. So, only keep the points from OL that are
higher than the highest point of OR.

13

procedure Windows[(x1, y1), …, (xn, yn)](sortedbyxcoordinates).

1. if n == 1

2. return (x1, y1)

3. m = bn/2c
4. L = [(x1, y1, . . . , (xm, ym)]

5. R = [(xm+1, ym+1, . . . , (xn, yn)]

6. OL = Windows(L)

7. OR = Windows(R)

8. output = [ ] (initialize the output to be the empty list)

9. Set Y to be the maximum y value of OR

10. for (x, y) ∈ OL
11. if y > Y

12. add (x,y) as the last element of output

13. for (x,y) ∈ OR
14. add (x,y) as the last element of output

15. return output

(b) Correctness:

Base case: When n = 1, the outer most window is A[1] since there are no other windows that
can cover it.

Induction hypothesis: Suppose n ≥ 2. Assume that Windows (A[(x1, y1), . . . , (xm, ym)]) and
Windows(A[(xm, ym), . . . , (n, yn)]) are correct where m = n/2.

Induction step: Windows is called on 2 instances of size n/2. By the IH this returns the outline
windows for the two subarrays. Lines 10-14 contains a for loop where all windows in OL that
are taller than the tallest element in OR are added to the final output. Then all elements that
are in OR are added to the output. This works because the outline windows that are taller
than the windows in OR will not be covered by OR. Thus, the windows algorithm is correct,
it outputs the outline of the windows.

(c) Runtime Analysis: There are two recursive calls, each on a problem of half the size. Then,
finding the maximum y coordinate will take O(n) time and adding in the coordinates to output
will take O(n) time. All in all, the non-recursive part takes O(n) time. So if T (n) is the runtime
of the algorithm on an input of length n, then T (n) = 2T (n/2) +O(n).

By the Master Theorem with a = 2, b = 2, d = 1, we get T (n) = O(n log n). Don’t forget that
we sorted the points before we ran the algorithm. Sorting takes O(n log n). So, all in all, the
entire algorithm takes O(n log n).

14