CSC373 Fall’20 Midterm 1 Solutions
Note 1: There were two variants of the midterm; let’s call them versions A and B. They differed slightly in the details of Q1 and Q2. These differences only manifested in finer details; the key ideas behind the solutions were the same. For each of Q1 and Q2, I’ll first describe the difference between the two variants, then provide the solution for version A, and then identify the minor differences in the solution for version B.
Note 2: Please note that the solutions below differ from the kind of solutions expected from you in the actual test in two significant ways.
1. The solutions below are quite detailed. This is to help you understand them without confusion. Your solutions in the actual test can be a lot more succinct.
2. The solutions below are written with the purpose of highlighting the process of arriving at them starting from the problem statement. For example, in Q1, I describe how to figure out that we need the recursive function to return some additional information. In the actual test, you can directly provide the algorithm and a justification for its correctness without explaining how you arrived at it.
Solution to Q1 Variants:
• A: The subgraph is allowed to be empty (empty subgraphs have total node weight 0).
• B: The subgraph is required to be non-empty. Solution for version A:
For a node v, let T(v) denote the subtree of node v; hence, the entire tree is T = T(r), where r is the root node.
We will design a function, which, when applied on node v, finds the maximum weight of any con- nected subgraph of T(v) (allowing the empty subgraph). To apply divide-and-conquer, we will call the function recursively on the children of v and use the solutions returned from these calls.
Note that a maximum-weight connected subgraph H of T(v) can have one of two possibilities:
• H doesn’t contain v. Then, H must be a maximum-weight connected subgraph of T(vi) for some child vi of v. This is because if H contains parts of subtrees of two or more children, then H won’t be connected.
• H contains v. Then, for each child vi of v, H must either include no nodes from T(vi) or a maximum-weight connected subgraph of T(vi) that contains vi itself. Once again, the condition of containing vi itself is necessary and sufficient to make sure that H is connected.
1
Here, we realize that from the recursive call on a child vi, we not only need the maximum weight of any connected subgraph of T(vi) (let’s call this A[vi]), but also the maximum weight of any con- nected subgraph of T(vi) that includes vi (let’s call this B[vi]). We will compute both quantities in our recursive function simultaneously. Note that B[v] is simply the maximum weight among subgraphs H in the second possibility above. Translating the description of the two possibilities above to code, we get the following algorithm:
Function f(v):
• If v is a leaf, return (A[v], B[v]) = (max(0, wv), wv).
• (A[vi],B[vi]) ← f(vi) for each child vi of v.
• B[v]←wv+vi=childofvmax(B[vi],0).
// This is the 2nd possibility. The 0 inside max allows not including any node from T(vi).
• A[v] ← max(maxvi=child of v A[vi], B[v]).
// The first term in outer max corresponds to the 1st possibility. A[v] is the max of the two.
The desired solution is obtained by calling f(r) (at root node r) and returning A[r].
Running time: When we call f(r), and let the function make recursive calls, we note that the
function is called at each node v exactly once (from the call on its parent node).
The time spent in the call at node v is O(d(v)), where d(v) is the degree of node v. Hence, the
total running time is O(n + v d(v)) = O(n + m) = O(n) because a tree has m = n − 1 edges.
Changes for version B:
We note that the 2nd possibility already enforces the subgraph H to be non-empty because node v itself is included. Hence, B[v] remains unchanged. We only need to change the definition of A[v] to the maximum weight of any non-empty connected subgraph of T(v). Interestingly, this does not change the recurrence relation of A[v] from the last line of function f (because now each A[vi] will also be forced to ignore the empty subgraph). The only change is in the base case of a leaf v, where we return (A[v], B[v]) = (wv, wv).
Solution to Q2 Variants:
• A: The algorithm starts at a, maintains a covered region [a, t], and constantly expands the region more to the right.
• B: The algorithm starts at b, maintains a covered region [t, b], and constantly expands the region more to the left.
Solution for version A:
(a) Let Ir be the interval added by the greedy solution in the r-th iteration of the while loop. Let OP T be an optimal solution that contains intervals I1, . . . , Ik for the maximum possible k. If k = |G|, then we must have OPT = G, so we are done. If k < |G|, we derive a contradiction by
2
designing an optimal solution that contains intervals I1, . . . , Ik+1.
Note that I1, . . . , Ik already cover the interval [a, fIk ]. Now, OP T must contain at least one other interval [si,fi] such that si ≤ fIk < fi. Otherwise, points just to the right of fIk will not be covered. Note that this interval covers fIk, and since the greedy algorithm chooses the interval withthegreatestendpointsamongsuchintervals,wehavefi ≤fIk+1. Hence,I1∪...∪Ik∪[si,fi]⊆ I1 ∪ . . . ∪ Ik ∪ Ik+1. Thus, OP T ′ obtained by replacing the interval [si, fi] from OP T with Ik+1 is also an optimal solution for covering the entire interval [a, b], which is the desired contradiction.
(b) The algorithm sorts the intervals in a non-descending order of the starting point, i.e., so that s1 ≤ . . . ≤ sn. Then, it makes one pass over the list, maintaining i∗, the interval with the maximum fi observed so far. When it encounters any si > t, it adds interval i∗ to the solution, sets t = fi∗ , and resets (uninitializes) i∗.
Let tr be the value of t at the end of the r-th iteration (with t0 = a). Then, the key insight is this: In the (r + 1)-st iteration, we want to consider intervals that cover tr, i.e., for which si ≤ tr ≤ fi. However, we only really need to consider those intervals for which si ∈ (tr−1,tr]. Any interval with si ≤ tr−1 will already be considered in a previous iteration, and since we always choose the greatest endpoint to include, it will not cover any additional point. On the other hand, some interval with si ∈ (tr−1,tr] may not have fi ≥ tr (i.e. it may not actually cover tr), but this will get ignored anyway since we will choose the interval with the maximum fi (which will inevitably have fi > tr).
The sorting takes O(n log n) time and then the single pass takes O(n) additional time. Hence, the algorithm runs in O(n log n) time.
Changes for version B:
The two versions are completely symmetric. To see this, visualize the intervals in front of you on a piece of paper and now imagine going to the opposite side of the paper. The algorithm from version A will now become the algorithm from version B. So if one is optimal, so is the other. In the solution above, you can correspondingly swap a and b, si-s and fi-s, and max and min.
Solution to Q3
(a) Define an array Q[1 . . . n] where Q[i] denotes the maximum number of pairwise friendly heads that we can chose from houses 1, . . . , i subject to choosing the head of house i. The desired solution is then max(Q[1], . . . , Q[n]).
(b) The difficulty in writing a recurrence relation that describes Q[i] in terms of Q[j] for j < i is that, while we can check if i is friendly with j, we have no idea what other heads are chosen as part of the solution of Q[j], and therefore, cannot directly check if i is friendly with them too.
At this point, we notice the following non-trivial insight:
• Fork
|A[i] − A[k]| ≤ |A[i] − A[j]| + |A[j] − A[k]| ≤ (i − j) + (j − k) = i − k. 3
Once we realize this insight, we notice that for j < i, head i can be added to any friendly committee from {1,...,j} as long as i and j are friends (because j will already be friends with any head k < j part of the committee, so i will be friends with k too). Hence, we get the following Bellman equation (the max over j returns 0 if there is no j < i who is friendly with i):
1, if i = 1,
Q[i] =
1 + maxj∈{1,...,i−1}∧(i,j) are friendly Q[j], if i > 1.
(c) Since each Q[i] depends on Q[j] for smaller j, we compute the entries in the order i = 1, 2, . . . , n. (d) Since each entry takes O(n) time to compute (due to taking maximum over j) and O(1) space
to store, the time complexity is O(n2) and the space complexity is O(n).
4