CPSC 320 2020W1: Assignment 4
This assignment is due Sunday, November 22nd, 2020 at 22:00 Pacific Time. Assignments submitted within 24 hours after the deadline will be accepted, but a penalty of 15% will be applied. Please follow these guidelines:
Prepare your solution using LATEX, and submit a pdf file. Easiest will be to use the .tex file provided. For questions where you need to select a circle, you can simply change \fillinMCmath to \fillinMCmathsoln .
Enclose each paragraph of your solution with \begin{soln}Your solution here…\end{soln}.Your solution will then appear in dark blue, making it a lot easier for TAs to find what you wrote.
Start each problem on a new page, using the same numbering and ordering as this assignment handout.
Submit the assignment via GradeScope at https://gradescope.ca. Your group must make a single submission via one group member’s account, marking all other group members in that submission using GradeScope’s interface.
After uploading to Gradescope, link each question with the page of your pdf containing your solution. There are instructions for doing this on the CPSC 121 website, see https://www.students.cs.ubc.ca/ ~cs-121/2020W1/index.php?page=assignments&menu=1&submenu=3. Ignore the statement about group size that is on that page.
Before we begin, a few notes on pseudocode throughout CPSC 320: Your pseudocode should commu- nicate your algorithm clearly, concisely, correctly, and without irrelevant detail. Reasonable use of plain English is fine in such pseudocode. You should envision your audience as a capable CPSC 320 student unfamiliar with the problem you are solving. If you choose to use actual code, note that you may neither include what we consider to be irrelevant detail nor assume that we understand the particular language you chose. (So, for example, do not write #include
Remember also to justify/explain your answers. We understand that gauging how much justifi- cation to provide can be tricky. Inevitably, judgment is applied by both student and grader as to how much is enough, or too much, and it’s frustrating for all concerned when judgments don’t align. Justifica- tions/explanations need not be long or formal, but should be clear and specific (referring back to lines of pseudocode, for example). Proofs should be a bit more formal.
On the plus side, if you choose an incorrect answer when selecting an option but your reasoning shows partial understanding, you might get more marks than if no justification is provided. And the effort you expend in writing down the justification will hopefully help you gain deeper understanding and may well help you converge to the right selection :).
Ok, time to get started…
1
1 Statement on Collaboration and Use of Resources
To develop good practices in doing homeworks, citing resources and acknowledging input from others, please complete the following. This question is worth 2 marks.
1. All group members have read and followed the guidelines for groupwork on assignments in CPSC 320 (see https://www.students.cs.ubc.ca/~cs-320/2020W1/index.php?page=assignments&menu=1& submenu=3).
Yes No
2. We used the following resources (list books, online sources, etc. that you consulted):
3. One or more of us consulted with course staff during office hours. Yes No
4. One or more of us collaborated with other CPSC 320 students; none of us took written notes during our consultations and we took at least a half-hour break afterwards.
Yes No
If yes, please list their name(s) here:
5. One or more of us collaborated with or consulted others outside of CPSC 320; none of us took written notes during our consultations and we took at least a half-hour break afterwards.
Yes No
If yes, please list their name(s) here:
2
2 The ISGM problem comes back
In this question, we will revisit the ISGM problem you encountered in assignment 2 and on the midterm. This time, however, we will not be asking you to use it as the target for a reduction. Recall that this problem is defined by:
Given an undirected graph G = (V,E) with vertices V and edges E, an integer (positive, zero, or negative) weight hv for each v ∈ V , and an integer (positive, zero, or negative) weight Ju,v for each edge {u, v} ∈ E, find an assignment of +1 or −1 to the variables xv for each v ∈ V that minimizes the objective function:
hvxv + Ju,vxuxv v∈V (u,v)∈E
In this problem, we will consider the case where the graph G is a k-ary tree: a rooted tree where every node has at most k children for some fixed positive integer k. For instance, binary trees are the case k = 2.
a. [8 marks] Design an O(|V |) time algorithm to find the assignment of +1 or −1 to the variables xv for each v ∈ V that minimizes the objective function. Hints: (1) think about dynamic programming, (2) there will be some brute-force work involved.
Solution: If the optimal solution assigns a value x to a child c of the root, then it will contain an optimal solution (given the fixed value x at c) for the subtree rooted at c, since there is no advantage to using a suboptimal solution for that subtree (the only interaction between that subtree and the rest of the graph is the node c). We don’t know whether this optimal solution assigns −1 or +1 to the root, so we will try both options.
We thus perform a post-order traversal of the tree, computing for each node the optimal value of the objective function in two cases:
When we assign −1 to the node; When we assign +1 to the node.
In order to obtain these values for a node N, we consider every value that might be assigned to each of the up to k children of N. This algorithm falls under the dynamic programming paradigm because we store the values we have computed in the nodes (so we have a “tree-shaped” table instead of an array).
Here is pseudo-code for an implementation of the algorithm:
Algorithm SolveISGM(T, h, J)
//
// T is the root of the tree.
//
PostOrderTraversal(T, Optimize{h, J}) If T[-1] < T[+1]
Assign{h, J}(T, -1) else
Assign{h, J}(T, +1)
Function PostOrderTraversal(N, function)
//
// Call function on each node of the tree rooted at N
//
3
for each child c of N
PostOrderTraversal(c, function)
function(c)
Function Optimize{h, J}(N) //
// Compute N[-1] and N[+1] by considering all possible choices for each of
// N’s children.
//
// Initialization N[-1] ← -h[N] N[+1] ← h[N]
// Find the best choice for each child, for each of N[-1], N[+1] for i ← 0 to N.children().size() - 1
c = N.children(i)
// First compute what choice to make for N[-1] value[-1] ← c[-1] + J[c, N]
value[+1] ← c[+1] - J[c, N]
N[-1] = N[-1] + min(value[-1], value[+1])
// Now compute what choice to make for N[+1] value[-1] ← c[-1] - J[c, N]
value[+1] ← c[+1] + J[c, N]
N[+1] = N[+1] + min(value[-1], value[+1])
Function Assign{h, J}(N, v) //
// Assign values to the nodes in the subtree that minimize the // objective function, starting with value v at the root.
//
N[x-value] ← v
// Assign values to each child
for i ← 0 to N.children().size() - 1
c = N.children(i)
if c[-1] - J[c,N] * v <= c[+1] + J[c,N] * v then
Assign(c, -1)
else
Assign(c, +1)
b. [4 marks] Prove the correctness of your algorithm from part 1. Hint: this should be simple.
Solution: The correctness of our solution follows from the fact that we are considering both possible assignment of values to each node of the tree, and for each child we consider the contribution of its subtree to the objective function for both of its possible assigned values.
c. [3 marks] Analyze the worst-case running time of your algorithm from part 1 as a function of both n and k.
4
d.
Solution: A post-order traversal of a tree with n nodes takes times in O(nα) where α is the amount of time spent visiting each node. Function Optimize has a loop that runs at most 2k time, since each node has at most k children and we try every possible assignment of ±1 to each child. So the running time of our algorithm is in O(kn). This is linear in n as long as k is a constant.
[3 marks] What happens to your algorithm’s running time if nodes of the tree may have an arbitrary number of children (i.e. maybe a single root node with n − 1 leaves, or a root node with √n children,
of the leaves), the running time of the algorithm will be O(k1)+O(k2)+···+O(kn)∈O(k1 +k2 +···+kn)=O(n−1)=O(n)
University of Building Construction
3
√
n leaves)?
Solution: If the n nodes of the tree have k1, k2, ...kn children (many of these values will be 0 because
each of which has about
The University of Building Construction is a (fictitious) prestigious, public university, located on a beautiful peninsula. However, after decades of government underfunding, the Board of Governors has become desperate for funding and has decided to use core campus land to build luxury condos to sell to wealthy foreign investors. In this question, you will create efficient algorithms to help them plan where and how high to build the new buildings.
In particular, we assume that there are n potential building sites, and that they are all located along a straight line. There is already an academic building at each building site, and the height of the building at site i is stored in array h[i], for i ∈ {1, . . . , n}. Also along the same line, there is a famous landmark, and for each building site i, the distance from this landmark to building site i is stored in array d[i]. Note that the building sites are ordered by their distance from the landmark, so d[i] < d[j] for all i < j.
h[1]
h[3]
...
...
h[n]
Landmark
h[2] i=123
n
d(1)
d(2)
d(3)
To keep the math simple, we’ll ignore the width of the buildings — think of them as vertical line segments, as in the picture. And we’ll assume the earth is flat, so the line from the landmark through building site 1, 2, 3, etc. through building site n is straight.
a. [4 marks] (This first part should be easy. Don’t overthink it.)
In focus groups with wealthy investors, the UBC development team found that buyers would pay top dollar if they can see the entire famous landmark. So, the team wants to compute, for each building site 2 ≤ i ≤ n, how high would they have to build so that the top of the new building can see the entire landmark, assuming no other building is changed. (Building site i = 1 has a fancy, windowless
5
d(n)
performing arts center on it, so it’s not being considered. Besides, even at height 0, it would be possible to see the landmark from site 1.) For example, in this figure:
h[3]
...
...
∆h[n]
h[n]
h[1] i=123
Landmark
h[2]
the top of building n currently cannot see the base of the landmark, because it’s blocked by buildings 1 and 3 (red, dotted line. We’ll ignore other buildings between 3 and n for this example.), but if we increased the height by the red amount ∆h[n], then the top of the new building could see over the tops of the other buildings (green, dashed line).
Mathematically speaking, we are computing for each building site 2 ≤ i ≤ n a new height h′[i] (which could be greater than or less than or equal to the current height h[i]), such that the new slope h′[i]/d[i] is equal to the greatest slope h[j]/d[j] over all 1 ≤ j < i.
The development team had previously hired a programmer to solve this problem, but it runs too slowly, having worst-case runtime Θ(n2). (There are a lot of building sites!) Here is the current pseudocode:
// Loop over all building sites to compute h_prime[i]
for (int i=2; i<=n; i++) {
// ***** START OF THE BLOCK OF CODE TO REPLACE *****
// Find the largest slope on a site less than i
int bestj = 1;
double bestslope = h[1]/d[1];
for (int j=2; j bestslope) {
bestj = j;
bestslope = h[j]/d[j];
}
}
double answer = bestslope*d[i];
// ***** END OF THE BLOCK OF CODE TO REPLACE *****
h_prime[i] = answer;
}
Your job is to replace the marked part of the code so that the overall code computes the exact same results, but the overall code runs in worst-case O(n) time instead. Your code must be a drop-in replacement for the marked part of the code, and you may not make other changes to the code.
Solution: The key insight is that when you compute h′[i − 1], that’s based on the best slope over all 1 ≤ j < i − 1, so you don’t have to search through all of them again. The only two possibilities for h′[i] are to use the same best slope as for h′[i − 1], or to use the slope at h[i]/d[i], whichever is greater.
// Loop over all building sites to compute h_prime[i]
for (int i=2; i<=n; i++) {
// ***** START OF THE BLOCK OF CODE TO REPLACE *****
6
n
// Getting started is a little tricky, since we can’t modify the outer loop.
double bestslope;
if (i==2) {
bestslope = 0.0;
} else {
bestslope = h_prime[i-1]/d[i-1]; // Previous largest slope
}
// Next check whether i-1 gives a bigger slope.
if (h[i-1]/d[i-1] > bestslope) bestslope = h[i-1]/d[i-1];
double answer = bestslope*d[i];
// ***** END OF THE BLOCK OF CODE TO REPLACE *****
h_prime[i] = answer;
}
b. [8 marks] (This second part is trickier! Keep an eye out for our hints!)
In this second part of the problem, we no longer care about seeing the landmark. Instead the focus is on increasing the height of existing buildings, but with certain constraints. Specifically, from the top of each building site i, imagine looking straight out horizontally in the direction of the landmark (to the left in the drawings), until you hit a taller building (or can see forever, if there is no taller building). For each building that we “look over” (i.e., that is between us and the taller building), we could add height up to this horizontal line without changing the view from this point. For example, in this figure:
looking left from building site 1, you can see forever, but there are no more building sites to the left, so the amount of height that can be added to existing buildings is 0; looking left from building site 2, you can see horizontally only until site 1, so again there are no building sites in between, so again, the answer is 0; however, looking left from building site 3, you can see forever, and you “look over” building sites 1 and 2, so the total amount of height that we can add to existing buildings without changing the horizontal view from site 3 is equal to (h[3] − h[1]) + (h[3] − h[2]), as shown in red in this figure:
Landmark
7
h[1]
i=123
h[3]
h[n] …
h[2]
Landmark
… n
i=123
h[3] − h[1]
h[1]
… n Because we are computing the total height that we can add to shorter buildings that lie between taller
buildings, we call the quantity that we compute the amount of height in the “valley”.
As in the preceding part of this question, the development team’s previous programmer wrote code to solve this problem, namely, for each building site i, add up (h[i] − h[j]) for all the j from i − 1 down to (but not including) the first j where h[j] > h[i]. However, once again, the previous programmer’s code is too slow, with worst-case runtime Θ(n2):
h[3] − h[2]
h[3]
h[n] …
h[2]
// This code will store the correct sum for building site i
// in array element v[i] (where v stands for “valley”)
// Loop over all building sites
for (int i=1; i<=n; i++) {
// Now start with j=i-1 and go downward, adding up height increments
// until we hit something taller than i, or we run out of building sites.
double total = 0.0;
int j = i-1;
while ((j>=1) && (h[j]<=h[i])) {
total += (h[i]-h[j]);
j--; }
v[i] = total;
}
Your job is to write pseudocode that computes the exact same values v[i] as the code above, but runs in worst-case O(n) time. Unlike the earlier part, you are permitted to make larger changes to this code.
Hint: Our solution uses the same outer loop and a similarly structured loop body. However, the inner loop is more clever to run faster.
Hint: It’s not totally obvious that the correct solution runs in O(n) time. In this part of the question, focus on getting code that computes the correct results and avoids wasted, repeated work. You’ll explain why your code runs in O(n) time in in the next part of this question.
Hint: It’s very helpful to compute an array w[] (w is for “width”), where w[i] indicates how many building sites are included in the valley that starts at i (including i itself). For example, in the picture above, w[1] = 1, w[2] = 1, and w[3] = 3. You can compute the values w[i] in the same outer loop as for v[i], or you can pre-compute w[i] in an earlier loop. As long as your overall code runs in time O(n), that’s OK.
Hint: How might you avoid repeated work? Consider this example:
h[1]
h[3] h[2]
Landmark
The code above would compute v[8] by adding up (h[8]−h[7])+(h[8]−h[6])+(h[8]−h[5])+(h[8]−h[4]). But if you’ve already computed v[1], . . . , v[7], as well as w[1], . . . , w[7], could you compute v[8] and w[8] without looking at all of h[4], . . . , h[7]?
Solution: This part was a LOT trickier, but it’s good to get a challenge! (And the hints hopefully helped a lot.)
The key insight here is that if h[i] > h[j] for some j < i, we can compute the amount of additional height possible for w[j] buildings as v[j] (which brings all those buildings up to height h[j]), plus w[j] ∗ (h[i] − h[j]) (which brings all those buildings fully up to height h[i]). And then we can “skip over” w[j] entries in the h[] array.
Once we work this out, the code is actually very similar to what we started with:
8
i=12345678
h[8]
// This code will store the correct sum for building site i
// in array element v[i] (where v stands for "valley")
// It will also correctly compute the entrees w[i], such that
// h[i-w[i]] is the first building (counting downward from i) that
// is taller than h[i].
// Loop over all building sites
for (int i=1; i<=n; i++) {
// Now start with j=i-1 and go downward, adding up height increments
// until we hit something taller than i, or we run out of building sites.
// However, we’ll use the trick to skip over buildings!
double total = 0.0;
int j = i-1;
while ((j>=1) && (h[j]<=h[i])) {
total += ( v[j] + w[j]*(h[i]-h[j]) );
j -= w[j]; }
v[i] = total;
w[i] = i-j; }
Interestingly, it’s not obvious that this runs in O(n) time. The outer loop runs Θ(n) times, but the inner loop, in the worst case, might also run Ω(n) times. But that puzzle is for the next part of this question!
c. [3 marks] Briefly explain why your answer in the preceding part runs in worst-case O(n) time. Solution: As noted earlier, it’s not obvious why this runs in O(n) time.
The outer loop runs n times exactly. Aside from the inner loop, the loop body runs in constant time, so adding that all up, we have O(n), ignoring the inner while loop.
So, what about the while loop? As noted, on any one iteration, the while loop might run Ω(n) times. For example, suppose the heights are in decreasing order, except the last height h(n) is the tallest. Then, when i = n, the inner while loop will decrement by 1, all the way down to 1, for n − 1 total iterations. However, note that in that “counterexample”, to set things up, in every preceding iteration of the outer loop, the while loop would have iterated zero times! So, can we find some way to say that the total number of times the while loop executes over all outer-loop iterations, is still linear?
The key is to note that in any iteration of the while loop, we look at some particular v[j] and w[j]. We’ll use j′ as the name for that particular value of j. But once we do so, their contributions are going to be lumped into v[i], and w[i] will be set that we never look at v[j′] or w[j′] again. Let’s use i′ to refer to the current value of i. If some future for-loop iteration has a larger value of i, it’s j can’t get to j′ before comparing to i′. If h[i′] > h[j], then the while loop stops, and j′ is not reached; otherwise, h[i′] ≤ h[j], and j′ gets skipped.
So, since each iteration of the while loop “uses up” some value of j ∈ {1, . . . , n − 1}, the while loop body can execute at most n − 1 times. This gives us the O(n) overall time bound.
9