Algorithms and Data, Summer 1 PS4 Answers
Part I
1.
a.) The greedy approach fails especially in cases where there is “thrashing” — we are continually lured back and forth between two shippers and pay the penalty each time, when it would have been better to stick with one shipper.
Suppose penalty 10 and consider the following weekly costs:
A 10 8 18 B 9 19 7
Staying with B has total cost 35, while the greedy approach has cost 9+8+7+10+10 = 44. b.)
// Let A, B, and C be the arrays of costs to ship for each week of the year. // Assume these are indexed 1 to 52. P is the penalty.
// Returns the schedule of best cost as a linked list. BestShipperSchedule(A, B, C, P):
// Let BestCost[S, i] be the cost of the best schedule that ends with
// shipper S on week i. We’ll use 1,2,3 to refer to using shipper A, B, and C, respectively. BestCost[1,1] = A[1]
BestCost[2,1] = B[1]
BestCost[3,1] = C[1]
// We’ll use BestPrev to figure out what the best overall schedule was in the end BestPrev[1,1] = BestPrev[2,1] = BestPrev[3,1] = null
// For ease of writing for loops, let’s put the costs in one big table
for w = 1 to 52
Costs[1,w] = A[w] Costs[2,w] = B[w] Costs[3,w] = C[w]
// Now we find for each week the best cost of a schedule that ends with a // particular shipper.
for w = 2 to 52
for newShipper = 1 to 3
prev_a_cost = BestCost[1,w-1] +Costs[newShipper,w] if newShipper != 1
prev_a_cost += P
prev_b_cost = BestCost[2,w-1] + Costs[newShipper,w] if newShipper != 2
prev_b_cost += P
prev_c_cost = BestCost[3,w-1] + Costs[newShipper,w] if newShipper != 3
prev_c_cost += P
if min(prev_a_cost, prev_b_cost,prev_c_cost) == prev_a_cost
BestCost[newShipper, w] = prev_a_cost
BestPrev[newShipper, w] = 1
else if min(prev_a_cost, prev_b_cost, prev_c_cost) == prev_b_cost
BestCost[newShipper, w] = prev_b_cost
BestPrev[newShipper, w] = 2 else
BestCost[newShipper, w] = prev_c_cost
BestPrev[newShipper, w] = 3
// Now we figure out the best schedule, working backwards from the best end
if min(BestCost[1,52], BestCost[2,52],BestCost[3,52]) == BestCost[1,52]) BestFinalShipper = 1
else if min(BestCost[1,52], BestCost[2,52],BestCost[3,52]) == BestCost[2,52]) BestFinalShipper = 2
else
BestFinalShipper = 3
BestSchedule = {} // empty list week = 52
BestShipper = BestFinalShipper do {
BestSchedule.push(BestShipper)
week–
BestShipper = BestPrev[BestShipper,week]
} while BestShipper != null return BestSchedule
c.) This algorithm does S work per entry of the BestCost table, plus a linear time pass at the end, so that is Ө(WS2 + W) = Ө(WS2).
2. a.)
// Given a string S[1…N], returns a list of strings in the best segmentation. WordSegmentation(S)
// We build a table of the best quality scores achievable up to letter i,
// and keep pointers back to the previous index where the last word ended. BestQuality[0] = 0
BestPrev[0] = null
for i = 1 to N
BestQuality[i] = wordQuality(S[0…i]) BestPrev[i] = 0
for j = 0 to i
// Quality of treating whole string as one word
if BestQuality[j] + wordQuality(S[j+1…i]) > BestQuality[i] BestQuality[i] = BestQuality[j] + wordQuality(S[j+1…i]) BestPrev[i] = j
// Work backwards from the end to get best words bestWordSegmentation = {} // empty list thisWordEnd = N
prevWordEnd = BestPrev[N]
while prevWordEnd != null bestWordSegmentation.push(S[prevWordEnd+1…thisWordEnd]) thisWordEnd = prevWordEnd
prevWordEnd = BestPrev[prevWordEnd] return bestWordSegmentation
b.) The recurrence is BestQuality[i] = maxj BestQuality[j] + wordQuality(S[j+1…N]), assuming we treat BestQuality[0] as 0. This works because the quality of the best segmentation is equal to the quality of its final word, plus the quality of all the words preceding it. There is no reason not to simply pick the best possible segmentation for a particular ending point of the previous words. Whatever the optimal solution does, wherever it has a word break, if it were not best to pick the highest-scoring segmentation up to that point, then we could substitute the highest- scoring word sequence up to that point and leave the rest of the sequence the same. Either that’s just as good as optimal, or we’d have a contradiction and the optimal solution wasn’t optimal after all. So the optimal solution must always consist of the best sum of final word and the score of the best segmentation up to that final word.
c.) This is an O(N2) algorithm, since we iterate up to N and do an O(N) pass with each iteration to look for possible beginning points of the final word.
3. a.) We can treat this as a network flow problem in the following way. Let the injured people be one set of vertices, and the hospitals be another set of vertices, as in bipartite matching. An edge of capacity 1 goes from an injured person to a hospital if the person can be transported to that hospital. Edges of capacity 1 go from the source to each injured person, and edges of capacity hi go from each hospital to the sink. Finding the max flow will then find the correct allocation of people to hospitals; the edges from people to hospitals that have flow should be treated as instructions to send that person to that hospital.
b.) Ford-Fulkerson’s running time of O(|E|C) is here proportional to roughly O(N2H), since the number of edges is roughly NH + N + H = O(NH) and the number of times augmenting paths can be found is limited by the number of injured people N (there are only that many paths from the source). Push-relabel’s running time of (V2E) becomes O((N+H)2NH) = O(N3H + N2H2 + NH3) which is worse in this case (since at least one of those terms is strictly worse).
4.) By linearity of expectation, every time a person goes out in a group of size gi, their expected number of times carrying the pack should increase by 1/gi. In a linear pass through the lists of people, we can calculate ceiling(Σ1/gi ) for each person, with the sum over all the days the person goes out on patrol. This is the limit on the number of times that person will carry the pack; call it Lj.
Then, we can set this problem up as a flow problem in the following way. Let the days be one set of nodes in a bipartite graph, and let the people be the other set of nodes. If a person is going out on patrol on day d, add an edge of capacity 1 from the day to that person. Add capacity 1 edges going from the source to every day, and add capacity Lj edges going from each soldier to the sink.
Now a maximum flow will, if it is possible, send a unit of flow across each edge to each day; that flow will be assigned to some soldier going out on patrol that day using an edge from the center; and no more than the fair limit will be assigned to each soldier, thanks to the Lj capacity edges going from each soldier to the sink.
Part II
1. The worst case cost of this operation is Θ(N), where N is the number of elements in the list, since the push may cause us to check every element in the list and drop them all. But for an amortized analysis, let the potential function be equal to the number of items in the list. Let D be the number of elements dropped from the list on a particular push. A push now has amortized cost equal to its true cost, 1 + D (add the element, maybe drop some others), plus the difference in the potential, which is 1-D (the element increases the potential, but we lose everything dropped). That’s an amortized cost of 1 + 1 + D – D = 2, which is a constant. So this is Θ(1) amortized cost.
2. a.) You probably should have seen spikes at increasing intervals as the ArrayList doubled in size whenever its underlying array was full. The spikes should have generally increased as the cost of copying the array got larger and larger. However, there’s also a hunt for memory that ArrayList needs to do when it resizes, and depending on the state of memory, that could cause some spikes to be larger than spikes that come later.
b.) You probably saw a kind of buzzsaw that zigzagged up diagonally. The buzzsaw sharp spikes are the resizings, which temporarily change the average. But, overall, you could probably fit a roughly straight line through the whole graph. In the long term, the overall costs of N operations is roughly proportional to N, even though the spikes for resizing are necessarily going to affect that average in the short term.