Midterm 2
posted: 9 am, 05.07.2021 due: midnight, 09.07.2021
You are not allowed to work in groups for the midterm. You can look in books and online, but
you must not discuss the exam problems with anyone. If you don’t understand a question, contact
Travis or one of the TAs for an explanation (but no hints). All problems are weighted the same
and, for problems broken down into subproblems, all subproblems are weighted the same.
As with assignments, answers the markers consider hard to read will get 0!
1. Suppose your class is on a field trip to an island with n towns on it, connected by m town-
to-town buses (which run in both directions), run by c companies. Each company’s buses
are a different colour, and there can be buses from two or more companies running between
two towns. You have a map showing which companies run buses between which towns. The
drivers have a relaxed attitude to schedules and the buses run often, so there’s no telling
which buses will be arriving and leaving next.
Your classmate Ryan has wandered off and got lost and you’re (somewhat reluctantly) trying
to find him. You’d told him which buses the class was supposed to take during the day, and
given him tickets from the appropriate companies, the same colours as the buses and stapled
together in the right order. Ryan didn’t remember which towns the class was going to visit,
however, so he always took the first bus he saw of the colour of the next ticket, tearing off
that ticket and giving it to the driver.
Design a polynomial-time dynamic-programming algorithm that, given the map of the bus
routes, Ryan’s starting point and the colours of the buses he took, calculates the probability
Ryan is in each of the n towns.
For example, if the map is as shown below on the left and Ryan started in town A and took
a red bus, a green bus, a blue bus and another green bus, then his itinerary could be any of
those show below in the center, and the probability of him being in a certain town after a
certain number of steps and of taking trips between cities is as shown in the DAG below on
the right.
A
B
C
D
A-B-C-D-B,
A-B-D-A-C,
A-B-D-C-A,
A-B-D-C-B,
A-C-A-B-C,
A-C-A-B-D,
A-C-A-D-B,
A-C-B-A-C .
A
B C
A B C D
A B C D
A B C D
1
0.5 0.5
0.5 0.5
0.25 0.25
0.250.25
0.25 0.25 0.250.25
0.125
0.125
0.125
0.125
0.25 0.25
0.125 0.125 0.375
0.0625 0.06250.0625 0.0625
0.375
0.0625 0.06250.4375 0.4375
0.375
0.375
1
The probability Ryan went first from A to B is 0.5, and the probability he went first from A
to C is 0.5. Therefore, after one trip, the probability he was in B is 0.5 and the probability
he was in C is 0.5.
The probability Ryan’s second trip took him from B to C is 0.5 times the probability he was
in B, or 0.25. The probability it took him from B to D is also 0.5 times the probability he
was in B, or 0.25. The probability it took him from C to A is 0.5 times the probability he
was in C, or 0.25. The probability it took him from C to B is 0.5 times the probability he
was in C, or 0.25. Therefore, after two trips, the probability is 0.25 he was in any particular
town.
The probability Ryan’s third trip took him from A to B is 0.5 times the probability he was
in A, or 0.125. The probability it took him from A to D is 0.5 times the probability he was
in A, or 0.125. The probability it took him from B to A is the probability he was in A, or
0.25. The probability it took him from C to D is the probability he was in C, or 0.25. The
probability it took him from D to A is 0.5 times the probability he was in D, or 0.125. The
probability it took him from D to C is 0.5 times the probability he was in D, or 0.125.
Therefore, after three trips, the probabilities Ryan was in A, B, C, D are, respectively, 0.25 +
0.125 = 0.375 (the probability his third trip took him from B to A plus the probability it
took him from D to A), 0.125, 0.125 and 0.125 + 0.25 = 0.375 (the probability his third trip
took him from A to D plus the probability it took him from C to D).
You can compute the probability of Ryan being in a particular town after four trips similarly.
Isn’t it lucky you’re on an island, so you can’t accidentally lose Ryan forever?
(Hint: first design an algorithm that computes the number of ways Ryan could have ended
up in a town, and then modify it to compute the probability.)
2. Your professor Travis told your TA Sarah that he was going to ask you to modify the solution
to the alignment question on Assignment 5, to compute an optimal alignment using only one
pass through the matrix.* In contrast, that assignment question allows filling in the matrix
and then walking back from the bottom right corner to the top left corner to compute the
alignment.
Travis claimed it was possible by keeping two more arrays, top[0..m, 0..n] and bottom[0..m, 0..n],
where top[i, j] is a pointer to a string of length at most m+n+ 1 (including the end-of-string
delimiter) containing the top line in an optimal alignment of S[1..i] to T [1..j], and bottom[i, j]
is a pointer to a string of length at most m+n+1 containing the bottom line in that alignment.
To compute top[i, j], we sprintf into an empty string either top[i− 1, j] or top[i− 1, j − 1]
or top[i, j − 1], followed by either S[i − 1] or ‘-’. To compute bottom[i, j], we sprintf into
an empty string either bottom[i− 1, j] or bottom[i− 1, j − 1] or bottom[i, j − 1], followed by
either T [j − 1] or -.
Sarah correctly pointed out that mallocing and sprintfing a string of length Ω(m + n + 1)
takes Ω(m + n + 1) time, so Travis’s solution takes cubic time. Help Travis by figuring out
how to modify his solution (shown below) to use quadratic time again.
(Hint: pointers are your friends!)
*Yes, this question is based on a true story.
2
#include
#include
#define MIN(a, b) (a < b ? a : b) #define MIN3(a, b, c) (MIN(a, b) < c ? MIN(a, b) : c) void align(char *S, char *T, int m, int n) { int A[m + 1][n + 1]; char *top[m + 1][n + 1]; char *bottom[m + 1][n + 1]; A[0][0] = 0; top[0][0] = (char *) malloc(m + n + 1); bottom[0][0] = (char *) malloc(m + n + 1); sprintf(top[0][0], ""); sprintf(bottom[0][0], ""); for (int i = 1; i <= m; i++) { A[i][0] = i; top[i][0] = (char *) malloc(m + n + 1); bottom[i][0] = (char *) malloc(m + n + 1); sprintf(top[i][0], "%s%c", top[i - 1][0], S[i - 1]); sprintf(bottom[i][0], "%s-", bottom[i - 1][0]); } for (int j = 1; j <= n; j++) { A[0][j] = j; top[0][j] = (char *) malloc(m + n + 1); bottom[0][j] = (char *) malloc(m + n + 1); sprintf(top[0][j], "%s-", top[0][j - 1]); sprintf(bottom[0][j], "%s%c", bottom[0][j - 1], T[j - 1]); } for (int i = 1; i <= m; i++) { for (int j = 1; j <= n; j++) { A[i][j] = MIN3(A[i - 1][j] + 1, A[i][j - 1] + 1, A[i - 1][j - 1] + (S[i - 1] == T[j - 1] ? 0 : 1)); top[i][j] = (char *) malloc(m + n + 1); bottom[i][j] = (char *) malloc(m + n + 1); if (A[i][j] == A[i - 1][j] + 1) { sprintf(top[i][j], "%s%c", top[i - 1][j], S[i - 1]); sprintf(bottom[i][j], "%s-", bottom[i - 1][j]); } else if (A[i][j] == A[i][j - 1] + 1) { 3 sprintf(top[i][j], "%s-", top[i][j - 1]); sprintf(bottom[i][j], "%s%c", bottom[i][j - 1], T[j - 1]); } else { sprintf(top[i][j], "%s%c", top[i - 1][j - 1], S[i - 1]); sprintf(bottom[i][j], "%s%c", bottom[i - 1][j - 1], T[j - 1]); } } } printf("%s\n%s\n", top[m][n], bottom[m][n]); return; } int main() { char *S = "AGATACATCA"; char *T = "GATTAGATACAT"; align(S, T, 10, 12); return(0); } Bonus (worth 10% of the midterm): Can you reduce the space usage to O((m + n)k), assuming you’re given the edit distance k between S and T? (Hint: banded dynamic programming and garbage collections are your friends!) 3. For the problem Longest Kind-Of Increasing Subsequence (LKOIS), we’re given a sequence S[1..n] of integers and asked to find the longest subsequence S′ of S such that S′[i− 1]− 3 ≤ S′[i] for 1 < i ≤ |S′|. Give an O(n log n) algorithm for LKOIS. 4. For the problem Partition, we’re given a set S of positive integers that sum to 2t and asked if there is a subset of S that sums to exactly t. Prove Partition is NP-complete by � showing Partition is in NP, � reducing one of the NP-complete problems we’ve seen in class to Partition. 5. Write a program that, given a list of the edges in a connected graph G on the vertices 1, . . . , n, in polynomial time outputs a Boolean formula F that is satisfiable if and only if G has a Hamiltonian path. You can assume the list of edges looks something like (1, 2) (1, 3) (4, 2) (6, 5) (5, 3) with one pair per line, and your output should consist of a single line containing copies of space, (, ), AND, OR, NOT and variables that look something like x1, x2, etc. 4 ``Clink'' versus ``BOOM'' I Divide and Conquer Colouring Graphs Assignment 1 Solution Euclid, Karatsuba, Strassen Fast Fourier Transform Assignment 2 Solutions Asymptotic Bounds Sorting Assignment 3 Solutions II Greedy Algorithms Huffman Coding Minimum Spanning Trees More Greedy Algorithms Assignment 4 Solutions Midterm 1 Solutions III Dynamic Programming Edit Distance Subset Sum and Knapsack Optimal Binary Search Trees and Longest Increasing Subsequences Assignment 5 Solutions IV NP-Completeness Reductions Assignment 6 Midterm 2