COMS20010 —
Short answer questions
1. (5 marks) (Short question.) Consider the unoptimised version of the union-find data structure on a 1000-element set, in which a sequence of n operations takes Θ(n log n) time in the worst case. Suppose this data structure is freshly-initialised, so that which each element is currently in its own (singleton) set. What is the least number of Union operations that could result in a component of the data structure having depth 4?
E. None of the above.
2. (5 marks) (Short question.) Consider the following 2-3-4 tree. What will the root contain and how many children will it have after we insert 72 into the tree?
A. The root will contain 41, 59 and will have three children.
B. The root will contain 41, 59, 99 and will have four children. C. The root will contain 59, 70 and will have three children.
D. The root will contain 41, 59, 65 and will have three children. E. The root will contain 41, 59, 70 and will have four children.
Solution: A. Remember, as covered in video 6-2, the depth can only increase when merging two elements of equal depth. To achieve depth 1 we need 1 Union operation. To achieve depth 2 we need to merge two trees of depth 2, which takes 2 · 1 + 1 = 3 Union operations. To achieve depth 3 we need to merge two trees of depth 3, which takes 2 · 1 + 1 = 7 Union operations. And likewise, to achieve depth 4 we need 2 · 7 + 1 = 15 Union operations.
Solution: E. The insert operation will take the rightmost child from the room, since 72 > 59. This will take it to (65, 70, 99), a 4-node, which it will split. This sends 70 up to the root, with left child (65) and right child (99). The insertion operation will then descent to (65) and keep going. (Note that the insertion operation does not need to split the root, since not doing so can never result in hitting a 4-node with a 4-node parent on the way down to insert 72. But I wouldn’t have listed “The root will contain 59 and have two children” as an option, either.)
Page 1 of 16
3. (5 marks) (Short question.) Mark each of the following statements true or false. (a) (1 mark) 40n ∈ o(20n).
(b) (1 mark) 40n ∈ O(20n).
(c) (1 mark) 40n ∈ Θ(20n).
(d) (1 mark) 40n ∈ Ω(20n). (e) (1 mark) 40n ∈ ω(20n).
4. (5 marks) (Short question.) Consider the bipartite graph G = (V, E) and indicated matching M below. v1 v2 v3 v4 v5 v6
Solution: A, B and C are false, D and E are true. Remember, we have 40n = 20n · 2n, and 2n ∈ ω(1), so we must have 40n ∈ ω(20n). This immediately implies 40n ∈ Ω(20n), 40n ∈/ Θ(20n), 40n ∈/ O(20n) and 40n ∈/ o(20n).
v7 v8 v9 v10 v11 v12
Which of the following is not an augmenting path? A. v1v7v2v8v3v9.
B. v9v4v11v5v10v2v7.
D. v9v4v11v6.
E. Either they are all augmenting paths, or more than one of them is not an augmenting path.
5. (5 marks) (Short question.) Which of the following logical formulae is not in conjunctive normal form (CNF)?
A. (a∨b∨c)∧(c∨¬d).
D. (a∨¬b)∧¬(c∨d).
E. Either they are all in CNF, or more than one of them is not in CNF.
Solution: B. For a path to be alternating, it must begin and end with unmatched vertices, and alternate between matching and non-matching edges. A, C and D achieve this, but B ends with a matched vertex.
Page 2 of 16
Solution: D. A CNF formula is an AND of OR clauses. Both AND and OR clauses can have length 1, as in B and C, but the clauses themselves cannot be negated as in D — only the variables they contain.
6. (5 marks) (Short question.) Consider the following circulation network.
What is the largest integer value of x such that the network has a valid circulation? A. 5.
D. There’s no valid circulation for any integer value of x. E. None of the above.
Solution: B. A valid circulation with x = 6 is pictured below. 5/5
(The process of finding a valid circulation is covered in video 9-2.) The vertex v with demand x has total inward capacity c−(v) = 6, so there is no circulation with x = 7 (as this would require too much flow into v).
7. (5 marks) (Short question.) Consider an instance of weighted interval scheduling with interval set R = [(1, 3), (2, 5), (4, 9), (0, 9), (7, 10), (8, 10), (11, 13), (9, 14), (12, 15)] and weight function w given by
w(1,3) = 6, w(2,5) = 6, w(4,9) = 4, w(0,9) = 10, w(7,10) = 3, w(8,10) = 3, w(11,13) = 8, w(9,14) = 9, w(12,15) = 4.
Page 3 of 16
What is the maximum possible weight of a compatible set? A. 19.
E. None of the above.
Solution: A. Two optimal sets are {(0, 9), (9, 14)} and {(1, 3), (4, 9), (9, 14)}. In more detail, writing OPT(i) for the weight of an optimal solution starting from the ith interval sorted in ascending order of finish time, is
OPT(1) = 19, OPT(2) = 19, OPT(3) = 19, OPT(4) = 19, OPT(5) = 12, OPT(6) = 12, OPT(7) = 9, OPT(8) = 9, OPT(9) = 4.
8. (5 marks) (Short question.) Mark the following five statements true or false. (a) (1 mark) In all directed graphs G = (V, E), we have |E| = v∈V d+(v). (b) (1 mark) All forests are connected.
(c) (1 mark) All induced subgraphs of trees are trees.
(d) (1 mark) Any tree with at least two vertices has at least two leaves. (e) (1 mark) Any tree with n vertices has n − 1 edges.
Solution: True, false, false, true, true. Statement 1 is the directed form of the handshaking lemma. A forest is any graph without cycles — an example of a disconnected forest is V = [2], E = ∅. An induced subgraph of a tree need not be connected — for example, taking V = [3], E = {{1,2},{2,3}} and X = {1,3}, then G = (V,E) is a tree, but G[X] is not connected. Any tree with at least two vertices does have at least two leaves, and this was covered in lectures. Any tree with n vertices has exactly n − 1 leaves by the fundamental lemma of trees.
9. (5 marks) (Short question.) Consider the graph below.
Page 4 of 16
Are each of the following sets vertex covers or not? (a) (1 mark) {vi : i ∈ [10]};
(b) (1mark) {v1,v3,v7,v10}.
(c) (1 mark) {vi : i ∈ [5]};
(d) (1mark) {v1,v3,v4,v7,v8,v10}. (e) (1mark) {v2,v4,v5,v6,v7,v8}.
Solution: Yes, no, no, no, yes. A vertex cover is a set of vertices intersecting every edge in the graph. (b) does not intersect e.g. {v4, v5}. (c) does not intersect e.g. {v7, v10}. (d) does not intersect {v7, v9}.
10. (5 marks) (Short question.) We define directed and undirected graphs G1, G2, G3 and G4 as in the picture below.
G1 G2 G3 G4
Page 5 of 16
Mark each statement true or false.
(a) (1 mark) The in-degree of v in G1 is 3.
(b) (1 mark) G1 is weakly connected.
(c) (1 mark) G2 and G3 are isomorphic.
(d) (1 mark) Some induced subgraph of G3 is isomorphic to G4. (e) (1 mark) G2 and G4 each contain an Euler walk.
11. (5 marks) (Medium question.) Consider the effects of deleting 4 from the 2-3-4 tree below.
1 3 5 7 9 11 13 15
After deleting 4:
(a) (1 mark) How many 2-nodes are in the tree?
(b) (1 mark) How many 3-nodes are in the tree?
(c) (1 mark) How many 4-nodes are in the tree?
(d) (1 mark) How many nodes are in the bottom layer of the tree?
(e) (1 mark) How many total fuse, split and transfer operations occurred in the process of deleting 4?
This question is multiple choice. The available options for each part are 0, 1, 2, 3, 7, 8, 9, and “none of the above”.
Solution: True, true, true, false, true. These all follow immediately from the definitions of the terms.
Solution: 8, 3, 0, 7, 3. The deletion algorithm traverses first to 4, then to the predecessor 3 of 4, fusing two 2-nodes encountered on the way (including the root). It then deletes 3, triggering another fuse operation, and overwrites 4 with 3. The resulting 2-3-4 tree is shown below.
1 2 5 7 9 11 13 15
Page 6 of 16
12. (5 marks) (Medium question.) Suppose Dijkstra’s algorithm is run on the following graph starting at vertex i. List the vertices in the order the algorithm performs final processing on them (i.e. the order in which their distance from i is finalised).
c3b2a 133 d e5f 7642
A. i,h,b,g,e,a,d,f,c B. i,h,f,e,b,a,g,d,c C. i,h,f,e,a,g,b,c,d D. i,h,g,e,a,b,f,d,c E. i,h,f,e,a,g,b,d,c
Solution: E. See Video 4-4 for a refresher on how Dijkstra’s algorithm works. The shortest-path tree is given below.
c3b2a 133 d e5f 7642
13. (10 marks) (Medium question.) Suppose that x, y, and z are strings. We say that z is a “shuffle” of x and y if z can be obtained by mixing the characters from x and y in a way that preserves the left-to-right ordering of the characters from x and the characters from y. For example, “LOhamLbuCrgerAT” is a shuffle of “hamburger” and “LOLCAT”. Given three strings x, y, and z, you wish to tell whether or not z is a shuffle of x and y.
(a) (8 marks) Fill in the blanks in the incomplete dynamic programming algorithm below. The return value of Shuffled(x, y, z) should be True if z is a shuffle of x and y, and False if it is not.
Page 7 of 16
1 2 3 4 5 6
9 10 11 12 13
Let n ← length(x). Let m ← length(y). Let r ← length(z). if n=0then
If y == z return , else return elseifm=0then
If x == z return , else return
Letx− ←x[1,…,n−1]. Lety− ←y[1,…,m−1]. Letz− ←z[1,…,r−1]. Return
((z[r−1] == x[n−1])∧Shuffled( once or not at all.
,z1))∨((z[r−1] == y[m−1])∧Shuffled( ,
Each blank should contain True, False, x, y, z, x1, y1 or z1. Some terms may appear more than
Solution: In order: True, False, True, False, x−, y, x, y−. We break the problem down as a sequence of choices: if z is a shuffle of x and y, then which of x and y does the last character of z come from? The base case, covered by lines 5–9, is that any word z is a shuffle of z and the empty string. x−, y− and z− are x, y and z with their last characters removed, respectively. z can be expressed as a shuffle of x and y whose last character comes from x if and only if the last characters of z and x match, and z− is a shuffle of x− and y. Likewise, z can be expressed as a shuffle of x and y whose last character comes from y if and only if the last characters of z and y match, and z− is a shuffle of x and y−. This gives rise to the recurrence relation of lines 9–13.
(b) (2 marks) What is the running time of Shuffled, if properly memoised? Choose the strongest bound that applies.
D. O(mnr).
E. None of the above.
Solution: C. The only possible recursive calls are to initial segments of x, y and z. Moreover, the number of characters missing from the end of z must be precisely equal to the number of characters missing from the end of x plus the number of characters missing from the end of y — so the z argument is determined by the x argument and the y argument. There are O(n) possible x arguments, and O(m) possibly y arguments, for a total of O(mn) possible recursive calls. The non-recursive parts of each call take O(1) time, so the overall time taken is O(mn). (Note that this is not the same as O(n2) — for example, if n = 1 and m is very large.)
14. (5 marks) (Medium question.) Mark the following five statements true or false.
(a) (1 mark) The problem of finding a minimum vertex cover in a given input graph is in NP.
(b) (1 mark) Every Co-NP-complete problem is the complement of an NP-complete problem (under Karp reductions).
(c) (1 mark) There is a Karp reduction from every problem in NP to 3-SAT.
Page 8 of 16
(d) (1 mark) Every decision problem with a polynomial-time algorithm is in NP.
(e) (1 mark) If a problem is NP-complete under Karp reductions, then it is NP-complete under Cook reductions.
Solution: False, true, true, true, true. Finding a minimum vertex cover isn’t a decision prob- lem, so it can’t be in NP. If X is a Co-NP-complete problem, then for any problem Y in Co-NP, there is a Karp reduction fY,X mapping instances of Y to instances of X with the same answer in polynomial time. This same Karp reduction also maps instances of Y to instances of X with the same answer, so since every problem in NP is the complement of a problem in Co-NP, X must be NP-complete. 3-SAT is NP-complete as proved in the course, so every problem in NP Karp-reduces to it. The fourth statement says that P ⊆ NP, which is proved in the course. The fifth statement follows from the fact that if X ≤K Y , then X ≤C Y — the Cook reduction simply applies the Karp reduction to the given instance of X, applies the oracle to the result, and returns the output.
Long answer questions
15. (10 marks) (a) (5 marks) (Short question.) Give an example of an undirected graph with a closed Euler walk but no Hamilton cycle. You do not need to justify your answer.
(b) (5 marks) (Short question.) Give an example of an undirected graph with a Hamilton cycle but no closed Euler walk. You do not need to justify your answer.
16. (5 marks) (Short question.) The Bacon number of an actor is defined as follows. has a Bacon number of zero. The Bacon number of another actor is the shortest chain of co-stars linking them to . For example, was in Change of Habit with , and was in JFK with , so has a Bacon number of at most 2. Elvis was never in any movies with Kevin directly, so he has a Bacon number of exactly 2. Given API access to IMDB, briefly summarise an efficient algorithm to output the Bacon number of a given actor by reducing to a problem solved in the course. You do not need to prove your algorithm works.
Solution: There are many possible answers, but one of them would be the graph G = (V, E) with V = [5], E = {{1, 2}, {2, 3}, {3, 1}, {3, 4}, {4, 5}, {5, 3}}, i.e.:
Here 123451 is a closed Euler walk, but the only cycles in the graph are 1231 and 3453.
Solution: There are many possible answers, but one of them would be the graph G = (V, E) with V = [4], E = {{1, 2}, {2, 3}, {3, 4}, {4, 1}, {1, 3}}, i.e.:
Here 12341 is a Hamilton cycle, but there is no closed Euler walk since 1 and 3 have odd degree.
Page 9 of 16
Solution: Let A be the set of all actors listed on IMDB. Let
E = {i, j : actors i and j have appeared in a movie together}.
Then the Bacon number of an actor is precisely their distance from in the graph (A, E), which can be queried using the IMDB API. The problem can therefore be solved with O(|A|2) API queries using breadth-first search.
17. (10 marks) (Short question.) You are running a cattery. You have a large number of cats, and you wish to feed them a diet which meets all their nutritional needs while spending as little money as possible. You are given a list of store-brand foods and supplements, along with their prices and nutritional content per kilo. Express this as a linear programming problem by filling in the blanks below. You may assume all nutritional needs are lower bounds (e.g. “needs at least 200g protein per day”).
The linear programming problem will be of the form
c1x1 +···+ctxt →min,
A⃗x ≥ ⃗b, ⃗x ≥ ⃗0 .
Here x1,…,xt represent , c1,…,ct are given by , the rows of A are given by , and ⃗b is given by . This problem an optimal solution.
Each blank will be filled by one of the following:
the costs of each food or supplement per kilo;
the nutritional content of each food per kilo;
the amounts of each food to give the cats in kilos per day; the total amount of each nutrient the cats require per day; always has;
may or may not have;
never has.
Solution: The completed paragraph is given by: The linear programming problem will be of the form
c1x1 +···+ctxt →min, A⃗x ≥ ⃗b,
Here x1,…,xt represent the amounts of each food to give the cats in kilos per day, c1,…,ct are given by the costs of each food or supplement per kilo, the rows of A are given by the nutritional content of each food per kilo, and ⃗b is given by the total amount of each nutrient the cats require per day. This problem always has an optimal solution.
To see the last part is true, note that as explained in video 8-2, the only way a problem can fail to have any solutions is if either the objective function is unbounded or if there are no feasible solutions. Here each ci is implicitly non-negative (no food can ever have a negative cost), so the objective function is bounded below by zero. Moreover, we can satisfy every constraint by taking x1 , . . . , xt large enough, so there is at least one feasible solution. (Of course, since we don’t currently allow any upper bounds on e.g. caloric intake, this may result in the cats becoming spherical…)
Page 10 of 16
18. (5 marks) (Short question.) Give a list of augmenting paths from s to t in the below flow network, with respect to the given flow.
5/10 b 2/3 e 2/4 st
Solution: sbet, sbedt, sbdt, sadt, sadbet. The residual graph is as shown below. 2
(a) (10 marks) (Medium question.) Consider the following greedy algorithm to find a large independent set in a graph.
Algorithm: GreedyIS(G, k)
Input : A graph G.
Output: An independent set of G. begin
Let output ← ∅. foreach v ∈ V (G) do
if output ∪ {v} is an independent set then output ← output ∪ {v}.
Return output.
Page 11 of 16
Fix an integer ∆ > 0. If G has n vertices and maximum degree ∆, then how large an independent set is GreedyIS guaranteed to output? Your answer should include both an upper bound and a lower bound, and a brief justification of each.
(b) (5 marks) (Medium question.) Now consider the following refinement of GreedyIS. Algorithm: BetterGreedyIS(G, k)
Solution: n/(∆ + 1). A vertex is added to output if there is no vertex already adjacent to it in output already. Hence each vertex v that GreedyIS adds to output prevents at most d(v) ≤ ∆ further vertices from being added to output. Thus |output| + ∆ · |output| ≥ n. Rearranging, we obtain |output| ≥ n/(∆ + 1) as required. Conversely, if we take G to be a union of cliques of size ∆ + 1, then any independent set can contain at most one vertex from each clique, so a maximum independent set contains only n/(∆ + 1) vertices.
Input : A graph G.
Output: An independent set of G. begin
SortV(G)inincreasingorderofdegree. WriteV(G)={v1,…,vn},withd(v1)≤···≤d(vn). Let output ← ∅.
for i = 1 to n do
if output ∪ {vi} is an independent set then
output ← output ∪ {vi}. Return output.
Prove that BetterGreedyIS may still output an independent set of size O(1) even if G contains an independent set of size Ω(n).
Solution: There are many ways of doing this — here’s one. Consider an input graph G = (V, E) of the following form. Let t be an arbitrary (large) integer. We have V = {a, b} ∪ A ∪ B, where A and B are t-vertex sets. We form E by joining a to every vertex in A, b to every vertex in B, and every vertex in A to every vertex in B. Then d(a) = d(b) = t, and d(v) = t+1 for all v ∈ A ∪ B. Thus BetterGreedyIS will pick first a, then b, then be unable to pick any other vertices, returning a set of size 2 ∈ O(1). Meanwhile, A is an independent set of size t = (|V | − 2)/2 ∈ Ω(|V |).
20. (5 marks) (Medium question.) Let (G, w) be a connected weighted graph with non-negative weights, and let T be result of running Kruskal’s algorithm on G. Let w′(x, y) = log(w(x, y) + 1) for all edges {x, y} ∈ E(G). Is T a minimum spanning tree of (G, w′)? Briefly explain your answer.
21. (5 marks) (Medium question.) The Travelling Salesman Problem (TSP) is defined as follows. We are given a list of n cities and, for each unordered pair {i,j} of distinct cities, the cost c(i,j) of travelling between i and j. (These costs can be arbitrary positive integers.) We are also given an integer k. We must output Yes if there is a way to travel to each city exactly once, then return back to the starting point, with total cost at most k. We