MIDTERM Solutions
Fundamental Algorithms, Fall 2011, Professor Yap (TA: Ziyao Wei)
Thu Oct 13, 2011 MIDTERM SOLUTION
INSTRUCTIONS:
• This a closed book exam, no calculators, but you may refer to a prepared 8”x11” sheet of notes.
• There are 9 questions: 7 short ones, 2 regular ones.
• Write all answers on the provided spaces in the Exam Sheets. Use the the back of the Exam Sheets for rough work.
SHORT QUESTIONS (70 Points)
(Q1) (10 Points) Asymptotics
(a) Suppose f > g infinitely often (i.o.). Please express this using our asymptotic notations (like dominance, etc).
(b) Please restate the condition f ̸≺≺ g (f is not super-dominated by g) using the “infinitely often” terminology.
(Q2) (10 Points) Orders of Growth
No proofs. Just order these 7 functions from smallest to largest Θ-order:
nlg5 n, n−lgn, lg2 n, n3/2 −10n, (√n)!, 2lgn, (lgn)n
(Q3) (10 Points) Master Theorem
Say whether T1(n) ≺≺ T2(n) or T1(n) ≻≻ T2(n) where
T1(n) = 8T1(n/4) + n1.5, T2(n) = 6T2(n/3) + n2.
Briefly justify using Master Theorem.
SOLUTION (a) It is NOT the case that f ≤ g (ev.).
(b) There exists C > 0 such that f ≥ Cg infinitely often.
Comments: This is obtained by negating the definitions of ≼ and ≺≺, and using de Morgan’s Law. A general rule of thumb is that the negation of an “eventuality” statement is an ‘infinitely often” statement.
SOLUTION
n−lgn ≺lg2n≺2lgn(=n)≺nlg5n≺n3/2 −10n≺(√n)!≺(lgn)n
n lg n) which is super dominated by lg((lg n)n )) = n lg lg n. Note that lg((√n)!) = Θ(√
Comments:
⃝c Chee Yap Page 1
October 18, 2011
SOLUTION Watershed constant α1 = log4 8 = log4 16 − log4 2 = 2 − 0.5 = 1.5. Using Master Theorem, this is CASE(0), and so T1(n) = Θ(n1.5 log n).
Note that T2(n) = Ω(n2). From this, we conclude that T1 ≺≺ T2.
Comments: Although unnecessary for this question, suppose you want to determine T2 up to Θ-order. The watershed constant in this case is α2 = log3 6 < log3 9 = 2. This is CASE(+), and so T2(n) = Θ(n2). We need not check regularity condition for this CASE(+) because it is automatic for polynomial driving functions.
(Q4) (10 Points) AVL Deletion
Construct an AVL tree with 12 nodes such that, by deleting one node, you will cause two double-rotations during rebalancing.
Draw three AVL trees: original one, and after each double-rotation. Partial credit if they are only single rotations instead of double-rotation.
HINT: It is not necessary to assign keys to the nodes: just show the shape of the tree, and label some nodes in order to clarify the deletion and rotations. Solution Figure:
del(x)
rot2(y)
rot2(z)
z
y
z
y
x y
xx
Figure 1: Delete x
SOLUTION This is very similar the the question in Hw4. You can make the two rebalancing acts to be single or double-rotations.
Refer to Figure 1. The idea is simple: using 12 nodes, you can construct a min-size tree of height 4. By definition, min-size trees have balance of +1 or −1 at every node. But depending on how you choose the balance of each node, you can get two single rotations, one single and one double rotation, or two double-rotations.
Comments:
(Q5) (10 Points) Comparison Model of Sorting
Let S(27) denote the minimal height of a tree program to sort 27 elements. Describe how you would go about computing this number if you only have a pocket calculator. Mention any pitfalls, numerical errors, etc.
SOLUTION This is discussed in hw1 solution. We know that S(27) = ⌈lg(27!)⌉, because the ITB is tight for n ≤ 29. To compute this number using a pocket calculator, you should use Stirling’s approximation from Lecture II. OK if you forget what it is, but you need to take logarithms to base e and base 2. What you must not do is to compute the factorial first, because you do not have enough precision to compute this factorial. The formula is
lg(n!) = n lg(n/e) + 1 lg(2πn) + lg(e)/(12n + δ) 2
for some 0 ≤ δ < 1. But in practice, it will turn out that it does not matter what δ is, as it will not affect the rounded up value of lg(n!). Of course, this not a proof. Even if there is uncertainty because of roundoff errors, the computed will be off by at most 1.
Comments:
⃝c Chee Yap Page 2 October 18, 2011
(Q6) (10 Points) Duplicate Keys in BST
Consider search trees with duplicate keys.
(a) Draw all the possible BST’ s with keys 1, 3, 3, 3, 4, assuming the root is 3.
(b) Suppose we want to design AVL trees with duplicate keys. Say why and how the BST property must be
modified.
Solution Figure:
333 131314
343 433
(a) (b) (c)
Figure 2: Duplicate Key
SOLUTION (a) There are three possibilities as shown in Figure 2.
(b) The BST property says that LeftTree < Root ≤ RightTree. Imagine an AVL tree with the keys 3, 3, 3. Then the BST property requires the tree to be (((3)3)3) (a right-path). But this violates the AVL balance property. To modify this, we must allow LeftTree ≤ Root ≤ RightT ree.
Comments:
(Q7) (10 Points) Median and Real Induction
Recall our median algorithm where we first find the medians of groups of size 5. Let T (n) be the worst case number of comparisons made by our algorithm when the input has n elements. We see that T (n) = T (n/5) + T(7n/10) + Cn, for some constant C > 0. In this question, we are interested in constant factors, not just asymptotics.
(a) Determine the value of C in this algorithm. For this purpose, use the fact that we can find the median of five elements with 7 comparisons.
(b) Using Real Induction, show that T (n) ≤ K n (ev.). Determine the smallest possible value of K as a function of C. State an explicit constant for K based on your answer in part (a). NOTE: As usual, you need not show the Real Basis in your induction.
SOLUTION (a) There are 7n/5 comparisons to find the medians of the groups of 5. Recursively,
we compute m, which is the median of the medians. Next, from each group, there are 2 elements
that must be compared to the m. Since there are n/5 groups, this comes to 2n/5 comparisons.
This proves that C = (7/5) + (2/5) = 9/5.
(b)LetT(n)≤K(n/5)+K(7n/10)+Cn=Kn(1+7 +C/K)≤Kn,provided1+ 7 9 510 5
10 +C/K≤1,or10 +C/K≤1,orK≥10C.UsingtheconstantC=9/5inpart(a),we conclude K = 18.
Comments:
REGULAR QUESTIONS (40 Points)
(Q8) (20 Points) Solving Recurrences
Recall that if μ(h) is the minimum size of an AVL tree of height h, then it satisfies the recurrence inequality μ(h) > 2μ(h − 2). Please go through the following steps (as in our proof of the Master Theorem) in order to ”solve it”.
(a) Simplify this into a recurrence for solving.
⃝c Chee Yap Page 3 October 18, 2011
(b) Convert the recurrence in (a) into Standard Form. HINT: first take logarithms. (c) Solve the Standard Form recurrence.
(d) Plug back to solve the original recurrence in (a).
SOLUTION (a) To simplify, we replace inequality by equality: μ(h) = 2μ(h − 2). (b)Usingthehint,wetakelogstogetlgμ(h)=1+lgμ(h−2). Thusm(h)=1+m(h−2) where we define m(h):= lg μ(h).
This is still not standard form, so we need another transformation (similar to what you see in the proof of the Master Theorem). Let n:=h/2 and define T (n) = m(2n). This yields
T (n) = m(2n) = m(2n − 2) + 1 = T (n − 1) + 1.
Thus we obtain the Standard Form T(n) = T(n − 1) + 1.
(c) Solving this one is trivial, T (n) = n by DIC.
(d) Thus m(h) = T (h/2) = h/2. Plugging back, μ(h) = 2m(h) = 2h/2
Comments: Note that we provided a direct solution in the text. It is instructive to see how this is a disguised Standard Form.
⃝c Chee Yap Page 4 October 18, 2011
(Q9) (20 Points) Merging in the Tape Model
In homework 1, we described a procedure called TapeMerge(T1,T2,T0) to merge items in T1 and in T2 into T0, assuming that the items in T1, T2 are already sorted. Let us explore what happens in case T1 and T2 have many runs. Recall the procedure Distribute(T0, T1, T2) to distribute the runs in T0 into T1 and T2 (alternately). For this question, assume that Distribute(T0, T1, T2) returns the number k ≥ 1 of runs that was originally in T0. Consider the following algorithm:
Clearly, using T apeM erge in this way is an abuse of our original intention. But if this algorithm halts, it would have successfully sorted the items in the original tape T0. Our goal is to prove that it will, in fact, halt.
HINT: Consider the runs that appear in T0 after TapeMerge. When does a new run appear in T0? Can you bound the number of such events?
Input: Items in T0. Initially T1, T2 empty. k ← Distribute(T0, T1, T2).
while (k > 1)
TapeMerge(T1,T2,T0)
k ← Distribute(T0, T1, T2)
SOLUTION CLAIM: Suppose T1 has r ≥ 1 runs and T2 has s ≥ 1 runs, then after the merge,
T0 willhaver+s−1runs.
Suppose the runs in T1 are initially R1, . . . , Rr) and in T2 are (S1, . . . , Ss). Consider the first
moment that either R1 or S1 are completely scanned by our algorithm. Wlog, suppose R1 has
been output to the tape T . Let S = S′ S′′ such that at the moment that R is output, only S′ 011111
has been output. Therefore, the tape contents of T0 is the merger of S1′ and R1. Let T0 = (M1)
where M1 is the merger of S1′ and T1. Note that at this moment, T1 = (R2,…,Rr) and
T =(S′′,S,…,S). 212s
Wehavetwocases,r=1orr>1. Weconsiderr=1astheBASECASE.Thereisasubtle difference between the base case and the general case.
CASE r = 1: In this case T1 becomes empty, and Merge will output the rest of the contents of T2, namely,
T =(S′′,S,…,S). 212s
Thus T = (M ,S′′,S ,…,S ). But every item in S′′ is larger than the last item in M , and 0112s11
hence (M1, S′′) actually constitute a single run. This proves that T0 only has s runs in total, and
this is less than r + s = 1 + s.
CASE r > 1: The algorithm with merge the contents in the rest of tapes T1 and T2, namely
T = (R ,…,R ) and T = (S′′,S ,…,S ). By induction, this output will have r + s − 2 12r212s
runs,
(M2, M3, . . . , Mr+s−2). This proves that tape T0 has r + s − 1 runs,
T0 = (M1,M2,M3,…,Mr+s−2).
Comments: ThisproofshowsthatthebehaviourofTapeMergeisquiteremarkable:itdoesnot reduce the number of runs until we reach the EOT of either T1 or T2. In the worst case, the above sorting algorithm may need n − 1 phases. It is not very efficient (in terms of phases) but at least it halts.
⃝c Chee Yap Page 5 October 18, 2011