程序代写代做代考 algorithm Algorithms and Data, Summer 1, 2016 Problem Set 3 Answers

Algorithms and Data, Summer 1, 2016 Problem Set 3 Answers
Part I
1.) Θ(N2)
2.) Θ(N log N) 3.) Θ(log2 N) 4.) Θ(N2)
5.) Θ(log N)
Part II
1.) The idea here is that anything that is a majority in the algorithm overall must be a majority in one of the halves. The following recursive algorithm will return the correct answer as M.
// Given a one-indexed array of objects A, returns (M,O,C) // M: whether a majority exists
// O: a sample of the majority object
// C: a count of the majority object in this array Majority?(A[1…N])
If N == 1:
Return (true, A[1], 1)
else:
M1, O1, C1 = Majority?(A[1…floor(N/2)])
M2, O2, C2 = Majority?(A[floor(N/2)+1 … N]) If O1 == O2
Return (true, O1, C1 + C2) else
if M2
for each object O in A[1…floor(N/2)]
if O == O2 C2++
if M1
for each object O in A[floor(N/2)+1…N]
if O == O1 C1++
if C1 > N/2
return (true, O1, C1)
if C2 > N/2
return (true, O2, C2)
return (false, null, 0)
This makes just one pass of comparisons over all the objects in every recurrence, for a recurrence of T(N) = 2T(N/2) + O(N), the “mergesort recurrence” that is O(N log N).
2.) Here, we can declare the biggest run to be the better of the biggest run in the left half, the biggest run in the right, and the biggest run to cross between the halves.

// Returns the value of the best run contained within (W), the value of the best run starting
// from the left (L), and the value of the best run that touches the right (R), along with the
// indices for where each should start and/or end (l, wl, wr, r) and the product P of everything in // the array
// Assume WLOG that the percentages are expressed as numbers to multiply by, so +10% = 1.1 BestRun(A[1…N]):
If N == 1:
if A[0] is < 1 return 1, 1, 1, -1, -1, -1, -1, 0 // Don’t include this in future products else return A[1], A[1], A[1], 1, 1, 1, 1, A[1] L1, W1, R1, l1, wl1, wr1, r1, P1 = BestRun(A[1...floor(N/2)]) L2, W2, R2, l2, wl2, wr2, r2, P2= BestRun(A[floor(N/2)+1...N]) if P1L2 > L1
L = P1L2
l = l2 else
L = L1
l = l1 if P2R1 > R1
R = P2R1
r = r1 else
R = R2
r = r2 P = P1P2
Compare W1, W2, L2R1, set W to the appropriate value and set wl, wr to wl1 wr1, wl2, wr2, or l2, r1 as appropriate.
Return L,W,R, indices, and P.
At the top level, W is the desired value.
This is actually even better than O(N log N), since it’s doing constant work to combine values; this is T(N) = 2T(N/2) + O(1) = O(N). But, if you did a linear pass to find the best L and R with each recursion, that would be N log N, and that’s acceptable for this problem.
A divide-and-conquer solution that uses absolute stock values and looks for the best A[j]/A[i] is also possible.
3.) a.) The waiting time for N fair bits is N times the expected wait for 1 fair bit. The probability of getting a fair bit on two tosses is 2p(1-p), so the number of times we toss twice for one bit is 
 1/(2p(1-p)). That’s the expected number of double-tosses, so the expected number of tosses is 1/(p(1-p)). The expected tosses until we hit N fair bits is then N/(p(1-p)).

b.)
Extract-more-bits(UnfairBits[1…N]): // Returns fair bits and some unfair bits If N==1
return empty list, UnfairBits: // With just one unfair bit, pass it up FairBits1, UnfairBits1 = Extract-more-bits(UnfairBits[1…floor(N/2)]) FairBits2, UnfairBits2 = Extract-more-bits(UnfairBits[floor(N/2)+1 … N] NewFairBits = concatenate(FairBits1, FairBits2)
OldUnfairBits = concatenate(UnfairBits1, UnfairBits2) NewUnfairBits = empty
for i = 0; i < |OldUnfairBits|; i+=2 if UnfairBits[i] == UnfairBits[i+1] NewUnfairBits.Add(UnfairBits[i]) else // 01, 10 if UnfairBits[i] == 0 NewFairBits.Add(0) else NewFairBits.Add(1) return NewFairBits, NewUnfairBits This works because every time we find a pair of unfair bits that are the same, we can treat this as a new coin flip of an even more biased coin for the next level up. However, we have to be careful to avoid comparing “flips” where the probabilities differ, so this discards any excess unfair bits at the current level. 4.) The number of swaps that happens by accident bounds the number of swap operations that Insertion Sort needs to perform in order to make things right again. In the worst case, a random swap needs to be undone by Insertion Sort. (In the best case, a swap undoes another swap.) The expected number of random swaps is kp. Insertion sort also at least does one check per element to make sure that things are still in order, so the expected running time is O(n + kp). Part III Values omitted here, but easy enough to check on your end. 1.) The most likely scenario here is that the zipped single line is bigger than the original file. The most likely reason if we are using DEFLATE is that the zip utility needed to include the “codebook” for the symbols it was using, and the codebook is actually bigger than the file itself. (I’ll be a little surprised if anyone got a smaller file — at best, a good zip utility might check whether it’s helping matters. Mine doesn’t.) 2.) A reasonable run-length encoding might encode the number of repetitions as a 32-bit integer, followed by a string, with a newline indicating the end of the string. For this file, that would be 32 bits + 8 bits * 4 characters = 64 bits. That’s probably rather less than what your zip utility produced. 3.) It’s unlikely the codebook includes entries this big, because the values we see are increasing too quickly for that to be the case. If all the codewords are roughly the same size, we should be able to double the size of the zipped file for only a constant number more bits, because at every stage, there is already a codeword that encodes more or less the entire file up to that point. Part IV [values omitted] For our experiments, we found that Bucketsort was faster than a non-optimized Quicksort, but not faster than in-place Quicksort. This is because memory allocation tends to be expensive, and creates large constants in running times. Your implementation of Bucketsort probably also uses ArrayLists, which can incur some overhead in accessing elements. The grab-the-first and median-of-three optimizations are minor optimizations — grab-the-first at least doesn’t do any harm when the values are all random anyway, and median-of-three can have more of an effect when the values are nonrandom and we’re trying to avoid particularly bad values.