CS计算机代考程序代写 data structure algorithm Algorithms homework related to exam 3

Algorithms homework related to exam 3

Hw7

1. Recall from your data structures course that a queue supports two operations, enqueue and dequeue,
and a stack supports two operations, push and pop.

(a) Show how to simulate a queue using two stacks.

(b) What is the worst-case time complexity for the two queue operations, in your simulated queue
from part (a)?

(c) What is the amortized time complexity of your queue operations, assuming you start with an
empty structure?

If you didn’t solve this problem and want to keep working on it, here’s a hint:

Draw the stacks back-to-back, horizontally, with the open ends at the extreme left and right. The
queue process will move elements from left to right in general; into one stack, then transfer to
the other, then out of the other. Maintain the invariant that at all times the left-to-right (or
right-to-left) order of items represents queue order.

Answer:

Visualize the stacks set up back-to-back. For instance, let them be drawn horizontally, with one
stack L open to the left, and the other stack R open to the right. Let R handle enqueuing, and
let L handle dequeuing. If it weren’t for the “closed” ends of the stacks, items would enter on the
right and emerge on the left. A problem only occurs when L is empty and we need to dequeue.
Suppose some set of items has been enqueued in R, and that L is empty. We can incrementally
pop everything and push into L, which sets up the correct order for these items to be dequeued
whenever necessary. In other words, instead of the standard enqueue-dequeue phases when using
a regular queue, items go through an enqueue-transition-dequeue process with two stacks. This
transition phase has to be charged to either pushing or popping. Notice that we cannot simply
push a new item into R and send it over to L immediately, if L is non-empty. This would violate
the queue order. So, one way or another, either pushing or popping will have a worst-case time of
O(n).

Now we will see how to get constant amortized time complexity per operation. While L is not
empty, just enqueue and dequeue in constant time, as required. As soon as L becomes empty,
transfer everything from R over to L, using a pop and a push for each element. This may cost
O(n) time, but we can also count it as O(1) per item moved. Every item will be involved in a
transition precisely once. When we complete a transition, R is empty, which is great; it is ready to
accept new items. On the other side, L is now ready to go through several constant-time dequeues.
So the higher the cost of any particular transition, the more we will wait until another transition
occurs. In other words, such a transition will occur rarely.

You can use the accounting method to charge 3 units of time for every enqueue; 1 of those units
will pay for pushing into R, and the other two get stored along with the element in R. Eventually
they will pay for its eventual transfer over to L, whenever that happens. Dequeueing (popping)
from R just gets charged 1 unit. Our maximum amortized cost per operation is 3.

For the potential method, let � equal twice the number of items in R. Thus � is always non-
negative, and it starts out as zero. Whenever we enqueue (push into R), �� = 2, so the amortized
cost is 3. When we dequeue, the pop from L generates �� = 0, so the amortized cost is just 1 if
L was not empty. But if L was empty, the transition of k items from R to L creates �� = �2k,
which perfectly o↵sets the actual transition cost. Thus the contribution of this expensive step to
the amortized cost is zero. Our maximum amortized cost per operation is 3.

2. An extended queue supports the two standard queue operations, but has one extra operation,
called Q-pop, through which we access and remove the item that has been enqueued most recently.
Show how to simulate an extended queue, using three stacks, such that the amortized cost of each
extended queue operation is constant.
You can only use stack operations, and you should not duplicate (copy) elements.

Hint: Let the third stack be temporary. Before and after each operation it should be empty.

Answer:

Setup:

With two stacks we can’t handle all three operations of an extended queue, in sub-linear amortized
time per operation. Alternating dequeue and Q-pop would require moving everything back and
forth each time. But with a third stack, C, we can operate as follows.
Enqueueing is handled by pushing onto stack A. We will always handle Q-pop by standard pop
from A. This means that we will make sure that if the data structure is non-empty, then the
last-inserted element is at the top of A. So far, only A is needed and both operations take O(1)
worst-case time, so things get interesting only if we need to dequeue.

Setup, part 2:

Now suppose that we need to dequeue, when B is empty and A contains k items. We pop half
of the items from A and push them onto C. (I’ll assume we can divide by 2 here; this is just a
detail). Then we pop the other half of A and push onto B. So far the cost is 2k. Finally we pop
all of C and push back onto A. So the total cost is 3k. The e↵ect is that the bottom half of A is
transferred to B, and the top half of A just slides down within A. This means that we can keep
enqueuing in constant time (push A), we can dequeue in constant time (pop B), and we can Q-pop
in constant time (pop A), until A or B become empty again. If B becomes empty and we need to
dequeue, we transfer half of A over to B as just described. But if A becomes empty and we need
to Q-pop, we transfer half of B over to A in a similar way. The cost is always 3 times the number
of elements, so of course in the worst case this is linear.

Intuition for amortization:

The linear cost of the first transfer can be charged to the linear number of enqueues that must
have preceded it. In general when A or B become empty, we use C as an intermediate to split the
remaining elements among A and B again. The linear transfer cost in either scenario can always be
“shared” by the linear number of O(1)-time operations that must have preceded. In other words,
after every transfer we know that there will be a proportional number of operations until we need
another transfer.

Accounting method, focusing on the first transfer:

Suppose that we pretend that every enqueue costs 7 units instead of 1. One unit pays for the push.
We store 6 units with every item in A. (Why 6? That is based on the calculation that follows in
this paragraph; we could have used a variable and figured this out algebraically.) If we Q-pop,
we use 1 of the item’s 6 units and burn the other 5. Now suppose that A contains k items and
we need to make a transfer to B, because we want to dequeue. We have calculated the total true
cost of this transfer to be 3k. We are left with k/2 items in A, and we want them to maintain the
invariant of each having 6 saved units. So essentially the k/2 items that are no longer in A must
contribute their savings to pay for the entire transfer. They initially had 3k units tied to them, so
that’s enough. But we will need to address the fact that the items in B do not have any credits.

Accounting method, more generally:

Now that we have the same number of items in A and B, we generalize our approach. Our invariant
will be that the stack containing more elements holds all the savings, specifically 6 units per item
in that stack. If both stacks contain the same number of units, arbitrarily use one to hold all the

savings. So for example if we want to Q-pop and B is about to have one more item than A, we
move all of the savings from A to B, including the 6 units belonging to the item that is about to
be Q-popped. Note that transferring units of savings is free, it is not an algorithmic operation, we
are just analyzing. Now if we keep Q-popping while A has fewer items, we can burn the savings of
what is Q-popped, because every item in B already has 6 units saved. In fact for every enqueue we
don’t need to save anything (until A reaches the size of B, of course). Similarly for every dequeue
that occurs while B is larger, we can just burn the savings. But if a dequeue will make B have one
item less than A, we move all of the saved units back over to A.

The result of this generalized approach is that whenever one stack becomes empty, the other one
has 6 units stored per item, and that is exactly enough to pay for the cost of a transfer, while
maintaining the generalized invariant. Thus every operation has an amortized cost of at most 7.

Potential method, unsuccessfully:

First, consider what changes a lot when we have an expensive operation (i.e., a transfer). Clearly,
one answer is the number of items in A (or B). But we need that change to be negative as well,
when there is an expensive operation. If we use the number of items in A, this will change a lot
but positively, after a transfer triggered by a Q-pop (i.e., A was empty beforehand). Similarly if
we use the number of items in B, this will change a lot, positively, after a transfer triggered by a
dequeue. This means we need to find something else that changes a lot.

Potential method, correctly:

For a transfer to be triggered, either A or B must have been empty. After the transfer, A and B
hold the same number of items. So the absolute value of the di↵erence between the sizes of A and
B changes a lot, and in fact negatively. Accordingly we will define � = 3 · |A�B|. The factor of 3
was determined by recalling that the true cost of a transfer is 3k if the non-empty stack contains
k items. So �i�1 = 3k, and �i = 0, which means �� = �3k. Thus the amortized cost of any
transfer is zero: our main objective is accomplished, that is, the expensive operation has a small
amortized cost. Now we check that there are no undesired side-e↵ects: For any enqueue, dequeue
or Q-pop, the true cost is 1, and �� = ±3, so the amortized cost is at most 4. The parity depends
on the relative sizes of A and B. We conclude that the amortized cost of any operation is at most
4.

Back to accounting:

The potential method result seems better than what we got with the accounting method, but in
fact the accounting strategy can be modified to get an amortized cost of at most 4 as well. The
invariant can change as follows: suppose that one stack contains c more items than the other.
Then that stack must store 3c units. That means that when we begin by enqueuing (push into
A), we pretend that the cost is 4 instead of 1. The first transfer from A to B will use up all of
our savings (specifically 3k, as described earlier). Then if we enqueue again we pretend that the
cost is 4. If instead we Qpop, we pretend that the cost is 4 and we save 3 units in B instead of A,
because B would now have one more item than A. Similarly if we were to dequeue when A and
B hold the same number of items, we pretend that the cost of popping from B is 4, and save 3
units in A. Any time that we remove an item from the stack containing more items, we can burn
3 units to preserve our nice invariant. It is now clear that when one stack ends up empty, the
other one has enough to pay for a transfer, so the amortized cost of the transfer is zero and we re-
set the total savings to zero as well. In conclusion the highest amortized cost of every operation is 4.

Final note: This extended queue structure, which also has a stack operation, is called a Quack.

Hw8

1. You are given two strings that each have length n. You must find their LCS; not only its length.
You are allowed polynomial time to do this but you must only use linear space. Describe how to
do this.

Answer:

Run the standard algorithm that we use to get the length of the LCS, using the equivalent of
two rows of space. Then trace back as much as possible within the last two rows, and output
that part of the solution for the LCS itself (i.e., output the su�x of a solution). When we require
information that has been deleted, recycle your storage space, and start all over again (from the
top) to retrieve the previous two rows. Each time we start over, we spend O(n2) time, so overall
the complexity is O(n3).

2. Let S be a string with n characters, indexed 1 to n. Let X be a sorted array with m integers
within range 1 to n�1, inclusive. The integers in X correspond to indices in S after which we
must insert a space. For example, if S = ALGOISSOCOOL is to be split into 4 words of length
4,2,2,4 respectively, then X = [4, 6, 8].

Given S and X, the spaces must be inserted one at a time, in any order. The order matters
because it costs k to insert a space anywhere on a substring of size k. When a space is inserted,
we get two substrings that we eventually process disjointly. In the example above if you first split
at position 6, the cost will be 12. Then on each side the cost of the next split will be 6. On the
other hand if you were to first split at position 4, the cost would be 12 but then you would pay 8
for the next split, etc.

(a) Derive a recursive formula that returns the minimum cost of splitting a string as defined above.
Briefly explain why the formula is correct.

(b) How many distinct subproblems might you need to solve during recursion?

(c) How fast can you determine what the minimum cost of splitting a given string is? Your time
complexity should be polynomial.

Answer:

(a) When we choose to use an element from X, we insert a space, which partitions both S and
X into two parts each, and thus we obtain two subproblems with the same description as what
we were originally given. We always pay n for the first split, no matter where it is. There are
m possible scenarios, one per initial split position. For each of these, we add the cost of the two
resulting subproblems, and add the cost n. We take the min over all scenarios. So we could try
something like the following:

Cost([1 . . . n]) = n+min{Cost([1 . . . k]) + Cost([k+1 . . . n])}
where the min{} is taken over all values k in X.
This would generalize as follows:

Cost([p . . . q]) = (q � p+ 1) +min{Cost([p . . . k]) + Cost([k+1 . . . q])}
where min{} is taken over all values k in X that fall within p . . . q�1, i.e., valid split points.
But there’s an issue that needs to be considered. Given a range [p . . . q], how do we figure out
which split positions in X to consider in our function? In other words how do we find “all values
k” that are valid? The parameters p and q are indices of S, but in X they are values, not indices,
so we can’t just instantly jump to the appropriate position in X and start listing valid split points

(between p and q). We could use a binary search for p and q in X, costing O(logm) time. Rather
than do this, we could instead primarily work with indices of X instead of S.

This means that we should have a recursive function of two variables, representing the range of
split points that we are currently considering, rather than considering a subset of S. If we know
that we are currently considering all split points between positions i and j in X, this implies that
the part of S that we’re considering begins at index X[i�1] + 1, or S[1] if X[i�1] isn’t defined.
Similarly, the part of S that we’re implicitly considering ends at X[j+1], or S(n) if X[i+1] isn’t
defined. So the length of the current part of S that we care about can be found using these
calculations, and we could call that Length(i, j). Our function could look something like this:

Cost([i . . . j]) = Length(i, j) +min{Cost([i . . . k�1]) + Cost([k+1 . . . j])}
where min{} is taken over all values k such that i  k  j. Note that if i = j then we don’t need
to recurse, we can just calculate the length.

(b) With the first recursive formulation, it may seem that we would have to solve Cost(p . . . q) for
all 1  p < q  n, meaning ⇥(n2) subproblems, but in fact the only values of p and q that we would ever deal with are determined by where we might split. So in fact there are only ⇥(m2) distinct subproblems. Clearly this is also the case for the second formulation. (c) We will use a dynamic programming table of size ⇥(m2). For the second formulation, each cell is computed by taking the min of O(m) pre-computed subproblems of smaller size. Thus we get O(m3) time complexity overall to compute C(1, n). For the first recursive function that is based on indices in S, as described in (a) we would get an extra logarithmic factor per cell, to figure out what range of X to consider. But that’s actually not the bottleneck of the total time complexity, because we’re already spending linear time per cell.