代写 data structure compiler =============================================================================

=============================================================================
CSC 263 Lecture Summary for Week 7 Fall 2019
=============================================================================

READING: Chapter 17.
SELF-TEST: Exercises 17.1-2.

——————
Amortized Analysis
——————

– Rental car analogy
– Ski slope (gondola) analogy

Often, we perform _sequences_ of operations on data structures and time complexity for processing the entire sequence is important.

Definitions:

m-sequence = a sequence of m operations

Worst-case sequence complexity: Max total time over all m-sequences

Worst-case sequence complexity <= m * worst-case complexity of an operation But WCSC can be smaller than m * worst-case single operation! Definition: Over all m-sequences, worst-case sequence complexity amortized sequence complexity = ------------------------------. m Amortized complexity represents average worst-case cost of each operation. CAREFUL: *different* from average-case time complexity! - NO probability this time - average over the m operations, not over the possible inputs Amortized analyses can make more sense than plain worst-case analysis - Can give tighter bound - Better captures "overall" behaviour: A symbol table in a compiler is used to keep track of information about variables in the program being compiled: we care about the time taken to process the entire program, i.e., the entire sequence of variables, and not about the time taken for each individual variable. Three methods for amortized analyses: - aggregate - accounting - potential method is assigned as reading Example: Binary Counter = sequence of k bits (k fixed) with single operation: INCREMENT -- add 1 to integer represented by counter. The "cost" (i.e., running time) of one INCREMENT operation is equal to number of bits that change during INCREMENT. For example, if k=5, Initial counter: 00000 (value = 0) after INCREMENT: 00001 (value = 1) cost = 1 after INCREMENT: 00010 (value = 2) cost = 2 after INCREMENT: 00011 (value = 3) cost = 1 after INCREMENT: 00100 (value = 4) cost = 3 after INCREMENT: 00101 (value = 5) cost = 1 ... after INCREMENT: 11101 (value = 29) cost = 1 after INCREMENT: 11110 (value = 30) cost = 2 after INCREMENT: 11111 (value = 31) cost = 1 after INCREMENT: 00000 (value = 0) cost = 5 ... Aggregate Method ================ Compute worst-case sequence complexity of m-sequence and divide by m. For binary counter, amortized cost of sequence of m INCREMENT operations, starting with value 0, is computed as follows: consider the sequence of operations. bit number changes total number of changes 0 every operation n 1 every 2 operations floor(m/2) 2 every 4 operations floor(m/4) ... ... ... i every 2^i operations floor(m/2^i) ... ... ... k-1 every 2^{k-1} operations floor(m/2^{k-1}) Total number of bit-flips in m-sequence = sum over bits (# times bit i flips) - don't need to sum over all k bits: i = 0,...,min{k,floor(lg n)} - floor(lg n) because n increments won't touch more significant bits Total number of bit-flips in m-sequence = [lg n] [lg n] [lg n] oo <= SUM [n/2^i] <= SUM n/2^i = n SUM 1/2^i <= n SUM 1/2^i <= 2n i=0 i=0 i=0 i=0 ∴ amortized cost <= 2n/n = 2 for each operation. Accounting Method ================= Each operation is assigned: - "cost" (*actual* complexity of the individual operation) - "charge" (*estimated amortized* of the individual operation) Goal: total COST for m-sequence <=. total CHARGE for m-sequence ∴ can bound the actual total cost by the estimated charge The approach: Use a CHARGE that is easy to work with, (e.g., constant charge) CHARGE = money budgeted for operation COST = money consumed by operation When COST < CHARGE, we have a surplus (credit) - leave credit on element of data structure When CHARGE < COST, we have a deficit - cover deficit using previously stored credit !!! must prove that credit will always be available when needed !!! => total CHARGE always covers total COST
=> total CHARGE bounds total COST

Since total CHARGE is easy to compute, we found an easy way to compute amortized complexity, total CHARGE / m

Revisit binary counter:

Accounting method (see below for general description): during one
INCREMENT operation may “reset” multiple bits (1 -> 0) but…

Observation: INSERT “sets” (0 -> 1) exactly one bit

Observation: Any “reset” (1 -> 0) was preceeded corresponding “set”
– set and reset come in pairs

Idea: CHARGE $2 per INCREMENT operation
– pay for set-reset pair up front ($1 + $1 = $2)
– INCREMENT begins exactly one set-reset pair, ∴ $2 covers the pair
– the “set” costs $1, leaves $1 “reset” credit on the 1-bit
∴ doesn’t matter how many “resets” occur per operation,
they were previously paid for!

Let’s do it formally:

Credit invariant:
At any step during the sequence, each bit of the counter that is
equal to 1 will have a $1 credit.

Proof by induction:
– Initially, counter is 0 and no credit: invariant trivially true.
– Assume invariant true and INCREMENT is performed.
– Cost of resets is paid for by stored credits
– Cost of setting paid for $2 charge, leaving $1 credit stored on new “1”
No other bit changes so invariant still holds

total cost <= total charge = 2m ∴ amortized cost per INSERT <= 2m/m = 2 -------------- Dynamic Arrays -------------- - Consider data structure: array of fixed size and two operations: . APPEND (store element in first free position); . DELETE (remove element in last occupied position). (Note: standard stack implementation using an array.) - Main advantage? Constant-time element access. Main disadvantage? Fixed size. - Load factor alpha (% fullness, like hash table) - Idea: if alpha=1, APPEND allocates twice the size, copies, then appends. - Cost model: $1 for cost of assignment operation - Bad worst case for single APPEND: Theta(n) (when alpha=1) But worst case happens only on n = power of 2 Direct calculation of amortized cost: n floor(log n) worst case total <= sum 1 + sum 2^i <= 3n i=1 i=1 initial copying assignment array amortized average cost = 3n/n = 3 Accounting Method: 1. Goal: analyze amortized complexity of a sequence of m APPEND operations, starting with array of size 0. Intuition: charge $2 for each APPEND $1 for initial assignment $1 credit to save for copying Let's look at how this works out in a hypothetical situation where we have an empty array of size 2: --------------- Initially: | | | --------------- APPEND(x1): |x1($1)| | charge $2: $1 to store x1, $1 credit --------------- APPEND(x2): |x1($1)|x2($1)| charge $2: $1 to store x2, $1 credit ----------------------------- APPEND(x3): |x1($0)|x2($0)|x3($1)| | ----------------------------- use $1 credit to copy x1 and x2, and charge $2 for x3 ----------------------------- APPEND(x4): |x1($0)|x2($0)|x3($1)|x4($1)| ----------------------------- charge $2 for x4 ----------------------------------------- APPEND(x5): |x1($0)|x2($0)|x3($1)|x4($1)| | | | | ----------------------------------------- Problem: run out of "credits" to copy over every old element! More precisely, only elements in second half of array (added since last size increase) have credit. Solution: store enough credit in the second half to cover entire array. 2. Charge $3 for each APPEND. Credit invariant: "Each element in the SECOND HALF of the array has a credit of $2." Base case: array size = 0: invariant vacuously true. Inductive step: assume invariant holds and consider next APPEND operation: - If array size does not change, pay $1 to store new element and keep $2 credit with the new element. Since new elements only added in second half of the array, credit invariant is maintained. - If array size must grow, this means array is full. Then, number of elements in first half = number of elements in second half. By credit invariant, total credit = $2 * number of elements in second half = total number of elements. Sufficient credirt to copy every element. New array is half full. Invariant is maintained. Hence, total credit never negative (because number of elements in second half of array never negative). This means total cost >= total charge,
i.e., amortized cost per operation is <= 3m/m = 3. - What about DELETE operations? Charge each DELETE $1 to pay for element "removal". Since no copying of elements happens during DELETE, analysis above still holds. 3. What if we delete many elements from that array? Could have a large array with a small number of elements, wasting memory. Modify DELETE operation so that it "shrinks" the array when "too empty." Q: Shrink when half-full? A: No! We could keep expanding-shrinking-expanding-shrinking-etc with alternating APPENDs and DELETEs Solution: shrinks array when less than 1/4 full. Charge $2 for each DELETE: $1 for elementary deletion, and $1 for future copying. No matter how many elements we start with, we must delete *at least* 1/2 of the elements before shrinking, ∴ stored credits cover copying to contracted array. Furthermore, contracted array is half full, therefore credit invariant holds: "Every element in the second half of the array has $2 of credit and if the array is K less than half full, there are at least K credits." In other words, if array is exactly half full, there are no credits in the first quarter. But for each element deleted from the first half, one more dollar of credit is added to an element in the first quarter. By the time the array is less than 1/4 full, each element in the first quarter has enough credit to pay for the cost of copying.

Related Posts