=============================================================================
CSC 263 Lecture Summary for Week 7 Fall 2019
=============================================================================
READING: Chapter 17.
SELF-TEST: Exercises 17.1-2.
——————
Amortized Analysis
——————
– Rental car analogy
– Ski slope (gondola) analogy
Often, we perform _sequences_ of operations on data structures and time complexity for processing the entire sequence is important.
Definitions:
m-sequence = a sequence of m operations
Worst-case sequence complexity: Max total time over all m-sequences
Worst-case sequence complexity <= m * worst-case complexity of an operation
But WCSC can be smaller than m * worst-case single operation!
Definition: Over all m-sequences,
worst-case sequence complexity
amortized sequence complexity = ------------------------------.
m
Amortized complexity represents average worst-case cost of each operation.
CAREFUL: *different* from average-case time complexity!
- NO probability this time
- average over the m operations, not over the possible inputs
Amortized analyses can make more sense than plain worst-case analysis
- Can give tighter bound
- Better captures "overall" behaviour: A symbol table in a compiler is used to keep track of information about variables in the program being compiled: we care about the time taken to process the entire program, i.e., the entire sequence of variables, and not about the time taken for each individual variable.
Three methods for amortized analyses:
- aggregate
- accounting
- potential method is assigned as reading
Example:
Binary Counter = sequence of k bits (k fixed) with single operation:
INCREMENT -- add 1 to integer represented by counter. The "cost" (i.e.,
running time) of one INCREMENT operation is equal to number of bits that
change during INCREMENT. For example, if k=5,
Initial counter: 00000 (value = 0)
after INCREMENT: 00001 (value = 1) cost = 1
after INCREMENT: 00010 (value = 2) cost = 2
after INCREMENT: 00011 (value = 3) cost = 1
after INCREMENT: 00100 (value = 4) cost = 3
after INCREMENT: 00101 (value = 5) cost = 1
...
after INCREMENT: 11101 (value = 29) cost = 1
after INCREMENT: 11110 (value = 30) cost = 2
after INCREMENT: 11111 (value = 31) cost = 1
after INCREMENT: 00000 (value = 0) cost = 5
...
Aggregate Method
================
Compute worst-case sequence complexity of m-sequence and divide by m.
For binary counter, amortized cost of sequence of m INCREMENT operations, starting with value 0, is computed as follows: consider the sequence of operations.
bit number changes total number of changes
0 every operation n
1 every 2 operations floor(m/2)
2 every 4 operations floor(m/4)
... ... ...
i every 2^i operations floor(m/2^i)
... ... ...
k-1 every 2^{k-1} operations floor(m/2^{k-1})
Total number of bit-flips in m-sequence = sum over bits (# times bit i flips)
- don't need to sum over all k bits: i = 0,...,min{k,floor(lg n)}
- floor(lg n) because n increments won't touch more significant bits
Total number of bit-flips in m-sequence =
[lg n] [lg n] [lg n] oo
<= SUM [n/2^i] <= SUM n/2^i = n SUM 1/2^i <= n SUM 1/2^i <= 2n
i=0 i=0 i=0 i=0
∴ amortized cost <= 2n/n = 2 for each operation.
Accounting Method
=================
Each operation is assigned:
- "cost" (*actual* complexity of the individual operation)
- "charge" (*estimated amortized* of the individual operation)
Goal:
total COST for m-sequence <=. total CHARGE for m-sequence
∴ can bound the actual total cost by the estimated charge
The approach:
Use a CHARGE that is easy to work with, (e.g., constant charge)
CHARGE = money budgeted for operation
COST = money consumed by operation
When COST < CHARGE, we have a surplus (credit)
- leave credit on element of data structure
When CHARGE < COST, we have a deficit
- cover deficit using previously stored credit
!!! must prove that credit will always be available when needed !!!
=> total CHARGE always covers total COST
=> total CHARGE bounds total COST
Since total CHARGE is easy to compute, we found an easy way to compute amortized complexity, total CHARGE / m
Revisit binary counter:
Accounting method (see below for general description): during one
INCREMENT operation may “reset” multiple bits (1 -> 0) but…
Observation: INSERT “sets” (0 -> 1) exactly one bit
Observation: Any “reset” (1 -> 0) was preceeded corresponding “set”
– set and reset come in pairs
Idea: CHARGE $2 per INCREMENT operation
– pay for set-reset pair up front ($1 + $1 = $2)
– INCREMENT begins exactly one set-reset pair, ∴ $2 covers the pair
– the “set” costs $1, leaves $1 “reset” credit on the 1-bit
∴ doesn’t matter how many “resets” occur per operation,
they were previously paid for!
Let’s do it formally:
Credit invariant:
At any step during the sequence, each bit of the counter that is
equal to 1 will have a $1 credit.
Proof by induction:
– Initially, counter is 0 and no credit: invariant trivially true.
– Assume invariant true and INCREMENT is performed.
– Cost of resets is paid for by stored credits
– Cost of setting paid for $2 charge, leaving $1 credit stored on new “1”
No other bit changes so invariant still holds
total cost <= total charge = 2m
∴ amortized cost per INSERT <= 2m/m = 2
--------------
Dynamic Arrays
--------------
- Consider data structure: array of fixed size and two operations:
. APPEND (store element in first free position);
. DELETE (remove element in last occupied position).
(Note: standard stack implementation using an array.)
- Main advantage? Constant-time element access.
Main disadvantage? Fixed size.
- Load factor alpha (% fullness, like hash table)
- Idea: if alpha=1, APPEND allocates twice the size, copies, then appends.
- Cost model: $1 for cost of assignment operation
- Bad worst case for single APPEND: Theta(n) (when alpha=1)
But worst case happens only on n = power of 2
Direct calculation of amortized cost:
n floor(log n)
worst case total <= sum 1 + sum 2^i <= 3n
i=1 i=1
initial copying
assignment array
amortized average cost = 3n/n = 3
Accounting Method:
1. Goal: analyze amortized complexity of a sequence of m APPEND operations,
starting with array of size 0. Intuition: charge $2 for each APPEND
$1 for initial assignment
$1 credit to save for copying
Let's look at how this works out in a hypothetical situation where we
have an empty array of size 2:
---------------
Initially: | | |
---------------
APPEND(x1): |x1($1)| | charge $2: $1 to store x1, $1 credit
---------------
APPEND(x2): |x1($1)|x2($1)| charge $2: $1 to store x2, $1 credit
-----------------------------
APPEND(x3): |x1($0)|x2($0)|x3($1)| |
-----------------------------
use $1 credit to copy x1 and x2, and charge $2 for x3
-----------------------------
APPEND(x4): |x1($0)|x2($0)|x3($1)|x4($1)|
-----------------------------
charge $2 for x4
-----------------------------------------
APPEND(x5): |x1($0)|x2($0)|x3($1)|x4($1)| | | | |
-----------------------------------------
Problem: run out of "credits" to copy over every old element! More
precisely, only elements in second half of array (added since last size
increase) have credit.
Solution: store enough credit in the second half to cover entire array.
2. Charge $3 for each APPEND.
Credit invariant:
"Each element in the SECOND HALF of the array has a credit of $2."
Base case: array size = 0: invariant vacuously true.
Inductive step: assume invariant holds and consider next APPEND operation:
- If array size does not change, pay $1 to store new element and keep
$2 credit with the new element. Since new elements only added in
second half of the array, credit invariant is maintained.
- If array size must grow, this means array is full. Then, number
of elements in first half = number of elements in second half. By
credit invariant, total credit = $2 * number of elements in second
half = total number of elements. Sufficient credirt to copy every
element. New array is half full. Invariant is maintained.
Hence, total credit never negative (because number of elements in second
half of array never negative). This means total cost >= total charge,
i.e., amortized cost per operation is <= 3m/m = 3.
- What about DELETE operations? Charge each DELETE $1 to pay for element
"removal". Since no copying of elements happens during DELETE, analysis
above still holds.
3. What if we delete many elements from that array? Could have a large array
with a small number of elements, wasting memory.
Modify DELETE operation so that it "shrinks" the array when "too empty."
Q: Shrink when half-full?
A: No! We could keep expanding-shrinking-expanding-shrinking-etc
with alternating APPENDs and DELETEs
Solution: shrinks array when less than 1/4 full. Charge $2 for each DELETE: $1 for elementary deletion, and $1 for future copying. No matter how many elements we start with, we must delete *at least* 1/2 of the elements before shrinking, ∴ stored credits cover copying to contracted array. Furthermore, contracted array is half full, therefore credit invariant holds:
"Every element in the second half of the array has $2 of credit and
if the array is K less than half full, there are at least K credits."
In other words, if array is exactly half full, there are no credits in
the first quarter. But for each element deleted from the first half, one
more dollar of credit is added to an element in the first quarter. By the
time the array is less than 1/4 full, each element in the first quarter
has enough credit to pay for the cost of copying.