CS计算机代考程序代写 algorithm scheme data structure AI AVL discrete mathematics PowerPoint Presentation

PowerPoint Presentation

EECS 4101/5101
Advanced
Data Structures
Prof. Andy Mirzaian

COURSE THEMES
Amortized Analysis
Self Adjusting Data Structures
Competitive On-Line Algorithms
Algorithmic Applications
2

COURSE TOPICS
Phase I: Data Structures
Dictionaries
Priority Queues
Disjoint Set Union

Phase II: Algorithmics
Computational Geometry
Approximation Algorithms
3

INTRODUCTION
Amortization
Self Adjustment
Competitiveness
References:

[CLRS] chapter 17
Lecture Note 1 (and parts of LN11)

4

Amortization
5

Data Structure Analysis Methods
What is the total computational time to execute
a sequence of operations on a data structure?
– These operations have correlated effect on the DS

Worst-Case Analysis:
– worst-case time per operation  number of operations
– ignores the correlated effects on the DS
– this upper-bound is most often too pessimistic

Expected-Case Analysis:
– needs probability distribution
– probabilistic assumptions may not correspond to reality
– does not give upper-bound

Amortized Analysis:
– simple and robust
– correlated effects are taken into account
– gives tighter upper-bound
– is used to design self adjusting data structures
6

Amortized Analysis
amortized time = average time of an operation over a
worst-case sequence of operations.

a sequence of n operations s = (op1 , op2 , … , opn )
on the data structure (starting from some given initial state)
actual time costs per op in the sequence: c1 , c2 , … , cn
total time over the sequence: c1 + c2 + … + cn
worst-case total: T(n) = max s (c1 + c2 + … + cn)
amortized time is (a good upper bound on) T(n)/n

Three Methods of Amortized Analysis:
– Aggregate Method: T(n)/n
– Accounting Method: a banker’s point of view
– Potential Function Method: a physicist’s point of view
7

Two running D.S. examples
Stack with Multipop:
– A sequence of n intermixed Push(x,S), Pop(S), Multipop(S)
operations on an initially empty stack S
– Multipop example: while S and Top(S)<0 do Pop(S) - time cost unit = O(1) per Push and Pop - worst-case actual cost of a single Multipop = O(n) - B-bit Binary Counter A= AB-1…A2A1A0 initially all 0 bits - procedure Increment(A) i0 while i0 credit “stored” in specific parts of the DS
<0 debit used up from stored credits Must have [Total stored Credit] = Aggregate Amortized Cost > 0
– Aggregate Actual Cost

This implies that the aggregate amortized time is an upper bound on
aggregate actual time even for the worst-case sequence of operations.

CREDIT INVARIANT states the credit/debit status of the DS at any state.
12

Accounting Method: Stack with Multipop
Amortized costs:
Push $2 Pay $1 for push, store $1 with new item
Pop $0 Use stored $1 to pop the item
Multipop $0 Use the stored $1’s for each item popped
J
I
H
G
F
E
D
C
B
A

$1
$1
$1
$1
$1
$1
$1
$1
$1
$1
Credit Invariant: $1 stored with each stack item.
13

Accounting Method: B. C. + Increment
C := actual cost (could be as high as B)
C-1 1-to-0 bit flips, but only one 0-to-1 bit flip.
X X . . . X 0 1 1 . . . 1 1
X X . . . X 1 0 0 . . . 0 0

Increment
C

Many 1-to-0 bits to flip. Expensive. They should have stored credits.

Credit Invariant: $1 stored with each 1-bit, $0 with each 0-bit.
For the C-1 1-to-0 bit flips, their stored $1’s will pay for their flips.
For the one 0-to-1 bit flip: pay $1 for the flip and store another $1 at the new 1-bit to maintain the Credit Invariant.
Amortized cost of an Increment = $2
14

Potential Function Method
Physicist’s point of view.
Potential energy = aggregate of prepaid credits “stored/assigned ” to
specific parts of the data structure.
Potential energy can be released to pay for future operations.
D0 = initial data structure
Di = D.S. after the ith operation opi, i=1..n
ci = actual cost of opi, i=1..n
reals
set of all possible states of the DS

change in potential (positive or negative) caused by opi

Potential Function  : D  
opi
Di-1  Di
ci
ĉ = c +  (for short)
F(D) = potential of the data structure at state D. Memoryless!
Defining an effective potential function is the key to the analysis.

ĉi = amortized cost of opi, i=1..n
ĉi = ci + (Di) – (Di-1)
15

Potential Function Method
ĉi = ci + (Di) – (Di-1)

ci
ĉi
actual
cost
amortized
cost
i

Regular Potential: (D0) = 0, and (D) > 0 for all D D

16

Potential Method: Stack with Multipop
F(S) = |S| (regular potential)

Amortized cost per operation:

Push: ĉ = c +  = 1 + 1 = 2

Pop: ĉ = c +  = 1 – 1 = 0

Multipop: ĉ = c +  = c – c = 0

So, in all cases ĉ = O(1).

Therefore, a sequence of n operations on an
initially empty stack will cost at most O(n) time.

17

Potential Method: B. C. + Increment
F(A) = Si Ai = # of 1-bits in A (regular potential)

Amortized cost per Increment:

ĉ = c +  = c + [-(c-1) + 1] = 2.
Therefore, a sequence of n Increments on an
initially 0 counter will cost at most O(n) time.
0-to-1 bit flip
1-to-0 bit flips

18

Self Adjustment
19

Self Adjusting Data Structures
Worst-case efficient data structures:
Example: many variants of balanced search trees such as AVL trees.
Aim to keep the worst-case time per operation low.
Enforce explicit structural balance, and maintain extra balance information.
Comparatively need more memory and computational overhead.
Good for real-time applications.

Self-adjusting data structures:
Efficient in amortized sense only.
Leave D.S. in arbitrary state, but update it in a simple uniform way, aiming to keep cost of future operations low.
Much simpler than their worst-case efficient cousins. No explicit balance info.
Adjust to pattern of usage and on some sequences of operations comparatively more efficient in the aggregate.
Tend to make more local changes, even during accesses, as well as updates.
This could be bad for persistent data structures.
Bad for real-time applications, since worst-case cost of an op could be high.
Good for batch-mode applications.
We will study a number of important self-adjusting D.S. such as dynamic tables, lists, dictionaries, priority queues, disjoint set union.
discussed next

20

Dynamic Tables
s(T) = size of table T

n(T) = # of items currently in T [0  n(T) s(T) ]

a(T) = n(T)/s(T) load factor [0  a(T) 1 ]

For good memory utilization we typically want to keep at least a certain fraction of the table full, e.g., 0.5  a(T) 1.
Table T:

s(T)

n(T)

[e.g., hash table, stack, heap, …]
21

Dynamic Tables
Insert/Delete: add to or remove a specified item from the table.

Case 1) Load factor remains within the allowable range.
Actual Cost O(1) time.

Case 2) Otherwise, need table Expansion / Contraction:
dynamically allocate a new table of larger/smaller size, and move all existing items from the old table to the new one,
and insert/delete the specified item.
Load factor in new table must remain in the allowable range.
Actual cost O(n(T)).

Self-Adjustment: Is there a table expansion/contraction policy with
amortized O(1) cost per insert/delete?
Table T:

s(T)

n(T)

[e.g., hash table, stack, heap, …]
22

Semi-Dynamic Table: Insert only
Need table expansion only.
Before expansion: Told, s = s(Told), n = n(Told), a = a(Told)
After expansion: Tnew, s’ = s(Tnew), n’ = n(Tnew), a’ = a(Tnew)

Expansion Policy: when a = 1 before an insertion,
double table size: s(Tnew) = 2 s(Told)

Right after expansion but before new item is inserted:
s’ = 2s, a’ = n/s’ = n/2s = a/2 = ½.
Load factor range maintained: ½  a 1
Insertion cost: 1 without expansion
n+1 with expansion
Worst-case cost of an insertion = O(n)
What is the amortized cost of an insertion? We will derive it by
– Aggregate Method
– Accounting Method
– Potential Function Method
23

S.-D. Table: Aggregate Method
With first insertion s(T) = 1 = 20. Afterwards it’s doubled with each expansion.

Cost of ith insertion:

2j
2j+1

Expansion:
Told
Tnew

Amortized insertion cost = T(n)/n < 3n/n = 3 = O(1). 24 S.-D. Table: Accounting Method Amortized cost: charge $3 for each insertion: $1 to insert new item $2 to store with new item Credit Invariant: $2 stored with each item not previously moved (from an older table) $0 stored with each item previously moved Expansion: Told Tnew $2 each $0 each $2 each $0 each 25 S.-D. Table: Potential Fn Method Amortized cost of Insert: Case 1) a<1 (no expansion): ĉ = c + DF = 1 + 2 = 3. Case 2) a=1 (w expansion): n’ = n+1 = s+1 ĉ = c + DF = c + [Fafter - Fbefore] = [s+1] + [2 - s] = 3. In all cases ĉ = O(1). s s F slope DF/Dn = 2 26 Dynamic Table: Insert & Delete First Try: ½  a  1 when Delete Contract: s’=s/2 when Insert Expand: s’=2s Does this strategy yield O(1) amortized cost per insert/delete? No. a is nearly at the other critical extreme after an expansion/contraction. Expensive sequence (assume n is power of 2): I, I, I, …… , I, I, I, I, D, D, I, I, D, D, … , I, I, D, D n/2 – 1 n/2 +1 Q(n) Q(n) Q(n) Q(n) T(n) = Q(n2) Amortized cost per Insert/Delete = T(n)/n = Q(n). 27 Dynamic Table: Insert & Delete Does this strategy yield O(1) amortized cost per insert/delete? Yes. a = ½ just after an expansion/contraction (before the insert/delete). This is a fraction away from both extreme boundaries. See exercise on how to figure out good expansion/contraction policy to handle other pre-specified ranges for a. Next Try: ¼  a  1 when Delete Contract: s’=s/2 when Insert Expand: s’=2s 28 Dynamic Table: Insert & Delete Next Try: ¼  a  1 when Delete Contract: s’=s/2 when Insert Expand: s’=2s s s F slope = 2 Potential function: s/4 s/2 s/4 slope = -1 29 Dynamic Table: Insert & Delete s s F slope = 2 s/4 s/2 s/4 slope = -1 case 1) ¼  a < a’  ½ ĉ=c+DF =1-1 =0 case 2) a < ½ < a’ ĉ=c+DF 1+2 =3 case 3) ½  a <1 ĉ=c+DF =1+2 =3 case 4) a =1 (needs expansion) ĉ=c+DF =[n+1]+ [2-n] =3. Summary: In all cases ĉ=O(1). Amortized cost for Insert: 30 Dynamic Table: Insert & Delete s s F slope = 2 s/4 s/2 s/4 slope = -1 case 1) ½a’ 0, compute the largest D-flat subset of A.
(b) Given an array A[1..n] of n elements and a D> 0, compute the longest D-flat contiguous
subarray of A.
[Note: in part (a) any subset is a potential candidate solution, but in part (b) only those subsets that correspond to contiguous subarrays are allowed.
Hint: for part (b) use incremental algorithm and a variation of stack with multipop.]
50

Competitive On-line Linked-List Correctness Verification: We are given the head pointer to a read-only linked list. Each node of this list consists of a single field, namely, a pointer to the next node on the list. For the list to have correct structure, each node should point to its successor, except for the last node that points to nil, as in figure (a) below. However, due to a possible structural error, the list may look like that of figure (b), where a node points to one of the previous nodes (possibly itself). From that node on, we no longer have access to the rest of the nodes on the list. The task is to verify the structural correctness of the list. The off-line algorithm knows n, the number of accessible nodes on the list. So, it can easily verify the list’s structural correctness, namely, starting from the head pointer, follow the list nodes for n steps and see if you reach a nil pointer. However, the on-line algorithm has no advance knowledge of the length of the list.
Design a competitive in-place on-line algorithm to test the structural correctness of the list. That is, your algorithm should take time linear in the number of accessible nodes, and work in-place, i.e., use only O(1) additional space. So, your algorithm may use a few local scalar variables, but not such things as additional lists or arrays. Also, the nodes on the list do not have any additional fields that you may use for some kind of marking, and your algorithm should not attempt to alter the structure of the list, not even temporarily.

(a) Head
(b) Head
nil
51

END
52

n

n
n
2
)
1
(
8
1
4
1
2
1
£
×
×
×
+
+
+
+
< + 2 n + 4 n + × × × + - 1 2 B n [ ] [ ] ) D ( ) D ( c ) D ( ) D ( c c ˆ 0 n n 1 i i n 1 i 1 i i i n 1 i i F - F + = F - F + = å å å = = - = å å = = £ n 1 i i n 1 i i c ˆ c î í ì ³ = - = otherwise 1 0 j integer some for 2 1 i if i c j i å = = n 1 i i c ) n ( T ë û å = + = n log 0 j j 2 n ) ( 4 2 × × × + + + + < n n n n n 3 n 2 n = + £ î í ì < ³ - = F 2 1 2 1 ) T ( α if 0 ) T ( α if ) T ( s ) T ( n 2 ) T ( ï î ï í ì < £ £ £ + - £ £ - = F 4 1 2 1 4 1 2 1 ) T ( α 0 if 0 ) T ( α if 2 / ) T ( s ) T ( n 1 ) T ( α if ) T ( s ) T ( n 2 ) T ( î í ì = F exp/con before just n(T) exp/con after ust j 0 ) T ( ) s ( C ) s ( C max σ OPT A s A = . n n k i i i = å - = 1 0 2 /docProps/thumbnail.jpeg