Dynamic Data Structures
Dr Timothy Kimber
January 2018
Introduction
Dynamic Data Structures
Having efficient data structures is crucial for successful algorithms.
The problems seen so far involved fixed length lists
In most languages we have a simple way to implement this efficiently
— arrays
Our algorithms assumed some sort of array type was available
Other problems require dynamic data structures such as
Lists, Stacks and Queues
Sets and Dictionaries
These are designed to hold variable, essentially unlimited amounts of data.
Algorithms (580) Dynamic Data Structures January 2018 2 / 47
Lists
Ordered Data Structures
A list is an ordered collection of {nodes, items, elements}.
The key property of a list is the ordering of the nodes
A list might support operations such as
push adds an element to the end of the list
pop removes the last element of the list
shift removes the first element of the list
unshift adds an element to the front of the list
insert adds an element at a given position
remove removes the element at a given position
iterate returns the items in order
Plus sorting, searching, copying, joining, splitting …
The most appropriate implementation depends on which operations
are needed.
Algorithms (580) Dynamic Data Structures January 2018 3 / 47
Stacks
Stacks
A stack is a last-in first-out (LIFO) list.
Stacks support only
push for adding elements
pop for removing elements
Stacks are usually pictured as a vertical (stacked!) structure
Stacks support recursive algorithms including fundamental operations
such as calling subprocedures and evaluating arithmetic expressions
Algorithms (580) Dynamic Data Structures January 2018 4 / 47
Stacks
Stacks
Question
How would you implement a stack?
Must be able to add “unlimited” objects
Push and Pop must implement LIFO behaviour
Algorithms (580) Dynamic Data Structures January 2018 5 / 47
Stacks Stack Implementation
Stack Implementation
Could use array or linked-list as storage model
Both very simple implementation
Array has fixed capacity
Linked list has higher overheads
Can make array dynamic (variable size)
Integer sp points to the top of the stack
Update sp within Push and Pop
Algorithms (580) Dynamic Data Structures January 2018 6 / 47
Stacks Stack Implementation
Dynamic Array-Based Stack
Push(Input: stack s[0, . . . , k − 1], object x)
if s.sp == k // array is full
s’ = new stack of size 2k
for i = 0 to k – 1
s’[i] = s[i]
s = s’
s[s.sp] = x
s.sp = s.sp + 1
push increases the capacity of the stack if it is full
Algorithms (580) Dynamic Data Structures January 2018 7 / 47
Stacks Stack Implementation
Dynamic Array-Based Stack
Pop(Input: stack s[0, . . . , k − 1])
if s.sp < k/2 // array is less than half full
s’ = new stack of size k/2
for i = 0 to k/2 - 1
s’[i] = s[i]
s = s’
s.sp = s.sp - 1
return s[s.sp]
pop decreases the capacity if it is too big
a full implementation should have a minimum size
Algorithms (580) Dynamic Data Structures January 2018 8 / 47
Stacks Performance
Performance of Push
Question
Given a stack containing N objects, what is the worst case time
complexity of push?
Assume: time to insert (copy, add) one object to array is c
Assume: initial capacity is 4
The time taken for pushing objects is:
c , c , c , c , c + 4c , c , ...
So:
Worst time to push a single object is Nc
So T (N) = O(N)
Want to reflect fact that most pushes are not O(N)
Algorithms (580) Dynamic Data Structures January 2018 9 / 47
Stacks Performance
Performance of Push
Revised Question
Given an empty stack, what is the worst case time to push N objects?
Assume: initial capacity is 4
Assume: time to insert (copy, add) one object to array is c
The time taken for the each push is still:
c , c , c , c , c + 4c , c , ...
For N pushes the worst single push is Nc
T (N) = O(N2)
However, this is a big overestimate
Algorithms (580) Dynamic Data Structures January 2018 10 / 47
Stacks Performance
Performance of Push
The time for N pushes is at most
T (N) = Nc + (4c + 8c + · · ·+ (N/2)c + Nc)
where
the first Nc is the cost of writing N elements
the rest is for copying to new arrays
The time for copying is 2Nc − 4c (see notes on solving series), so
T (N) ≤ 3Nc − 4c
T (N) = O(N)
Algorithms (580) Dynamic Data Structures January 2018 11 / 47
Stacks Amortised Analysis
Amortisation
The time for N pushes is O(N), so:
A single push is effectively a constant time operation
More correctly: push is amortised Θ(1)
NOT the same as Θ(1)
Amortisation
Related to accountancy method used to defer large costs
Amortised analysis considers a sequence of operations
Cost of individual ops is “amortised” across the sequence
Unlike accountancy, must never be in debt
Algorithms (580) Dynamic Data Structures January 2018 12 / 47
Stacks Amortised Analysis
Amortised Analysis
Rather than calculating cost of full sequence of N steps we can
Pick a representative subsequence
Subsequence is some “cycle” that repeats
Pick an amortised cost for operations
Show that paying amortised cost covers all costs (never in debt)
Exercise
Find a representative cycle (subsequence) of pushes into the stack and
show that the amortised cost of 3c covers all costs.
Algorithms (580) Dynamic Data Structures January 2018 13 / 47
Stacks Amortised Analysis
Amortised Analysis
Start after any expensive push
Up to and including next expensive one
Cheap pushes each put 2c in the bank
Have enough to cover extra costs when next expensive push occurs
(If you started with a copy you are immediately over budget)
Algorithms (580) Dynamic Data Structures January 2018 14 / 47
Stacks Amortised Analysis
Amortised Analysis
Argument only works because array is initially empty and size is doubled
Say we have N objects on stack after a copy
Before next copy we always push N more
This is how cost is covered
Multiplying by any factor will do - will affect amortisation constant
Algorithms (580) Dynamic Data Structures January 2018 15 / 47
Queues
Queues
A queue is also a list, but the next object removed is either:
The earliest one added (FIFO Queue)
The one with highest priority (Priority Queue)
Questions
How could you implement a priority queue (PQ)?
Given a PQ following your design that contains N objects, what
would be the worst case time to add a new object? (Each object has
a key attribute that determines its priority.)
Algorithms (580) Dynamic Data Structures January 2018 16 / 47
Priority Queues Design
Priority Queue Design
If we maintain a total ordering of the queue:
It will take O(N) time to add new elements
Can search a sorted array quickly but have to shift existing objects
Finding position in a linked list is O(N)
remove items
9, 8, 7, 6, 3 Max Priority Queue
add items
6, 9, 7, 3, 8
Do not actually need total ordering.
Queue does not support indexed access
Just want to find object with highest priority
Algorithms (580) Dynamic Data Structures January 2018 17 / 47
Priority Queues Design
Priority Queue Design
Solution is to divide and conquer the data
Key Property: Maintain order within each branch
Highest (or lowest) key will be at the root
Behaves like lots of mini queues
Algorithms (580) Dynamic Data Structures January 2018 18 / 47
Priority Queues Design
Priority Queue Design
Solution is to divide and conquer the data
Question
A new object could go in any branch. (Do you agree?) So, where should it
go? Why?
Algorithms (580) Dynamic Data Structures January 2018 19 / 47
Priority Queues Heaps
Heap: a Tree in an Array
We want to know where the “end” of the tree is:
Build a tree within an array
Track end using “stack pointer”
Navigate by indices
Leaving a[0] blank means:
parent of a[n] is a[n/2]
children of a[n] are a[2*n] and a[2*n+1]
Exercise
How should a new object be added to a max binary heap? (i.e. the
greatest key should be at the root).
Algorithms (580) Dynamic Data Structures January 2018 20 / 47
Priority Queues Heap Operations
Binary Heap Operations
Adding an object to the heap
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 21 / 47
Priority Queues Heap Operations
Binary Heap Operations
Adding an object to the heap
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 21 / 47
Priority Queues Heap Operations
Binary Heap Operations
Adding an object to the heap
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 21 / 47
Priority Queues Heap Operations
Binary Heap Operations
Adding an object to the heap
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 21 / 47
Priority Queues Heap Operations
Heap: a Tree in an Array
Track end using “stack pointer”
Leaving a[0] blank means:
parent of a[n] is a[n/2]
children of a[n] are a[2*n] and a[2*n+1]
Exercise
How should the object with the greatest key be removed from a max
binary heap?
Algorithms (580) Dynamic Data Structures January 2018 22 / 47
Priority Queues Heap Operations
Binary Heap Operations
To remove the object at the root
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 23 / 47
Priority Queues Heap Operations
Binary Heap Operations
To remove the object at the root
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 23 / 47
Priority Queues Heap Operations
Binary Heap Operations
To remove the object at the root
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 23 / 47
Priority Queues Heap Operations
Binary Heap Operations
To remove the object at the root
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 23 / 47
Priority Queues Heap Operations
Binary Heap Operations
To remove the object at the root
Restore the “shape”
Then restore the order
Algorithms (580) Dynamic Data Structures January 2018 23 / 47
Priority Queues Heap Operations
Binary Heap Performance
Question
Given a heap containing N objects, what is the time complexity for adding
or removing one object?
Algorithms (580) Dynamic Data Structures January 2018 24 / 47
Priority Queues Heap Operations
Binary Heap Performance
Both operations are O(log2N)
Height of the heap is Θ(log2N)
Each operation confined to one branch
Algorithms (580) Dynamic Data Structures January 2018 25 / 47
Priority Queues Heapsort
Heapsort
Heaps also provide us with the Heapsort algorithm (JWJ Williams, 1964)
Heapsort (given a list L)
Create an empty heap H
Remove each element of L and add it to H
Remove each element of H and add it to L
HALT
What could be simpler?!
Performance is again Θ(Nlog2N)
Can also be implemented in place by setting up list and heap
partitions within a single array
Algorithms (580) Dynamic Data Structures January 2018 26 / 47