Randomised Algorithms
1
COMP9024: Data Structures and
Algorithms
Randomized Algorithms
2
Contents
Randomized Algorithm
Quick Selection
Skip Lists
Randomized Algorithm (1/2)
A randomized algorithm is an algorithm that employs
a degree of randomness as part of its logic.
The algorithm typically uses uniformly random bits as an
auxiliary input to guide its behaviour, in the hope of
achieving good performance in the average case over all
possible choices of random bits.
The performance of a randomized algorithm is a
random variable determined by the random bits.
The worst-case performance is typically bad with a very
small probability but the average performance can be good.
Two categories: Las Vegas algorithm and Monte
Carlo algorithm
3
Randomized Algorithm (2/2)
Las Vegas algorithm
A Las Vegas algorithm is a randomized algorithm that
always gives correct results
Monte Carlo algorithm
A Monte Carlo algorithm is a randomized algorithm whose
output may be incorrect with a certain (typically small)
probability
4
An Example (1/5)
5
Given an unsorted list where half of the elements have
a key k1 and the other half have a key k2, find an
element in the list with key k1.
An Example (2/5)
6
Algorithm findKey(L, k1)
Input: list L, key k1
Output: an element in L with key k1
{
repeat
randomly select e∈L;
until key(e)=k1;
return e;
}
Las Vegas algorithm
An Example (3/5)
7
Probability of success: 1
The number of iterations varies and can be
arbitrarily large, but the expected number
of iterations is:
lim
𝑛𝑛→∞
∑𝑖𝑖=0
𝑛𝑛 𝑖𝑖
2𝑖𝑖
= 2
The expected time complexity is O(1)
Analysis:
An Example (4/5)
8
Monte Carlo algorithm
Algorithm findKey(L, k1)
Input: list L, key k1
Output: an element in L with key k1
{
i=0;
repeat
randomly select e∈L;
i++;
until key(e)=k1 or i=m;
return e;
}
An Example (5/5)
9
Analysis:
After k iterations, the probability of finding an
element with key k1 is 1 − 1
2
𝑘𝑘
Time complexity is O(1) but its does not
guarantee success
Quick Selection
10
11
12
13
14
15
Worst-Case Time complexity (1/3)
What is the worst-case?
Each time the smallest key or the largest key is
selected, and thus only one element (the one with the
smallest key or the one with the largest key) is
excluded on each iteration
The worst-case time complexity is
O(n)+O(n-1)+…+O(2)+O(1)
=O((n(n+1)/2))=O(n2)
16
Worst-Case Time complexity (2/3)
What is the probability of the worst-case?
Let Ɛₖ be the event that we pick the largest or
smallest element when there are k elements left.
Let event Ɛ be the worst-case. We have:
Ɛ=∏𝑖𝑖=1
𝑖𝑖=𝑛𝑛 𝜀𝜀𝑖𝑖
What is P(Ɛ)=P(∏𝑖𝑖=1
𝑖𝑖=𝑛𝑛 𝜀𝜀𝑖𝑖)?
Since all Ɛi’s are independent, this simplifies to
P(Ɛ)=P(∏𝑖𝑖=1
𝑖𝑖=𝑛𝑛 𝜀𝜀𝑖𝑖)=∏𝑖𝑖=1
𝑖𝑖=𝑛𝑛 𝑃𝑃(𝜀𝜀𝑖𝑖)
17
Worst-Case Time complexity (3/3)
P(Ɛ1 ) = 1.
If i>1, then P(Ɛi)=
2
𝑖𝑖
. Thus
P(Ɛ)=∏𝑖𝑖=1
𝑖𝑖=𝑛𝑛 𝑃𝑃(𝜀𝜀𝑖𝑖)=∏𝑖𝑖=2
𝑖𝑖=𝑛𝑛 2
𝑖𝑖
= 2
𝑛𝑛−1
𝑛𝑛!
if n = 31, then 2n-1<1010 and n! ≈ 8 × 1033
P(Ɛ)<1/1022. This is extremely unlikely!
18
19
Expected Running Time
(1/2)
Consider a recursive call of quick-select on a sequence of size s
Good call: the sizes of L and G are each less than 3s/4
Bad call: one of L and G has size greater than or equal to 3s/4
A call is good with probability 1/2
1/2 of the possible pivots cause good calls:
7 9 7 1 → 1
7 2 9 4 3 7 6 1 9
2 4 3 1 7 2 9 4 3 7 61
7 2 9 4 3 7 6 1
Good call Bad call
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Good pivotsBad pivots Bad pivots
20
Probabilistic Fact #1: The expected number of coin tosses required in
order to get one head is two
Probabilistic Fact #2: Expectation is a linear function:
E(X + Y ) = E(X ) + E(Y )
E(cX ) = cE(X )
Let T(n) denote the expected running time of quick-select.
By Fact #2,
T(n) < T(3n/4) + bn*(expected # of calls for a good call)
By Fact #1,
T(n) < T(3n/4) + 2bn
That is, T(n) is a geometric series:
T(n) < 2bn + 2b(3/4)n + 2b(3/4)2n + 2b(3/4)3n + …
So T(n) is O(n).
We can solve the selection problem in O(n) expected
time.
Expected Running Time
(2/2)
21
Skip Lists
+∞−∞
S0
S1
S2
S3
+∞−∞ 10 362315
+∞−∞ 15
+∞−∞ 2315
22
What is a Skip List
A skip list for a set S of distinct (key, element) items is a series of
lists S0, S1 , … , Sh such that
Each list Si contains the special keys +∞ and −∞
List S0 contains the keys of S in nondecreasing order
Each list is a subsequence of the previous one, i.e.,
S0 ⊇ S1 ⊇ … ⊇ Sh
List Sh contains only the two special keys
We show how to use a skip list to implement the dictionary ADT
56 64 78 +∞31 34 44−∞ 12 23 26
+∞−∞
+∞31−∞
64 +∞31 34−∞ 23
S0
S1
S2
S3
23
Search
We search for a key x in a a skip list as follows:
We start at the first position of the top list
At the current position p, we compare x with y ← key(next(p))
x = y: we return element(next(p))
x > y: we “scan forward”
x < y: we “drop down”
If we try to drop down past the bottom list, we return null
Example: search for 78
+∞−∞
S0
S1
S2
S3
+∞31−∞
64 +∞31 34−∞ 23
56 64 78 +∞31 34 44−∞ 12 23 26
24
To insert an entry (x, o) into a skip list, we use a randomized
algorithm:
We repeatedly toss a coin until we get a tail, and we denote with i
the number of times the coin came up with heads
If i ≥ h, we add to the skip list new lists Sh+1, … , Si +1, each
containing only the two special keys
We search for x in the skip list and find the positions p0, p1 , …, pi
of the items with largest key less than x in each list S0, S1, … , Si
For j ← 0, …, i, we insert item (x, o) into list Sj after position pj
Example: insert key 15, with i = 2
Insertion
+∞−∞ 10 36
+∞−∞
23
23 +∞−∞
S0
S1
S2
+∞−∞
S0
S1
S2
S3
+∞−∞ 10 362315
+∞−∞ 15
+∞−∞ 2315
p0
p1
p2
25
Deletion
To remove an entry with key x from a skip list, we proceed as
follows:
We search for x in the skip list and find the positions p0, p1 , …, pi
of the items with key x, where position pj is in list Sj
We remove positions p0, p1 , …, pi from the lists S0, S1, … , Si
We remove all but one list containing only the two special keys
Example: remove key 34
−∞ +∞4512
−∞ +∞
23
23−∞ +∞
S0
S1
S2
−∞ +∞
S0
S1
S2
S3
−∞ +∞4512 23 34
−∞ +∞34
−∞ +∞23 34
p0
p1
p2
26
Implementation
We can implement a skip list
with quad-nodes
A quad-node stores:
entry
link to the node prev
link to the node next
link to the node below
link to the node above
Also, we define special keys
PLUS_INF and MINUS_INF,
and we modify the key
comparator to handle them
x
quad-node
27
Space Usage
The space used by a skip list
depends on the random bits
used by each invocation of the
insertion algorithm
We use the following two basic
probabilistic facts:
Fact 1: The probability of getting i
consecutive heads when
flipping a coin is 1/2i
Fact 2: If each of n entries is
present in a set with
probability p, the expected size
of the set is np
Consider a skip list with n
entries
By Fact 1, we insert an entry
in list Si with probability 1/2i
By Fact 2, the expected size
of list Si is n/2i
The expected number of
nodes used by the skip list is
nn
n h
i
i
h
i
i 22
1
2 00
<= ∑∑
==
Thus, the expected space
usage of a skip list with n
items is O(n)
28
Height
The running time of the
search and insertion
algorithms is affected by the
height h of the skip list
We show that with high
probability, a skip list with n
items has height O(log n)
We use the following
additional probabilistic fact:
Fact 3: If each of n events has
probability p, the probability
that at least one event
occurs is at most np
Consider a skip list with n
entries
By Fact 1, we insert an entry
in list Si with probability 1/2i
By Fact 3, the probability that
list Si has at least one item is
at most n/2i
By picking i = 3log n, we have
that the probability that S3log n
has at least one entry is
at most
n/23log n = n/n3 = 1/n2
Thus a skip list with n entries
has height at most 3log n with
probability at least 1 − 1/n2
29
Search and Update Times
The search time in a skip list
is proportional to
the number of drop-down
steps, plus
the number of scan-forward
steps
The drop-down steps are
bounded by the height of the
skip list and thus are O(log n)
with high probability
To analyze the scan-forward
steps, we use yet another
probabilistic fact:
Fact 4: The expected number of
coin tosses required in order
to get a tail is 2
When we scan forward in a
list, the destination key does
not belong to a higher list
A scan-forward step is
associated with a former coin
toss that gave a tail
By Fact 4, in each list the
expected number of scan-
forward steps is 2
Thus, the expected number of
scan-forward steps is O(log n)
We conclude that a search in a
skip list takes O(log n)
expected time
The analysis of insertion and
deletion gives similar results
30
Summary
Randomized algorithm
Las Vegas algorithm
Monte Carlo algorithm
Randomized selection algorithm
Skip lists
COMP9024: Data Structures and Algorithms
Contents
Randomized Algorithm (1/2)
Randomized Algorithm (2/2)
An Example (1/5)
An Example (2/5)
An Example (3/5)
An Example (4/5)
An Example (5/5)
Quick Selection
Slide Number 11
Slide Number 12
Slide Number 13
Slide Number 14
Slide Number 15
Worst-Case Time complexity (1/3)
Worst-Case Time complexity (2/3)
Worst-Case Time complexity (3/3)
Expected Running Time (1/2)
Expected Running Time (2/2)
Skip Lists
What is a Skip List
Search
Insertion
Deletion
Implementation
Space Usage
Height
Search and Update Times
Summary