Recursive Version
Data Structures and Algorithms
(DSA)
Lecture 10_2: Binary Search Trees
(Average Case Analysis)
Data Structures and Algorithms
Overview
• Binary Search Trees
– Searching
• Best Case / Worst Case Analysis
• Average Case Analysis
Data Structures and Algorithms
Runtime Analysis
Analyzing the runtime, we may have different
perspectives:
• Worst case analysis (done so far, default case)
• Best case analysis
• Average case analysis
Notation:
i: problem instance
T(i): runtime on i
In: set of instances of size n.
Data Structures and Algorithms
Worst Case Analysis
Data Structures and Algorithms
Find for binary search trees:
• Tree may degenerate to a list
• Tree has height n-1
• We want to find the element of height n-1
Example:
Best Case Analysis
Data Structures and Algorithms
• The element is at the root of the tree
Find for binary search trees:
Example:
Analysis for height of a binary search tree:
• Worst case:
• Best case:
Data Structures and Algorithms
Average Case Analysis
Data Structures and Algorithms
Average the runtime over all possible inputs
Average time for find (degenerated
tree)
Assume: T is generated to a list and consists of
elements 1, 2, …, n.
Input for find: Element
Average time to find an element in T chosen
uniformly at random is
Data Structures and Algorithms
Average time for find (balanced tree)
Assume:
• T is perfectly balanced.
• n = 2k -1 elements.
Observation:
• There are 2i elements at depth i, 0 ≤ i ≤ k-1.
• Time to find element at depth i is i+1.
Data Structures and Algorithms
Time to find an element in T chosen uniformly at
random is
Data Structures and Algorithms
Data Structures and Algorithms
Data Structures and Algorithms
Use geometric series:
We get:
Theorem
Theorem: The average time to find an element
in a perfectly balanced binary tree with n = 2k-1
elements is
Data Structures and Algorithms
Average Case for random insertion
• Assume that the items to be inserted are in
random order.
• We may be lucky and the tree has small depth
(does not degenerate to a list)
Question:
• What is the average time to find an element in
such a tree?
Data Structures and Algorithms
Permutations of n elements
Assume that we have a set of n elements
Consider all permutations of these elements
There are n! permutations.
Example: Set {1, 2, 3}
Permutations:
(1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), (3,2,1)
Data Structures and Algorithms
Analysis
In our analysis:
• we average over the different permutations for
building the binary search tree.
• all queries for the elements.
Formally, we consider “double expected value” with
respect to:
• the order of elements inserted
• the element we query
Data Structures and Algorithms
Cost of a search tree
c(v): number of nodes on the path from the root
to v.
Cost of a tree T:
Average search cost of a tree T:
Data Structures and Algorithms
Cost of a tree
Data Structures and Algorithms
7
3 13
2 11
17
19
5
18 22
1
2 2
3 3 3 3
4 4 4
Cost of the tree C(T) = 1+2+2+3+3+3+3+4+4+4=29
Cost of a node
Average search time for T: C(T) / n = 29 / 10 = 2.9
n=10
Average costs of a tree
Let E(n) be the average cost of tree with n
elements.
Recursion:
Data Structures and Algorithms
Recursive Formula
Data Structures and Algorithms
Each element i is with equal
probability the root
i-1 elements go into
the left subtree
n-i elements go into
the right subtree
Root lies on every
path to a node
Solve Recursion
• Recursive Formula seems to be complicated.
• Is it worth the effort?
Reasons for doing that:
• Result is interesting
• Math tricks can often be used
• Similar analysis gives average case results for
the Quicksort algorithm.
Data Structures and Algorithms
Solving Recursion
contains E(0), E(1), …., E(n-1).
First step:
• Get a recursive formula for E(n) that only
depends on E(n-1).
Data Structures and Algorithms
Consider
This implies that E(n-2), …, E(1) get the same
factor and cancel out, i. e.
Data Structures and Algorithms
Data Structures and Algorithms
Data Structures and Algorithms
Divide by n(n+1)
Consider:
Data Structures and Algorithms
Use:
Then we get:
Data Structures and Algorithms
Data Structures and Algorithms
Harmonic sum
Remember:
Data Structures and Algorithms
Average Cost for Find
Data Structures and Algorithms
Average cost for find after random insertion:
Using:
we get
Theorem
Theorem: The insertion of n randomly chosen
elements leads to a Binary Search Tree whose
expected time for a successful find operation is
Data Structures and Algorithms
Runtimes for Binary Search Tree
Find, insert, remove:
Worst case:
Best case:
Average case:
Data Structures and Algorithms
Aim: Time O(log n) in the worst case