ITI 1121. Introduction to Computing II – subtitle
ITI 1121. Introduction to Computing II
Binary search trees: methods
by
Marcel Turcotte
Version March 27, 2020
Preamble
Preamble
Overview
Overview
Binary search trees: methods
We discuss the properties of trees: full binary tree, complete trees, maximum depth of a
complete tree of size n. Finally, we implement methods for adding and removing an
element to and from a binary search tree.
General objective:
This week, you will be able to design and modify computer programs based on the
concept of a binary search tree.
1 49
Preamble
Learning objectives
Learning objectives
Discuss the efficiency of recursive tree processing in Java, especially in relation to
memory consumption.
Modify the implementation of a binary search tree to add a new method.
Readings:
Pages 263, 265-268, 288-293 of E. Koffman and P. Wolfgang.
2 49
Preamble
Plan
Plan
1 Preamble
2 Summary
3 Definitions
4 add
5 remove
6 Prologue
3 49
Summary
Summary
A tree is a hierarchical data structure
A tree can be implemented using linked nodes
Iterative and recursive processing:
Iterative: A method that follows one and only one path in the tree can easily
be implemented using an iterative method.
Recursive: A method that must traverse more than one subtree for the same
node is usually implemented more easily using a recursive method.
4 49
Definition
A binary search tree is a binary tree where each node satisfies the following two
properties:
All the nodes in its left subtree have smaller values than this node’s or its left
subtree is empty;
All the nodes of its right subtree have larger values than this node’s or its right
subtree is empty.
5
8
15
9 29
11
5 49
Implementation of binary research trees
pub l i c c l a s s BinarySearchTree
p r i v a t e s t a t i c c l a s s Node
p r i v a t e T v a l u e ;
p r i v a t e Node
p r i v a t e Node
}
p r i v a t e Node
}
6 49
Memory diagram
8
root
Node
BinarySearchTree
rightleft
value
Comparable
7 49
8
9 15
115
root Node
BinarySearchTree
Definitions
Definitions
Full binary tree
Full binary tree
5
8
15
12 292 7
A binary tree is said to be full if all its nodes have exactly two children except for the
leaves.
9 49
Definitions
Complete tree
Complete tree
A binary tree of depth d is complete if all its nodes at depths less than d − 1 (so in the
interval [0, 1 . . . d − 2]) have exactly two children.
5
8
15
12 292 7
1 4 5013
Is this tree complete?
Yes, the depth of the tree is d =3, all the nodes at depths 0 and 1 (≤ d − 2) have
exactly two children. Nodes at depth 2 have 0, 1 or 2 children. All nodes at depth 3 are
leaves.
10 49
Complete tree
Is this tree complete?
5
8
15
9 29
11
No, the depth of the tree is d =3, node 5 at depth 1 (≤ d − 2) does not have two
children.
11 49
Definitions
Maximum depth
Relationship between the depth and the
number of nodes
5
8
15
12 292 7
1 4 5013
A complete binary tree of depth d has from 2d to 2d+1 − 1 nodes;
The depth of a complete binary tree of size n is blog2 nc.
12 49
Relationship between the depth and the
number of nodes
. . .
13 49
Discussion
What relationship exists between the efficiency of the methods and the topology of
the tree (complete or not).
14 49
5
8
15
12 292 7
2
5
7
8
12
15
29
5
8
15
12 292 7
2
5
7
8
12
15
29
Observations
When searching, each comparison eliminates a subtree;
The maximum number of nodes visited depends on the depth of the tree;
Thus, complete trees are advantageous (since the depth of the tree is blog2 nc) ∗.
∗In the extreme case where the tree is completely unbalanced, one would have to traverse n − 1 links.
17 49
n blog2 nc
10 3
100 6
1,000 9
10,000 13
100,000 16
1,000,000 19
10,000,000 23
100,000,000 26
1,000,000,000 29
10,000,000,000 33
100,000,000,000 36
1,000,000,000,000 39
10,000,000,000,000 43
100,000,000,000,000 46
1,000,000,000,000,000 49
Prefixes of the International System of Units
Prefix n blog2 nc
mega 106 19
giga 109 29
tera 1012 39
peta 1015 49
exa 1018 59
zetta 1021 69
yotta 1024 79
Consult How much data is generated each day? by Jeff Desjardins in World
Economic Forum on 17 April 2019.
19 49
https://www.weforum.org/agenda/2019/04/how-much-data-is-generated-each-day-cf4bddf29f/
https://www.weforum.org/agenda/2019/04/how-much-data-is-generated-each-day-cf4bddf29f/
Experiment
Machine: 64 cores = Intel(R) Xeon(R) CPU E7-4870 v2 2.30GHz
RAM = 512 Giga-bytes
Operating system = Linux
Set-up: Average time over 5 runs calling add on a tree containing
1,000,000,000 (109) elements
5,765 nonoseconds, 5.765 microseconds, 0.005765 milliseconds
> java -Xmx256g TestBST 1000000000
Building the tree takes 3.1348975 hours.
20 49
add
boolean add(E element)
Exercise. Starting from an empty tree, add one by one the following elements: «Lion»,
«Fox», «Rat», «Cat», «Pig», «Dog», «Tiger».
What conclusions do you draw?
21 49
In order to add an element, you must find the place to insert it.
Which method is used to find an element?
It’s the methodcontains.
What are the changes to be made?
pub l i c boolean c o n t a i n s (E e lement ) {
boolean found = f a l s e ;
Node
whi le ( ! found && c u r r e n t != nu l l ) {
i n t t e s t = e lement . compareTo ( c u r r e n t . v a l u e ) ;
i f ( t e s t == 0) {
found = t rue ;
} e l s e i f ( t e s t < 0) {
c u r r e n t = c u r r e n t . l e f t ;
} e l s e {
c u r r e n t = c u r r e n t . r i g h t ;
}
}
re tu rn found ;
}
boolean add(E element)
Is there a special case to deal with?
Operations involving a change to the variable root are special cases, as are changes to
the variable head for a linked list.
i f ( c u r r e n t == nu l l ) {
r o o t = new Node
}
23 49
Else.
boolean done = f a l s e ;
whi le ( ! done ) {
i n t t e s t = e lement . compareTo ( c u r r e n t . v a l u e ) ;
i f ( t e s t == 0) {
done = t rue ;
} e l s e i f ( t e s t < 0) {
i f ( c u r r e n t . l e f t == nu l l ) {
c u r r e n t . l e f t = new Node
done = t rue ;
} e l s e {
c u r r e n t = c u r r e n t . l e f t ;
}
} e l s e {
i f ( c u r r e n t . r i g h t == nu l l ) {
c u r r e n t . r i g h t = new Node
done = t rue ;
} e l s e {
c u r r e n t = c u r r e n t . r i g h t ;
}
}
}
boolean add(E element)
One always replaces a null value with a new node;
The existing structure of the tree is unchanged;
The topology of the tree depends largely on the order in which the elements are
inserted.
25 49
remove
boolean remove(E element)
Removals will inevitably result in structural changes.
Explore different strategies using the tree on the next page.
Remove each of the 12 nodes, one by one.
26 49
4
6
9
7 112 5
1 3 128 10
boolean remove(E element)
Consider some specific cases:
Remove the leftmost node.
How many sub-cases are there and what are they?
There are two subcases:
The node doesn’t have any subtrees;
Node 1 of the subtree 6 is an example;
What do we do? parent.left = null;
The node has a right subtree;
Node 7 of the subtree 9 is an example;
What do we do? parent.left = “right subtree”;
The node cannot have a left subtree, otherwise it is not the leftmost node!
28 49
Case 1: removing a leaf
Before:
8
115
root
After:
8
11
root
29 49
Case 1: removing a leaf
Before:
8
115
root
After:
8
5
root
30 49
Case 1: Removing a leaf
Before:
8
root
After:
root
31 49
Case 2: t.remove(new Integer(34))
Before:
15
23
34
After:
15
23
32 49
Case 3: t.remove(new Integer(34))
Before:
15
41
34
After:
15
41
33 49
Case 4: t.remove(new Integer(6))
Before:
4
6
9
7 112 5
1 3 128 10
34 49
Case 4: t.remove(new Integer(6))
After:
4
7
9
8 112 5
1 3 1210
35 49
Node
// pre−c o n d i t i o n :
i f ( e l ement == nu l l ) {
throw new N u l l P o i n t e r E x c e p t i o n ;
}
i f ( r o o t == nu l l ) {
throw new NoSuchElementExcept ion ( ) ;
}
36 49
Node
Replacing the node at the root of the tree is a special case.
i f ( e l ement . compareTo ( r o o t . v a l u e ) == 0) {
r o o t = removeTopMost ( r o o t ) ;
}
37 49
Node
element is not at the root of the tree
} e l s e {
Node
i f ( e l ement . compareTo ( r o o t . v a l u e ) < 0) {
c u r r e n t = r o o t . l e f t ;
} e l s e {
c u r r e n t = r o o t . r i g h t ;
}
// . . .
38 49
// . . .
whi le ( c u r r e n t != nu l l ) {
i n t t e s t = e lement . compareTo ( c u r r e n t . v a l u e ) ;
i f ( t e s t == 0) {
i f ( c u r r e n t == pa r en t . l e f t ) {
pa r en t . l e f t = removeTopMost ( c u r r e n t ) ;
} e l s e {
pa r en t . r i g h t = removeTopMost ( c u r r e n t ) ;
}
c u r r e n t = nu l l ; // s t o p p i n g c r i t e r i a
} e l s e {
pa r en t = c u r r e n t ;
i f ( t e s t < 0) {
c u r r e n t = pa r en t . l e f t ;
} e l s e {
c u r r e n t = pa r en t . r i g h t ;
}
}
}
Node
current)
p r i v a t e Node
Node
i f ( c u r r e n t . l e f t == nu l l ) {
top = c u r r e n t . r i g h t ;
} e l s e i f ( c u r r e n t . r i g h t == nu l l ) {
top = c u r r e n t . l e f t ;
} e l s e {
c u r r e n t . v a l u e = getLe f tMos t ( c u r r e n t . r i g h t ) ;
c u r r e n t . r i g h t = removeLeftMost ( c u r r e n t . r i g h t ) ;
top = c u r r e n t ;
}
re tu rn top ;
}
40 49
E getLeftMost(Node
p r i v a t e E getLe f tMos t ( Node
i f ( c u r r e n t == nu l l ) {
throw new N u l l P o i n t e r E x c e p t i o n ( ) ;
}
i f ( c u r r e n t . l e f t == nu l l ) {
re tu rn c u r r e n t . v a l u e ;
}
re tu rn ge tLe f tMos t ( c u r r e n t . l e f t ) ;
}
41 49
Node
current)
p r i v a t e Node
i f ( c u r r e n t . l e f t == nu l l ) {
re tu rn c u r r e n t . r i g h t ;
}
Node
c u r r e n t = c u r r e n t . l e f t ;
whi le ( c u r r e n t . l e f t != nu l l ) {
pa r en t = c u r r e n t ;
c u r r e n t = c u r r e n t . l e f t ;
}
pa r en t . l e f t = c u r r e n t . r i g h t ;
re tu rn top ;
}
42 49
Recursive implementation
pub l i c vo id remove (E e lement ) {
// pre−c o n d i t i o n :
i f ( e l ement == nu l l ) {
throw new N u l l P o i n t e r E x c e p t i o n ( ) ;
}
r o o t = remove ( root , e l ement ) ;
}
43 49
p r i v a t e Node
Node
i n t t e s t = e lement . compareTo ( c u r r e n t . v a l u e ) ;
i f ( t e s t == 0) {
i f ( c u r r e n t . l e f t == n u l l ) {
r e s u l t = c u r r e n t . r i g h t ;
} e l s e i f ( c u r r e n t . r i g h t == n u l l ) {
r e s u l t = c u r r e n t . l e f t ;
} e l s e {
c u r r e n t . v a l u e = getLe f tMos t ( c u r r e n t . r i g h t ) ;
c u r r e n t . r i g h t = remove ( c u r r e n t . r i g h t , c u r r e n t . v a l u e ) ;
}
} e l s e i f ( t e s t < 0) {
c u r r e n t . l e f t = remove ( c u r r e n t . l e f t , e l ement ) ;
} e l s e {
c u r r e n t . r i g h t = remove ( c u r r e n t . r i g h t , e l ement ) ;
}
r e t u r n r e s u l t ;
}
Observations
There is a very wide variety of trees, including the self-balancing trees (AVL,
Red-Black, B).
A general tree is a tree whose nodes may have more than two children.
45 49
Summary
The binary search tree is a binary tree such that all the keys in the left subtree are
smaller than that of the current node and all the keys in the right subtree are larger;
Such a data structure allows for efficient searches.
46 49
Prologue
Summary
The topology of the tree depends on the order in which the elements are added.
If the tree is complete and contains n elements, you’ll have to follow at most
blog2 nc links to find the element you are looking for.
47 49
References I
E. B. Koffman and Wolfgang P. A. T.
Data Structures: Abstraction and Design Using Java.
John Wiley & Sons, 3e edition, 2016.
48 49
Marcel Turcotte
Marcel.
School of Electrical Engineering and Computer Science (EECS)
University of Ottawa
49 / 49
Marcel.
Preamble
Overview
Learning objectives
Plan
Summary
Definitions
Full binary tree
Complete tree
Maximum depth
add
remove
Prologue