The University of 1
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969
This material has been reproduced and communicated to you by or on behalf of the University of Sydney pursuant to Part VB of the Copyright Act 1968 (the Act). The material in this communication may be subject to copyright under the Act. Any further copying or communication of this material by you may be the subject of copyright protection under the Act.
Do not remove this notice.
Copyright By PowCoder代写 加微信 powcoder
Data structures and Algorithms
Binary Search Trees [GT 3.1-2] [GT 4.2]
Dr. André van Renssen School of Computer Science
Some content is taken from material provided by the textbook publisher Wiley.
The University of 2
Binary Search Trees (BST)
A binary search tree is a binary tree storing keys (or key-value pairs) satisfying the following BST property
For any node v in the tree and any node u in the left subtree of v and any node w in the right subtree of v,
key(u) < key(v) < key(w)
Note that an inorder traversal
of a binary search tree visits the keys in increasing order.
The University of Sydney © Goodrich, Tamassia, 3
BST Implementation
To simplify the presentation of our algorithms, we only store keys (or key-value pairs) at internal nodes
External nodes do not store items (and with careful coding, can be omitted, using null to refer to such)
The University of Sydney © Goodrich, Tamassia, 4
Searching with a Binary Search Tree
To search for a key k, we trace a downward path starting at the root
To decide whether to go left or right, we compare the key of the current node v with k
If we reach an external node, this means that the key is not in the data structure
def search(k, v)
if v.isExternal() then
# unsuccessful search
if k = key(v) then
# successful search
else if k < key(v) then
# recurse on left subtree
return search(k, v.left)
#thatis k>key(v)
# recurse on right subtree return search(k, v.right)
The University of Sydney © Goodrich, Tamassia, 5
Example: Find 22
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
9 26 11 24
The University of 6
Example: Find 23
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
9 26 11 24
The University of 7
Analysis of Binary Tree Searching
Runs in O(h) time, where h is the height of the tree } worstcaseish=n-1
} bestcaseis h≤log2 n
Time per level
The University of Sydney
© Goodrich, Tamassia, 8
Total time: O(h)
To perform operation put(k, o), we search for key k (using search)
Ifkisfoundinthetree,replacethe corresponding value by o
If k is not found, let w be the external node reached by the search. We replace w with an internal node holding (k, o)
The University of 9
Example: Insert 23
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Search for 23
9 26 11 24
The University of 10
To perform operation remove(k), we search for key k (using search) to find the node w holding k
We distinguish between two cases – whasoneexternalchild
– whastwointernalchildren
If k is not in the tree we can either throw an exception or do nothing depending on the ADT specs
The University of 11
Deletion Case 1
Suppose that the node w we want to remove has an external child, which we call z.
To remove w we
– removewandzfromthetree
– promotetheotherchildofwto take w’s place
This preserves the BST property
The University of Sydney © Goodrich, Tamassia, 12
Example: Delete 24
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Search for 24
9 26 11 24
The University of 13
Example: Delete 24
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Remove z and w
22 z 29 31
The University of 14
Example: Delete 24
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
The University of 15
Example: Delete 24
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Promote remaining child to take spot of w
The University of 16
Deletion : Case 2
Suppose that the node w we want to remove has two internal children.
To remove w we
– find the internal node y following w in an inorder traversal (i.e., y has the smallest key among the right subtree under w)
Example: remove(3)
– we copy the entry from y into node w
– we remove node y and its left child z, which must be external, using previous case
This preserves the BST property
The University of Sydney © Goodrich, Tamassia, 17
Deletion algorithm
def remove(k)
w ← search(k, root) if w.isExternal() then
# key not found
return null
else if w has at least one external child external z then
promote the other child of w to take w’s place remove w
# y is leftmost internal node in the right subtree of w
y ← immediate successor of w
replace contents of w with entry from y remove(y)
The University of Sydney © Goodrich, Tamassia, 18
Complexity
Consider a map with n items implemented by means of a binary search tree of height h:
– the space used is O(n)
– get, put and remove take O(h) time
The height h can be n in the worst case and log n in the best case.
Therefore the best one can hope is that tree operations take O(log n) time but in general we can only guarantee O(n). But the former can be achieved with better insertion routines.
The University of 19
Duplicate key values in BST
Our definition says that keys are in strict increasing order
key(left descendant) < key(node) < key(right descendant)
This means that with this definition duplicate key values are not allowed (as needed when implementing Map)
However, it is possible to change it to allow duplicates. But that means additional complexity in the BST implementation:
• Allowing left descendants to be equal to the parent
key(left descendant) ≤ key(node) < key(right descendant)
• Using a list to store duplicates
The University of Sydney © Goodrich, Tamassia, 20
Range Queries
A range query is defined by two values k1 and k2. We are to findallkeyskstoredinTsuchthatk1 ≤k≤k2
E.g., find all cars on eBay priced between 10K and 15K.
The algorithm is a restricted version of inorder traversal. When
at node v:
– if key(v) < k1: Recursively search right subtree
– if k1 ≤ key(v) ≤ k2: Recursively search left subtree, add v to range output, search right subtree
– if k2 < key(v): Recursively search left subtree
The University of 21
Pseudo-code
def range_search(T, k1, k2) output ← [] range(T.root, k1, k2)
def range(v, k1, k2) if v is external then
return null
if key(v) > k2 then
range(v.left, k1, k2)
else if key(v) < k1 then
range(v.right, k1, k2)
range(v.left, k1, k2)
output.append(v)
range(v.right, k1, k2)
The University of 22
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
The University of 23
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Start search
The University of 24
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31} Q=[8,28]
Is 12 in Q?
9 26 11 24
Yes, so 12 will be in the output and we continue the search in both subtrees
The University of 25
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is 9 in Q?
Yes, so 9 will be in the output and we continue the search in both subtrees
The University of 26
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Yes, continue search in right subtree.
The University of 27
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Yes, continue search in right subtree.
2 222931 20
The University of 28
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is node a leaf?
Yes, return Æ. 30
9 26 11 24
The University of 29
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is 11 in Q?
Yes, return 11 and continue the search in both subtrees.
The University of 30
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is node a leaf ( ́2)? Yes, return Æ.
9 26 11 24
The University of 31
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is 26 in Q?
Yes, return 26 and continue the search in both subtrees.
The University of 32
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is 24 in Q?
Yes, return 24 and continue the search in both subtrees.
The University of 33
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is 22 in Q?
Yes, return 22 and continue the search in both subtrees.
9 26 11 24
The University of 34
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is 20 in Q?
Yes, return 20 and continue the search in both subtrees.
9 26 11 24
The University of 35
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is node a leaf ( ́4)? Yes, return Æ.
9 26 11 24
The University of 36
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Yes, continue the search in left subtree.
9 26 3 1124
The University of 37
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Yes, continue the search in left subtree.
2522 31 20
The University of 38
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
Is node a leaf?
Yes, return Æ. 30
9 26 11 24
The University of 39
Range queries
S={2,3,5,9,11,12,20,22,24,26,29,30,31}
9 26 11 24
The University of 40
Performance
Let P1 and P2 be the binary search paths to k1 and k2 We say a node v is a:
– boundary node if v in P1 or P2
– inside node if key(v) in [k1, k2] but not in P1 or P2
– outside node if key(v) not in [k1, k2] but not in P1 or P2
The algorithm only visits boundary and inside nodes and
- |inside nodes| ≤ |output|
- |boundary node| ≤ 2 * tree height
Therefore, since we only spend O(1) time per node we visit, the total running time of range search is O(|output| + tree height)
The University of 41
Maintaining a balanced BST
We have seen operations on BSTs that take O(height) time to run. Unfortunately, the standard insertion implementation can lead to a tree with height n-1 (e.g., if we insert in sorted order)
In the rest of today’s lecture we will cover much more sophisticated algorithms that maintain a BST with height O(log n) at all times by rebalancing the tree with simple local transformations
This directly translates into O(log n) performance for searching
The University of 42
Rank-balanced Trees
A family of balanced BST implementations that use the idea of keeping a “rank” for every node, where r(v) acts as a proxy measure of the size of the subtree rooted at v
Rank-balanced trees aim to reduce the discrepancy between the ranks of the left and right subtrees:
– AVL Trees (now)
– Red- (book)
The University of 43
AVL Tree Definition
AVL trees are rank-balanced trees, where r(v) is its height
of the subtree rooted at v
Balance constraint: The ranks of the two children of every internal node differ by at most 1.
The University of 44
Height of an AVL Tree
Fact: The height of an AVL tree storing n keys is O(log n).
Proof (by induction):
– LetN(h)betheminimumnumberofkeysofanAVLtreeofheighth.
– WeeasilyseethatN(1)=1andN(2)=2
– ClearlyN(h)>N(h-1)foranyh≥2
– Forh>2,thesmallestAVLtreeofheighthcontainstherootnode, one AVL subtree of height h-1 and another of height at least h-2:
N(h) ≥ 1 + N(h-1) + N(h-2) > 2 N(h-2)
– Byinductionwecanshowthatforheven N(h) ≥ 2 h/2
– Takinglogarithms:h<2logN(h)
– ThustheheightofanAVLtreeisO(logn)
The University of 45
Insertion in AVL trees
Suppose we are to insert a key k into our tree:
1. If k is in the tree, search for k ends at node holding k
There is nothing to do so tree structure does not change
2. If k is not in the tree, search for k ends at external node w.
Make this be a new internal node containing key k
3. The new tree has BST property, but it may not have AVL
balance property at some ancestor of w since
– some ancestors of w may have increased their height by 1
– every node that is not an ancestor of w hasn’t changed its height
4. We use rotations to re-arrange tree to re-establish AVL
property, while keeping BST property
The University of 46
Re-establishing AVL property
– Let w be location of newly inserted node
– Let z be lowest ancestor of w, whose children heights differ by 2 – Let y be the child of z that is ancestor of w (taller child of z)
– Let x be child of y that is ancestor of w
32 50 88 48 62
before inserting 54
The University of 47
after initial insertion
Re-establishing AVL property
If tree does not have
AVL property, do a trinode restructure at x, y, z
It can be argued that tree
has AVL property after operation
The University of Sydney
T0 T1 T3 Page 48
Augmenting BST with a height attribute
But how do we know the height of each node? If we had to compute this from scratch it would take O(n) time
Therefore, we need to have this pre-computed and update the height value after each insertion and rebalancing operation:
– After we create a node w, we should set its height to be 1, and then update the height of its ancestors.
– After we rotate (z, y, x) we should update their height and that of their ancestors.
Thus, we can maintain the height only using O(h) work per insert
The University of 49
Improving Balance: Trinode Restructuring
Let x, y, z be nodes such that x is a child of y and y is a child of z. Let a, b, c be the inorder listing of x, y, z
Perform the rotations so as to make b the topmost node of the three.
Single rotation around b
T0 T1 T2 T3
The University of 50
Improving Balance: Trinode Restructuring
Let x, y, z be nodes such that x is a child of y and y is a child of z. Let a, b, c be the inorder listing of x, y, z
Perform the rotations so as to make b the topmost node of the three.
Double rotation around c and b
T0 T1 T2 T3
The University of 51
Pseudo-code
The algorithm for doing a trinode restructuring, which is used, possibly repeatedly, to restore balance after an insertion or deletion.
def restructure(x):
input A node x of a binary search tree T that has both a parent y and a
grandparent z
output Tree T after a trinode restructuring (which corresponds to a single or double rotation) involving nodes x, y, and z
1. Let (a,b,c) be the left-to-right (inorder) listing of the nodes x, y, and z, and let (T0,T1,T2,T3) be the left-to-right (inorder) listing of the four subtrees of x, y, and z that are not rooted at x, y, and z.
2. Replace the subtree with a new subtree rooted at b. 3.LetabetheleftchildofbandletT0 andT1 betheleftandright
subtrees of a
4.LetcbetheleftchildofbandletT2 andT3 betheleftandright subtrees of c
5. Recalculate the heights of a, b, and c from the corresponding values stored at their children
6. return b The University of 52
Trinode Restructuring (when done by Single Rotation)
Single Rotations:
single rotation
T3 T0 T1 T2
single rotation
The University of 53
T0T2 T2T1T0 T1
Trinode Restructuring (when done by Double Rotation)
Double rotations:
double rotation
c = z double rotation
The University of 54
T T0 TT2TT 3T2 310
Performance
Assume we are given a reference to the node x where we are performing a trinode restructure and that the binary search tree is represented using nodes and pointers to parent, left and right children
A single or double rotation takes O(1) time, because it involves updating O(1) pointers.
single rotation
double rotation
T2 Page 55
The University of 0 T1 T2
Removal in AVL trees
Suppose we are to remove a key k from our tree:
1. If k is not in the tree, search for k ends at external node
There is nothing to do so tree structure does not change
2. If k is in the tree, search for k performs usual BST removal leading to removing a node with an external child and promoting its other child, which we call w
3. The new tree has BST property, but it may not have AVL balance property at some ancestor of w since
– some ancestors of w may have decreased their height by 1
– every node that is not an ancestor of w hasn’t changed its heights
4. We use rotations to rearrange tree and re-establish AVL property, while keeping BST property
The University of 56
Re-establishing AVL property
– Let w be the parent of deleted node
– Let z be lowest ancestor of w, whose children heights differ by 2
– Let y be the child of z with larger height (y is not an ancestor of w)
– Let x be child of y with larger height
17 62 w17 62
32 50 78 48 54 88
50 78 48 54 88
The University of Sydney
before deletion of 32
after initial deletion Page 57
Re-establishing AVL property
– If tree does not have AVL property, do a trinode restructure at x, y, z
– This restores the AVL property at z but it may upset the balance of another node higher in the tree, we must continue checking for balance until the root of T is reached
w17 62y 50 78
The University of 58
AVL Tree Performance
Suppose we have an AVL tree sto
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com