CS代考 Topic 6(1): Quicksort, Sorting Lower Bound, BST

Topic 6(1): Quicksort, Sorting Lower Bound, BST
􏰀 Quicksort and Analysis (CLRS ch 7.1-2, 7.4)
􏰀 Sorting Lower Bounds (CLRS ch 8.1)
􏰀 Binary Search Trees (BST): a review (CLRS ch 12.1-3)

QuickSort: Another sorting meets divide-and-conquer
􏰀 The ideas:
􏰀 Pick one key (pivot), compare it to all others.
􏰀 Rearrange A to be: [elements ≤ pivot, pivot, elements > pivot]
􏰀 Recursively sort subarrays before and after the pivot. 􏰀 Pseudocode:
procedure QuickSort(A,p,r)
∗∗ sorts the subarray A[p, …, r] if (p < r) then q← Partition(A,p,r) ∗∗ Partition returns q such that ∗∗ (i) A[q] = pivot, ∗∗ (ii) All elements ≤ pivot appear in A[p, ..., q − 1] ∗∗ (iii) All elements > pivot appear in A[q + 1, …, r]
QuickSort(A, p, q − 1) QuickSort(A, q + 1, r)
􏰀 How will you prove QuickSort’s correctness?
􏰀 Byinductiononn=#elementsinA=r−p+1
􏰀 Yourbasecaseiswhenn=0andn=1. (Why?)
􏰀 Induction step is easy if we know Partition() is correct
Topic 6(1): Quicksort, Sorting Lower Bound, BST

Partition(A, p, r):
􏰀 procedure Partition(A,p,r)
** last element, A[r], is the pivot key picked of the partition pivot ← A[r]
i ← p−1 ∗∗ i is the location of the last element known to be ≤ pivot for (j from p to r−1) do
if (A[j] ≤ pivot) then i←i+1
exchange A[i] ↔ A[j] exchange A[i + 1] ↔ A[r]
return i+1
􏰀 Example: A[1..8] = {3,1,7,6,4,8,2,5}, p = 1, r = 8, pivot = A[8] = 5
Topic 6(1): Quicksort, Sorting Lower Bound, BST
31764825 31764825 31764825 31764825 31764825 31467825 31467825 31467825 31427865 31425867
i = 4, j = 8 (for-loop ended)

Partition(A, p, r):
􏰀 The pivot happens to be A[r].
􏰀 Works by traversing the keys of the array from p to r − 1. 􏰀 j indicates the current element we are considering.
􏰀 i is the index of the last known element which is ≤ pivot.
􏰀 The invariant: 􏰀 i pivot
􏰀 A[j..(r − 1)] contains keys yet to be compared to pivot 􏰀 A[r] is the pivot
􏰀 A[j] is the current key
􏰀 If A[j] > pivot — no need to change i or exchange keys as the invariant is
maintained
􏰀 If A[j] ≤ pivot, exchange A[j] with the first larger-then-pivot element
(A[i + 1]) and increment i to maintain the fact that A[p, …, j] is built from
two consecutive subarrays of elements ≤ pivot and elements > pivot
􏰀 At the end, exchange A[r] ↔ A[i + 1] such that:
􏰀 A[p..i] contains keys ≤ pivot
􏰀 A[i + 1] contains pivot
􏰀 A[(i + 2)..r] contains keys > pivot
Topic 6(1): Quicksort, Sorting Lower Bound, BST

QuickSort Analysis
􏰀 Why we study QuickSort and its analysis: 􏰀 very efficient, in use
􏰀 divide-and-conquer
􏰀 huge literature
􏰀 a model for analysis of algorithms
􏰀 a terrific example of the usefulness of randomization
􏰀 Observations:
􏰀 (Again) key comparison is the dominant operation
􏰀 Counting KC — only need to know (at each call) the rank
of the split key
Topic 6(1): Quicksort, Sorting Lower Bound, BST

QuickSort recursion tree:
􏰀 root = pivot, left and right children = trees for the first and second recursive calls
􏰀 An example:
􏰍􏰍􏰍H 􏰌􏰌hhh
02 05 07 12
Topic 6(1): Quicksort, Sorting Lower Bound, BSTanalysis
􏰀 More observations:
􏰀 In the resulting recursion tree, at each node
(all keys in left subtree) ≤ (key in this node) < (all keys in right 􏰀 QuickSort recursion tree −→ binary search tree QuickSort Running Time 􏰀 Like before - dominated by #KC. 􏰀 The pivot is compared with every other key: (n − 1) KC 􏰀 Recurrence: 􏰏0, n≤1 T(n1)+T(n−1−n1)+(n−1), n≥2 where0≤n1 ≤n−1 􏰀 This raises the question: how do we estimate what n1 is going to be? 􏰀 There is no single answer. Topic 6(1): Quicksort, Sorting Lower Bound, BSTanalysis QuickSort WC running time: 􏰀 Recurrence: T(n) = 􏰏0, n≤1 T(n1)+T(n−1−n1)+(n−1), n≥2 􏰀 Notice that when both subarrays are non-empty, then #KC in next level is (n1 −1)+(n−1−n1 −1) = (n−3) But when one subarray is empty then #KC in next level is (n − 2). 􏰀 WC recurrence: T (n) = T (0) + T (n − 1) + (n − 1) = T (n − 1) + (n − 1), 􏰀 Solving the recurrence — Master Theorem doesn’t apply T(n) = T(n−1)+(n−1)=T(n−2)+(n−2)+(n−1) = ... = T(1)+1+2+...+(n−1) So, T (n) ∈ Θ(n2) 􏰀 What is a worst-case instance for QuickSort? Topic 6(1): Quicksort, Sorting Lower Bound, BSTanalysis QuickSort Almost-WC running time: 􏰀 Recurrence: T(n) = 􏰏0, n≤1 T(n1)+T(n−1−n1)+(n−1), n≥2 􏰀 Let’s try an almost-WC situation: At every step, we find a pivot for whichn1 =n−2. 􏰀 WC recurrence: T (n) = T (1) + T (n − 2) + (n − 1) = T (n − 2) + (n − 1), 􏰀 Solving the recurrence — T(n) = T(n−2)+(n−1)=T(n−4)+(n−3)+(n−1) = ... = T(1)+1+3+...+(n−3)+(n−1) 􏰀ClearlyT(n)≤n2(n−1)∈O(n2),andalsoT(n)≥n4 ·n2 ∈Ω(n2). So, T(n) ∈ Θ(n2). Topic 6(1): Quicksort, Sorting Lower Bound, BSTanalysis QuickSort BC running time: 􏰀 Recurrence: T(n) = 􏰏0, n≤1 T(n1)+T(n−1−n1)+(n−1), n≥2 􏰀 When both subarrays are non-empty, we are saving 1 KC ... 􏰀 Best case: each partition is a bipartition !!! Saving as many KC as possible every level ... The recursion tree is as short as possible ... 􏰀 Recurrence: 􏰀 Solving the recurrence — Master Theorem T (n) = 2T ( n2 ) + (n − 1) solves to Θ(n log n). 􏰀 Question: What is the best case array for the case of {1, 2, ..., 7}? T(n) = 2×T(n−1)+(n−1) 2 Topic 6(1): Quicksort, Sorting Lower Bound, BSTanalysis QuickSort Almost-BC running time: 􏰀 Let’s assume the at each round we get an approximated bipartition. Namely, each split is 34 n and 14 n, resulting in recurrence T (n) = T ( 3n ) + T ( n ) + cn 44 where, for simplicity, we just write cn to mean linear time in n 􏰀 We may even consider a more extreme case where split is 9 n and 1 n, resulting in recurrence 10 T (n) = T ( 9n ) + T ( n ) + cn 10 10 Topic 6(1): Quicksort, Sorting Lower Bound, BSTanalysis 􏰀 In both cases, the running time is O(n log n), and in fact Θ(n log n) considering time Ω(n log n) for BC. 􏰀 In fact, for any split of constant proportionality, the running time remains to be Θ(n log n) (See CLPR p.175-176). QuickSort Average Case running time: 􏰀 The recurrence for running time is: 􏰎 0, if n = 0, 1 T(n)= T(n1)+T(n−1−n1)+(n−1), if n≥2 􏰀 Average case: always ask “average over what input distribution?” 􏰀 Here, we assume each possible input equiprobable, i.e. uniform distribution. 􏰀 Key observation: equiprobable inputs imply for each key, rank among keys so far is equiprobable. 􏰀 So, n1 can be 0,1,2,...,n−2,n−1, with the same probability 1 Topic 6(1): Quicksort, Sorting Lower Bound, BST Solving T(n): 􏰀 AsPr[n1 =i]= n1 foreveryi,weget T(0) = 0,T(1) = 0, T(2) = (2−1)+ 12 (T(0)+T(2−1−0))+ 21 (T(1)+T(2−1−1)), T(n) = (n−1)+ n1 (T(0)+T(n−1)) + n1 (T(1)+T(n−2)) +... + n1 (T(n−2)+T(1)) + n1 (T(n−1)+T(0)) n−1 =(n−1)+n2 􏰊T(i). i=0 􏰀 Master Theorem does NOT apply here, but it can be shown T(n) ≤ 2(n + 1)[H(n + 1) − 1], where the Harmonic number n H(n)=􏰁1i =lnn+γ,whereγ≈0.577···. i=1 (The details are omitted here, but see, e.g., https://en.wikipedia.org/ w/index.php?title=Quicksort&action=edit&section=13) Topic 6(1): Quicksort, Sorting Lower Bound, BST Topic 6(1): Quicksort, Sorting Lower Bound, BST Sorting Algorithms: Running Time Comparison InsertionSort Θ(n log n) Θ(n log n) Θ(n log n) Θ(n log n) Θ(n log n) Θ(n log n) Random QuickSort Θ(n log n) Θ(n log n) What is “∗”? QuickSort Improvement and space requirement: 􏰀 QuickSort is considered an in-place sorting algorithm: 􏰀 extra space required at each recursive call is only constant. 􏰀 whereas in MergeSort, at each recursive call up to Θ(n) extra space is required. 􏰀 To improve the algorithm we use a random pivot Difference between a Randomized Algorithm and AC Analysis 􏰀 AC analysis means we make an assumption on the input 􏰀 No guarantee that the assumption holds 􏰀 Input is chosen once: on avg we might have a good running time, but once input is given our running time is determined. 􏰀 A randomized algorithm works for any input (WC analysis) 􏰀 Randomness in the coins we toss (not in the input) so we control the distribution of the coin toss 􏰀 We can always start the algorithm anew if it takes too long; or run it multiple times in parallel Topic 6(1): Quicksort, Sorting Lower Bound, BST Randomized QuickSort 􏰀 We invoke RandomPartition rather than Partition 􏰀 Pseudocode procedure RandomPartition(A,p,r) i← uniformly chosen random integer in {p,...,r} exchange A[i] ↔ A[r] return Partition(A,p,r) 􏰀 Q: How do we analyze the #KC of the Random QuickSort? Depends on which pivot we pick, we can compare any two elements. And of course, there is a chance we pick the worst-pivot (last element) in every iteration... 􏰀 A: We analyze the expected WC #KC (As always, proportional to Expected WC running time) 􏰀 Conclusion: the expected WC running time for random quicksort is Θ(n log n). (We ignore the details in this course.) Topic 6(1): Quicksort, Sorting Lower Bound, BST Sorting lower bound: 􏰀 So far: we looked at BC runtime — for lower bounds purposes. 􏰀 They serve as lower bounds, for specific algorithms. 􏰀 E.g., “Even in the best case, my algorithm makes KC.” 􏰀 So this is a lower bound of the form ∃ algoritm A s.t. ∀ input I, runtime(A(I)) ≥ .... 􏰀 We now give a lower bound for the problem of sorting. 􏰀 A lower bound on any algorithm for sorting - even those not invented yet. 􏰀 This is a lower bound of the form ∀ algoritm A ∃ input I, runtime(A(I)) ≥ .... 􏰀 Q: Can we derive a lower bound of the form ∀ algoritm A and ∀ input I, runtime(A(I)) ≥ ....? 􏰀 A: Not a very informative bound, since for every input I0 we can always “massage” any algorithm into an algorithm that first checks for I0. If (input= I0) return solution(I0) else ... (run original algorithm) Topic 6(1): Quicksort, Sorting Lower Bound, BST Two useful trees in algorithm analysis: 1. Recursion tree 􏰀 node ←→ recursive call 􏰀 describes algorithm execution for one particular input by showing all calls made 􏰀 one algorithm execution ←→ all nodes (a tree) 􏰀 useful in analysis: sum the numbers of operations over all nodes 􏰍􏰍􏰍H 􏰌􏰌hhh 02 05 07 12 Topic 6(1): Quicksort, Sorting Lower Bound, BST Two useful trees in algorithm analysis: 2. Decision tree 􏰀 node ←→ algorithm decision 􏰀 describes algorithm execution for all possible inputs by showing all possible algorithm decisions 􏰀 one algorithm execution ←→ one root-to-leaf path 􏰀 useful in analysis: sum the numbers of operations over nodes on one Topic 6(1): Quicksort, Sorting Lower Bound, BST Sorting lower bound: 􏰀 Consider comparison-based sorting algorithms. These algorithms map to decision trees (nodes have exactly 2 children). 􏰀 Binary tree facts: 􏰀 Suppose there are t leaves and k levels. Then, 􏰀 So,logt≤(k−1) 􏰀 Equivalently, k ≥ 1 + log t — binary tree with t leaves has at least (1 + log t) levels 􏰀 Comparison-based sorting algorithm facts: 􏰀 Look at its Decision Tree. It’s a binary tree. 􏰀 It should contain every possible output: every permutation of the positions {1,2,...,n}. 􏰀 So, it contains at least n! leaves ... 􏰀 Equivalently, it has at least 1 + log(n!) levels. 􏰀 A longest root-to-leaf path of length at least log(n!). 􏰀 So in the worst case, the algorithm makes at least log(n!) KC, and log(n!) ∈ Θ(n log n) Topic 6(1): Quicksort, Sorting Lower Bound, BST Topic 6(1): Quicksort, Sorting Lower Bound, BST Now let us turn to some data structures and study their role in the design of algorithms: 􏰀 Binary Search Trees (BST): a review (CLRS ch 12.1-3) 􏰀 Balanced Binary Search Trees: AVL Trees 􏰀 Hash Tables (CLRS ch 11.1-3) Binary Search Trees 􏰀 Definition: A rooted tree is a data-structure defined recursively as: 􏰀 The empty rooted tree, nil 􏰀 A special node, the root, which has children — pointers to distinct and unique rooted trees. Unique: A non-nil tree cannot be pointed more than once 􏰀 A tree is implemented by a doubly linked list. A node contains pointers to its child nodes and parent (nil if non-existent) and a node holds a key (possibly other satellite data); if needed there is also an attribute T.height. (review CLRS ch 10.2 for linked lists). 􏰀 Anodeuisadescendantofnodev(orthatnodevisanancestorofu)if there’s a path from v.root to u that uses only child-pointers (and there’s a path from u.root to v that uses only parent pointers). 􏰀 A leaf is a rooted tree with no children (all children are nil) 􏰀 A binary rooted tree is a rooted tree in which all nodes have at most two children, left and right. 􏰀 The length (= # edges) of the longest path from the root of the tree to a leaf is called the height of the tree. 􏰀 The i-th layer in a tree is the set of all nodes whose distance (= # edges) to the root is precisely i. Topic 6(1): Quicksort, Sorting Lower Bound, BST Binary Search Trees 􏰀 Definition: A binary rooted-tree is called a binary search tree if for every node v we have the following two properties 1. All keys stored in the v’s left subtree are ≤ v.key 2. All keys stored in the v’s right subtree are > v.key
􏰀 On the same set of nodes/keys, there could be many binary trees of different heights.
􏰀 Like in the case of heaps: n ≤ 2height+1 − 1…
􏰀 …but (as opposed to heaps) it is wrong to assume 2height ≈ n
􏰀 In fact, in the worst case height ≈ n
Topic 6(1): Quicksort, Sorting Lower Bound, BST

Binary Search Trees
􏰀 In a BST, finding a key x is rather simple:
procedure Find(T,x) if (T =nil) then
return nil
if (T.root.key = x) then
return T ∗∗ alternatively, return T.root else if (x < T.root.key) then Topic 6(1): Quicksort, Sorting Lower Bound, BST return Find(T.root.left,x) else return Find(T.root.right,x) 􏰀 Runtime for tree of height h is T (h) = 􏰀 I.e., runtime is O(h). O(1) + T(h − 1), if h = 0 o/w 􏰀 Note: While we use T to denote a BST and T.root to refer to the root node of T, CLRS uses a root node to represent a tree, e.g., on p.290, the procedure TREE-SEARCH(x, k) is to find key k in the BST whose root node is x. Binary Search Trees: Find Min 􏰀 Of particular importance is finding the min/max in a tree. procedure FindMin(T) ∗∗ precondition: T isn’t nil if (T.root.left =nil) then return T ∗∗ alternatively, return T.root else return FindMin(T.root.left) 􏰀 How do we prove correctness of this code? 􏰀 Clearly, by induction. But on what? Answer: induction on hL = the height of the left subtree. 􏰀 Base case: hL = 0. This means that there are no left descendants. So, by definition, all keys in the right subtree are greater than T.root.key so by returning T.root we return the elements with the smallest key. 􏰀 Induction step: Fix any hL ≥ 0. Assuming that for any tree with left-height hL FindMin() returns the min key of this tree, we show FindMin() returns the minimum of any tree whose left-height is hL + 1. Let T be any tree whose left-subtree’s height is hL + 1 > 0. So FindMin(T) returns FindMin(T.root.left), and by IH we return the smallest of all the keys that are ≤ T.root.key. All other keys in the tree are either T.root.key or the keys stored in the right subtree which are greater than T.root.key. Hence, the minimum of the left subtree is indeed the minimum of the whole tree.
􏰀 What’s the code for FindMax(T)?
Topic 6(1): Quicksort, Sorting Lower Bound, BST

Binary Search Tree: Insert
􏰀 In a BST, inserting a key x is done at the leaf:
procedure Insert(T,x) T new ← a new tree Tnew.root ← x Tnew.root.left ←nil Tnew.root.right ←nil if (T =nil) then
∗∗ if this is the first item in the tree.
replace T with Tnew else
InsertTree(T,Tnew)
procedure InsertTree(T,Tnew)
∗∗ precondition: T isn’t nil
if (Tnew.root.key ≤ T.root.key) then
if (T.root.left =nil) then T.root.left ← Tnew
T new.root.parent = T
else InsertTree(T.root.left,Tnew)
else ∗∗ I.e., Tnew.root.key > T.root.key if (T.root.right =nil) then
T.root.right ← Tnew
T new.root.parent = T else
InsertTree(T.root.right,Tnew)
Topic 6(1): Quicksort, Sorting Lower Bound, BST
􏰀 Insert() takes O(height of T).
􏰀 The correctness statement: After invoking Insert(T,x) is resulting tree is a BST that contains all previous keys and x.

Binary Search Tree: Delete
􏰀 How to do deletion of a node x?
􏰀 Easy case: x is a leaf – remove it. Done.
􏰀 How to delete a node with only a single child?
􏰀 What about deleting a node with two non-nil children?
􏰀 Find a different node y that can replace x — delete y and put y instead of x.
􏰀 Which node y shall we look for? The largest in the left subtree (or the smallest in the rightsubtree, both works), and delete y.
Topic 6(1): Quicksort, Sorting Lower Bound, BST

Binary Search Tree: Delete
􏰀 An example of the simple deletion case:
Topic 6(1): Quicksort, Sorting Lower Bound, BST

An example of the more complex deletion case:
Topic 6(1): Quicksort, Sorting Lower Bound, BST

Binary Search Tree: Delete
􏰀 (WC) Runtime Analysis?
􏰀 Na ̈ıve first attempt: write the runtime as a recursive relation, T(h) = O(1) + O(h) + T(l), where l = depth of max node (with T (0) = O(1)).
􏰀 But note: invoking the recursive call DeleteRoot() on the largest element in the left-subtree means that this node must not have a right child!
􏰀 Hence, the recursive call is invoked at most once.
􏰀 Thus, T(h) = O(1) + O(h) + O(1) = O(h).
􏰀 Exercise: Depict the tree after each of instruction:
(Delete(x) refers to DeleteRoot(Find(x)))
Insert(1), Insert(2), Insert(3), Insert(5), Insert(4), Delete(1), Delete(5), Delete(3), Insert(1), Insert(5), Delete(2)
Topic 6(1): Quicksort, Sorting Lower Bound, BST

Topic 6(1): Quicksort, Sorting Lower Bound, BST
Binary Search Tree: Outputting the Sorted Sequence
􏰀 How to output all keys held in the tree from smallest to largest? 􏰀 In-order traversal:
procedure In-Order(T) if (T ̸=nil) then
In-Order(T .root.lef t) Print(T.root.key) In-Order(T.root.right)
􏰀 Runtime: T(n) = O(1)+T(nL)+T(nR) when nL,nR denote the number of nodes on the left/right subtree respectively. (Of course, T (0) = O(1)).
􏰀 Solves to T (n) = O(n).

程序代写 CS代考加微信: powcoder QQ: 1823890830 Email: powcoder@163.com

Related Posts