11_Quicksort
Lecture 11
Quicksort
EECS 281: Data Structures & Algorithms
Quicksort Basics
Data Structures & Algorithms
Two Problems with Simple Sorts
• They might compare every pair of elements
– Learn only one piece of information/comparison
– Contrast with binary search: learns n/2 pieces of
information with first comparison
• They often move elements one place
at a time (bubble and insertion)
– Even if the element is “far” out of place
– Contrast with selection sort: moves each element
exactly to its final place
• Faster sorts attack these two problems
3
Quicksort: Background
• ‘Easy’ to implement
• Works well with variety of input data
• In-place sort (no extra memory for data)
• Additional memory for stack frames
4
Quicksort: Divide and Conquer
• Base case:
– Arrays of length 0 or 1 are trivially sorted
• Inductive step:
– Guess an element elt to partition the array
– Form array of [LHS] elt [RHS] (divide)
• ∀ x ∈ LHS, x <= elt • ∀ y ∈ RHS, y >= elt
– Recursively sort [LHS] and [RHS] (conquer)
5
Quicksort with Simple Partition
1 void quicksort(Item a[], size_t left, size_t right) {
2 if (left + 1 >= right)
3 return;
4 size_t pivot = partition(a, left, right);
5 quicksort(a, left, pivot);
6 quicksort(a, pivot + 1, right);
7 } // quicksort()
• Range is [left, right)
• If base case, return
• Else divide (partition and find pivot)
and conquer (recursively quicksort)
6
How to Form [LHS]elt[RHS]?
• Divide and conquer algorithm
– Ideal division: equal-sized LHS, RHS
• Ideal division is the median
– How does one find the median?
• Simple alternative: just pick any element
– (a) array is random
– (b) otherwise
– Not guaranteed to be a good pick
– Quality can be averaged over such choices
7
Simple Partition
1 size_t partition(Item a[], size_t left, size_t right) {
2 size_t pivot = –right;
3 while (true) {
4 while (a[left] < a[pivot])
5 ++left;
6 while (left < right && a[right - 1] >= a[pivot])
7 –right;
8 if (left >= right)
9 break;
10 swap(a[left], a[right – 1]);
11 } // while
12 swap(a[left], a[pivot]);
13 return left;
14 } // partition()
• Choose last item as pivot
• Scan…
– from left for >= pivot
– from right for < pivot
• Swap left & right pairs
and continue scan until
left & right cross
• Move pivot to ‘middle’
8
Example: pick-the-last
{2 9 3 4 7 5 8 6}
9
Another Partition
1 size_t partition(Item a[], size_t left, size_t right) {
2 size_t pivot = left + (right – left) / 2; // pivot is middle
3 swap(a[pivot], a[--right]); // swap with right
4 pivot = right; // pivot is right
5
6 while (true) {
7 while (a[left] < a[pivot])
8 ++left;
9 while (left < right && a[right - 1] >= a[pivot])
10 –right;
11 if (left >= right)
12 break;
13 swap(a[left], a[right – 1]);
14 } // while
15 swap(a[left], a[pivot]);
16 return left;
17 } // partition()
• Choose middle item as pivot
• Swap it with the right end
• Repeat as before
10
Example: worst-case (min/max)
{1 2 3 4 5}
11
Quicksort Basics
Data Structures & Algorithms
Quicksort Analysis & Performance
Data Structures & Algorithms
Time Analysis
• Cost of partitioning n elements: O(n)
• Worst case: pivot always leaves one side empty
– T(n) = n + T(n – 1) + T(0)
– T(n) = n + T(n – 1) + c [since T(0) is O(1)]
– T(n) ~ n2/2 Þ O(n2 [via substitution]
• Best case: pivot divides elements equally
– T(n) = n + T(n / 2) + T(n / 2)
– T(n) = n + 2T(n / 2) = n + 2(n / 2) + 4(n / 4) +…+ O(1)
– T(n) ~ n log n Þ O(n log n) [master theorem or substitution]
• Average case: tricky
– Between 2n log n and ~ 1.39 n log n Þ O(n log n)
14
Memory Analysis
• Requires stack space for recursive calls
• The first recursive call is NOT tail
recursive, requires a new stack frame
• The second recursive call IS tail recursive,
which reuses the current stack frame
• When pivoting is going terribly:
– O(n) stack frames if split is (n – 1), pivot, (0)
– O(n) stack frames if split is (0), pivot, (n – 1)
15
Sort Smaller Region First
1 void quicksort(Item a[], size_t left, size_t right) {
2 if (left + 1 >= right)
3 return;
4 size_t pivot = partition(a, left, right);
5 if (pivot – left < right – pivot) { 6 quicksort(a, left, pivot); 7 quicksort(a, pivot + 1, right); 8 } else { 9 quicksort(a, pivot + 1, right); 10 quicksort(a, left, pivot); 11 } // else 12 } // quicksort() 16 • Worst memory requirement? • Both sides equal: O(log n) Quicksort: Pros and Cons Advantages • On average, n log n time to sort n items • Short inner loop O(n) • Efficient memory usage • Thoroughly analyzed and understood Disadvantages • Worst case, n2 time to sort n items • Not stable, and incredibly difficult to make stable • Partitioning is fragile (simple mistakes will either segfault or not sort properly) 17 Improving Splits • Key to performance: a “good” split – Any single choice could always be worst one – Too expensive to actually compute best one (median) • Rather than compute median, sample it – Simple way: pick three elements, take their medians – Very likely to give you better performance • Sampling is a very powerful technique! 18 Median Sampling: Fixed 17 14214 Pivot = Runtime: O(1) 34 18 1 4 42 14 34 28 28417 15 57 95 19 Find median of first, middle, and last elements 34 18 1 4 42 14 34 28 28417 15 57 95 Median Sampling: Random 1 2828 Find median of three (five, seven,…) random elements 42 Pivot = Runtime: O(1) 20 Other Improvements • Divide and conquer – Most sorts are “small” regions – Lots or recursive calls to small regions • Reduce the cost of sorting small regions – Insertion sort is faster than quicksort on small arrays – Bail out of quicksort when size < k • For some small, fixed k, usually around 16 or 32 – Insertion sort each small sub-array 21 Summary: Quicksort • On average, O(n log n)-time sort • Efficiency based upon selection of pivot – Randomly choose middle or last key in partition – Sample three keys – Other creative methods • Other methods of tuning – Use another sort when partition is ‘small’ – Three-way partition 22 Quicksort Analysis & Performance Data Structures & Algorithms Sorting Summary Data Structures & Algorithms • Bubble sort • Insertion sort • Selection sort • Heapsort • Quicksort Sorting Algorithms: Time heap-based sort, O(n log n) worst-case elementary sorts (worst-case O(n2)) Average case: O(n log n) depending on pivot selection divide-and-conquer 25 • Bubble sort • Insertion sort • Selection sort • Heapsort • Quicksort? Sorting Algorithms: Memory In-place sorts - O(1) extra memory 26 Sorting Algorithms: Stability A sorting algorithm is stable if output data having the same key values remain in the same relative locations after the sort 27 • Bubble sort ✓ • Insertion sort ✓ • Selection sort ✘ • Heapsort ✘ • Quicksort ✘ Introsort • Introspective Sort – Introspection means to think about oneself • Used by g++ and Visual Studio Algorithm introsort(a[], n): if (n is small) insertionSort() else if (quicksort.recursionDepth is large) heapsort() else quicksort() 28 Sorting Summary Data Structures & Algorithms Questions for Self-study • Illustrate worst case input for quicksort • Explain why best-case runtime for quicksort is not linear – Give two ways to make it linear (why is this not done in practice?) • Normally, pivot selection takes O(1) time, what will happen to quicksort’s complexity if pivot selection takes O(n) time? • Improve quicksort with O(n)-time median selection – Must limit median selection to linear time in all cases 30