Counting-Sort (A, B, k)
• (a): Array A and the auxiliary array C after line 5
• (b):Arrayafterline7
• (c) ~ (e): The change values of array B and auxiliary array C after iteration, 1, 2, and 3.
• (f): The final sorted output array B.
Algorithm Analysis
• The overall time is O(n+k). When we have k=O(n), the worst case is O(n).
§ for-loop of lines 2-3 takes time O(k) § for-loop of lines 4-5 takes time O(n) § for-loop of lines 6-7 takes time O(k) § for-loop of lines 8-10 takes time O(n)
• Stable, but not in place.
• No comparisons made: it uses actual values of the
elements to index into an array.
Counting Sort Animation
Counting Sort
• Cool! Why don’t we always use counting sort?
• Because it depends on range k of elements
• Could we use counting sort to sort 32 bit integers? Why or why not?
§ Answer: no, k too large (232 = 4,294,967,296)
• 16-bit?
§ Probably not.
• 8-bit?
§ Maybe, depending on n.
• 4-bit?
§ Probably, (unless n is really small).
Radix Sort
• How did IBM get rich originally?
• Answer: punched card readers for census tabulation in early 1900’s.
§ In particular, a card sorter that could sort cards into different bins
o Each column can be punched in 12 places
o Decimal digits use 10 places
§ Problem: only one column can be sorted on at a time
IBM Punched Card
Radix Sort
• Used to sort on card-sorters:
• Do a stable sort on each column, one column at a time. • The human operator is part of the algorithm!
• Key idea: sort on the “least significant digit” first and on the remaining digits in sequential order. The sorting method used to sort each digit must be “stable”.
§ If we start with the “most significant digit”, we’ll need extra storage. (Why? Try it!)
§ Problem: lots of intermediate piles of cards
Sorting Animation
Radix Sort using Counting Sort
Input
After sorting on LSD
After sorting on middle digit
After sorting on MSD
356
392
446
495
532
631
928
392 631
356 392
446 532
928 Þ 495 Þ
928
631
532
446
356
392
495
Þ
631 532 495
356
446
928
An Example
Radix-Sort(A, d)
RadixSort(A, d)
1. fori¬1tod
2. do use a stable sort to sort array A on digit i
Correctness of Radix Sort
• By induction on the number of digits sorted. • Assume that radix sort works for d – 1 digits. • Show that it works for d digits.
• Radix sort of d digits o radix sort of the low-order d – 1 digits followed by a sort on digit d .
Algorithm Analysis
• Each pass over n d-digit numbers then takes time Q(n+k). (Assuming counting sort is used for each pass.)
• There are d passes, so the total time for radix sort is Q(d (n+k)).
• When d is a constant and k = O(n), radix sort runs in linear time.
• Radix sort, if uses counting sort as the intermediate stable sort, does not sort in place.
§ If primary memory storage is an issue, quicksort or other sorting methods may be preferable.
Correctness of Radix Sort
By induction hypothesis, the sort of the low-order d – 1 digits works, so just before the sort on digit d , the elements are in order according to their low-order d – 1 digits. The sort on digit d will order the elements by their dth digit.
Consider two elements, a and b, with dth digits ad and bd:
• If ad < bd , the sort will place a before b, since a < b regardless of the low-order digits.
• If ad > bd , the sort will place a after b, since a > b regardless of the low-order digits.
• If ad = bd , the sort will leave a and b in the same order, since the sort is stable. But that order is already correct, since the correct order of is determined by the low-order digits when their dth digits are equal.
Radix Sort
• Problem: sort 1 million 64-bit numbers
§ If treat as four-digit radix 216 numbers
§ Can sort in just four passes with radix sort!
§ Running time: 4( 1 million + 216 ) ≈ 4 million operatons
• Compares well with typical O(n lg n) comparison sort
§ Requires approx lg n = 20 operations per number being
sorted
§ Total running time ≈ 20 million operatons
• So why would we ever use anything but radix sort?
An Example
• Sort N numbers, each with k bits • E.g,input{4,1,0,10,5,6,1,8}
Another Example
• English words sorting:
Radix Sort
• In general, radix sort based on counting sort is § Fast
§ Asymptotically fast (i.e., O(n)) § Simple to code
§ A good choice
• To think about: Can radix sort be used on floating-point numbers?
Summary: Radix Sort
• Radix sort:
§ Assumption: input has d digits ranging from 0 to k
§ Basic idea:
o Sort elements by digit starting with least significant o Use a stable sort (like counting sort) for each stage
§ Each pass over n numbers with d digits takes time O(n+k), so total time O(dn+dk)
o When d is constant and k=O(n), takes O(n) time §Fast! Stable!Simple!
§ Doesn’t sort in place
Bucket Sort
• Bucket sort
§ Assumption: input is n reals from [0, 1)
§ Basic idea:
o Create n linked lists (buckets) to divide interval [0,1)
into equal subintervals of size 1/n
o Add each input element to appropriate bucket and sort
buckets with insertion sort
§Uniform input distributionàO(1) bucket size
o Therefore the expected total time is O(n)
§ These ideas will return when we study hash tables
An Example
Bucket-Sort (A) Input: A[1..n], where 0 £ A[i] < 1 for all i. Auxiliary array: B[0..n – 1] of linked lists, each list initially empty. BucketSort(A) n ¬ A.length Let B[0..n – 1] be a new array for i ¬ 0 to n – 1 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. return the concatenated lists make B[i] an empty list for i ¬ 0 to n do insert A[i] into list B[ ënA[i]û ] fori¬0ton–1 do sort list B[i] with insertion sort concatenate the lists B[i]s together in order Bucket Sort Animation1 Bucket Sort Animation 2 Correctness of BucketSort • Consider A[i], A[j]. Assume w.o.l.o.g, A[i] £ A[j]. • Then, ën ́ A[i]û £ ën ́ A[j]û. • So, A[i] is placed into the same bucket as A[j] or into a bucket with a lower index. § If same bucket, insertion sort fixes up. § If earlier bucket, concatenation of lists fixes up. Analysis • Relies on no bucket getting too many values. • All lines except insertion sorting in line 8 take O(n) altogether in the worst case. • Intuitively, if each bucket gets a constant number of elements, it takes O(1) time to sort each bucket Þ O(n) sort time for all buckets. • We “expect” each bucket to have few elements, since the average is 1 element per bucket. • But we need to do a careful analysis. Analysis – Contd. • RV ni = no. of elements placed in bucket B[i]. • Insertion sort runs in quadratic time. Hence, time for bucket sort is: n-1 T (n) = Q(n) + åO(n2 ) i i=0 Taking expectations of both sides and using linearity of expectation, we have (bylinearityof expectation) é n-1 ù E[T(n)]= EêQ(n)+åO(n2)ú i ë i=0 û n-1 =Q(n)+åE[O(n2)] i i=0 = Q(n) + ån-1 O(E[n2 ]) (E[aX ] = aE[X ]) (8.1) i i=0 Analysis – Contd. • Claim: E[ni2] = 2 – 1/n. • Proof: • Define indicator random variables. § Xij = I{A[j] falls in bucket i} § Pr{A[j] falls in bucket i} = 1/n. n § ni = (8.2) åX j=1 ij éæ n ö2ù X ÷ú å i êè øú E[n2]=Eêç ë j=1 û énnù = E êë å å X X úû ij ij ik j=1 k=1 Analysis – Contd. éå ù =Eên X2+ XXú åå ê j =1 1£ j £ n 1£ k £ n ú ij ë j1k û ij ik =ån E[X2]+å åE[X X ] ,bylinearityof expectation. j=1 1£ j£n1£k£n j1k (8.3) ij ij ik Analysis – Contd. E[X2]=02 ×Pr{A[j]doesn'tfallinbucketi}+ ij 12 ×Pr{A[j]fallsinbucketi} = 0×æ1- 1 ö +1× 1 çè n÷ø n =1 n E[XijXik]for j1k: Since j 1 k , X ij and X ik are independent random variables. Þ E[Xij Xik ] = E[Xij ]E[Xik ] =1×1=1 nn n2 (8.3) is hence, n11 E[n2]=å +åå Analysis – Contd. i =1+ n-1 n =2-1. n j=1n 1£j£n1£k£nn2 k1j =n×1+n(n-1)× 1 n n2 Substituting (8.2) in (8.1), we have, E[T (n)] = Q(n) + ån-1 O(2 -1/ n) i=0 = Q(n) + O(n) = Q(n)