程序代写代做代考 decision tree C algorithm AI Sorting (Chapters 7, 8, 9 in the textbook)

Sorting (Chapters 7, 8, 9 in the textbook)
cosc 336 1

Sorting by Comparison
SORTING PROBLEM: Given an arrayA[1], A[2], …, A[n] of numbers, arrange them in increasing order.
We look at comparison-based sorting algorithms. They work by comparing A[i] with A[j] for various indices i and j.
1. Simple: SelectionSort, BubbleSort
2. Good worst case: MergeSort, HeapSort
3. Lower bound for comparison-based sorting algorithms
1. Good average case: QuickSort
cosc 336 2

Selection Sort Idea
• Are first 2 elements sorted? If not, swap.
• Are the first 3 elements sorted? If not, move the 3rd element to the left by series of swaps.
• Are the first 4 elements sorted? If not, move the 4th element to the left by series of swaps.
– etc.
cosc 336 3

Selection Sort
procedure SelectionSort (Array[1..n])
For (i=2 to n) { j = i;
while ( j > 0 && Array[j] < Array[j-1] ){ swap( Array[j], Array[j-1] ) j --; } } Suppose Array is initially sorted? Suppose Array is reverse sorted? cosc 336 4 Selection Sort procedure SelectionSort (A[1..n]) For (i=2 to n) { j = i; } while ( j > 0 && A[j] < A[j-1] ){ swap( A[j], A[j-1] ) j --; } cosc 336 5 Bubble Sort cosc 336 6 • • Why Selection (or Bubble) Sort is Slow Inversion: a pair (i,j) such that i Array[j]
Array of size n can have (n2) inversions
– average number of inversions in a random set of elements is
n(n-1)/4
Selection/Bubble Sort only swaps adjacent elements

– only removes 1 inversion!
cosc 336 7

GOOD SORTING ALGORITHMS
We have already discussed:
• MergeSort.
Runs in O(n log n) worst case.
• HeapSort.
Runs in O(n log n) worst case.
cosc 336 8

Lower bound for comparison-based sorting algorithms
cosc 336 9

Decision tree to sort list A,B,C
A x = A[r]. The cursor j moves right one step at a time.
If the cursor j “discovers” an element ≤ x, then this element
is swapped with the front element of the window, effectively moving the window right one step; if it discovers an element > x, then the window simply becomes longer one unit.
cosc 336 16

Analyzing QuickSort
• Picking pivot: constant time
• Partitioning: linear time
• Recursion: time for sorting left partition (say of size i) + time for right (size N-i-1)
T(1) = b
T(N) = T(i) + T(N-i-1) + cN
where i is the number of elements smaller than the pivot
cosc 336 17

QuickSort – Worst case
Pivot is always smallest (or largest) element. T(n) = T(i) + T(n-i-1) + cn
Worst case is when i=1 or i=n-1. In such a case: T(n) = T(n-1) + cn
= T(n-2) + c(n-1) + cn =T(n-k)+c(n+(n-1)+ …+1) = O(n2)
cosc 336 18

Dealing with Slow QuickSorts
• Randomly choose pivot
– Good theoretically and practically, but call to random number generator can be expensive
• Pick pivot cleverly
– “Median-of-3” rule takes Median(first, middle, last element elements). Also works well.
cosc 336 19

QuickSort – Best Case
cosc 336 20

QuickSort – Average Case
cosc 336 21

Quicksort – analysis 2
cosc 336 22

Summary for comparison-based sorting algorithms
• Sorting algorithms that only compare adjacent elements are (n2) worst case – but may be (n) best case
• HeapSort and MergeSort – (n log n) both best and worst case
• QuickSort (n2) worst case but (n log n) best and average case
• Any comparison-based sorting algorithm is (n log n) worst case
cosc 336 23

Medians and Order Statistics
• i-th order statistic: i-th smallest element
• n elements: median is
– n odd: (n+1)/2
– n even: n/2 or n/2+1
• Assume distinct numbers.
• SELECTION PROBLEM:
• Input: A array with n numbers, 1<=i<=n • Output: element x of A larger than i-1 elements of A (in other words: the i-th smallest number). cosc 336 24 • • 1. 1. EASY SOLUTION: O(n log n) time based on ... We’ll see: randomized algorithm with O(n) time on average. deterministic algorithm with O(n) time in worst-case cosc 336 25 Solutions Minimum and Maximum • How many comparisons? – At most n-1. – Examine each element and keep trach of smallest one: • Comparison based • Each element must be compared • Each must loose once (except winner). • What about simultaneous min and max? cosc 336 26 Min & Max • Can do with 2n-2 comparisons. • Can do better – Form pairs of elements – Compare elements in each pair – Pair (ai, ai+1), assume ai < ai+1, then • Compare (min,ai), (ai+1,max) • 3 comparisions for each pair. cosc 336 27 Average Time Median Selection • Divide-and-Conquer (prune-and-search). • Randomized: behavior determined by output of random number generator. • Based on QuickSort: – Partition input array recursively, but – Work only on one side! cosc 336 28 Randomized Selection • QuickSort(A,p,r) If p < r then q=partition(A,p,r) QuickSort(A,p,q) QuickSort(A,q+1,r). – First call: QuickSort(A,1,n) – After partition(A,p,r): • A[i] k.
cosc 336 33

Analysis
• Find lower bound on number of elements greater than x.
– At least half of medians in step 2 greater than x. Then,
– At least half of the groups contribute 3 elements that are greater than x, except:
• Last group (if less than 5 elements);
• x own group.
– Discard those two groups:
• Number of elements greater than x is ≥ 3((n/5)/2-2)=3n/10-6.
• Similarly, number of elements smaller than x is ≥3n/10-6.
• Then, in worst case, Select is called recursively in Step 5 on at most 7n/10+6 elements (upper bound).
cosc 336 34

Analysis (cont.)
cosc 336 35

Sorting in Linear Time
cosc 336 36

• Assumptions: – n records
Counting sort
– Each record contains keys and data
– Allkeysareintherangeof1tok • Space
– The unsorted list is stored in A, the sorted list will be stored in an additional array B
– Uses an additional array C of size k
cosc 336 37

Counting sort
• Main idea:
1. For each key value i, i = 1,…,k, count the number of times the
keys occurs in the unsorted input array A. Store results in an auxiliary array, C
2. Use these counts to compute the offset. Offseti is used to calculate the location where the record with key value i will be
stored in the sorted output list B.
The offseti value has the location where the last keyi .
• When would you use counting sort?
• How much memory is needed?
cosc 336 38

Input: A [ 1 .. n ],
A[J]  {1,2, . . . , k }
Output: B [ 1 .. n ], sorted
Uses C [ 1 .. k ], auxiliary storage
1. 2. 3. 4. 5. 6. 7. 8.
9.
for i  1 to k
do C[i ]  0
for j  1 to length[A]
do C[A[ j ] ]  C[A[ j ] ] + 1
for i  2 to k
do C[i ]  C[i ] +C[i -1]
for j  length[A] down 1
Analysis:
Counting Sort Counting-Sort( A, B, k)
do
B [ C[A[ j ] ] ]  A[ j ] C[A[ j ] ] ]  C [A[ j ] ] -1
cosc 336
39
Adapted from Cormen,Leiserson,Rivest

123456
A
k = 4, length = 6 C
after lines 1-2
C
after lines 3-4
C
after lines 5-6
Counting-Sort( A, B, k)
1. for i  1 to k
2. do C[i ]  0
3. for j  1 to length[A]
4. do C[A[ j ] ]  C[A[ j ] ] + 1 5. for i  2 to k
4
1
3
4
3
4
0
0
0
0
1
0
2
3
6. do
C[i ]  C[i ] +C[i -1]
1
1
3
6
cosc 336
40

123456
A
7. for j  length[A] down 1
4
1
3
4
3
4
123456
8. do 9.
B [ C[A[ j ] ] ]  A[ j ] C[A[ j ] ] ]  C [A[ j ] ] -1
B
C
<-1-> <- - 3 - ->
<--- 4 -->
1
1
3
6
cosc 336
41

A
Counting sort
B
3 Clinton 4 Smith 1 Xu
2 Adams 3 Dunn 4Yi
2 Baum 1 Fu
3 Gold 1 Lu
1 Land
1 Lu
1 Land
3 Gold
1 CCC1
23 1 1 1 23
42224 53335 64446
78
9 10 11
final counts
“offsets”
78
9 10 11
Original list
Sort buckets
0 0 0 0
4 2 3 2
cosc 336
42
(4)(3)2 6
(9)8 11

Analysis:
• O(k+n)time
– Whatifk=O(n)
• But Sorting takes  (n lg n) ????
• Requires k + n extra storage.
• This is a stable sort: It preserves the original order of equal keys.
• Clearly no good for sorting 32 bit values.
cosc 336 43

Bucket sort
• Keys are distributed uniformly in interval [0, 1)
• The records are distributed into n buckets
• The buckets are sorted using one of the well known sorts
• Finally the buckets are combined
cosc 336 44

1 2 3 4 5 6 7 8 9 10
0 1 2 3 4 5 6 7 8 9
.26/
0 1 2 3 4 5 6 7
89
.26/
.78 .17 .39 .26 .72 .94 .21 .12 .23 .68
/
/ /
/
.68/ .72
.68/ .72
Bucket sort
/
/ /
/
.12
.21
.39/
.12
.23
.39/
.17/
.23
.17/
.21
.78/
.94/
Step 1 distribute
.78/ .94/
Step 2 sorted
cosc 336
Step3 combine 45

Bucket Sort: Analysis
• p = 1/n , probability that the key goes to bucket i. • Expectedsizeofbucketisnp=n1/n=1
• The expected time to sort one bucket is (1).
• Overall expected time is (n).
cosc 336 46


Radix sort
Main idea
– Break key into “digit” representation
key = id, id-1, …, i2, i1
– “digit” can be a number in any base, a character, etc
Radix sort: for i= 1 to d
sort “digit” i using a stable sort
Analysis : (d  (stable sort time)) where d is the number of “digit”s
cosc 336 47

Radix sort
cosc 336 48

Radix sort- with decimal digits
1 2 3 4 5 6 7 8
910
321
572
294
178
139
326
572
294
321
910
368
326 178
368 139
910
321
326
139
368
572
178
294
Input list
Sorted list
cosc 336 49
139
178
294
321
326
368
572
910

Radix sort
with unstable digit sort
1 13 2 17
17 13
Input list
equal to 1
Since unstable and both keys
List not sorted
cosc 336
50
17 13

Is Quicksort stable?
1 48 2 55
3 51
Key Data
After partition of 0 to 2
• Note that data is sorted by key
• Since sort unstable cannot be used for radix sort
51 55 48
48 55 51
After partition of 1 to 2
Linear Sorts 51

Is Heapsort stable?
51 
55 Key Data
Complete binary tree, and max heap
1 2

Heap Sorted
After swap
51 55
55 51
• Note that data is sorted by key
• Since sort unstable cannot be used for radix sort
Linear Sorts 52

Example Sort 1 million 64-bit numbers.
We could use an in place comparison sort which would run in (n lg n) in the average case.
lg 1,000,000  20 passes over the data
radix-216number. Sod=4,k=216 ,n= 64 bits number = d3*(216)3 + d2*(216)2+ d1 (216)1 + d0(216)0
16 bits 16 bits 16 bits
ddd 321
We can treat a 64 bit number as a 4
digit,
16 bits
d
o
1,000,000
cosc 336 53
Adapted from Cormen,Leiserson,Rivest