程序代写代做代考 algorithm Sorting

Sorting

Sorting

Dr Timothy Kimber

January 2018

The Sorting Problem

The Sorting Problem

Sorting data is one of the most thoroughly explored computing
problems.

Problem (Sort)

Input: a sequence A of values 〈a1, a2, . . . , aN〉
Output: a permutation (reordering) 〈a′1, a


2 . . . , a


N〉 of A

such that a′1 ≤ a

2 ≤ · · · ≤ a


N

Sorting is an important problem. It is part of the solution to many
other problems.

Understanding the complexity of sorting algorithms helps design good
solutions to these other problems.

Algorithms (580) Sorting January 2018 2 / 32

Incremental Sorting Design

Incremental Sorting

The incremental approach of Simple Search can be applied to sorting.

Proceed from left

Gradually grow a sorted region (note loop invariant)

EXERCISE

Invent an incremental sorting algorithm.

Algorithms (580) Sorting January 2018 3 / 32

Incremental Sorting Design

Incremental Sorting

There are two options:

1 Add the next element a[i] to the sorted region

2 Add the lowest element outside the sorted region

Option 1 leads to the Insertion Sort algorithm

Algorithms (580) Sorting January 2018 4 / 32

Incremental Sorting Design

Insertion Sort

Insertion Sort divides a into a sorted part, initially just a[0], and the
remaining unsorted part

Elements from the unsorted part are then inserted into the sorted part

Algorithms (580) Sorting January 2018 5 / 32

Incremental Sorting Insertion Sort

Insertion Sort

Insertion Sort(Input: sequence a = [a0, . . . , aN−1])

For i = 1 to N // initialise invariant

next = a[i] // save. a[i] overwritten later

j = i

While j > 0 and next < a[j-1] // use invariant a[j] = a[j-1] j = j - 1 EndWhile a[j] = next EndFor The sorted region can be initialised to contain a[0] Do not need to compare next with all a[0,..,i-1] (sorted) Algorithms (580) Sorting January 2018 6 / 32 Incremental Sorting Performance Time Complexity What is the worst case input? What is the best case input? What is the time complexity in the best and worst cases? Algorithms (580) Sorting January 2018 7 / 32 Incremental Sorting Performance Worst Case Running time of Insertion Sort has two dimensions: Number of insertions ‘Size’ of insertion Informally: both dimensions are Θ(N), so T (N) = Θ(N2) Algorithms (580) Sorting January 2018 8 / 32 Incremental Sorting Performance Worst Case More formally, the total number of iterations of the inner loop is t(N) = 1 + 2 + · · ·+ (N − 1) = N−1∑ i=1 i = (N − 1 + 1)× (N − 1) 2 = N2 − N 2 Algorithms (580) Sorting January 2018 9 / 32 Incremental Sorting Performance Worst Case So, the worst case time complexity is aN2 + bN + c = Θ(N2) Algorithms (580) Sorting January 2018 10 / 32 Incremental Sorting Performance Best Case In the best case t(N) = N − 1 So, T (N) = aN + b = Θ(N) Algorithms (580) Sorting January 2018 11 / 32 Incremental Sorting Performance Other Properties Insertion Sort works in place (space complexity is Θ(1)) Assuming randomised input, average case is T (N) = Θ(N2) For any input T (N) = O(N2) Algorithms (580) Sorting January 2018 12 / 32 Divide and Conquer Sorting Design Divide and Conquer Will a divide and conquer approach work? Divide into subproblems Solve the subproblems Combine into overall solution EXERCISE Design a combining algorithm. Algorithms (580) Sorting January 2018 13 / 32 Divide and Conquer Sorting Combining Sorted Sequences Combining Sorted Sequences Merge (Input: array a, indices l , m and r , where r > m ≥ l)
left = a[l,…,m-1]

right = a[m,…,r-1]

i = j = 0, k = l

while k < r if (i > (m-l)) or (j < (r-m) and right[j] < left[i]) a[k] = right[j] j = j + 1 else a[k] = left[i] i = i + 1 end k = k + 1 end The procedure takes Θ(N) time for N total elements Algorithms (580) Sorting January 2018 15 / 32 Divide and Conquer Sorting Predicted Performance Divide and Conquer Time to combine subproblem solutions is Θ(N) EXERCISE What is worst case time complexity of divide and conquer algorithm? Write recurrence (assume N = 2a so no floors) Solve using master theorem Algorithms (580) Sorting January 2018 16 / 32 Divide and Conquer Sorting Predicted Performance Time Complexity The proposed algorithm divides the problem (in constant time) into 2 subproblems of size N/2, solves both, and combines the solutions in Θ(N) time, so the time complexity is: T (N) = { Θ(1) , if N = 1 2T (N/2) + Θ(N) , if N > 1

So, N logb a = N log2 2 = N1 = N, and therefore

f (N) = Θ(N logb a) = Θ(N logb a × log02N)
and Case 2, with k = 0, applies.

T (N) = Θ(N logb a log12N) = Θ(N log2N)

The divide and conquer algorithm is faster than Insertion Sort. Surprised?

Algorithms (580) Sorting January 2018 17 / 32

Divide and Conquer Sorting Predicted Performance

Time Complexity

Alternative informal view of time complexity: recursion tree

Each level of the tree contributes cN

There are log2N + 1 levels

Algorithms (580) Sorting January 2018 18 / 32

Divide and Conquer Sorting MergeSort

MergeSort

You have invented Mergesort

MergeSort (Input: array a, index l , index r)

if r – l < 2 return m = (l + r) / 2 MergeSort(a, l, m) MergeSort(a, m, r) Merge(a,l,m,r) return The sorting appears to be happening in place, but the list is copied during Merge What is the best case? Algorithms (580) Sorting January 2018 19 / 32 Divide and Conquer Sorting Properties of Merge Sort Properties of Merge Sort Space complexity is Θ(N) Time complexity is Θ(N log2N) Faster than Insertion Sort for large, unsorted lists Slower than Insertion Sort if the list is already sorted Slower than Insertion Sort for small N Algorithms (580) Sorting January 2018 20 / 32 Quicksort Design Alternative Divide and Conquer Merge Sort divides the data in half and sorts the halves Alternative approach: do a quick (rough) sort into two groups ap is called the pivot and the division ensures that ∀a ∈ Alow (a < ap) and ∀a ∈ Ahigh(a ≥ ap) Left with subproblems of sorting Alow and Ahigh No combining needed Algorithms (580) Sorting January 2018 21 / 32 Quicksort Algorithm Quicksort This procedure sorts the array a[l , ..., r − 1] Quicksort (Input: array a, index l , index r) if r < l + 2 // 0 or 1 elements to sort return p = Partition(a, l, r) Quicksort(a, l, p) Quicksort(a, p + 1, r) return The Quicksort divide step is called partitioning The Partition procedure must return the final index of the pivot The base case must work for an empty array Algorithms (580) Sorting January 2018 22 / 32 Quicksort Algorithm Suggested Partition Design Use last element as the pivot Maintain two subarrays which grow Elements before j must be less than pivot Elements j,...,i-1 must be equal or greater than the pivot Elements i,... are unseen so far Tutorial Exercise Write the Partition procedure. Algorithms (580) Sorting January 2018 23 / 32 Quicksort Algorithm Lomuto Partitioning Algorithms (580) Sorting January 2018 24 / 32 Quicksort Algorithm Partition This procedure partitions the array a[l , ..., r − 1] Partition (Input: array a, index l , index r) i = j = l // both partitions are empty p = r - 1 while i < p if a[i] < a[p] swap(a, i, j) j = j + 1 i = i + 1 swap(a, p, j) return j The time complexity is Θ(N) (where N = r − l) Algorithms (580) Sorting January 2018 25 / 32 Quicksort Performance Quicksort Performance Question What is the worst case time complexity of Quicksort. And why? Algorithms (580) Sorting January 2018 26 / 32 Quicksort Performance Quicksort Worst Case The given partition procedure: will only remove one element from sorted or reverse-sorted data will only remove one element from data with many duplicates This leads to incremental execution resembling insertion sort N levels of recursion N − i elements to partition at level i So worst case time complexity is Θ(N2) Algorithms (580) Sorting January 2018 27 / 32 Quicksort Performance Quicksort Best Case Fewest levels of recursion when the partitioning is balanced Subproblems are no larger than N/2 Same bound as Mergesort: Θ(N log2N) As recursion tree suggests, constants smaller than Mergesort Algorithms (580) Sorting January 2018 28 / 32 Quicksort Performance Quicksort Performance With rather unbalanced partitioning performance is still O(N log2N) Any dividing factor will introduce a logN term e.g. 9 to 1 (Cormen p. 176) Algorithms (580) Sorting January 2018 29 / 32 Quicksort Performance Randomised Quicksort If the partition procedure is altered to choose the pivot at random, then the worst case behaviour becomes a matter of chance for all inputs. Partition (Input: array a, index l , index r) x = random(l, r) // random integer in [l,r-1] swap(a, x, r-1) ... as before Assuming N distinct values: The probability of choosing the worst pivot in every call is 1/N! This becomes vanishingly small as N increases Randomised Quicksort is algorithm of choice if N more than ∼ 10 Algorithms (580) Sorting January 2018 30 / 32 Quicksort Randomised Quicksort Expected Performance Possible to give average case time complexity for Randomised Quicksort Assume N distinct values again Randomisation means all inputs equally likely Time depends on number of comparisons performed Probability of comparing a[i] with a[j] determined by their rank by value (see books for full explanation) Average number of comparisons is sum of probabilities for all i,j Average case complexity is Θ(N log2N) This is called expected running time for randomised algorithm Algorithms (580) Sorting January 2018 31 / 32 Quicksort Other Partitions Partition Variations There are many ways to implement partitioning. e.g. Hoare paritioning Partitions grow inwards from end Handles duplicates better Three-way paritioning Includes a region for values equal to pivot Handles duplicates better Median of 3 Choose pivot as median of three random elements Better balance between subproblems Algorithms (580) Sorting January 2018 32 / 32 The Sorting Problem Incremental Sorting Design Insertion Sort Performance Divide and Conquer Sorting Design Combining Sorted Sequences Predicted Performance MergeSort Properties of Merge Sort Quicksort Design Algorithm Performance Randomised Quicksort Other Partitions