CS计算机代考程序代写 data structure algorithm 10_Elementary_Sorts

10_Elementary_Sorts

Lecture 10
Elementary Sorting Methods

EECS 281: Data Structures & Algorithms

Sorting Overview

Data Structures & Algorithms

Computational Task & Solutions
Sort records in a sequence* by keys,

with respect to an operator/functor
bool operator<(const T&, const T&) const Elementary sorts • Bubble Sort • Selection Sort • Insertion Sort • Counting sort (int, few different keys) High-Performance Sorts • Quicksort • Merge Sort • Heapsort • Radix Sort (int, char*, …) 3*Sort keys in ascending order, unless directed otherwise Elementary ≠ Useless Elementary algorithms for sorting… • Illustrate how to approach a problem • Illustrate how to compare multiple solutions • Illustrate program optimization • Are sometimes “good enough” • Often combined with sophisticated sorts 4 Which Containers Can Be Sorted? Array vs. Linked List Data Representation • Both will be considered, begin with arrays • Some basic tasks are better suited for arrays • Some basic tasks can be adapted to linked lists 5 Accessing Containers • STL sorting is done using iterators #include

vector a{4, 2, 5, 1, 3};

std::sort(begin(a), end(a)); // sorts [left, right)

• In this lecture, we use indices and arrays
int a[] = {4, 2, 5, 1, 3};

bubble(a, 0, 5); // sorts [left, right)

• Exercise at home: rewrite using iterators

6

How Size Affects Sorting
Internal Sort: “I can see all elements”
• Sequence to be sorted fits into memory
• O(1)-time random access is available
Indirect Sort: “I can see all indices”
• Reorder indices of items rather than items

(important when copying is expensive)
External Sort: “Too many elements”
• Items to be sorted are on disk
• Items are accessed sequentially or in blocks

7

Exercise
• Write swap()

– Swap items A and B
• Return type?
• Write a templated version

8

Desirable Qualities of Algorithms
• Asymptotic complexity, number of

comparisons (and/or swaps)
– Worst-case
– Average-case

• Memory efficiency considerations
– Sort in place with O(1) extra memory
– Sort in place with recursion (consider stack)
– Sort in place, but have pointers or indices (n items

need an additional n ptrs or indices)
– Some sorts need W(n) extra memory

10

Desirable Qualities of Algorithms
• Stability

– Definition: preservation of relative order
of items with duplicate keys

– Elementary sorts tend to be stable (not all)
– Complex sorts are often not stable
– Important for sorting on two or more keys

(sort alphabetically, then by SSN)

11

Sorting Algorithms: Stability
A sorting algorithm is stable if data with the same
key remains in the same relative locations

Example Input: (already sorted by first name)
{“Sheen, Charlie”, “Berry, Halle”, “Liu, Lucy”, “Sheen, Martin”}

Sort by Last Name

Example Stable Output: (two “Sheen” stay in relative locations)
{“Berry, Halle”, “Liu, Lucy”, “Sheen, Charlie”, “Sheen, Martin”}

Example Unstable Output:
{“Berry, Halle”, “Liu, Lucy”, “Sheen, Martin”, “Sheen, Charlie”}

12

Types of Algorithms
• Non-Adaptive Sort

– Sequence of operations performed is independent of
order of data

– Usually simpler to implement

• Adaptive Sort
– Performs different sequences of operations

depending on outcomes of comparisons

Worst-case versus best-case complexity

13

Sorting Overview

Data Structures & Algorithms

Bubble Sort

Data Structures & Algorithms

Bubble Sort
1 void bubble(Item a[], size_t left, size_t right) {

2 for (size_t i = left; i < right - 1; ++i) 3 for (size_t j = right - 1; j > i; –j)

4 if (a[j] < a[j - 1]) 5 swap(a[j - 1], a[j]); 6 } // bubble() • Find minimum item on first pass by comparing adjacent items, and swap item all the way to left • Find second smallest item on second pass • Repeat until sorted • Non-adaptive 16 Example input: {4, 2, 5, 1, 3} pass1: {1, 4, 2, 5, 3} // Smallest item first pass2: {1, 2, 4, 3, 5} // …second smallest pass3: {1, 2, 3, 4, 5} // … pass4: {1, 2, 3, 4, 5} // No change, why pass? 17 1 void bubble(Item a[], size_t left, size_t right) { 2 for (size_t i = left; i < right - 1; ++i) 3 for (size_t j = right - 1; j > i; –j)

4 if (a[j] < a[j - 1]) 5 swap(a[j - 1], a[j]); 6 } // bubble() Bubble Sort: Pop Quiz • Currently non-adaptive • How can bubble sort be fine-tuned to minimize work? i.e., make it adaptive. • Hint: think about what happens if we bubble sort {1, 2, 3, 4, 5} 18 Adaptive Bubble Sort 1 void bubble(Item a[], size_t left, size_t right) { 2 for (size_t i = left; i < right - 1; ++i) { 3 bool swapped = false; 4 for (size_t j = right - 1; j > i; –j) {

5 if (a[j] < a[j - 1]) { 6 swapped = true; 7 swap(a[j - 1], a[j]); 8 } // if 9 } // for j 10 if (!swapped) 11 break; 12 } // for i 13 } // bubble() 19 Bubble Sort Analysis • Non-adaptive version – ~n2/2 comparisons and swaps in worst and average cases – ~n2/2 comparisons, 0 swaps in best case • Adaptive version – ~n2/2 comparisons and swaps in worst and average cases – ~n comparisons in best case – ~n swaps (as few as 0) in best case 20 Bubble Sort • Advantages – Simple to implement and understand – Completes some ‘pre-sorting’ while searching for smallest key (adjacent pairs get swapped) – Stable – Adaptive version may finish quickly if the input array is almost sorted • Disadvantages – O(n2) time • n2/2 comparisons and n2/2 swaps – Slower than high-performance sorts 21 Bubble Sort Data Structures & Algorithms Selection Sort Data Structures & Algorithms Selection Sort 1 void selection(Item a[], size_t left, size_t right) { 2 for (size_t i = left; i < right - 1; ++i) { 3 size_t minIndex = i; 4 for (size_t j = i + 1; j < right; ++j) 5 if (a[j] < a[minIndex]) 6 minIndex = j; 7 swap(a[i], a[minIndex]); 8 } // for 9 } // selection() • Find smallest element in array, swap with first position • Find second smallest element in array, swap with second position • Repeat until sorted • Non-adaptive 24 Example input: {4, 2, 5, 1, 3} pass1: {1, 2, 5, 4, 3} pass2: {1, 2, 5, 4, 3} pass3: {1, 2, 3, 4, 5} pass4: {1, 2, 3, 4, 5} 25 1 void selection(Item a[], size_t left, size_t right) { 2 for (size_t i = left; i < right - 1; ++i) { 3 size_t minIndex = i; 4 for (size_t j = i + 1; j < right; ++j) 5 if (a[j] < a[minIndex]) 6 minIndex = j; 7 swap(a[i], a[minIndex]); 8 } // for 9 } // selection() Adaptive Selection Sort • Extra comparison checks if item is already in position • Eliminates unnecessary (costly) swaps 1 void selection(Item a[], size_t left, size_t right) { 2 for (size_t i = left; i < right - 1; ++i) { 3 size_t minIndex = i; 4 for (size_t j = i + 1; j < right; ++j) 5 if (a[j] < a[minIndex]) 6 minIndex = j; 7 if (minIndex != i) 8 swap(a[i], a[minIndex]); 9 } // for 10 } // selection() 26 Selection Sort Analysis • Non-adaptive version – ~n2/2 comparisons – n - 1 swaps in best, average, worst case – Run time is insensitive to input • Adaptive version – (n2 - n)/2 + (n - 1) comparisons – n - 1 swaps worst case – 0 swaps best case – Run time is sensitive to input 27 Selection Sort Advantages • Minimal copying of items – Good choice when objects are large and copying is expensive • Fairly efficient for small n 28 Selection Sort Disadvantages • Q(n2) time • Run time only slightly dependent upon pre-order – The smallest key on one pass tells nothing about the smallest key on subsequent passes – Runs about the same on • Already sorted array • Array with all keys equal • Randomly arranged array • Sort is not stable 29 Selection Sort Data Structures & Algorithms Summary/Preview • Sorting algorithms useful to – Satisfy direct requests to sort something – Do so quickly • Two sorts (Bubble and Selection) – Asymptotically “slow” (O(n2)) – Minimal opportunity for tuning • Next: Insertion Sort – Also asymptotically “slow” (O(n2)) – More opportunities for tuning 31 Insertion Sort Data Structures & Algorithms Insertion Sort 1 void insertion(Item a[], size_t left, size_t right) { 2 for (size_t i = left + 1; i < right; ++i) 3 for (size_t j = i; j > left; –j)

4 if (a[j] < a[j - 1]) 5 swap(a[j - 1], a[j]); 6 } // insertion() • Divide items into two groups: sorted and unsorted • Move items one at a time from unsorted to sorted, inserting each into its proper place • Sorted group grows, unsorted group shrinks • Repeat until sorted • Non-adaptive 33 Insertion Sort Example input: {4, 2, 5, 1, 3} pass1: {2, 4, 5, 1, 3} pass2: {2, 4, 5, 1, 3} pass3: {1, 2, 4, 5, 3} pass4: {1, 2, 3, 4, 5} 34 1 void insertion(Item a[], size_t left, size_t right) { 2 for (size_t i = left + 1; i < right; ++i) 3 for (size_t j = i; j > left; –j)

4 if (a[j] < a[j - 1]) 5 swap(a[j - 1], a[j]); 6 } // insertion() Insertion Sort Analysis • Non-adaptive version – ~n2/2 comparisons in worst case – ~n2/2 swaps worst case – ~n2/4 swaps average case – Run time only slightly sensitive to input (Comparisons always the same, but swaps can change) 35 Improving Insertion Sort • General purpose elementary sorts involve comparisons and swaps • Best method of improving sort efficiency is to use a more efficient algorithm • Next best method is tighten the inner loop (comparison and swap) 36 Insertion Sort: First Improvement 1 void insertion(Item a[], size_t left, size_t right) { 2 for (size_t i = left + 1; i < right; ++i) { 3 Item v = a[i]; size_t j = i; 4 for (size_t j = i; j > left; –j) {
5 if (a[j] < [j - 1]) 6 swap(a[j - 1], a[j]); 7 if (v < a[j - 1]) 8 a[j] = a[j - 1]; 9 else 10 break; 11 } // for j 12 a[j] = v; 13 } // for i 14 } // insertion() Move vs. Swap 37 1 void insertion(Item a[], size_t left, size_t right) { 2 for (size_t i = left + 1; i < right; ++i) { 3 Item v = a[i]; size_t j = i; 4 while ((j > left) && (v < a[j - 1])) { 5 if (v < a[j - 1]) a[j] = a[j - 1]; 6 --j; 7 } // while 8 a[j] = v; 9 } // for i 10 } // insertion() Insertion Sort: Second Improvement Replace for with while Remove the break 38 Insertion Sort So Far • The j > left test is often performed, but seldom false
• Eliminate the need for frequent testing with a sentinel

1 void insertion(Item a[], size_t left, size_t right) {
2 for (size_t i = left + 1; i < right; ++i) { 3 Item v = a[i]; size_t j = i; 4 while ((j > left) && (v < a[j - 1])) { 5 a[j] = a[j - 1]; 6 --j; 7 } // while 8 a[j] = v; 9 } // for i 10 } // insertion() 39 Insertion Sort: Third Improvement 1 void insertion(Item a[], size_t left, size_t right) { 2 for (size_t i = right - 1; i > left; –i) // find min item

3 if (a[i] < a[i - 1]) // put in a[left] 4 swap(a[i - 1], a[i]); // as sentinel 5 6 for (size_t i = left + 2; i < right; ++i) { 7 Item v = a[i]; size_t j = i; // v is new item 8 while ((j > left) && v < a[j - 1]) { // v in wrong spot 9 a[j] = a[j - 1]; // copy/overwrite 10 --j; 11 } // while 12 a[j] = v; 13 } // for i 14 } // insertion() j > left is often performed,
but seldom false

Eliminate frequent tests with
a sentinel

40

Insertion Sort: Adaptive/Example
1 void insertion(Item a[], size_t left, size_t right) {

2 for (size_t i = right – 1; i > left; –i) // find min item

3 if (a[i] < a[i - 1]) // put in a[left] 4 swap(a[i - 1], a[i]); // as sentinel 5 6 for (size_t i = left + 2; i < right; ++i) { 7 Item v = a[i]; size_t j = i; // v is new item 8 while (v < a[j - 1]) { // v in wrong spot 9 a[j] = a[j - 1]; // copy/overwrite 10 --j; 11 } // while 12 a[j] = v; 13 } // for i 14 } // insertion() a = { 4 2 5 1 3 } input a = { 1 4 2 5 3 } sentinel a = { 1 2 4 5 3 } pass 1 a = { 1 2 4 5 3 } pass 2 a = { 1 2 3 4 5 } pass 3 41 Adaptive Insertion Sort • Comparisons – ~n2/2 in worst case – ~n2/4 in average case – ~n in best case • Copies – ~n2/2 in worst case – ~n2/4 in average case – ~n in best case • Run time sensitive to input 42 Insertion Sort Comparison Advantages of Adaptive Insertion Sort • More efficient data manipulation – Use move instead, 1/3 of the work of swap • Fewer comparisons/copies – Used only when key is smaller than key of item being inserted • Find sentinel key – Sentinel key is a smallest key in range – Adds overhead of explicit first pass to find sentinel 43 Insertion Sort Analysis • Advantages – Run time depends upon initial order of keys in input – Stable – Algorithm is tunable – Best sort for small values of n • Disadvantages – Still 𝛩 𝑛! 44 Insertion Sort Data Structures & Algorithms Sorting Performance Data Structures & Algorithms Performance Characteristics • Run time proportional to – Number of comparisons – Number of swaps – Sometimes both comparisons and swaps • Elementary sorts are all quadratic time (O(n2)) worst and average – Assuming inputs are uniformly distributed – Better average on pre-sorted inputs 47 Performance Analysis Inversion: A pair of keys that are out of order For each item, can determine the number of inversions (number of items to left that are greater) For each input, can determine total number of inversions • Gives sense of “presorted-ness” • Some sorts work better on presorted data 48 Performance Analysis Property: Insertion and Bubble sort use linear number of comparisons and swaps for data with a constant upper limit to the number of inversions for each element (i.e. almost sorted: elements are not very far out of place) Interpretation: Insertion sort and bubble sort are efficient on partially sorted data when each item is close to its “final” position 49 Performance Analysis Property: Insertion sort uses a linear number of comparisons and swaps for data with a constant number of elements having more than a constant number of inversions (i.e. a small number of elements are very far from their final location) Interpretation: Insertion sort is efficient on sorted data when a new item is added to the sorted data 50 Performance Analysis Property: Selection sort runs in linear time for data with large items and small keys (i.e. sort 1MB/person medical records in order by a single integer: their social security number) Interpretation: Selection sort is expensive in terms of comparisons, but cheap in terms of swaps; thus it is O(n) in the number of swaps 51 Summary: Elementary Sorts • O(n2)-time, in-place sorts – Bubble Sort – Selection Sort – Insertion Sort • Non-adaptive • Adaptive • Counting sort is O(n)-time, but not in-place • Some sorts execute more quickly in special cases Highly recommended: http://www.sorting-algorithms.com/ Click on one to watch it go! 52 http://www.sorting-algorithms.com/ Sorting Performance Data Structures & Algorithms Counting Sort Data Structures & Algorithms Counting Sort For sorting a limited number of different keys Example: D F B G A F C B F D D F G • First pass: count the number of items that match each key (don’t copy data, just count it) Example: 1 2 1 3 0 4 2 • Second pass: compute offset where each key would start in sorted order Example: 0 1 3 4 7 7 11 (last used slot: 12) • Third pass: copy records into output container, grouped by key 55 D F B G A F C B F D D F GD F B G A F C B F D D F G Counting Sort 56 0 A B B C D D D F F F F G G 0 0 0 0 0 01 1 1 1 1 12 2 2 23 34 1) Count items and sum into buckets 2) Calculate offsets from counts 3) Copy from input to output When Counting Sort Can Be Used • ONLY usable when the number of keys is limited (and small) – For example: • Number of possible birthdays annually = 366 • Number of possible class standings = 4 • INT_MAX is not “limited” – Changes with architecture – For 64 bit computers, 264 8-byte integers = 144 quintillion bytes = 160 million 10TB hard drives 57 Example Code 1 void counting_sort(vector &students) {
2 vector work_copy(students.size());
3 vector counts(MAX);
4
5 for (auto &s : students) // Pass 1
6 ++counts[s.classStanding()];
7 for (size_t i = 1; i < MAX; ++i) // Pass 2 8 counts[i] += counts[i - 1]; 9 for (auto &s : students) { // Pass 3 10 work_copy[counts[s.classStanding()] - 1] = s; 11 --counts[s.classStanding()]; 12 } // for 13 swap(students, work_copy); 14 } // counting_sort() 58 Counting Sort Analysis • A “distribution sort”, where items are not compared to each other • Time complexity is linear in the number of items and buckets 𝑂 𝑛 + 𝑘 • Space complexity is 𝑂(𝑛 + 𝑘) • Stable • Similar to and often used in Radix Sort 59 Counting Sort Data Structures & Algorithms Using std::sort() Data Structures & Algorithms std::sort() Parameters • Expects: – Two iterators [begin, end) – A natural sort order (contained data type supports operator <) • If no natural sort order, use the overloaded version with three parameters: – Two iterators [begin, end) – A functor 62 std::sort() Functor • The function object determines sort order • Assume parameters are a, b – For ascending sort, return true when a < b – For descending sort, use the same functor but reverse the arguments, b < a 63 Using std::sort() Data Structures & Algorithms