程序代写代做代考 algorithm Query Evaluation 2

Query Evaluation 2

 It is sometimes necessary or useful to sort data
as it is scanned (read)
The cost to scan a file is B(R) ▪ To satisfy a query with an ORDER BY clause
▪ Or because an algorithm requires sorted input ▪ Such as projection or some join algorithms
 There are a number of ways in which a sort scan can be performed
▪ Main memory sorting ▪ B+ tree index
▪ Multi-way mergesort
But only if R fits in main memory: B(R) < M If B(R) < M the cost to sort a file is B(R) John Edgar 2  Sorting a collection of records that fit within main memory can be performed efficiently ▪ There are a number of sorting algorithms that can be performed in n(log2n) time ▪ That is, with n(log2n) comparisons, e.g., Mergesort, Quicksort,  Many DB tables are too large to fit into main memory at one time ▪ So cannot simply be read into main memory and sorted ▪ The focus in external sorting is to reduce the number of disk I/Os ▪ As it is with optimization in general John Edgar 3  Consider the Merge sort algorithm ▪ Input sub-arrays are repeatedly halved ▪ Until they contain only one element  Sub-arrays are then merged into sorted sub-arrays by repeated merge operations ▪ merging two sorted sub-arrays can be performed in O(n) mergesort(arr, start, end) if(start < end) //at least two elements mid = start + end / 2 mergesort(arr, start, mid) mergesort(arr, mid+1, end) merge(arr, start, mid, mid+1, end) John Edgar 4 O(n log2n)  Convert main memory merge sort to work on disk data  Initial step - read 2 pages of data from file ▪ Sort them and write them to disk ▪ Results in B/2 sorted "runs" of size 2  Merge the first two sorted runs of size 2 ▪ Read the first page of the first two runs into input pages ▪ Merge to a single output page, and write it out when full ▪ When all records in an input page have been merged read in the second page of that run ▪ Repeat for each pair of runs of size 2 ▪ There are now B/4 sorted runs of size 4  Repeatedly merge runs until the file is sorted Note: this does not make much sense, but is included for illustration John Edgar 5 11 15 23 31 64 87 6 10 14 41 55 76 3 25 28 39 53 disk pages contain three records  After the first sort pass the file consists of B/2 sorted runs each of two pages  Read in the first page of each of the first two sorted runs  Leaving a third page free as an output buffer sorted runs of size 2 11 15 23 input buffers output buffer 6 10 14 main memory John Edgar 6 11 15 23 31 64 87 6 10 14 41 55 76 3 25 28 39 53 6 10 11 disk  Records from the input pages are merged into the output buffer  Once the output buffer is full it's contents are written out to disk, to form the first page of the first sorted run of length 4 11 15 23 input buffers output buffer 6 10 14 6 10 11 main memory John Edgar 7 11 15 23 31 64 87 6 10 14 41 55 76 3 25 28 39 53 6 10 11 disk  At this point all of the records from one of the input pages have been processed  The next page of that sorted run is read into the input page  And the process continues uses only three main memory frames! 11 15 23 input buffers output buffer 461 5105 7164 14 15 main memory John Edgar 8  Assume that B = 2k ▪ After the first pass there are 2k-1 sorted runs ▪ Eachistwopagesinsize ▪ After the second pass there are 2k-2 sorted runs, of length 4 ▪ After the kth pass there is one sorted run of length B  The number of passes is therefore log2 B   In each pass all the pages of the file are read and written for a total cost of log2 B  * 2B ▪ Note that only 3 frames of main memory are required! ▪ Also note that main memory costs are ignored  The algorithm can be improved in two ways John Edgar 9 B pages read and B pages written  In the first stage of the naive process pairs of pages are read into main memory, sorted and written out ▪ Resulting in B/2 runs of size 2  To make effective use of main memory, read M pages, and sort them length M ▪ This reduces the number of subsequent merge passes M main memory pages available ▪ After the first pass there will be B/M sorted runs, each of sort memory John Edgar 10 file  In the merge passes perform an M-1 way merge ▪ M-1 input pages, one for each of M-1 sorted runs and ▪ 1 page for an output buffer  The first items in each of the M-1 input partitions are compared to determine the smallest  Each merge pass merges M-1 runs ▪ After the first pass the runs are size (M-1)*M ▪ This results in less merge passes, and less disk I/O memory M-1 way merge file Runs were size M after first pass ... John Edgar 11  The initial pass produces B/M sorted runs of size M  Each merge pass reduces the number of runs by a factor of M-1 ▪ The number of merge passes is logM-1B/M   Each pass requires that the entire file is read and then written first pass ▪ Total cost is therefore 2B (logM-1B/M + 1)  M is typically relatively large this so this reduction over two-way merge is considerable John Edgar 12 B = 1,000,000 M log2B logb-1B/M+1 3 20 20 5 20 10 200 20 3 2,000 20 2 Even a large file can usually be sorted in two passes (a cost of 4B I/Os to sort and write out) assuming a reasonable size for M John Edgar 13 10 43 23 1 64 87 35 50 19 41 5 86 12 24 94 41 26 13 101 453 1203 191 6243 8357 4351 5403 1590 6441 865 867 current set input buffer output buffer disk  In the first pass of external mergesort B/M sorted runs of size M are produced ▪ Larger initial run size means less merge steps  Replacement sort increases initial run size ▪ To 2 * M on average  The algorithm uses buffers ▪ M-2 pages to sort the file – the current set ▪ One page for input ▪ One page for output  First the current set is filled ▪ ... then sorted main memory John Edgar 14 10 43 23 1 64 87 35 50 19 41 5 86 12 24 94 41 26 13 121 245 9104 19 23 35 41 43 50 64 86 87 current set input buffer output buffer disk  Once the current set is sorted the next page of the file is read into the input buffer  The smallest record from the current set, and input buffer, is put in the output buffer  The first element of the current set is now free and is replaced with the first record from the input buffer  This process is repeated until the output buffer is full and all the values in the input buffer are in the current set 12 24 94 1 5 10 main memory John Edgar 15 10 43 23 1 64 87 35 50 19 41 5 86 12 24 94 41 26 13 1 5 10 12 24 94 19 23 35 41 43 50 64 86 87 current set input buffer output buffer disk  When the output buffer is full ▪ The next page of the file is read into the input buffer ▪ And the output buffer is written to disk as the first page of the first sorted run ▪ As the process continues the current set is periodically re-sorted  When a value is read that is less than the values in the output buffer the current set must be written out 4121 264 9143 1 5 10 main memory ▪ The process starts again to generate the next run John Edgar 16  In practice it may be more efficient to make the input and output buffers larger than one page ▪ This reduces the number of runs that can be merged at one time, so may increase the number of passes required ▪ But, it allows a sequence of pages to be read or written to the buffers, decreasing the actual access time per page  We have also ignored CPU costs ▪ If double buffering is used, the CPU can process one part of a run while the next is being loaded into main memory ▪ Double buffering also reduces the amount of main memory available for the sort John Edgar 17  Primary B+ tree index ▪ The index can be used to find the first page, but ▪ Note that the file is already sorted!  Secondary B+ tree index ▪ Leaves point to data records that are not in sort order ▪ In the worst case, each data entry could point to a different page from its adjacent entries ▪ Retrieving the records in order requires reading all of the index leaf pages, plus one disk read for each record! ▪ In practice external sort is likely to be much more efficient than using a secondary index Secondary indices are not useful for retrieving large selections John Edgar 18