程序代写代做代考 algorithm data structure CS 112 – Data Structures

CS 112 – Data Structures
Sesh Venugopal
Sequential Search Average Case

return false;
}
Sesh Venugopal
Sequential Search on a list: Array or Linked List
// on an array of integers
for (int i=0; i < arr.length; i++) { if (target == arr[i]) { return true; // on a linked list, Node has int data for (Node ptr=front; ptr != null; ptr=ptr.next) { if (target == ptr.data) { return true; return false; Big O running time is the same for either version: Success: Worst case: O(n), Best case: O(1) Failure: O(n) CS 112: Sequential Search Average Case 2 } }} Sequential Search on a list: Average Count for Success Basic Operation: Comparing target against list (array or linked list) item What is the average number of comparisons for success? A quick calculation would be to average out the best case (1 comparison) and the worst case (n comparisons), so (n+1)/2 Sesh Venugopal CS 112: Sequential Search Average Case 3 Sequential Search on a list: Average Count for Success A more general approach to compute average, like you would do to find average of a bunch of n numbers, is to add up the numbers and and divide by n (e.g. your average score on 3 exams) In the case of sequential search, the numbers to be averaged are the comparisons for matching at various positions. If the elements of the list have positions numbered 1 thru n, and it takes Ci comparisons to match at the i-th position then the average number of comparisons over all items is: Sesh Venugopal CS 112: Sequential Search Average Case 4 Sequential Search on a list: Average Count for Success It takes 1 comparison to match against the 1st item (C1), 2 comparisons to match against the second item (C2), etc. So the average number of comparisons for success is: (1+2+3+...+n)/n = n*(n+1)/2/n = (n+1)/2 Sesh Venugopal CS 112: Sequential Search Average Case 5 Average Count for Success: The Assumption of Equal Likelihood This formula we are using for the average assumes that all matches are equally likely. For instance, if you have a list of 10 items, and 100 matches are made on the list, then under this assumption, each item would be matched 10 times (i.e. same number of matches for all items) So we can rethink the formula as a weighted average: C1*1/n+C2*1/n+...+Cn*1/n=1/n*(C1 +C2 +...+Cn) Each item’s comparison count for match is weighted with its likelihood or probability of being matched – which is 1/n since all items are equally likely to be matched Sesh Venugopal CS 112: Sequential Search Average Case 6 Average Count for Success: General Formula, Random Likelihoods In practice, it is absurd to believe that all items will be equally likely to be matched on. More generally, there will be biases toward a few items—they will be a lot more likely to be matched on than others. (Think of Google search trends, with its day-to-day fluctuations.) So, in practice, we will need to assign a different probability of match for each item, say Pi for the i-th item . Which generalizes the weighted comparisons formula for average to: C1*P1 + C2*P2 + ... + Cn*Pn Each item’s comparison count for match is weighted with its likelihood or probability of being matched – but the probabilities are all different, in general (of course the sum of all probabilities equals 1) Sesh Venugopal CS 112: Sequential Search Average Case 7 Average Count for Success: General Formula, Random Likelihoods Here’s an example: Comparisons Probabilities for match Using the weighted average formula gives us the following for average number of comparisons: Note that the value 3 (1/7) is twice as likely to matched as 12 (1/14), and 25 (4/7) is 4 times as likely to be matched as 3. Sesh Venugopal CS 112: Sequential Search Average Case 8 Rearranging items based on search pattern Comparisons 1234567 3 12 86 17 15 25 6 Probabilities for match 1111141 -- -- -- -- -- -- -- 7 142114217 21 Since the value 25 is 4 times as likely to be matched on as value 3, shouldn’t 25 be all the way up front so that the search time (number of comparisons) is reduced on average? And similarly for all other items: the more likely they are to be matched on, the more toward the front they should be, to reduce the number of comparisons to get to match on them Rearranging the items in decreasing order of their match probabilities results in the minimum average number of comparisons. Sesh Venugopal CS 112: Sequential Search Average Case 9 Rearranging items based on search pattern Rearrange in descending probabilities Comparisons 1234567 1234567 25 3 12 17 15 86 6 3 12 86 17 15 25 6 Probabilities for match 1111141 -- -- -- -- -- -- -- 7 142114217 21 4111111 -- -- -- -- -- -- -- 7 7 1414212121 Average number of comparisons for success AFTER rearrangement: 1*4/7 + 2*1/7 + 3*1/14 + 4*1/14 + 5*1/21 + 6*1/21 + 7*1/21 = 2.2 The average after rearrangement is less than half the original average of 4.7 Sesh Venugopal CS 112: Sequential Search Average Case 10 Adaptive Shuffling Rearranging ALL items in one shot, knowing the spread of probabilities of matches, is only possible after the spread has attained long-term stability (will change very little over time). But this kind of long-term stability is not very likely to occur in practice – there is usually lots of fluctuation in short periods of time. There is an alternative: shuffle ”adaptively” by moving an item, as soon as it is matched, toward the front (the expectation is that it is quite likely to be matched again very soon) – moving toward front will reduce comparisons for match There are two possible kinds of adaptive moves. One option is to move the just-matched item one spot toward the front. Swap 15 with 17 3 12 86 17 15 25 6 3 12 86 15 17 25 6 Sesh Venugopal CS 112: Sequential Search Average Case 11 Adaptive Shuffling Another, more aggressive option, is to move the just-matched item all the way to the first spot Swap 15 with 3 But it would now push the previous 1st item (e.g. 3) backward, thereby undoing (perhaps drastically) the previous move that brought that item (3) to the 1st spot. 3 12 86 17 15 25 6 15 12 86 17 3 25 6 Sesh Venugopal CS 112: Sequential Search Average Case 12 Adaptive Shuffling A better way to bring the just-matched item to the 1st spot is shuffle all preceding items one step back (like in insertion sort!) But this would be much more expensive, O(n) time in the worst case However, it would only take O(1) time in a linked list 15 3 12 86 17 25 6 3 12 86 17 15 25 6 Sesh Venugopal CS 112: Sequential Search Average Case 13 General Formula, Random Likelihoods: Widely Applicable The general formula for weighted average given probabilities for match can be applied to ANY search algorithm on ANY data structure C1*P1 + C2*P2 + ... + Cn*Pn The actual number of comparisons (Ci’s) for match would depend on the algorithm and the data structure For example, this formula can be applied for binary search on a sorted array – you will need to compute the number of comparisons to find a match at each array position (the middle would be 1 comparison), then multiply by the associated probabilities for match. Sesh Venugopal CS 112: Sequential Search Average Case 14