Asymptotic Complexity
Comparison: Two Different Sequences
A linked list
Copyright By PowCoder代写 加微信 powcoder
An array with some spare capacity (to not worry about resizing)
data data data next next next
How long would it take to…
• Get or set the nth element?
• Add an element to the front?
• Add an element to the back?
• Determine whether x is in the sequence? •…
Different representations → different costs for operations Today let’s make this precise.
Getting the nth element
(Please forgive the lack of error checking.)
def list_nth(lst, n):
let link = lst.head
for i in range(n):
link = link.next
return link.data
def array_nth(array, n):
return array.elems[n]
The loop in list_nth repeats n times. array_nth has no loop.
Adding an element to the front
def list_push_front(lst, val):
lst.head = cons(val, lst.head)
# need to move all the existing elements back…
# let’s assume we have enough spare capacity
def array_push_front(array, val):
let i = array.len
while i > 0:
array.elems[i] = array.elems[i – 1]
array.len = array.len + 1 array.elems[0] = val
list_push_front has no loop (and no loop hidden in cons). The loop in array_push_front repeats array.len times.
Breaking down list-nth
def list_nth(lst, n):
let link = lst.head
for i in range(n):
link = link.next
return link.data
Tthing(n) = time to do thing on input of size n
Tlist_nth(n) = Tget .head + Tsetup for + n × Tget .next +
n × Tassign link + n × Tnext for + Tget .data
Tlist_nth(n) = c1 + c2n
Let c1 = Tget .head + Tsetup for + Tget .data Let c2 = Tget .next + Tassign link + Tnext for
Operation time comparison
nth c1 + c2n push_front e1
If n is in the formula, the time will grow as the data structure gets larger.
If not, always the same cost!
No matter what the values of c1, c2, and d1 are, if n gets
large enough then c1 + c2n will be larger than d1. Not true when comparing c1 + c2n to f1 + f2n.
Complexity classes
c1 + c2n and f1 + f2n feel very similar.
In both cases, if n doubles, their second component doubles. (And for a large enough n, c1 and f1 are too small to matter.) So they are both essentially proportional to n.
• We will call this O(n) (AKA order of n AKA linear) d1 and e1 also feel very similar.
In both cases, regardless of how big n is, they are constant. • We will call this O(1) (AKA order of 1 AKA constant)
Another example: Searching a sorted array
We have a sorted array of n numbers.
We want to check if a given number is in there.
def search (numbers, target):
I can think of two ways to do this. Can you?
• Look at each element one at a time (linear search) • Oh, it’s sorted. Binary search!
Linear search
Look at each element in turn, check if it’s the one we want.
def linear_search (numbers, target):
for x in numbers:
if x == target:
return True
return False
We don’t know the position of our target, or if it’s there at all!
But in the worst case, we look at all n numbers.
Tlinear_search(n) = Tsetup for + n × (T== + Tnext for) Tlinear_search(n) = c1 + c2n
Tlinear_search(n) = O(n)
When analyzing the cost of operations, we’re usually interested in the cost in the worst case. (I’ll let you know otherwise.)
Binary search
• Suppose we’re looking up 75
• Initially, the element could be anywhere in the array
• Compare with the middle element
• Each iteration, possible range is cut in half
• When we’re down to one element
Either we found what we’re looking for Or it’s not in the array at all!
• https://www.cs.usfca.edu/~galles/ visualization/Search.html
Binary search
def binary_search (numbers, target):
def helper (low, high):
if low > high: return False
let mid = (low + high) // 2
if numbers[mid] == target: return True
elif numbers[mid] < target:
return helper(mid+1, high)
else: # numbers[mid] > target:
return helper(low, mid-1)
return helper(0, numbers.len()-1)
We rely on sortedness to cut out half the numbers each step. Key idea: like doing binary search on an array half the size!
Binary search
def binary_search (numbers, target):
def helper (low, high):
if low > high: return False
let mid = (low + high) // 2
if numbers[mid] == target: return True
elif numbers[mid] < target:
return helper(mid+1, high)
else: # numbers[mid] > target:
return helper(low, mid-1)
return helper(0, numbers.len()-1)
We only do one of the recursive calls.
Tbinary_search(n) = T> + Tmid + Tarray access + T== + T< + Tadd/sub + Tbinary_search( n2 )
Tbinary_search(n)=c1 +Tbinary_search(n2)
Binary search
def binary_search (numbers, target):
def helper (low, high):
if low > high: return False
let mid = (low + high) // 2
if numbers[mid] == target: return True
elif numbers[mid] < target:
return helper(mid+1, high)
else: # numbers[mid] > target:
return helper(low, mid-1)
return helper(0, numbers.len()-1)
In the worst case, we get down to the empty range and don’t find our target, and return False.
Tbinary_search(1) = T>
Binary search
def binary_search (numbers, target):
def helper (low, high):
if low > high: return False
let mid = (low + high) // 2
if numbers[mid] == target: return True
elif numbers[mid] < target:
return helper(mid+1, high)
else: # numbers[mid] > target:
return helper(low, mid-1)
return helper(0, numbers.len()-1)
In the worst case, we get down to the empty range and don’t find our target, and return False.
Tbinary_search(1) = T> Tbinary_search(1) = c2
Binary search
Putting all this together.
Tbinary_search(n)=c1 +Tbinary_search(n2) =c1 +c1 +Tbinary_search(n4)
=c1 +c1 +c1 +Tbinary_search(n8) …
=c1 +c1 +c1 +…+Tbinary_search(1) = c1 × log2 n + Tbinary_search(1)
=c1 ×log2n+c2
QUIZ: How many times do we divide by 2 before reaching 1? log2 n
So binary_search is O(log2 n) (aka logarithmic)
Linear search versus binary search
Linear search takes O(n). Binary search takes O(log2 n). What does this mean concretely?
65 536 1 048 576 33 554 432 4 294 967 296
1000000× log2 n
4 000 000 5 000 000 6 000 000
16 000 000 20 000 000 25 000 000 32 000 000
Even as n gets larger, binary search remains pretty fast. You can make the c1 constant as large as you want… …but as n gets larger… binary search still wins eventually.
Linear search versus binary search
red 1000000×log2 n blue n
Linear search versus binary search
red 1000000×log2 n blue n
If f is a function, then O(f) is the set of functions that “grow no faster than” f
“g grows no faster than f” means there exist some c and m suchthatforalln>m, g(n)≤c×f(n)
• Intuitively: on large enough input (m), g grows no faster than f up to a change of constants (c)
• When we say, e.g., O(n), we’re implicitly referring to the function f(n) = n.
• This definition implies that f(n) = O(log2 n) is also O(n). That’s true, but less information than saying it’s O(log2 n).
Another definition
f ≪gmeansf ∈O(g)butg̸∈O(f)
For example:
log2 n ≪ n because
log2 n ∈ O(n) n ̸∈ O(log2 n)
Big-O equalities
There are a bunch of rules we can apply to simplify complexity expressions (some of which we used already):
• O(f(n) + c) = O(f(n))
Equivalently: f (n) + c ∈ O(f (n))
In other words: adding constants doesn’t matter.
• O(c × f(n)) = O(f(n))
In other words: multiplying by constants doesn’t matter.
• O(logk f(n)) = O(logj f(n))
In other words: log bases don’t matter. (So we’ll stop bothering with them.)
• O(f(n)+g(n))=O(f(n))ifg≪f
In other words: adding smaller things doesn’t matter. (Implies the first rule.)
Another example: Sorting
Sorting is one of the classic problems in computer science.
Simplest formulation: given a list of numbers, return a list with the same elements, sorted in increasing order.
Many different algorithms to solve that problem. Some more efficient than others. We’ll look at two.
• Selection sort: build a new list, starting from the largest element. Then add the next largest to the front, and so on.
• Merge sort: split the list into two sublists of equal length. Sort each one, then merge the two sorted lists together, preserving the ordering.
has visualizations for various sorting algorithms:
https://www.cs.usfca.edu/~galles/visualization/
ComparisonSort.html
Selection sort
def selection_sort (unsorted):
let sorted = None
while unsorted is not None:
let largest = find_largest(unsorted)
unsorted = remove(largest, unsorted)
sorted = cons(largest, sorted)
return sorted
find_largest and remove are what you would expect. See
bonus slides for code and analysis.
We loop until the list is empty, removing one element each time.
QUIZ: How many iterations? n
Tselection_sort(n) = n × (Tis not + Tfind_largest(n) + Tremove(n) + Tcons)
Selection sort
Tselection_sort(n) = n × (Tis not + Tfind_largest(n) + Tremove(n) + Tcons) = n × (c1 + Tfind_largest(n) + Tremove(n))
=n×(c1 +O(n)+O(n)) =n×(c1 +2×O(n))
= n×(c1 +O(n))
= c1 × n + O(n2)
= O(n2) n2 ≫n≫c1
Merge sort
(Vector notation for simplicity; works even better on linked lists!)
Step 1: Split into two sub-lists (even indices and odd indices) Step 2: Recursively sort each sub-list
Step 3: Merge the two, picking the smallest element from either at each step
Merge sort
Step 1. Split into two sub-lists
Step 2. Recursively sort each sub-list Step 3. Merge the two
# : List[Number] -> List[Number]
def merge_sort(lst):
if lst is None or lst.next is None: return lst
let o = merge_sort(odds(lst))
let e = merge_sort(evens(lst))
return merge(o, e)
Tmerge_sort(1) = Tis + T.next + _sort(n) = Tis + T.next + Tis + Todds(n) + Tmerge_sort(n2) +Tevens(n)+Tmerge_sort(n2)+Tmerge(n)
Merge sort
Tmerge_sort(1) = c1
Tmerge_sort(n) = n + 2 × Tmerge_sort( n2 )
=n+2×(n2 +2×Tmerge_sort(n4)) =n+2×n2 +4×Tmerge_sort(n4) =n+2×n2 +4×n4 +8×Tmerge_sort(n8))
=n+2×n2 +4×n4 +…+n×c1 = n + n + n + . . . + n × c1
= log n × n + n × c1
= O(n log n)
Selection sort versus merge sort
Selection sort takes O(n2). Merge sort takes O(n log n).
n n2 1000000×nlogn
(106 ≈) (3.6×107 ≈)
16 256 32 1024 64 4 096
256 65 536 220 1.1 × 1012 225 1.1×1015
64 000 000 160 000 000 384 000 000
2 048 000 000 2.1 × 1013 8.4×1014
n log n grows faster than n, but much slower than n2.
Selection sort versus merge sort
red 1000000×nlogn black n2
green n log n blue n
Selection sort versus merge sort
red 1000000×nlogn black n2
green n log n blue n
Big-O inequalities
For constants j, k where j < k:
1 ≪ log n ≪nj
≪ nk log n ≪ jn ≪kn ≪nn
constants are less than logs... are less than polynomials... are less than higher-degree polynomials... are less than poly-log (for same k)... are less than exponentials... are less than higher-base exponentials... are less than this horror.
In practice, worse than n3 is intractable. jn is a non-starter. Above are the most common classes; we will see others.
Next time: The Dictionary ADT
Selection sort helpers (1/2)
def find_largest(lst)
if lst is None:
error('empty list has no largest element')
let best_so_far = lst.data
while lst is not None:
if lst.data > best_so_far:
best_so_far = lst.data
lst = lst.next
return best_so_far
Traverses the entire list; the largest element could be anywhere! Thus, O(n).
Selection sort helpers (2/2)
# returns a new list, but could also do an
# in-place version
def remove(elt, lst):
if lst is None:
error(‘element not found: %d’, elt)
elif lst.data == elt:
return lst.next
return cons(lst.data,
remove(elt, lst.next))
In the worst case, we’re removing the last element of the list. So we may need to traverse the whole list.
Thus, O(n).
Merge sort helpers (1/2)
Split a list into two sublists, one with the elements with even indices, the other with odd indices.
# 0th element, 2nd, 4th, etc.
def evens(lst):
if lst is None: return None
else: return cons(lst.data, odds(lst.next))
# 1st element, 4rd, 5th, etc.
def odds(lst):
if lst is None: return None
else: return evens(lst.next)
The mutual recursion may obscure matters a bit, but in the end, calling either helper traverses the entire list.
They are both O(n).
Merge sort helpers (2/2)
Merge two sorted lists into one, larger sorted list.
# : List[Number] List[Number] -> List[Number]
def merge(l1, l2):
if l1 is None:
elif l2 is None:
elif l1.data < l2.data:
return cons(l1.data, merge(l1.next, l2))
return cons(l2.data, merge(l1, l2.next)) In merge sort, the inputs are both lists of length n2 .
merge traverses both lists all the way to the end, for a total length of n.
Thus merge is O(n).
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com