代写 C algorithm html graph COMP1730/COMP6730 Programming for Scientists

COMP1730/COMP6730 Programming for Scientists
Algorithm and problem complexity

Algorithm complexity
* The time (memory) consumed by an algorithm: – Counting “elementary operations” (not 𝜇s).
– Expressed as a function of the size of its
arguments.
– In the worst case.
* Complexity describes scaling behaviour: How much does runtime grow if the size of the arguments grow by a certain factor?
– Understanding algorithm complexity is important when (but only when) dealing with large problems.

Big-O notation
* O(f(n))means roughly “a function that grows at the rate of f (n), for large enough n”.
* For example,
– n2 + 2n is O(n2) – 100n is O(n)
– 1012 is O(1).
(Image by Lexing Xie)

Example
* Find the greatest element ≤ x in an unsorted sequence of n elements. (For simplicity, assume some element ≤ x is in the sequence.)
* Two approaches:
a) Search through the sequence; or
b) First sort the sequence, then find the greatest
element ≤ x in a sorted sequence.

Searching an unsorted sequence
def unsorted find(x, ulist): best = min(ulist)
for elem in ulist:
if elem == x:
return elem
elif elem <= x: if elem > best:
best = elem
return best

Analysis
* Elementary operation: comparison. – Can be arbitrarily complex.
* If we’re lucky, ulist[0] == x. * Worst case?
– ulist = [0, 1, 2, …, x – 1]
– Compare each element with x and current
value of best
* What about min(ulist)?
* f(n)=2n,soO(n)

Measured runtime

Searching a sorted sequence
def sorted find(x, slist): if slist[-1] <= x: return slist[-1] lower = 0 upper = len(slist) - 1 while (upper - lower) > 1:
middle = (lower + upper) // 2
if slist[middle] <= x: lower = middle else: upper = middle return slist[lower] Analysis * Loop invariant: slist[lower] <= x and x < slist[upper]. * How many iterations of the loop? - Initially, upper - lower = n − 1. - The difference is halved in every iteration. - Can halve it at most log2(n) times before it becomes 1. * f (n) = log2(n) + 1, so O(log(n)). Measured runtime Problem complexity * The complexity of a problem is the time (memory) that any algorithm must use, in the worst case, to solve the problem, as a function of the size of the arguments. * The hierarchy theorem: For any computable function f (n) there is a problem that requires time greater than f (n). (Analogous result for memory.) How fast can you sort? * Any sorting algorithm that uses only pair-wise comparisons needs n log(n) comparisons in the worst case. 1,2,3 1,3,2 2,1,3 log2(n!) 2,3,1 3,1,2 3,2,1 n! * log2(n!) ≥ n log(n) for large enough n. Measured runtime (list.sort) Points of comparison * Algorithm (a): O(n) * Algorithm (b): n log(n) + log(n) = O(n log(n)) Unsorted find Sorted find Sorting n = 64k n = 128k n = 512k 0.108 s 0.00002 s 0.013 s 0.026 s 0.000017s 0.000018s 0.07 s 0.18 s Rate of growth * Algorithm uses T (n) time on input of size n. * If we double the size of the input, by what factor does the runtime increase? T (2n)/2T (n) Caution * “Premature optimisation is the root of all evil in programming.” – C.A.R. Hoare * Remember: Scaling behaviour becomes important when (and only when) problems become large, or when they need to be solved a many times. NP-Completeness Example * The subset sum problem: Given n integers w1,...,wn, is there a subset of them that sums to exactly C? (Also known as the “(exact) knapsack problem”: ⇒ w0=5 w1=2 w2=9 w3=1 C=16.) def subset sum(w, C): if len(w) == 0: return C == 0 # including w[0] if w[0] <= C: if subset sum(w[1:], C - w[0]): return True # excluding w[0] if subset sum(w[1:], C): return True return False Analysis * Count recursive function calls (no loops, so every call does a constant max amount of work). * Assume argument size (n) is number of weights. * Worst case? - If the answer is False and C is less than but close to ∑︀i wi , almost every call makes two recursive calls. * f(n+1)=2f(n),f(0)=1meansthatf(n)=2n. Finding vs. checking an answer * Sorting a list vs. checking if it’s already sorted * Finding a subset of w1,...,wn that sums to C vs. checking if a sum is equal to C O(n log(n)) O(n) O(2n) O(n) NP-complete problems * A problem is in NP iff there is an answer- checking algorithm that runs in polynomial time (O(nc), c constant). * NP stands for Non-deterministic Polynomial time. * A problem is NP-complete if it’s in NP and at least as hard as every other problem in NP. * We think there is no polynomial time algorithm for solving NP-complete problems, but we don’t know. There are many NP-complete problems... * Most populous intractable problem class. - Solving a system of integer linear equations. - The Knapsack problem. * http://www.nada.kth.se/ ̃viggo/ wwwcompendium/wwwcompendium.html lists over 700 NP-complete optimisation problems. Why Complexity is (Sometimes) a Good Thing Cryptographic Characters Encrypt Decrypt Ciphertext Alice Bob Eve * Eve can intercept the ciphertext, but without knowing Ksecret can’t read the message. * Alice and Bob must agree on Ksecret. Message Ksecret Ksecret Message Public Key Cryptography Encrypt Decrypt Ciphertext Alice Bob Eve * Kpublic can only be used to encrypt. * Decrypting with Kprivate is easy, but decrypting without knowing Kprivate is (NP-)hard. Message Kprivate Kpublic Message Kpublic Example: Proof of Identity * Alice is chatting with “Bob” on-line, but wants to be sure it’s really Bob. 1. Alice picks a random number N and sends C = Encrypt(Kpublic, N) to “Bob”. 2. Bob quickly computes N = Decrypt(Kprivate, C) and sends N back to Alice. Repeat 1–2 many times to make sure “Bob” didn’t make a lucky guess. Succeeding every time proves he knows Kprivate, which we assume only Bob does.