Computer Language
Chapter 7: Time Complexity
Last Modified 4/2/20
Copyright By PowCoder代写 加微信 powcoder
Complexity
A decidable problem is computationally solvable But what resources are needed to solve the problem?
How much time will it require?
How much memory will it require?
In this chapter we study time complexity
Chapter 8 covers the space complexity of a problem
Space corresponds to memory
We do not cover space complexity (this topic is rarely covered in introductory theory courses)
Basics of time complexity theory
Introduce method for the measuring time to solve a
Show how to classify problems according to the amount of time required
Show that certain classes of problems require enormous amounts of time
We will see how to determine if we have this type of problem
Chapter 7.1
Measuring Complexity
An Example of Measuring Time
Take the language A = {0k1k|k ≥ 0} A is clearly decidable
we can write a program to recognize it
How much time does a single-tape Turing machine,
M1, need to decide A?
Do you recall the main steps involved?
Turing Machine M1
M1 = On input string w:
Scan across the tape and reject if a 0 is found to the right of a 1
Repeat if both 0s and 1s remain on the tape
3. Scan across the tape, crossing off a single 0 and a single 1
If 0s still remain after all the 1s have been crossed off, or if 1s still remain after all the 0s have been crossed off, reject. Otherwise, if neither 0s nor 1s remain, accept.
Time Analysis of a TM
The number of steps that an algorithm/TM requires may depend on several parameters
If the input is a graph, then the number of steps may depend on:
Number of nodes, number of edges, maximum degree of the graph
For this example, what does it depend on?
The length of the input string
But also the value of the input string (1•010,000•110,000)
For simplicity, we compute the running time of an algorithm as a function of the length of the string that represents the input
In worst-case analysis, we consider the longest running time of all inputs of a particular length (that is all we care about here)
In average-case analysis, we consider the average of all running times of inputs of a particular length
Running Time/Time Complexity
Definition: Running time or time complexity:
If M is a TM that halts on all inputs, the running time of M is the function f: N → N, where f(n) is the maximum number of steps that M uses on any input of length n.
We say that M runs in time f(n) and that M is an f(n) time Turing machine.
Custom is to use n to represent the input length
Big-O and Small-O Notation
We usually just estimate the running times In asymptotic analysis, we focus on the
algorithm’s behavior when the input is large
We consider only the highest order term of the running time complexity expression
The high order term will dominate low order terms when the input is sufficiently large
We also discard constant coefficients For convenience, since we are estimating
Examples of using Big-O Notation
If the time complexity is given by the f(n) below: f(n) = 6n3 + 2n2 + 20n + 45 then
Then f(n) = O(?)
An answer, using what was just said, is: F(n) = O(n3)
Definition of Big-O
Let f and g be functions f, g: N → R+. Say that f(n) = O(g(n)) if positive integers c and no exist such that for every integer n ≥ no
f(n) ≤ c g(n)
When f(n) = O(g(n)) we say that g(n) is an asymptotic upper bound for f(n), to emphasize that we are suppressing constant factors
Intuitively, f(n) = O(g(n)) means that f is less than or equal to g if we disregard differences up to a constant factor.
From earlier, we said if the time complexity is: f(n) = 6n3 + 2n2 + 20n + 45, then
f(n) = O(n3)
Is f(n) always less than c g(n) for some n? That is, is f(n) ≤ c g(n)?
Try n=0; we get 45 ≤ 0 (c x 0)
Try n=10; we get 6000 + 200 + 200 + 45 ≤ c 1000 6445 ≤ 1000c. Yes, if c is 7 or more.
So, if c = 7, then true whenever no ≥ 10
Example continued
What if we have:
f(n) = 6n3 + 2n2 + 20n + 45, then f(n) = O(n2)?
Let’s pick n = 10
6445 ≤ 100c
True if c = 70. But then what if n is bigger, such as 100 Then we get 6,022,045 ≤ 10,000c
Now c has to be 603
So, this fails and hence f(n) is not O(n2).
Note that you must fix c and no. You cannot keep changing c. If you started with a larger c, then you would have to get a bigger n, but eventually you would always cause the inequality to fail.
If you are familiar with limits (from calculus), then you should already understand this
Another way to look at this
Imagine your graph y1=(x2) and y2=(x3) Clearly y2 would be bigger than y1 if x > 0.
However, we could change y1 so that it is equal to cx2 and then y1 could be bigger than y2 for some values of x
However, once x gets sufficiently large, for any value of c, then y2 will be bigger than y1
Example Continued
Note that since f(n) = O(n3), then it would trivially hold for O(n4), O(n5), and 2n
The big-O notation does not require that the upper bound be “tight” (i.e., as small as possible)
For the perspective of a computer scientist who is interesting in characterizing the performance of an algorithm, a tight bound is better
A Brief Digression on Logarithms
As it turns out, log2n and log10n and logxn differ from each other by a constant amount
So if we use logarithms inside the O(log n), we can ignore the base of the logarithm
Note that log n grows much more slowly than a polynomial like n or n2. An exponential like 2n or 5n or 72n grows much faster than a polynomial.
Recall from “basic” math that a logarithm is essentially the opposite of an exponential
Example, log216 = 4 since 24=16. In general, log22n = n
Small-O Notation
Big-O notation says that one function is asymptotically no more than another
Think of it as ≤.
Small-O notation says that one function is asymptotically less than another
This of it as <
Examples: Come up with small-o
If f(n) = n, then f(n) = o(?) f(n) = o(n log n) or o(n2)
If f(n) = n2 then f(n) = o(?) f(n) = o(n3)
A small-O bound is always also a big-O bound (not vice-versa) Just as if x < y then x ≤ y (x ≤ y does not mean that x < y)
Formal Definition of Small-O
Let f and g be functions f, g: N → R+. Say that f(n) = o(g(n)) if:
Limit as n→ ∞ f(n)/g(n) = 0
This is equivalent to saying that f(n) = o(g(n)) then, for any real number c > 0, a number no exists where f(n) < cg(n) for all n ≥ no.
Now Back to Analyzing Algorithms
Take the language A = {0k1k|k ≥ 0} M1 = On input string w:
Scan across the tape and reject if a 0 is found to the right of a 1
Repeat if both 0s and 1s remain on the tape
3. Scan across the tape, crossing off a single 0 and a single 1
If 0s still remain after all the 1s have been crossed off, or if 1s still remain after all the 0s have been crossed off, reject. Otherwise, if neither 0s nor 1s remain, accept.
How to Analyze M1
To analyze M1, consider each of the four stages separately
How many steps does each take?
Analysis of M1
M1 = On input string w:
Scan across the tape and reject if a 0 is found to the right of a 1
Repeat if both 0s and 1s remain on the tape
3. Scan across the tape, crossing off a single 0 and a single 1
If 0s still remain after all the 1s have been crossed off, or if 1s still remain
after all the 0s have been crossed off, reject. Otherwise, if neither 0s nor 1s remain, accept.
Stage 1 takes: ? steps n steps so O(n)
Stage 3 takes: ? Steps n steps so O(n)
Stage 4 takes: ? Steps n steps so O(n)
How many times does the loop in stage 2 cause stage 3 to repeat? n/2 so O(n) steps
What is the running time of the entire algorithm? O(n) + n/2 x O(n) + O(n) = O(n2/2) = O(n2)
Time Complexity Class
Let t: N → R+ be a function. We define the time complexity class, TIME(t(n)), to be the collection of all languages that are decidable by an O(t(n)) time Turing machine.
Given that language A was accepted by M1 which was O(n2), we can say that A TIME(n2) and that TIME(n2) contains all languages that can be decided in O(n2) time.
Is There a Tighter Bound for A?
Is there a machine that will decide A asymptotically more quickly?
That is, is A in TIME(t(n)) for t(n) = o(n2)?
If it is, then there should be a tighter bound
The book gives an algorithm for doing it in O(nlogn) on page 252 by crossing off every other 0 and 1 at each step (assuming total number of 0s and 1s is even)
Need to look at the details to see why this works, but not hard to see why this is O(n log n).
Similar to M1 that we discussed except that the loop is not repeated n/2 times. How many times is it repeated?
Answer log2n (but as we said before the base does not matter so log n) So, that yields nlogn instead of n2
Can we Recognize A even Faster?
If you had to program this, how long would it take (don’t worry about using a Turing machine)?
I can do it in O(n) time by counting the number of 0s and then subtracting each 1 from that sum as we see it
Can you do it in O(n) with a Turing Machine? Not with a single tape Turing Machine
But you can with a 2-tape Turing Machine How?
A TM M3 to Recognize A in O(n)
M3 = On input string w:
1. Scan across the tape and reject if a 0 is found to the right side of a 1.
2. Scan across the 0s on tape 1 until the first 1 is seen. As each 0 is seen, copy a 0 to tape 2 (unary counting)
3. Scan across the 1s on tape 1 until the end of the input. For each 1 read on tape 1, cross off a 0 on tape 2. If all 0s are crossed off before all 1s are read, reject.
4. If all the 0s have now been crossed off, accept. If any 0s remain, reject.
What is the running time:
1: O(n) 2: O(n) 3: O(n) 4: O(n) Is it possible to do even better?
No, since the input is O(n). Could only do better for a problem where it is not necessary to look at all of the input, which isn’t the case here.
Implications
On a 1-tape TM first we came with a O(n2) algorithm and then a O(nlogn) algorithm
So, the algorithm you choose helps to decide the time complexity
Hopefully no surprise there
As it turns out, O(nlogn) is optimal for a 1-tape TM
On a 2-tape TM, we came up with a O(n) algorithm So, the computational model does matter
A 2-tape and 1-tape TM are of equivalent computational power with respect to what can be computed, but have different time complexities for same problem
How can we ever say anything interesting if everything is so model dependent?
Recall that with respect to computability, TMs are incredibly robust in that almost all are computationally equivalent. That is one of the things
that makes TMs a good model for studying computability
A Solution
One solution is to come up with a measure of complexity that does not vary based on the model of computation
That necessarily means that the measure cannot be very fine-grained
We will briefly study the relationships between the models with respect to time complexity
Then we can try to come up with a measure that is not impacted by the differences in the models
Complexity Relationships between Models
We will consider three models: Single-tape Turing machine
Multi-tape Turing machine
Nondeterministic Turing machine
Complexity Relationship between Single and Multi-tape TM
Theorem:
Let t(n) be a function, where t(n) ≥ n. Then every t(n) time multitape Turing machine has an equivalent
O(t2(n)) time single-tape TM Proof Idea:
We previously showed how to convert any multi-tape TM into a single-tape TM that simulates it. We just need to analyze the time complexity of the simulation.
We show that simulating each step of the multitape TM uses at most O(t(n)) steps on the single-tape machine. Hence the total time used is O(t2(n)) steps.
Proof of Complexity Relationship
Review of simulation:
The single tape of machine S must represent all of the
tapes of multi-tape machine M
S does this by storing the info consecutively and the positions of the tape heads is encoded with a special symbol
Initially S puts its tape into the format that represents the multi-tape machine and then simulates each step of M
A single step of the multi-tape machine must then be simulated
Simulation of Multi-tape step
To simulate one step:
S scans tape to determine contents under tape heads
S then makes second pass to update tape contents and move tape heads If a tape heads moves right into new previously unused space (for the
corresponding tape in M), then S must allocate more space for that tape It does this by shifting the rest of the contents one cell to the right
Analysis of this step:
For each step in M, S makes two passes over active portion of the tape.
First pass gets info and second carries it out.
The shifting of the data, when necessary, is equivalent to moving over the active portion of the tape
So, equivalent to three passes over the active contents of the tape, which is same as one pass. So, O(length of active contents).
We now determine the length of the active contents
Length of Active Contents on Single-Tape TM
Must determine upper bound on length of active tape
Why an upper bound?
Answer: we are looking a worst-case analysis
Active length of tape equal to sum of lengths of the k-tapes being simulated
What is the maximum length of one such active tape?
Answer: t(n) since can at most make t(n) moves to the right in t(n) steps
Thus a single scan of the active portion takes O(t(n)) steps Why not O(k x t(n)) since k tapes in k-multitape machine M?
Answer: k is a constant so drops out Putting it all together:
O(n) to set up tape initially and then t(n) steps to simulate each of the O(t(n)) steps.
This yields O(n) + O(t2(n)) = O(t2(n)) given that t(n) ≥ n. Proof done!
Complexity Relationship between Single- tape DTM and NDTM
Definition:
Let N be a nondeterministic Turing machine that is a decider. The running time of N is:
Function f: N → N, where f(n) is the maximum number of steps that N
uses on any branch of its computation on any input of length n Doesn’t correspond to real-world computer
Except maybe a quantum computer?*
Rather a mathematical definition that helps to characterize the
complexity of an important class of computational problems
As I said before, non-determinism is like always guessing correctly (as if we had an oracle). Given this, the running time result makes sense.
* This is more related to this course than you might think. Lately many new methods for computation have arisen in research, such as quantum and DNA computing. These really do 1) show that computing is not about “computers” and 2) that it is useful to study different models of computation. Quantum computing is radically different.
Proof of Complexity Relationship
Theorem:
Let t(n) be a function, where t(n) ≥ n. Then every t(n) time nondeterministic single-tape Turing machine has an equivalent 2O(t(n)) time deterministic single-tape TM
Note that 2O(t(n)) is exponential, which means it grows very fast
Exponential problems considered “computationally intractable”
Definition of NTM Running Time
Let N be a Nondeterministic Turing Machine that is a decider. The running time is the maximum number of steps that N takes on any branch of its computation on any input of length n.
Essentially the running time assumes that we always guess correctly and execute only one branch of the tree. The worst case analysis means that we assume that correct branch may be the longest one.
This is based on the proof associated with Theorem 3.16 (one of the few in Chapter 3 that we did not do)
Construct a deterministic TM D that simulates N by searching N’s nondeterministic computation tree
Finds the path that ends in an accept state.
On input n the longest branch of computation tree is t(n)
If at most b transitions, then number of leaves in tree is at most bt(n)
Explore it breadth first
Total number of nodes in tree is less than twice number of leaves (basic discrete math) so bounded by O(bt(n))
Proof continued
The way the computation tree is searched in the original proof is very inefficient, but it does not impact the final result, so we use it
It starts at the root node each time
So, it goes to bt(n) nodes and must travel O(t(n)) each time.
This yields bt(n) O(t(n)) steps = 2 O(t(n))
Note that b is a constant ≥ 2 and for running time purposes can be
listed as 2 (asymptotically equivalent).
The O(t(n)) does not increase the overall result, since exponential dominates the polynomial
If we traversed the tree intelligently, still must visit O(bt(n)) nodes
The simulation that we did not cover involves 3 tapes. But going from 3 tapes to 1 tape at worst squares the complexity, which has no
impact on the exponential
Chapter 7.2
The Class P
The Class P
The two theorems we just proved showed an important distinction
The difference between a single and multi-tape TM is at most a square, or polynomial, difference
Moving to a nondeterministic TM yields an exponential difference
A non-deterministic TM is not a valid real-world model
So we can perhaps focus on polynomial time complexity
Polynomial Time
From the perspective of time complexity, polynomial differences are considered small and exponential ones large
Exponential functions do grow incredibly fast and do grow much faster than polynomial function
However, different polynomials do grow much faster than others and in an algorithms course you would be crazy to suggest they are equivalent
O(n log n) sorting much better than O(n2). O(n) and O(n3) radically different
As we shall see, there are some good reasons for nonetheless assuming polynomial time equivalence
Background
Exponential time algorithms arise when we solve problems by exhaustively searching a space of possible solutions using brute force search
Polynomial time algorithms require something other than brute force (hence one distinction)
All reasonable computational models are polynomial-time equivalent
So if we view all polynomial complexity algorithms as equivalent then the specific computational model does not matter
The Definition of the Class P
Definition:
P is the class of languages that are decidable
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com