EECS 3101 York University Instructor: Andy Mirzaian
MACHINE MODEL AND TIMING ANALYSIS NOTATION
Introduction
This course has two major goals.
(1) To teach certain fundamental combinatorial (as opposed to numerical) algorithms. (2) To teach general techniques for the design and analysis of algorithms.
The first question to address is “What is analysis of algorithms?”. We are interested in ana- lyzing the complexity of algorithms. Complexity has many aspects:
• “conceptual complexity”, i.e. how difficult is it to write and verify the correctness of the algorithm in the first place. We do not have any way of quantifying the complexity of an algorithm in this sense. All other things being equal, we would like to design the simplest and easiest to understand algorithm possible, but we’ll not formally talk about these notions.
• The length of the algorithm. This sense of the complexity of an algorithm can be quantified and is a useful measure in other contexts. This measure is related to Kolmogorov complex- ity. This measure should not be confused with the “conceptual complexity” of an algorithm. We don’t consider this measure of complexity in this course.
• How long does it take the algorithm to run. This is known as the time complexity of the algorithm. This measure, specified in fuller detail below, is the primary focus of the com- plexity analysis done in this course.
• How much storage does the algorithm require. This is known as the space complexity of the algorithm. This measure is also a concern in the complexity analysis done in this course, although perhaps not as central as time complexity.
Machine Model and Time Complexity of Algorithms
Let N denote the set of non-negative integers, i.e. {0,1,…} and let R denote the set of real num- bers. We use N+ and R+ , respectively, to denote the positive integers and the positive reals. Finally, R * = R+∪{0} is used to denote the non-negative reals. The worst-case time complexity of an algorithm is expressed as a function
T:N→R*
where T(n) is the maximum number of “steps” in any execution of the algorithm on inputs of “size” n. For example, T (n) = 3 ⋅ n2 means that on inputs of size n the algorithm requires up to 3 ⋅ n2 steps. To make this more precise, we must clarify what we mean by input “size” and “step”.
(a) A general definition of the input size to an algorithm is the number of bits in the input. This is not a very convenient measure for most of the problems we deal with in this course, i.e. for problems considered in this course there is another measure of the input size which is more natural to talk about. When we are dealing with algorithms which solve problems of searching and sorting, the measure of input size we adopt is the number of elements which
-2-
are to be searched through or sorted. When we are dealing with algorithms which solve graph problems, the measure of input size we adopt is expressed in terms of the number of edges in the graph and the number of nodes in the graph. In this case, the worst-case time complexity of the algorithm is expressed as a function of both of these measures of the input size.
(b) A “step” of the algorithm can be defined precisely if we fix a particular machine on which the algorithm is to be run. We do not want to restrict ourselves to any particular machine, so we will be a bit vague as to exactly what a step is (in most cases, when we analyze the time complexity of a particular algorithm to solve a particular problem, we will specify in more detail exactly what the time complexity measures): basically, a step is anything we can reasonably expect a computer to do in a fixed amount of time. Typical examples are:
• Performing an arithmetic operation ( + , − , ÷ , ×).
• Comparing two numbers.
• Logic: and, or, not.
• Assigning a value to a variable.
• Following a pointer or indexing into an array.
There is a problem with assuming that these steps only take a fixed amount of time, e.g. an arith- metic operation takes constant time only if the numbers involved are of bounded size.
Uniform Cost Criterion: Charge one unit of time for each step.
Logarithmic Cost Criterion: Charge log α † units of time for a step, where α is the maximum number involved in the operation. The rational for this cost criterion is that the time to perform a simple step is proportional to the length of the numbers involved (i.e. the number of bits) which is log α . Even this isn’t always quite true, for example the multiplication of two numbers!
The uniform cost criterion makes the analysis simpler but fails to take into account the size of the numbers, which implies that it is a bit unrealistically optimistic. The logarithmic cost criterion is more realistic. In this course, unless otherwise specified, we stick to the uniform cost criterion.
Recall that T (n) is the maximum number of steps required for inputs of size n, i.e. the worst-case time complexity of the algorithm. Another useful measure of time complexity is average-case time complexity. The average-case time complexity of an algorithm, T(n), is the expected num- ber of steps required for inputs of size n. For this to be a well-defined concept, we must define a probability space on the set of inputs of size n for each value of n. The main advantage of mea- suring average-case time complexity instead of worst-case time complexity is that the average- case time complexity of an algorithm gives a more realistic idea of what to expect when the algo- rithm is run on a problem of size n, given that it is possible to determine a probability distribu- tion on the set of inputs to the algorithm which reflects the true probability distribution on inputs of size n. The disadvantage of measuring average-case time complexity instead of worst-case time complexity are:
• The probability distribution on inputs depends on the application where the algorithm is to be used, and is usually not even known.
• Even when the probability distribution is known, average-case time complexity analysis is usually much more difficult than the worst-case time complexity analysis for the same algo- rithm.
† in this course, unless otherwise stated, all logarithms are base two. Also, lg indicates log2.
-3-
• In some real-time applications, it is critical that on each input the algorithm is guaranteed to finish within a particular time bound. The worst-case time complexity analysis is useful to see if the criterion is met, whereas average-case time complexity analysis is not at all useful for this purpose.
There is another time complexity measure known as the amortized time complexity, which is studied in the COSC 4101/5101 course. In this course we concentrate mostly on worst-case time complexity, but we’ll see some examples of average-case.
Notations for asymptotic growth rate
We are often only interested in obtaining a rough estimate of an algorithm’s time complexity. Since in the interest of generality, we measure time in somewhat vaguely defined “steps”, there is little point in most cases of spending too much effort to obtain a precise expression for this num- ber of steps. We are happy with a reasonable approximation.
A style of time complexity analysis that is very common in algorithm analysis is the so called asymptotic analysis. Except on rare occasions, this is the type of analysis we’ll work within this course (you’ll be happy to know it’s the easiest!). In asymptotic analysis we cavalierly drop low order terms and multiplicative constants from the time complexity function T(n). For example, if the function is 3 ⋅ n2 + 7 ⋅ log n , we are only interested in the ” n2 ” part: 7 ⋅ log n is a low order term and 3 is “just” a multiplicative constant. Some special mathematical notation has been developed to express asymptotic analysis conveniently as explained below.
“Big-Oh” Notation
To express asymptotic upper bounds on running times of algorithms, we introduce the “big-oh” notation defined as follows:
Consider functions f:N→R* and g:N→R*. We say g=O(f) iff ∃ constants c > 0 , no ≥ 0 such that ∀n ≥ no , g(n) ≤ c ⋅ f (n).
Informally g(n) = O( f (n)) (pronounced gee is big oh of ef, or simply, gee is oh of ef) says that g grows no faster than f , to within a constant factor. In words, g(n) = O( f (n)) if for all suffi- ciently large n (i.e., ∀ n ≥ no ) g(n) is bounded from above by f (n) − possibly multiplied by a positive constant. We say f (n) is an asymptotic upper bound for g(n). Figure 1 shows in terms of a chart what it means for g(n) to be O( f (n)).
Example 1: f (n) = 3 ⋅ n2 + 4 ⋅ n3/2 is O(n2). This is because 3 ⋅ n2 + 4 ⋅ n3/2 ≤ 3 ⋅ n2 + 4 ⋅ n2 ≤ 7 ⋅ n2 . Thus, pick n = 0 and c = 7. For all n ≥ n , f (n) ≤ c ⋅ n2 .
Example 2: f (n) = (n + 5)2 is O(n2) . This is because (n + 5)2 = n2 + 10 ⋅ n + 25 . Check (with
oo
elementary algebra) that for all n ≥ 7 , n2 + 10 ⋅ n + 25 ≤ 3 ⋅ n2 . Thus, pick n alln≥n , f(n)≤c⋅n2 .
o
Why is this useful? In many cases obtaining the exact time complexity function T (n) of an algo- rithm is a very laborious task and yields a messy function. However, it is often relatively easy to find a much simpler function f(n) that is an asymptotic upper bound for T(n) and prove that
o
= 7 and c = 3. For
-4-
no
Figure 1: g(n) is O( f (n)).
c f(n)
g(n)
f(n)
n
T (n) is O( f (n)) and leave it at that (without bothering to find an exact expression for T (n)). Of course, we want to find the best asymptotic upper bound on T(n) possible. For example, if T(n) = 3⋅ n2 + 4⋅ n3/2 + 5⋅ log n +16 , then T(n) is O(n2). But T(n) is also O(n3), O(n4), …, O(n100), … To say that T(n) is O(n100) is true but quite irrelevant. We want to find a simple asymptotic upper bound that is as small as possible. f (n) is as small as possible for T (n) if T (n) is O( f (n)) and for all functions g(n) such that T (n) is O(g(n)), it is also the case that f (n) is O(g(n)). In words, f (n) is as small an asymptotic upper bound as possible for T (n) if f (n) is an asymptotic upper bound for T(n) and all other functions which are asymptotic upper bounds for T (n) are also asymptotic upper bounds for f (n). For the T (n) given above, f (n) = n2 is as small an asymptotic upper bound as possible, and so is g(n) = 53 ⋅ n2 + 16 ⋅ n . We much prefer to state f (n) as an asymptotic upper bound on T (n) rather that g(n) because f (n) is simpler. On the other hand, if h(n) = n2 log n , then T(n) is O(h(n)), but h(n) is not as small an asymptotic upper bound as possible (try to prove this!).
“Big-Omega” Notation
There is a similar notation for asymptotic lower bounds, the “big-omega” notation.
Consider functions f:N→R* and g:N→R*. We say g=Ω(f) iff ∃ constants
c > 0 , no ≥ 0 such that ∀n ≥ no , g(n) ≥ c ⋅ f (n).
If g = Ω( f ) we say g is Ω( f ). In words, g(n) is Ω( f (n)) if for all sufficiently large n, g(n) is bounded from below by f (n) − possibly multiplied be a positive constant. We say f (n) is an asymptotic lower bound for g(n). Analogous to the case for asymptotic upper bounds, we are interested in finding a simple asymptotic lower bound which is as large as possible.
Notice the symmetry between “big-oh” and “big-omega”: if g(n) is O( f (n)) then f (n) is Ω(g(n)).
Definition: The tight asymptotic bound: g = Θ( f ) if and only if g is both O( f ) and Ω( f ).
Thus, if g(n) is Θ( f (n)) then asymptotically g(n) and f (n) grow at the same rate within a posi- tive multiplicative constant. In terms of this notation our previous concept simplifies to: f (n) is an asymptotic upper bound (lower bound) on T (n) which is as small as possible (as large as pos- sible) iff T (n) is Θ( f (n)).
-5-
We can also extend these notations so that they include functions that may be negative or unde-
fined on some finite initial segment of N . It is only necessary that there be a constant k ≥ 0
above which the function is always defined and nonnegative. For example, we can talk about
n log n
Some Examples:
n = O(n2)
n2 =Ω(n)
3n +1 = Θ(n)
log2 n = Θ(log3 n)
log n = O(n)
log n ≠ Θ(n)
n = Θ(n)
n log n = Ω(n)
lg n = O(nε ), for any constant real ε > 0 no matter how small.
nk = O(nl) for all constants k ≤ l
nk = O(2n) for all constants k > 0
2n2 + 3n = O(n2) n
Σlog i = Θ(n log n) i=1
The proof of the last result follows from the following two observations:
nn
Σlog i ≤ Σlog n = n log n and i=1 i=1
nnn
Σlog i ≥ Σ log i ≥ Σ log n/2 ≥ n(log n −1)/2 ≥ 0.25 n log n ∀n ≥ 4. i=1 i=n/2 i=n/2
The sets O, Ω, and Θ satisfy a lot of nice properties. Here are some of them. Try to prove them.
O(
) without worrying that this function is not defined when n=0 or 1.
(1) (2) (3) (4) (5)
(6) (7)
g(n) = O( f (n)) if and only if f (n) = Ω(g(n)).
g(n) = Θ( f (n)) if and only if f (n) = O(g(n)) and g(n) = O( f (n)).
f (n) = Θ(g(n)) if and only if g(n) = Θ( f (n)).
If a>0 and b are constants and f (n) ≥ 1, then a ⋅ f (n) + b = Θ( f (n)).
Transitivity: If f (n) = O(g(n)) and g(n) = O(h(n)), then f (n) = O(h(n)). Transitivity holds for Θ and Ω as well.
f (n) + g(n) = Θ(max{ f (n) , g(n)}). This is called the rule of sum I.
Iff1(n)=O(g1(n))andf2(n)=O(g2(n)),then
(7a) f1(n) + f2(n) = O(g1(n) + g2(n)) (the rule of sum II), and (7b) f1(n) ⋅ f2(n) = O(g1(n) ⋅ g2(n)) (the rule of product).
Facts (6) and (7) are often used in the time complexity analysis of algorithms. Facts (6) and (7a) are useful for finding an asymptotic upper bound on the time complexity of an algorithm which is composed of the execution of one algorithm followed by the execution of another algorithm. fact (7b)is useful for determining the time complexity of nested loops.
-6-
In the analysis of an algorithm, each of these rules can be applied only a constant number of times for the analysis to be valid.
Proofs of some of the above Facts:
(5) Transitivity:
f(n)=O(g(n)) and g(n)=O(h(n)) imply there exist constants c1 >0,c2 >0,n1 ≥0,n2 ≥0
such that for all n ≥ n1 , f (n) ≤ c1 g(n), and for all n ≥ n2 , g(n) ≤ c2 h(n). These imply that f (n) ≤ c h(n) for all n ≥ no, where c = c1c2, and no= max {n1 , n2}.
(6) Rule of Sum I:
f(n)+g(n)=Θ(max{f(n),g(n)}) is implied by the fact that
[ f (n) + g(n)] / 2 ≤ max { f (n) , g(n)} ≤ [ f (n) + g(n)]. (Note that f and g by assumption are nonnegative functions.)
(7b) Rule of Product:
From the premise we have f1(n) ≤ c1 g1(n) for all n ≥ n1, and f2(n) ≤ c2 g2(n) for all n ≥ n2. These imply that f1(n) f2(n) ≤ cg1(n)g2(n) for all n ≥ no, where c=c1c2, and no= max {n1 , n2}.
Example of a bad analysis:
Let g(n) = 2 for all n ∈ N. Thus, g(n) is O(1). If we applied rule (7b) n times with respect n
to g(n) to analyze an algorithm on inputs of size n, we might (erroneously) conclude that Π g(n)
i=1
n
is O(1). It is easy to see that Π g(n) = 2n , and thus it is obviously not O(1). What went wrong is
i=1
that we are applying rule (7b) a non-constant number of times in the analysis of the algorithm.
Other Asympotic Notations
Little oh: g(n) = o( f (n)) means g(n)=O( f (n)), but the upper bound f (n) is growing much faster than g(n) asymptotically. That is, for any constant c > 0 (no matter how small it is), there exists a constant no, such that for all n ≥ no, we have g(n) < c f (n).
Little ω : g(n) = ω( f (n)) if f (n) = o(g(n)). That is, asymptotically f (n) grows much slower than g(n). That is, for any constant c > 0 (no matter how large it is), there exists a constant no, such that for all n ≥ no, we have g(n) > c f (n).
Asymptotics by Limits:
The relationship between the rates of growth of certain functions can be obtained by using
f (n)
the following facts. Suppose lim = L. If this limit exists, then we infer the following:
n→∞ g(n)
IfL=0,then f(n)=o(g(n)).
If 0 < L < ∞, then f (n) = Θ(g(n)). If L = ∞, then f (n) = ω(g(n)).
-7-
Example 3: n=o(n2) and n1.3=ω(n log n).
Example 4: f (n)=o(g(n)) is not equivalent to [ f (n) = O(g(n) and f (n)≠Θ(g(n)) ].
To see this, consider the following example. g(n) = n, and f (n) = n for even values of n, and f(n) =1 for odd values of n. We see that f(n) ≤ g(n) for all n ≥1. Hence, f(n) = O(g(n)). We also see f (n)≠Θ(g(n)). However f (n)≠o(g(n)) since it is not true that for all constants c > 0
f (n)