CE204
Data Structures and Algorithms
Part 4
09/02/2019 CC204 Part 4
1
Analysing Running Times 1
If there are several possible algorithms for solving a problem we may wish to determine which is the most efficient. One way to do this would be to implement all of the algorithms and perform experiments. This may, however, take a lot of time since some algorithms may perform better than others on particular data but worse on different data, so it may be necessary to perform a large number of experiments.
It would be preferable to be able to estimate the running time of the algorithms by using mathematical analysis.
09/02/2019 CC204 Part 4
2
Analysing Running Times 2
We cannot expect to be able to calculate precise running times using simple mathematical analysis, since to do so would require knowledge of the code produced by the compiler and timing details for the hardware. However we may be able to obtain useful estimates.
Typically the type of conclusion we might be able to reach is that if a program takes a particular time for data of size n it will take about twice as long for data of size 2n. Another algorithm might take about four times as long.
09/02/2019 CC204 Part 4
3
Analysing Running Times 3
Consider two algorithms A and B which perform the same task. Suppose A takes 1000n milliseconds (i.e. n seconds) for data of size n and B takes 5n2 milliseconds. Then when n is 10, A will take 10 seconds and B will take 0.5 seconds, and when n is 100, A will take 100 seconds and B will take 50 seconds. However when n is 1000, A will take about 17 minutes but B will take about 83 minutes, and when n is 10000 A will take about 3 hours but B will take nearly 6 days. As n becomes larger the difference will become greater.
09/02/2019 CC204 Part 4
4
Analysing Running Times 4
Although B is more efficient when n is small, A is much more efficient when n is large. Furthermore, the difference when n is small is not very important since the time taken to produce and load the data is likely to be longer than the time taken for the algorithm.
The difference between n and n2 is hence far more significant than the difference between the constants 5 and 1000. The knowledge that A takes time proportional to n whereas B takes time proportional to n2 is much more useful than knowing the multiples involved.
09/02/2019 CC204 Part 4
5
Analysing Running Times 5
The precise times taken for an algorithm are unlikely to be expressible as a simple expression – the times for different data of the same size may be different, so a more realistic statement might be that A takes between 0.7n+3 seconds and 1.5n+12 seconds. While no longer being able to say that algorithm A takes time proportional to n we can still say that it takes time approximately proportional to n. From this statement we can still conclude that if we double the size of the data the time taken is likely to be about twice as long.
09/02/2019 CC204 Part 4
6
Analysing Running Times 6
In order to formally argue about running times using such terminology we need to specify precisely what is meant by “approximately proportional to f(n)”.
We could define this by saying that T(n) is approximately proportional to f(n) if there exist some constants C and D such that T(n) ≥ C×f(n) and T(n) ≤ D×f(n).
Using this definition the running time for algorithm A from the previous slide is indeed approximately proportional to n, since 0.7n+3 ≥ 0.7n and 1.5n+12 ≤ 13.5n (as long as n is at least 1, which is reasonable to assume since we would not run a program with data of size less than 1).
09/02/2019 CC204 Part 4
7
Analysing Running Times 7
Some algorithms involving searching, when it is not necessary to examine all of the data, have running time approximately proportional to log2n. For such algorithms the definition on the previous slide cannot be satisfied, since it is impossible to find a value for D which works when n is 1 (log21 is 0 so we would have to find a D such that the time is less than or equal to D×0).
Hence the formal definition of a meaning for “approximately proportional to” ignores small values of n.
09/02/2019 CC204 Part 4
8
The “Big O” Notation 1
The big O notation is a formalisation of the “approximately proportional to” definition that allows us to ignore small values of n. It considers just an upper bound, so effectively means “at most approximately proportional to”
T(n) is said to be O(f(n)) if there exist constants C and N such that T(n) ≤ C×f(n) for all values of n greater than or equal to N.
There are usually several choices for C and N: 1.5n+12 is O(n) since 1.5n+12 ≤ 13.5n for all values of n greater than or equal to 1, and also since 1.5n+12 ≤ 2n for all values of n greater than or equal to 24.
09/02/2019 CC204 Part 4
9
The “Big O” Notation 2
Upper bounds are usually more important than lower bounds since we are usually more concerned about the worst-case performance of an algorithm than the best-case performance. Consider for example a sorting algorithm: this may take time proportional to n if the list or array happens to be already sorted but this case is not typical – indeed it can be proved that it is impossible to produce a general-purpose sorting algorithm whose average-case running time is better than O(n log2n).
09/02/2019 CC204 Part 4
10
Logarithms and “Big O”
The logarithm to base k of a number n (written as logkn) is the value p such that kp is equal to n. For example, log216 is 4 since 24 is 16, and log101000 is 3 since 103 is 1000.
It can be shown that logan = logab × logbn, so changing the base of a logarithm involves multiplying by a factor that is independent of n. Hence if T(n) is O(logan) we can show, by choosing a different value for C, that T(n) is also O(logbn). Consequently, when big O values involve logarithms it is not necessary to specify the base; we can simply say that T(n) is O(log n).
09/02/2019 CC204 Part 4
11
Some Properties of “Big O” 1
If T1(n) is O(f(n)) and T2(n) is O(g(n)) the following properties can all be proved:
k × T1(n) is O(f(n)) (where k is a positive constant, i.e. not dependent on n)
T1(n) + T2(n) is O(max(f(n), g(n)))
T1(n) × T2(n) is O(f(n) × g(n))
Additionally it is easy to conclude from the definition of big O
that f(n) is O(f(n)) – simply choose both C and N to be 1.
09/02/2019 CC204 Part 4
12
Some Properties of “Big O” 2
From the properties on the previous slide we can conclude that nk is O(nk) and hence ank is also O(nk).
From the second property on the previous slide we can conclude that, if T1(n) is O(nj) and T2(n) is O(nk) where j
For a while loop of the form while (e) S (where e is a simple expression) we can use the same formula but m will be the maximum number of times the loop body is executed – this may, however, be difficult to estimate.
09/02/2019 CC204 Part 4
16
Estimating Running Times of Java Methods 3
Using the second property on slide 12 we can conclude that if T1 is O(f1(n)) and T2 is O(f2(n)) then the time for the if statement is O(max(f1(n), f2(n)).
If m is O(f(n)) and TS is O(g(n)) we can conclude that the maximum time for the for or while loop is O(f(n)×g(n)).
09/02/2019 CC204 Part 4
17
Estimating Running Times of Java Methods 4
We wish to apply the formula on the previous slide to the following for loop which initialises an array a of size n.
for (int i = 0; i
09/02/2019 CC204 Part 4
21
Estimating Running Times of Recursive Methods 2
The equation T(n) = T1+T(n-1) is known as a recurrence relation. In order to be able to estimate the time for factorial(n) we need to be able to solve this equation.
We will present known solutions for some commonly- occurring recurrence relations.
If T(n) ≤ T1+T(n-1) for all n>1:
If T1 is O(1) then T(n) is O(n) If T1 is O(n) then T(n) is O(n2) If T1 is O(n2) then T(n) is O(n3)
We can hence conclude that the time for factorial(n) is O(n).
09/02/2019 CC204 Part 4
22
Estimating Running Times of Recursive Methods 3
If T(n) ≤ T1+kT(n-1) for all n>1 (where k is a positive constant integer):
If T1 is O(1) or O(n) then T(n) is O(kn) If T(n) ≤ T1+T(n/2) for all n>1:
If T1 is O(1) then T(n) is O(log n) If T1 is O(n) then T(n) is O(n)
If T(n) ≤ T1+2T(n/2) for all n>1: If T1 is O(1) then T(n) is O(n)
If T1 is O(n) then T(n) is O(n log n)
(These results also hold if when n is odd we have recursive calls
with times T(n/2) and T(n/2+1).) 09/02/2019 CC204 Part 4
23
The “Big Ω” Notation
If we want to specify a lower bound for running times (instead of an upper bound) we can use the big omega notation instead of the big O notation.
T(n) is said to be Ω(f(n)) if there exist constants C and N such that T(n) ≥ C×f(n) for all values of n greater than or equal to N.
09/02/2019 CC204 Part 4
24