Analyzing algorithms
Predict how your algorithm performs in practice
Criteria
Running time
Copyright By PowCoder代写 加微信 powcoder
Space usage
Cache I/O
Disk I/O
Network bandwidth Power consumption Lines of codes
Time complexity
What are good algorithms?
add together all the numbers from 1 to 100, assuming that this task would occupy them for quite a while. He was shocked when young Gauss, after a few seconds thought, wrote down the answer 5050. (source: https://nrich.maths.org/2478)
Tale about Guass Other students:
for (int i = 1; i <=n; i++)
time complexity
sum = (1+n)*n/2;
time complexity
What is a computational model?
What resource do we
What operations are
we allowed to do?
What is the cost of
each operation?
Computational Model
Cost measure
To estimate the time needed for an algorithm
Computational model
Algorithm Cost bound
Random Access Machine (RAM) model
We have an arbitrarily large memory
doarithmeticcalculation
Read/writetoamemorylocationgiventheaddress
Every operation takes unit time
Random Access Machine (RAM) model
We have an arbitrarily large memory
doarithmeticcalculation
Read/writetoamemorylocationgiventheaddress
Every operation takes unit time
for (int i = 1; i <=n; i++)
3n+2 operations?
sum = (1+n)*n/2; 3 operations?
Random Access Machine (RAM) model
Every operation, including memory access, arithmetic operations, etc.,
takes unit time
We only care about the order of the cost, e.g., we omit Lower-order terms
Constants
We care about how faster a function grows for large n Asymptotic Analysis
Would you rather have a million dollars or one cent on day one, doubled every day for a month?
A. I want a million dollars immediately!!
B. I want to have one cent in the first
day, doubled every day for a month!
Actually, even in February, for choice B you can get more than 1 million!
Your total reward on day is
cents dollars
Is it larger than dollars? It will be for large enough !
Growth of Functions
The growth of the function is what interests us (large data)
For the cost function we care when is large enough
When is small, is small anyway. We care the analysis when the running time can be long
The constant factors and lower-order terms vs.
When is large enough, (grow exponentially) will be much larger than (constant)
The ruler liked the chess very much, and wanted to give the inventor a prize. He could have anything he wished, the ruler said. The inventor requested that
- One grain of wheat be put in the first square
- Two grains of wheat in the second square.
- Four grains of wheat in the third
In general, the number of grains of wheat in one square is always the double of the previous square. The ruler laughed it off as an insignificant prize for his achievement and the ruler granted his wish. Later the court treasurers reported that the entire grain reserve of the kingdom was not sufficient to fulfill the request!
Grow exponentially ()! 18,446,744,073,709,551,615
This is more wheat than in the entire whole world; in fact, it would fill a building 25 miles long, 25 miles wide, and 1000 feet tall.
Asymptotic notation
for (int i = 1; i <=n; i++)
3n+2 operations?
operations
OK, then what does big-O mean?
sum = (1+n)*n/2;
3 operations? operations
Asymptotic notations
Big-O: asymptotically smaller than or equal to
is
is
is also
What happens if we want to say other cases?
notations
Asymptotical analysis
We usually omit constant factors and low-order terms
Popular Classes of Functions
, ,
, ,
,
,
,
,
( is a constant) 15
Constant:
Logarithmic:
Poly-logarithmic: Sublinear:
Super-linear:
Quadratic:
Polynomial:
Exponential:
Compare two functions
: Constant: :
Commonly-used functions
For large enough , we have (asymptotically):
Analyze running time
Random Access Machine (RAM) model
Every operation, including memory access, arithmetic operations, etc.,
takes unit time
We only care about the order of the cost, e.g., we omit Constants
Lower-order terms
We usually consider worst-case running time for general input
In some cases, we also analyze bounds with probabilistic guarantees (e.g., average running time for randomized algorithms)
Case study: selection sort
Find the smallest element in the array
Move it to the front
For the rest of elements, find the smallest element and repeat
for (int i = 0; i < n; i++) {
for (int j = i+1; j < n; j++) {
if (a[j] < a[i]) swap(a[i], a[j]); }
Why care about constants & lower-order terms?
Why we only care about the order (main term)? Quicksort: time in expectation
Selection sort: time:
Running time in ms:
Selection sort
10,000,000
About 20 min
How large a problem you can solve?
Assume your computer can do each operation in 0.1 s ( operations per second)
Complexity
Max problem size (n)
120,000,000
7,200,000,000
418,961,604
Analyzing algorithms
Given an algorithm
And a computational model specifies what the cost should be defined
- whatever the input is, our algorithm finishes in this time
Cost bound
Algorithm
Computational Model
Cost measure
lower bound analysis
Given a problem, what is the smallest cost we need to pay?
We know algorithms of running time
Does there exist better algorithms? If not, why?
Given a sorted array and an element you get to find the rank of ?
We know binary search needs time
Does there exist better algorithms? If not, why?
Example about lower bound proof: Finding the max of an array
Input: an array of distinct elements
Model: we only care about comparisons: each comparison is a unit cost
How many comparisons at least do you need to find the max of the array?
Idea 1: always compare to the champion
ans = a[1];
for (i = 2 to n) {
if (a[i] > max) max = a[i];
comparisons
Example about lower bound proof: Finding the max of an array
Idea 1: always compare to the champion
ans = a[1];
for (i = 2 to n) {
if (a[i] > max) max = a[i];
comparisons
Idea 2: single-elimination tournament?
3?1 6?5 8?2 7?4
3?68?7 6?8
comparisons
Finding the max of an array
Can we use fewer than comparisons to find the max of an array?
Principle:
If an element is not compared to others, we cannot guarantee we find the max If an element loses any comparison, it cannot be the max
If an element never lose a comparison
free need to verify!) bad rule it out!)
top a candidate!)
Initial status: 0 top, free, 0 bad
Final status: 1 top, 0 free, bad
Only when we reached this final status, we find the max
Finding the max of an array lower bound proof
Preliminary
top
bad
free
Initial status: 0 top, free, 0 bad
Final status: 1 top, no free, bad
What happens when we compare two elements?
top vs. top top vs. bad
top vs. free
bad vs. bad bad vs. free
free vs. free
#top #bad #free
-1 +1 = = = = -1 +1 = = +1 -1 -1+1 +1 -1
=== = +1 -1
+1 = -1 +1 +1 -2
(top wins) (bad wins) (top wins) (free wins)
(bad wins) (free wins)
=> We need at least comparisons to get the final status ( bad)
Each comparison increases at most 1 bad!
Finding the max of an array lower bound proof
To get an optimal algorithm, we have to guarantee to increase #bad by 1 in every comparison
Only compare top to top, or
free to free, or
top to free
(guarantee to increase #bad by 1)
Algorithm:
If there are more than two free or top
elements, compare them
Until there is only one top
top vs. top top vs. bad
top vs. free
bad vs. bad bad vs. free
free vs. free
#top #bad #free
-1 +1 = = = = -1 +1 = = +1 -1 -1+1 +1 -1
=== = +1 -1
+1 = -1 +1 +1 -2
(top wins) (bad wins) (top wins) (free wins)
(bad wins) (free wins)
Each comparison increases at most 1 bad!
=> We need at least comparisons to get the final status ( bad)
Example about lower bound proof: Finding the max of an array
Idea 1: always compare to the champion
ans = a[1];
for (i = 2 to n) {
Every comparison is top to free!
Idea 2: single-elimination tournament? 3?1 6?5 8?2 7?4
3?68?7 6?8
comparisons First level: free to free
Others: top to top
comparisons
if (a[i] > max) max = a[i];
Upper bound vs. lower bound
An upper bound of the cost of a problem means there exists an algorithm takes at most steps on any input of size
So, given this problem, we can just run this algorithm to get an answer
is guaranteed to be sufficient costs
A lower bound of means for any algorithm, there exists an input for which all algorithm takes at least steps on that input
Whatever algorithm you use, you cannot get better than !!
An algorithm has cost , and the best you can do is optimal algorithm!
Upper bound vs. lower bound
Finding max of an array
Lower bound: comparisons (we just proved that)
If you use fewer than comparisons, you cannot find the answer
Upper bound: the compare-to-champion algorithm does exactly comparisions
comparison
Upper bound meets lower bound!
Comparison-to-champion is an optimal algorithm (w.r.t. #comparisons) to find the max
Final notes
HW1 out
forget to finish entrance exam!!
Random Access Machine (RAM) model
We have an arbitrarily large memory
doarithmeticcalculation
Read/writetoamemorylocationgiventheaddress
Every operation takes unit time Is RAM model perfect?
Is RAM model perfect?
for (int i = 0; i < n; i++) A[i] = A[i] + 1;
for (int i = 0; i < n; i++) A[i] = (long long)(i)*4323 % n + 1;
for (int i = 0; i < n; i++) A[i] = A[(long long)(i)*4323 % n] + 1;
How long would the other two for-loops take?
A.0.14s D.1.0s B.0.2s E.3.0s C.0.6s F.5.0s
Image from ithare.com:
http://ithare.com/infographics-operation-costs-in-
cpu-clock-cycles/
Random Access Machine (RAM) model
We have an arbitrarily large memory
doarithmeticcalculation
Read/writetoamemorylocationgiventheaddress
Every operation takes unit time Is RAM model perfect?
Do we have other models?
Access memory is expensive
But we have cache in our machine!
Accessing data in the cache is free
# of transfers between cache and memory
Asymmetric RAM model
Write to the memory is more expensive than read
Pointer machine model
You cannot random access the memory based on address (i.e., not assume contiguous memory)
You can access the memory only use pointers
You can access a memory location by a pointer
You cannot find the i-th element in an array by using a[i]
Parallel computational models
RAM
Count the total number of operations (however this does not account for
Count the longest chain we have to execute sequentially
Other measurements
Space complexity?
Programming complexity?
Why do we care about models?
Help you understand the cost from different aspects
Help you simplify the estimation of the cost
You can never calculate the exact time needed for an algorithm
But the models focusing on different measurements help you understand the algorithm using a simplified manner
In many cases, the estimation is rather accurate. E.g., with reasonable implementation, a quicksort will be faster than a selection sort.
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com