COSC 1285/2123
COSC 1285/2123 Algorithms and Analysis
Workshop 3
Algorithmic Analysis
Tutorial Exercises
Objective
Students who complete this tutorial should:
� Understand the fundamentals of algorithm analysis.
� Be able to evaluate recursive and non-recursive algorithms.
2.3.1 Compute the following sum (extension):
n+1∑
i=3
1
Answer: Use the common summation formulas and rules listed in Appendix A (book). You may
need to perform some simple algebraic operations before applying them.
n+1∑
i=3
1 = (n + 1)− 3 + 1 = n− 1
2.3.4 Consider ALGORITHM 1.
Algorithm 1 Mystery(n)
// Input: a non-negative integer n
S = 0
for i = 1 to n do
S = S + i ∗ i
end for
return S
a. What does this algorithm compute?
b. What is its basic operation?
c. How many times is the basic operation executed?
d. What is the efficiency class of this algorithm?
e. Suggest an improvement or a better algorithm altogether and indicate its efficiency class. If
you cannot do it, try to prove that in fact it cannot be done.
Answer:
©2021 RMIT University 1
COSC 1285/2123
1. Tracing the algorithm to get its output for a few small values of n should help if you need it.
Computes S(n) =
∑n
i=1 i
2.
2. Multiplication (or, if multiplication and addition are assumed to take the same amount of
time, either of the two).
3. C(n) =
∑n
i=1 1 = n.
4. C(n) = n ∈ Θ(n).
5. Use the formula S(n) =
∑n
i=1 i
2 =
n(n+1)(2n+1)
6
to compute the sum in Θ(1) (constant) time.
2.4.1 Solve the following recurrence relations:
a. x(n) = x(n− 1) + 1, for n > 0, x(0) = 0.
Answer: Use backward substitution.
x(n) = x(n− 1) + 1
= [x(n− 2) + 1] + 1 = x(n− 2) + 2
= [x(n− 3) + 1] + 2 = x(n− 3) + 3
= . . .
= x(n− i) + i
= . . .
= x(0) + n = 0 + n = n
2.4.4 Consider ALGORITHM 2.
Algorithm 2 Q(n)
// Input: a positive integer n
if n == 1 then
return 1
else
return Q(n− 1) + 2 ∗ n− 1
end if
a. Set up a recurrence relation for this algorithm’s return values. Make a guess on what the
algorithm computes.
b. Set up a recurrence relation for the number of multiplications made by the algorithm.
Answer:
1. Note that you are asked here about a recurrence of the algorithm’s values, not about a re-
currence for the number of times its operation is executed. Just follow the pseudocode to set
it up. We use backward substitution to solve this (note, this is considered a more difficult
©2021 RMIT University 2
COSC 1285/2123
problem, but interesting to apply backward substitution to figure out what the algorithm is
computing.).
Q(n) = Q(n− 1) + 2 ∗ n− 1
= [Q(n− 2) + 2 ∗ (n− 1)− 1] + 2 ∗ n− 1 = Q(n− 2) + 4 ∗ n− 4
= [Q(n− 3) + 2 ∗ (n− 2)− 1] + 4 ∗ n− 4 = Q(n− 3) + 6 ∗ n− 9
= [Q(n− 4) + 2 ∗ (n− 3)− 1] + 6 ∗ n− 9 = Q(n− 4) + 8 ∗ n− 16
= . . .
= Q(n− i) + 2 ∗ i ∗ n− i2
= . . .
= Q(1) + 2 ∗ (n− 1) ∗ n− (n− 1)2 = 1 + 2 ∗ n2 − 2n− n2 + 2n− 1 = n2
2. This question is very similar to one we have already discussed.
M(n) = M(n − 1) + 1 for n > 1, M(1) = 0. Solving it by backward substitution yields
M(n) = n− 1.
©2021 RMIT University 3
COSC 1285/2123
Practical Exercises
Objective
� Learn how to benchmark algorithms.
Benchmarking
Evaluation in computer science can be conducted theoretically or empirically. In this lab, we learn
how to apply empirical analysis to a class of similar algorithms. The efficiency of an algorithm
is often measured as the running time. Here we investigate two simple ways to benchmark the
performance of a algorithm: the unix time command and the nanotime() method call.
Bubblesort
In this part of the lab we will compare the running time of two different BubbleSort implementations
for different input sets. Although we have not covered BubbleSort, the main purpose of the lab is
to explore how to empirically evaluate algorithms, so we can treat them as blackbox algorithms (for
now). The exercise for this part of the lab is to benchmark the run time of the two given implemen-
tations Bubblesort1.java and Bubblesort2.java using the given sample data sets test1.in and
test2.in. First, download and compile the two files Bubblesort1 and Bubblesort2 and perform
a run time analysis on them using the input sets test1.in and test2.in. You are encouraged to
test both implementations and compare their results. After you run your experiments you should
be able to answer the following questions:
� Which of the two implementations is faster?
� Is it enough to perform an experiment once? Try running an experiment multiple times and
compare the results?
� Is the time method accurate enough for your experiment?
� How do the two algorithms perform for different input sizes?
Download the code and compile both programs using the following commands:
javac BubbleSortUtils.java BubbleSort1.java
javac BubbleSortUtils.java BubbleSort2.java
Simple Run Time Evaluation
The easiest way to measure the running time of an algorithm is to use the unix time command
on the program that implements the algorithm (you will need access to a machine that has that
command, e.g., the Core Teaching servers). For example, invoking the time command results in the
following output:
$ time java BubbleSort1 < test1.in
real 12.0 user 11.0 sys 1.0 ©2021 RMIT University 4 COSC 1285/2123 Note that the format of the timing information may be different depending on the machine you � The elapsed (real) time between invocation of bubblesort1 � The user CPU time (the sum of the tms utime and tms cutime values in a struct tms (see � The system CPU time (the sum of the tms stime and tms cstime values in a struct tms (see Which of the values would you use in your experiments and why? Discuss with your lab demon- Accurate Run Time Evaluation Using the time command gives you an easy way to quickly evaluate the run time of a program � You are not evaluating the run time of your algorithm but the program that implements it. � The accuracy of the run time measurements are given in seconds. This is not very accurate A more accurate way of measuring the run time of an algorithm is to use the public static long nanoTime(); It returns the current time in nanoseconds precision. See javadocs for more details. ©2021 RMIT University 5 COSC 1285/2123 Consider the following example that measures the time required to count all prime numbers pub l i c c l a s s CountPrimes pub l i c s t a t i c i n t countPrimes ( i n t n) { // loop from n to 2 , and try each subsequent i as a prime boolean np = f a l s e ; where: � start of range for values: Start of integer range to generate values from (inclusive), must be 0 � end of range for values: End of integer range to generate values from (inclusive), must be 0 or � number of values to generate: Number of values to generate. � with — without: using sampling with (without) replacement. As an example, the following command generates 20 integers in the range 2 to 100, using sam- java DataGenerator 2 100 20 without ©2021 RMIT University 8
are using, but still to do with timing. The three different time values returned by the program time
are (use the man pages to find out more details of what they mean):
man -s2 times)
man -s2 times)
strator!
without the need to modify the actual source code. There are however multiple downsides to this
method:
This includes parts of your program like file I/O, program start up or memory allocation that
you actually don’t want to benchmark.
in the area of run time analysis. Evaluating a fast algorithm on small input set will be very
difficult using the time command.
System.nanoTime() method call:
smaller than n:
{
i n t count = 0 ;
f o r ( i n t i = n ; i > 1 ; i−−) {
f o r ( i n t j = 2 ; j < i ; j++) {
// t e s t i f prime
i f ( i % j == 0) {
np = true ;
break ;
}
// increment prime count
i f ( ! np ) {
count++;
}
}
}
re turn count ;
}
pub l i c s t a t i c void main ( St r ing [ ] a rgs ) {
// s e t to 40000 , can be anything you l i k e
i n t n = 40000 ;
long startTime = System . nanoTime ( ) ;
CountPrimes . countPrimes (n ) ;
long endTime = System . nanoTime ( ) ;
double estimatedTime = ( ( double ) ( endTime − startTime ) ) / Math . pow(10 , 9 ) ;
System . out . p r i n t l n (” time taken = ” + estimatedTime + ” sec ” ) ;
}
}
Figure 1: Sample time measurement using the nanoTime() system call.
©2021 RMIT University 6
COSC 1285/2123
The above code sample shows that you simply encapsulate your algorithm within two nanoTime()
calls to measure its run time in nanoseconds. It is therefore easy to exclude the parts of your program
like file I/O, program start up or memory allocation that you actually don’t want to benchmark.
Modify both bubblesort implementations according to the sample code above and
measure the running time using nanoTime().
Considering the Environment
Execution times on a specific machine normally depend upon details of the machine and on the
specific data used. Timings may vary from data set to data set and from machine to machine, so
experiments from one machine and one data set may not be very helpful in general. It is therefore
important to carefully consider the environment in which you benchmark your algorithm. You
will be running your experiments on the core teaching servers. These machines are used by many
students at a time which in turn can affect the outcome of your experiments. It is common practice
to run multiple times and use the average mean of these results during the evaluation of your
experiments.
Data Set Generation
The performance of algorithms is also dependent on the data that it is applied on. Therefore when
evaluating an algorithm, it is also desirable to consider test scenarios that the algorithm could face
and test with representative data samples of these scenarios.
Sometimes it is not possible to obtain real data that covers the important data scenarios. Hence,
it is useful to be able to generate data to test scenarios that we don’t have real data for. In this part
of the lab, we will study two simple ways to generate testing data.
There are many factors to consider, including what values to generate, how many to generate,
do we allow duplicate values and how likely we generate a value. To demonstrate how to generate
datasets, we generate data for the two Bubblesort algorithms. We consider the simple case of
generating (non-negative, or 0 or greater) integers over a range of values that are equally likely to
appear in our data samples. This can be implemented as sampling from a uniform distribution.
We consider two subcases, one where we allow duplicates, one where we do not. In the first
subcase (duplicates are allowed), this is sampling (from a uniform distribution) with replacement.
Without going into details, imagine we have a ball for each integer in the range. We place these
balls into a bag, mix them, then draw/sample one ball. We note down the integer represented by
the ball, then drop the ball back in (leading to the name “with replacement”). We mix them again,
and draw again until we drawn the number of integers (balls) we need.
In the second subcase (only unique values, no duplicates allowed), this is sampling (from a
uniform distribution) without replacement. Using the ball and integer analogy again, this time,
when we draw/sample a ball, we note down the integer of the ball but do not drop the ball back in.
We repeat this process until we have drawn the number of integers (balls) we need. Since the balls
are not put back into the bag, then it the integer it represents can’t be drawn again, so duplicates
cannot happen.
Consider the file DataGenerator.java. We have implemented sampling without replacement.
Your task is to implement sampling with replacement to generate data generation that allows du-
plicates. Complete the implementation for method sampleWithReplacement(). As a hint, consider
java.util.Random.nextInt().
©2021 RMIT University 7
COSC 1285/2123
To run the program, type:
java DataGenerator
or greater.
greater.
pling without replacement (no duplicates):