SIT323 Cloud Application Development, Trimester 1, 2021 Programming Task 1 – Validation and Testing
Due Date
Sunday 8:00 PM, May 2, 2021
Introduction
Programming Tasks 1 and 2 comprise parts of the one project. These require you to design, develop and test software related to a task allocation problem in which a parallel program is partitioned into a set of tasks and these tasks need to be allocated to a set of processors such the amount of energy consumed is minimised (see class notes of week 1).
1. In brief, Programming Task 1 focuses on designing, developing and testing a new program such that it is able to:
• Validate/invalidate the two data files related to configuration and task allocations. • Validate/invalidate allocations that are described in the allocations file.
• Compute and display the amount of time taken to run the parallel program.
• Compute and display the amount of energy consumed by each allocation.
• Display each allocation, i.e, indicate tasks allocated to each processor.
• Compute and display the maximum amount of RAM required by tasks on each
processor, and the amount of RAM available to each processor.
• Compute and display the maximum download speed required by tasks on each
processor, and the amount of download speed available to each processor. • Compute and display the maximum upload speed required by tasks on each
processor, and the amount of upload speed available to each processor.
You will be provided with valid input data files, but also invalid input data files. These
files can be found in the ZIP file called Programming Task 1 – Data Files.zip.
2. In brief, Programming Task 2 requires you to focus on:
• Designing, developing and testing a cloud solution.
• Loading data from a configuration file to compute one or more allocations of tasks
such that the amount of energy consumed is valid and is the lowest that you can achieve. Note, there might be more than one allocation where the consumed energy is the same and the lowest.
• Optimisation techniques.
• Displaying these allocations and related data such as the amount of energy
consumed, and the amount of runtime.
• Ensuring that these new allocations are valid.
Page 1 of 10
Objectives
The main objectives for Programming Task 1 are:
1. Design and develop a new program to work with the two input data files. Their formats are described in the following sections:
• Task Allocation file format (.taff)
• Configuration file format (.cff)
2. Design and develop many unit tests.
3. Code to conventions and standards.
Testing Requirements – Unit Tests
Your software solution for Programming Task 1 requires testing. You must design and develop several unit tests for the following functionality.
1. Determining whether the amount of RAM required by a task is less than or equal to the amount of RAM associated with a processor.
2. Determining whether the amount of download speed required by a task is less than or equal to the amount of download speed provided by a processor.
3. Determining whether the amount of upload speed required by a task is less than or equal to the amount of upload speed provided by a processor.
4. Determining whether the contents of a TAFF file conform to the TAFF format.
5. Determining whether the contents of a CFF file conform to the CFF format.
6. For an invalid TAFF file, determine whether errors in that TAFF file are detected
by your code.
7. For an invalid CFF file, determine whether errors in that CFF file are detected by
your code.
Task Allocations file format (.taff)
Your software solution for Programming Task 1 must use data from a text file that contains zero or more task allocations (see “PT1 – Test1.taff” as an example).
1. Lines containing zero or more white spaces only are allowed.
2. Lines containing a comment or data are permitted to commence with 0 or more white spaces.
3. A line containing a comment is allowed. The symbol // will indicate the start of a comment, the end of line will represent the end of a comment. Some valid comment lines are as follows.
// This is valid.
// Creation date: 28/2/2021
// Leading white spaces are valid too.
Mixing data and a comment on one line is not allowed. For example, the following line is invalid:FILENAME=”PT1 – Test1.cff” // Configuration filename.
Page 2 of 10
4. There will be a section that references a configuration file. This section starts with the keyword CONFIGURATION-DATA on a line by itself, and ends with the keyword END- CONFIGURATION-DATA on a line by itself.
Between these two keywords will be a line containing the name of a configuration file. It commences with the keyword FILENAME, followed by an equals symbol, and ends with the filename that is delimited by double quotes. For example.
CONFIGURATION-DATA
FILENAME=”config.cff”
END-CONFIGURATION-DATA
5. There will be a section that contains data about allocations. This section starts with the keyword ALLOCATIONS on a line by itself, and ends with the keyword END- ALLOCATIONS on a line by itself.
There will be a line indicating the number of allocations. It commences with the keyword COUNT, followed by an equals symbol, and the number of allocation in this section.
There will be a line indicating the number of tasks in each allocation. It commences with the keyword TASKS, followed by an equals symbol, and the number of tasks.
There will be a line indicating the number of processors in each allocation. It commences with the keyword PROCESSORS, followed by an equals symbol, and the number of processors.
There will be 0 or more sub-sections of allocation data. One sub-section for each allocation. Each sub-section starts with the keyword ALLOCATION on a line by itself, and ends with the keyword END-ALLOCATION on a line by itself. Details of an ALLOCATION sub-section is below.
For example, the following indicates 2 allocations, 5 tasks, and 3 processors.
ALLOCATIONS
COUNT=2
TASKS=5
PROCESSORS=3
ALLOCATION
ID=0
MAP=1,1,0,0,0;0,0,0,1,0;0,0,1,0,1
END-ALLOCATION
ALLOCATION
ID=1
MAP=1,0,0,1,0;0,1,0,0,0;0,0,1,0,1
END-ALLOCATION
END-ALLOCATIONS
6. There will be a sub-section of data for each allocation. In general, each allocation has an ID and a MAP that represents the allocation of 0 or more tasks to each processor.
For each ALLOCATION sub-section, there will be two lines of data. The first line commences with the keyword ID, followed by an equals symbol, and ends with an ID number. The second line commences with the keyword MAP, followed by an equals symbol, and ends with data to specify an allocation.
Page 3 of 10
This map data contains semicolon separated sections, one section per processor. Each of these sections contain comma separated 1s and 0s such as 1,0,1,0,0 where values in the mth section represent an allocation of 0 or more tasks onto the mth processor.
• 1 in the nth position of the mth section indicates the nth task is assigned to the mth processor.
• 0 in the nth position of the mth section indicates the nth task is not assigned to the mth processor.
The following ALLOCATION example represents one allocation with an ID of 0, and a map. This map represents 5 tasks, and 3 processors. Based on this map data:
• Processor 0 has been allocated tasks 0 and 1. • Processor 1 has been allocated task 3.
• Processor 2 has been allocated tasks 2 and 4.
ALLOCATION
ID=0
MAP=1,1,0,0,0;0,0,0,1,0;0,0,1,0,1
END-ALLOCATION
Configuration file format (.cff)
Your software solution for Programming Task 1 must use data from a CFF file that contains configuration data, an example can be seen in “PT1 – Test1.cff”.
1. Lines containing zero or more white spaces only are allowed.
2. Lines containing a comment or data are permitted to commence with 0 or more white spaces.
3. A line containing a comment is allowed. The symbol // will indicate the start of a comment, the end of line will represent the end of a comment. Some valid comment lines are as follows.
// This is valid.
// Creation date: 28/2/2021
// Leading white spaces are valid too.
Mixing data and a comment on one line is not allowed. For example, the following two lines are invalid.
LOGFILE // The name of the default log file.
DURATION=20 // The program must not exceed this limit.
4. There will be a section that references a log file. This section starts with the keyword LOGFILE on a line by itself, and ends with the keyword END-LOGFILE on a line by itself.
Between these two keywords is a line containing the default name of a log file. It commences with the keyword DEFAULT, followed by an equals symbol, and ends with the filename that is delimited by double quotes. For example.
LOGFILE
DEFAULT=”log.txt”
END-LOGFILE
Page 4 of 10
5. There will be a section of minimum and maximum limits for the number of tasks, the number of processors, processor frequencies, amounts of RAM, download speeds, and upload speeds. This section starts with the keyword LIMITS on a line by itself, and ends with the keyword END-LIMITS on a line by itself.
Each line in this section commences with a keyword, following by an equals symbol, and a value. For example:
LIMITS
MINIMUM-TASKS=1
MAXIMUM-TASKS=100
MINIMUM-PROCESSORS=1
MAXIMUM-PROCESSORS=500
MINIMUM-PROCESSOR-FREQUENCIES=1.37
MAXIMUM-PROCESSOR-FREQUENCIES=10.0
MINIMUM-RAM=1
MAXIMUM-RAM=64
MINIMUM-DOWNLOAD=1
MAXIMUM-DOWNLOAD=1000
MINIMUM-UPLOAD=1
MAXIMUM-UPLOAD=1000
END-LIMITS
This means that we cannot have a program partitioned into more than 100 tasks, we cannot use more than 500 processors, we cannot use a processor that has a frequency of more than 10 GHz, and we cannot allocate more than 64 GB of RAM to a processor, etc.
6. There will be a section containing data related to the parallel program: the maximum duration of that program, the number of tasks that this program must be partitioned, and the number of available processors to run the tasks.
This section starts with the keyword PROGRAM on a line by itself, and ends with the keyword END-PROGRAM on a line by itself. Each line in this section commences with a keyword, following by an equals symbol, and a value. For example:
PROGRAM
DURATION=3.0
TASKS=5
PROCESSORS=3
END-PROGRAM
This means that the parallel program must complete within 3 seconds. It must be partitioned into 5 tasks. Each task must be allocated to one of the 3 processors.
7. There will be a section that contains data about tasks. This section starts with the keyword TASKS on a line by itself, and ends with the keyword END-TASKS on a line by itself.
There will be 1 or more sub-sections of task data. One sub-section for each task. Each sub-section starts with the keyword TASK on a line by itself, and ends with the keyword END-TASK on a line by itself. Details of a TASK sub-section is below.
For example, the following indicates 2 tasks.
Page 5 of 10
TASKS TASK
ID=0
// Several lines of task data go here, see below.
END-TASK
TASK ID=1
// Several lines of task data go here, see below.
END-TASK
END-TASKS
8. There will be a sub-section of data for each task. In general, each task has an ID, a runtime that was measured on a processor running at a particular frequency, requires an amount of RAM, requires an amount of download speed, and requires an amount of upload speed.
Each line in a task sub-section commences with a keyword, following by an equals symbol, and a value. For example:
TASK ID=0
RUNTIME=1.0
REFERENCE-FREQUENCY=2.2
RAM=2
DOWNLOAD=100
UPLOAD=10
END-TASK
NOTE – this example indicates that this task took 1 second to complete while on a processor running at 2.2 GHz. If this task was allocated to a faster processor, it would complete in a shorter time. It would take 0.5 seconds to complete while running on a 4.4 GHz processor.
9. There will be a section that contains data about processors. This section starts with the keyword PROCESSORS on a line by itself, and ends with the keyword END-PROCESSORS on a line by itself.
There will be 1 or more sub-sections of processor data. One sub-section for each processor. Each sub-section starts with the keyword PROCESSOR on a line by itself, and ends with the keyword END-PROCESSOR on a line by itself. Details of a PROCESSOR sub-section is below.
For example, the following indicates 2 processors.
PROCESSORS
PROCESSOR
ID=0
// Several lines of processor data go here, see below.
END-PROCESSOR
PROCESSOR
ID=1
// Several lines of processor data go here, see below.
END-PROCESSOR
END-PROCESSORS
Page 6 of 10
10. There will be a sub-section of data for each processor. In general, each processor has an ID, a type, a frequency (GHz), an allocated amount of RAM (GB), a download speed capability (Gbps), and an upload speed capability (Gbps).
Each line in a processor sub-section commences with a keyword, following by an equals symbol, and a value. For example:
PROCESSOR
ID=0
TYPE=”Intel i5″
FREQUENCY=1.8
RAM=4
DOWNLOAD=300
UPLOAD=50
END-PROCESSOR
NOTE – Consider a task that takes 10 seconds to complete while on a processor running at 2 GHz. If that task was allocated to this slower processor running at 1.8 GHz, it would take longer to complete. In fact it would take 11.111 seconds to complete (10 * 2 / 1.8).
11. There will be a section that contains data about the types of processors. This section starts with the keyword PROCESSOR-TYPES on a line by itself, and ends with the keyword END-PROCESSOR-TYPES on a line by itself.
There will be 1 or more sub-sections of processor type data. One sub-section for each kind of processor. Each sub-section starts with the keyword PROCESSOR-TYPE on a line by itself, and ends with the keyword END-PROCESSOR-TYPE on a line by itself. Details of a PROCESSOR-TYPE sub-section is below.
For example, the following indicates 2 kinds of processors.
PROCESSOR-TYPES
PROCESSOR-TYPE
// Several lines of type information go here.
END-PROCESSOR-TYPE
PROCESSOR-TYPE
// Several lines of type information go here.
END-PROCESSOR-TYPE
END-PROCESSOR-TYPES
12. There will be a sub-section of data for each type of processor. In general, each type of processor has a name, and a set of three coefficient values for a quadratic formula.
Each line in a processor type sub-section commences with a keyword, following by an equals symbol, and a value. For example:
PROCESSOR-TYPE
NAME=”Intel i5″
C2=10
C1=-25
C0=25
END-PROCESSOR-TYPE
Page 7 of 10
This means that the “Intel i5” processor has an energy (per second) consumption function of:
energy per second = 10f2 – 25f + 25
This means that a task that runs for 2.5 seconds on a 3.3 GHz Intel i5 processor will consume the following amount of energy.
(10*3.32 – 25*3.3 + 25) * 2.5
= (10*3.32 – 25*3.3 + 25) * 2.5
= (108.9 – 82.5 + 25) * 2.5
= 51.4 * 2.5
= 128.5
13. There will be a section containing energy costs for local communications. That is, a task sends data to another task, and both tasks are running on the same processor. This section commences with the keyword LOCAL-COMMUNICATION on a line by itself, contains a map of energy values, and ends with the keyword END-LOCAL- COMMUNICATION on a line by itself.
The map data commences with the keyword MAP, followed by an equals symbol, and ends with data to specify an energy costs for local communication.
This map data contains semicolon separated sections, one section per task. Each of these sections contain comma separated numeric values such as 0,0.2,0,0.33,1.7 where values in the mth section represent how much energy is required by the mth task to locally communicate with the other tasks.
• A value, say E, in the nth position of the mth section indicates that the mth task requires E amount of energy to locally communicate with the nth task.
• A zero value in the nth position of the mth section indicates that the mth task does not locally communicate with the nth task.
For example, the following map data indicates the amount of energy required for: • Task 0 to locally send data to Task 1, 3 or 4.
• Task 1 to locally send data to Task 2.
• Task 2 to locally send data to Task 3 or 4.
• Task 3 to locally send data to Task 0 or 4. • Task 4 to locally send data to Task 0.
LOCAL-COMMUNICATION
MAP=0,4,0,1,1;0,0,5,0,0;0,0,0,5,2;7,0,0,0,5;2,0,0,0,0
END-LOCAL-COMMUNICATION
14. There will be a section containing energy costs for remote communications. That is, a task sends data to another task, and both tasks are running on different processors. This section commences with the keyword REMOTE-COMMUNICATION on a line by itself, contains a map of energy values, and ends with the keyword END-REMOTE- COMMUNICATION on a line by itself.
Page 8 of 10
The map data commences with the keyword MAP, followed by an equals symbol, and ends with data to specify an energy costs for local communication.
This map data contains semicolon separated sections, one section per task. Each of these sections contain comma separated numeric values such as 0,0.2,0,0.33,1.7 where values in the mth section represent how much energy is required by the mth task to remotely communicate with the other tasks.
• A value, say E, in the nth position of the mth section indicates that the mth task requires E amount of energy to remotely communicate with the nth task.
• A zero value in the nth position of the mth section indicates that the mth task does not remotely communicate with the nth task.
For example, the following map data indicates the amount of energy required for: • Task 0 to remotely send data to Task 1, 2, 3 or 4.
• Task 1 to remotely send data to Task 2.
• Task 2 to remotely send data to Task 3.
• Task 3 to remotely send data to Task 4.
• Task 4 to remotely send data to Task 0 or 1.
REMOTE-COMMUNICATION
MAP=0,1,1,1,1;0,0,5,0,0;0,0,0,5,0;0,0,0,0,5;3,2,0,0,0
END-REMOTE-COMMUNICATION
Validating files and allocation
1. Your software solution for Programming Task 1 must be able to validate both text input data files (taff and cff) with respect to file format.
Invalid aspects of these files are displayed to a user via the GUI, and appended to the log file.
2. Your software solution for Programming Task 1 must be able to validate allocations too, but only if both input data files are valid. Consequently, it is possible to have perfectly valid input files, but an allocation is invalid.
For example, an allocation is invalid if the accumulated runtime of tasks (from any of the processors) exceeds the overall program runtime.
As another example, the following allocation is invalid because task 0 has been allocated to two processors (processor 0 and processor 2).
ALLOCATION
ID=0
MAP=1,1,0,0,0;0,0,1,1,0;1,0,0,0,1
END-ALLOCATION
Invalid aspects of an allocation should also be displayed on the GUI, and appended to the log file.
Page 9 of 10
Displaying an allocation and related values
1. Whether or not the input files and allocations are valid, the existing program attempts to display valid or invalid allocations on the GUI. For example, an invalid allocation of 0s and 1s in the above section could still be displayed.
Displaying a valid allocation on your GUI might look like:
Allocation ID = 0, Runtime = 1.23, Energy = 234.56
1,1,0,0,0 0,0,1,1,0 0,0,0,0,1
2/2 GB 4/8 GB 4/16 GB
200/300 Gbps 180/220 Gbps 330/380 Gbps
40/50 Gbps 20/30 Gbps 60/75 Gbps
2. The existing program computes and displays the program runtime on the GUI. However, an appropriate value is displayed for an invalid allocation.
For each processor P, RUNTIME(P) is the sum of runtimes of each allocated task on that processor.
The program runtime is the maximum of those RUNTIME(P) values.
3. The existing program computes and displays the program energy, i.e., the amount of energy consumed by a valid allocation on the GUI. However, an appropriate value is displayed for an invalid allocation.
The program energy is the sum of all energies to:
• run each allocated task on a particular processor, • locally communicate, and
• remotely communicate.
4. For each processor, compute and display the amount of RAM required by the allocated tasks (this is a maximum, not a sum), and display the amount of RAM available to that processor.
5. For each processor, compute and display the amount of download speed required by the allocated tasks (this is a maximum, not a sum), and display the amount of download capability of that processor.
6. For each processor, compute and display the amount of upload speed required by the allocated tasks (this is a maximum, not a sum), and display the amount of upload capability of that processor.
Data files
To help you design and develop your unit tests and software, you will be provided with several pairs of CFF and TAFF files based on the above formats. These files can be found in the ZIP file called Programming Task 1 – Data Files.zip.
Page 10 of 10