COMP3100 Assignment 2: Semester 1 – 2020
This document details the submission method, time limit, and questions which are similar to your “final exam” (Especially the Question 4)
Assessmen t Weighting
The assessment is out of 100 marks and is worth 10% of your final grade.
What to submit
Submit 1 (one) PDF file containing your answers to all questions.
Filename format: COMP3100_studentname_studentnumber.pdf
Where to submit
Assignment 2 question submission box (in the assessments section on iLearn).
This is a TurnItIn submission box which will check for collaboration / plagiarism.
When to submit
Before the deadline
Submit as many times before Date/Time (Sydney time): 23:59 7th June 2020
Type of assessment
Individual assessment.
You may refer to the material on iLearn during the exam. This must be your own work – cite any sources used from outside of the unit.
Instructions :
•This assignment has 4 questions. The first three questions are worth 20 marks each and the last question is worth 40 marks. For a total out of 100.
•Include your name and student number in the document. •Each question must be answered in the SAME file.
•Start your answer for each question on a NEW page.
•Number your answers (Answer for Q1, Answer for Q2, Answer for Q3, Answer for Q4).
1
Answers to each question should be around one page each. The format require- ment is listed below:
• single-spaced pages
• 13 point Palatino or Times typeface
• This means around 500 words as a guide.
Remember clearly and succinctly expressing ideas is more important than quan- tity. You may consult other materials, lecture slides, notes, web searches, etc. to help remember any specific details, but do not copy and paste, you must provide the sum- maries in your own words (overly long answers will raise suspicion).
The aim is to show that you have sufficient knowledge of these topics to quickly provide a high-level explanation to the management of the hypothetic company for which you are working. In reality, you might spend several days on each topic to produce some polished writing. Given the short period (in the exam), your writing should clearly express the ideas as best as possible. It is your opportunity to show us your knowledge of the area. References are required for external resources.
2
Taiji Cloud Scheduler
Instructions
“Google owns and operates data centers all over the world, helping to keep the inter- net humming 24/7. Learn how our relentless focus on innovation has made our data centers some of the most high-performing, secure, reliable, and efficient data centers in the world” [1].
Figure 1: Google Data Center [1].
“Google cloud scheduler is a fully managed enterprise-grade cron job scheduler. It allows you to schedule virtually any job, including batch, big data jobs, cloud infra- structure operations, and more. You can automate everything, including retries in case of failure to reduce manual toil and intervention. Cloud Scheduler even acts as a single pane of glass, allowing you to manage all your automation tasks from one place.” [2]
Taiji Cloud is a start-up company which wants to compete with Google in the data centre market. The most important thing for the company is to come up with com- petitive job scheduler algorithm to gain market share from the current unicorn com- panies like Google.
The CTO of the start-up company knows you have been doing job scheduler algo- rithm in COMP3100 and asks you to explain the three algorithms in stage 2 in details (Questions 1-3). The first three questions are based on Taiji Cloud existing datacenter which has multiple resource types. Resources are going to be used by a couple of jobs submitted to the datacenter. The datacenter follows three policies called FirstFit, Best- Fit, WorstFit to assign jobs to the resources. The following is the information about the jobs (Table 1) and resources (Table 2).
Table 1: Jobs
Job Id Arrival Time Estimated Ex- Required Required Required ecution Time CPU Cores Memory Disk Space
(Time Slot) 025322
3
137235
254413 368324
4 10 9 1 5 2 5 23 3 2 3 3
6 34 10 2 3 2 7 43 12 3 2 2
8 44 3 4 4 1 9 56 15 3 1 3
10 57 5 2 2 4 11 89 1 1 4 4
12 90 8 2 3 3 13 94 7 2 2 2
14 95 2 4 3 5 15 95 6 4 3 4
Table 2: Servers
Resource Bootup Time Slot CPU Cores Memory Disk Space Limit Type Time Rate
SYD 10 $1 2 10 15 3
MEL 15 $2 4 15 20 4 GLC 5 $3 8 20 25 5
Assumptions:
• Each server has only one of the following states: Booting (BOT), Idle (IDL),
and Active (ATV).
• Available server time/ is considered when a server can RUN a job if it is in
Active state (ATV).
• The cost is calculated based on the amount of time slots a job uses. If a server doesn’t have any jobs at an arbitrary time, the cost is zero.
• When multiple jobs run on the same server, the cost is determined based on the maximum estimated job’s time slot.
• The datacenter considers the following metrics for performance evaluation: Table 3: Metrics
Metric Turnaround Time
Cost Utilization
Description
The amount of time taken to complete a job. In other words, a job turnaround time is calcu- lated based on the difference between comple- tion time and arrival time.
The cost by jobs submitted to each server with respect to the time slot rate.
How loaded the server is in terms of CPU, memory, and disk.
4
*Please note in this assignment, for an active server, the current utilization or cost is considered. Each server type has its own cost. ** For the reason, please mention the main steps and briefly write the reason.
For each metrics, you must consider the last completion time of all jobs which means when there are no jobs to be executed (e.g., simulation end time in ds-server). Hence, the duration for utilization of each server begins with the server ready time till the last completion time including the active and idle time.
The second row is completed for FirstFit algorithm for job 0.
Table 4: Scheduling decisions
Table 4 parameters have the following meanings:
J_id J_avt S_typ S_id S_stat S_rea Why* Res J_fin- J_id_ Cost* Utiliza- e e dy * ished wait- tion*
ing
0
2
MEL
0
BOT
2+15 = 17
Steps 3,4,5
4,15,2 0
N/A
N/A
$0
0%,0%,0 %
J_id: Job Id
S_id: Server Id
Why: the reason for the scheduling de- cisions
J_id_waiting: Wait- ing jobs on this server
J_avt: Job Ar- rival Time
S_state: Sever State
Res: Current server resource; core, memory, disk
Cost: Accumulated server cost for each type over time
S_type: Server Type
S_ready: Server available time
J_finished: Fin- ished jobs on this server
Utilization: Cur- rent utilization of re- sources
The CTO also ask you to write some investigative reports on the fourth question. Each question has some instructions of what you shall write about as a starting point.
Reference:
[1] https://www.google.com.au/about/datacenters/ [2] https://cloud.google.com/scheduler
Exam questions are over the page
5
Q1 FirstFit algorithm (20 marks)
Followed Firstfit algorithm stated in stage #2 project description, show how jobs in Table 1 are allocated to resources in Table 2.
• You MUST use Table 4 as the template to calculate the average metrics in Ta- ble 3 when all jobs are completed including:
Completion time, Average CPU/MEM/DISK utilization per each server type, Cost per each server Type, and Average turnaround time of jobs.
***Your understanding about the job allocation and metrics will be assessed. Please focus on the logic behind each algorithm.***
6
Q2 BestFit algorithm (20 marks)
Followed BestFit algorithm stated in stage #2 project description, show how jobs in Table 1 are allocated to resources in Table 2.
• You MUST use Table 4 as the template to calculate the average metrics in Ta- ble 3 when all jobs are completed all jobs are completed including:
Completion time, Average CPU/MEM/DISK utilization per each server type, Cost per each server Type, and Average turnaround time of jobs.
.
***Your understanding about the job allocation and metrics will be assessed. Please focus on the logic behind each algorithm.***
7
Q3 WorstFit algorithm (20 marks)
Followed WorstFit algorithm stated in stage #2 project description, show how jobs in Table 1 are allocated to resources in Table 2.
• You MUST use Table 4 as the template to calculate the average metrics in Ta- ble 3 when all jobs are completed all jobs are completed including:
Completion time, Average CPU/MEM/DISK utilization per each server type, Cost per each server Type, and Average turnaround time of jobs.
***Your understanding about the job allocation and metrics will be assessed. Please focus on the logic behind each algorithm.***
Q4 Job Scheduler, Time Synchronisation, Fault
8
Tolerance and Transparency (40 marks)
How Distribution transparency is related to such job scheduler application?
Summarise here what distribution transparency is, and give an example for each transparency principle based on the prototype job scheduler application.
What Time Synchronisation mechanism will you rec- ommend?
Summarise why time synchronisation is required for this job scheduler application. Give your recommended time synchronisation mechanism. More importantly give justification.
What fault-tolerance mechanism will you recommend?
Summarise what specific fault model you are targeting for this prototype system, list different faults handling mechanism and give your recommended fault handling mechanism. More importantly give justification.
9
10
11
12
13