Final Exam
Please make sure that you always use notations consistent with lecture notes. Different notations will not be accepted. The deadline for final assessment is: SAT 8th , May 5:00 pm
Please do not wait till the last minute to submit and doubt check it is successful.
Question 1 (18 marks)
(a) (10 marks) An entertainment company operates many cinemas and hires you to design a small database the following requirements:
⚫ Each cinema is uniquely identified by its cinema ID. For each cinema, we also record these attributes: name, manager’s name, contact number, address. The Address is composed of a suburb and a street. Each cinema must have multiple halls.
⚫ Each hall is uniquely identified by its ID. We also record the hall name and its category. Each hall belongs to exact one cinema and has multiple seats.
⚫ Each seat is uniquely identified by its ID and its hall ID. The number of seats in each hall is needed.
⚫ Each hall must have multiple sessions. Each session is identified by its ID and is hold by exact one hall. We also record the start time, end time of each session. A session only plays exact one movie. Each movie in this database must be played by at least one session.
⚫ Each movie is identified by its id. We also record its category, languages and rating. Each movie can have multiple available languages.
⚫ A movie should be directed by more than one director. Each director can direct zero or more movies.
⚫ Each director is identified by his or her director ID. We also record his or her first name, last name, nationality and birth.
⚫ Each actor in the database acts in more than one movie. There should be at least one actor in each movie.
⚫ Each actor is identified by the actor ID. We also record first name, last name, gender and birth.
Draw an ER diagram to represent this scenario and clearly state the assumptions you make if any. Note: please use the drawing conventions in the lecture notes.
(b) (8 marks) Translate the ER diagram of the above question into a relational model. Note: please use the drawing conventions in the lecture notes.
Question 2 (18 marks)
(18 marks) Consider the relational schema R(A, B, C, D, E, G, H, I, J) and a set of functional dependencies F = {AB → C,C → DI,D → AEG, EG → H,GH → DI}. Note that A, B, C, D, E, G, H, I, J are attributes.
1. Find a minimal cover Fm for F. (3 marks)
2. Compute total number of the super keys of R with respect to F that are not
candidate keys. (4 marks)
3. Regarding F, is the decomposition R1 = {ABCJ}, R2 = {CDEGHI} of R
lossless-join? Please justify your answer. (5 marks)
4. Determine the highest normal form of R. If R is not in the BCNF, decompose R
into BCNF and indicate a key in each of the result tables by underlining the attributes. (6 marks)
Question 3 (20 marks)
(16 marks) Consider the following relational schema for COVID safe check-in records:
• Person (pID, pName, age, gender),
• POI (iID, iName, address, suburb, state, category),
• CheckIn (pID, iID, sTime, eTime, date)
Below are detailed descriptions for the fields in each schema:
• Person: For each person, we record its pID, name, age and gender. pID is the primary key.
• POI: For each point of interest, we record its iID, name, address, suburb, state and category. iID is the primary key.
• CheckIn: For each check-in, we record the iID of POI, the pID of person, the date and the duration of the stay (begin time and end time). The combination of pID and iID is the primary key.
Note: All the attributes except for the primary keys may not be unique. Use relational algebra to answer the following questions:
1. List the iID of pois that are only visited by senior (age=65) female persons. (2 marks)
2. List the pID of persons who have been in a same poi with the person whose pID=’233’ at a same time between ‘Apr 1, 2020’ and ‘Apr 7, 2020’. (4 marks)
3. List the pID of persons who have visited all the pois categorised as ‘supermarket’ in the suburb ‘Sydney’ in ‘NSW’. (*Note: In this question you are not allowed to use division) (6 marks)
4. Find the pID of the oldest persons that have visited the ‘Town Hall’ in the suburb ‘Redfern’ in ‘NSW’ on ‘Jan 1, 2021’. (Hint: You may use A.b to refer the attribute b in the relation A and may need immediate relations to solve this question. E.g. R1 ← π (pID) (Person).) (8 marks)
Question 4 (12 marks)
(a) (6 marks) Consider the schedule below. Here, R(·) and W(·) stand for ‘Read’ and ‘Write’, respectively. T1, T2, T3 and T4 represent four transactions and ti represents a time slot.
1. Give the precedence graph of this schedule. (3 marks)
2. Is this schedule conflict serializable? If your answer is “yes”, please providing
the equivalent serial schedule. If your answer is “no”, convert it to a conflict serializable schedule with the minimum change in the scheduling while preserving the given ordering of operations in each transaction. (e.g. For T1, the change of moving W(Y) from t6 to t9 is 3.) (3 marks)
(b) (6 marks) Consider the lock request sequence given below. RL(·) and WL(·) stand for “read lock” and “write lock”, respectively. T1,T2,T3 and T4 represent four transactions.
1. Give the wait-for graph for the given lock requests. (3 marks)
2. Determine whether there exists a deadlock in the lock requests and justify your
answer. (3 marks)
Time
t1
t2
t3
t4
t5
t6
t7
t8
t9
R(X)
W(Y)
R(Y)
W(Y)
W(Z)
R(X)
R(Z)
R(Y)
W(X)
W(Y)
Time
t1
t2
t3
t4
t5
t6
t7
t8
WL(C)
WL(B)
WL(A)
WL(C)
RL(B)
RL(A)
RL(B)
WL(C)
Question 5 (14 marks)
(a) (6 marks) Consider the following relational schema, SQL query and additional information:
• Employee (eID, dept, salary, hobby),
• Department (dID, dName, floor, phone),
• Finance (dID, budget, sales, expenses)
Select dName, budget
From Employee e, Department d, Finance f Where e.dept = d.dID and d.dID = f.dID and d.floor = 2 and e.salary >= 119000 and e.hobby = ‘golf’;
• Unclustered B+ tree indexes exist on Employee.dept, Employee.salary, Department.dID and Finance.dID (each leaf page contains up to 100 entries).
• The employee’s salaries are between 40,000 and 120,000, and they enjoy 200 different hobbies.
• The company owns two floors.
• There are totally 50,000 employees and 5,000 departments (each with a
corresponding financial information) in the database.
• Statistics indicate that the tuples are uniformed distributed.
Please answer the following questions:
1. Identify a relational algebra expression that is equivalent to the given SQL query. (1 marks)
2. For each of the expression’s base relations (Employee, Department and Finance), estimate the number of tuples that would be initially selected from each relation before any join processing begins. Justify your answer. (3 marks)
3. Based on your answer above, which of the join orders is expected to have the least estimated cost? Justify your answer. (2 marks)
(b) (8 marks) Consider three buffer replacement policies: ‘Least Recently Used’, ‘Most Recently Used’ and ‘First In First Out’. Please answer the following questions:
1. Construct an example that ’First In First Out’ buffer replacement policy performs worse than the other two buffer replacement policies. (4 marks)
2. Construct an example that ’Least Recently Used’ buffer replacement policy performs the worst, ’First In First Out’ performs moderately, and ’Most Recently Used’ performs the best. (4 marks)
Question 6 (18 marks)
(a) (6 marks) Consider the Linear Hashing index shown below. Assume that a bucket split occurs whenever an overflow page is created. h0(x) takes the rightmost 2 bits of key x as the hash value, and h1(x) takes the rightmost 3 bits of key x as the hash value.
1. What can you say about the last entry that was inserted into the index? (1 mark)
2. What is the smallest key more than 20 whose insertion will cause a split? Please
justify your answer (2 marks)
3. Show the index after inserting entries with hash values 15, 38 and 200 (plot the final
hash table and update the ‘Next’ pointer, if needed). (3 marks)
(b) (12 marks) Consider the following B+ tree:
Note: All key values must be positive integers.
1. How many additional records are required to be added to increase its height? Give the minimum possible number and justify your answer. (3 marks)
2. By inserting data entries into the original tree, what is the minimum possible value of the last key (marked red) in the root without changing its height? Give the minimum possible value and justify your answer. (3 marks)
3. Draw the B+ tree after adding the data entries with keys 6, 8, 13 sequentially into the original tree. (3 marks)
4. Draw the B+ tree after deleting the data entries with keys 56, 60, 95 sequentially from the original tree. (3 marks)
Final Exam Submission
We accept electronic submissions only. Please submit your assignments as follows:
• Students must submit an electronic copy of their answers to the above questions to
the course website in Moodle.
• Only .docx or .pdf file is accepted. The file name should be final_studentID.doc or
final_studentID.pdf (e.g., final_z5100000.doc or final_z5100000.pdf).
Note:
1. For any problems in submissions, please email to comp9311unsw@gmail.com
2. All submissions will be checked for plagiarism.
3. We do not accept e-mail submissions.
Warning: Before submission, please keep a copy in your university account or other reliable cloud servers (such as dropbox or google drive). If you are not sure how, please have a look at taggi. Usually, the submission should be successful. In case it fails, we do not accept backups from your own computers as the modification time can be edited.
Late Submission Penalty
0 mark.