1. Due to a recent upgrade to Central train Station, authorities decided that all
information related to the station should be organized using a DBMS, and
you have been hired to design the database. Your task is to draw an ER
diagram based on the following information given to you by the station
authorities:
6 marks
• Every train has a registration number, name, and each train is of a
specific model.
• The station accommodates a number of train models, and each
model is identified by a model number and has a capacity and a
weight.
• A number of technicians work at the station. You need to store the
name, Social Security Number (SSN), address, phone number, and
salary of each technician.
• Each technician is an expert on one or more train model(s), and his
or her expertise may overlap with that of other technicians.
• Traffic controllers of the station must have an annual medical
examination. For each traffic controller, you must store the date of
the most recent exam.
• All station employees (e.g technicians and traffic controllers) belong
to a union. You must store the union membership number of each
employee. Each employee is uniquely identified by a social security
number (SSN).
2. This question is based on the following E-R diagram shown in Figure 1 which
describes the information kept by a real estate agent.
Figure 1
a) Translate the E-R diagram into a relational model using the following textual
notation. Each relation should be written in the form:
Name(attribute, attribute, …)
PK=(attribute list), FK=(attribute list->parent relation)
An ‘attribute list’ is one or more, comma-separated attribute names. Each
relation must have a primary key (PK) defined. A relation can have zero or
more FKs specified.
For example: Enrolment(studentId, courseId, mark)
PK=(studentId, courseId), FK=(studentId->Student, courseId->Course)
Do not write this relational model in SQL DDL syntax, use the above
convention to describe your model.
4 marks
b) Write SQL CREATE TABLE statements for the Customer and Residence
relations from your relational model, as well as any associated relationship
relations between these two relations. You should show all relevant
attributes, types and key constrains for these relations. You should include
foreign key integrity constrains, like ON DELETE clauses where suitable, as
well as any necessary NOT NULL or UNIQUE constraints.
3 marks
3. Consider the database consisting of the following tables:
Person(driving_license_number, name, address)
Cars(car_registration_number,model, year, driving_license_number)
Accident(report_number, date, location)
InvolvedIn(report_number, car_registration_number)
Write down the SQL expressions for the following queries:
a) Find the car registration numbers that were involved in the highest number
of accidents
3 marks
b) Find the name of the persons who own more than 2 cars 3 marks
c) Find the name of the persons whose cars were involved in accidents
between 1st of January 2021 and 31st January 2021
3 marks
4. Suppose you are given a relation A with four attributes P, Q, R, S and the
following sets of FDs:
R → S
R → P
Q → R
a) Identify the candidate key(s) for A. 2 marks
b) Identify the highest normal form that A satisfies (1NF, 2NF, 3NF, or BCNF). 2 marks
c) If A is not in BCNF, decompose it into a set of BCNF relations that preserve
the dependencies.
3 marks
5. Consider the database consisting of the following tables
Movie(Movie_Title, Movie_Year, Length, Studio_Id)
ActedIn(Movie_Title, Movie_Year, Actor_id)
Actors(Actor_Id, Actor_Name, Address, Gender, Birthdate)
Studio(Studio_Id, Studio_Name, Address)
In case you cannot type the usual RA Greek letters easily, you should use the
following convention shown in Figure 2. Operator parameters should be
enclosed in square brackets.
Figure 2
πtitle ( σpoints = 6 (courses)) can be written as P[title] ( S[points = 6] (courses))
Write down the Relation Algebra (RA) expressions for the following queries:
a) Find the title of movies in which both actors ‘Jackie Chan’ and ‘Tom Cruise’
has acted.
2.5 marks
b) Find the name of all the actors who appear in movies made by “Warner
Bros” studio in 2020.
2.5 marks
6. Determine whether each of the following schedules are conflict serializable
or not; justify your answer. If a schedule is conflict serializable, please also
give a conflict equivalent serial schedule.
a) R1(x), W1(x), R2(x), R3(z), W1(y), R2(y), W3(y), R4(x), R1(z) 3 marks
b) R2(x), R1(y), W2(y), R3(z), W3(x), R1(z), R1(x) 3 marks
7. Consider a Company database that provides no concurrency control. The
database contains an Employee table defined as Employee(EmployeeId,
name, joining_date, salary) which contains these rows before the execution
of the following SQL statements:
EmployeeId Name Joining_date Salary
E101 Ross 2021-04-20 11000
E102 Joey 2021-03-05 8200
E103 Chandler 2021-01-01 12150
Three applications are executing concurrently over this database, two
general transactions (T1 and T2) and the periodic store-wide price-update
transaction (T3). Their operations are interleaved in the following execution
schedule:
T1 SELECT salary FROM employee WHERE employeeId = ‘E101’
T2 SELECT salary FROM employee WHERE employeeId = ‘E102’
T3 UPDATE employee SET salary = salary * 1.01
T1 UPDATE employee SET salary = 13000 WHERE employeeId = ‘E101’
T2 UPDATE employee SET salary = 15000 WHERE employeeId = ‘E103’
a) What are the salaries of all 3 employees after these operations have
completed?
1 mark
b) Which of the ACID properties are not being maintained in this non-
transactional system?
1 mark
Having realised their error, the company has turned on full SQL transaction
support on their database (ISOLATION LEVEL SERIALIZABLE) using strict 2-
phase row-level locking and turned their applications into transactions by
wrapping the applications within BEGIN and COMMIT statements. The
database is reset to the values shown above in the table, and the three
applications are re-run, with T1 and T2 again completing their initial SELECTs
before T3 attempts its UPDATE, so the start of the execution schedule is the
same as before.
c) Write down the revised execution schedule that will have actually occurred
as a result of the database enforcing serializable isolation.
2 marks
8. You have been brought in as a consultant to improve the performance of a
system dealing with the tracking of Shipment at the dock. You focus on
database operations involving the relation
Shipment(shipmentId, invoice, weight, container, date, company_id)
Each row in this relation takes up 16 bytes, including the 4-byte primary key
shipmentId. There are 200,000 rows in the table, and the table has been
defined with a target load factor of 75%, so there is about 25% free space in
each table page. The database uses 2048-byte pages for storing both data
and indexes, with the first 196 bytes of each page reserved for header data.
Records and index entries cannot span pages.
a) Estimate the space, in bytes, required to store just the relation data,
excluding any indexes. State any assumptions you make.
3 marks
Your analysis of the time taken for various transactions shows that many of
the problematic ones are doing searches over container, such as the
following:
SELECT COUNT(*)
FROM Shipment
WHERE container= ‘C-101’ AND weight < 20;
b) What access path would the database have to be using to process such
queries. Estimate the number of I/Os needed to process each such query,
ignoring any effects of buffering.
1 mark
c) What changes would you recommend fixing this issue, and what effect
would you expect it to have on the performance of such queries?
2 marks
----------------------------- End of Exam Questions---------------------------------------