Monday 4 May 2020, 14:30 BST
(24 hour open online assessment – Indicative duration 1.5 hours)
DEGREES OF MSc, MSci, MEng, BEng, BSc,MA and MA (Social Sciences)
Database Theory and Applications M
COMPSCI 5076
(Answer All 6 Questions)
This examination paper is worth a total of 60 marks
Summer Diet 1 Continued Overleaf/
Part A: Relational Modelling & Normalisation [20 Marks]
Question 1.
a) Can a Foreign Key be NULL? Explain your answer and provide an example. [3]
b) Consider the attributes X and Y of a relation, such that X Y and Y X, i.e., X
determines Y, and Y determines X. What can we assume for these attributes? Explain
your answer and provide an example. [3]
Question 2.
Consider the relation R(A, B, C, D, E) with the following FDs:
FD1: A → C; FD2: C → D; FD3: D → E; FD4: B → D
a) Which are the possible candidate keys in the relation R? Explain your answer. [2]
b) Which is the maximum Normal Form (NF) of the relation R? Explain your answer. [2]
c) Decompose the relation R into a set of relations in Boyce-Codd NF (BCNF). Describe your
methodology at each stage of decomposition. [3]
Question 3.
Assume a relation R(A, B, C, D, E) where {A,B,C} is the (composite) primary key. Consider
also the FDs: FD1: {A,B} → E; FD2: C → D;
a) Which is the maximum Normal Form (NF) of the relation R? Explain your answer.
[2]
b) Decompose the relation R into a set of relations in 2NF. Describe your methodology of
decomposition. [2]
Question 4.
Assume a relation R(A, B, C) where B is the primary key. Consider also the FD: C → A;
Decompose the relation R into a set of relations in BCNF, if relation R is not already in BCNF.
For each relation in BCNF, define the primary and foreign keys, if exist. [3]
Summer Diet 2 Continued Overleaf/
Part B: SQL [20 Marks]
Question 5. Consider the following relational schema:
Author(AuthorID, Name)
Authoring(ARTID, AID)
Article(ArticleID, PublicationYear, Title)
where the attribute AID is a foreign key in Authoring relation referencing to AuthorID in
Author relation, and the attribute ARTID is a foreign key in Authoring referencing to ArticleID
in Article relation. The primary key is underlined in each relation.
a) Write a SQL query that shows the publication years of the articles written by author ‘John’.
[2]
b) Write a SQL query that shows the number of articles written by each author. [2]
c) Write a SQL query that shows the names of the authors who have written more than 100
articles. [2]
d) Write a SQL query that checks if there exist authors who have not written any article. If so,
the query returns their names. [4]
e) For each author who has written more than 50 articles, show how many of these articles
have been published since 2016. [5]
f) Which is the expected result of the following query? Explain your answer. [5]
SELECT * FROM ARTICLE WHERE ARTICLEID NOT IN (NULL, 1, 2)
Summer Diet 3 Continued Overleaf/
Part C: Relational Algebra & Heuristic Optimization [20 Marks]
Question 6.
Assume the following relational schema:
EMPLOYEE(SSN, ENAME, DNO, SUPER_SSN)
DEPARTMENT(DNUMBER, DNAME, MGR_SSN)
The SSN (Social Security Number) is the primary key in the relation EMPLOYEE.
The SUPER_SSN (supervisor’s SSN) is a foreign key in the relation Employee referencing
to the SSN attribute. The SUPER_SSN indicates the SSN of the supervisor of an employee.
A supervisor does not have any supervisor, i.e., SUPER_SSN is NULL for a supervisor
tuple.
A supervisee has only one supervisor.
The DNO (department number) is a foreign key referencing to the DNUMBER attribute in
the relation Department. The DNO indicates the department where an employee is working.
The DNUMBER is the primary key in the relation Department.
The MGR_SSN in relation Department is a foreign key referencing to the SSN, which
indicates the social security number of the manager of a department.
Each department has only one manager. Some employees are managers of departments.
The attributes ENAME and DNAME correspond to the first name of an employee and the
name of a department, respectively.
There are only four departments in the database with names: ‘DepA1’, ‘DepA2’, ‘DepB1’,
‘DepC1’.
There are 100 employees in the database: 70 supervisee and 30 supervisors. Moreover, there
are 20 employees whose first name starts with ‘E’, e.g., ‘Eric’, ‘Edward’, ‘Ethan’.
Consider the following SQL query:
SELECT E.ENAME, D1.DNAME, S.ENAME. D2.DNAME
FROM EMPLOYEE E, EMPLOYEE S, DEPARTMENT D1, DEPARTMENT D2
WHERE E.SUPER_SSN = S.SSN
AND S.SSN = D2.MGR_SSN
AND E.DNO = D1.DNUMBER
AND D1.DNAME LIKE ‘DepA%’
AND E.DNAME LIKE ‘E%’
Summer Diet 4 /END
a) What does the SQL query return? Explain your answer. [2]
b) You are asked to find an optimal query execution process for the above-mentioned SQL
query based on heuristic optimization. Describe the optimal sequence of the relational
algebra operators (e.g., project, select, join) over the relations by applying heuristic
optimization rules. Explain your answer and the heuristic rules you used. [10]
c) For each relational algebra operator you used in your answer to Question 6.(b), provide the
expected number of tuples of the corresponding intermediate result. [8]