CS计算机代考程序代写 SQL Example Exam Paper DTA(M)

Example Exam Paper DTA(M)

Part A: Relational Modelling

Question 1.

(a) Provide a formal definition of the Super Key and the Primary Key in a relation R;

Answer: Slide 16/Lecture: Relational Model

(b) Provide an example for the super key and for the primary key.

Answer: Slide 16/Lecture: Relational Model

Question 2.

(a) Provide a formal definition of the Functional Dependency: X  Y in a relation R(X,Y);

Answer: Slide 17/Lecture: Functional Dependency Theory & Normalization Theory: Part I

(b) Provide an example of a functional dependency.

Answer: Slide 18/Lecture: Functional Dependency Theory & Normalization Theory: Part I

(c) Provide a formal definition of the Partial Functional Dependency in a relation R;

Answer: Slide 32/Lecture: Functional Dependency Theory & Normalization Theory: Part I

(d) Provide an example of a partial functional dependency.

Answer: Slide 32/Lecture: Functional Dependency Theory & Normalization Theory: Part I

Question 3.

Assume the relation R(A, B, C, D) with four attributes {A, B, C, D}. Consider the set of the Functional

Dependencies (FDs), which are the only dependencies that hold true in the relation R:

FD1: C → D

FD2: C → A

FD3: B → C

(a) Identify the candidate key(s) for the relation R and explain your answer.

Key is B; B functionally determines C, and transitively determines attributes D and A.

(b) Identify the highest Normal Form (NF) that relation R satisfies, i.e., one of: 1NF, 2NF, 3NF, or BCNF

and explain your answer.

R is in 2NF, since no partial FDs hold true (key B is singleton); R is not in 3NF due to the transitive FDs

of the non-prime D and A from the candidate key B via the non-prime attribute C.

(c) If relation R is not in BCNF, decompose R into a set of BCNF relations. Explain your

methodology.

FDs: C → D and C → A both cause violations of BCNF;

 Apply BCNF Theorem for C → D

 R\{D} and {C} U {D} thus: R1(A, B, C) and R2(C, D)

R1 is not in BCNF due to C → A; R2 is in BCNF

 R1\{A} and {C} U {A}, thus: R3(B, C) and R4(C, A)

Merge R2 with R4 (optional): R3(B, C) and R5(C, D, A)

Part B: SQL

Question 4.

Consider the following relational schema:

EMPLOYEE(SSN, NAME, SALARY, SUPERVISOR_SSN, PID)

PROJECT(PID, NAME, MANAGER_SSN)

Description: EMPLOYEE is a table of employees uniquely identified by their SSN, working at a project

with unique code: PID and having an employee as supervisor, identified by the SUPERVISOR_SSN.

PROJECT is a table storing information about projects uniquely identified by a PID, and having one

manager, who is an employee identified by MANAGER_SSN.

(a) For each relation, identify the Primary Key and Foreign Keys (if any) and explain your answer.

Employee: PK is SSN and FK is PID referencing to PID attribute in Project relation.

Project: PK is PID and FK is MANAGER_SSN referencing to SSN in Employee relation.

(b) Write a SQL query that shows the name(s) of the employee(s) who earn more than £40000.

SELECT E.NAME

FROM EMPLOYEE E

WHERE E.SALARY > 40000

(c) For those employees, who work in the projects with PID =1 or PID = 2, retrieve the names of their

supervisors.

SELECT S.NAME

FROM EMPLOYEE S, EMPLOYEE E

WHERE E.SUPERVISOR_SSN = S.SSN AND E.PID IN (1, 2)

(d) Find the number of employees working per project.

SELECT E.PID, COUNT(*)

FROM EMPLOYEE E

GROUP BY E.PID

(e) Find the minimum and the maximum salaries per project.

SELECT E.PID, MAX(E.SALARY), MIN(E.SALARY)

FROM EMPLOYEE E

GROUP BY E.PID

(f) For those projects with more than 10 employees, show their average salary.

SELECT E.PID, COUNT(*), AVG(E.SALARY)

FROM EMPLOYEE E

GROUP BY E.PID

HAVING COUNT(*) > 10

(g) Check if there exists a project whose manager is supervisor.

SELECT P.PID

FROM PROJECT P, EMPLOYEE S

WHERE S.SSN = P.MANAGER_SSN AND

EXISTS (SELECT * FROM EMPLOYEE E WHERE S.SSN = E.SUPERVISOR_SSN)

Part C: Relational Algebra and Heuristic Optimization

Question 5.

Consider the relational schema in Question 4 (repeated here for convenience):

EMPLOYEE(SSN, NAME, SALARY, SUPERVISOR_SSN, PID)

PROJECT(PID, NAME, MANAGER_SSN)

(a) Write the relational algebra expression of the following SQL query:

SELECT E.NAME, P.NAME

FROM EMPLOYEE E, PROJECT P

WHERE P.PID = E.PID AND E.SALARY > 50000

E.NAME, P.NAME(σE.SALARY > 5000 (PROJECT P.PID = E.PID EMPLOYEE)

(b) Provide the Canonical Tree and an Optimal Tree of the SQL query in Question 5(a). Explain why

your solution is optimal.

Let us provide Only the Optimal Tree; the Canonical Tree is trivial and can be any non-optimal tree

derived directly from the query.

(c) Provide the Optimal Tree of the following SQL query and explain why your solution is optimal:

SELECT S.NAME

FROM EMPLOYEE S

WHERE S.PID IN (1, 2) AND S.SSN IN (SELECT E.SSN

FROM EMPLOYEE E

WHERE E.SALARY < 50000) Let us provide Only the Optimal Tree; the Canonical Tree is trivial and can be any non-optimal tree derived directly from the query. (d) Assume that there are 100 employees and 10 projects. (1). Optimize the following SQL query: SELECT E.NAME, P.NAME FROM EMPLOYEE E, PROJECT P WHERE P.MANAGER_SSN = E.SSN The query joins two relations with their PK-FK. Since one project has only one manager and we retrieve the information for each manager, then the only optimization is on projecting on the attributes [SSN, E.Name, MANAGER_SSN, PName} before the joining and then only on the E.Name and P.Name after the joining. (2). Estimate the number of tuples retrieved. Given that each department has a unique manager, we expect to retrieve 10 tuples.