CS计算机代考程序代写 Functional Dependencies database DTA(M) Database Theory & Applications

DTA(M) Database Theory & Applications

Lab 3 Tasks: Solutions

Task 1: Operational Anomalies Solutions

1.1: Update anomaly: To update the address of a student (e.g., SID = 41), who occurs twice or more

than twice in the relation, we need to update the Address column in all the corresponding rows, i.e.,

it requires a full scan of the table. Otherwise, data will become inconsistent.

1.2: Delete anomaly: if student with SID = 42 temporarily drops their uniquely enrolled subject, then

we either put a NULL value in the corresponding subject column or delete that row with all student

information (name, address).

1.3: Insert anomaly: Assume a new student admission with a new SID (SID = 44), name and address,

but the student has not yet chosen any subject. In this case, we insert NULL values. On the other

hand, if the new student to be inserted is enrolled in 5 subjects, then we need to insert 5 tuples

duplicating their name and address for each subject!

1.4: The relation is not in the 1NF. Using the 1NF, data redundancy increases, as there will be many

columns with same data in multiple rows. In our example, the first row (Chris, 15, {‘Biology’,

‘Maths’}) should be replicated as many times as the number of the subjects Chris has selected, i.e.,

two times. One for Biology and one for Maths.

1.5: The NF of the relation Tournament Winners is in 2NF since there is no partial dependency of any

non-prime attribute to the primary key. The FDs are:

FD1: {Tournament, Year} → {Winner, Winner Date of Birth} //by definition of the primary key

FD2: Winner → Winner Date of Birth

The composite key {Tournament, Year} is a minimal set of attributes guaranteed to uniquely identify

a row. That is, {Tournament, Year} is a candidate key for the table.

Now, the FD2 is not related to the PK. It does associate two non-prime attributes. Hence, based on

the definition of the 2NF, all non-prime attributes (winner and winner birth date) are fully

dependent on the PK, which is trivially derived from the FD1.

Note: The full or partial FD relates the dependency of a non-prime attribute to the PK only. Thus, to

assess the 2NF of a relation, we check the partial/full FDs of the non-prime attributes with the prime

attribute only (and not between non-prime attributes!)

DTA(M) Database Theory & Applications

Even though the attributes Winner and Winner Date of Birth are fully determined by the whole PK =

{Tournament, Year}, we observe that the Winner & Winner Date of Birth combinations result to

redundancy on multiple tuples. This leads to an update anomaly: if updates on e.g., winner date of

birth are not carried out consistently, then e.g., a specific winner could be shown as having two

different dates of birth.

Note: Let’s discuss a bit more: Here, the non-prime attribute Winner Date of Birth is transitively

dependent on the PK {Tournament, Year} via the non-prime attribute Winner. That is the relation is

in 2NF and not in 3NF, as we will be discussing later in the course .

Task 2: Normalisation to 2NF Solutions

2.1: The NF of the relation ORDER is 1NF and not it 2N since the Product_Description attribute is

partially dependent on the PK via the prime-key attribute Product_ID. That is, the Order_No does

not contribute in the determination of a Product_Description value. To normalize this relation in a

set of 2NF relations, we create two relations deriving from the inherent FD. Hence, we obtain the

relations:

• Product(Product_ID, Product_Description)

• OrderProduct(Order_No, Product_ID)

In both relations, the FDs therein correspond to fully functional dependencies over their non-prime

attributes. Also, Product_ID in the relation OrderProduct is FK referencing to the Product_ID

attribute (PK) of relation Product.

2.2: The relation Student is not in 2NF since there are partial dependencies of the non-prime

attributes with the PK. We need to split the relation into 2NF-relations such that each of the partial

FDs will be corresponding to full FDs in the new relations. That is, we create the relations:

• Student(Student_ID, Name) with full FD

• Student_Course(Student_ID, Course_ID) with trivial full DF (only a composite PK)

• Course(Course_ID, Units) with full FD.

Moreover, Student_ID in relation Student_Course is FK referencing to Student_ID PK in relation

Student; Course_ID in relation Student_Course is FK referencing to Course_ID PK in relation Course.

DTA(M) Database Theory & Applications

2.3: We identify the following unique attributes: StudentNo and CourseNo. Now, there are partial

dependencies of the non-prime attributes with these unique attributes, i.e.,

FD1: StudentNo → StudentName

FD2: CourseNo → CourseName

and for the Grade attribute it holds true that:

FD3: {StudentNo, CourseNo} → Grade

Hence, we split the original relation into a set of 2NF relations based on the abovementioned FDs ,

i.e., we obtain that:

• Student (StudentNo, StudentName)

• CourseGrade (StudentNo, CourseNo, Grade)

• Course (CourseNo, CourseName)

Moreover, StudentNo in relation CourseGrade is FK referencing to StudentNo PK in relation Student;

CourseNo in relation CourseGrade is FK referencing to CourseNo PK in relation Course.

2.4: Since the values of the attribute A are unique (a1, a2, a3, etc.), it follows from the FD definition

that:

A → B; A → C; A → D; A → E

It also follows that: A →{B, C} (or any other subset of {A, B, C, D, E}).This can be summarized as A →

{B, C, D, E}, i.e., from our observation, A is a potential primary key.

DTA(M) Database Theory & Applications

Since the values of E are always the same (all e1), it follows that:

A → E; B → E; C → E; D → E

Moreover, combinations of {B, C} are unique, therefore {B, C} → {A, D,

E}. And, combinations of {B, D} are unique, therefore {B, D} → {A, C, E}.

In addition, if C values match, so do D values, i.e., C → D. Moreover, D

values don’t determine C values. Based on the above reasoning, A is

the primary key being the minimal super key.