DTA(M) Database Theory & Applications
Lab 3 Tasks: Solutions
Task 1: Operational Anomalies Solutions
1.1: Update anomaly: To update the address of a student (e.g., SID = 41), who occurs twice or more
than twice in the relation, we need to update the Address column in all the corresponding rows, i.e.,
it requires a full scan of the table. Otherwise, data will become inconsistent.
1.2: Delete anomaly: if student with SID = 42 temporarily drops their uniquely enrolled subject, then
we either put a NULL value in the corresponding subject column or delete that row with all student
information (name, address).
1.3: Insert anomaly: Assume a new student admission with a new SID (SID = 44), name and address,
but the student has not yet chosen any subject. In this case, we insert NULL values. On the other
hand, if the new student to be inserted is enrolled in 5 subjects, then we need to insert 5 tuples
duplicating their name and address for each subject!
1.4: The relation is not in the 1NF. Using the 1NF, data redundancy increases, as there will be many
columns with same data in multiple rows. In our example, the first row (Chris, 15, {‘Biology’,
‘Maths’}) should be replicated as many times as the number of the subjects Chris has selected, i.e.,
two times. One for Biology and one for Maths.
1.5: The NF of the relation Tournament Winners is in 2NF since there is no partial dependency of any
non-prime attribute to the primary key. The FDs are:
FD1: {Tournament, Year} → {Winner, Winner Date of Birth} //by definition of the primary key
FD2: Winner → Winner Date of Birth
The composite key {Tournament, Year} is a minimal set of attributes guaranteed to uniquely identify
a row. That is, {Tournament, Year} is a candidate key for the table.
Now, the FD2 is not related to the PK. It does associate two non-prime attributes. Hence, based on
the definition of the 2NF, all non-prime attributes (winner and winner birth date) are fully
dependent on the PK, which is trivially derived from the FD1.
Note: The full or partial FD relates the dependency of a non-prime attribute to the PK only. Thus, to
assess the 2NF of a relation, we check the partial/full FDs of the non-prime attributes with the prime
attribute only (and not between non-prime attributes!)
DTA(M) Database Theory & Applications
Even though the attributes Winner and Winner Date of Birth are fully determined by the whole PK =
{Tournament, Year}, we observe that the Winner & Winner Date of Birth combinations result to
redundancy on multiple tuples. This leads to an update anomaly: if updates on e.g., winner date of
birth are not carried out consistently, then e.g., a specific winner could be shown as having two
different dates of birth.
Note: Let’s discuss a bit more: Here, the non-prime attribute Winner Date of Birth is transitively
dependent on the PK {Tournament, Year} via the non-prime attribute Winner. That is the relation is
in 2NF and not in 3NF, as we will be discussing later in the course .
Task 2: Normalisation to 2NF Solutions
2.1: The NF of the relation ORDER is 1NF and not it 2N since the Product_Description attribute is
partially dependent on the PK via the prime-key attribute Product_ID. That is, the Order_No does
not contribute in the determination of a Product_Description value. To normalize this relation in a
set of 2NF relations, we create two relations deriving from the inherent FD. Hence, we obtain the
relations:
• Product(Product_ID, Product_Description)
• OrderProduct(Order_No, Product_ID)
In both relations, the FDs therein correspond to fully functional dependencies over their non-prime
attributes. Also, Product_ID in the relation OrderProduct is FK referencing to the Product_ID
attribute (PK) of relation Product.
2.2: The relation Student is not in 2NF since there are partial dependencies of the non-prime
attributes with the PK. We need to split the relation into 2NF-relations such that each of the partial
FDs will be corresponding to full FDs in the new relations. That is, we create the relations:
• Student(Student_ID, Name) with full FD
• Student_Course(Student_ID, Course_ID) with trivial full DF (only a composite PK)
• Course(Course_ID, Units) with full FD.
Moreover, Student_ID in relation Student_Course is FK referencing to Student_ID PK in relation
Student; Course_ID in relation Student_Course is FK referencing to Course_ID PK in relation Course.
DTA(M) Database Theory & Applications
2.3: We identify the following unique attributes: StudentNo and CourseNo. Now, there are partial
dependencies of the non-prime attributes with these unique attributes, i.e.,
FD1: StudentNo → StudentName
FD2: CourseNo → CourseName
and for the Grade attribute it holds true that:
FD3: {StudentNo, CourseNo} → Grade
Hence, we split the original relation into a set of 2NF relations based on the abovementioned FDs ,
i.e., we obtain that:
• Student (StudentNo, StudentName)
• CourseGrade (StudentNo, CourseNo, Grade)
• Course (CourseNo, CourseName)
Moreover, StudentNo in relation CourseGrade is FK referencing to StudentNo PK in relation Student;
CourseNo in relation CourseGrade is FK referencing to CourseNo PK in relation Course.
2.4: Since the values of the attribute A are unique (a1, a2, a3, etc.), it follows from the FD definition
that:
A → B; A → C; A → D; A → E
It also follows that: A →{B, C} (or any other subset of {A, B, C, D, E}).This can be summarized as A →
{B, C, D, E}, i.e., from our observation, A is a potential primary key.
DTA(M) Database Theory & Applications
Since the values of E are always the same (all e1), it follows that:
A → E; B → E; C → E; D → E
Moreover, combinations of {B, C} are unique, therefore {B, C} → {A, D,
E}. And, combinations of {B, D} are unique, therefore {B, D} → {A, C, E}.
In addition, if C values match, so do D values, i.e., C → D. Moreover, D
values don’t determine C values. Based on the above reasoning, A is
the primary key being the minimal super key.