RMIT Classification: Trusted
ISYS3412 Practical Database Concepts Normalisation Examples
May 2020
1 Second Normal Form (2NF) 1.1 Example1
Consider the WorksOn relation, taken from the poorly designed Company database.
WorksON (ESSN, PNo, PName, PLocation, DNo, Hours)
The following functional dependencies are derived from the business rules and information extracted from the company operation specifications.
FD1: ESSN, PNo à Hours FD2: PNo à PName
FD3: PNo à PLocation FD4: PNo à DNo
Hours attribute is fully functional dependent on the primary key. Refer FD1. However, the other non-primary-key attributes are partially dependent. They are only dependent on PNo, which only a part of the primary key. So, this relation is not in 2NF. Decompose it.
To make it to 2NF, we remove the “offending” attributes, along with a copy of their determinant attribute (PNo) and form a new relation, say Project. This determinant attribute (PNo) becomes the primary key of the new relation.
The result is:
WorksON (ESSN, PNo*, Hours)
Project (PNo, PName, PLocation, DNo)
Also note that the PNo attribute in the old relation is now acting as a foreign key referencing to the primary key of the new relation.
1.2 Example2
Consider the relation RouteAllocation relation given below.
RouteAllocation (Rego, RouteNo, StartTerminus, EndTerminus, StartTime, EndTime, Remarks)
Unclassified
You can assume the business rules given in Assignment 1 hold here. So, based on these facts, the following functional dependencies exist:
FD1: Rego, RouteNo à StartTime FD2: Rego, RouteNo à EndTime FD3: Rego, RouteNo à Remarks FD4: RouteNo à StartTerminus
School/Department/Area
Document: Normalisation Exercises.docx Author: Santha Sumanasekara Save Date: 09/05/2020 Page 1 of 7
RMIT Classification: Trusted
FD5: RouteNo à EndTerminus
As such, based on FD1, FD2, FD3 we can see StartTime, EndTime, and Remarks attributes are fully functional dependent on the primary key (
So, we must decompose this relation into two by removing the partially dependent attributes, along with a copy of their determinant (RouteNo) attribute. This will result in:
RouteAllocation (Rego, RouteNo*, StartTerminus, EndTerminus, StartTime, EndTime, Remarks)
Route (RouteNo, StartTerminus, EndTerminus)
When we do this decomposition, RouteNo attribute in the old relation becomes a foreign key,
referencing to the primary key of the new relation.
2 Third Normal Form (3NF) 2.1 Example1
Consider the Employee relation given below.
Employee (SSN, FName, MInit, LName, DoB, Position, Salary)
It was stated that each position in this fictitious company has a designated salary. So, based on these facts, the following functional dependencies exist:
FD1: SSN à FName
FD2: SSN à MInit
FD3: SSN à LName
FD4: SSN à DoB
FD5: SSN à Position FD6: Position à Salary
Left side of FD6 is not a primary key attribute. So, the Employee relation is not in 3NF.
Recompose the Employee relation, by removing the Salary attribute (right-hand side of FD6) along with a copy of its determinant (left-hand side of FD6) the Position attribute. Make them into a new relation, where the determinant attribute Position becomes the primary key of the new relation.
So, we have:
Employee (SSN, FName, MInit, LName, DoB, Position*) Wages (Position, Salary)
The Position attribute in the original relation becomes a foreign key referencing to Position attribute of the new Wages relation.
2.2 Example2
Consider the following relation used to store allocation of busses to routes. Assume the business rules discussed in assignment 1 (Task 2) holds here. In addition to those business rules, it was stated that all busses of a specific brand, model and year have the same service interval.
Document: Normalisation Exercises.docx Author: Santha Sumanasekara Save Date: 09/05/2020 School/Department/Area Page 2 of 7
RMIT Classification: Trusted
Bus (Rego, Brand, Model, YearMade, noSeats, ServiceInterval, LastServiceDate)
• • •
Is this relation in 1NF? – Yes, no multi-values attributes
Is this relation in 2NF? — Yes, it is 1NF and primary key is single-attribute PK. Is this relation in 3NF?
Based on the above business rules and the facts, the following functional dependencies hold among the attributes in this relation.
FD1: Rego à Brand
FD2: Rego à Model
FD3: Rego à YearMade
FD4: Rego à NoSeats
FD5: Brand, Model, YearMade à ServiceInterval FD6: Rego à LastServiceDate
Left side of the FD5 comprises of three attributes, but they are not primary key attributes. So, Bus relation is not in the third normal form. i.e. one of the non-primary-key attributes (ServiceInterval) is transitively dependent, not directly dependent, on the primary key.
Decompose it, by removing the ServiceInterval attribute, along with copies of its determinants (Brand, Model, and YearMade).
The result of this decomposition is as follows:
Bus (Rego, Brand*, Model*, YearMade*, noSeats, LastServiceDate) ServiceInfo (Brand, Model, YearMade, ServiceInterval)
3 Another Exercise
Consider the following Invoice relation instance shown below and answer the questions.
1. Discuss the data redundancy in the Customers relation. Use the relation instance given to show the insertion, deletion and update anomalies incurred by the data redundancy. Note that insertion can be seen as a special type of update.
Answer:
There are a number of redundant data in this relation instance, as a result of poor design. Such redundancies will lead into update anomalies. A few of such anomalies are listed below:
Invoice
Invoice No
CustName
Customer Addr
Cust Suburb
Cust Postcode
ProdName
Manufacturer
CountryofOrigin
Qty
116
Di Hunter
High Street
Preston
3072
Medium Cattle Trough
Rocky Concrete
Australia
2
Plenty Road
3757
Large Cattle Trough
Ironman Steels
China
3
117
Glads Gladdies
Whittlesea
Large Cattle Trough
Ironman Steels
China
4
Garden Gnome
Rocky Concrete
Australia
2
Bicycle Stand
Rocky Concrete
Australia
3
a.
School/Department/Area
Each product is made by one manufacturer, however, this information is listed on each invoice (for example, Large Cattle Trough appears on both invoice 116 and 117. This can lead into (1) insertion anomaly (how can we store manufacturer information, if no customer had bought an item yet?), (2) modification anomaly (e.g.
Document: Normalisation Exercises.docx Author: Santha Sumanasekara Save Date: 09/05/2020 Page 3 of 7
RMIT Classification: Trusted
in invoice 116, manufacturer is listed as Ironman Steels, while invoice 117 lists it as Ironman Steelworks), and (3) deletion anomaly (e.g. all bicycle stands were returned and deleted from the relation, and as a result, all product information related to bicycle stand is now lost).
b. Suburbs have designated postcodes. It is not required to record them along with each invoice.
c. Assuming each manufacturer imports their products from one country, the CountryofOrigin can be stored separately.
2. List all likely functional dependencies (FDs). Do not include trivial FDs.
Answer:
A FD like InvNo à InvNo is considered as a trivial functional dependency. While it is a true functional dependency we cannot learn anything new from it. It is not required to list such trivial functional dependencies.
The non-trivial FDs are:
FD1: InvNo à CustName
FD2: CustName à CustAddr
FD3: CustName à CustSuburb
FD4: CustSuburb à CustPostcode
FD5: ProdName à Manufacturer
FD6: Manufacturer à CountryOfOrigin
FD7: InvNo, ProdName à Qty
3. What are the candidate keys for the Invoice relation?
Answer:
By exploring the above functional dependencies, attributes CustName, CustAddr, CustPostcode, CustSuburb, Manufacturer, CountryOfOrigin, and Qty are dependent on some other attributes, and they are fully or partially, directly, or transitively depend on other attributes. However, InvNo and ProdName are not depend on any attribute (and, they do not depend each other, too). So, based on these observations, we can determine that
4. What the highest normal form is this relation in?
Answer:
This relation is unnormalized. We have a repeating group of attributes (ProdName, Manufacturer, CountryOfOrigin, and Qty).
5. Explain the reasons why it does not meet the requirements of the next normal form.
Answer: See above.
6. Decompose the relation into a set of 3NF relations.
Answer:
Start with:
Invoice (InvNo, CustName, CustAddr, CustSuburb, CustPostcode,
{ProdName, Manufacturer, CountryOfOrigin, Qty})
{} denotes the repeating group of attributes.
Remove that group of attributes, along with a copy of the original primary key (InvNo).
Document: Normalisation Exercises.docx Author: Santha Sumanasekara Save Date: 09/05/2020
School/Department/Area Page 4 of 7
RMIT Classification: Trusted
By doing so, we arrive at two relations, say Invoice and InvoiceDetails that are 1NF.
Invoice (InvNo, CustName, CustAddr, CustSuburb, CustPostcode)
InvoiceDetails (InvNo, ProdName, Manufacturer, CountryOfOrigin, Qty)
————————————————— Next, we must test these relations for 2NF.
Invoice relation is in 2NF, too.
Why? It has a single-attribute primary key. So, without further tests, we can deduce that
Invoice relation is in 2NF.
InvoiceDetails relation needs to be tested for 2NF.
Among its non-primary-key attributes, are all full functional dependent? Are there any that are partially dependent?
Consider the following FDs.
FD5: ProdName à Manufacturer
FD6: Manufacturer à CountryOfOrigin
Based on the above, it is clear that Manufacturer and CountryOfOrigin are partial dependent on the primary key (
The other attribute (Qty) is fully functional dependent on the whole of the primary key. Refer FD7.
So, InvoiceDetails relation is not in 2NF.
Create two relations, by removing Manufacturer and CountryOfOrigin, along with a copy of their determinant (ProdName).
The result is:
InvoiceDetails (InvNo, ProdName*, Qty)
Product (ProdName, Manufacturer, CountryOfOrigin)
At the end of 2NF step, we have three relations:
Invoice (InvNo, CustName, CustAddr, CustSuburb, CustPostcode) InvoiceDetails (InvNo, ProdName*, Qty)
Product (ProdName, Manufacturer, CountryOfOrigin)
—————————————————
Each of these are required to be checked for 3NF.
Checking Invoice relation.
FDs:
FD1: InvNo à CustName
FD2: CustName à CustAddr
FD3: CustName à CustSuburb
FD4: CustSuburb à CustPostcode
Based on these FDs we can determine that attributes CustAddr, CustSuburb, CustPostcode are transitively dependent on InvNo.
In fact, there are two levels of transitive dependency in the case of CustPostcode attribute:
InvNo à CustName à CustSuburb à CustPostcode
So, the decomposition can be done in two steps.
School/Department/Area
Document: Normalisation Exercises.docx Author: Santha Sumanasekara Save Date: 09/05/2020 Page 5 of 7
RMIT Classification: Trusted
Firstly, decompose by taking out CustAddr, CustSuburb, CustPostcode attributes from the Invoice relation and store them in a new relation (say, Customer) along with a
copy of their determinant (CustName).
Invoice (InvNo, CustName*)
Customer (CustName, CustAddr, CustSuburb, CustPostcode)
Secondly, remove CustPostcode and store them in a separate relation (say, Suburbs), along with its determinant (CustSuburb)
This will result in:
Customer (CustName, CustAddr, CustSuburb*) Suburb (Suburb, Postcode)
————————————————— Checking InvoiceDetails relation.
InvoiceDetails (InvNo, ProdName*, Qty)
This relation has only one non-primary-key attribute. So, there is no possibility of any transitive dependency to exist!
So, it is in 3NF.
————————————————— Checking Product relation.
Product (ProdName, Manufacturer, CountryOfOrigin) Consider the following FDs.
FD5: ProdName à Manufacturer
FD6: Manufacturer à CountryOfOrigin
CountryOfOrigin is transitively dependent on the primary key, ProdName. So, that has to be removed and store in a separate relation.
This will result in:
Product (ProdName, Manufacturer*) CountryOfOrigin (Manufacturer, CountryOfOrigin)
————————————–
The final schema:
Invoice (InvNo, CustName*)
InvoiceDetails (InvNo, ProdName*, Qty) Customer (CustName, CustAddr, CustSuburb*) Suburb (Suburb, Postcode)
Product (ProdName, Manufacturer*) CountryOfOrigin (Manufacturer, CountryOfOrigin)
The following Database instances demonstrate the database after each normalisation stage.
Original:
School/Department/Area
Document: Normalisation Exercises.docx Author: Santha Sumanasekara Save Date: 09/05/2020 Page 6 of 7
RMIT Classification: Trusted
After removing repeating group by copying non-repeating values to make it into 1NF, but this is not an acceptable solution, as it adds redundant data.
Correct way to make it to 1NF is to decompose it by removing the repeating group of attributes.
After 2NF:
After 3NF:
School/Department/Area
Document: Normalisation Exercises.docx Author: Santha Sumanasekara Save Date: 09/05/2020 Page 7 of 7