Boyce-Codd normal form
Boyce-Codd Normal Form
Jianjun Chen
Previous Lecture
In the previous lecture we demonstrated how 2NF and 3NF disallow partial and transitive dependencies on the primary key of a relation, respectively.
Relations that have these types of dependencies may suffer from the update anomalies
In This Lecture
More normalisation
Lossless decomposition
Boyce-Codd normal form (BCNF)
Higher normal forms
Denormalisation
For more information
Connolly and Begg Chapter 15
Lossless Decomposition
To normalise a relation, we used projections
If R(A,B,C) satisfies A→B then we can project it on A,B and A,C without losing information
Lossless decomposition:
R = πAB(R) ⋈ πAC(R)
where πAB(R) is projection of R on AB and ⋈ is natural join.
Example of Lossless Decomposition
Assume that a module is taught by one lecturer
When is Decomposition not Lossless: No FD
Question:
Is the normalisation process from 1NF to 3NF lossless or lossy?
If we break a FD A→B by splitting them into two separate tables {A} and {B}, will it be lossy or lossless?
Answers:
1: Not always lossless, we will see that later.
2. It’s always lossy.
7
Boyce-Codd Normal Form
The Stream Relation
Consider a relation, Stream, which stores information about times for various streams of courses (Online teaching)
Assume:
Each course has several streams
Only one stream (of any course at all) takes place at any given time
Each (student, course) pair is assigned to a single stream for it. That is, (student, course) is the primary key.
9
The Stream Relation
What are the functional dependencies?
FDs in the Stream Relation
Stream has the following non-trivial FDs
{Student, Course} → {Time}
{Student, Time} → {Course}
{Time} → {Course}
Trivial FD: (A, B) → B
B is a subset of (A, B).
non-trivial FD: A → B
B is not a subset of A
Is Stream table in 3NF?
Yes, Stream table is in 3NF.
11
Anomalies in Stream
INSERT anomalies:
You can’t add an empty stream (a stream with no students)
Modification anomalies:
Moving the 12:00 class to 9:00 means changing two rows
DELETE anomalies
Deleting Rebecca removes a stream
Boyce-Codd Normal Form
“A relation is in BCNF, if and only if, every determinant is a candidate key.”
The difference between 3NF and BCNF is that for a functional dependency A → B, The process of normalising to 3NF does not care whether the determinant A is a candidate key or not.
whereas BCNF insists that for this dependency to remain in a relation, A must be a candidate key.
13
Stream and BCNF
Stream is not in BCNF as the FD {Time} → {Course} is non-trivial and {Time} is not a candidate key
Conversion to BCNF
{Student, Course} → {Time}
{Student, Time} → {Course}
{Time} → {Course}
BCNF & Decomposition Properties
In relation {A, B, C} with primary key (A, B), we have functional dependency (A, B) → C.
If we also have FD C → B, and C cannot be considered as a candidate key in the relation, then this relation is in 3NF but not in BCNF. Stream table shows that such a relation has update anomalies.
We have FD (A, B) → C → B, which looks like a transitive dependency, but (A, B) → B is a trivial FD.
That’s the reason why this table is in 3NF.
BCNF & Decomposition Properties
To convert {A, B, C} into BCNF, split it into {A, B} and {C, B}
But we lose FD (A, B) → C. We can no longer reproduce such relationship.
As a result, BCNF in this case is lossy.
Decomposition Properties
Lossless: Data should not be lost or created when splitting relations up
Dependency preservation: It is desirable that FDs are preserved when splitting relations up
Normalisation to BCNF may not preserve all dependencies.
As a result, BCNF is NOT always necessary.
Depends on whether anomalies are acceptable or not.
Another Example
What if (student, time) is the primary key?
FD1: {Student, Course} → {Time}
FD2: {Student, Time} → {Course}
FD3: {Time} → {Course}
In FD2, Course is partially dependent on the primary key. The table is not in 2NF.
Splitting the table to {Time, Course} and {Student, Time} results in a lossy decomposition.
FD {Student, Course} → {Time} is broken
You can check the result by yourself.
Normalisation to 3NF is NOT always lossless and dependency preserving
Higher Normal Forms
Violation of BCNF is quite rare.
The potential to violate BCNF may occur in a relation that:
contains two (or more) composite candidate keys;
the candidate keys overlap, that is have at least one attribute in common.
Higher Normal Forms
BCNF is as far as we can go with FDs
Higher normal forms are based on other sorts of dependency
Fourth normal form removes multi-valued dependencies
Fifth normal form removes join dependencies
Denormalisation
Denormalisation
Normalisation
Removes data redundancy
Solves INSERT, UPDATE, and DELETE anomalies
This makes it easier to maintain the information in the database in a consistent state
However
It leads to more tables in the database
Often these need to be joined back together, which is expensive to do
So sometimes (not often) it is worth ‘denormalising’
Denormalisation
You might want to denormalise if
Database speeds are unacceptable (not just a bit slow)
There are going to be very few INSERTs, UPDATEs, or DELETEs
There are going to be lots of SELECTs that involve the joining of tables
Denormalisation gives some of the data check work to clients.
24
/docProps/thumbnail.jpeg