Functional Dependencies
Functional Dependencies
Functional Dependencies
P.J. Mc.Brien
Imperial College London
P.J. Mc.Brien (Imperial College London) Functional Dependencies 1 / 41
Functional Dependencies Problems in Schemas
What is wrong with this schema?
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
SELECT cash
FROM bank data
WHERE so r t c od e=67
cash
34005.00
34005.00
34005.00
34005.00
34005.00
P.J. Mc.Brien (Imperial College London) Functional Dependencies 2 / 41
Functional Dependencies Problems in Schemas
What is wrong with this schema?
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
SELECT DISTINCT cash
FROM bank data
WHERE so r t c od e=67
cash
34005.00
P.J. Mc.Brien (Imperial College London) Functional Dependencies 2 / 41
Functional Dependencies Problems in Schemas
What is wrong with this schema?
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
SELECT DISTINCT r a t e
FROM bank data
WHERE account=107
rate
null
P.J. Mc.Brien (Imperial College London) Functional Dependencies 2 / 41
Functional Dependencies Problems in Schemas
Problems with Updates on Redundant Data
INSERT INTO bank data
VALUES (100 ,67 , ’ Strand ’ , 33005 .00 , ’ d e p o s i t ’ , ’ McBrien , P . ’ , n u l l ,
1017 ,−1000.00 , ’ 1999−01−21 ’ )
UPDATE bank data
SET r a t e =1.00
WHERE mid=1007
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. 1.00 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
100 67 Strand 33005.00 deposit McBrien, P. null 1017 -1000.00 1999-01-21
SELECT DISTINCT cash
FROM bank data
WHERE so r t c od e=67
cash
34005.00
33005.00
P.J. Mc.Brien (Imperial College London) Functional Dependencies 3 / 41
Functional Dependencies Problems in Schemas
Problems with Updates on Redundant Data
INSERT INTO bank data
VALUES (100 ,67 , ’ Strand ’ , 33005 .00 , ’ d e p o s i t ’ , ’ McBrien , P . ’ , n u l l ,
1017 ,−1000.00 , ’ 1999−01−21 ’ )
UPDATE bank data
SET r a t e =1.00
WHERE mid=1007
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. 1.00 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
100 67 Strand 33005.00 deposit McBrien, P. null 1017 -1000.00 1999-01-21
SELECT DISTINCT r a t e
FROM bank data
WHERE account=107
rate
null
1.00
P.J. Mc.Brien (Imperial College London) Functional Dependencies 3 / 41
Functional Dependencies Problems in Schemas
How do you know what is redundant?
Functional Dependency
A functional dependency (fd) X → Y states that if the values of attributes X
agree in two tuples, then so must the values in Y .
Using an FD to find a value
If the FD no → rate holds then x in the table below must always take the value 5.25,
but y and z may take any value.
bank data
no mid rate
101 1001 5.25
101 1008 x
119 1009 y
z 1010 5.25
P.J. Mc.Brien (Imperial College London) Functional Dependencies 4 / 41
Functional Dependencies Problems in Schemas
Quiz 1: FDs that hold in bank data
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
Which set of FDs below do not hold for the data?
A
no → rate
no → bname
B
no → type
bname → no
C
no → type
mid → bname
D
amount → rate
bname → sortcode
P.J. Mc.Brien (Imperial College London) Functional Dependencies 5 / 41
Functional Dependencies Problems in Schemas
Quiz 2: Deriving FDs from other FDs
sortcode → bname
no → sortcode
no → cname
no → rate
mid → no
Given the FDs above, which FD below might not hold?
A
no → bname
B
no,sortcode → cname,sortcode
C
amount,tdate → amount
D
amount,tdate → mid
P.J. Mc.Brien (Imperial College London) Functional Dependencies 6 / 41
Functional Dependencies Armstrong’s Axioms
Armstrong’s Axioms
X,Y and Z are sets of attributes, and XY is a shorthand for X ∪ Y
Reflexivity
Y ⊆ X |= X→Y
Such an FD is called a trivial FD
Applying reflexivity
If amount,tdate are attributes
By reflexivity
amount ⊆ amount, tdate |= amount, tdate → amount
tdate ⊆ amount, tdate |= amount, tdate → tdate
P.J. Mc.Brien (Imperial College London) Functional Dependencies 7 / 41
Functional Dependencies Armstrong’s Axioms
Armstrong’s Axioms
X,Y and Z are sets of attributes, and XY is a shorthand for X ∪ Y
Augmentation
X → Y |= XZ → Y Z
Applying augmentation
If no,cname,sortcode are attributes and no → cname
By augmentation
no → cname |= no, sortcode → cname, sortcode
P.J. Mc.Brien (Imperial College London) Functional Dependencies 7 / 41
Functional Dependencies Armstrong’s Axioms
Armstrong’s Axioms
X,Y and Z are sets of attributes, and XY is a shorthand for X ∪ Y
Transitivity
X → Y, Y → Z |= X → Z
Applying transitivity
If no → sortcode and sortcode → bname
By transitivity
no → sortcode, sortcode → bname |= no → bname
P.J. Mc.Brien (Imperial College London) Functional Dependencies 7 / 41
Functional Dependencies Deriving Rules from Armstrong’s Axioms
Union Rule
Armstrong’s Axioms
Reflexivity: Y ⊆ X |= X → Y
Augmentation: X → Y |= XZ → Y Z
Transitivity: X → Y, Y → Z |= X → Z
Union Rule
If X → Y,X → Z
By augmentation
X → Y |= XZ → Y Z
X → Z |= X → XZ
By transitivity
X → XZ,XZ → Y Z |= X → Y Z
If X → Y Z
By reflexivity
Y Z |= Y Z → Y, Y Z → Z
By transitivity
X → Y Z, Y Z → Y |= X → Y
X → Y Z, Y Z → Z |= X → Z
∴ X → Y,X → Z ≡ X → Y Z
Note that the union rules means that we can restrict ourselves to FD sets
containing just one attribute on the RHS of each FD without loosing
expressiveness
P.J. Mc.Brien (Imperial College London) Functional Dependencies 8 / 41
Functional Dependencies Deriving FDs
Quiz 3: Deriving FDs from other FDs
Given a set S = {A → BC,CD → E,C → F,E → F} of FDs
Which set of FDs below follows from S?
A
A → BF,A → CF,A → ABCF
B
A → BD,A → CF,A → ABCF
C
A → BD,A → BF,A → ABCF
D
A → BD,A → BF,A → CF
P.J. Mc.Brien (Imperial College London) Functional Dependencies 9 / 41
Functional Dependencies Deriving FDs
Pseudotransitivity Rule
Armstrong’s Axioms
Reflexivity: Y ⊆ X |= X → Y
Augmentation: X → Y |= XZ → Y Z
Transitivity: X → Y, Y → Z |= X → Z
Pseudotransitivity Rule
If X → Y,WY → Z
By augmentation
X → Y |= WX → WY
By transitivity
WX → WY,WY → Z |= WX → Z
∴ X → Y,WY → Z |= WX → Z
P.J. Mc.Brien (Imperial College London) Functional Dependencies 10 / 41
Functional Dependencies Deriving FDs
Decomposition Rule
Armstrong’s Axioms
Reflexivity: Y ⊆ X |= X → Y
Augmentation: X → Y |= XZ → Y Z
Transitivity: X → Y, Y → Z |= X → Z
Decomposition Rule
If X → Y,Z ⊆ Y
By reflexivity
Z ⊆ Y |= Y → Z
By transitivity
X → Y, Y → Z |= X → Z
∴ X → Y,Z ⊆ Y |= X → Z
P.J. Mc.Brien (Imperial College London) Functional Dependencies 11 / 41
Functional Dependencies FDs and Keys
FDs and Keys
Super-keys and minimal keys
If a set of attributes X in relation R functionally determines all the other
attributes of R, then X must be a super-key of R
If it is not possible to remove any attribute from X to form X ′, and X ′
functionally determine all attributes, then X is a minimal key of R
Determining keys of a relation
Suppose branch(sortcode, bname, cash) has the FD set
{sortcode → bname,bname → sortcode, bname → cash}
1 {sortcode, bname} is a super-key since {sortcode, bname} → cash
2 However, {sortcode, bname} is not a minimal key, since sortcode → {bname, cash}
and bname → {sortcode, cash}
3 sortcode and bname are both minimal keys of branch
P.J. Mc.Brien (Imperial College London) Functional Dependencies 12 / 41
Functional Dependencies FDs and Keys
Quiz 4: Deriving minimal keys from FDs
Suppose the relation R(A,B,C,D, E) has functional dependencies
S = {A → E,B → AC,C → D,E → D}
Which of the following is a minimal key?
A
A
B
AB
C
BC
D
B
P.J. Mc.Brien (Imperial College London) Functional Dependencies 13 / 41
Functional Dependencies FDs and Keys
Quiz 5: Keys and FDs
Suppose the relation R(A,B,C,D, E) has minimal keys AC and BC
Which FD does not necessarily hold?
A
ABC → DE
B
AC → BDE
C
AB → DE
D
BC → DE
P.J. Mc.Brien (Imperial College London) Functional Dependencies 14 / 41
Functional Dependencies Closure
Closure of a set of attributes with a set of FDs
Closure X+ of a set of attributes X with FDs S
1 Set X+ := X
2 Starting with X+ apply each FD Xs → Y in S where Xs ⊆ X
+ but Y is not
already in X+, to find determined attributes Y
3 X
+ := X+ ∪ Y
4 If Y not empty goto (2)
5 Return X+
Closure of attributes
Relation R(A,B,C,D, E, F ) has FD set S = {A → BC,CD → E,C → F,E → F}
To compute A+
Start with A+ = A, just A → BC matches, so Y = BC
A
+ = ABC, just C → F matches, so Y = F
A
+ = ABCF , no FDs apply, so we have the result
P.J. Mc.Brien (Imperial College London) Functional Dependencies 15 / 41
Functional Dependencies Closure
Closure of a set of attributes with a set of FDs
Closure X+ of a set of attributes X with FDs S
1 Set X+ := X
2 Starting with X+ apply each FD Xs → Y in S where Xs ⊆ X
+ but Y is not
already in X+, to find determined attributes Y
3 X
+ := X+ ∪ Y
4 If Y not empty goto (2)
5 Return X+
Closure of a set of attributes
Relation R(A,B,C,D, E, F ) has FD set S = {A → BC,CD → E,C → F,E → F}
To compute AD+
Start with AD+ = AD, just A → BC matches, so Y = BC
AD
+ = ABCD, CD → E,C → F matches, so Y = EF
AD
+ = ABCDEF , no FDs apply, so we have the result
P.J. Mc.Brien (Imperial College London) Functional Dependencies 15 / 41
Functional Dependencies Closure
Quiz 6: Closure of Attribute Sets
Given a relation R(A,B,C,D, E, F ) and FD set
S = {A → BC,C → D,BA → E,BD → F,EF → B,BE → ABC}
Which closure of attributes of S does not cover R?
A
A
+
B
BC
+
C
BE
+
D
EF
+
P.J. Mc.Brien (Imperial College London) Functional Dependencies 16 / 41
Functional Dependencies Closure
Closure of a set of Functional Dependencies
Closure of the FD Set
The closure S+ of a set of FDs S is the set of all FDs that can be infered from S
Two sets of FDs S, T are equivalent if S+ = T+
For speed, we can ignore
trivial FDs (e.g. ignore A → A)
LHS that are not minimal (e.g. ignore AB → C if A →)
flatten all FDs to have just one attribute in RHS (e.g. consider A → CD as A → C
and ArightarrowD)
Apart from calculating equivalence, do not normally need to compute closure
Equivalent FDs
S = {A → B,A → C,B → A,B → D}
T = {A → B,A → C,A → D,B → A}
S
+ = T+ = {A → B,A → C,A → D,B → A,B → C,B → D}
∴ S ≡ T
P.J. Mc.Brien (Imperial College London) Functional Dependencies 17 / 41
Functional Dependencies Minimal Cover
Minimal cover of a set of FDs
Minimal cover S
c
of S
A minimal cover Sc of FD set S has the properties that:
All the FDs in S can be derived from Sc (i.e. S
+ = S+c )
It is not possible to form a new set S′c by deleting an FD from Sc or deleting an
attribute from an FD in Sc, and S
′
c can still derive all the FDs in S
In general, a set of FDs may have more than one minimal cover
Deriving a minimal cover
Suppose S = {A → B,BC → A,A → C,B → C}
1 Since B → C
BC → A ⇒ B → A
Leaves S′ = {A → B,B → A,A → C,B → C}
2a Since A → B,B → C |= A → C
A → C ⇒ ∅
Leaves Sc = {A → B,B → A,B → C}
2b Since B → A,A → C |= B → C
B → C ⇒ ∅
Leaves Sc = {A → B,B → A,A → C}
P.J. Mc.Brien (Imperial College London) Functional Dependencies 18 / 41
Functional Dependencies Minimal Cover
Quiz 7: Minimal Cover of a Set of FDs
Given an FD set S = {A → BC,C → D,BA → E,BD → F,EF → B,BE → ABC}
Which is a minimal cover of S?
A
A → BC,C → D,BA → E,BD → F,EF → B,BE → ABC
B
A → BC,C → D,BA → E,BD → F,EF → B,BE → A
C
A → BCE,C → D,BD → F,EF → B,BE → A
D
A → BC,C → D,B → E,B → F,EF → B,BE → A
P.J. Mc.Brien (Imperial College London) Functional Dependencies 19 / 41
Functional Dependencies Minimal Cover
Worksheet: Minimal Cover
R(A,B,C,D, E,F, G,H)
S = {AB → DEH,BEF → A,FGH → C,D → EG,EG → BF,F → BH}
1 Rewrite S to an equivalent set of FDs which only have a single attribute on the
RHS of each FD.
2 Consider each FD X → A, and for each B ∈ X, consider if X → B from the
other FDs. If so, replace X → A by (X −B) → A in S.
3 Consider each FD X → A, and compute X+ without using X → A. If A ⊆ X+,
delete X → A since it is rundundant. This will give a minimal cover Sc of S.
4 Justify what are the minimal candidate keys of R constrained by Sc
P.J. Mc.Brien (Imperial College London) Functional Dependencies 20 / 41
Functional Dependencies Minimal Cover
Worksheet: Minimal Cover (Step 3)
1 AB+ = ABDEHGFC
Try removing AB → D: find AB+ = ABEH, so can’t remove.
Try removing AB → E: find AB+ = ABDHEGFC, so remove it from S′′ to get S′′′
Try removing AB → H: find AB+ = ABDEGFHC, so remove it from S′′′ to get
S′′′′ = {AB → D,EF → A, FG → C,D → E,D → G,EG → B,EG → F,F → B, F →
H}
2 EF+ = EFABHDGC
Try removing EF → A: find EF+ = EFBH, so can’t remove.
3 FG+ = FGCBH
Try removing FG → C: find FG+ = FGBH, so can’t remove.
4 D+ = DEGBFHAC
Try removing D → E: find D+ = DG, so can’t remove.
Try removing D → G: find D+ = DE, so can’t remove.
5 EG+ = EGBFHADC
Try removing EG → B: find EG+ = EGFBHADC, so remove it from S′′′′ to get S′′′′′
Try removing EG → F : find EG+ = EG, so can’t remove.
6 F+ = FBH
Try removing F → B: find F+ = FH, so can’t remove.
Try removing F → H: find F+ = FB, so can’t remove.
Thus S′′′′′ is a minimal cover
Sc = {AB → D,EF → A,FG → C,D → EG,EG → F, F → BH}
P.J. Mc.Brien (Imperial College London) Functional Dependencies 21 / 41
Normalisation
Normalisation
P.J. Mc.Brien
Imperial College London
P.J. Mc.Brien (Imperial College London) Normalisation 22 / 41
Normalisation Need for Normalisation
Using FDs to Formalise Problems in Schemas
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
Formalise the intuition of redundancy by the statements of FDs
mid → {tdate, amount,no},
no → {type, cname, rate, sortcode},
{cname, type} → no,
sortcode → {bname, cash}
bname → sortcode
1st Normal Form (1NF)
Every attribute depends on the key
P.J. Mc.Brien (Imperial College London) Normalisation 23 / 41
Normalisation Need for Normalisation
Quiz 8: 1st Normal Form
bank data
no sortcode bname cash type cname rate? mid amount tdate
100 67 Strand 34005.00 current McBrien, P. null 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. null 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. null 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. null 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. null 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18
mid → {tdate, amount,no},
no → {type, cname, rate, sortcode},
{cname, type} → no,
sortcode → {bname, cash}
bname → sortcode
Is bank data in 1st Normal form?
True False
P.J. Mc.Brien (Imperial College London) Normalisation 24 / 41
Normalisation Need for Normalisation
Prime and Non-Prime Attributes
Prime Attribute
An attribute A of relation R is prime if there is some minimal candidate key X of R
such that A ⊆ X
Any other attribute B ∈ Attrs(R) is non-prime
Prime and non-prime attributes of bank data
bank data(no,sortcode,bname,cash,type,cname,rate,mid,amount,tdate)
Has FDs mid → {tdate, amount, no}, no → {type, cname, rate, sortcode},
{cname, type} → no, sortcode → {bname, cash}, bname → sortcode
Then
1 the only minimal candidate key is mid
2 the only prime attribute is mid
3 non-prime attributes are no,sortcode,bname,cash,type,cname,rate,amount,tdate
P.J. Mc.Brien (Imperial College London) Normalisation 25 / 41
Normalisation 3NF
3rd Normal Form (3NF)
3rd Normal Form (3NF)
For every non-trivial FD X → A on R, either
1 X is a super-key
2 A is prime
Every non-key attribute depends on the key, the whole key and nothing but the key
Failure of bank data to meet 3NF
bank data(no,sortcode,bname,cash,type,cname,rate,mid,amount,tdate)
Has the following FDs where the LHS is not a super-key:
no → {type, cname, rate, sortcode}, {cname, type} → no,
sortcode → {bname, cash}, bname → sortcode
Each of the above FD causes the relation not to meet 3NF since the RHS
contains non-prime attributes
P.J. Mc.Brien (Imperial College London) Normalisation 26 / 41
Normalisation 3NF
Quiz 9: Prime and nonprime attributes
Given a relation R(A,B,C,D, E, F ) and an FD set
A → BCE,C → D,BD → F,EF → B,BE → A
What are the nonprime attributes?
A
DEF
B
BC
C
CDF
D
CD
P.J. Mc.Brien (Imperial College London) Normalisation 27 / 41
Normalisation 3NF
Quiz 10: 3rd Normal Form
Given a relation R(A,B,C,D, E, F ) and an FD set
A → BCE,C → D,BD → F,EF → B,BE → A
Which decomposition is not in 3NF?
A
R1(B,D, F ), R2(A,B,C,D,E)
B
R1(A,B,C,E, F ), R2(C,D)
C
R1(A,B,C,E, F ), R2(C,D), R3(B,D, F )
D
R1(B,E, F ), R2(A,C,E), R3(C,D)
P.J. Mc.Brien (Imperial College London) Normalisation 28 / 41
Normalisation Lossless Join
Lossless-join decomposition of relations
Lossless-join decomposition of a Relation
A lossless-join decomposition of a relation R with respect to FDs S into relations
R1, . . . , Rn has the properties that:
Attrs(R1) ∪ . . . ∪ Attrs(Rn) = Attrs(R)
For all possible extents of R satisfying S, πAttrs(R1) R ⋊⋉ . . . ⋊⋉ πAttrs(Rn) R = R
Lossless-join decomposition of bank data
bank data(no,sortcode,bname,cash,type,cname,rate,mid,amount,tdate)
Has FDs mid → {tdate, amount, no}, no → {type, cname, rate, sortcode},
{cname, type} → no, sortcode → {bname, cash}, bname → sortcode
Decomposing bank data into
branch = πsortcode,bname,cash bank data
account = πno,type,cname,rate,sortcode bank data
movement = πmid,amount,no,tdate bank data
satisfies the lossless-join decomposition property
P.J. Mc.Brien (Imperial College London) Normalisation 29 / 41
Normalisation Lossless Join
Problems if not a lossless-join decomposition
If a decomposition of R into R1, . . . , Rn is not lossless, then some tuples spread over
R1, . . . , Rn can result in phantom tuples appearing
R(A,B,C,D), S = {A → B,B → CD}
R
A B C D
1 1 2 6
2 2 3 4
3 3 3 5
R1
A B C
1 1 2
2 2 3
3 3 3
R2
C D
2 6
3 4
3 5
R1 ⋊⋉ R2
A B C D
1 1 2 6
2 2 3 4
3 3 3 5
2 2 3 5
3 3 3 4
P.J. Mc.Brien (Imperial College London) Normalisation 30 / 41
Normalisation Lossless Join
Quiz 11: Lossless join decomposition
Given a relation R(A,B,C,D, E, F ) and an FD set
A → BCE,C → D,BD → F,EF → B,BE → A
Which is not a lossless-join decomposition of R?
A
R1(B,D, F ), R2(A,B,C,D,E)
B
R1(A,B,C,E, F ), R2(C,D)
C
R1(A,B,C,E, F ), R2(C,D), R3(B,D, F )
D
R1(B,E, F ), R2(A,C,E), R3(C,D)
P.J. Mc.Brien (Imperial College London) Normalisation 31 / 41
Normalisation Lossless Join
Worksheet: Lossless Join Decomposition
1 R(A,B,C,D,E) has the FDs S = {AB → C,C → DE,E → A}.
Which of the following are lossless join decompositions?
1 R1(A,B, C), R2(C,D,E)
2 R1(A,B, C), R2(C,D), R3(D,E)
2 Derive a lossless join decomposition into three relations of R(A,B,C,D, E, F )
with FDs S = {AB → CD,C → E,A → F}.
3 Derive a lossless join decomposition into three relations of R(A,B,C,D, E, F )
with FDs S = {AB → CD,C → E,F → A}.
P.J. Mc.Brien (Imperial College London) Normalisation 32 / 41
Normalisation Lossless Join
Generating 3NF
Generating 3NF
1 Given R and a set of FDs S, find an FD X → A that causes R to violate 3NF
(i.e. for which A is not a prime attribute and X is not a superkey).
2 Decompose R into Ra(Attr(R)− A) and Rb(XA) (Note because the two
relations share X and X → A this is lossless)
3 Project the S onto the new relations, and repeat the process from (1)
Note that step (2) ensures that the decomposition is lossless since joining Ra with Rb
will share X, and X → A
Canonical Example of 3NF Decomposition
Suppose R(A,B,C) has FD set S = {A → B,B → C}
The only key is A, and so B → C violates 3NF (since B is not a superkey and C
is nonprime).
Decomposing R into R1(A,B) and R2(B,C) results in two 3NF relations.
P.J. Mc.Brien (Imperial College London) Normalisation 33 / 41
Normalisation Lossless Join
Example: Decomposing bank data into 3NF
Bank Database as a Single Relation
bank data(no,sortcode,bname,cash,type,cname,rate,mid,amount,tdate)
S = {mid → {tdate, amount,no}, no → {type, cname, rate, sortcode},
{cname, type} → no, sortcode → {bname, cash}, bname → sortcode}
Since sortcode → {bname, cash} and sortcode is not superkey and bname, cash
nonprime, we should decompose bank data into
1 branch(sortcode, bname, cash) with FDs sortcode → {bname, cash},
bname → sortcode
2 bank data′(no, sortcode, type, cname, rate,mid, amount, tdate) with FDs
mid → {tdate, amount,no}, no → {type, cname, rate, sortcode},
{cname, type} → no
branch is in 3NF, but no → {type, cname, rate, sortcode} makes bank data′ violate
3NF, so we should decompose bank data′ into:
3 account(no, type, cname, rate, sortcode) with FDs
no → {type, cname, rate, sortcode}, {cname, type} → no
4 movement(mid.amount, no, tdate) with FD mid → {tdate, amount,no}
The relations branch, account, and movement are all in 3NF
P.J. Mc.Brien (Imperial College London) Normalisation 34 / 41
Normalisation Preserving FDs
Preserving FDs during decomposition
FD preserving decomposition
A lossless decomposition of R with FDs S into Ra and Rb preserves functional
dependencies S if the projection of S+ onto Ra and Rb is equivalent to S
FD preserving decomposition
Suppose R(ABC) with S = {A → B,B → C,C → A} is decomposed into Ra(AB)
and Rb(BC).
S
+ = {A → B,A → C,B → A,B → C,C → A,C → B}
The projection of S+ onto Ra gives S
+
a = {A → B,B → A}
The projection of S+ onto Rb gives S
+
b
= {B → C,C → B}
Note that the union Su of the two subsets of S
+ (i.e. Su = S
+
a ∪ S
+
b
) has the
property that S+u = S
+, and hence the decomposition preserves functional
dependencies.
3NF
There is always possible to decompose a relation into 3NF in a manner that preserves
functional dependencies. Thus any good 3NF decomposition of a relation must also
preserve functional dependencies.
P.J. Mc.Brien (Imperial College London) Normalisation 35 / 41
Normalisation Preserving FDs
Quiz 12: Preserving FDs during Decomposition
Given a relation R(A,B,C,D, E, F ) and an FD set
A → BCE,C → D,BD → F,EF → B,BE → A
Which decomposition preserves FDs?
A
R1(B,D, F ), R2(A,B,C,D,E)
B
R1(A,B,C,E, F ), R2(C,D)
C
R1(A,B,C,E, F ), R2(C,D), R3(B,D, F )
D
R1(B,E, F ), R2(A,C,E), R3(C,D)
P.J. Mc.Brien (Imperial College London) Normalisation 36 / 41
Normalisation Preserving FDs
Preserving FDs, lossless join, and 3NF
Given a relation R(A,B,C,D, E, F ) and an FD set
A → BCE,C → D,BD → F,EF → B,BE → A
Decomposition lossless join 3NF Preserves FDs
R1(B,D, F ), R2(A,B,C,D, E) ✓ ✗ ✗
R1(A,B,C,E, F ), R2(C,D) ✓ ✓ ✗
R1(A,B,C,E, F ), R2(C,D), R3(B,D, F ) ✓ ✓ ✓
R1(B,E, F ), R2(A,C,E), R3(C,D) ✗ ✓ ✗
Decomposing to 3NF
Since it is always possible to decompose a relation into a 3NF form that is both a
lossless join decomposition, and preserves FDs, you should always do so.
P.J. Mc.Brien (Imperial College London) Normalisation 37 / 41
Normalisation Preserving FDs
Quiz 13: Preserving FDs during Decomposition to 3NF
Suppose the relation R(A,B,C,D, E) has functional dependencies
S = {AC → DBE,BC → DE,B → A,E → D} (and hence has minimal keys AC
and BC)
Which is a lossless join decomposition to 3NF that preserves FDs?
A
Ra(B,C,E), Rb(A,B,C), Rc(D,E)
B
Ra(A,B,C), Rb(A,C,D,E)
C
Ra(A,C,D), Rb(A,C,E), Rc(A,B)
D
Ra(A,C,E), Rb(B,D,E)
P.J. Mc.Brien (Imperial College London) Normalisation 38 / 41
Normalisation BCNF
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF)
For every non-trivial FD X → A on R, X is a super-key.
Every attribute depends on the key, the whole key and nothing but the key
BCNF schema
branch(sortcode, bname, cash) with FDs sortcode → {bname, cash}, bname → sortcode
is in BCNF since sortcode and bname are both candidate keys
account(no, type, cname, rate, sortcode) with FDs no → {type, cname, rate, sortcode},
{cname, type} → no is in BCNF since no and cname, type are both candidate keys
movement(mid.amount, no, tdate) with FD mid → {tdate, amount, no} is in BCNF
since mid is key
P.J. Mc.Brien (Imperial College London) Normalisation 39 / 41
Normalisation BCNF
Decomposition of Relations into BCNF
Generating BCNF
1 Given R and a set of FDs S, find an FD X → A that causes R to violate BCNF
(i.e. for which X is not a superkey).
2 Decompose R into Ra(Attr(R)− A) and Rb(XA) (Note because the two
relations share X and X → A this is lossless)
3 Project the S onto the new relations, and repeat the process from (1)
Difference between 3NF and BCNF
Suppose the relation address(no, street, town, county, postcode) has FDs
{no, street, town, county} → postcode, postcode → {street, town, county},
The relation is in 3NF (alternative keys no, street, town, county and no, postcode).
The relation is not in BCNF since postcode → {street, town, county} has a
non-superkey as the determinant
Decompose the relation address on postcode → {street, town, county} to:
postcode(postcode, street, town, county)
streetnumber(no, postcode)
Note FD {no, street, town, county} → postcode cannot be projected over the
relations.
P.J. Mc.Brien (Imperial College London) Normalisation 40 / 41
Normalisation BCNF
Worksheet: Normal Forms
Sc = {AB → D,EF → A,FG → C,D → EG,EG → F, F → BH}
1 Decompose the relation into 3NF
2 Decompose the relation into BCNF
3 Determine if your decompositions in (1) and (2) preserve FDs, and if they do
not, suggest how to amend you schema to preserve FDs.
P.J. Mc.Brien (Imperial College London) Normalisation 41 / 41
Functional Dependencies
Problems in Schemas
Armstrong’s Axioms
Deriving Rules from Armstrong’s Axioms
Deriving FDs
FDs and Keys
Closure
Minimal Cover
Normalisation
Need for Normalisation
3NF
Lossless Join
Preserving FDs
BCNF