5. Functional Dependencies
Housekeeping
Help, friendship & plagiarism Labs/tutorials
Tutor feedback survey is open on Wattle
Weekly pre-lab quizzes
Lab activities (5%): attending 5 labs + participating 5 pre-lab quizzes
Finding a group partner
Chartered accountants of Australia and New Zealand (Sean Lara)
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 1
5. Functional Dependencies
5. Functional Dependencies
We will study:
Database design quality
Data inconsistency Data redundancy Update anomalies
Functional dependencies Armstrong’s inference rules
Reflexive rule Augmentation rule Transitive rule
Inferred and implied functional dependencies Finding keys
Minimal cover (the hard part!)
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 2
5. Functional Dependencies
Alice’s Question
Foundations of Databases, S. Abiteboul, R. Hull, V. Vianu, Addison-Wesley, 1995.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 3
5. Functional Dependencies
Alice’s Question
Alice: Your model reduces the most interesting information to something flat and boring.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 4
5. Functional Dependencies
Alice:
Vittorio: Sergio:
Riccardo:
Your model reduces the most interesting information to something flat and boring.
You’re right, and this causes a lot of problems.
Designing the schema for a complex application is tough, and it is easy to make mistakes when updat- ing a database.
Also, the system knows so little about the data that it is hard to obtain good performance.
Alice’s Question
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 5
5. Functional Dependencies
Alice:
Vittorio: Sergio:
Riccardo:
Alice:
Vittorio:
Your model reduces the most interesting information to something flat and boring.
You’re right, and this causes a lot of problems.
Designing the schema for a complex application is tough, and it is easy to make mistakes when updat- ing a database.
Also, the system knows so little about the data that it is hard to obtain good performance.
Are you telling me that the model is bad? No, wait, we are going to fix it!
Alice’s Question
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 6
5. Functional Dependencies
Database Design Quality
A fundamental question in database design:
What constitutes a “well-designed” database schema?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 7
5. Functional Dependencies
Database Design Quality
A fundamental question in database design:
What constitutes a “well-designed” database schema?
We have learnt that:
A database design often starts with building an EER model.
An EER model can then be translated to a relational database schema.
However, such an EER model may not be “perfect”. Instead, it is common to have many different EER models for the same application.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 8
Product (ID, description, quantity)
Manufacturer (Website, name) Order (No, items, quantity)
5. Functional Dependencies
Database Design Quality – Examples (1)1
Page 1 of 1
Page 1 of 1
1
http://dblab.cs.toronto.edu/ miller/amalgam/
Page 1 of 1
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 9
file:///C:/Users/qing/Desktop/image001.gif 14/08/2015 http://dblab.cs.toronto.edu/~miller/amalgam/index_files/image002.gif 14/08/2015
5. Functional Dependencies Customer (ID, name, email, address)
Weak entities: 2 Delivery (tracking No, charges)
Database Design Quality – Examples (2)
Payment method (invoice no, amount)
The EER diagram does not show “all orders are delivered from a warehouse owned by
2
Previous COMP2400/6240 students’ solutions for an EER modelling question
the company”.
Question 2
Product ID
Address
Email
Name
Internet M
Product 1
Delivery
Customer ID
Customer 11
Shop
Quantity Description Manufacturer
Product 1
Delivery
Cost
Bank Transfer
M
M
Product ID Quantity
Description Manufacturer
Website of Manufacturer
Website of Manufacturer M
Paypal
Cost
Credit Card
Paypal
Cash Credit
Card
Bank Transfer
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 10
5. Functional Dependencies
Database Design Quality
Some desirable properties of a “well-designed” database schema
Completeness
Has all relevant information been captured?
Redundancy freeness
Hasthttep://dbloab.ucs.btorloinnto.gedu/o~mfillerre/amlealgvama/indetx_ifinlesf/iomargme003a.gitfionbeen1a4/0v8/o20i1d5ed(ifpossible)? Consistent understanding
Is the meaning of all relevant information consistent?
Performance
Can the database schema lead to the good performance for given tasks?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 11
5. Functional Dependencies
Design Guidelines
Some informal guidelines for determining the quality of (relational) database designs:
Reducing the NULL values;
Does not apply Unknown
Known but absent
Reducing the redundant information;
Making sure that the meaning of attributes is clear; Disallowing the possibility of generating spurious tuples; …
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 12
5. Functional Dependencies
Motivating Example
Suppose that we want to store the enrolment information (i.e., course no, semester and unit) of students (i.e., name, student id and date of birth) in a relational database.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 13
5. Functional Dependencies
Motivating Example
Suppose that we want to store the enrolment information (i.e., course no, semester and unit) of students (i.e., name, student id and date of birth) in a relational database.
Is the design of the relation ENROLMENT good?
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 14
5. Functional Dependencies
Motivating Example – Data Inconsistency
Any inconsistency problems with these tuples?
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 15
5. Functional Dependencies
Motivating Example – Data Inconsistency
Any inconsistency problems with these tuples?
The same student has different DoBs. This seems unreasonable.
There are different units for the same course in the same semester.
That should not happen.
The different students have the same ID. This is unacceptable.
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 16
5. Functional Dependencies
Motivating Example – Data Redundancy
Any redundancy problems with these tuples?
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 17
5. Functional Dependencies
Motivating Example – Data Redundancy
Any redundancy problems with these tuples?
There exists redundant information about students.
There exists redundant information about courses.
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 18
5. Functional Dependencies
Motivating Example – Update Anomalies
What could happen to update operations (e.g., insert, delete and update)?
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 19
5. Functional Dependencies
Motivating Example – Update Anomalies
What could happen to update operations (e.g., insert, delete and update)?
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Insertion anomalies: If inserting a new course COMP3000, then … Deletion anomalies: If deleting the enrolled course COMP2400 of
Fran, then …
Modification anomalies: If changing the DoB of Tom, then …
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 20
5. Functional Dependencies
Database Design Issues
We have seen the following database design issues so far:
Data inconsistency Data redundancy Update anomalies
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1989
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123456
11/09/1987
COMP2400
2009 S2
8
Can we avoid these issues when designing a database? How about using Null values?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 21
5. Functional Dependencies
Database Design Issues – Motivating Example
We may fix those database design issues through breaking a relation into smaller relations.
For example, each tuple in ENROLMENT represents three different facts:
Information about students Information about courses Course enrolment of students
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 22
5. Functional Dependencies
Database Design Issues – Motivating Example
⇓
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
STUDENT
Name
StudentID
DoB
ENROL
StudentID
CourseNo
Semester
123456
COMP2400
2010 S2
123456
COMP8740
2011 S2
123458
COMP2400
2009 S2
123458
COMP8740
2011 S2
123457
COMP2400
2009 S2
Tom
123456
25/01/1988
Michael
123458
21/04/1985
Fran
123457
CourseNo
COMP2400
11/09/1987
COURSE
Unit
6
COMP8740
12
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 23
5. Functional Dependencies
3 1971.
Codd and Functional Dependencies
Functional dependencies (FDs) were introduced by Codd in 1971 3 Edgar F. Codd of IBM Research (1923-2003) invented the relational data
model for data management in 1970.
He received the ACM Turing Award in 1981 for his contributions on the
theoretical foundations of relational databases:
Functional dependencies Normalization
– Boyce–Codd Normal Form (BCNF)
Query languages
– Relational Calculus – Relational Algebra
Further Normalization of the Data Base Relational Model. E. F. Codd, IBM Research Report, San Jose, California,
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 24
5. Functional Dependencies
Why Functional Dependencies?
We need some formal way of analysing whether a database schema is well-designed, or why one is better than another.
FDs are developed to define the goodness and badness of (relational) database design in a formal way.
Top down: start with a relation schema and FDs, and produce smaller relation schemas in certain normal form (called normalisation).
Bottom up: start with attributes and FDs, and produce relation schemas (not popular in practice).
FDs tell us “relationship between and among attributes”!
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 25
5. Functional Dependencies
Motivating Example
We have two FDs on ENROLMENT:
StudentID functionally determines Name and DoB, i.e.,
{StudentID} → {Name, DoB}
CourseNo functionally determines Unit, i.e., {CourseNo} → {Unit}
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 26
5. Functional Dependencies
Functional Dependencies – Informal Description
A FD says that, within a relation, the values of some attributes determine the values of other attributes.
How many legs does a kangaroo have?
How many legs does a pig have?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 27
5. Functional Dependencies
Functional Dependencies – Informal Description
A FD says that, within a relation, the values of some attributes determine the values of other attributes.
Animal → Legs
kangaroo 2 pig 4
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 28
5. Functional Dependencies
5. Functional Dependencies
Functional Dependencies – Informal Description
A FD says that, within a relation, the values of some attributes determine the values of other attributes.
Animal → Legs kangaroo 2
pig 4
If attributes A, B, C determine attributes D, E, then we write {A,B,C} → {D,E}
This means, if two tuples have the same values for A, B and C, then they must also have the same values for D and E.
A, B and C are the determinant, while D and E are the dependent.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 29
5. Functional Dependencies
What is “Functional” about Functional Dependencies?
The notion of functional dependency is very close to the notion of function. A (total) function f : X → Y describes a relationship between two sets X
and Y such that each element of X is mapped to a unique element of Y .
What is “Functional” about Functional Dependencies?
Add another slide (Optional): Before slide 14, we might say that the concept of functional What is “Functional” about Functional Dependencies?
Add another slide (Optional): Before slide 14, we might say that the concept of functional dependencies stems from the uniqueness property of the definition of function
dependencies stems from the uniqueness property of the definition of function
Recall that a relation f:X -> Y is a function if any two values at the first position of the tuples of
the relation f are different. In other words if x1=x2 then f(x1) = f(x2)
Recall that a relation f:X -> Y is a function if any two values at the first position of the tuples of the relation f are different. In other words if x1=x2 then f(x1) = f(x2)
A (total) function f : X → Y describes a relationship between two sets X Exercise: Which of the following diagrams satisfy the uniqueness property?
A (total) function f : X → Y describes a relationship between two sets X Exercise: Which of the following diagrams satisfy the uniqueness property?
and Y such that each element of X is mapped to a unique element of Y . Which one of them satisfy uniqueness property?
and Y such that each element of X is mapped to a unique element of Y . Which one of them satisfy uniqueness property?
Exercise: which of them represent a function? 1515
Exercise: which of them represent a function? 1515
Function
Functional
XY
Determinant Dependant
Domain
Codomain
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 30
dependency
5. Functional Dependencies
2626 3737
2626 3737
151 262 373
5 6 7
151 262 373
5 6 7
Slide 14: Consider adding in the second dot point: In other words if t1[X] = t2[X] then we must
have t1[Y] = t2[Y]. The following is also equivalent to the following:
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 31
Add another slide
Answer: The ones at the bottom.
Slide 14: Consider adding in the second dot point: In other words if t1[X] = t2[X] then we must
have t1[Y] = t2[Y]. The following is also equivalent to the following:
X is said to be the determinant and Y as the dependent of the functional dependent.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016
32
Add another slide
X is said to be the determinant and Y as the dependent of the functional dependent.
5. Functional Dependencies
Formal Definition
Let R be a relation schema.
A FD on R is an expression X → Y with attribute sets X,Y ⊆ R. A relation r(R) satisfies X → Y on R if, for any two tuples
t1,t2 ∈ r(R), whenever the tuples t1 and t2 coincide on values of X, they also coincide on values of Y .
A FD is trivial if it can always be satisfied, e.g., {A,B,C} → {C}
{A,B,C} → {A,B}
Syntactical convention: (1) Instead of {A, B, C}, we may use ABC. (2) A, B, . . . for individual attributes and X , Y , . . . for sets of attributes.
t1[X] = t2[X] ⇓
t1[Y] = t2[Y]
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 33
5. Functional Dependencies
Exercise – Functional Dependencies
A functional dependency specifies a constraint on the relation schema that must hold at all times.
Consider the following relations with attributes {A,B,C,D,E}. Do they satisfy: (1) AB → E; (2) ABC → D; (3) D → E?
r1(R)
A
B
C
D
E
1
1
1
9
9
1
1
1
8
9
r3(R)
A
B
C
D
E
1
1
1
9
9
1
1
2
9
8
r2(R)
A
B
C
D
E
1
1
1
9
9
1
2
2
8
9
r4(R)
A
B
C
D
E
1
1
1
9
9
1
2
2
9
8
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 34
5. Functional Dependencies
Exercise – Functional Dependencies
Consider the following relations with attributes {A,B,C,D,E}. Do they satisfy: (1) AB → E; (2) ABC → D; (3) D → E?
r1(R)
A
B
C
D
E
1
1
1
9
9
1
1
1
8
9
r3(R)
A
B
C
D
E
1
1
1
9
9
1
1
2
9
8
r2(R)
A
B
C
D
E
1
1
1
9
9
1
2
2
8
9
r4(R)
A
B
C
D
E
1
1
1
9
9
1
2
2
9
8
Answer:
r1(R) r2(R) r3(R) r4(R)
yes
yes
no
yes
no
yes
yes
yes
yes
yes
no
no
(1) AB → E (2) ABC → D (3) D → E
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 35
5. Functional Dependencies
How to Identify FDs?
In real-life applications, we often use the following approaches:
(1) Analyze sample data
Useful when application users are unavailable for consultation and/or
the document is incomplete.
(2) Analyze data requirements
Can be provided in the form of discussion with application users and/or data requirement specifications.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 36
5. Functional Dependencies
(1) Identifying FDs – Sample Data
Can you find some FDs on ENROLMENT based on the sample data?
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 37
5. Functional Dependencies
(1) Identifying FDs – Sample Data
Can you find some FDs on ENROLMENT based on the sample data?
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
We may have:
{StudentID} → {Name, DoB};
{CourseNo} → {Unit};
{StudentID, CourseNo, Semester} → {Name, DoB, Unit}; {Name} → {StudentID};
{DoB} → {StudentID};
……
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 38
5. Functional Dependencies
(1) Identifying FDs – Sample Data
Can you find some FDs on ENROLMENT based on the sample data?
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
We may have:
{StudentID} → {Name, DoB};
{CourseNo} → {Unit};
{StudentID, CourseNo, Semester} → {Name, DoB, Unit}; {Name} → {StudentID};
{DoB} → {StudentID};
……
Limitations: Sample data needs to be a true representation of all possible values that the database may hold.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 39
5. Functional Dependencies
(1) Identifying FDs – Sample Data
Can you find some FDs on ENROLMENT based on the sample data?
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Tom
123470
02/01/1989
COMP2400
2013 S2
6
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Peter
123480
21/04/1985
COMP2600
2013 S2
6
Fran
123457
11/09/1987
COMP2400
2009 S2
6
We may have:
{StudentID} → {Name, DoB};
{CourseNo} → {Unit};
{StudentID, CourseNo, Semester} → {Name, DoB, Unit}; {Name} → {StudentID} ×;
{DoB} → {StudentID} ×;
……
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 40
5. Functional Dependencies
5. Functional Dependencies
(2) Identifying FDs – Data Requirements
Consider the following relation schema: RENTAL={CustID,CustName,PropertyNo,DateStart,Owner} .
Data requirements:
1 Each customer can be uniquely identified by his or her customer ID.
2 A customer cannot rent two or more properties from the same date.
3 Two different customers cannot share the same lease
4 A customer cannot rent the same property more than once.
5 Each property can be uniquely identified by its owner.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 41
5. Functional Dependencies
Inference???
To design a good database, we need to consider all possible FDs. Example:
If {StudentID} → {ProjectNo} and {ProjectNo} → {Supervisor} , we can infer {StudentID} → {Supervisor} .
(i.e., each student works on one project and each project has one supervisor, then each student must have one project supervisor)
Can we systematically infer all possible FDs?
(2) Identifying FDs – Data Requirements
Consider the following relation schema: RENTAL={CustID,CustName,PropertyNo,DateStart,Owner} .
Data requirements:
1 Each customer can be uniquely identified by his or her customer ID. {CustID} → {CustName}
2 A customer cannot rent two or more properties from the same date. {CustID, DateStart} → {PropertyNo}
3 Two different customers cannot share the same lease. {PropertyNo, DateStart} → {CustID}
4 A customer cannot rent the same property more than once. {PropertyNo, CustID} → {DateStart}
5 Each property can be uniquely identified by its owner. {Owner} → {PropertyNo}
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 42
5. Functional Dependencies
Armstrong’s Inference Rules4
The Armstrong’s inference rules consist of the following three rules:
Reflexive rule: Augmentation rule:
Transitive rule:
XY → Y
{X → Y } |= XZ → YZ
{X → Y , Y → Z } |= X → Z
We use the notation Σ |= X → Y to denote that X → Y is inferred from the set Σ of functional dependencies.
4
William Ward Armstrong: Dependency Structures of Data Base Relationships, page 580-583. IFIP Congress, 1974.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 43
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 44
5. Functional Dependencies
Rule 1 – Reflexive Rule
If two tuples which agree on X and Y , then they also agree on Y , i.e., XY → Y.
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
Example: {StudentID, CourseNo, Semester} → {CourseNo, Semester}, where
X={StudentID}; Y={CourseNo, Semester}.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 45
5. Functional Dependencies
Rule 2 – Augmentation Rule
If two tuples agree on X and Y, then they agree on X,Z and Y,Z, i.e., {X → Y} |= XZ → YZ.
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
Example: {{StudentID} → {Name, DoB}} |={StudentID, CourseNo} → {Name, DoB, CourseNo}, where
X={StudentID}; Y={Name, DoB}; Z={CourseNo}.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 46
5. Functional Dependencies
Rule 3 – Transitive Rule
If X determines Y and Y determines Z , then X determines Z , i.e., { X → Y , Y → Z } |= X → Z .
ENROLMENT
Name
StudentID
DoB
CourseNo
Semester
Unit
Tom
123456
25/01/1988
COMP2400
2010 S2
6
Tom
123456
25/01/1988
COMP8740
2011 S2
12
Michael
123458
21/04/1985
COMP2400
2009 S2
6
Michael
123458
21/04/1985
COMP8740
2011 S2
12
Fran
123457
11/09/1987
COMP2400
2009 S2
6
Example: {{StudentID, CourseNo} → {CourseNo}, {CourseNo} → {Unit}} |= {StudentID, CourseNo} → {Unit}, where
X={StudentID, CourseNo}; Y={CourseNo};
Z={Unit}.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 47
5. Functional Dependencies
Inference Rules
If each student works on one project and each project has one supervisor, does each student have one project supervisor?
{{StudentID}→{ProjectNo}, {ProjectNo}→{Supervisor}}
|= {StudentID}→{Supervisor}
This can be proven by using the Transitive rule:
{ X → Y , Y → Z } |= X → Z
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 48
5. Functional Dependencies
Armstrong’s Inference Rules
Two questions:
Are all the FDs inferred using the Armstrong’s inference rules correct?
soundness
Can we use the Armstrong’s inference rules to infer all possible FDs?
completeness
Theorem (W. W. Armstrong, 1974)
The Armstrong’s inference rules are both sound and complete.
Implied FDs
vs
Inferred FDs
Pa
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 49
5. Functional Dependencies
Additional Inference Rules
The following rules can be derived from the Armstrong inference rules:
Union rule:
{X → Y , X → Z } |= X → YZ Decomposition rule:
{X →YZ}|={X →Y,X →Z} Pseudo Transitivity rule:
{X →Y,YZ →W}|=XZ →W
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 50
ge 1 of 1
5. Functional Dependencies
Common Questions from Students
http://blog.writeathome.com/wp-content/uploads/2012/03/Imply_Infer-e13315805169… 14 Can we use the following rules to infer FDs, i.e., are they correct?
(1) {X→Y}|=XZ→YZ
(2) {XZ→YZ}|=X→Y
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 51
5. Functional Dependencies
Common Questions from Students
Can we use the following rules to infer FDs, i.e., are they correct?
(1) {X→Y}|=XZ→YZ
Yes, using the Augmentation rule.
(2) {XZ→YZ}|=X→Y
No. See the counter-example below:
X
Y
Z
a a
b c
c d
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 52
/08/2015
pendencies. A (functional) dependency is derivable from Σ using if and only if there exists
5. Functional Dependencies
a derivation tree over and Σ with root ψ (notation: Σ ⊢ ψ). The syntactic hull Σ+ of Σu
Σ+ = {ψ | Σ ⊢ ψ}. Implied Functional Dependencies ⊔⊓
In most cases we deal with a fixed set of axioms and rules such as the Armstrong axioms. So, we normally drop the index .
Computing Σ+ using the Armstrong’s inference rules is not efficient.
Implied Functional Dependencies
We write Σ+ for all possible FDs implied by Σ.
Σ+canbecomputedusingtheArmstrong’sinferencerules. Example7.2.E5x.aTmakpelet:hCeorenlsaitdieonrascrheelamtiaonRs=chAemBaCRDE=a{nAd,BΣ,C=,D{A,EB}→anCdDa,sBet→ofEFD,DsE→ A}. Using theΣAr=m{stArBon→g aCxiDom,Bs→forEfu,nDcEtio→naAl }d.epUesnindgentchiesAwrmesotbrtoanignrtuhlesfo,lwloewhinagvedethrievation
tree:
following, which shows DB → A ∈ Σ+.
DB → B
B → E
DB → E
DB → BDE BDE → DE
DB → DE DE → A DB → A
∑
∑ ∑+
Armstrong’s rules
Why can we compute Σ+ using the Armstrong’s inference rules? Nonetheless, computing Σ+ using the Armstrong’s inference rules is not
efficient.∑1 ∑1+ = Comp2400/Comp6240 – Relational Databases, Semester 2, 2016
∑2 ∑2+
53
It follows that (DB → A) ∈ Σ+. ⊔⊓ How can we do it better?
Exa
of derivations using this set of inference rules and axioms. We first prove the union rule, i.e., {X → Y,X → Z} ⊢ X → YZ. The following derivation tree shows that this derivation is indeed valid.
5. Functional Dependencies
5. Functional Dependencies
In order to explain the derivation steps in detail we start on top of the tree. The node XY → X is an instantiation of the reflexivity axiom (X ⊆ XY ) and is therefore allowed to occur as a leaf.
Implied Functional Dependencies
Implied Functional Dependencies – Example
LetΣbeasetofFDs.Then,todecide whetherornotΣ|=X→Wholds, we need to
1 Compute the set of all attributes that are dependent on X , which is called the closure of X under Σ and is denoted by X+.
The node X → Z belongs to Σ. Applying the transitivity rule gives XY → Z ∈ Σ+. From there
we derive XY → XY Z ∈ Σ+ using the extension rule. But XY Z → Y Z is the reflexivity axiom
again and transitivity leaves us with XY → Y Z. The extension rule provides X → XY ∈ Σ+ Consider a relation schema R = {A,B,C,D,E,F}, a set of FDs
Σ = {AC → B,B → CD,C → E,AF → B} on R. Decide whetherornotΣ|=AC→EDholds.
2 Σ|=X→WholdsiffW⊆X+. Algorithm5
X+ :=X;
repeat until no more change on X +
X+
Pushing out
X
5
international edition]
foreachY →Z ∈ΣwithY ⊆X+, add all the attributes in Z to X+, i.e., replaceX+ byX+ ∪Z.
See Algorithm 15.1 in [Elmasri and Navathe, 6th edition], or Algorithm 1 on Page 555 in [Elmasri and Navathe,
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 55
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 56
nder is the set of all (functional) dependencies that are derivable from Σ using , i.e., 5. Functional Dependencies
mple 7.2.6. Let denote the Armstrong Axioms for FDs. We will show some examples Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 54
5. Functional Dependencies
5. Functional Dependencies
Implied Functional Dependencies – Example
Consider a relation schema R = {A,B,C,D,E,F}, a set of FDs Σ = {AC → B,B → CD,C → E,AF → B} on R.
Decide whether or not Σ |= AC → ED holds . 1 We first build the closure of AC:
Implied Functional Dependencies – Example
Consider a relation schema R = {A,B,C,D,E,F}, a set of FDs Σ = {AC → B,B → CD,C → E,AF → B} on R.
Decide whether or not Σ |= AC → ED holds . 1 We first build the closure of AC:
(AC)+ ⊇ AC ⊇ACB
⊇ ACBD ⊇ ACBDE = ACBDE
by the reflexive rule usingAC →B using B → CD using C → E
(AC)+ ⊇ AC ⊇ACB
⊇ ACBD ⊇ ACBDE = ACBDE
by the reflexive rule usingAC →B using B → CD using C → E
2 Then we check that ED ⊆ (AC)+. Hence Σ |= AC → ED.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 57
5. Functional Dependencies
Exercise – Implied Functional Dependencies
Consider a relation schema R = {A, B, C, D, E} and a set of functional dependencies Σ = {A → C,B → C,CD → E} on R.
Decide whether or not
Σ |= AD → CE holds Σ |= BD → AC holds
2 Then we check that ED ⊆ (AC)+. Hence Σ |= AC → ED. Can you quickly tell whether or not Σ |= AC → EF holds?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 58
5. Functional Dependencies
Exercise – Implied Functional Dependencies
Consider a relation schema R = {A, B, C, D, E} and a set of functional dependencies Σ = {A → C,B → C,CD → E} on R.
Decide whether or not
Σ |= AD → CE holds Σ |= BD → AC holds
We build the closure for the set of attributes and check:
1 (AD)+ = (ACD)+ = (ACDE)+ = ACDE and CE ⊆ (AD)+, hence
Σ |= AD → CE.
2 (BD)+ = (BCD)+ = (BCDE)+ = BCDE and AC ̸⊆ (BD)+, hence
Σ ̸|= BD → AC.
1 2
1
2
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 59
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 60
Armstrong’s rules + ∑ ∑
5. Functional Dependencies ∑
Equivalence of Functional Dependencies
5. Functional Dependencies
Σ1 and Σ2 are equivalent if Σ+1 = Σ+2 .
∑1 ∑1+ =
Example:
LetΣ1 ={X →Y,Y →Z}andΣ2 ={X →Y,Y →Z,X →Z}. Wehave: Σ1 ̸=Σ2 butΣ+1 =Σ+2 ={X →Y,Y →Z,X →Z}.
Hence, Σ1 and Σ2 are equivalent. Questions:
1 Is it possible that Σ+1 = Σ+2 but Σ1 ̸= Σ2?
2 Is it possible that Σ+1 ̸= Σ+2 but Σ1 = Σ2?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 61
We will need keys for defining the normal forms later on.
A subset of the attributes of a relation schema R is a superkey if it
uniquely determines all attributes of R.
A superkey K is called a key if no proper subset of K is a superkey.
That is, if you take any of the attributes out of K , then there is not enough to uniquely identify tuples.
Candidate keys are keys, and the primary key is chosen from them.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 62
A Bunch of Keys
∑2 ∑2+
5. Functional Dependencies
5. Functional Dependencies
A Bunch of Keys
Exercises – A Bunch of Keys
Candidate keys (keys)
Candidate keys (keys)
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 63
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 64
Superkeys
Some confusing questions:
(1) Is a key a candidate key?
(2) Is a primary key a candidate key? (3) Is a superkey a key?
(4) Is a key a superkey?
(5) Is a candidate key a superkey?
The Primary key
Superkeys
The Primary key
5. Functional Dependencies
Finding Keys
Given a set Σ of FDs on a relation R, the question is:
How can we find all the (candidate) keys of R?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 65
5. Functional Dependencies
Finding Keys
Fact:AkeyK ofRalwaysdefinesaFDK →R.
Algorithm6 :
Input: a set Σ of FDs on R. Output: a set of keys of R.
compute the closure X+ for every subset X of the relation R if X+ = R, then X is a superkey.
ifnopropersubsetY ofX withY+ =R,thenX isakey.
A prime attribute is an attribute occurring in a key, and a non-prime attribute is an attribute that is not a prime attribute.
6
Navathe, international edition] to finding all keys of R
It extends Algorithm 15.2(a) in [Elmasri and Navathe, 6th edition], or Algorithm 2(a) on Page 558 in [Elmasri and
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 66
5. Functional Dependencies
Finding Keys – Example
Consider ENROLMENT and the following FDs:
{StudentID} → {Name};
{StudentID, CourseNo, Semester} → {ConfirmedBy, Office}; {ConfirmedBy} → {Office}.
What are the keys, superkeys and prime attributes of ENROLMENT?
ENROLMENT
Name
StudentID
CourseNo
Semester
ConfirmedBy
Office
Tom
123456
COMP2400
2010 S2
Jane
R301
Mike
123458
COMP2400
2008 S2
Linda
R203
Mike
123458
COMP2600
2008 S2
Linda
R203
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 67
5. Functional Dependencies
Finding Keys – Example
Consider ENROLMENT and the following FDs:
{StudentID} → {Name};
{StudentID, CourseNo, Semester} → {ConfirmedBy, Office}; {ConfirmedBy} → {Office}.
We have
{StudentID, CourseNo, Semester} is the only key.
Every superset of {StudentID, CourseNo, Semester} is a superkey. StudentID, CourseNo and Semester are the prime attributes.
ENROLMENT
Name
StudentID
CourseNo
Semester
ConfirmedBy
Office
Tom
123456
COMP2400
2010 S2
Jane
R301
Mike
123458
COMP2400
2008 S2
Linda
R203
Mike
123458
COMP2600
2008 S2
Linda
R203
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 68
5. Functional Dependencies
5. Functional Dependencies
Exercise – Finding Keys
Exercise – Finding Keys
Consider a relation schema R = {A, B, C, D} and a set of functional dependencies Σ = {AB → C, AC → D}.
1 List all the keys and superkeys of R.
2 Find all the prime attributes of R.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016
5. Functional Dependencies
Exercise – Finding Keys
Checking all possible combinations of the attributes is too tedious! Example: Still consider a relation schema R = {A, B, C, D} and
Σ = {AB → C, AC → D}. List all the keys of R. Some tricks:
If an attribute never appears in the dependent of any FD, this attribute must be be part of each key.
If an attribute never appears in the determinant of any FD but appears in the dependent of any FD, this attribute must not be part of each key.
69
Consider a relation schema R = {A, B, C, D} and a set of functional dependencies Σ = {AB → C, AC → D}.
1 List all the keys and superkeys of R.
2 Find all the prime attributes of R.
Solution:
1 We compute the closures for all possible combinations of the attributes in R:
(A)+ =A,(B)+ =B,(C)+ =C,(D)+ =D;
(AB)+ = ABCD, (AC)+ = ACD, (AD)+ = AD, (BC)+ = BC, (BD)+ = BD, (CD)+ = CD
(ABC)+ = ABCD, (ABD)+ = ABCD, (ACD)+ = ACD, (BCD)+ = BCD
2 Hence, we have
AB is the only key of R.
AB, ABC, ABD and ABCD are the superkeys of R. A and B are the prime attributes of R.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 70
5. Functional Dependencies
Minimal Cover – The Hard Part!
LetΣbeasetofFDs. AminimalcoverΣm ofΣisasetofFDssuchthat
1 Σm is equivalent to Σ, i.e., Σ+m = Σ+, and
2 Dependent: each FD in Σm has only a single attribute on its right hand side;
3 Determinant: each FD has as few attributes on the left hand side as possible;
4 If we remove any FD from Σm, then Σm is not equivalent to Σ.
You can obtain a minimal cover for a given set of FDs by just doing these
three things (Steps 2, 3, 4) (as in the algorithm in the next slide).
A minimal cover can tell us what depends on what, the keys, the prime and
thus non-prime attributes, which will be needed for normalisation.
If a set X of attributes is a key, then any proper superset of X must not be a key.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 71
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 72
5. Functional Dependencies
5. Functional Dependencies
Minimal Cover – The Hard Part!
Minimal Cover
Algorithm7 :
Input: a set Σ of FDs on R.
Output: a minimal cover Σm for Σ on R. Step 1: Start with Σm = Σ
Step 2: Replace each FD X → {A1,…,Ak} in Σm with
X → A1 , . . . , X → Ak (i.e., the right hand side is single attribute).
Step 3: For each FD X → A in Σm, check each attribute B of X: if we can replace X → A with (X − B) → A in Σm, and still
remain equivalence of Σm and Σ, then we replace it. Step 4: For each remaining FD X → A in Σm:
if we can remove X → A and Σm is still equivalent to Σ, then we remove it.
Theorem:
The previous algorithm is non deterministic, and cannot find all minimal covers of a set of functional dependencies Σ.
Examples: Consider the following set of functional dependencies: Σ = {A → BC,B → C,B → A,C → AB}
Σ has two different minimal covers: Σ1 ={A→B,B→C,C→A}
Σ2 ={A→C,C→B,B→A}
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 74
5. Functional Dependencies
Minimal Cover – Examples
1 The set {A → B, B → C, A → C} can be reduced to {A → B, B → C}, because we can recover {A → C} by the transitive rule.
2 The set {AB → C, B → A} can be reduced to {B → C, B → A}, because {AB → C} can be derived from the original set.
The minimal cover of a set of functional dependencies Σ always exists but is not necessarily unique.
7
international edition]
See Algorithm 15.2 in [Elmasri and Navathe, 6th edition], or Algorithm 2 on Page 557 in [Elmasri and Navathe,
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 73
5. Functional Dependencies
Minimal Cover – Examples
1 The set {A → B, B → C, A → C} can be reduced to {A → B, B → C}, because we can recover {A → C} by the transitive rule.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 75
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 76
5. Functional Dependencies
5. Functional Dependencies
Minimal Cover – Examples
Weekly Review
1 The set {A → B, B → C, A → C} can be reduced to {A → B, B → C}, because we can recover {A → C} by the transitive rule.
2 The set {AB → C, B → A} can be reduced to {B → C, B → A}, because {AB → C} can be derived from the original set.
3 Given the set of FDs Σ = {B → A, D → A, AB → D}, we can compute the minimal cover of Σ as follows:
Step 2: check whether all the FDs in Σ have only one attribute on the right hand side (look good);
Step 3: determine if AB → D has any redundant attribute on the left hand side (AB → D can be replaced by B → D);
Step 4: look for a redundant FD in {B → A, D → A, B → D} (B → A is redundant);
Therefore, the minimal cover of Σ is {D → A, B → D}.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 77
5. Functional Dependencies
Exercises – Keys and Minimal Cover
Consider RENTAL={CustID,CustName,PropertyNo,DateStart,Owner} andthe following set Σ of FDs:
{CustID} → {CustName} {PropertyNo, StartDate} → {CustID} {PropertyNo, CustID} → {StartDate} {CustID, StartDate} → {PropertyNo} {Owner} → {PropertyNo}
Questions:
1 What are the keys of RENTAL?
2 What is a minimal cover of Σ?
Some questions
What are the common issues that may occur in a poorly designed database?
What is a functional dependency?
Why do we need functional dependencies for database design?
How can we identify functional dependencies in real-life applications? What are the Armstrong inference rules?
What is the difference between implied FDs and inferred FDs?
How can we find all keys using functional dependencies?
What is a minimal cover?
Some homework
See the next slide (optional).
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 78
5. Functional Dependencies
Exercises – Keys and Minimal Cover
Consider RENTAL={CustID,CustName,PropertyNo,DateStart,Owner} andits FDs in the abbreviated form as
R={C, N, P, D, O}, and
Σ={C →N,PD→C,CP→D,CD→P,O→P}
What are the keys of RENTAL?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 79
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 80
5. Functional Dependencies
Exercises – Keys and Minimal Cover
Consider RENTAL={CustID,CustName,PropertyNo,DateStart,Owner} andits FDs in the abbreviated form as
R={C, N, P, D, O}, and
Σ={C →N,PD→C,CP→D,CD→P,O→P}
What are the keys of RENTAL?
Solution: Check (X)+ for every subset of {C, N, P, D, O}.
(C)+ =CN,(P)+ =P,(N)+ =N,(D)+ =D,(O)+ =OP
(CP)+ = (CD)+ = CPND, (OD)+ = CPNDO (ignore all supersets of
OD and why?), . . .
(CO)+ = CPNDO (ignore all supersets of CO), . . .
Thus, {CustID, Owner} and {Owner, DateStart} are the keys.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 81
5. Functional Dependencies
Exercises – Keys and Minimal Cover
Consider RENTAL={CustID,CustName,PropertyNo,DateStart,Owner} andits FDs in the abbreviated form as
R={C, N, P, D, O}, and
Σ={C →N,PD→C,CP→D,CD→P,O→P}
What is a minimal cover of Σ?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 82
5. Functional Dependencies
Exercises – Keys and Minimal Cover
Consider RENTAL={CustID,CustName,PropertyNo,DateStart,Owner} andits FDs in the abbreviated form as
R={C, N, P, D, O}, and
Σ={C →N,PD→C,CP→D,CD→P,O→P}
What is a minimal cover of Σ? Solution:
Step 2: check that all the FDs in Σ have only one attribute on the right hand side.
Step 3: determine if there are any redundant attributes on the left hand side (no FDs in Σ need to be changed);
Step 4: check redundant FDs (no FDs in Σ are redundant). Thus, Σ is a minimal cover itself.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 83