程序代写代做代考 C Functional Dependencies ER database 1

1
4/5/20
Practical Database Concepts
Lecture 9 and 10: Relational Database Design Santha Sumanasekara
May 2020
Where we are at?
2
1

3
4/5/20
Why Good Designs Matter?
Overview of the lecture
vRelational Database Design
Ø Why Good Designs Matter? Ø Top-down approach
Ø Bottom-up approach
vNormalisation
4
2

5
4/5/20
Example
Example
6
3

4/5/20
A good design
Employee (SSN, Fname, Minit, Lname, BDATE, Address, Sex, Position*, SuperSSN*, Dno*)
Department (Dnumber, Dname, MGRSSN*, MGRStartDate) DeptLocations (Dnumber*, Dlocation)
Project (Pnumber, Pname, Plocation, Dno*)
WorksOn (ESSN*, Pno*, Hours)
Dependent (ESSN*, DependentName, Sex, Bdate, Relationship)
Wages (Position, Salary)
7
A poor design
Employee (SSN, Fname, Minit, Lname, BDATE, Address, Sex, Position, Salary, SuperSSN*, Dno*)
Department (Dnumber, Dname, MGRSSN*, MGRStartDate, Location1, Location2, Location3)
WorksOn (ESSN*, Pno*, Pname, Plocation, Dno*, Hours)
Dependent (ESSN*, DependentName, Sex, Bdate, Relationship)
Dept Location
now is a repeating group
Project information embedded within WorksOn relation
8
4

4/5/20
Qualities of a Good Database Design
ØIn Database design, the main objective is to create accurate representation of data
Ø Reflects real-world structure of the problem Ø Can represent all expected data over time Ø Avoids redundant storage of data
Ø Provides efficient access to data
Ø Supports the maintenance of data integrity Ø Clean, consistent, and easy to understand!
9
How would you know whether if it is a Good Design?
Ø A good design would minimise (eliminate) “Update Anomalies”
ØInsertion Anomaly ØDeletion Anomaly ØModification Anomaly
What is an “Update Anomaly”?
10
5

11
4/5/20
How would you know whether if it is a Good Design?
If you discover the presence of these anomalies, you know it is a poor design!
Is this a really poor design?
Employee (SSN, Fname, Minit, Lname, BDATE, Address, Sex, Position, Salary, SuperSSN*, Dno*)
Department (Dnumber, Dname, MGRSSN*, MGRStartDate, Location1, Location2, Location3)
WorksOn (ESSN*, Pno*, Pname, Plocation, Dno*, Hours)
Dependent (ESSN*, DependentName, Sex, Bdate, Relationship)
12
6

13
4/5/20
Insertion Anomaly
ØConsider this poorly designed schema:
ØProject information (such as name location etc) are now stored redundantly along with each worksOn record.
ØTwo records can have different project info (for the same project
ØAnother problem: How do you enter details of a project with no employees assigned yet?
WorksOn (ESSN*, Pno*, Pname, Plocation, Dno*, Hours)
Insertion Anomaly
14
7

15
4/5/20
Deletion Anomaly
ØConsider this poorly designed schema:
ØRamesh and Franklin are working on ProductX project. Let’s say, they were removed from the project, but, we still are required to maintain information about the ProductX project.
Ø How?
WorksOn (ESSN*, Pno*, Pname, Plocation, Dno*, Hours)
Deletion Anomaly
16
8

4/5/20
Modification Anomaly
ØConsider this poorly designed schema:
ØLet’s suppose “ProjectX” project moved from Melbourne to Sydney. In our instance, there are 2 employees work on this project. So, two tuples are to be updated.
ØWhat will happen if we updated one of them and one of them accidentally omitted?
WorksOn (ESSN*, Pno*, Pname, Plocation, Dno*, Hours)
17
Modification Anomaly
18
9

4/5/20
Solution: Normalisation
19
Normalisation
Don’t get confused with “relationships” in ER model. They are between entities. These functional dependencies are among attributes
ØNormalisation is a database design technique, which begins by examining the relationships (called functional dependencies) between attributes.
ØNormalisation Process would rearrange attributes in relations (say, decompose one relation into two, merge two relations into one, etc) in order to achieve a better database design.
ØTwo main approaches for using normalisation Øas a validation technique;
Øas a decomposition tool
20
10

21
4/5/20
Normalisation – Top-down
ØNormalisation can be used as a validation technique ØEach of the relations built in the mapping process is
checked (validated) against a set of rules.
ØCan be done on any large or complex database design
22
11

23
4/5/20
Normalisation – Bottom-up
ØNormalisation can be used as a decomposition tool
ØThe use of this approach is to be determined by the size, extent, and complexity of the database – not suitable for large and complex database designs.
24
12

25
4/5/20
Normalisation is a stepwise refinement
ØOften executed as a series of steps. Each step corresponds to a specific normal form, which has known properties.
ØNormal forms we consider: first normal form (1NF), second normal form (2NF), and, third normal form (3NF).
ØOther normal forms: Boyce-Codd normal form (BCNF), 4NF, 5NF
ØEach normal form improves a relation schema
ØGoverned by functional dependencies among attributes.
26
13

27
4/5/20
Normalisation is a stepwise refinement
Functional Dependencies
ØNormalisation Process is entirely guided by Functional Dependencies
28
14

29
4/5/20
Functional Dependencies
ØFunctional dependencies describe relationships between attributes.
ØBased on semantics of the attributes in a relation.
ØThe semantics indicate how attributes relate to one another.
Functional Dependencies
ØA functional dependency is denoted as AàB
when attribute B is functionally dependent on attribute A.
Ø Each value of A is associated with exactly one value of B.
ØE.g. If SSN and Salary are two attributes in Employee relation, you would say, Salary is functionally dependent on SSN.
SSN à Salary
30
15

31
4/5/20
Functional Dependencies – An Example
If we know a SSN, we can determine that person’s Salary
Each person has only one salary, so if we know Alicia’s SSN, that will determine her salary.
Functional Dependencies – An Example
By knowing a Salary, we CANNOT definitively determine who receives that amount of Salary.
Opposite is not true. Salary 25K is associated with Alicia, Joyce and Ahmed, So, SSN is NOT functionally dependent on Salary.
32
16

33
4/5/20
Functional Dependencies – Another Example
ØFunctional Dependencies must hold for all time.
ØDepends on semantics, not data
ØBased on sample data, the following functional dependencies appear to hold
SSN à Fname
Fname à SSN
Ø However, based on semantics of data, the only valid
FD is:
SSN à FName
What if we have two Johns?
Functional Dependencies and Primary Key
ØThe primary key (or, candidate keys, for that matter) are also determined by functional dependencies.
ØIdentify all functional dependencies
ØSeparate all attributes that are determined by some attributes (i.e. attributes that are on right hand sides of FDs)
ØAny attributes left out in the previous step form the primary key.
34
17

4/5/20
Functional Dependencies and Primary Key
Example 1:
Ø Department (Dno, Dname, BranchNo, BranchMgr)
FD1: DnoàDname
FD2: BranchNoàBranchMgr
Ø Dname and BranchMgr attributed are determined by other
attibutes
Ø Dno and BranchNo attributes are not determined by others.
Ø So, is the primary key
Ø Do not rely on common sense, only rely on FDs!
Common sense would say Dno should be the primary key of th Department relation. Not true in this case!
35
Functional Dependencies and Primary Key
Example 2:
Product (prodcode, name, weight, price, manufacturer) If:
FD1: prod codeàname, weight, manufacturer, price What is the primary key?
If:
FD1: prod codeàname, manufacturer FD2: prodcode, weightàprice
What is the primary key?
36
18

37
4/5/20
Normal Forms
First Normal Form (1NF)
ØGeneral Definition: A relation is in 1NF when the intersection of each tuple and attribute (i.e. tuple component) contains one and only one value.
38
19

39
4/5/20
First Normal Form (1NF)
Consider the following example (from poor design)
More than one value for this attribute
First Normal Form (1NF)
One solution is to have multiple attributes to hold them.
This does not meet 1NF guidelines – No Repeating values or Repeating Groups
What will happen if a new branch is opened?
40
20

41
4/5/20
First Normal Form (1NF)
Another way to handle this situation is to add multiple rows (3 in this case) to store Research Department information, while maintaining 1NF requirements.
.
Decompose it!
However, this is not an acceptable solutions, as it adds redundant data, and violates our first principles on good designs.
First Normal Form (1NF)
The decomposition will result in two relations, with a new relation holding locations, each location on its own tuple.
Now both relations are in 1NF.
42
21

4/5/20
First Normal Form (1NF)
ØA more formal Definition: A relation is in 1NF when all non-primary key attributes are functional dependent on primary key attributes.
ØNote that when we have multi-valued attributes, those attributes are not functionally dependent on the primary key.
ØFD1: DeptNoàAddress
does not hold if there are multiple addresses, as such Department relation is not in 1NF.
43
1NF – Is it good enough?
Consider the following relation:
WorksOn (ESSN*, Pno*, Pname, Plocation, Dno*, Hours) It is in 1NF.
Still redundant data
44
22

45
4/5/20
1NF – Is it good enough?
There are a two points to make:
Ø The primary key is a composite primary key
Ø Some attributes (such as Pname, Plocation and Dno) are only determined by one component of the composite primary key.
So, even if the relation is in1 NF, it is not acceptable!
1NF – Is it good enough?
Ø Three attributes (Pname, Plocation, Dno) are only dependent on part of the Primary key (Pno).
Ø Decompose.
46
23

4/5/20
1NF – Is it good enough?
What we must do is to take away the three attributes (Pname, Plocation and Dno) and store them in a new relation along with the Pno attribute which determine the first three attributes.
PnoàPname, Plocation, Dno
47
1NF – Is it good enough?
There is no redundancy in this new relation. Each row represents a project and only one row per project.
Pno is now a foreign key referencing to the primary key (Pnumber) of the new relation
48
24

49
4/5/20
S1NecFo–ndIsNitogromoadl Feonromug(2hN? F)
What we did here is to bring our relations in to Second Normal Form (2NF).
Second Normal Form (2NF)
Ø Only matters when a relation has a composite primary key.
Ø (If a 1NF relation has a simple (one- attribute) primary key, it is always in 2NF, too.)
and
Ø Some attributes are only functional dependent on part of the primary key.
50
25

51
52
Second Normal Form (2NF)
Full Functional Dependency
Ø Only applicable when there are more than one attribute on the left-hand side of the FD.
Ø If ALL of the attributes are essential to hold the FD, it is said to be a full functional dependency.
ESSN, PNoàHours
So, this is a full functional dependency!
If we remove either
ESSN or PNo the FD doesn’t hold
4/5/20
26

53
4/5/20
Full Functional Dependency
Ø Only applicable when there are more than one attribute on the left-hand side of the FD.
Ø If ALL of the attributes are essential to hold the FD, it is said to be a full functional dependency.
ESSN, PNoàPLocation
While this FD is true, ESSN is redundent here. Even if you remove it, FD still holds. So, this is a Partial Functional Dependency.
Full Functional Dependency
Ø Only applicable when there are more than one attribute on the left-hand side of the FD.
Ø If ALL of the attributes are essential to hold the FD, it is said to be a full functional dependency.
ESSN, PNoàHours, PLocation
In fact, there are two FDs here.
Exercise: Is this a full or partial FD?
54
27

55
4/5/20
Full Functional Dependency
Ø Only applicable when there are more than one attribute on the left-hand side of the FD.
Ø If ALL of the attributes are essential to hold the FD, it is said to be a full functional dependency.
ESSN, PNoàHours ESSN, PNoàPlocation
This is a Full FD
This is a Partial FD
Second Normal Form (2NF)
Ø 2NF is only matters if the primary key is a composite primary key.
Ø If any 1NF relation with a simple primary key will definitely be 2NF too.
56
28

57
4/5/20
Second Normal Form (2NF)
Ø Consider following relation (from poor design)
WorksOn (ESSN, Pno, Pname, Plocation, Dno, Hours)
Based on the semantics of data, FD1: PnoàPname
FD2: PnoàPlocation
FD3: PnoàDno
FD4: ESSN, PnoàHours
The left-hand side of FD1, FD2, and FD3 does not have the whole Primary key.
That means, we have non-primary key attributes (Pname, Plocation, DNo) which are not fully functionally dependent on the primary key.
Second Normal Form (2NF)
Ø Three attributes (Pname, Plocation, Dno) are partially dependent on Primary key.
Ø Decompose.
58
29

59
4/5/20
Second Normal Form (2NF)
Second Normal Form (2NF)
Ø Exercise.
Ø Consider the following relation used to store
allocation of busses to routes. Assume the business rules discussed in assignment 1 (Task 2).
Ø Is this relation in 1NF?
Ø Is this relation in 2NF?
Ø If not decompose it to relations that are in 2NF.
RouteAllocation (Rego, RouteNo, StartTerminus,
EndTerminus, StartTime, EndTime,
Remarks)
60
30

61
4/5/20
2NF – Is it good enough?
Ø Consider the following 2NF relation.
Ø Employee (SSN, Fname, …, Position, Salary) Ø It is in 2NF.
Why?
Still redundant data!
2NF – Is it good enough?
There is an important point to make.
Ø Based on business rules, each position has a
designated salary.
Ø So, “Salary” attribute is functionally dependent
on another non-primary-key attribute,
“Position”.
Ø If non-primary-key attributes dependent on
other non-primary-key attributes that can lead into data redundancy.
62
31

63
4/5/20
2NF – Is it good enough?
FD1: SSNàPosition FD2: PositionàSalary
What we have to do in this case is to remove the attributes that are transitively dependent along with a copy of their determinant attribute and store in a new relation.
Salary is transitively dependent on the primary key
2NF – Is it good enough?
64
32

65
4/5/20
2TNhiFrd–NIsoirtmgaoloFdoermno(u3gNhF?)
What we did here is to decompose a 2NF relation in to two Third Normal Form (3NF) relations.
Third Normal Form (3NF)
Ø Only matters when a relation has FD s where left-hand side attribute(s) is/are not the primary key attribute(s)
Ø In simple terms, if there are any FDs which have non-primary-key attributes in the left- hand side, the relation is not in 3NF.
Ø In our previous example, “Position” is not a primary key attribute.
66
33

67
68
Third Normal Form (3NF)
Transitive Functional Dependency
Ø Transitive dependency describes a condition where A, B, and C are attributes of a relation such that if A → B and B → C, then C is transitively dependent on A via B.
SSN à Position
Position à Salary
From the above two we can derive: SSN à Salary.
This is a transitive dependency.
4/5/20
34

69
4/5/20
Third Normal Form (3NF)
Ø Consider following relation (from poor design)
Employee (SSN*, Fname, … Position, Salary, …) Based on the semantics of data,
SSN à Position
Position à Salary
This is a transitive functional dependency
That means, we have a non-primary key attribute (Salary) is being determined by another non-primary key attribute (Position).
Third Normal Form (3NF)
70
35

71
4/5/20
Third Normal Form (3N
F)
Salary is transitively dependent.
Third Normal Form (3NF)
72
36

73
4/5/20
Third Normal Form (3NF)
Ø Exercise.
Ø Consider the following relation used to store
allocation of busses to routes. Assume the business rules discussed in assignment 1 (Task 2).
Ø Assume all busses of a specific brand, model and year have the same service interval.
Ø Is this relation in 1NF?
Ø Is this relation in 2NF?
Ø Is this relation in 3NF?
Ø If not decompose it to relations that are in 3NF.
Bus (Rego, Brand, Model, YearMade, noSeats,
ServiceInterval, LastServiceDate)
74
37

4/5/20
Next Week
vNormalisation – doing it on real examples.
75
38