Relations, Keys, & Normalization
Relations, Keys, & Normalization
Lecture 11
Relational Model Review
What is a relation?
Rows contain data about an entity
Columns contain data about attributes of the entity
Cells of the table hold a single value
All entries in a column are of the same kind
Each column has a unique name
The order of the rows and columns is unimportant
No two rows may be identical
Keys
A key is one or more columns of a relation that is used to identify a row
A key can be unique or non-unique
In Employee relation Emp_No vs Department
Composite Key
Primary Key
Candidate Key
Surrogate Key
Foreign Key
Keys
An attribute that is a key of one or more relations other than the one in which it appears
Is a foreign key
Foreign keys are one or more fields in a dependent file that reference the primary key in a parent file
Surrogate Keys
A column with a unique, DBMS assigned identifier that has been added to a table to be the primary key
The unique values of the surrogate key are assigned by the DBMS each time a row is added and the values never change
PROPERTY(Street,City,Prov,Pcode,OwnerID)
PROPERTY(PropertyID,Street,City,Prov,Pcode,OwnerID)
Surrogate keys are short, numeric and never change
Ideal as a primary key
Example (Primary Key)
Example (Hockey Awards)
Award Player
Name Pnumber Position Team Year
Best Defense Joe Wall
Sy Stopp
Pete Puck
Joe Wall 17
7
22
17 Left Defense
Right Defense
Left Defense
Left Wing Toronto
Detroit
Montreal
Toronto 1999
2000
2001
2002
Most Valuable Sam Scores
Wayne Gret
Joe Wall 18
99
17 Center
Center
Left Wing Chicago
New York
Toronto 1999
2000
2002
Example (Primary Key)
Step 1: fill all attributes
Step 2: look for any column that has no value that is used in more than one row.
We are looking for a column for which its value is UNIQUE.
We first check to see if we can have a single attribute as the primary key.
Award Player
Name Pnumber Position Team Year
Best Defense
Best Defense
Best Defense
Best Defense Joe Wall
Sy Stopp
Pete Puck
Joe Wall 17
7
22
17 Left Defense
Right Defense
Left Defense
Left Wing Toronto
Detroit
Montreal
Toronto 1999
2000
2001
2002
Most Valuable
Most Valuable
Most Valuable Sam Scores
Wayne Gret
Joe Wall 18
99
17 Center
Center
Left Wing Chicago
New York
Toronto 1999
2000
2002
Example (Primary Key)
Step 3: Look for any pairs of columns which when concatenated produce a unique value.
Team + Year…………
Position + Team……..
Position + Year………
PNumber + Position….
PNumber + Team…….
PNumber + Year……..
PlayerName + PNumber …
PlayerName + Position……
PlayerName + Team………
PlayerName + Year……….
Award + PlayerName……..
Award + PNumber…………
Award + Position………….
Award + Team…………….
Award + Year……………..
Example (Primary Key)
Concatenated primary key:
So Award + Year will be selected as the PK.
and the relation will be:
(Award, Year, PlayerName, PNumber, Position, Team)
Normalization Review
Normal Forms
1NF
2NF
3NF
Why is this done?
Example (Normalization/Userview1)
CLASSLIST
CLASSLIST in UNF
CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName, {StudentNumber, StudentName} ]
A relation is in 1st normal form when the primary key determines a single value of each attribute for all attributes in the relation (i.e. the relation contains no repeating groups)
Two ways to get to 1NF – how did we do it last week?
SubjectCode Section InstNo InstName SubjectName StudentNo StudentName
DBS201 A 122
Russ Pangborn
Intro to DB
111111111
222222222 Terry Adams
Jack Chan
DBS201 B 323 Bill Gates Intro to DB 121212121
323233232 Frank Brown
Mary Wong
RPG544 A 122 Russ Pangborn RPGIV 444444444
143211222 Wendy Clark
Peter Lind
Example (Normalization/Userview1)
Add to key of unnormalized relation to ensure primary key identifies 1 and only 1 value of each attribute in the relation
CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName, StudentNumber, StudentName ]
CLASSLIST [ SubjectCode, SectionCode, StudentNumber, InstructorNo, InstructorName, SubjectName, StudentName ]
SubjectCode Section InstNo InstName SubjectName StudentNo StudentName
DBS201 A 122 Russ Pangborn Intro to DB 111111111 Terry Adams
DBS201 A 122 Russ Pangborn Intro to DB 222222222 Jack Chan
DBS201 B 323 Bill Gates Intro to DB 121212121 Frank Brown
DBS201 B 323 Bill Gates Intro to DB 323233232 Mary Wong
RPG544 A 122 Russ Pangborn RPGIV 444444444 Wendy Clark
RPG544 A 122 Russ Pangborn RPGIV 143211222 Peter Lind
CLASSLIST
1NF (Method 2)
Restate the original un-normalized relation without the repeating group
CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName ]
Create a new relation consisting of key of original relation and attributes within repeating group and add to key to ensure uniqueness
CLASSLISTSTUDENT[ SubjectCodeFK1, SectionCodeFK1,StudentNumber, StudentName]
2nd Normal Form
A 1NF relation is in 2NF when the entire primary key is needed to determine the value of each non-key attribute
(i.e. relation has the partial dependencies – attributes whose values can be determined by knowing only one part of the key)
1NF to 2NF
1NF:
CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName, SubjectName ]
Relation CLASSLIST contains the partial dependency
SubjectCode -> SubjectName
CLASSLISTSTUDENT [ SubjectCode, SectionCode, StudentNumber, StudentName ]
contains the partial dependency
StudentNumber-> StudentName
2NF
Create new relation(s) consisting of part of the primary key and all attributes whose values are determined by this part of the primary key:
SUBJECT [SubjectCode, SubjectName ]
STUDENT [StudentNumber, StudentName ]
Restate original relation(s) without partially dependent attributes:
CLASSLISTSTUDENT [ SubjectCodeFK1, SectionCodeFK1, StudentNumberFK2 ]
CLASSLIST [ SubjectCode, SectionCode, InstructorNo, InstructorName ]
3NF
A 2NF relation is in 3NF when
the primary key and nothing but the primary key can be used to determine the value of each non-key attribute
(i.e. relation has transitive dependencies – attributes whose values can be determined by knowing something other than the key)
2NF to 3NF
2NF Relations:
CLASSLISTSTUDENT [ SubjectCodeFK1, SectionCodeFK1, StudentNumberFK2 ]
CLASSLIST [ SubjectCodeFK, SectionCode, InstructorNo, InstructorName ]
SUBJECT [SubjectCode, SubjectName ]
STUDENT [StudentNumber, StudentName ]
In CLASSLIST the Instructor Name is determined by InstructorNo so create the new relation:
INSTRUCTOR [InstructorNo, InstructorName ]
Remove the transitive dependency
CLASSLIST [ SubjectCodeFK, SectionCode, InstructorNoFK ]
3NF Relations
Set of 3NF Relations for the Class List Userview:
CLASSLIST [ SubjectCode, SectionCode, InstructorNo ]
CLASSLISTSTUDENT [ SubjectCode, SectionCode, StudentNumber ]
SUBJECT [SubjectCode, SubjectName ]
STUDENT [StudentNumber, StudentName ]
INSTRUCTOR [InstructorNo, InstructorName ]
1NF/2NF/3NF
An unnormalized userview will always result in one or more relations in 1NF.
Each 1NF relation will result in one or more 2NF relations.
Each 2NF relation will result in one or more 3NF relations.
You can never lose (i.e. not include) an attribute – it must always be found in one of the relations at each step .
You can never lose a relation.
Normalize all User views
Normalization process is applied to each remaining user view (eg grade sheet, timetable request, …)
A set of 3NF relations is produced for each user view.
Then 3NF relations from each user view are then integrated to form one complete set of relations for the application.
Example (Normalization/Userview2)
Comparing this user view with the CLASSLIST userview:
Similar attributes – Subject Code, Section, …
Different attributes – Grade, Semester, Year
Missing attribute – Instructor Number
SubjectCode Section Subject Name InstName Grade Sem Year StudentNo StudentName
DBS201 A Database Design Russ Pangborn A S 2016 111111111 Terry Adams
RPG544 A Advanced RPG Justin Trudeau B+ S 2016 111111111 Terry Adams
SYS333 B Systems Analysis Donald Trump C+ F 2016 111111111 Terry Adams
DBT544 B DB2 Steve Harper A F 2016 111111111 Terry Adams
MAP525 A Mobile Devices Russ Pangborn A W 2017 111111111 Terry Adams
MCL544 A Assembler Cito Gaston A W 2017 111111111 Terry Adams
GradeSheet
Example (Normalization/Userview2)
Normalizing GradeSheet userview to 3NF will produce similar relations like
STUDENT
INSTRUCTOR
and different relations like
GRADES
After the user views are in 3NF a merge would be done
DBDL (Database Designation Language)
Consider the following scenario:
Each dentist’s office has a unique identifier for insurance companies. There is a mailing address for the office as well as the name of the head dentist. There are many patients and each patient has a unique identifier number.
DBDL (Example 1)
Write the DBDL for the given description:
List Attributes
OfficeNo
MailAddress
HeadDentist
PatientNo
PatientName
Each dentist’s office has a unique identifier for insurance companies. There is a mailing address for the office as well as the name of the head dentist. There are many patients and each patient has a unique identifier number.
DBDL (Example 1)
Select Primary Key (unique identifier for each row)
OfficeNo, MailAddress, HeadDentist,PatientNo, PatientName
Show mulit-valued dependencies
OfficeNo,MailAddress,HeadDentist,(PatientNo, PatientName)
Give the relation a name
DentistsOffice [OfficeNo, MailAddress, HeadDentist, (PatientNo, PatientName) ]
We call this 0NF or UNF (Unnormalized Form) because there is a multi-valued dependency.
UNF to 1NF (Example 1)
UNF:
DentistsOffice [OfficeNo, MailAddress, HeadDentist, (PatientNo, PatientName) ]
Select the Primary Key for the multi-valued dependency.
Create a two-part primary key by concatenating the original PK with the PK of the multi-valued dependency:
1NF:
DentistsOffice [OfficeNo, PatientNo MailAddress, HeadDentist, PatientName ]
1NF to 2NF (Example 1)
1NF:
DentistsOffice [OfficeNo, PatientNo MailAddress, HeadDentist, PatientName ]
Look for partial dependencies
MailAddress is dependent on OfficeNo
OfficeNo -> MailAddress
PatientName is dependent on PatientNo
PatientNo -> PatientName
2NF: Create new relations
OfficePatient[OfficeNo,PatientNo)
DentistsOffice[ OfficeNo, MailAddress,HeadDentist]
Patient[PatientNo,PatientName]
Already in 3NF
UNF to 1NF (Example 2)
UNF
[Purchase#, date, (item#, quantity, unit_ price, tax code)]
1NF
Item# is the Primary Key of the multi-valued dependency.
[ Purchase#, item#, date, quantity, unit_ price, tax code]
Purchases at Shoppers Drug Mart-1111 Young Street Toronto are identified by a unique purchase # and a date on the bill. There can be several items and the purchase must record the item #, the quantity, the unit price, a tax code for each item, and the total price.
1NF to 2NF (Example 2)
2NF
PurchaseItem[ Purchase#, item#, quantity]
Purchase[ Purchase#, date ]
Item[ item#, unit_ price, tax code]
Purchases at Shoppers Drug Mart-1111 Young Street Toronto are identified by a unique purchase # and a date on the bill. There can be several items and the purchase must record the item #, the quantity, the unit price, a tax code for each item, and the total price.
/docProps/thumbnail.jpeg