程序代写代做代考 database Functional Dependencies ER flex finance Slide 1

Slide 1

INFO20003 Database Systems

Lecture 16

Normalization

Semester 2 2018, Week 8

Dr Renata Borovica-Gajic

INFO20003 Database Systems © University of Melbourne 2018 2

Administration

• Academic integrity:
https://app.lms.unimelb.edu.au/bbcswebdav/institution/Engineering/MSE_academic_integrity/

Academic%20Integrity%20for%20MSE%20LMS.html

• I would love to get your feedback (only 2 min):

– https://www.surveymonkey.com/r/NCPZPWK

“Cheating – copying from or providing to another student or students an answer or answers to any assessment task

or essay:

• First year undergraduate student – failure of subject – Mark 0, Grade N;

• Any other student, second or subsequent offence – suspension or termination of enrolment – Mark 0, Grade N;
• Second or subsequent offence – termination of enrolment – Mark 0, Grade N. “

https://app.lms.unimelb.edu.au/bbcswebdav/institution/Engineering/MSE_academic_integrity/Academic Integrity for MSE LMS.html

WHY ARE DATABASES COOL ?
Featuring people from industry

Part 1

WHY ARE DATABASES COOL ?
Featuring people from industry

SNOWFLAKE COMPUTING, USA

INFO20003 Database Systems © University of Melbourne 2018 5

Learning Objectives

• By the end of this lecture, you should be able to:

– Define normalization

– Explain and identify database anomalies

– Define and identify functional dependencies

– Normalize relations to:
• 1st Normal Form (1NF)

• 2nd Normal Form (2NF)

• 3rd Normal Form (3NF)

INFO20003 Database Systems © University of Melbourne 2018 6

Motivation for normalization

• What happens if we don’t normalize?

INFO20003 Database Systems © University of Melbourne 2018 7

What’s wrong with the organization of

data in this table?
Student
ID#

Student
Name

Campus
Address

Degree Phone Subject
ID

Subject
Title

Lecturer
Name

A121 Joy Egbert 166 Grattan Street B.Com. 555-7771 ACC101 Accounting Davern

A121 Joy Egbert 166 Grattan Street B.Com. 555-7771 ECO101 Economics Smyth

A121 Joy Egbert 166 Grattan Street B.Com. 555-7771 ECO104 Quant. M. Collier

A121 Joy Egbert 166 Grattan Street B.Com. 555-7771 FIN101 Finance. James

A121 Joy Egbert 166 Grattan Street B.Com. 555-7771 ACC103 Processes Wise

A123 Larry Mueller 302 Royal Parade B.Com. 555-1235 ACC101 Accounting Davern

A123 Larry Mueller 302 Royal Parade B.Com. 555-1235 ECO101 Economics Smyth

A123 Larry Mueller 302 Royal Parade B.Com. 555-1235 ECO104 Quant. M. Collier

A123 Larry Mueller 302 Royal Parade B.Com. 555-1235 FIN101 Finance. James

A124 Mike Guon 224 Swanston St. B.Eco. 555-2214 ACC101 Accounting Davern

A124 Mike Guon 224 Swanston St. B.Eco. 555-2214 ECO101 Economics Smyth

Lecturer
Office

T240C

T240F

T240D

T240D

T240E

T240C

T240F

T240D

T240D

T240C

T240F

Lecturer
Phone

8344-1846

8344-1868

8344-5716

8344-5275

8344-5309

8344-1846

8344-1868

8344-5716

8344-5275

8344-1846

8344-1868

Sem.

1-11

1-11

1-11

1-11

1-11

1-11

1-11

1-11

1-11

1-11

1-11

Grade

H1

H2B

H2B

H2A

H3

H1

H2B

H2A

H3

H2A

H2A

A124 Mike Guon 224 Swanston St. B.Eco. 555-2214 ECO104 Quant. M. Collier

A124 Mike Guon 224 Swanston St. B.Eco. 555-2214 ACC103 Processes Wise

A126 Jackie Judson 85 Barry Street B.Eco. 555-1245 ACC101 Accounting Davern

A126 Jackie Judson 85 Barry Street B.Eco. 555-1245 ECO101 Economics Smyth

A126 Jackie Judson 85 Barry Street B.Eco. 555-1245 ECO104 Quant. M. Collier

A126 Jackie Judson 85 Barry Street B.Eco. 555-1245 ACC103 Processes Wise

… … … … … … … …

T240D

T240E

T240C

T240F

T240D

T240E

8344-5716

8344-5309

8344-1846

8344-1868

8344-5716

8344-5309

1-11

1-11

1-11

1-11

1-11

1-11

H2B

H2B

H1

H2B

H2B

H2A

INFO20003 Database Systems © University of Melbourne 2018 9

Anomalies in Denormalized Data:

• Consider the following
denormalized table (relation) :

• Insertion Anomaly: A new course cannot be added until at least one

student has enrolled (which comes first student or course?)

• Deletion Anomaly: If student 425 withdraws, we lose all record of course

C400 and its fee!

• Update Anomaly: If the fee for course C200 changes, we have to change

it in multiple records (rows), else the data will be inconsistent.

Student-ID Course-ID Fee

130 C200 75

200 C300 100

250 C200 75

425 C400 150

500 C300 100

575 C500 50

… … …

INFO20003 Database Systems © University of Melbourne 2018 10

Normalisation

A relation is normalized if

all determinants are

candidate keys

• A technique used to remove undesired redundancy from

databases (Break one large table into several smaller

tables).

How do we normalise?

INFO20003 Database Systems © University of Melbourne 2018 11

Invoice example

INFO20003 Database Systems © University of Melbourne 2018 12

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person

Terms Product

ID

Product

Name

Unit

Price

Quantity Amount Sub

Total

INV0012 14-Aug-

09

John /

Synex

128

Juanita

Ave…

Charles

Wooten

COD PSV880.

006

AMD

Athlon

X2DC

580 6 3480 9463

PSV880.

037

PDC

E5300

645 4 2580

LC.V890.

002

LG 8.5”

LCD

230 10 2300

HPQ754.

071

HP

LaserJet

5200

1103 1 1103

INV0013 15-Aug-

09

Mary /

ThisCo

123 Smith

Street…

Charles

Wooten

COD HP

Q754.071

HP

LaserJet

5200

1103 2 2206 3356

LCV890.

002

LG 8.5”

LCD

230 5 1150

This is not relational model

INFO20003 Database Systems © University of Melbourne 2018 13

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-09 John / Synex 128 Juanita

Ave…

Charles

Wooten

COD 9463 0 780.70 0

INV0013 15-Aug-09 Mary / ThisCo 123 Smith

Street…

Charles

Wooten

COD 3356 0 100 0

Product ID Product Name Unit Price Quantity Amount

PSV880.006 AMD Athlon

X2DC

580 6 3480

PSV880.037 PDC E5300 645 4 2580

LC.V890.002 LG 8.5” LCD 230 10 2300

HPQ754.071 HP LaserJet

5200

1103 1 1103

HPQ754.071 HP LaserJet

5200

1103 2 2206

LCV890.002 LG 8.5” LCD 230 5 1150

Break into two

But…

How do we connect?

INFO20003 Database Systems © University of Melbourne 2018 14

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-09 John / Synex 128 Juanita

Ave…

Charles

Wooten

COD 9463 0 780.70 0

INV0013 15-Aug-09 Mary / ThisCo 123 Smith

Street…

Charles

Wooten

COD 3356 0 100 0

Product ID Product

Name

Unit Price Quantity Amount Invoice

Number

PSV880.006 AMD Athlon

X2DC

580 6 3480 INV0012

PSV880.037 PDC E5300 645 4 2580 INV0012

LC.V890.00

2

LG 8.5” LCD 230 10 2300 INV0012

HPQ754.07

1

HP LaserJet

5200

1103 1 1103 INV0012

HPQ754.07

1

HP LaserJet

5200

1103 2 2206 INV0013

LCV890.002 LG 8.5” LCD 230 5 1150 INV0013

Add FK

INFO20003 Database Systems © University of Melbourne 2018 15

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-09 John / Synex 128 Juanita

Ave…

Charles

Wooten

COD 9463 0 780.70 0

INV0013 15-Aug-09 Mary / ThisCo 123 Smith

Street…

Charles

Wooten

COD 3356 0 100 0

Product ID Product

Name

Unit Price Quantity Amount Invoice

Number

PSV880.006 AMD Athlon

X2DC

580 6 3480 INV0012

PSV880.037 PDC E5300 645 4 2580 INV0012

LC.V890.00

2

LG 8.5” LCD 230 10 2300 INV0012

HPQ754.07

1

HP LaserJet

5200

1103 1 1103 INV0012

HPQ754.07

1

HP LaserJet

5200

1103 2 2206 INV0013

LCV890.002 LG 8.5” LCD 230 5 1150 INV0013

This is about

product

This is about

Order (invoice)

INFO20003 Database Systems © University of Melbourne 2018 16

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-09 John / Synex 128 Juanita

Ave…

Charles

Wooten

COD 9463 0 780.70 0

INV0013 15-Aug-09 Mary / ThisCo 123 Smith

Street…

Charles

Wooten

COD 3356 0 100 0

Product ID Quantity Amount Invoice

Number

PSV880.006 6 3480 INV0012

PSV880.037 4 2580 INV0012

LC.V890.00

2

10 2300 INV0012

HPQ754.07

1

1 1103 INV0012

HPQ754.07

1

2 2206 INV0013

LCV890.00 5 1150 INV0013

Break into two

Product ID Product

Name

Unit Price

PSV880.006 AMD Athlon

X2DC

580

PSV880.037 PDC E5300 645

LC.V890.00

2

LG 8.5” LCD 230

HPQ754.07

1

HP LaserJet

5200

1103

INFO20003 Database Systems © University of Melbourne 2018 17

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-09 John / Synex 128 Juanita

Ave…

Charles

Wooten

COD 9463 0 780.70 0

INV0013 15-Aug-09 Mary / ThisCo 123 Smith

Street…

Charles

Wooten

COD 3356 0 100 0

Product ID Quantity Amount Invoice

Number

PSV880.006 6 3480 INV0012

PSV880.037 4 2580 INV0012

LC.V890.00

2

10 2300 INV0012

HPQ754.07

1

1 1103 INV0012

HPQ754.07

1

2 2206 INV0013

LCV890.00 5 1150 INV0013

Product ID Product

Name

Unit Price

PSV880.006 AMD Athlon

X2DC

580

PSV880.037 PDC E5300 645

LC.V890.00

2

LG 8.5” LCD 230

HPQ754.07

1

HP LaserJet

5200

1103

What about amount?

INFO20003 Database Systems © University of Melbourne 2018 18

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-09 John / Synex 128 Juanita

Ave…

Charles

Wooten

COD 9463 0 780.70 0

INV0013 15-Aug-09 Mary / ThisCo 123 Smith

Street…

Charles

Wooten

COD 3356 0 100 0

Product ID Quantity Invoice

Number

PSV880.006 6 INV0012

PSV880.037 4 INV0012

LC.V890.00

2

10 INV0012

HPQ754.07

1

1 INV0012

HPQ754.07

1

2 INV0013

LCV890.00 5 INV0013

Product ID Product

Name

Unit Price

PSV880.006 AMD Athlon

X2DC

580

PSV880.037 PDC E5300 645

LC.V890.00

2

LG 8.5” LCD 230

HPQ754.07

1

HP LaserJet

5200

1103

Could be derived

What about sales person?

INFO20003 Database Systems © University of Melbourne 2018 19

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

Name

Customer

Address

Sales

Person ID

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-

09

John /

Synex

128 Juanita

Ave…

1 COD 9463 0 780.70 0

INV0013 15-Aug-

09

Mary /

ThisCo

123 Smith

Street…

1 COD 3356 0 100 0

Product ID Quantity Invoice

Number

PSV880.006 6 INV0012

PSV880.037 4 INV0012

LC.V890.00

2

10 INV0012

HPQ754.07

1

1 INV0012

HPQ754.07

1

2 INV0013

LCV890.00 5 INV0013

Product ID Product

Name

Unit Price

PSV880.006 AMD Athlon

X2DC

580

PSV880.037 PDC E5300 645

LC.V890.00

2

LG 8.5” LCD 230

HPQ754.07

1

HP LaserJet

5200

1103

Sales

Person ID

Sales

Person

1 Charles

Wooten

What about customer?

INFO20003 Database Systems © University of Melbourne 2018 20

Invoice example – Spreadsheet Format

Invoice

Number

Date Customer

ID

Sales Person

ID

Terms Sub Total Discount Sales Tax Shipping

INV0012 14-Aug-09 1 1 COD 9463 0 780.70 0

INV0013 15-Aug-09 2 1 COD 3356 0 100 0

Product ID Quantity Invoice

Number

PSV880.006 6 INV0012

PSV880.037 4 INV0012

LC.V890.00

2

10 INV0012

HPQ754.07

1

1 INV0012

HPQ754.07

1

2 INV0013

LCV890.00 5 INV0013

Product ID Product

Name

Unit Price

PSV880.006 AMD Athlon

X2DC

580

PSV880.037 PDC E5300 645

LC.V890.00

2

LG 8.5” LCD 230

HPQ754.07

1

HP LaserJet

5200

1103

Sales

Person ID

Sales

Person

1 Charles

Wooten

Customer

ID

Customer

Name

Customer

Address

1 John /

Synex

128 Juanita

Ave…

2 Mary /

ThisCo

123 Smith

Street…

INFO20003 Database Systems © University of Melbourne 2018 21

Normalized Relations and ER Diagram

– –

• We can name the relations now
– Customer (CustomerNumber, CustomerName, CustomerAddress)

– Clerk (ClerkNumber, ClerkName)

– Product (ProductNumber, ProductDescription)

– Invoice (InvoiceNumber, Date, CustomerNumber, ClerkNumber)

– InvoiceLineItem (InvoiceNumber, ProductNumber, UniPrice, Quantity)

INFO20003 Database Systems © University of Melbourne 2018 22

• Now let’s go back to theoretical concepts…

INFO20003 Database Systems © University of Melbourne 2018 23

Functional Dependency

• A functional dependency concerns values of

attributes in a relation

• A set of attributes X determines another set of

attributes Y if each value of X is associated with only

one value of Y
– Written X → Y

• X determines Y (If I know X then I also know Y)

• Emp# Emp-name

• Emp# Salary

INFO20003 Database Systems © University of Melbourne 2018 24

Functional Dependency: Definitions

– –

• Determinants (X,Y Z)
– the attribute(s) on the left hand side of the arrow

• Key and Non-Key attributes
– each attribute is either part of the primary key or it is not

• Partial functional dependency (Y Z)
– a functional dependency of one or more non-key attributes

upon part (but not all) of the primary key

• Transitive dependency (Z D)
– a functional dependency between 2 (or more) non-key

attributes

A(X, Y, Z, D)

INFO20003 Database Systems © University of Melbourne 2018 25

Armstrong’s Axioms

Functional dependencies can be identified using

Armstrong’s Axioms

1. Reflexivity:

2. Augmentation:

3. Transitivity:

𝐴 = 𝑋1, 𝑋2, … , 𝑋𝑛 𝑎𝑛𝑑 𝐵 = (𝑌1, 𝑌2, … , 𝑌𝑛)

𝐴 → 𝐵 ⟹ AC → 𝐵𝐶

𝐴 → 𝐵 and B → 𝐶 ⟹ 𝐴 → 𝐶

𝐵 ⊆ 𝐴 ⟹ 𝐴 → 𝐵

Example: Student_ID, name -> name

Example: Student_ID -> name => Student_ID, surname ->name, surname

Example: ID -> birthdate and birthdate -> age then ID ->age

INFO20003 Database Systems © University of Melbourne 2018 26

Steps in Normalisation

First Normal
Form

Second
Normal Form

Third Normal
Form

Keep atomic data/Remove repeating groups

Remove partial dependencies

Remove transitive dependencies

INFO20003 Database Systems © University of Melbourne 2018 27

First Normal Form

Remove Repeating Groups

• repeating groups of attributes cannot be represented in a
flat, two dimensional table

• removing cells with multiple values (keep atomic data)

• Order-Item (Order#, Customer#, (Item#, Desc, Qty))

• Order-Item (Order#, Item#, Desc, Qty)

• Order (Order#, Customer#)

Example: Order-Item (Order#, Customer#, (Item#, Desc, Qty))

Set of values

Break them into two

Use PK/FK to connect

INFO20003 Database Systems © University of Melbourne 2018 28

Second Normal Form

Remove Partial Dependencies

a non-key attribute cannot be identified

by part of a composite key

• Order-Item (Order#, Item#, Desc, Qty)

Example: Order-Item (Order#, Item#, Desc, Qty)

Partial dependency

Order-Item (Order#, Item#, Qty)

Item (Item#, Desc)

INFO20003 Database Systems © University of Melbourne 2018 29

Partial Dependency Anomalies

Order# Item# Desc Qty

27

28

28

873

402

873

nut

bolt

nut

2

1

10

Order-Item (Order#, Item#, Desc, Qty)

30 495 washer 50

• UPDATE change item desc in many places

• DELETE data for last item lost when last order for
that item is deleted

• INSERT cannot add new item until it is ordered

INFO20003 Database Systems © University of Melbourne 2018 30

Solution to these Anomalies

Order# Item# Qty

27

28

28

873

402

873

2

1

10

Order-Item

30 495 50

Item# Desc

873

402

nut

bolt

495 washer

delete last

order for item,

but item

remains

add new item at any time

change item

description in one

place

Item

INFO20003 Database Systems © University of Melbourne 2018 31

Third Normal Form

Remove Transitive Dependencies

a non-key attribute cannot be identified

by another non-key attribute

• Employee (Emp#, Ename, Dept#, Dname)

• Employee (Emp#, Ename, Dept#)

• Department( Dept#, Dname)

Example: Employee (Emp#, Ename, Dept#, Dname)

Transitive dependency

INFO20003 Database Systems © University of Melbourne 2018 32

Transitive Dependency Anomalies

Emp# Ename Dept# Dname

10

20

25

Smith

Jones

Smith

D5

D7

D7

MIS

Finance

Finance

Example: Employee (Emp#, Ename, Dept#, Dname)

30 Black D8 Sales

• UPDATE change dept name in many places

• DELETE data for dept lost when last employee
for that dept is deleted

• INSERT cannot add new dept until an
employee is allocated to it

INFO20003 Database Systems © University of Melbourne 2018 33

Solution to these Anomalies

delete last emp in

dept, but dept

remainsEmp# Ename Dept#

10

20

25

Smith

Jones

Smith

D5

D7

D7

Employee

30 Black D8

Dept# Dname

D5

D7

MIS

Finance

D8 Sales

add new dept at any time

change dept

name in one

place

INFO20003 Database Systems © University of Melbourne 2018 37

Summary of Normalisation

First Normal
Form

Second
Normal Form

Third Normal
Form

Remove repeating groups

Remove partial dependencies

Remove transitive dependencies

INFO20003 Database Systems © University of Melbourne 2018 38

Normalisation vs Denormalization

Normalisation:

– Normalised relations contains a minimum amount of

redundancy and allow users to insert, modify, and delete

rows in tables without errors or inconsistencies

(anomalies)

Denormalization:
– The pay-off: query speed.

– The price: extra work on updates to keep redundant data

consistent.

• Denormalization may be used to improve performance of

time-critical operations.

INFO20003 Database Systems © University of Melbourne 2018 39

What’s Examinable?

– –

• Normalisation Process

• Anomalies

• Functional dependencies

• Denormalisation

INFO20003 Database Systems © University of Melbourne 2018 40

Next Lecture

– –

• Database Admin