程序代写代做代考 database ER algorithm Functional Dependencies SQL Microsoft PowerPoint – 11- DatabaseDesign_SchemaRefinement

Microsoft PowerPoint – 11- DatabaseDesign_SchemaRefinement

© 2018 A. Alawini & A. Parameswaran

Database Design:
Functional

Dependencies
Abdu Alawini

University of Illinois at Urbana-Champaign

CS411: Database Systems

October 8, 2018

1

© 2018 A. Alawini & A. Parameswaran

Announcements

•HW 2 is due Oct 15 (23:59)
•Project Track 1 – Stage 2
And
•Project Track 2 – Stage 1
are due on Wednesday (10/10)
•Midterm Oct, 29th in class (11-12:15)
•Final  Dec, 12 in class (11-12:15)

2

© 2018 A. Alawini & A. Parameswaran

Construct an E-R diagram for a car-insurance company
whose customers own one or more cars each. Each car
has associated with it zero to any number of recorded
accidents.

ERD Exercise

3

© 2018 A. Alawini & A. Parameswaran

ERD Exercise Student Solution

4

© 2018 A. Alawini & A. Parameswaran

Various notations for “one-to-many”

5

0..1 1..*

zero..one one..many

maximum cardinalities only minimum and maximum
cardinalities

1 m

manyone

1 *

0-1 1+

0

© 2018 A. Alawini & A. Parameswaran

Various notations for “many-to-many”

6

1..* 1..*

one..many one..many

maximum cardinalities only minimum and maximum
cardinalities

manymany

1+ 1+

* *

m n

© 2018 A. Alawini & A. Parameswaran

•Conceptual design: (ER & UML Models are used
for this.)
• What are the entities and relationships we need?

•Logical design:
• Transform ER design to Relational Schema

•Schema Refinement: (Normalization)
• Check relational schema for redundancies and

related anomalies.

•Physical Database Design and Tuning:
• Consider typical workloads; (sometimes) modify the

database design; select file types and indexes.

Overview of Database Design

We’re here

7

© 2018 A. Alawini & A. Parameswaran

Agenda

•How do we obtain a good design?
•Functional Dependencies and Keys
•Rules about Functional Dependencies

• Splitting/Combination
• Trivial Dependencies
• Attribute Closure
• FD Closure

8

© 2018 A. Alawini & A. Parameswaran

Motivation

• We have designed ER diagram, and translated it into a relational db
schema R = set of R1, R2, … Now what?

• We can do the following
• implement R in SQL
• start using it

• However, R may not be well-designed, thus causing us a lot of problems
• OR: people may start without an ER diagram, and you need to reformat

the schema R

• Either way you may need to improve the schema

9

© 2018 A. Alawini & A. Parameswaran

Q: Is this a good design?

10 Green 123-321-99 (201) 555-1234

10 Green 123-321-99 (206) 572-4312

431 Purple 909-438-44 (908) 464-0028

Individuals with several phones:

Address SSN Phone Number

10

© 2018 A. Alawini & A. Parameswaran

Potential Problems

• Redundancy
• Update anomalies

• maybe we’ll update the address of the person with phone number ‘(206)
572-4312’ to something other than ‘10 Green’. Then there will be two
addresses for that person.

• Deletion anomalies
• delete the phone number of a person; if not careful then the address can

also disappear with it.

Address SSN Phone Number

10 Green 123-321-99 (201) 555-1234

10 Green 123-321-99 (206) 572-4312

431 Purple 909-438-44 (908) 464-0028

11

© 2018 A. Alawini & A. Parameswaran

Better Designs Exist

SSN Address

123-321-99 10 Green
909-438-44 431 Purple

SSN Phone Number

123-321-99 (201) 555-1234
123-321-99 (206) 572-4312

909-438-44 (908) 464-0028

Break the relation into two:

Unfortunately, this is not something you will detect even
if you did principled ER design and translation

12

© 2018 A. Alawini & A. Parameswaran

How do We Obtain a Good Design?

•Start with the original db schema R
• From ER translation or otherwise

•Transform it until we get a good design R*
• Some desirable properties for R*

• must preserve the information of R
• must have minimal amount of redundancy
• must be “dependency preserving”

• (we’ll come to this later)
• must also give good query performance

13

© 2018 A. Alawini & A. Parameswaran

How do We Obtain a Good Design?

14

© 2018 A. Alawini & A. Parameswaran

Normal Forms

•DB gurus have developed many “normal forms”
•These are basically schemas obeying certain rules

• Converting a schema that doesn’t obey rules to one that does is
called “normalization”

• This typically involves some kind of decomposition into smaller
tables, just like we saw earlier.

• (the opposite: grouping tables together, is called
“denormalization”)

15

© 2018 A. Alawini & A. Parameswaran

Normal Forms

•DB gurus have developed many “normal forms”
•Most important ones

• Boyce-Codd, 3rd, and 4th normal forms

•If R* is in one of these forms, then R* is guaranteed to
achieve certain good properties

• e.g., if R* is in Boyce-Codd NF, it is guaranteed to not have certain
types of redundancies

•DB gurus have also developed algorithms to transform R
into R* in these normal forms

16

© 2018 A. Alawini & A. Parameswaran

Normal Forms (cont.)

•There are also trade-offs among normal forms

•Thus, our goal is to:
• learn these forms
• transform R into R* in one of these forms
• carefully evaluate the trade-offs

•To understand these normal forms we’ll need to understand
certain types of constraints
• functional dependencies and keys

17

© 2018 A. Alawini & A. Parameswaran

Functional Dependencies
and Keys

18

© 2018 A. Alawini & A. Parameswaran

Functional Dependencies

• A form of constraint (hence, part of the schema)
• Finding them is part of the database design
• Used heavily in schema refinement
• Holds for ALL instances!
Definition:

If two tuples agree on the attributes

A , A , … A
1 2 n

then they must also agree on the attributes

B , B , … B 1 2 m

Formally:

A , A , … A
1 2 n

B , B , … B 1 2 m

Where have we seen
this before?

19

© 2018 A. Alawini & A. Parameswaran

Examples

•EmpID Name, Phone, Position
•Position Phone
•but Phone Position: why?

EmpID Name Phone Position
E0045 Smith 1234 Clerk
E1847 John 9876 Salesrep
E1111 Smith 9876 Salesrep
E9999 Mary 1234 Lawyer

20

© 2018 A. Alawini & A. Parameswaran

What a FD actually means

• Knowing FD: A  B holds in R(A, B, C) means that
• For ALL valid instances R(A, B, C):

• A determines B
• Or, if two tuples share A, then they share the same B

• This is the property of the “world”
• Conversely, if: A / B, then there is no guarantee that the “A

determines B” property holds in a given instance (though it
might).
• Trivially, it holds when you have only one tuple.

21

© 2018 A. Alawini & A. Parameswaran

More examples

Product: name, manufacturer price

Person: ssn name, age

Company: name stock price, president

22

© 2018 A. Alawini & A. Parameswaran

Q: From this, can you conclude phone  SSN?
SSN Phone Number

123-321-99 (201) 555-1234
123-321-99 (206) 572-4312

909-438-44 (908) 464-0028

909-438-44 (212) 555-4000

 No, you cannot. Like we discussed, this is a property of
the world, not of the current data

 In general, you cannot conclude from a given instance of
a table that an FD is true. An FD is not an observation
made from a table’s current tuples, it is an assertion that
must always be respected.

 You can however check if a given FD is violated by the
table instance.

23

© 2018 A. Alawini & A. Parameswaran

Keys are a type of FD

• Key of a relation R is a set of attributes that
• functionally determines all attributes of R
• none of its subsets determines all attributes of R

• There could be many keys of a relation
• Student (UIN, email, dept, age)
• UIN  UIN, email, dept, age
• email  UIN, email, dept, age

• Superkey
• “Superset” of key
• a set of attributes that contains a key
• Any examples for student?

24

© 2018 A. Alawini & A. Parameswaran

Many many FDs…
• MovieInfo (name, year, actor, director, studio)

• Same movie can be remade multiple years, but a name, year pair
uniquely determines a movie

• A movie has a single director/studio but many actors
• Name, year  director, studio
• Name, year  director; Name, year  studio
• Name, year / actor

• A director works only with a single studio
• Director  studio

• An actor works on a given movie only once (never for remakes), but
may work for many movies in a year

• Actor, name  year; actor, year / name
25

© 2018 A. Alawini & A. Parameswaran

Many many FDs…
•MovieInfo (name, year, actor, director, studio)

• Name, year  director, studio
• Name, year  director
• Name, year  studio
• Director  studio
• Actor, name  year
• …
• Actor, name, year  director, studio
• Director, actor, name  studio, year
• Director, name, year  studio
• Studio, actor, name  year
• …

Any missing
FDs?

26

© 2018 A. Alawini & A. Parameswaran

Rules about Functional
Dependencies

27

© 2018 A. Alawini & A. Parameswaran

Outline of Rules

• Two examples of rules
• Splitting/Combination
• Trivial Dependencies

• Attribute Closure
• Algorithm
• Uses

• FD Closure
• A complete set of rules:

• Armstrong’s axioms
• Algorithm

28

© 2018 A. Alawini & A. Parameswaran

The Splitting/Combining Rule

•A1A2…An  B1B2…Bm

•Equivalent to:
A1A2…An B1;

A1A2…An B2;

A1A2…An Bm

•Can replace one for the other.

29

© 2018 A. Alawini & A. Parameswaran

Trivial Functional Dependencies

•A1A2…An A1

•In general,
A1A2…An B1B2…Bm

if {B1B2…Bm} {A1A2…An}

Example: name, UIN  UIN

Why does this make sense?

30

© 2018 A. Alawini & A. Parameswaran

Closure of a Set of Attributes

Given a set of attributes {A1, …, An} and a set of FDs S.

Problem: find all attributes B such that:

for all relations that satisfy S, they also satisfy:

A1, …, An  B

The closure of {A1, …, An}, denoted {A1, …, An} ,

is the set of all such attributes B

We will discuss the motivations for attribute closures soon

+

31

© 2018 A. Alawini & A. Parameswaran

Algorithm to Compute Closure
Split the FDs in S so that every FD has a single attribute on the
right. (Simplify the FDs)

Start with X={A1A2…An}.

Repeat until X doesn’t change do:

If (B1B2…Bm  C) is in S,

such that B1,B2,…Bm are in X and C is not in X:

add C to X.

// X is now the correct value of {A1A2…An}
+

Why does this algorithm converge?

32

© 2018 A. Alawini & A. Parameswaran

Example

A B C

A D E

B D

A F B

Closure of {A,B}:

Closure of {A, F}:

• Set of attributes A,B,C,D,E,F.

• Functional Dependencies:

33

© 2018 A. Alawini & A. Parameswaran

Example

A B C

A D E

B D

A F B

Closure of {A,B}: X = {A, B, C, D, E}

Closure of {A, F}: X = {A, F, B, D, C, E}

• Set of attributes A,B,C,D,E,F.

• Functional Dependencies:

34

© 2018 A. Alawini & A. Parameswaran

1. Decompose all FDs in F
sid -> name, crn -> cid, crn -> subj

2. We start with with the attribute crn in the initial
closure = {crn}

3. We look for all FDs that has crn, the 2nd FD (crn -> cid) has crn
closure = cid U closure = {crn, cid}

4. We look for all FDs that has cid or crn, the 3rd FD (crn -> subj) has
crn

closure = subj U closure = {crn, cid, subj}

Attribute Closure Exercise

35

Given:
Student_info(sid, name, crn, subj, cid, grade), and
F={sid ->name, crn -> cid, subj}, find A+ for A = {crn}

© 2018 A. Alawini & A. Parameswaran

Is this algorithm correct?

•Yes. See Text (Section 3.2.5) for proof.

•Two parts of proof:
• Anything determined to be part of {S}+ deserves to be there:

• soundness
• There’s nothing missing:

• completeness

36

© 2018 A. Alawini & A. Parameswaran

Uses for Attribute Closure

•Use 1: To test if X is a superkey
• How?
• compute X+, and check if X+ contains all attrs of R

•Use 2: To check if X Y holds
• How?
• by checking if Y is contained in X+

37

© 2018 A. Alawini & A. Parameswaran

An exercise

•Show that each of the following are not valid rules about
FD’s, by giving example relations that satisfy the given
FDs (following the “If”), but not the FD that allegedly
follows (after the “then”).

(1) If A –> B then B –> A

(2) If AB –> C and A –> C then B –> C.

•(1) A = SSN, B = Name
•(2) A = SSN, B = Phone, C = Name

38