程序代写代做代考 database AI SQL Week 2 – Relational Data Model

Week 2 – Relational Data Model
FIT2094 Database

MONASH
INFORMATION
TECHNOLOGY

2

Overview
▪ Relational Model
▪ Relational Algebra

3

The Relational Model
▪ Introduced by CODD in 1970 – the fundamental basis for relational DBMS’s
▪ Basic structure is the mathematical concept of a RELATION mapped to the

‘concept’ of a table (tabular representation of relation)
– Relation – abstract object
– Table – pictorial representation
– Storage structure – “real thing” – eg. isam file

▪ Relational Model Terminology
– DOMAIN – set of atomic (indivisible) values

• specify
– name
– data type
– data format

▪ Examples:
– customer_number domain – 5 character string of the form xxxdd
– name domain – 20 character string
– address domain – 30 character string containing street, town & postcode
– credit_limit domain – money in the range $1,000 to $99,999

4

A Relation
▪ A relation consists of two parts

– heading
– body

▪ Relation Heading
– Also called Relational Schema consists of a fixed set of attributes

• R (A1,A2,…….An)
– R = relation name, Ai = attribute i

– Each attribute corresponds to one underlying domain:
• Customer relation heading:

– CUSTOMER (custno, custname, custadd, credlimit)
» dom(custno) = customer_number
» dom(custname) = name
» dom(custadd) = address
» dom(credlimit) = credit_limit

credlimitcustaddcustnamecustno

5

Relation Body
▪ Relation Body

– Also called Relation Instance (state)
• r(R) = {t1, t2, t3, …., tm}
• consists of a time-varying set of n-tuples

– Relation R consists of tuples t1, t2, t3 .. tm
– m = number of tuples = relation cardinality

• each n-tuple is an ordered list of n values
• t = < v1, v2, ....., vn>

– n = number of values in tuple (no of attributes) = relation degree
– In the tabular representation:

• Relation heading ⇨ column headings
• Relation body ⇨ set of data rows

credlimitcustaddcustnamecustno

SMITHSMI13 Wide Rd, Clayton, 3168 2000

JON44 JONES Narrow St, Clayton, 3168 10000

BRO23 BROWN Here Rd, Clayton, 3168 10000

6

Relation Properties
▪ No duplicate tuples

– by definition sets do not contain duplicate elements
• hence tuples are unique

▪ Tuples are unordered within a relation
– by definition sets are not ordered

• hence tuples can only be accessed by content
▪ No ordering of attributes within a tuple

– by definition sets are not ordered

7

Relation Properties cont’d
▪ Tuple values are atomic – cannot be divided

• EMPLOYEE(eid, ename, departno, dependants)
– not allowed: dependants (depname, depage)

multivalued
– hence no multivalued (repeating) attributes allowed,

called the first normal form rule
▪ COMPARE with tabular representation

• normally nothing to prevent duplicate rows
• rows are ordered
• columns are ordered

– tables and relations are not the same ‘thing’

9

surname firstname degree DOB

Black Sam BBIS 02-02-1996

Brown Jane BITS 01-01-1995

Chen Chan BITS 09-02-1996

Grey Maria BCS 15-12-1995

Indigo Jose BITS 28-10-1995

Black Jet BCS 13-05-1996

Chen Maria BITS 31-08-1995

▪ Functional Dependency:
– A set of attributes X functionally determines an attribute Y if, and only if,

for each X value, there is exactly one Y value in the relation. It is denoted
as X → Y.

▪ For example, given the data above:
– firstname, surname → degree

• but
– firstname → degree does not hold
– What about: degree → firstname, surname?

11

Relational Keys
▪ A candidate key K of a relation R is an attribute or set of

attributes which exhibits the following properties:
– No two tuples of R have the same value for K

(Uniqueness property)
– No proper subset of K has the uniqueness property

(Minimality or Irreducibility property)
▪ One candidate key is chosen to be the primary key of a

relation. Remaining candidate keys are termed alternate
keys.

▪ A superkey is an attribute or set of attributes which only
exhibits the uniqueness property

16

stu_no surname firstname degree DOB

1111 Black Sam BBIS 02-02-1996

1112 Brown Jane BITS 01-01-1995

1113 Chen Chan BITS 09-02-1996

1114 Grey Maria BCS 15-12-1995

1115 Indigo Jose BITS 28-10-1995

1116 Black Jet BCS 13-05-1996

1117 Chen Maria BBIS 31-08-1995

▪ A primary key must be chosen considering the data that may be added to the table in the
future

– Names, dates of birth etc are rarely unique and as such are not a good option
– PK should be free of ‘extra’ semantic meaning, preferably single attribute, preferably

numeric (see Table 5.3 Coronel & Morris)
– Natural vs Surrogate

Selection of a Primary key

17

18

Writing Relations
▪ Relations may be represented using the following

notation:
– relation_name(attribute1, attribute2,…)

▪ The primary key is underlined.

▪ Example:
– staff(staffid, surname, initials, address, phone)

19

Relational Database
▪ A relational database is a collection of normalised

relations.
▪ Normalisation is part of the design phase of the database

and will be discussed in a later lecture.

Example relational database:

order (order_id, orderdate,)
order-line (order_id, product_id, quantity)
product (product_id, description, unit_price)

20

Foreign Key (FK)
▪ An attribute/s in a table that exists in the same, or

another table as a Primary Key.
▪ Referential Integrity

– A Foreign Key value must either match the primary
key in another table or be NULL.

▪ The pairing of PK and FK creates relationships (logical
connections) between tables. Hence the abstraction
away from the underlying storage model.

23

Data Integrity
▪ Entity integrity

– Primary key values must be unique
– Primary key value must not be NULL.

▪ Referential integrity
– The values of FK must either match a value of the

PK in the related relation or be NULL.
▪ Column/Domain integrity

– All values in a given column must come from the
same domain (the same data type and range).

26

Relational DMLs
▪ Relational Calculus
▪ Relational Algebra
▪ Transform Oriented Languages (e.g. SQL)
▪ Graphical Languages
▪ Exhibit the “closure” property – queries on relations

produce relations

27

Relational Calculus
▪ Based on mathematical logic.
▪ Non-procedural.
▪ Primarily of theoretical importance.
▪ May be used as a yardstick for measuring the power of

other relational languages (“relational completeness”).
▪ Operators may be applied to any number of relations.

28

RELATIONAL ALGEBRA

Manipulation of relational data

29

Relational Algebra
▪ Relationally complete.
▪ Procedural.
▪ Operators only apply to at most two relations at a time.
▪ 8 basic operations:

– single relation: selection, projection
– cartesian product, join
– union
– intersection
– difference
– division

30

Relational Operation PROJECT

31

Relational Operation SELECT

32

Relational Operation Multiple Actions

1

2

33

SQL vs Relational Algebra in the Database

35

JOIN
▪ Join operator used to combine data from two or more

relations, based on a common attribute or attributes.
▪ Different types:

– theta-join
– equi-join
– natural join
– outer join

36

THETA JOIN (Generalised join)

(Relation_1) ⨝F (Relation_2)

– F is a predicate (i.e. truth-valued function) which is
of the form Relation_1.ai θ Relation2.bi

– θ is one of the standard arithmetic comparison
operators, i.e. <, ≤, =, ≥, >

– Most commonly, θ is equals (=)

39

NATURAL JOIN
ID Name

1 Alice

2 Bob

ID Subj Marks

1 1004 95

2 1045 55

1 1045 90

STUDENT MARK

STUDENT.
ID

Name MARK.ID Subj Marks

1 Alice 1 1004 95

1 Alice 2 1045 55

1 Alice 1 1045 90

2 Bob 1 1004 95

2 Bob 2 1045 55

2 Bob 1 1045 90

Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =)

40

NATURAL JOIN

STUDENT.I
D

Name MARK.ID Subj Marks

1 Alice 1 1004 95

1 Alice 1 1045 90

2 Bob 2 1045 55

Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =)
Step 3: delete duplicate columns (project away)

ID Name

1 Alice

2 Bob

ID Subj Marks

1 1004 95

2 1045 55

1 1045 90

STUDENT MARK

41

NATURAL JOIN

ID Name Subj Marks

1 Alice 1004 95

1 Alice 1045 90

2 Bob 1045 55

A natural join of STUDENT and MARK

Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =)
Step 3: delete duplicate columns (project away)

ID Name

1 Alice

2 Bob

ID Subj Marks

1 1004 95

2 1045 55

1 1045 90

STUDENT MARK

44

OUTER JOIN

ID Name Subj Marks

1 Alice 1004 95

1 Alice 1045 90

2 Bob 1045 55

A natural join of STUDENT and MARK

No information for Chris and the student with ID 4

ID Name

1 Alice

2 Bob

3 Chris

ID Subj Marks

1 1004 95

2 1045 55

1 1045 90

4 1004 100

STUDENT MARK

45

FULL OUTER JOIN

ID Name Subj Marks

1 Alice 1004 95

1 Alice 1045 90

2 Bob 1045 55

3 Chris Null Null

4 Null 1004 100

A full outer join of STUDENT and MARK

Get (incomplete) information of both Chris and student with ID 4

ID Name

1 Alice

2 Bob

3 Chris

ID Subj Marks

1 1004 95

2 1045 55

1 1045 90

4 1004 100

STUDENT MARK

46

LEFT OUTER JOIN

ID Name Subj Marks

1 Alice 1004 95

1 Alice 1045 90

2 Bob 1045 55

3 Chris Null Null

A left outer join of STUDENT and MARK

Get (incomplete) information of only Chris

ID Name

1 Alice

2 Bob

3 Chris

ID Subj Marks

1 1004 95

2 1045 55

1 1045 90

4 1004 100

STUDENT MARK

47

RIGHT OUTER JOIN

ID Name Subj Marks

1 Alice 1004 95

1 Alice 1045 90

2 Bob 1045 55

4 Null 1004 100

A right outer join of STUDENT and MARK

Get (incomplete) information of the student with ID 4

ID Name

1 Alice

2 Bob

3 Chris

ID Subj Marks

1 1004 95

2 1045 55

1 1045 90

4 1004 100

STUDENT MARK