代写代考 FIT2094 – FIT3171 Databases

INFORMATION TECHNOLOGY
FIT2094 – FIT3171 Databases
Week 3 – The Relational Database Model
Workshop S1 2022

Copyright By PowCoder代写 加微信 powcoder

Once we have a conceptual model, it is time to move to the second stage and map this to a logical model
For our unit this will involve mapping to the Relational Model in preparation for implementation in a RDBMS. First before we consider this mapping it is necessary to have a clear understanding of the Relation Model and it use:
▪ Relational Model
▪ RelationalAlgebra

Early Database Models
▪ Hierarchical (1970’s eg. IBM Information Management System (IMS))
– 1:M relationships, a tree of linked records, child has only one parent ▪ Network (1970’s eg. Integrated Data Store IDS, basis for the CODASYL
– child may have multiple parents
▪ Both Navigational – move around in data via embedded links (pointers) Network:

The Relational Model
▪ Introduced by CODD in 1970 – the fundamental basis for the relational DBMS
▪ Basic structure is the mathematical concept of a RELATION mapped to the
‘concept’ of a table (tabular representation of relation)
– Relation – abstract object
– Table – pictorial representation
– Storage structure – “real thing” – eg. isam file of 1’s and 0’s
▪ Relational Model Terminology
– DOMAIN – set of atomic (indivisible) values
– Examples (name, data type, data format):
• customer_number domain – 5 character string of the form xxxdd
• name domain – 20 character string
• address domain – 30 character string containing street, town & postcode • credit_limit domain – money in the range $1,000 to $99,999

A Relation
▪ A relation consists of two parts – heading
▪ Relation Heading
– Also called Relational Schema consists of a fixed set of attributes
• R (A1,A2,…….An)
– R = relation name, Ai = attribute i
– Each attribute corresponds to one underlying domain: • Customer relation heading:
– CUSTOMER (custno, custname, custadd, custcredlimit)
» dom(custno) = customer_number » dom(custname) = name
» dom(custadd) = address
» dom(custcredlimit) = credit_limit
custcredlimit

Relation Body
▪ Relation Body
– Also called Relation Instance (state of the relation at any point in time)
• r(R) = {t1, t2, t3, …., tm}
• consists of a time-varying set of n-tuples
– Relation R consists of tuples t1, t2, t3 .. tm
– m = number of tuples = relation cardinality • each n-tuple is an ordered list of n values
• t = < v1, v2, ....., vn>
– n = number of values in tuple (no of attributes) = relation degree – In the tabular representation:
• Relation heading ⇨ column headings
• Relation body ⇨ set of data rows
custcredlimit
Wide Rd, Clayton, 3168
Narrow St, Clayton, 3168
SMI13 JON44 BRO23
Here Rd, Clayton, 3168

Relation Properties
▪ No duplicate tuples
– by definition sets do not contain duplicate elements
• hence tuples must be unique
▪ Tuples are unordered within a relation – by definition sets are not ordered
• hence tuples can only be accessed by content
▪ No ordering of attributes within a tuple – by definition sets are not ordered

Relation Properties cont’d
▪ Tuple values are atomic – cannot be divided
• EMPLOYEE (eid, ename, departno, dependants)
– not allowed: dependants (depname, depage) multivalued
– hence no multivalued (repeating) attributes allowed, called the first normal form rule
▪ COMPARE with tabular representation
• normally nothing to prevent duplicate rows • rows are ordered
• columns are ordered
– tables and relations are not the same ‘thing’

Functional Dependency
▪ Functional Dependency:
– A set of attributes A functionally determines an attribute B if, and only if, for
each A value, there is exactly one value of B in the relation. It is denoted as A → B (A determines B, or B depends on A)
• orderno → orderdate
• prodno → proddesc
• orderno, prodno → qtyordered

Relational Model Keys
▪ A superkey of a relation R is an attribute or set of attributes which exhibits only the uniqueness property
– No two tuples of R have the same value for the superkey (Uniqueness property)
– t1[superkey] ≠ t2[superkey] Many possible superkeys
▪ A candidate key CK of a relation R is an attribute or set of attributes which
exhibits the following properties:
– Uniqueness property (as above), and
– No proper subset of CK has the uniqueness property
Potentially many possible candidate keys
(Minimality or Irreducibility property) ie. a minimal superkey
▪ One candidate key is chosen to be the primary key (PK) of a relation. Remaining
candidate keys are termed alternate keys (AK).
Only ONE primary key (may be composed of many attributes – a composite primary key)

Free text answer in Flux (after Group discussion): Q1. List all the super keys for:
– treat this as only a small sample of the data
– the attributes are fixed.

Selection of a Primary key
▪ A primary key must be chosen considering the data that may be added to the table in the future
– Names, dates of birth etc are rarely unique and as such are not a good option
– PK should be free of ‘extra’ semantic meaning and security compliant, preferably a single attribute, preferably numeric (see Table 5.3 Coronel & Morris)
– Natural vs Surrogate primary key
• ENROLMENT (unitcode, student_id, enrol_sem, enrol_year,
enrol_mark, enrol_grade) – Superkey
– Issues with PK?

Null in the Relational Model Implementation
▪ NULL is a concept created and implemented by SQL, does not exist in classical relational algebra
▪ NULL is NOT a value – is a representation of the fact that there is NO VALUE ▪ Reasons for a NULL:
– Columns Number_of_payments, Total_payments
– Column Average_payment_made
– If Number_of_payments = 0 => Average undefined
VALUE NOT APPLICABLE –
• EMP relation – empno, deptno, salary, commission
– commission only applies to staff in sales dept VALUE UNKNOWN –
• Joe’s salary is NULL, Joe’s salary is currently unknown VALUE DOES NOT EXIST –
• Tax File Number – is applicable to all employees but Joe may not have a number at this time
VALUE UNDEFINED –
• Certain items explicitly undefined eg. divide by zero

Writing Relations
▪ Relations may be represented using the following notation: – RELATION_NAME (attribute1, attribute2,…)
▪ Relation_name must not be pluralised (is a set name) ▪ The primary key is underlined
▪ Example:
– STAFF (staff_id, staff_surname, staff_initials, staff_address, staff_phone)

Q2. A well designed relational database ( a database based on the relational model) has:
A. No redundant data
B. Minimal redundant data
C. A large amount of redundant data
D. A level of redundancy based on the vendors
implementation

Relational Database
▪ A relational database is a collection of normalised relations. ▪ Normalisation is part of the design phase of the database
and will be discussed in a later lecture. Example relational database:
ORDER (order_id, order_date)
ORDER_LINE (order_id, prod_id, ol_quantity) PRODUCT (prod_id, prod_desc, prod_unitprice)

Foreign Key (FK) – Implementation
▪ FK: An attribute/s in a relation that exists in the same, or another relation as a Primary Key.
▪ Referential Integrity
– A Foreign Key value must either match the full primary key in
a relation or be NULL.
▪ The pairing of PK and FK creates relationships (logical connections) between tables when implemented in a RDBMS. Hence the abstraction away from the underlying storage model.

Q4. Business rules:
Runners may form a team, the runner who registers the team is recorded as the team leader. Each team can have up to 5 members (runners).
Identify the FK(s):
TEAM(team_id, team_name, team_leader) RUNNER(runner_id, runner_name, team_id)
A. team_leader in TEAM
B. team_id in TEAM
C. runner_id in RUNNER
D. team_id in RUNNER

Data Integrity – Implementation
▪ Entity integrity
– Primary key value must not be NULL.
• No duplicate tuple property then ensures that each primary key must be unique
• Implemented in the RDBMS via a unique index on the PK ▪ Referential integrity
– The values of FK must either match a value of a full PK in the related relation or be NULL.
▪ Column/Domain integrity
– All values in a given column must come from the same domain (the same data type and range).

Q5. The following set of relations:
HOSPITAL (hosp_id, hosp_name, hosp_phone) DOCTOR (hosp_id, dr_id, dr_name, dr_mobile) PATIENT (pat_id, pat_name, pat_dob, dr_id)
A. have no integrity issues
B. violate entity integrity
C. violate referential integrity
D. violate column/domain integrity
Multiple responses allowed

Relational DMLs
▪ Relational Calculus
▪ Relational Algebra
▪ Transform Oriented Languages (e.g. SQL)
▪ Graphical Languages
▪ Exhibit the “closure” property – queries on relations produce relations

Relational Calculus
▪ Based on mathematical logic.
▪ Non-procedural.
▪ Primarily of theoretical importance.
▪ May be used as a yardstick for measuring the power of other relational languages (“relational completeness”).
▪ Operators may be applied to any number of relations.

RELATIONAL ALGEBRA
Manipulation of relational data

Relational Algebra
▪ Relationally complete.
▪ Procedural.
▪ Operators only apply to at most two relations at a time. ▪ 8 basic operations:
single relation: selection, projection
two relations:
• cartesian product, join
• intersection
• difference
• division
▪ Standard RA/pure form has no concept of NULL (Databases units use standard RA)

Relational Operation PROJECT
PRDETAIL (project_code, project_manager, project_bid_price)
Show all project manager:
How many tuples? How many attributes?

Relational Operation SELECT
PRDETAIL (project_code, project_manager, project_bid_price)
Show details of project 25-5A:
How many tuples? How many attributes?

Relational Operation Multiple Actions
PRDETAIL (project_code, project_manager, project_bid_price) 2
Show the project manager of project 25-5A
How many tuples? How many attributes?

SQL vs Relational Algebra in the Database

Q5. Relational Algebra select and project
The following relations represent a karate dojo member training attendance:
SENSEI (sensei_id, sensei_name)
TRAINING_SCHEDULE (training_day, training_time, group_id, sensei_id) ATTENDANCE (training_day, training_time, member_id, attendance_date) MEMBER (member_id, member_name, member_dob, member_belt, group_id) GROUP (group_id, group_name, group_age_range)
A. Primary keys are underlined
B. A karate member falls into one of the age level groups: Tiny Tiger (for 4-7 year old), Young Dragon
(for 8-14 years old), or Adult (for 14+ years old) and owns a certain color of belt (e.g. white, green,
brown or black)
C. Sensei (Karate teachers) are scheduled to train an age level group of karate members in a particular
day and time (e.g. Nakamura trains Tiny Tiger members every Tuesday 5pm)
D. A karate member may attend more than one training schedule of their age level group in a given
Write the relational algebra for the following query (your answer must show an understanding of query
efficiency):
(1) Show the name and dob of all black belt members.

▪ Join operator used to combine data from two or more relations, based on a common attribute or attributes.
▪ Different types: – theta-join
– equi-join
– natural join

THETA JOIN (Generalised join)
(Relation_1) ⨝F (Relation_2)
– F is a predicate (i.e. truth-valued function) which is of the
form Relation_1.ai θ Relation2.bi
• CUSTOMER.cust_no θ ORDER.cust_no
– θ is one of the standard arithmetic comparison operators, <, ≤, =, ≥, >
– Most commonly, θ is equals (=), but can be any of the operators
• EMPLOYEE.emp_sal > SALARYSCALE.step_5

NATURAL JOIN
STUDENT MARK
studid studname studid unitcode mark
1Alice 1100495 2Bob 2104555 1 1045 90
Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =)
Result at Step 2 is an Equijoin (STUDENT ⨝(STUDENT.studid = MARK.studid) MARK)
STUDENT.studid studname
1 Alice 1 Alice
2 Bob 2 ARK.studid unitcode mark
1 1004 95
2 1045 55
1 1045 90
1 1004 95
2 1045 55
1 1045 90

Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =) Step 3: delete duplicate columns (project away)
Result at Step 3 is a Natural Join
NATURAL JOIN
STUDENT MARK
studid studname studid unitcode mark
1Alice 1100495 2Bob 2104555
STUDENT.studid studname
MARK.studid unitcode
2 Bob 2 1045 55
1 1004 1 1045

NATURAL JOIN
STUDENT MARK
studid studname studid unitcode mark
1 Alice ⨝1 1004 95 2Bob 2104555
Step 1: STUDENT X MARK
Step 2: delete rows where IDs do not match (select =) Step 3: delete duplicate columns (project away)
studid studname unitcode
1 Alice 1004
1 Alice 1045
2 Bob 1045
A natural join of STUDENT and MARK

Q6. Which of the following statements returns a natural join of the two relations on the agent code (agent_cd and agent_code)?
A. σ agent_cd = agent_code (CUSTOMER X AGENT)
B. π cus_code, cus_lname, cus_zip, agent_code, agent_phone (σ agent_cd = agent_code (CUSTOMER X
C. σ agent_cd = agent_code (π cus_code, cus_lname, cus_zip, agent_code, agent_phone (CUSTOMER X AGENT))
D. All of the above
E. None of the above

UNION, INTERSECT, DIFFERENCE
UNION (STOREA ∪ STOREB)
product_id
product_name
LG Nano91 75″ 4K
TCL P725 65″ 4K UHD
Sony X85J 75″ Bravia
product_id
product_name
LG Nano91 75″ 4K
TCL P725 65″ 4K UHD
Sony X85J 75″ Bravia
LG C1 48″ Self Lit OLED 4K
INTERSECT (STOREA ⋂ STOREB)
DIFFERENCE (STOREA – STOREB)
product_id
product_name
LG Nano91 75″ 4K
TCL P725 65″ 4K UHD
product_id
product_name
LG Nano91 75″ 4K
TCL P725 65″ 4K UHD
LG C1 48″ Self Lit OLED 4K
product_id
product_name
Sony X85J 75″ Bravia
Union compatible relations required

Q7. Relational Algebra
The following relations represent a karate dojo member training attendance:
SENSEI (sensei_id, sensei_name)
TRAINING_SCHEDULE (training_day, training_time, group_id, sensei_id) ATTENDANCE (training_day, training_time, member_id, attendance_date) MEMBER (member_id, member_name, member_dob, member_belt, group_id) GROUP (group_id, group_name, group_age_range)
A. Primary keys are underlined
B. A karate member falls into one of the age level groups: Tiny Tiger (for 4-7 year old), Young Dragon (for
8-14 years old), or Adult (for 14+ years old) and owns a certain color of belt (e.g. white, green, brown or
C. Sensei (Karate teachers) are scheduled to train an age level group of karate members in a particular day
and time (e.g. Nakamura trains Tiny Tiger members every Tuesday 5pm)
D. A karate member may attend more than one training schedule of their age level group in a given week.
Write the relational algebra for the following query (your answer must show an understanding of query efficiency):
(2) Show the name, belt colour and attendance dates of the member with an id of 12345

(2) Show the name, belt colour and attendance dates of the member with an id of 12345
R2 = 𝝿 member_name, member_belt, attendance_date (𝝈 member_id = 12345 ( MEMBER ⨝ ATTENDANCE))
– this is the CANONICAL QUERY – not technically incorrect, but very inefficient, say member 12345 has only attended once in say 1000 tuples in ATTENDANCE. The join between MEMBER and ATTENDANCE yields, in such a scenario, 1000 tuples, 999 of which are unnecessary.
Your solution must demonstrate an understanding of efficiency:
A2a = 𝝿 member_id, attendance_date (𝝈 member_id = 12345 ATTENDANCE)
A2b = 𝝿 member_id, member_name, member_belt (𝝈 member_id = 12345 MEMBER)
R2 = 𝝿 member_name, member_belt, attendance_date ( A2a ⨝ A2b)

Q8. Relational Algebra POST WORKSHOP TASK – answer available Sunday 5PM
The following relations represent a karate dojo member training attendance:
SENSEI (sensei_id, sensei_name)
TRAINING_SCHEDULE (training_day, training_time, group_id, sensei_id) ATTENDANCE (training_day, training_time, member_id, attendance_date) MEMBER (member_id, member_name, member_dob, member_belt, group_id) GROUP (group_id, group_name, group_age_range)
A. Primary keys are underlined
B. A karate member falls into one of the age level groups: Tiny Tiger (for 4-7 year old), Young Dragon (for 8-14
years old), or Adult (for 14+ years old) and owns a certain color of belt (e.g. white, green, brown or black)
C. Sensei (Karate teachers) are scheduled to train an age level group of karate members in a particular day
and time (e.g. Nakamura trains Tiny Tiger members every Tuesday 5pm)
D. A karate member may attend more than one training schedule of their age level group in a given week.
Write the relational algebra for the following query (your answer must show an understanding of query efficiency):
(3) Show the id, name and age level group name of members who were absent (did not attend any training) between 01-03-2021 and 31-03-2021 (inclusive).

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com