Database Systems Infrastructure
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
.
This week, we continue to look into ERD.
This week we will be looking at relational table, and next week, we will be looking at normalisation of database tables. So, that is the red area.
INFS5710 Week 1
Extended Entity Relationship Model (EERM)
Result of adding more semantic constructs to the original entity relationship (ER) model
EER diagram (EERD): Uses the EER model
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Extended ER diagram or EERD are looking at more details at the modelling design.
Primary Keys and Foreign Keys
Consider the following tables pertaining to your course enrolment:
Students
SID PK
Email
Title
Surname
First Name
DOB
Home Address
Courses
Course ID PK
Course Name
Class Enrolment
SID PK, FK
Course ID PK, FK
Class ID PK
Term PK
Delivery Mode
Instructor
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
INFS5710 Week 2
Different Input
What if the Students table also stores students’ high school results, undergraduate degree, and other degree or diploma, or even their professional memberships?
For high school results, in NSW, one has HSC results and ATAR score. How about in China, India, Indonesia, and UK? They all are different, so how do you keep the records?
So how about professional memberships? For example, an accountant has a CPA, whereas a plumber need a plumbing certificate (or may be none).
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
.
In NSW, you are going to have HSC results and ATAR score.
How about in China, India, Indonesia, and UK, they all are different, so how do you keep the records?
Entity Supertypes and Subtypes
Entity supertype: Generic entity type related to one or more entity subtypes
Contains common characteristics
Entity subtype: Contains unique characteristics of each entity subtype
Criteria to determine the usage
There must be different, identifiable kinds of the entity in the user’s environment
The different kinds of instances should each have one or more attributes that are unique to that kind of instance
EMPLOYEE
PILOT
MECHANIC
ACCOUNTANT
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As we said in the last slide, Employee entity is the Supertype because it contains common attributes or characteristics for pilot, mechanic, and accountant, which are the subtypes.
Specialization Hierarchy
Depicts arrangement of higher-level entity supertypes and lower-level entity subtypes
Relationships are described in terms of “is-a” relationships
Subtype exists within the context of a supertype
Every subtype has one supertype to which it is directly related
Supertype can have many subtypes
An employee “is a” pilot, mechanic, or accountant.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Entity supertypes and subtypes are organised into a specialization hierarchy with the high-level entity supertypes, which are the parent entities, and the lower-entity subtypes are the child entities.
The relationships between the entities are described in terms of is-a relationships. So, in our example, an employee is a pilot, mechanic, or accountant.
A subtype must exist within the context of a supertype and must directly associate to one supertype.
A supertype can have many subtypes.
Specialization Hierarchy
Provides the means to:
Support attribute inheritance
Define a special supertype attribute known as the subtype discriminator
Define disjoint/overlapping constraints and complete/partial constraints
Can an employee be a pilot AND a mechanic?
e.g., EMP_TYPE (pilot, mechanic or accountant)
Must an employee be a pilot, a mechanic, OR an accountant?
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
It supports attribute inheritance, which means all attributes are inherited by the entity subtypes.
So, how do you differentiate the different subtypes? The way to do is to have a special subtype attribute called subtype discriminator.
In our example, emp_type attribute is the subtype discriminator.
Figure 5.2 – Specialization Hierarchy
disjoint – e.g., a pilot cannot be a mechanic
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
If you look if you look here between employee entity and dependant entity, the relationship is “has”.
See emp_type attribute in employee entity is the subtype discriminator. So, depending on the value of emp_type attribute, it will branch out differently.
If you look at the category symbol, this is Disjoint, and emp_type is the subtype discriminator. What it means here, for example, is a pilot cannot be a mechanic!
Inheritance
Enables an entity subtype to inherit attributes and relationships of the supertype
All entity subtypes inherit their primary key attribute from their supertype
At the implementation level, supertype and its subtype(s) maintain a 1:1 relationship
Entity subtypes inherit all relationships in which supertype entity participates
Lower-level subtypes inherit all attributes and relationships from its upper-level supertypes
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide
Subtype Discriminator
Attribute in the supertype entity that determines to which entity subtype the supertype occurrence is related
Default comparison condition is the equality comparison
“is-a” is “=“
EMP_TYPE is an attribute of the supertype
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide
Disjoint and Overlapping Constraints
Disjoint subtypes: Contain a unique subset of the supertype entity set
Known as nonoverlapping subtypes
Implementation is based on the value of the subtype discriminator attribute in the supertype
Overlapping subtypes: Contain nonunique subsets of the supertype entity set
Implementation requires the use of one discriminator attribute for each subtype
P
M
A
P
M
A
(next slide)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Disjoint subtypes mean the subtypes are independent and not overlapped. For example, P, M and A are not overlapped.
On the other hand, overlapping subtypes are when subtypes entities are overlapped with each other.
See next slide as an example.
overlapping:
a student can also be an employee
Total completeness (Everyone must be an employee, a student or both)
Partial completeness
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
If we look at person entity, a student can be an employee, for example you might be working at a university as well as studying.
For Total Completeness, a person can either be an employee, or a student, or both.
That is why you have two attributes, P_IS_EMP and P_IS_STU, one for student and the other for employee. In this University, we used to have a SID for staff and a ZID for student, but with the same number. This is an example off overlapping. But now we only have ZID!
Partial completeness means that not every supertype occurrence is a member of a subtype; some subtype occurrences may not be members of any subtypes
On the other hand, total completeness means that every super subtype occurrence must be a member of at least one subtype. In this example, an employee might not be an administrator or a professor. You might be only a cleaner
Remember the question of an accountant of having a pilot license, depending on the business rules, you can create something like overlapping subtypes. The accountant might be a pilot but spend most of his time doing numbers rather than flying.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
For example of Disjoint versus Overlapping! For example, the books of Harry Potters!
Table 5.1 – Discriminator Attributes with Overlapping Subtypes
How about N / N?
Partial completeness
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
How about N / N?
This is always partial completeness.
Completeness Constraint
Specifies whether each supertype occurrence must also be a member of at least one subtype
Types
Partial completeness: Not every supertype occurrence is a member of a subtype
Total completeness: Every supertype occurrence must be a member of any
STUDENT
EMPLOYEE
PERSON / EMPLOYEE
ADMINISTRATOR
PROFESSOR
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
Table 5.2 – Specialization Hierarchy Constraint Scenarios
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
overlapping:
a student can also be an employee
Every person must be either an employee, a student, or both
It is possible that someone is not certain to be an administrator and/or a professor.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
Remember you as an employee might be a cleaner, and not an administrator and/or a professor.
Primary Keys
Single attribute or a combination of attributes, which uniquely identifies each entity instance
Guarantees entity integrity
Works with foreign keys to implement relationships
Composite key
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
Remember, we said last week, your Zid is a primary key.
Natural Keys or Natural Identifier
Real-world identifier used to uniquely identify real-world objects
Familiar to end users and forms part of their day-to-day business vocabulary
Also known as natural identifier
Used as the primary key of the entity being modeled
common-sense ones
Invoice number to identify invoices
Credit card numbers to identify credit cards
table
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
Desirable Primary Key Characteristics
Embedded semantic meaning
(name, marital status may change)
Remember a PK can be someone’s foreign key,
multiple attributes make it hard to link tables.
Using SSN as a SID is a bad idea.
To avoid typing errors
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
So, how do you select a primary key?
It is best to generate an automatic number with last digit a check digit. A check digit means you have a formula to calculate the check digit based on the digits in the numbers.
For example, the bar code, the last number is a check digit – see https://www.gs1au.org/resources/check-digit-calculator. Once upon a time, I wrote a program to calculate the check-digit, and there are rules of how to calculate the check-digit.
This is better than generating the next number – why?
Non intelligent
No change over time
Preferably single-attribute
Preferably numeric
Security-compliant
Use of Composite Primary Keys
Identifiers of composite entities
Each primary key combination is allowed once in M:N relationship
Identifiers of weak entities
Weak entity has a strong identifying relationship with the parent entity
Remember the so-called “composite, associative, or bridge” entity.
both entities share PK
Existence-dependent
(foreign key cannot be null)
and both entities share PK
Weak identifying relationship
Determinant
Dependent
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
So, when do we create and use composite primary keys? This is when you have many to many relationships, and you want to make it unique, generate composite primary keys is your solution.
Use of Composite Primary Keys
When used as identifiers of weak entities, represent a real-world object that is:
Existence-dependent on another real-world object
Represented in the data model as two separate entities in a strong identifying relationship
DEPENDENT(emp_num, dpnt_name)
It shares PK from parent entity
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
For example… as stated in the slide…
Surrogate Primary Keys
Primary key used to simplify the identification of entity instances are useful when:
There is no natural key
Selected candidate key has embedded semantic contents or is too long
Require ensuring that the candidate key of entity in question performs properly
Use unique index and not null constraints
(e.g., composite PK)
(see an example on the next slide)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
This is best to explain as an example, so see next slide.
Table 5.4 – Data Used to Keep Track of Events
These are required fields to uniquely identify an event. It is too cumbersome to use them all as a foreign key. Programmer may like to assign an “event_id” for each event, not known to the user, but only to the programmer for convenience, which is a surrogate PK.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
If you look at the example, you can see to make this event unique, you have to create a Composite key containing date, time start, and room.
It might be too cumbersome to use them all as a foreign key.
You can create an id called event_id composites these attributes. This is called a surrogate primary key. It is not for programmer convenience but also performance of the database.
However, if you use a surrogate primary key, you still need to check for uniqueness of other attributes. You might have a unique key to ensure these attributes are unique! So you might have a primary key and a unique key – this is more to do with designing a physical database.
Design Case 1: Implementing 1:1 Relationships
Foreign keys work with primary keys to properly implement relationships in relational model
Rule
Put primary key of the parent entity on the dependent entity as foreign key
Options for selecting and placing the foreign key:
Place a foreign key in both entities
Place a foreign key in one of the entities
Duplicate data
P F
P F
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Design Case 2: Maintaining History of Time-Variant Data
Time-variant data: Data whose values change over time and for which a history of the data changes must be retained
Requires creating a new entity in a 1:M relationship with the original entity
New entity contains the new value, date of the change, and other pertinent attribute
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Figure 5.8 – Maintaining Salary History
Each employee has 0 to multiple
salary changes.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Figure 5.9 – Maintaining Manager History
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Figure 5.10 – Maintaining Job History
strong relationship,
sharing PK
weak relationship,
not sharing PK
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Design Case 3: Fan Traps
Design trap: Occurs when a relationship is improperly or incompletely identified
Represented in a way not consistent with the real world
Fan trap: Occurs when one entity is in two 1:M relationships to other entities
Produces an association among other entities not expressed in the model
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Figure 5.11 – Incorrect ERD with Fan Trap Problem
Is Jordan in BlurStars
or Phanters?
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Figure 5.12 – Corrected ERD After Removal of the Fan Trap
Which division is
Jordan in?
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Design Case 4:
Redundant Relationships
Occur when there are multiple relationship paths between related entities
Need to remain consistent across the model
Help simplify the design
Design trade-off
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
Figure 5.13 – A Redundant Relationship
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide and read the textbook for explanations.
This is a common error – if you know which division a player plays in by joining to table entity and division entity.
Oracle
Oracle
Database
Flat Files
Machine Learning
Prepared by Vincent Pang, Feb. 2021 Note: In-Memory Database (e.g. SAP Hana) is an alternative data model not shown here
ETL (Data
Cleansing)
Data (DW to BD, or
vice versa or both)
RelationalDatabaseBig DataDataWarehouseNormalisationReporting (Business Intelligence and Visualisation) and Business Analysis (End Users)(De-Normalised)External Data (e.g. Excel)Hadoop Distributed File System (HDFS) and MapReduceEntity Relationship Model (ERM)Unstructured Data(Social Media)Structured Data (Internet of Things (IOT))HadoopSQLSQLSpark and NoSQL (and other tools)Data StreamingData Streaming(“Not Normalised”)External Data (e.g. Excel)Star Schema
/docProps/thumbnail.jpeg