Poll — Designing database application involves:
a) Design of Database Schema
b) Design of programs that access and update data
c) Design of security scheme to control access to data
d) All of the above
Design Phases: How to Design a Database Schema?
• Characterize the data needs of
the prospective users.
• Choose a data model
• Moving from abstract data model
to the implementation:
• Logical Design phase
• Physical Design phase
Design Phases (cont.)
5
Choosing a Data Model:
Network Models More natural view of the
world
Capacity for data
independence-limited
Relational Model High degree of data
independence
May lose important
semantic information
Entity Relationship Model High degree of data
independence
More natural view of the
world
Design Phases (cont.)
6
Moving from abstract data model to the implementation
– Two phases:
● Logical Design – Deciding on the database schema. Database
design requires that we find a “good” collection of relation
schemas.
○ Business decision – What attributes should we record in the
database?
○ Computer Science decision – What relation schemas should we
have and how should the attributes be distributed among the
various relation schemas?
● Physical Design – Deciding on the physical structure of the database
Design Phases (cont.)
7
Moving from abstract data model to the implementation
– Two phases:
● Logical Design – Deciding on the database schema. Database
design requires that we find a “good” collection of relation
schemas.
○ Business decision – What attributes should we record in the
database?
○ Computer Science decision – What relation schemas should we
have and how should the attributes be distributed among the
various relation schemas?
● Physical Design – Deciding on the physical structure of the database
Design Approaches
8
● Entity Relationship Model (covered in this module)
○ Models an enterprise as a collection of entities
and relationships
■ Entity: a “thing” or “object” in the
enterprise that is distinguishable
from other objects
● Described by a set of attributes
■ Relationship: an association among
several entities
○ Represented diagrammatically by an entity-
relationship diagram
● Normalization Theory (Module 6)
○ Formalize what designs are bad, and test for
them
Major Pitfalls in designing a database schema?
•Redundancy: a bad design may result in repeat
information.
Redundant representation of information may lead to
data inconsistency among the various copies of
information
•Incompleteness: a bad design may make certain aspects of
the enterprise difficult or impossible to model.
Outline of the ER Model
ER Model – Database Modeling
11
● The ER data model was developed to facilitate database design by
allowing specification of an enterprise schema that represents the
overall logical structure of a database.
● The ER model is very useful in mapping the meanings and
interactions of real-world enterprises onto a conceptual schema.
Because of this usefulness, many database-design tools draw on
concepts from the ERmodel.
● The ER model also has an associated diagrammatic
representation, the ER diagram, which can express the overall
logical structure of a database graphically.
ER Model – Database Modeling
12
● The ER data model employs three
basic concepts:
○ entity sets
○ relationship sets
○ attributes.
If libraries chose ER Model?!!
Entity Sets
13
● An entity is an object that exists and is distinguishable from other
objects.
○ Example: specific person, company, event, plant
● An entity set is a set of entities of the same type that share the
same properties.
○ Example: set of all persons, companies, trees, holidays
● An entity is represented by a set of attributes; i.e., descriptive
properties possessed by all members of an entity set.
○ Example:
instructor = (ID, name, street, city, salary)
course= (course_id, title, credits)
● A subset of the attributes form a primary key of the entity set; i.e.,
uniquely identifying each member of the set.
Entity Sets – instructor and student
14
instructor
76766 Crick
45565 Katz
10101 Srinivasan
98345 Kim
76543 Singh
22222 Einstein
98988 Tanaka
12345 Shankar
00128 Zhang
76543 Brown
76653 Aoi
23121 Chavez
44553 Peltier
student
Relationship Sets
15
● A relationship is an association among several entities
Example:
44553 (Peltier)
student entity
advisor
relationship set
22222 (Einstein)
instructor entity
● A relationship set is a mathematical relation among n 2 entities,
each taken from entity sets
{(e
1
, e
2
, … e
n
) | e
1
∈ E
1
, e
2
∈ E
2
, …, e
n
∈ E
n
}
where (e
1
, e
2
, …, e
n
) is a relationship
● Example:
(44553, 22222) ∈ advisor
Why Relationships?
Entity Set A
Entity Set B
Relationship Set advisor
instructor
student
76766 Crick 98988 Tanaka
45565 Katz 12345 Shankar
10101 Srinivasan 00128 Zhang
98345 Kim 76543 Brown
76543 Singh 76653 Aoi
22222 Einstein 23121 Chavez
44553 Peltier
17
11
Relationship Sets (Cont.)
● An attribute can also be associated with a relationship set.
● For instance, the advisor relationship set between entity sets
instructor and student may have the attribute date which tracks
when the student started being associated with the advisor
instructor
76766 Crick
45565 Katz
10101 Srinivasan
98345 Kim
76543 Singh
22222 Einstein
98988 Tanaka
12345 Shankar
00128 Zhang
76543 Brown
76653 Aoi
23121 Chavez
44553 Peltier
student
3 May 2008
10 June 2007
12 June 2006
6 June 2009
30 June 2007
31 May 2007
4 May 2006
Degree of a Relationship Set
19
● Binary relationship
○ involve two entity sets (or degree two).
○ most relationship sets in a database system are binary.
● Relationships between more than two entity sets are rare. Most
relationships are binary.
○ Example: students work on research projects under the guidance
of an instructor.
○ relationship proj_guide is a ternary relationship between
instructor, student, and project
Mapping Cardinality Constraints
20
● Express the number of entities to which another entity can be
associated via a relationship set.
● Most useful in describing binary relationship sets.
● For a binary relationship set the mapping cardinality must be one of
the following types:
○ One to one
○ One to many
○ Many to one
○ Many to many
Mapping Cardinalities
One to one One to many
Note: Some elements in A and B may not be mapped to any elements in
the other set
21
Mapping Cardinalities (cont.)
Many to one Many to many
Note: Some elements in A and B may not be mapped to any
elements in the other set
22
Redundant Attributes
23
● Suppose we have entity sets:
○ instructor, with attributes: ID, name, dept_name, salary
○ department, with attributes: dept_name, building, budget
● We model the fact that each instructor has an associated
department using a relationship set inst_dept.
● The attribute dept_name appears in both entity sets. Since it is the
primary key for the entity set department, it replicates information
present in the relationship and is therefore redundant in the entity
set instructor and needs to be removed.
● BUT: when converting back to tables, in some cases the attribute
gets reintroduced.
Weak Entity Sets
24
● Consider a section entity, which is uniquely identified by a course_id,
semester, year, and sec_id.
● Clearly, section entities are related to course entities. Suppose we
create a relationship set sec_course between entity sets section and
course.
● Note that the information in sec_course is redundant, since section
already has an attribute course_id, which identifies the course with
which the section is related.
● One option to deal with this redundancy is to get rid of the
relationship sec_course; however, by doing so the relationship
between section and course becomes implicit in an attribute, which
is not desirable.
Weak Entity Sets (Cont.)
25
● An alternative way to deal with this redundancy is to not store the
attribute course_id in the section entity and to only store the
remaining attributes section_id, year, and semester. However, the
entity set section then does not have enough attributes to identify a
particular section entity uniquely; although each section entity is
distinct, sections for different courses may share the same
section_id, year, and semester.
● To deal with this problem, we treat the relationship sec_course as a
special relationship that provides extra information, in this case, the
course_id, required to identify section entities uniquely.
Weak Entity Sets (Cont.)
26
● The notion of weak entity set formalizes the above intuition. A weak
entity set is one whose existence is dependent on another entity,
called its identifying entity; instead of associating a primary key with
a weak entity, we use the identifying entity, along with extra
attributes called discriminator to uniquely identify a weak entity. An
entity set that is not a weak entity set is termed a strong entity set.
Weak Entity Sets (Cont.)
27
● Every weak entity must be associated with an identifying entity; that
is, the weak entity set is said to be existence dependent on the
identifying entity set. The identifying entity set is said to own the
weak entity set that it identifies. The relationship associating the
weak entity set with the identifying entity set is called the identifying
relationship.
● Note that the relational schema we eventually create from the entity
set section does have the attribute course_id, for reasons that will
become clear later, even though we have dropped the attribute
course_id from the entity set section.
E-R Diagrams
Entity Sets
● Entities can be represented graphically as follows:
○ Rectangles represent entity sets.
○ Attributes listed inside entity rectangle
○ Underline indicates primary key attributes
29
Relationship Sets
● Diamonds represent relationship sets.
30
Relationship Sets with Attributes
31
Roles
● Entity sets of a relationship need not be distinct
○ Each occurrence of an entity set plays a “role” in the
relationship
● The labels “course_id” and “prereq_id” are called roles.
32
Cardinality Constraints
● We express cardinality constraints by drawing either a directed line
(→), signifying “one,” or an undirected line (—), signifying “many,”
between the relationship set and the entity set.
● One-to-one relationship between an instructor and a student:
○ A student is associated with at most one instructor via the
relationship advisor
○ A student is associated with at most one department via
stud_dept
33
One-to-Many Relationship
● One-to-many relationship between an instructor and a student
○ an instructor is associated with several (including 0) students
via advisor
○ a student is associated with at most one instructor via advisor
34
Many-to-One Relationships
● In a many-to-one relationship between an instructor and a student,
○ an instructor is associated with at most one student via advisor,
○ and a student is associated with several (including 0)
instructors via advisor
35
Many-to-Many Relationship
● An instructor is associated with several (possibly 0) students via
advisor
● A student is associated with several (possibly 0) instructors via
advisor
36
Total and Partial Participation
● Total participation (indicated by double line): every entity in the
entity set participates in at least one relationship in the relationship
set
○ Participation of student in advisor relation is total
■ every student must have an associated instructor
● Partial participation: some entities may not participate in any
relationship in the relationship set
○ Example: participation of instructor in advisor ispartial
37
Notation for Expressing More Complex Constraints
● A line may have an associated minimum and maximum cardinality,
shown in the form l..h, where l is the minimum and h the maximum
cardinality
○ A minimum value of 1 indicates total participation.
○ Amaximum value of 1 indicates that the entity participates in
at most one relationship
○ A maximum value of * indicates no limit.
Instructor can advise 0 or more students. A student must have 1
advisor; cannot have multiple advisors
38
Expressing Weak Entity Sets
● In E-R diagrams, a weak entity set is depicted via a double
rectangle.
● We underline the discriminator of a weak entity set with a dashed
line.
● The relationship set connecting the weak entity set to the
identifying strong entity set is depicted by a double diamond.
● Primary key for section – (course_id, sec_id, semester, year)
39
Classwork
Construct an E-R diagram for a car insurance company whose
customers own one or more cars each. Each car has associated
with it zero to any number of recorded accidents. Each insurance
policy covers one or more cars and has one or more premium
payments associated with it. Each payment is for a particular period
of time, and has an associated due date, and the date when the
payment was received.
Reduction to Relational
Schemas
Reduction to Relational Schemas
42
● Entity sets and relationship sets can be expressed uniformly as
relation schemas that represent the contents of the database.
● A database which conforms to an E-R diagram can be
represented by a collection of schemas (that is, tables).
● For each entity set and relationship set there is a unique
schema that is assigned the name of the corresponding entity
set or relationship set.
● Each schema has a number of columns (generally corresponding
to attributes), which have unique names.
Representing Entity Sets
● A strong entity set reduces to a schema with the same attributes
student (ID, name, tot_cred)
● A weak entity set becomes a table that includes a column for the
primary key of the identifying strong entity set
section (course_id, sec_id, sem, year)
43
Representing Relationship Sets
● A many-to-many relationship set is represented as a schema with
attributes for the primary keys of the two participating entity sets,
and any descriptive attributes of the relationship set.
● Example: schema for relationship set advisor
advisor = (s_id, i_id)
44
Classwork:
Design a database for a worldwide package delivery company
(e.g., DHL or FedEx). The database must be able to keep track of
customers who ship items and customers who receive items; some
customers may do both. Each package must be identifiable and
trackable, so the database must be able to store the location of the
package and its history of locations. Locations include trucks,
planes, airports, and warehouses.
Your design should include an E-R diagram, a set of relational
schemas, and a list of constraints, including primary-key and
foreign-key constraints.
Non-binary Relationship Sets
● Most relationship sets are binary
● There are occasions when it is more convenient to represent
relationships as non-binary.
● E-R Diagram with a Ternary Relationship:
46
Cardinality Constraints on Ternary Relationship
47
● We allow at most one arrow out of a ternary (or greater degree)
relationship to indicate a cardinality constraint
● For example, an arrow from proj_guide to instructor indicates each
student has at most one guide for a project
● If there is more than one arrow, there are two ways of defining the
meaning.
Cardinality Constraints on Ternary Relationship (Cont.)
48
● For example, a ternary relationship R between A, B and C with
arrows to B and C could mean
○ Each A entity is associated with a unique entity from B and C or
○ Each pair of entities from (A, B) is associated with a unique C
entity, and each pair (A, C) is associated with a unique B
● Each alternative has been used in different formalisms
● To avoid confusion we outlaw more than one arrow
Design Issues
Entities vs. Attributes
● Use of entity sets vs. attributes
● Use of phone as an entity allows extra information about phone
numbers (plus multiple phone numbers)
50
Binary Vs. Non-Binary Relationships
51
● Although it is possible to replace any non-binary (n-ary, for n > 2)
relationship set by a number of distinct binary relationship sets, a
n-ary relationship set shows more clearly that several entities
participate in a single relationship.
● Some relationships that appear to be non-binary may be better
represented using binary relationships
○ For example, a ternary relationship parents, relating a child to
his/her father and mother, is best replaced by two binary
relationships, father and mother
■ Using two binary relationships allows partial information
(e.g., only mother being known)
○ But there are some relationships that are naturally non-binary
■ Example: proj_guide
E-R Design Decisions
52
● The use of an attribute or entity set to represent an object.
● Whether a real-world concept is best expressed by an entity set or a
relationship set.
● The use of a ternary relationship versus a pair of binary
relationships.
● The use of a strong or weak entity set.
● The use of specialization/generalization – contributes to modularity
in the design.
● The use of aggregation – can treat the aggregate entity set as a
single unit without concern for the details of its internal structure.
So, what defines a good design?
Proactively understand
Data Management needs
and Plan ahead.
Acknowledgements
54
● Some content adapted or modified from the course assigned
textbook and its relevant resources by:
Silberschatz, A., Korth, H. F., and Sudarshan, S. (2019). Database
System Concepts, 7th Edition. McGraw-Hill.
●