CS3402 Database Systems
Teaching Staff’s Information
Lecturers:
Dr. LIAO Jing (week1-6)
Dr. CHEUNG Man Hon Michael (week 8-13)
Email: jingliao@cityu.edu.hk Phone: 3442-8619
Office: YEUNG-G6429
Teaching Assistants:
Email:mhcheung@cityu.edu.hk Phone: 3442-7324
Office: YEUNG-Y6412
Mr. MENG Xiangyi xianmeng5-c@my.cityu.edu.hk
Ms. HUANG Jialu
Ms. YE Shuquan
Mr. WAN Ziyu
Mr. ZHANG Jingbo jbzhang6-c@my.cityu.edu.hk Mr. Wang Can wang355-c@my.cityu.edu.hk Mr. Du Hao haodu8-c@my.cityu.edu.hk
jialhuang8-c@my.cityu.edu.hk shuquanye2-c@my.cityu.edu.hk
ziyuwan2-c@my.cityu.edu.hk
Course Overview
Course Format: Regular lectures
Class timetable:
Tuesday 10:00am – 11:50am lecture
Tutorial and lab classes One-hour tutorial
Pattern: 4 times lab+8 times normal tutorial
Assessment
Coursework — 40% :
Mid-term (week 7, openbook)– 25% Homework assignments (3 times)– 15%
Final examination (open-book)– 60% Get 30 out of 100 to pass
Course Materials
Text books
“Fundamentals of Database Systems”, 6th edition (or later), by R. Elmasri, S.B. Navathe, Addison-Wesley.
“Database System Concepts”, 5th edition (or later), by A. Silberschatz, H. Korth, S. Sudarshan, McGraw-Hill Companies Inc.
Notations may vary in different books. Please stick to the ones used in this lecture notes!
Database is everywhere
You cannot avoid it and it’s everywhere!
You can say it actually makes the current society and
your life work!
Banking/Credit card /Social Security Info…
Online shopping/booking…
Motivating Example
Data: in Computer system, Data is any sort of information which is stored in computer memory.
Student ID
Student name
Course
Course Name
Grade
50000000
50000000
50000001
50000002
Peter Wong Peter Wong Mary Tsui
Bob Lee
CS3402
CS2302 CS3402
CS3402
Database Systems
Data Structures Database Systems
Database Systems
B+ A
A- B
This way of storing data good? How to improve?
Introduction to DB Systems
Motivations
File-processing Systems
permanent records stored in various files
application programs written to extract & add records Disadvantages of traditional file-processing systems
data redundancy & inconsistency difficulty in accessing data
data isolation & different data formats concurrent access anomalies security problem
integrity problem
Introduction to DB Systems
What is a Database (DB)?
A non-redundant, persistent collection of logically-related records/files that are structured to support various processing and retrieval needs.
Database Management System (DBMS)
A set of software programs for creating, storing, updating, and
accessing the data of a DB.
E.g.: Oracle, Mysql,Oceanbase
Database Management System
Database Systems: Design, Implementation, & Management: Rob & Coronel
– manages interaction between end users and database
15
Database Management System
Difference between DBMS & other programming systems
the ability to manage persistent data
primary goal of DBMS: to provide an environment that is convenient, efficient, and robust to use in retrieving & storing data
Other DBMS capabilities
data modeling
high-level languages to define, access and manipulate data transaction management & concurrency control
access control
recovery
Database System
Database System
an integrated system of hardware, software, people, procedures, and data
that define and regulate the collection, storage, management,
17
and use of data within a database environment
Hardware
Software
– OS
– DBMS
– Applications
People
Procedures
Data
DB System Architecture
Naïve user Interactive Appl. Prog’er DBA user
Application interfaces
Application programs
(SQL) query
DB schema
Application programs object code
DQL processor
Database manager
DML compiler
DDL compiler
DBMS
File manager
DB
Data files disk storage
Data dict.
DB System Architecture
Database Users Naive Users:
Running application programs Interactive Users:
Using query languages Application Programmers
Writing embedded DML in a host language
DB System Architecture
Database Administrator (DBA)
DBA is the person who has central control over the DB Main functions of DBA:
schema definition
storage structure and access method definition schema and physical organization modification granting of authorization for data access integrity constraint specification
DB System Architecture
Data Query Language (DQL)
– a language used to make queries in databases
– e.g. search records with giving conditions (sex=“Female”)
Data Manipulation Language (DML)
– a language that enables users to manipulate data
– e.g. insert or delete records
Data Definition Language (DDL)
– a language for defining DB schema
– e.g., create, modify, and remove database objects such as tables, indexes, and users.
DB System Architecture
DB Manager
interface between stored data and application programs/queries translate conceptual level commands into physical level ones responsible for
access control concurrency control backup & recovery integrity
File Manager
allocation of space operations on files
Data Abstraction: 3-level architecture
Data Abstraction
Data Abstraction
Abstract view of the data
simplify interaction with the system
hide details of how data is stored and manipulated Levels of abstraction
Physical/internal level: internal schema uses a physical data model and describes the complete details of data storage and access paths for the database.
Conceptual level: conceptual schema describes the structure of the whole database for a community of users and hides the details of physical storage structures.
View/external level: external schema describes the part of the database that a particular user group is interested in and hides the rest from that group.
Data Independence
the ability to modify a schema definition in one level without affecting a schema in the next higher level
there are two kinds (a result of the 3-level architecture): physical data independence
— the ability to modify the physical schema without altering the conceptual schema and thus, without causing the application programs to be rewritten
logical data independence
— the ability to modify the conceptual schema without causing the application programs to be rewritten
Data Abstraction: 3-level architecture
Data Models
Data Model (conceptual level)
A collection of conceptual tools for describing data, data
relationships, operations, and consistency constraints the “core” of a database
Course Objectives
ER model: characterize relationships among entities
Relational model: transform from ER diagram to tables
SQL: language for writing queries
Relational Algebra: logical way to represent queries
Normal Forms: how to design good tables
File Organization: provide file level structure to speed up query
Query Optimization: transform queries into more efficient ones
Transactions and Concurrency Control: handle concurrent operations and guarantee correctness of the database
Course Objectives
ER model: characterize relationships among entities
Relational model: transform from ER diagram to tables
SQL: language for writing queries
Relational Algebra: logical way to represent queries
Normal Forms: how to design good tables
File Organization: provide file level structure to speed up query
Query Optimization: transform queries into more efficient ones
Transactions and Concurrency Control: handle concurrent operations and guarantee correctness of the database
The Entity-Relationship Model
Preliminaries
Proposed by P. Chen in 1976
Direct, easy-to-understand graphical notation
Translates readily to relational schema for database design
Ideas/requirem E/R Relational Relational ents design schema database
Three basic concepts: Entity, Attribute, Relationship
ER Model Concepts
Entity
a distinguishable object with an independent
existence
Example: John Chan, CityU, HSBC, … Entity Set
a set of entities of the same type
Example: Student, University, Bank, …
ER Model Concepts
Attribute(Property) — a piece of information describing an entity
Example: Name, ID, Address, Sex are attributes of a student entity
Each attribute can take a value from a domain Example: Name Character String,
ID Integer, …
Formally, an attribute A is a function which maps
from an entity set E into a domain D: A: E D
Types of Attributes
Simple
Each entity has a single atomic value for the attribute. For
example, SSN or Sex, name…
Composite
The attribute may be composed of several components. For
example:
Address(Flat, Block, Street, City, State, Country)
Composition may form a hierarchy where some components are themselves composite
Multi-valued
An entity may have multiple values for that attribute. For
example, Color of a CAR or PreviousDegrees of a STUDENT Denoted as {Color} or {PreviousDegrees}
E.g., {{BSc, 1990}, {MSc, 1993}, {PhD, 1998}}
Example of a composite attribute
Types of Attributes
In general, composite and multi-valued attributes may be nested to any number of levels
For example, PreviousDegrees of a STUDENT is a composite multi-valued attribute denoted by {PreviousDegrees (College, Year, Degree, Field)}
Multiple PreviousDegrees values can exist
Each has four subcomponent attributes: College, Year, Degree, Field
Attributes can form the keys
Keys: to distinguish individual entities or relationships superkey — a set of one or more attributes which, taken
together, identify uniquely an entity in an entity set Example: {student ID, Name} identify a student
candidate key — minimal set of attributes which can identify uniquely an entity in an entity set
a superkey for which no proper subset is a superkey Example: student ID identify a student,
but Name is not a candidate key (WHY?) Minimal does not need to be a single attribute.
primary key — a candidate key chosen by the DB designer to identify an entity or relationship
ER Model Concepts
Relationship — an association among several entities Example: Patrick and Eva are friends
Patrick is taking cs3450
a relationship can carry attributes: properties of the relationship
Example: Patrick takes cs2450 with a grade of B+
Relationship Set — a set of relationships of the same type (same
attribute,same participating entity, same constraints)
Example:
taking
John mary may
cs3450 cs2578 ee4532
Formally, a relationship R is a subset of:
{ (e1, e2, …, ek) | e1 E1, e2 E2, …, ek Ek) }
Constraints on relationship
Cardinality COURSE LECTURER
One-to- one(1:1)
COURSE TA
One-to-many (1:N)
STUDENTS COURSE
Many-to- many (M:N)
Participation: whether every entity in the entity set participates in the relationship set: total v.s. partial
ER Model Diagram
ER Diagram Rectangles:
Oval:
Diamonds: Lines:
R
Entity Sets Attributes Relationship Sets
Attributes to Entity/Relationship Sets or, Entity Sets to Relationship Sets
mRn 1Rn
1
R
1
Example COMPANY Database
We need to create a database schema (definition) based on the following (simplified) application requirements of the COMPANY Database:
The company is organized into DEPARTMENTs
Each DEPARTMENT has a unique name, unique number and
an EMPLOYEE who manages the department
We keep track of the start date of the department manager. A
department may have several locations
Each DEPARTMENT controls/has a number of PROJECTs
Each project has a unique name, unique number and is located at a single location
Example COMPANY Database
The database will store each EMPLOYEE’s social security number (ssn), name(first name, last name and middle init), address, salary, sex, and birthdate
Each employee works for one department but may work on several projects
The DB will keep track of the number of hours per week that an employee currently works on each project
It is required to keep track of the direct supervisor of each employee
Each employee may have a number of DEPENDENTs
For each dependent, the DB keeps a record of name, sex, birthdate, and relationship to the employee
Initial Conceptual Design of Entity Sets for the COMPANY Database Schema
Based on the requirements, we can identify four initial entity sets in the COMPANY database:
DEPARTMENT PROJECT
EMPLOYEE
DEPENDENT
The initial attributes shown are derived from the requirements description:
Initial Design of Entity Sets: EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT
Refining the initial design by introducing relationships
Find relationships relating two or more distinct entities/entity types with a specific meaning
By examining the requirements, six relationship types are identified: EMPLOYEE works on PROJECT
EMPLOYEE works for DEPARTMENT
EMPLOYEE manages DEPARTMENT
EMPLOYEE supervises EMPLOYEE DEPENDENT depends on EMPLOYEE DEPARTMENT controls PROJECT
ER DIAGRAM – Relationship Types are: WORKS_FOR, MANAGES, WORKS_ON, CONTROLS, SUPERVISION, DEPENDENTS_OF
Relationship instances of the WORKS_FOR N:1 relationship between EMPLOYEE and DEPARTMENT
N-to-1
Many employees in the same department
Relationship instances of the M:N WORKS_ON relationship between EMPLOYEE and PROJECT
M-to-n
Many employee work on many projects
Many-to-one (N:1) Relationship
Select anyone in EMPLOYEE, one can be identified in DEPARTMENT
One-to-one is special case of Many-to-one
Many-to-many (M:N) Relationship
Many-to-one is a special case of a many-to-many
ER Model Diagram
Weak Entity Set
an entity set that does NOT have enough attributes to form a primary/candidate key
Acct. no balance
trans. no date
amount
account
transaction
Role Indicators
Emp. name
log
Phone#
manager worke
employee
Works-for
Summary of ER-Diagram Notation
Symbol
Meaning
ENTITY TYPE
WEAK ENTITY TYPE
RELATIONSHIP TYPE
IDENTIFYING RELATIONSHIP TYPE
ATTRIBUTE
KEY ATTRIBUTE
MULTIVALUED ATTRIBUTE
COMPOSITE ATTRIBUTE
DERIVED ATTRIBUTE
TOTAL PARTICIPATION OF E2 IN R
CARDINALITY RATIO 1:N FOR E1:E2 IN R
STRUCTURAL CONSTRAINT (min, max) ON PARTICIPATION OF E IN R
E1
1
R
(min,max)
R
R N
E2
E1
E2
E
Summary
Concepts of databse system, DBMS, data abstraction, Data model.
The ER Model is regarded as the 1st “conceptual/semantic” model centered around relationships, not attributes
It combines successfully the best features of the previous data models
simple and easy to understand
can be mapped to tables (relational model) in a straightforward
manner
(to be studied in the coming lecture series)