1
3/3/20
Practical Database Concepts
Lecture 1: Introduction Santha Sumanasekara March 2020
Data Everywhere – Amazon.com
– Half a billion products on sale
– Its website serves one billion pages a day on average, or 10000 pages
a second. Peaks can be much higher.
– To serve those billions of web pages, it will retrieve trillions of database objects from its backend database.
– A proprietary NoSQL database as the backend.
2
1
3
3/3/20
Data Everywhere
Data Everywhere – Facebook
– One billion queries per second
– Petabytes of data
– Almost 2.5 billion active users by 2019! What’s the total population on earth?
– A cluster of mySQL database at the backend
4
2
5
3/3/20
Facebook
Data Everywhere
– Storing large volumes of data in a methodical way is essential
– Traditional storage methods – flat files, spreadsheets – quickly became insufficient and limited in many ways
– This course is all about exploring how best manage your data.
– Start with comprehend core issues with data management.
6
3
7
3/3/20
A data management problem
– Scenario
– You are a video rental owner.
– Customers rent DVD/BluRay copies of movies.
– There may be several copies of each movie.
– There are a large number of customers and a handful of salespersons.
– Needs:
– Which DVDs were rented out?
– Are any videos overdue?
– When will a copy of a movie become available?
Solution 1: A simple file
Advantages:
• Easy to edit
• Simple to add a record
• Simple to delete a record
8
4
9
3/3/20
Solution 2: A Spreadsheet
Advantages:
• Readily available applications
• Simple to add a record (delete may not be straight-forward)
• Easy to do simple calculations.
Complications: Queries
– Does not address the query needs.
– Query 01: What movies has Joe Citizen rented?
– Execute: Search for the term “ Joe Citizen”.
– Query 02: What are the movies recommended for customers who borrowed “Lion King”?
– Execute: ??
– Requirements:
– Robust, sophisticated query language
– Clear separation between data organisation (schema) and data.
Database Concepts
• Schema
• DML (Data Manipulation Lang.) • SQL
10
5
11
3/3/20
Complications: Integrity
– Lacks Data integrity and consistency.
– The clerk, Ben, misspells a value:
– Customer: Joe Citizon, Rented: Elf, Due 05/03/2004
– Ends up having Joe Citizen and Joe Citizon as two customers.
– Requirements:
– Enforce constraints to permit only valid data to be input.
Database Concepts
• Integrity Constraints • Types
Complications: Updates – Add/delete/update fields in every record.
– Modify “customer” to “firstname” and “surname” – Customer: Joe Citizen, …, … , Loc: Boxhill
To:
– …, … , Loc: Boxhill
– Requirements:
– Ability to manipulate the way data is organised.
First: Joe, Surname: Citizen,
Database Concepts
• DDL (Data Definition
Language
12
6
13
3/3/20
Ben, the first check-out clerk at the video store
Introduction 13
Sarah, the second Check- out clerk at video store
14
7
15
3/3/20
Complications: Multi-user
–
Two clerks edit rented.txt file at the same time.
1. Ben starts to edit rented.txt, reads it into memory.
2. Sarahstartstoeditrented.txt
3. Ben adds a record and saves rented.txt to disk.
4. Sarahsavesrented.txttodisk.
5. Ben’saddedrecorddisappears!
– Requirements:
– Must support multiple readers and writers
– Must have a control over concurrent updates made by multiple writers.
Database Concepts
• Serialisability
• Concurrency Control
Complications: Crashes – Crash during update may lead to inconsistent state.
– Ben makes about 250 of 500 edits needed to fix Citizon to Citizen.
– He saves the changes
– He makes remaining edits as needed.
– Before he saves it, his favourite operating system crashes!
– Requirements:
– Must update on “all or none” basis.
– Implemented by commit or rollback if necessary.
Database Concepts • Transactions
• Commit
• Rollback
• Recovery
16
8
17
3/3/20
Complications: Security
– Clerks Ben and Sarah should not see the credit card details of
customers.
– Requirements:
– Ability to control who has access to what information
Database Concepts • Security
• Views
Complications: Efficiency
– One giant company takes over a number of small video stores
– rented.txt gets huge (gigabytes of data)
– Slow to edit.
– Slow to query for customer information.
– Requirements:
– New data structures to improve query performance.
– System automatically modifies query evaluation to improve the speed.
– Ability of system to scale to handle huge datasets.
Database Concepts • Index
• Query Optimisation
• Database Tuning
18
9
19
3/3/20
Solution?
What is a Database Management System??
• Chapter One of LinkedIn Learning tutorial (Learning Relational Databases) has a succinct explanation.
• https://www.linkedin.com/learning/learning-relational-databases- 2/welcome
• Let’s listen
20
10
21
3/3/20
Which database products? – A large number of Database products available in the market –
commercial, and open-source.
– Refer the comparison of relational database management systems at
– http://en.wikipedia.org/wiki/Comparison_of_relational_database_mana gement_systems
Which database products? – Evolution timeline at
– http://upload.wikimedia.org/wikipedia/commons/2/22/RDBMS_timeline .svg
22
11
23
24
3/3/20
12
25
26
3/3/20
13
27
3/3/20
A new approach – noSQL
– A NoSQL (originally referring to “non SQL” or ”Not Only Relational” database provides a mechanism for storage and retrieval of data which is modeled in means other than tables.
– a surge of popularity in the early twenty-first century, triggered by the needs of Web 2.0 companies such as Facebook, Google and Amazon.com.
– https://en.wikipedia.org/wiki/NoSQL
A new approach – noSQL – More informative videos about noSQL
– https://www.youtube.com/watch?v=uD3p_rZPBUQ An Introduction To NoSQL Databases
– https://www.youtube.com/watch?v=ASiU89Gl0F0 NoSQL distilled
28
14
29
30
3/3/20
Database System used in this
course
– We use SQLite – SQLite is a very light-weight single user database
– SQLite is the most widely deployed database in the world, including iOS, Android and other major projects (https://sqlite.org/famous.html)
– We use SQLite Studio (https://sqlitestudio.pl) as a GUI-based Interactive Development Environment
– Both SQLite and SQLite Studio are open- source projects.
15
31
3/3/20
SQLite Studio
What will you learn in this course?
32
16
33
3/3/20
What will you learn in this course?
– Database Design
– Building Databases
– Using Databases – CRUD
– Create – Read
– Update – Delete
What will you be assessed on?
– Two assignments – 20% and 30% – Exam – 50%
34
17
35
3/3/20
Feedback!
Planning for next week
– Pre-reading (preparation)
– Revise Section 1 – 3 of the lynda.com tutorial – Revise Week 1 lecture notes
– Laptop
– Next week, we explore relational database concepts in more details, some class activities/ exercises during the lecture.
– A short quiz – Please bring in your laptop
– Learn more about assessments
– Complete Week 2 Tute/Lab sessions
36
18