and Review
RDBMS review
Memory hierarchy
John Edgar 2
1.1
A table consists of rows and columns
▪ Rows represent records
▪ Records must be unique
▪ One or more attributes forms a primary key
▪ Records are also referred to as tuples
▪ Each column represents an attribute
▪ Columns must have unique names within the table
A table is an instance of a schema
John Edgar 4
cid
first
last
email
…
123
bob
smith
bob@sfu.ca
456
kate
larson
kate@sfu.ca
789
ida
chan
ida@sfu.ca
A schema is defined in the DB
▪ Created either by running a CREATE TABLE statement or
using the DBMS GUI
▪ Schemas are part of the DB metadata ▪ Storedinthesystemcatalog
A schema is associated with one or more constraints ▪ Primary key
▪ Foreign key(s)
▪ Other constraints
Each column is associated with a domain – i.e. a type John Edgar 5
An application uses a conceptual view when interacting with a DB
▪ In a relational DB uses the relational model ▪ Interacts with SQL
The DBMS maps the conceptual view to the physical view
▪ Data stored in main memory or secondary storage ▪ And is responsible for accessing data from storage
devices
John Edgar 6
There are two major alternatives for the interface between DBMS and application
Embedded
▪ The application access the DB directly via API
function call
Tiered client-server
▪ The application makes a connection with the DBMS via ODBC, JDBC etc.
▪ This may entail connecting to a server that connects to the DB server
John Edgar 7
The DBMS is linked to the application at compile time
▪ They share the same address space
Embedded DBs are often used in mobile
systems
SQLite is a widely used DBMS
John Edgar
8
The application and DBMS reside in separate machines
▪ And communicate through a network
There are many different possible tiered
architectures
▪ With different numbers of tiers
This type of architecture is common and used
with most enterprise DBMSs
John Edgar 9
1.2
Ideally memory should be ▪ Unlimited capacity
▪ High bandwidth
▪ Instantaneous access
▪ Persistent
▪ Reliable (never fail)
▪ Free
Unfortunately …
Of course this is not reality
Instead we have trade-offs between these qualities
John Edgar
11
access time (ns) distance (kms)
volatile
SRAM
DRAM
registers
L1 cache
L2 cache L3 cache
main memory
1 1 3
15 50
100
20,000 5,000,000
10,000,000,000
a few blocks
long commute to work
Cost (and speed) change
Cost of all memory types has decreased
For newer technologies such as SSD cost per MB has substantially decreased
non volatile
persistent memory
flash (SSD) disk (HDD)
tape
The comparison ignores bandwidth which generally increases down the hierarchy
cost speed
vancouver to capetown
6.5 round trips to the moon, tenth of the way to mars
33 round trips to the sun
General interest: latency comparisons 12
John Edgar
Database data must be stored in persistent storage
But must be operated on in main memory
▪ And ultimately in registers
Transfer of data from memory to
storage is very time-consuming
▪ And should be managed carefully
▪ Through buffer management John Edgar
registers
cache
slow
memory
very slow
storage
13
There are two main types of secondary memory Hard disk drives (HDDs)
▪ The most widely used secondary memory device ▪ Cheap
▪ Relatively unreliable
▪ Much slower than primary memory
Solid state drives (SSDs)
▪ Faster but more expensive than HDDs ▪ Use of SSDs in databases is increasing
John Edgar 14
Offline storage for database archives
▪ Tertiary storage should have large capacity and
low cost
Examples of tertiary storage devices include
▪ Optical drives – CDs and DVDs
▪ Magnetic tape
▪ A very old storage media that is still used
▪ Tape jukeboxes store catalogued banks of tapes
John Edgar 15
Persistent memory is non-volatile RAM and is also known as
▪ NVM – non-volatile memory
▪ NVRAM – non-volatile RAM
▪ SCM – storage class memory Characteristics
▪ Byte-addressable ▪ Persistent
There are different types
▪ Varied speed, capacity and cost
John Edgar 16
NVDIMM-N
non-volatile dual in-line memory module
▪ DRAM paired with flash with a battery
▪ Similar performance to DRAM
▪ Small capacity and relatively expensive NVDIMM-F
▪ Flash storage using a DRAM bus
▪ Slower than DRAM and closer to flash performance
▪ Large capacity and cheap Other technologies
▪ Intel 3D XPoint – Optane DC PM, released in 2019
▪ Large capacity, performance in between DRAM and flash
other technologies are in development
John Edgar 17
Main memory assumed to be much smaller than persistent storage
▪ Transaction processing occurs in main memory
▪ DB resides in storage
▪ Data must be transferred between main memory and storage
Performance is primarily determined by storage access speed
▪ Traditionally stored on HDDs
▪ Transitioning to SSDs
John Edgar 18
Reduced price of DRAM allows for large enough main memory to store entire DB
▪ Or at least the working set Implications
▪ No IO during execution of transactions ▪ Changes must still be made persistent
▪ But can be performed in the background
▪ Violates many assumptions of classic DBMS
Use of persistent memory also possible
John Edgar 19