程序代写代做代考 database ER SQL Databases: Introduction

Databases: Introduction

Introduction Data

Databases: Introduction

P.J. McBrien

Imperial College London

P.J. McBrien (Imperial College London) Databases: Introduction 1 / 23

Introduction Data Models

Databases are Computer Stores of Data!

Tiny Bank Ltd Customer: McBrien, P.
Strand Branch Current Acc: 10000100
Sortcode: 55-66-67
Trans Amount Date
1000 2300.00 5/1/1999
1002 -223.45 8/1/1999
1006 10.23 15/1/1999

Tiny Bank Ltd Customer: McBrien, P.
Strand Branch Deposit Acc: 10000101
Sortcode: 55-66-67
Trans Amount Date
1001 4000.00 5/1/1999
1008 1230.00 15/1/1999

Tiny Bank Ltd Customer: Boyd, M.
Goodge St Branch Current Acc: 10000103
Sortcode: 55-66-34
Trans Amount Date
1005 145.50 12/1/1999

Tiny Bank Ltd Customer: Poulovassilis, A.
Wimbledon BranchCurrent Acc: 10000107
Sortcode: 55-66-56
Trans Amount Date
1004 -100.00 11/1/1999
1007 345.56 15/1/1999

Tiny Bank Ltd Customer: Poulovassilis, A.
Wimbledon BranchDeposit Acc: 10000119
Sortcode: 55-66-56
Trans Amount Date
1009 5600.00 18/1/1999

Tiny Bank Ltd Customer: Bailey, J.
Wimbledon BranchCurrent Acc: 10000125
Sortcode: 55-66-56
Trans Amount Date
No transactions this month

Deposit Rates
AccountRate
101 5.25
119 5.50

P.J. McBrien (Imperial College London) Databases: Introduction 2 / 23

Introduction Data Models

Relational Data Model

Relational Data Model

Roughly: storing data in tables

bank data
no sortcode bname cash type cname rate? mid amount tdate

100 67 Strand 34005.00 current McBrien, P. 1000 2300.00 1999-01-05
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1001 4000.00 1999-01-05
100 67 Strand 34005.00 current McBrien, P. 1002 -223.45 1999-01-08
107 56 Wimbledon 84340.45 current Poulovassilis, A. 1004 -100.00 1999-01-11
103 34 Goodge St 6900.67 current Boyd, M. 1005 145.50 1999-01-12
100 67 Strand 34005.00 current McBrien, P. 1006 10.23 1999-01-15
107 56 Wimbledon 84340.45 current Poulovassilis, A. 1007 345.56 1999-01-15
101 67 Strand 34005.00 deposit McBrien, P. 5.25 1008 1230.00 1999-01-15
119 56 Wimbledon 84340.45 deposit Poulovassilis, A. 5.50 1009 5600.00 1999-01-18

P.J. McBrien (Imperial College London) Databases: Introduction 3 / 23

Introduction Data Models

Database Design: ER Modelling

branchbname

cash

sortcode

account

rate?

cname

type

no

movement

amount

tdate

mid

has
0:N

1:1

holds

1:1

0:N

P.J. McBrien (Imperial College London) Databases: Introduction 4 / 23

Introduction Data Models

Structured Data: Relational Model

branch
sortcode bname cash

56 ’Wimbledon’ 94340.45
34 ’Goodge St’ 8900.67
67 ’Strand’ 34005.00

movement
mid no amount tdate
1000 100 2300.00 5/1/1999
1001 101 4000.00 5/1/1999
1002 100 -223.45 8/1/1999
1004 107 -100.00 11/1/1999
1005 103 145.50 12/1/1999
1006 100 10.23 15/1/1999
1007 107 345.56 15/1/1999
1008 101 1230.00 15/1/1999
1009 119 5600.00 18/1/1999

account
no type cname rate? sortcode

100 ’current’ ’McBrien, P.’ NULL 67
101 ’deposit’ ’McBrien, P.’ 5.25 67
103 ’current’ ’Boyd, M.’ NULL 34
107 ’current’ ’Poulovassilis, A.’ NULL 56
119 ’deposit’ ’Poulovassilis, A.’ 5.50 56
125 ’current’ ’Bailey, J.’ NULL 56

key branch(sortcode)
key branch(bname)
key movement(mid)
key account(no)

movement(no)
fk
⇒ account(no)

account(sortcode)
fk
⇒ branch(sortcode)

P.J. McBrien (Imperial College London) Databases: Introduction 5 / 23

Introduction Data Models

Data Model: CSV

branch.csv
sortcode,bname,cash
56,”Wimbledon”,94340.45
34,”Goodge St”, 8900.67
67,”Strand”,34005.00

account.csv
no,type,cname,rate,sortcode
100,”current”,”McBrien, P.”,,67
101,”deposit”,”McBrien, P.”,5.25,67
103,”current”,”Boyd, M.”,,34
107,”current”,”Poulovassilis, A.”,,56
119,”deposit”,”Poulovassilis, A.”,5.50,56
125,”current”,”Bailey, J.”,,56

movement.csv
mid,no,amount,tdate
1000,100,2300.00,5/1/1999
1001,101,4000.00,5/1/1999
1002,100,-223.45,8/1/1999
1004,107,-100.00,11/1/1999
1005,103,145.50,12/1/1999
1006,100,10.23,15/1/1999
1007,107,345.56,15/1/1999
1008,101,1230.00,15/1/1999
1009,119,5600.00,18/1/1999

P.J. McBrien (Imperial College London) Databases: Introduction 6 / 23

Introduction Data Models

Semistructured Data: XML

〈bank〉
〈branch sortcode=“67” bname=“Strand” cash=“34005.00”〉

〈account no=“100” type=“current” cname=“McBrien, P.”〉
〈movement mid=“1000” amount=“2300.00” tdate=“5/1/1999”/〉
〈movement mid=“1002” amount=“-223.45” tdate=“8/1/1999”/〉
〈movement mid=“1006” amount=“10.23” tdate=“15/1/1999”/〉

〈/account〉
〈account no=“101” type=“deposit” cname=“McBrien, P.” rate=“5.25”〉

〈movement mid=“1001” amount=“4000.00” tdate=“5/1/1999”/〉
〈movement mid=“1008” amount=“1230.00” tdate=“15/1/1999”/〉

〈/account〉
〈/branch〉

〈/bank〉

P.J. McBrien (Imperial College London) Databases: Introduction 7 / 23

Introduction Data Models

SQL DDL: Implementation of the Relational Model

CREATE TABLE branch
( sortcode INTEGER NOT NULL,
bname VARCHAR(20) NOT NULL,
cash DECIMAL(10,2) NOT NULL,
CONSTRAINT branch pk PRIMARY KEY (sortcode)

)

CREATE UNIQUE INDEX branch bname idx
ON branch(bname)

CREATE TABLE account
( no INTEGER NOT NULL,
type CHAR(8) NOT NULL,
cname VARCHAR(20) NOT NULL,
rate DECIMAL(4,2) NULL,
sortcode INTEGER NOT NULL,
CONSTRAINT account pk

PRIMARY KEY (no),
CONSTRAINT account fk

FOREIGN KEY (sortcode) REFERENCES branch
)

CREATE INDEX account type idx ON account(type)

CREATE TABLE movement
( mid INTEGER NOT NULL,
no INTEGER NOT NULL,
amount DECIMAL(10,2) NOT NULL,
tdate DATETIME NOT NULL,
CONSTRAINT movement pk

PRIMARY KEY (mid),
CONSTRAINT movement fk

FOREIGN KEY (no) REFERENCES account
)

P.J. McBrien (Imperial College London) Databases: Introduction 8 / 23

Introduction Data Models

SQL DML: Implementation of the Relational Algebra

Basic SQL SELECT statements

SELECT no , cname , r a t e
FROM account
WHERE type=’ d e p o s i t ’

SQL Joins

SELECT bname , no , r a t e
FROM branch JOIN account USING ( so r t code )
WHERE type=’ d e p o s i t ’

Same as

SELECT bname , no , r a t e
FROM account JOIN branch ON branch . s o r t c ode=account . s o r t c ode
WHERE type=’ d e p o s i t ’

Same as

SELECT bname , no , r a t e
FROM account , branch
WHERE branch . s o r t c ode=account . s o r t c ode
AND type=’ d e p o s i t ’

P.J. McBrien (Imperial College London) Databases: Introduction 9 / 23

Introduction Data Models

RDBMS Products

Product SQL Language Company
DB2 SQL PL IBM
Oracle PL/SQL Oracle
Sybase Transact-SQL SAP
SQLServer Transact-SQL Microsoft
PostgreSQL PL/pgSQL Open Source
MySQL MySQL Open Source (Oracle)

All partially implement ANSI SQL

P.J. McBrien (Imperial College London) Databases: Introduction 10 / 23

Transactions

Transactions

BEGIN TRANSACTION
UPDATE branch
SET cash=cash-10000.00
WHERE sortcode=56

UPDATE branch
SET cash=cash+10000.00
WHERE sortcode=34

COMMIT TRANSACTION

database management systems

(DBMS) implements indivisible tasks
called transactions

The ACID Properties

Atomicity all or nothing

Consistency consistent before → consistent after

Isolation independent of any other transaction

Durability completed transaction are durable

P.J. McBrien (Imperial College London) Databases: Introduction 11 / 23

Transactions

Transaction Properties: Atomicity

BEGIN TRANSACTION
UPDATE branch
SET cash=cash −10000.00
WHERE so r t c od e=56

CRASH

Failure to maintain Atomicity

Suppose that the system crashes half way through processing a cash transfer, and the
first part of the transfer has been written to disc

The database on disc is left in an inconsistent state: the sum of cash should be
£137,246.12 but only £127,246.12 recorded

A DBMS implementing Atomicity of transactions would on restart undo the
change to branch 56

P.J. McBrien (Imperial College London) Databases: Introduction 12 / 23

Transactions

Transaction Properties: Consistency

BEGIN TRANSACTION
DELETE FROM branch
WHERE so r t c od e=56

INSERT INTO account
VALUES (100 , ’ Smith , J ’ , ’ d e p o s i t ’ , 5 . 0 0 , 34 )

END TRANSACTION

Failure to maintain Consistency

Suppose that a user deletes branch with sortcode 56, and inserts a desposit account
number 100 for John Smith at branch sortcode 34

The database is left in an inconsistent state for two reasons

it has three accounts recorded for a branch that appears not to exist, and
it has two records for account number 100, with different details for the account

A DBMS implementing Consistency of transactions would forbid both of these
changes to the database

P.J. McBrien (Imperial College London) Databases: Introduction 13 / 23

Transactions

Transaction Properties: Isolation

BEGIN TRANSACTION
UPDATE branch
SET cash=cash −10000.00
WHERE so r t c od e=56

UPDATE branch
SET cash=cash +10000.00
WHERE so r t c od e=34

END TRANSACTION

BEGIN TRANSACTION

SELECT SUM( cash ) AS n e t c a sh
FROM branch

END TRANSACTION

Failure to maintain Isolation

Suppose that the system sums the cash in the bank in one transaction, half way
through processing a cash transfer in another transaction

The result of the summation of cash in the bank erroneously reports
£127,246.12, whereas the movement of cash always leaves a total of £137,246.12

A DBMS implementing Isolation of transactions ensures that transactions
always report results based on the values of committed transactions

P.J. McBrien (Imperial College London) Databases: Introduction 14 / 23

Transactions

Transaction Properties: Durability

BEGIN TRANSACTION
UPDATE branch
SET cash=cash −10000.00
WHERE so r t c od e=56

UPDATE branch
SET cash=cash+10000.00
WHERE so r t c od e=34

END TRANSACTION

CRASH

Failure to maintain Durability

Suppose that the system crashes after informing the user that it has committed the
transfer of cash, but has not yet written to disc the update to branch 34

The database on disc is left in an inconsistent state, with £10,000 ‘missing’

A DBMS implementing Durability of transactions would on restart complete
the change to branch 34 (or alternatively never inform a user of commitment
with writing the results to disc).

P.J. McBrien (Imperial College London) Databases: Introduction 15 / 23

DBMS Architecture

DBMS Architecture

log

disc

data manager

buffer
manager

memory

recovery manager

scheduler

transaction manager

read

write❄✻
read

write

read
write❄

flush
fetch ❄

read
write

begin

abort
commit❄

execute

result
reject

delay

Transaction Manager

BEGIN TRANSACTION
UPDATE branch
SET cash=cash −10000.00
WHERE so r t c od e=56

UPDATE branch
SET cash=cash +10000.00
WHERE so r t c od e=34

END TRANSACTION

r1[b56], w1[b56], r1[b34], w1[b34]

P.J. McBrien (Imperial College London) Databases: Introduction 16 / 23

DBMS Architecture

DBMS Architecture

log

disc

data manager

buffer
manager

memory

recovery manager

scheduler

transaction manager

read

write❄✻
read

write

read
write❄

flush
fetch ❄

read
write

begin

abort
commit❄

execute

result
reject

delay

Transaction Manager

BEGIN TRANSACTION
SELECT SUM( cash ) AS n e t c a sh
FROM branch

END TRANSACTION

r2[b56], r2[b34], r2[b67]

P.J. McBrien (Imperial College London) Databases: Introduction 16 / 23

DBMS Architecture

DBMS Architecture

log

disc

data manager

buffer
manager

memory

recovery manager

scheduler

transaction manager

read

write❄✻
read

write

read
write❄

flush
fetch ❄

read
write

begin

abort
commit❄

execute

result
reject

delay

Scheduler

execute(r1[b56], w1[b56], r1[b34], w1[b34])
execute(r2[b56], r2[b34], r2[b67])

b1, r1[b56], w1[b56], b2, r2[b56]
r1[b34], w1[b34], c1, r2[b34], r2[b67], c2

P.J. McBrien (Imperial College London) Databases: Introduction 16 / 23

DBMS Architecture

DBMS Architecture

log

disc

data manager

buffer
manager

memory

recovery manager

scheduler

transaction manager

read

write❄✻
read

write

read
write❄

flush
fetch ❄

read
write

begin

abort
commit❄

execute

result
reject

delay

Data Manager

b1, r1[b56], w1[b56], b2,
r2[b56]
r1[b34], w1[b34],
c1, r2[b34],
r2[b67], c2

Say P1 = [b34], P2 = [b56, b67]
Mr(P2), Br(P2), Dr(P2),Mw(P2), Lw(P2),
Mr(P2),
Mr(P1), Br(P1), Dr(P1),Mw(P1), Lw(P1),
Lw(c1),Mr(P1),
Mr(P2), Lw(c2), Dw(P1), Dw(P2)

P.J. McBrien (Imperial College London) Databases: Introduction 16 / 23

ANSI/SPARC

ANSI/SPARC Model

DB →

ր

ց

external
schema1

. . . external
scheman

conceptual
schema

internal
schema

❑ ✕

transform

filter

ANSI/SPARC model views three levels of abstractions

schema means structure of the database

P.J. McBrien (Imperial College London) Databases: Introduction 17 / 23

ANSI/SPARC

ANSI/SPARC Model (Internal Schema)

external
schema1

. . . external
scheman

conceptual
schema

internal
schema

❑ ✕

2k page size

B-tree index

Strings end with char 0

Describes the physical layout of data

P.J. McBrien (Imperial College London) Databases: Introduction 18 / 23

ANSI/SPARC

ANSI/SPARC Model (Conceptual Schema)

external
schema1

. . . external
scheman

conceptual
schema

internal
schema

❑ ✕

CREATE TABLE branch
( sortcode INTEGER,

bname VARCHAR(20),
cash DECIMAL(10,2)

)

SELECT sortcode
FROM branch
WHERE cash<0 defined in data definition language (DDL) queried using data manipulation language (DML) controlled by database administrator (DBA) P.J. McBrien (Imperial College London) Databases: Introduction 19 / 23 ANSI/SPARC ANSI/SPARC Model (External Schema) external schema1 . . . external scheman conceptual schema internal schema ✻ ❑ ✕ ✲ CREATE VIEW bust SELECT sortcode FROM branch WHERE cash<0 Define a schema for a particular user/application P.J. McBrien (Imperial College London) Databases: Introduction 20 / 23 Course Format Course Format Schedule Three hours combined lectures/tutorials per week, running into week 10 Coursework that helps you prepare for the exam May Exam Books Several good text books on the market. Some that will also cover material in more advanced courses are: Fundamentals of Database Systems, 6th Ed, Elmasri and Navathe, Addison Wesley Database Systems: The Complete Book, 2nd Ed, Garcia-Molina, Ullman and Widom, Pearson Database Systems, 5th Ed, Connolly and Begg, Addison Wesley P.J. McBrien (Imperial College London) Databases: Introduction 21 / 23 Course Format Course Resources Course Web Site http://www.doc.ic.ac.uk/~pjm/db/ Lecture slides Example Databases Postgres SQL Server Resources CATe course work handout and submission Piazza discussion forum email course email list If you are not on Level 2 on CATe, nothing works! P.J. McBrien (Imperial College London) Databases: Introduction 22 / 23 http://www.doc.ic.ac.uk/~pjm/db/ Course Format Course Content Conceptual Layer: Relational Algebra SQL Datalog Conceptual Layer: Relational Data Model Properties of a ‘good’ schema: keys and normalisation Database design using ER models Physical Layer: Transaction Processing Serialisability Recovery and Checkpointing P.J. McBrien (Imperial College London) Databases: Introduction 23 / 23 Introduction Data Data Models Transactions DBMS Architecture ANSI/SPARC Course Format