2021/4/28 Relational Operations
Relational Operations
DBMS Architecture (revisited) Relational Operations
Cost Models
Query Types
>>
COMP9315 21T1 ♢ Relational Operations ♢ [0/11]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
1/13
2021/4/28 Relational Operations
❖ DBMS Architecture (revisited) Implementation of relational operations in DBMS:
∧ >>
COMP9315 21T1 ♢ Relational Operations ♢ [1/11]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
2/13
2021/4/28 Relational Operations
<< ∧ >>
❖ Relational Operations
DBMS core = relational engine, with implementations of
selection, projection, join, setoperations scanning, sorting, grouping, aggregation, …
In this part of the course:
examine methods for implementing each operation develop cost models for each implementation characterise when each method is most effective
Terminology reminder:
tuple = collection of data values under some schema ≅ record
page = block = collection of tuples + management data = i/o unit
relation = table ≅ le = collection of tuples COMP9315 21T1 ♢ Relational Operations ♢ [2/11]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
3/13
2021/4/28 Relational Operations
<< ∧ >>
❖ Relational Operations (cont)
In order to implement relational operations the low-levels
of the system provides:
Relation openRel(db,name)
get handle on relation name in database db Page request_page(rel,pid)
get page pid from relation rel, return buffer containing page
Record get_record(buf,tid)
return record tid from page buf Tuple mkTuple(rel,rec)
convert record rec to a tuple, based on rel schema COMP9315 21T1 ♢ Relational Operations ♢ [3/11]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
4/13
2021/4/28 Relational Operations
❖ Relational Operations (cont) Example of using low-level functions
// scan a relation Emps
Page p; // current page
Tuple t; // current tuple
Relation r = relOpen(db,”Emps”);
for (int i = 0; i < nPages(r); i++) {
p = request_page(rel,i);
for (int j = 0; j < nRecs(p); j++)
t = mkTuple(r, get_record(p,j));
... process tuple t ...
} }
COMP9315 21T1 ♢ Relational Operations ♢ [4/11]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
5/13
2021/4/28 Relational Operations
❖ Relational Operations (cont) Two “dimensions of variation”:
whichrelationaloperation (e.g.Sel,Proj,Join,Sort,…) whichaccess-method (e.g. lestruct:heap,indexed,hashed,…)
Each query method involves an operator and a le structure:
e.g. primary-key selection on hashed le
e.g. primary-key selection on indexed le
e.g. join on ordered heap les (sort-merge join)
e.g. join on hashed les (hash join)
e.g. two-dimensional range query on R-tree indexed le
We are interested in cost of query methods (and insert/delete operations)
COMP9315 21T1 ♢ Relational Operations ♢ [5/11]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
6/13
2021/4/28 Relational Operations
❖ Relational Operations (cont) SQL vs DBMS engine
select … from R where C
nd relevant tuples (satisfying C) in le(s) of R
insert into R values(…)
place new tuple in some page of a le of R
delete from R where C
nd relevant tuples and “remove” from le(s) of R
update R set … where C
nd relevant tuples in le(s) of R and “change” them
COMP9315 21T1 ♢ Relational Operations ♢ [6/11]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
7/13
2021/4/28 Relational Operations
❖ Cost Models
An important aspect of this course is analysis of cost of various query methods
Cost can be measured in terms of
Time Cost: total time taken to execute method, or Page Cost: number of pages read and/or written
Primary assumptions in our cost models:
memory (RAM) is “small”, fast, byte-at-a-time disk storage is very large, slow, page-at-a-time
COMP9315 21T1 ♢ Relational Operations ♢ [7/11]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
8/13
2021/4/28 Relational Operations
❖ Cost Models (cont)
Since time cost is affected by many factors
speed of i/o devices (fast/slow disk, SSD) load on machine
we do not consider time cost in our analyses.
For comparing methods, page cost is better
identi es workload imposed by method BUT is clearly affected by buffering
Estimating costs with multiple concurrent ops and buffering is dif cult!!
Addtional assumption: every page request leads to some i/o
COMP9315 21T1 ♢ Relational Operations ♢ [8/11]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
9/13
2021/4/28 Relational Operations
❖ Cost Models (cont)
In developing cost models, we also assume:
a relation is a set of r tuples, with average tuple size R bytes
the tuples are stored in b data pages on disk
each page has size B bytes and contains up to c tuples
the tuples which answer query q are contained in bq pages
data is transferred disk↔memory in whole pages cost of disk↔memory transfer Tr/w is very high
COMP9315 21T1 ♢ Relational Operations ♢ [9/11]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
10/13
2021/4/28 Relational Operations
❖ Cost Models (cont)
Our cost models are “rough” (based on assumptions)
But do give an O(x) feel for how expensive operations are.
Example “rough” estimation: how many piano tuners in Sydney?
Sydney has ≅ 4 000 000 people
Average household size ≅ 3 ∴ 1 300 000 households
Let’s say that 1 in 10 households owns a piano
Therefore there are ≅ 130 000 pianos
Say people get their piano tuned every 2 years (on average) Say a tuner can do 2/day, 250 working-days/year
Therefore 1 tuner can do 500 pianos per year
Therefore Sydney would need ≅ 130000/2/500 = 130 tuners
Actual number of tuners in Yellow Pages = 120
Example borrowed from Alan Fekete at Sydney University.
COMP9315 21T1 ♢ Relational Operations ♢ [10/11]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
11/13
2021/4/28 Relational Operations
❖ Query Types
Type SQL
Scan select * from R
Proj select x,y from R
Sort select * from R order by x
Sel1 select * from R where id = k
Seln select * from R where a = k
Join1 select * from R,S where R.id = S.r
RelAlg a.k.a. R – Proj[x,y]R – Sort[x]R ord
Sel[id=k]R one Sel[a=k]R –
R Join[id=r] S –
<< ∧
Different query classes exhibit different query processing behaviours.
COMP9315 21T1 ♢ Relational Operations ♢ [11/11]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
12/13
2021/4/28 Relational Operations
Produced: 27 Feb 2021
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/relops/slides.html
13/13