>>
Relational Operations
• DBMS Architecture (revisited)
• Relational Operations
• Cost Models
• Query Types
COMP9315 21T1 ♢ Relational Operations ♢ [0/11]
∧ >>
❖ DBMS Architecture (revisited)
Implementation of relational operations in DBMS:

COMP9315 21T1 ♢ Relational Operations ♢ [1/11]
<< ∧ >>
❖ Relational Operations
DBMS core = relational engine, with implementations of
• selection, projection, join, set operations
• scanning, sorting, grouping, aggregation, …
In this part of the course:
• examine methods for implementing each operation
• develop cost models for each implementation
• characterise when each method is most effective
Terminology reminder:
• tuple = collection of data values under some schema ≅ record
• page = block = collection of tuples + management data = i/o unit
• relation = table ≅ file = collection of tuples
COMP9315 21T1 ♢ Relational Operations ♢ [2/11]
<< ∧ >>
❖ Relational Operations (cont)
In order to implement relational operations the low-levels of the system provides:
• Relation openRel(db,name)
◦ get handle on relation name in database db
• Page request_page(rel,pid)
◦ get page pid from relation rel, return buffer containing page
• Record get_record(buf,tid)
◦ return record tid from page buf
• Tuple mkTuple(rel,rec)
◦ convert record rec to a tuple, based on rel schema
COMP9315 21T1 ♢ Relational Operations ♢ [3/11]
<< ∧ >>
❖ Relational Operations (cont)
Example of using low-level functions
// scan a relation Emps
Page p; // current page
Tuple t; // current tuple
Relation r = relOpen(db,”Emps”);
for (int i = 0; i < nPages(r); i++) {
p = request_page(rel,i);
for (int j = 0; j < nRecs(p); j++)
t = mkTuple(r, get_record(p,j));
... process tuple t ...
}
}
COMP9315 21T1 ♢ Relational Operations ♢ [4/11]
<< ∧ >>
❖ Relational Operations (cont)
Two “dimensions of variation”:
• which relational operation (e.g. Sel, Proj, Join, Sort, …)
• which access-method (e.g. file struct: heap, indexed, hashed, …)
Each query method involves an operator and a file structure:
• e.g. primary-key selection on hashed file
• e.g. primary-key selection on indexed file
• e.g. join on ordered heap files (sort-merge join)
• e.g. join on hashed files (hash join)
• e.g. two-dimensional range query on R-tree indexed file
We are interested in cost of query methods (and insert/delete operations)
COMP9315 21T1 ♢ Relational Operations ♢ [5/11]
<< ∧ >>
❖ Relational Operations (cont)
SQL vs DBMS engine
• select … from R where C
◦ find relevant tuples (satisfying C) in file(s) of R
• insert into R values(…)
◦ place new tuple in some page of a file of R
• delete from R where C
◦ find relevant tuples and “remove” from file(s) of R
• update R set … where C
◦ find relevant tuples in file(s) of R and “change” them
COMP9315 21T1 ♢ Relational Operations ♢ [6/11]
<< ∧ >>
❖ Cost Models
An important aspect of this course is
• analysis of cost of various query methods
Cost can be measured in terms of
• Time Cost: total time taken to execute method, or
• Page Cost: number of pages read and/or written
Primary assumptions in our cost models:
• memory (RAM) is “small”, fast, byte-at-a-time
• disk storage is very large, slow, page-at-a-time
COMP9315 21T1 ♢ Relational Operations ♢ [7/11]
<< ∧ >>
❖ Cost Models (cont)
Since time cost is affected by many factors
• speed of i/o devices (fast/slow disk, SSD)
• load on machine
we do not consider time cost in our analyses.
For comparing methods, page cost is better
• identifies workload imposed by method
• BUT is clearly affected by buffering
Estimating costs with multiple concurrent ops and buffering is difficult!!
Addtional assumption: every page request leads to some i/o
COMP9315 21T1 ♢ Relational Operations ♢ [8/11]
<< ∧ >>
❖ Cost Models (cont)
In developing cost models, we also assume:
• a relation is a set of r tuples, with average tuple size R bytes
• the tuples are stored in b data pages on disk
• each page has size B bytes and contains up to c tuples
• the tuples which answer query q are contained in bq pages
• data is transferred disk↔memory in whole pages
• cost of disk↔memory transfer Tr/w is very high

COMP9315 21T1 ♢ Relational Operations ♢ [9/11]
<< ∧ >>
❖ Cost Models (cont)
Our cost models are “rough” (based on assumptions)
But do give an O(x) feel for how expensive operations are.
Example “rough” estimation: how many piano tuners in Sydney?
• Sydney has ≅ 4 000 000 people
• Average household size ≅ 3 ∴ 1 300 000 households
• Let’s say that 1 in 10 households owns a piano
• Therefore there are ≅ 130 000 pianos
• Say people get their piano tuned every 2 years (on average)
• Say a tuner can do 2/day, 250 working-days/year
• Therefore 1 tuner can do 500 pianos per year
• Therefore Sydney would need ≅ 130000/2/500 = 130 tuners
Actual number of tuners in Yellow Pages = 120
Example borrowed from Alan Fekete at Sydney University.
COMP9315 21T1 ♢ Relational Operations ♢ [10/11]
<< ∧
❖ Query Types
Type
SQL
RelAlg
a.k.a.
Scan
select * from R
R
-
Proj
select x,y from R
Proj[x,y]R
-
Sort
select * from R
order by x
Sort[x]R
ord
Sel1
select * from R
where id = k
Sel[id=k]R
one
Seln
select * from R
where a = k
Sel[a=k]R
-
Join1
select * from R,S
where R.id = S.r
R Join[id=r] S
-
Different query classes exhibit different query processing behaviours.
COMP9315 21T1 ♢ Relational Operations ♢ [11/11]
Produced: 27 Feb 2021