CS计算机代考程序代写 SQL database >>

>>
Relational Operations

• DBMS Architecture (revisited)
• Relational Operations
• Cost Models
• Query Types
COMP9315 21T1 ♢ Relational Operations ♢ [0/11]
∧ >>
❖ DBMS Architecture (revisited)

Implementation of relational operations in DBMS:

COMP9315 21T1 ♢ Relational Operations ♢ [1/11]
<< ∧ >>
❖ Relational Operations

DBMS core = relational engine, with implementations of
• selection,   projection,   join,   set operations
• scanning,   sorting,   grouping,   aggregation,   …
In this part of the course:
• examine methods for implementing each operation
• develop cost models for each implementation
• characterise when each method is most effective
Terminology reminder:
• tuple = collection of data values under some schema ≅ record
• page = block = collection of tuples + management data = i/o unit
• relation = table ≅ file = collection of tuples
COMP9315 21T1 ♢ Relational Operations ♢ [2/11]
<< ∧ >>
❖ Relational Operations (cont)

In order to implement relational operations the low-levels of the system provides:
• Relation openRel(db,name)
◦ get handle on relation name in database db
• Page request_page(rel,pid)
◦ get page pid from relation rel, return buffer containing page
• Record get_record(buf,tid)
◦ return record tid from page buf
• Tuple mkTuple(rel,rec)
◦ convert record rec to a tuple, based on rel schema
COMP9315 21T1 ♢ Relational Operations ♢ [3/11]
<< ∧ >>
❖ Relational Operations (cont)

Example of using low-level functions

// scan a relation Emps
Page p; // current page
Tuple t; // current tuple
Relation r = relOpen(db,”Emps”);
for (int i = 0; i < nPages(r); i++) { p = request_page(rel,i); for (int j = 0; j < nRecs(p); j++) t = mkTuple(r, get_record(p,j)); ... process tuple t ... } } COMP9315 21T1 ♢ Relational Operations ♢ [4/11] << ∧ >>
❖ Relational Operations (cont)

Two “dimensions of variation”:
• which relational operation   (e.g. Sel, Proj, Join, Sort, …)
• which access-method   (e.g. file struct: heap, indexed, hashed, …)
Each query method involves an operator and a file structure:
• e.g. primary-key selection on hashed file
• e.g. primary-key selection on indexed file
• e.g. join on ordered heap files (sort-merge join)
• e.g. join on hashed files (hash join)
• e.g. two-dimensional range query on R-tree indexed file
We are interested in cost  of query methods (and insert/delete operations)
COMP9315 21T1 ♢ Relational Operations ♢ [5/11]
<< ∧ >>
❖ Relational Operations (cont)

SQL vs DBMS engine
• select … from R where C
◦ find relevant tuples (satisfying C) in file(s) of R
• insert into R values(…)
◦ place new tuple in some page of a file of R
• delete from R where C
◦ find relevant tuples and “remove” from file(s) of R
• update R set … where C
◦ find relevant tuples in file(s) of R and “change” them
COMP9315 21T1 ♢ Relational Operations ♢ [6/11]
<< ∧ >>
❖ Cost Models

An important aspect of this course is
• analysis of cost of various query methods
Cost can be measured in terms of
• Time Cost: total time taken to execute method, or
• Page Cost: number of pages read and/or written
Primary assumptions in our cost models:
• memory (RAM) is “small”, fast, byte-at-a-time
• disk storage is very large, slow, page-at-a-time
COMP9315 21T1 ♢ Relational Operations ♢ [7/11]
<< ∧ >>
❖ Cost Models (cont)

Since time cost is affected by many factors
• speed of i/o devices (fast/slow disk, SSD)
• load on machine
we do not consider time cost in our analyses.
For comparing methods, page cost is better
• identifies workload imposed by method
• BUT is clearly affected by buffering
Estimating costs with multiple concurrent ops and buffering is difficult!!
Addtional assumption: every page request leads to some i/o
COMP9315 21T1 ♢ Relational Operations ♢ [8/11]
<< ∧ >>
❖ Cost Models (cont)

In developing cost models, we also assume:
• a relation is a set of r tuples, with average tuple size R bytes
• the tuples are stored in b data pages on disk
• each page has size B bytes and contains up to c tuples
• the tuples which answer query q are contained in bq pages
• data is transferred disk↔memory in whole pages
• cost of disk↔memory transfer Tr/w is very high

COMP9315 21T1 ♢ Relational Operations ♢ [9/11]
<< ∧ >>
❖ Cost Models (cont)

Our cost models are “rough” (based on assumptions)
But do give an O(x) feel for how expensive operations are.
Example “rough” estimation: how many piano tuners in Sydney?
• Sydney has ≅ 4 000 000 people
• Average household size ≅ 3 ∴ 1 300 000 households
• Let’s say that 1 in 10 households owns a piano
• Therefore there are ≅ 130 000 pianos
• Say people get their piano tuned every 2 years (on average)
• Say a tuner can do 2/day, 250 working-days/year
• Therefore 1 tuner can do 500 pianos per year
• Therefore Sydney would need ≅ 130000/2/500 = 130 tuners
Actual number of tuners in Yellow Pages = 120
Example borrowed from Alan Fekete at Sydney University.
COMP9315 21T1 ♢ Relational Operations ♢ [10/11]
<< ∧ ❖ Query Types Type SQL RelAlg a.k.a. Scan select * from R R - Proj select x,y from R Proj[x,y]R - Sort select * from R order by x Sort[x]R ord Sel1 select * from R where id = k Sel[id=k]R one Seln select * from R where a = k Sel[a=k]R - Join1 select * from R,S where R.id = S.r R Join[id=r] S - Different query classes exhibit different query processing behaviours. COMP9315 21T1 ♢ Relational Operations ♢ [11/11] Produced: 27 Feb 2021