CS计算机代考程序代写 SQL database Relational Operations

Relational Operations

>>
Relational Operations

DBMS Architecture (revisited)

Relational Operations

Cost Models

Query Types

COMP9315 21T1 ♢ Relational Operations ♢ [0/11]

∧ >>
❖ DBMS Architecture (revisited)

Implementation of relational operations in DBMS:

COMP9315 21T1 ♢ Relational Operations ♢ [1/11]

<< ∧ >>
❖ Relational Operations

DBMS core = relational engine, with implementations of

selection,   projection,   join,   set operations

scanning,   sorting,   grouping,   aggregation,   …

In this part of the course:
examine methods for implementing each operation

develop cost models for each implementation

characterise when each method is most effective

Terminology reminder:
tuple = collection of data values under some schema ≅ record

page = block = collection of tuples + management data = i/o unit

relation = table ≅ file = collection of tuples

COMP9315 21T1 ♢ Relational Operations ♢ [2/11]

<< ∧ >>
❖ Relational Operations (cont)

In order to implement relational operations the low-levels of the system provides:

Relation openRel(db,name)
get handle on relation name in database db

Page request_page(rel,pid)
get page pid from relation rel, return buffer containing page

Record get_record(buf,tid)
return record tid from page buf

Tuple mkTuple(rel,rec)
convert record rec to a tuple, based on rel schema

COMP9315 21T1 ♢ Relational Operations ♢ [3/11]

<< ∧ >>
❖ Relational Operations (cont)

Example of using low-level functions

// scan a relation Emps
Page p; // current page
Tuple t; // current tuple
Relation r = relOpen(db,”Emps”);
for (int i = 0; i < nPages(r); i++) { p = request_page(rel,i); for (int j = 0; j < nRecs(p); j++) t = mkTuple(r, get_record(p,j)); ... process tuple t ... } } COMP9315 21T1 ♢ Relational Operations ♢ [4/11] << ∧ >>
❖ Relational Operations (cont)

Two “dimensions of variation”:

which relational operation   (e.g. Sel, Proj, Join, Sort, …)

which access-method   (e.g. file struct: heap, indexed, hashed, …)

Each query method involves an operator and a file structure:
e.g. primary-key selection on hashed file

e.g. primary-key selection on indexed file

e.g. join on ordered heap files (sort-merge join)

e.g. join on hashed files (hash join)

e.g. two-dimensional range query on R-tree indexed file

We are interested in cost  of query methods (and insert/delete operations)

COMP9315 21T1 ♢ Relational Operations ♢ [5/11]

<< ∧ >>
❖ Relational Operations (cont)

SQL vs DBMS engine

select … from R where C
find relevant tuples (satisfying C) in file(s) of R

insert into R values(…)
place new tuple in some page of a file of R

delete from R where C
find relevant tuples and “remove” from file(s) of R

update R set … where C
find relevant tuples in file(s) of R and “change” them

COMP9315 21T1 ♢ Relational Operations ♢ [6/11]

<< ∧ >>
❖ Cost Models

An important aspect of this course is

analysis of cost of various query methods

Cost can be measured in terms of
Time Cost: total time taken to execute method, or

Page Cost: number of pages read and/or written

Primary assumptions in our cost models:
memory (RAM) is “small”, fast, byte-at-a-time

disk storage is very large, slow, page-at-a-time

COMP9315 21T1 ♢ Relational Operations ♢ [7/11]

<< ∧ >>
❖ Cost Models (cont)

Since time cost is affected by many factors

speed of i/o devices (fast/slow disk, SSD)

load on machine

we do not consider time cost in our analyses.

For comparing methods, page cost is better

identifies workload imposed by method

BUT is clearly affected by buffering

Estimating costs with multiple concurrent ops and buffering is difficult!!

Addtional assumption: every page request leads to some i/o

COMP9315 21T1 ♢ Relational Operations ♢ [8/11]

<< ∧ >>
❖ Cost Models (cont)

In developing cost models, we also assume:

a relation is a set of r tuples, with average tuple size R bytes

the tuples are stored in b data pages on disk

each page has size B bytes and contains up to c tuples

the tuples which answer query q are contained in bq pages

data is transferred disk↔memory in whole pages

cost of disk↔memory transfer Tr/w is very high

COMP9315 21T1 ♢ Relational Operations ♢ [9/11]

<< ∧ >>
❖ Cost Models (cont)

Our cost models are “rough” (based on assumptions)

But do give an O(x) feel for how expensive operations are.

Example “rough” estimation: how many piano tuners in Sydney?

Sydney has ≅ 4 000 000 people

Average household size ≅ 3 ∴ 1 300 000 households

Let’s say that 1 in 10 households owns a piano

Therefore there are ≅ 130 000 pianos

Say people get their piano tuned every 2 years (on average)

Say a tuner can do 2/day, 250 working-days/year

Therefore 1 tuner can do 500 pianos per year

Therefore Sydney would need ≅ 130000/2/500 = 130 tuners

Actual number of tuners in Yellow Pages = 120

Example borrowed from Alan Fekete at Sydney University.

COMP9315 21T1 ♢ Relational Operations ♢ [10/11]

<< ∧ ❖ Query Types Type SQL RelAlg a.k.a. Scan select * from R R - Proj select x,y from R Proj[x,y]R - Sort select * from R order by x Sort[x]R ord Sel1 select * from R where id = k Sel[id=k]R one Seln select * from R where a = k Sel[a=k]R - Join1 select * from R,S where R.id = S.r R Join[id=r] S - Different query classes exhibit different query processing behaviours. COMP9315 21T1 ♢ Relational Operations ♢ [11/11] Produced: 27 Feb 2021