CS计算机代考程序代写 database data structure SQL 2021/4/28 Scanning

2021/4/28 Scanning
Scanning
Scanning
Selection via Scanning
Iterators
Example Query
next_tuple() Function Relation Copying
Scanning in PostgreSQL Scanning in other File Structures
>>
COMP9315 21T1 ♢ Scanning ♢ [0/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
1/18

2021/4/28 Scanning
❖ Scanning
Consider executing the query:
select * from Rel;
where the relation has a le structure like:
This would done by a simple scan of all records/tuples.
∧ >>
COMP9315 21T1 ♢ Scanning ♢ [1/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
2/18

2021/4/28 Scanning
❖ Scanning (cont)
Abstract view of how the scan might be implemented:
for each tuple T in relation Rel {
add tuple T to result set
}
Operational view:
for each page P in file of relation Rel {
for each tuple T in page P {
add tuple T to result set
}
}
Cost = read every data page once = b
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [2/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
3/18

2021/4/28 Scanning
❖ Scanning (cont)
Consider a le with overow pages, e.g.
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [3/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
4/18

2021/4/28 Scanning
❖ Scanning (cont)
In this case, the implementation changes to:
for each page P in data file of relation Rel {
for each tuple t in page P {
add tuple t to result set
}
for each overflow page V of page P {
for each tuple t in page V {
add tuple t to result set
}}}
Cost: read each data page and each overow page once Cost=b+bOv
where bOv = total number of overow pages
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [4/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
5/18

2021/4/28 Scanning
❖ Selection via Scanning
Consider a one query like:
select * from Employee where id = 762288;
In an unordered le, search for matching tuple requires:
<< ∧ >>
Guaranteed at most one answer; but could be in any page.
COMP9315 21T1 ♢ Scanning ♢ [5/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
6/18

2021/4/28 Scanning
❖ Selection via Scanning (cont) Overview of scan process:
for each page P in relation Employee {
for each tuple t in page P {
if (t.id == 762288) return t }}
Cost analysis for one searching in unordered le
best case: read one page, nd tuple
worst case: read all b pages, nd in last (or don’t nd) average case: read half of the pages (b/2)
Page Costs: Costavg = b/2 Costmin = 1 Costmax = b
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [6/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
7/18

2021/4/28 Scanning
❖ Iterators
Access methods typically involve iterators, e.g.
Scan s = start_scan(Relation r, …)
commence a scan of relation r
Scan may include condition to implement WHERE-clause Scan holds data on progress through le (e.g. current page)
Tuple next_tuple(Scan s)
return Tuple immediately following last accessed one returns NULL if no more Tuples left in the relation
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [7/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
8/18

2021/4/28 Scanning
❖ Example Query
Example: simple scan of a table …
select name from Employee
implemented as:
DB db = openDatabase(“myDB”);
Relation r = openRelation(db,”Employee”,READ); Scan s = start_scan(r);
Tuple t; // current tuple
while ((t = next_tuple(s)) != NULL) {
char *name = getStrField(t,2);
printf(“%s\n”, name);
}
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [8/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
9/18

2021/4/28 Scanning
❖ next_tuple() Function
Consider the following possible Scan data structure
typedef ScanData *Scan;
typedef struct {
Relation rel;
Page *page; // Page buffer int curPID; // current pid int curTID; // current tid
} ScanData;
Assume tuples are indexed 0..nTuples(p)-1 Assume pages are indexed 0..nPages(rel)-1
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [9/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
10/18

2021/4/28 Scanning
❖ next_tuple() Function (cont) Implementation of Tuple next_tuple(Scan) function
Tuple next_tuple(Scan s)
{
if (s->curTID >= nTuples(s->page)-1) {
// get a new page; exhausted current page
s->curPID++;
if (s->curPID >= nPages(s->rel))
return NULL;
else {
s->page = get_page(s->rel, s->curPID);
s->curTID = -1;
}
}
s->curTID++;
return get_tuple(s->rel, s->page, s->curTID);
}
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [10/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
11/18

2021/4/28 Scanning
❖ Relation Copying
Consider an SQL statement like:
create table T as (select * from S);
Effectively, copies data from one table to a new table. Process:
make empty relation T
s = start scan of S
while (t = next_tuple(s)) {
insert tuple t into relation T
}
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [11/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
12/18

2021/4/28 Scanning
❖ Relation Copying (cont)
It is possible that T is smaller than S
may be unused free space in S where tuples were removed if T is built by simple append, will be compact
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [12/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
13/18

2021/4/28 Scanning
❖ Relation Copying (cont)
In terms of existing relation/page/tuple operations:
Relation in; // relation handle (incl. files)
Relation out; // relation handle (incl. files)
int ipid,opid,tid; // page and record indexes
Record rec; // current record (tuple)
Page ibuf,obuf; // input/output file buffers
in = openRelation(“S”, READ);
out = openRelation(“T”, NEW|WRITE);
clear(obuf); opid = 0;
for (ipid = 0; ipid < nPages(in); ipid++) { ibuf = get_page(in, ipid); for (tid = 0; tid < nTuples(ibuf); tid++) { rec = get_record(ibuf, tid); if (!hasSpace(obuf,rec)) { put_page(out, opid++, obuf); clear(obuf); } insert_record(obuf,rec); }} if (nTuples(obuf) > 0) put_page(out, opid, obuf);
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [13/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
14/18

2021/4/28 Scanning
❖ Scanning in PostgreSQL
Scanning dened in: backend/access/heap/heapam.c Implements iterator data/operations:
HeapScanDesc … struct containing iteration state scan = heap_beginscan(rel,…,nkeys,keys) tup = heap_getnext(scan, direction) heap_endscan(scan) … frees up scan struct
res = HeapKeyTest(tuple,…,nkeys,keys)
… performs ScanKeys tests on tuple … is it a result tuple?
<< ∧ >>
COMP9315 21T1 ♢ Scanning ♢ [14/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
15/18

2021/4/28 Scanning
❖ Scanning in PostgreSQL (cont) typedef HeapScanDescData *HeapScanDesc;
typedef struct HeapScanDescData
{
// scan parameters
<< ∧ >>
Relation
Snapshot
int
ScanKey

rs_rd; // heap relation descriptor rs_snapshot; // snapshot … tuple visibility rs_nkeys; // number of scan keys
rs_key; // array of scan key descriptors
// state set up at initscan time
PageNumber rs_npages; // number of pages to scan PageNumber rs_startpage; // page # to start at

// scan current state, initally set to invalid
HeapTupleData
PageNumber
Buffer
rs_ctup;
rs_cpage;
rs_cbuf;
// current tuple in scan
// current page # in scan
// current buffer in scan

} HeapScanDescData;
COMP9315 21T1 ♢ Scanning ♢ [15/16]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html
16/18

2021/4/28 Scanning
❖ Scanning in other File Structures
Above examples are for heap les
simple, unordered, maybe indexed, no hashing
Other access le structures in PostgreSQL: btree, hash, gist, gin
each implements:
startscan, getnext, endscan
insert, delete (update=delete+insert) other le-specic operators
<< ∧ COMP9315 21T1 ♢ Scanning ♢ [16/16] https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html 17/18 2021/4/28 Scanning Produced: 7 Mar 2021 https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/scanning/slides.html 18/18