CS计算机代考程序代写 algorithm data structure database scheme Buffer Pool

Buffer Pool

>>
Buffer Pool

Buffer Pool

Page Replacement Policies

Effect of Buffer Management

COMP9315 21T1 ♢ Buffer Pool ♢ [0/16]

∧ >>
❖ Buffer Pool

COMP9315 21T1 ♢ Buffer Pool ♢ [1/16]

<< ∧ >>
❖ Buffer Pool (cont)

Aim of buffer pool:

hold pages read from database files, for possible re-use

Used by:
access methods which read/write data pages

e.g. sequential scan, indexed retrieval, hashing

Uses:
file manager functions to access data files

Note: we use the terms page and block interchangably

COMP9315 21T1 ♢ Buffer Pool ♢ [2/16]

<< ∧ >>
❖ Buffer Pool (cont)

COMP9315 21T1 ♢ Buffer Pool ♢ [3/16]

<< ∧ >>
❖ Buffer Pool (cont)

Buffer pool operations:   (both take single PageID argument)

request_page(pid),   release_page(pid), …

To some extent …
request_page() replaces getBlock()

release_page() replaces putBlock()

Buffer pool data structures:

Page frames[NBUFS]

FrameData directory[NBUFS]

Page is byte[BUFSIZE]

COMP9315 21T1 ♢ Buffer Pool ♢ [4/16]

<< ∧ >>
❖ Buffer Pool (cont)

COMP9315 21T1 ♢ Buffer Pool ♢ [5/16]

<< ∧ >>
❖ Buffer Pool (cont)

For each frame, we need to know:   (FrameData)

which Page it contains, or whether empty/free

whether it has been modified since loading (dirty bit)

how many transactions are currently using it (pin count)

time-stamp for most recent access (assists with replacement)

Pages are referenced by PageID …
PageID = BufferTag = (rnode, forkNum, blockNum)

COMP9315 21T1 ♢ Buffer Pool ♢ [6/16]

<< ∧ >>
❖ Buffer Pool (cont)

How scans are performed without Buffer Pool:

Buffer buf;
int N = numberOfBlocks(Rel);
for (i = 0; i < N; i++) { pageID = makePageID(db,Rel,i); getBlock(pageID, buf); for (j = 0; j < nTuples(buf); j++) process(buf, j) } Requires N page reads. If we read it again, N page reads. COMP9315 21T1 ♢ Buffer Pool ♢ [7/16] << ∧ >>
❖ Buffer Pool (cont)

How scans are performed with Buffer Pool:

Buffer buf;
int N = numberOfBlocks(Rel);
for (i = 0; i < N; i++) { pageID = makePageID(db,Rel,i); bufID = request_page(pageID); buf = frames[bufID] for (j = 0; j < nTuples(buf); j++) process(buf, j) release_page(pageID); } Requires N page reads on the first pass. If we read it again, 0 ≤ page reads ≤ N COMP9315 21T1 ♢ Buffer Pool ♢ [8/16] << ∧ >>
❖ Buffer Pool (cont)

Implementation of request_page()

int request_page(PageID pid)
{
if (pid in Pool)
bufID = index for pid in Pool
else {
if (no free frames in Pool)
evict a page (free a frame)
bufID = allocate free frame
directory[bufID].page = pid
directory[bufID].pin_count = 0
directory[bufID].dirty_bit = 0
}
directory[bufID].pin_count++
return bufID
}

COMP9315 21T1 ♢ Buffer Pool ♢ [9/16]

<< ∧ >>
❖ Buffer Pool (cont)

The release_page(pid) operation:

Decrement pin count for specified page

Note: no effect on disk or buffer contents until replacement required

The mark_page(pid) operation:

Set dirty bit on for specified page

Note: doesn’t actually write to disk; indicates that page changed

The flush_page(pid) operation:

Write the specified page to disk (using write_page)

Note: not generally used by higher levels of DBMS

COMP9315 21T1 ♢ Buffer Pool ♢ [10/16]

<< ∧ >>
❖ Buffer Pool (cont)

Evicting a page …

find frame(s) preferably  satisfying
pin count = 0   (i.e. nobody using it)

dirty bit = 0   (not modified)

if selected frame was modified, flush frame to disk

flag directory entry as “frame empty”

If multiple frames can potentially be released
need a policy to decide which is best choice

COMP9315 21T1 ♢ Buffer Pool ♢ [11/16]

<< ∧ >>
❖ Page Replacement Policies

Several schemes are commonly in use:

Least Recently Used (LRU)

Most Recently Used (MRU)

First in First Out (FIFO)

Random

LRU / MRU require knowledge of when pages were last accessed
how to keep track of “last access” time?

base on request/release ops or on real  page usage?

COMP9315 21T1 ♢ Buffer Pool ♢ [12/16]

<< ∧ >>
❖ Page Replacement Policies (cont)

Cost benefit from buffer pool (with n frames) is determined by:

number of available frames (more ⇒ better)

replacement strategy vs page access pattern

Example (a): sequential scan, LRU or MRU, n ≥ b

First scan costs b reads; subsequent scans are “free”.

Example (b): sequential scan, MRU, n < b First scan costs b reads; subsequent scans cost b - n reads. Example (c): sequential scan, LRU, n < b All scans cost b reads; known as sequential flooding. COMP9315 21T1 ♢ Buffer Pool ♢ [13/16] << ∧ >>
❖ Effect of Buffer Management

Consider a query to find customers who are also employees:

select c.name
from Customer c, Employee e
where c.ssn = e.ssn;

This might be implemented inside the DBMS via nested loops:

for each tuple t1 in Customer {
for each tuple t2 in Employee {
if (t1.ssn == t2.ssn)
append (t1.name) to result set
}
}

COMP9315 21T1 ♢ Buffer Pool ♢ [14/16]

<< ∧ >>
❖ Effect of Buffer Management (cont)

In terms of page-level operations, the algorithm looks like:

Rel rC = openRelation(“Customer”);
Rel rE = openRelation(“Employee”);
for (int i = 0; i < nPages(rC); i++) { PageID pid1 = makePageID(db,rC,i); Page p1 = request_page(pid1); for (int j = 0; j < nPages(rE); j++) { PageID pid2 = makePageID(db,rE,j); Page p2 = request_page(pid2); // compare all pairs of tuples from p1,p2 // construct solution set from matching pairs release_page(pid2); } release_page(pid1); } COMP9315 21T1 ♢ Buffer Pool ♢ [15/16] << ∧ ❖ Effect of Buffer Management (cont) Costs depend on relative size of tables, #buffers (n), replacement strategy Requests: each rC page requested once, each rE page requested rC times If nPages(rC)+nPages(rE) ≤ n read each page exactly once, holding all pages in buffer pool If nPages(rE) ≤ n-1, and LRU replacement read each page exactly once, hold rE in pool while reading each rC If n == 2   (worst case) read each page every time it's requested COMP9315 21T1 ♢ Buffer Pool ♢ [16/16] Produced: 22 Feb 2021