Buffer Pool
>>
Buffer Pool
Buffer Pool
Page Replacement Policies
Effect of Buffer Management
COMP9315 21T1 ♢ Buffer Pool ♢ [0/16]
∧ >>
❖ Buffer Pool
COMP9315 21T1 ♢ Buffer Pool ♢ [1/16]
<< ∧ >>
❖ Buffer Pool (cont)
Aim of buffer pool:
hold pages read from database files, for possible re-use
Used by:
access methods which read/write data pages
e.g. sequential scan, indexed retrieval, hashing
Uses:
file manager functions to access data files
Note: we use the terms page and block interchangably
COMP9315 21T1 ♢ Buffer Pool ♢ [2/16]
<< ∧ >>
❖ Buffer Pool (cont)
COMP9315 21T1 ♢ Buffer Pool ♢ [3/16]
<< ∧ >>
❖ Buffer Pool (cont)
Buffer pool operations: (both take single PageID argument)
request_page(pid), release_page(pid), …
To some extent …
request_page() replaces getBlock()
release_page() replaces putBlock()
Buffer pool data structures:
Page frames[NBUFS]
FrameData directory[NBUFS]
Page is byte[BUFSIZE]
COMP9315 21T1 ♢ Buffer Pool ♢ [4/16]
<< ∧ >>
❖ Buffer Pool (cont)
COMP9315 21T1 ♢ Buffer Pool ♢ [5/16]
<< ∧ >>
❖ Buffer Pool (cont)
For each frame, we need to know: (FrameData)
which Page it contains, or whether empty/free
whether it has been modified since loading (dirty bit)
how many transactions are currently using it (pin count)
time-stamp for most recent access (assists with replacement)
Pages are referenced by PageID …
PageID = BufferTag = (rnode, forkNum, blockNum)
COMP9315 21T1 ♢ Buffer Pool ♢ [6/16]
<< ∧ >>
❖ Buffer Pool (cont)
How scans are performed without Buffer Pool:
Buffer buf;
int N = numberOfBlocks(Rel);
for (i = 0; i < N; i++) {
pageID = makePageID(db,Rel,i);
getBlock(pageID, buf);
for (j = 0; j < nTuples(buf); j++)
process(buf, j)
}
Requires N page reads.
If we read it again, N page reads.
COMP9315 21T1 ♢ Buffer Pool ♢ [7/16]
<< ∧ >>
❖ Buffer Pool (cont)
How scans are performed with Buffer Pool:
Buffer buf;
int N = numberOfBlocks(Rel);
for (i = 0; i < N; i++) {
pageID = makePageID(db,Rel,i);
bufID = request_page(pageID);
buf = frames[bufID]
for (j = 0; j < nTuples(buf); j++)
process(buf, j)
release_page(pageID);
}
Requires N page reads on the first pass.
If we read it again, 0 ≤ page reads ≤ N
COMP9315 21T1 ♢ Buffer Pool ♢ [8/16]
<< ∧ >>
❖ Buffer Pool (cont)
Implementation of request_page()
int request_page(PageID pid)
{
if (pid in Pool)
bufID = index for pid in Pool
else {
if (no free frames in Pool)
evict a page (free a frame)
bufID = allocate free frame
directory[bufID].page = pid
directory[bufID].pin_count = 0
directory[bufID].dirty_bit = 0
}
directory[bufID].pin_count++
return bufID
}
COMP9315 21T1 ♢ Buffer Pool ♢ [9/16]
<< ∧ >>
❖ Buffer Pool (cont)
The release_page(pid) operation:
Decrement pin count for specified page
Note: no effect on disk or buffer contents until replacement required
The mark_page(pid) operation:
Set dirty bit on for specified page
Note: doesn’t actually write to disk; indicates that page changed
The flush_page(pid) operation:
Write the specified page to disk (using write_page)
Note: not generally used by higher levels of DBMS
COMP9315 21T1 ♢ Buffer Pool ♢ [10/16]
<< ∧ >>
❖ Buffer Pool (cont)
Evicting a page …
find frame(s) preferably satisfying
pin count = 0 (i.e. nobody using it)
dirty bit = 0 (not modified)
if selected frame was modified, flush frame to disk
flag directory entry as “frame empty”
If multiple frames can potentially be released
need a policy to decide which is best choice
COMP9315 21T1 ♢ Buffer Pool ♢ [11/16]
<< ∧ >>
❖ Page Replacement Policies
Several schemes are commonly in use:
Least Recently Used (LRU)
Most Recently Used (MRU)
First in First Out (FIFO)
Random
LRU / MRU require knowledge of when pages were last accessed
how to keep track of “last access” time?
base on request/release ops or on real page usage?
COMP9315 21T1 ♢ Buffer Pool ♢ [12/16]
<< ∧ >>
❖ Page Replacement Policies (cont)
Cost benefit from buffer pool (with n frames) is determined by:
number of available frames (more ⇒ better)
replacement strategy vs page access pattern
Example (a): sequential scan, LRU or MRU, n ≥ b
First scan costs b reads; subsequent scans are “free”.
Example (b): sequential scan, MRU, n < b
First scan costs b reads; subsequent scans cost b - n reads.
Example (c): sequential scan, LRU, n < b
All scans cost b reads; known as sequential flooding.
COMP9315 21T1 ♢ Buffer Pool ♢ [13/16]
<< ∧ >>
❖ Effect of Buffer Management
Consider a query to find customers who are also employees:
select c.name
from Customer c, Employee e
where c.ssn = e.ssn;
This might be implemented inside the DBMS via nested loops:
for each tuple t1 in Customer {
for each tuple t2 in Employee {
if (t1.ssn == t2.ssn)
append (t1.name) to result set
}
}
COMP9315 21T1 ♢ Buffer Pool ♢ [14/16]
<< ∧ >>
❖ Effect of Buffer Management (cont)
In terms of page-level operations, the algorithm looks like:
Rel rC = openRelation(“Customer”);
Rel rE = openRelation(“Employee”);
for (int i = 0; i < nPages(rC); i++) {
PageID pid1 = makePageID(db,rC,i);
Page p1 = request_page(pid1);
for (int j = 0; j < nPages(rE); j++) {
PageID pid2 = makePageID(db,rE,j);
Page p2 = request_page(pid2);
// compare all pairs of tuples from p1,p2
// construct solution set from matching pairs
release_page(pid2);
}
release_page(pid1);
}
COMP9315 21T1 ♢ Buffer Pool ♢ [15/16]
<< ∧
❖ Effect of Buffer Management (cont)
Costs depend on relative size of tables, #buffers (n), replacement strategy
Requests: each rC page requested once, each rE page requested rC times
If nPages(rC)+nPages(rE) ≤ n
read each page exactly once, holding all pages in buffer pool
If nPages(rE) ≤ n-1, and LRU replacement
read each page exactly once, hold rE in pool while reading each rC
If n == 2 (worst case)
read each page every time it's requested
COMP9315 21T1 ♢ Buffer Pool ♢ [16/16]
Produced: 22 Feb 2021