2021/4/28 Indexing
Indexing
Indexing
Indexes
Dense Primary Index
Sparse Primary Index Selection with Primary Index Insertion with Primary Index Deletion with Primary Index Clustering Index
Secondary Index
Multi-level Indexes
Select with Multi-level Index
>>
COMP9315 21T1 ♢ Indexing ♢ [0/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
1/17
2021/4/28 Indexing
❖ Indexing
An index is a le of (keyVal,tupleID) pairs, e.g.
∧ >>
COMP9315 21T1 ♢ Indexing ♢ [1/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
2/17
2021/4/28 Indexing
❖ Indexes
A 1-d index is based on the value of a single attribute A. Some possible properties of A:
maybeusedtosortdata le (ormaybesortedonsomeother eld)
valuesmaybeunique (ortheremaybemultipleinstances)
Taxonomy of index types, based on properties of index attribute:
<< ∧ >>
primary clustering secondary
index on unique eld, may be sorted on A index on non-unique eld, le sorted on A le not sorted on A
A given table may have indexes on several attributes.
COMP9315 21T1 ♢ Indexing ♢ [2/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
3/17
2021/4/28 Indexing
<< ∧ >>
❖ Indexes (cont)
Indexes themselves may be structured in several ways:
dense sparse single-level multi-level
Index lehastotalipages (wheretypicallyi≪b) Index lehaspagecapacityci (wheretypicallyci≫c) Dense index: i = ceil( r/ci ) Sparse index: i = ceil( b/ci )
COMP9315 21T1 ♢ Indexing ♢ [3/15]
every tuple is referenced by an entry in the index le only some tuples are referenced by index le entries tuples are accessed directly from the index le
may need to access several index pages to reach tuple
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
4/17
2021/4/28 Indexing
<< ∧ >>
❖ Dense Primary Index
Data le unsorted; one index entry for each tuple
COMP9315 21T1 ♢ Indexing ♢ [4/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
5/17
2021/4/28 Indexing
❖ Sparse Primary Index
Data le sorted; one index entry for each page
<< ∧ >>
COMP9315 21T1 ♢ Indexing ♢ [5/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
6/17
2021/4/28 Indexing
❖ Selection with Primary Index For one queries:
ix = binary search index for entry with key K
if nothing found { return NotFound }
b = getPage(pageOf(ix.tid))
t = getTuple(b,offsetOf(ix.tid))
— may require reading overflow pages
return t
Worst case: read log2i index pages + read 1+Ov data pages. Thus, Costone,prim = log2 i + 1 + Ov
Assume: index pages are same size as data pages ⇒ same reading cost
COMP9315 21T1 ♢ Indexing ♢ [6/15]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
7/17
2021/4/28 Indexing
<< ∧ >> ❖ Selection with Primary Index (cont)
For range queries on primary key:
use index search to nd lower bound
read index sequentially until reach upper bound accumulate set of buckets to be examined examine each bucket in turn to check for matches
For pmr queries involving primary key: search as if performing one query.
For queries not involving primary key, index gives no help.
COMP9315 21T1 ♢ Indexing ♢ [7/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
8/17
2021/4/28 Indexing
<< ∧ >> ❖ Selection with Primary Index (cont)
Method for range queries (when data le is not sorted)
// e.g. select * from R where a between lo and hi
pages = {} results = {}
ixPage = findIndexPage(R.ixf,lo)
while (ixTup = getNextIndexTuple(R.ixf)) {
if (ixTup.key > hi) break;
pages = pages ∪ pageOf(ixTup.tid) }
foreach pid in pages {
// scan data page plus ovflow chain
while (buf = getPage(R.datf,pid)) {
foreach tuple T in buf {
if (lo<=T.a && T.a<=hi) results = results ∪ T
}}}
COMP9315 21T1 ♢ Indexing ♢ [8/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
9/17
2021/4/28 Indexing
❖ Insertion with Primary Index Overview:
tid = insert tuple into page P at position p
find location for new entry in index file
insert new index entry (k,tid) into index file
Problem: order of index entries must be maintained
need to avoid over ow pages in index (but see later) so, reorganise index le by moving entries up
Reorganisation requires, on average, read/write half of index le:
Costinsert,prim = (log2i)r + i/2.(1r+1w) + (1+Ov)r + (1+δ)w COMP9315 21T1 ♢ Indexing ♢ [9/15]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
10/17
2021/4/28 Indexing
❖ Deletion with Primary Index Overview:
find tuple using index
mark tuple as deleted
delete index entry for tuple
If we delete index entries by marking … Costdelete,prim = (log2 i)r + (1 + Ov)r + 1w + 1w
If we delete index entry by index le reorganisation … Costdelete,prim = (log2 i)r + (1 + Ov)r + i/2.(1r+1w) + 1w
COMP9315 21T1 ♢ Indexing ♢ [10/15]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
11/17
2021/4/28 Indexing
<< ∧ >>
❖ Clustering Index
Data le sorted; one index entry for each key value
Cost penalty: maintaining both index and data le as sorted (Note: can’t mark index entry for value X until all X tuples are deleted)
COMP9315 21T1 ♢ Indexing ♢ [11/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
12/17
2021/4/28 Indexing
<< ∧ >>
❖ Secondary Index
Data le not sorted; want one index entry for each key value
Costpmr = (log2iix1 + aix2 + bq.(1 + Ov))
COMP9315 21T1 ♢ Indexing ♢ [12/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
13/17
2021/4/28 Indexing
<< ∧ >>
❖ Multi-level Indexes
Secondary Index used two index les to speed up search
by keeping the initial index search relatively quick Ix1 small (depends on number of unique key values)
Ix2 larger (depends on amount of repetition of keys) typically, bIx1 ≪ bIx2 ≪ b
Could improve further by
making Ix1 sparse, since Ix2 is guaranteed to be ordered in this case, bIx1 = ceil( bIx2 / ci )
if Ix1 becomes too large, add Ix3 and make Ix2 sparse if data le ordered on key, could make Ix3 sparse
Ultimately, reduce top-level of index hierarchy to one page.
COMP9315 21T1 ♢ Indexing ♢ [13/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
14/17
2021/4/28 Indexing
❖ Multi-level Indexes (cont) Example data le with three-levels of index:
<< ∧ >>
Assume: not primary key, c = 20, ci = 3
Inreality,morelikely c=100, ci=1000
COMP9315 21T1 ♢ Indexing ♢ [14/15]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
15/17
2021/4/28 Indexing
❖ Select with Multi-level Index For one query on indexed key eld:
xpid = top level index page
for level = 1 to d {
read index entry xpid
search index page for J’th entry
where index[J].key <= K < index[J+1].key
if (J == -1) { return NotFound }
xpid = index[J].page
}
pid = xpid // pid is data page index search page pid and its overflow pages
Costone,mli = (d + 1 + Ov)r
(Note that d = ceil( logci r ) and ci is large because index entries are small)
COMP9315 21T1 ♢ Indexing ♢ [15/15]
<< ∧
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
16/17
2021/4/28 Indexing
Produced: 13 Mar 2021
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/indexing/slides.html
17/17