2021/4/28 Linear Hashing
Linear Hashing
Linear Hashing
Selection with Lin.Hashing
File Expansion with Lin.Hashing Insertion with Lin.Hashing Splitting
Insertion Cost
Deletion with Lin.Hashing
>>
COMP9315 21T1 ♢ Linear Hashing ♢ [0/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
1/15
2021/4/28 Linear Hashing
❖ Linear Hashing File organisation:
le of primary data pages
le of over ow data pages
registers called split pointer (sp) and depth (d)
∧ >>
COMP9315 21T1 ♢ Linear Hashing ♢ [1/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
2/15
2021/4/28 Linear Hashing
<< ∧ >>
❖ Linear Hashing (cont)
Linear Hashing uses a systematic method of growing data
le …
hashfunction”adapts”tochangingaddressrange (viasp andd)
systematic splitting controls length of over ow chains
Advantage: does not require auxiliary storage for a directory
Disadvantage:requiresover owpages (don’tsplitonfullpages)
COMP9315 21T1 ♢ Linear Hashing ♢ [2/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
3/15
2021/4/28 Linear Hashing
<< ∧ >>
❖ Linear Hashing (cont)
File grows linearly (one page at a time, at regular intervals).
Has “phases” of expansion; over each phase, b doubles.
COMP9315 21T1 ♢ Linear Hashing ♢ [3/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
4/15
2021/4/28 Linear Hashing
❖ Selection with Lin.Hashing
If b=2d, the le behaves exactly like standard hashing. Use d bits of hash to compute page address.
// select * from R where k = val
h = hash(val);
P = bits(d,h); // lower-order bits for each tuple t in page P
and its overflow pages {
if (t.k == val) return t;
}
Average Costone = 1+Ov
COMP9315 21T1 ♢ Linear Hashing ♢ [4/13]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
5/15
2021/4/28 Linear Hashing
<< ∧ >>
❖ Selection with Lin.Hashing (cont)
If b != 2d, treat different parts of the le differently.
Parts A and C are treated as if part of a le of size 2d+1.
Part B is treated as if part of a le of size 2d.
Part D does not yet exist (tuples in B may eventually move into it).
COMP9315 21T1 ♢ Linear Hashing ♢ [5/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
6/15
2021/4/28 Linear Hashing
❖ Selection with Lin.Hashing (cont) Modi ed search algorithm:
// select * from R where k = val
h = hash(val);
pid = bits(d,h);
if (pid < sp) { pid = bits(d+1,h); }
P = getPage(f, pid)
for each tuple t in page P
and its overflow pages {
if (t.k == val) return R;
}
COMP9315 21T1 ♢ Linear Hashing ♢ [6/13]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
7/15
2021/4/28 Linear Hashing
<< ∧ >> ❖ File Expansion with Lin.Hashing
COMP9315 21T1 ♢ Linear Hashing ♢ [7/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
8/15
2021/4/28 Linear Hashing
❖ Insertion with Lin.Hashing Abstract view:
pid = bits(d,hash(val));
if (pid < sp) pid = bits(d+1,hash(val)); // bucket P = page P + its overflow pages
P = getPage(f,pid)
for each page Q in bucket P {
if (space in Q) {
insert tuple into Q
break
} }
if (no insertion) {
add new ovflow page to bucket P
insert tuple into new page
}
if (need to split) {
partition tuples from bucket sp
into buckets sp and sp+2^d
sp++;
if (sp == 2^d) { d++; sp = 0; }
}
COMP9315 21T1 ♢ Linear Hashing ♢ [8/13]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
9/15
2021/4/28 Linear Hashing
❖ Splitting
How to decide that we “need to split”? Two approaches to triggering a split:
split every time a tuple is inserted into full page
split when load factor reaches threshold (every k inserts)
Note: always split page sp, even if not full or “current” Systematic splitting like this …
eventually reduces length of every over ow chain helps to maintain short average over ow chain length
COMP9315 21T1 ♢ Linear Hashing ♢ [9/13]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
10/15
2021/4/28 Linear Hashing
❖ Splitting (cont)
Splitting process for page sp=01:
<< ∧ >>
COMP9315 21T1 ♢ Linear Hashing ♢ [10/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
11/15
2021/4/28 Linear Hashing
❖ Splitting (cont) Splitting algorithm:
// partition tuples between two buckets
newp = sp + 2^d; oldp = sp;
for all tuples t in P[oldp] and its overflows {
p = bits(d+1,hash(t.k));
if (p == newp)
add tuple t to bucket[newp]
else
add tuple t to bucket[oldp]
}
sp++;
if (sp == 2^d) { d++; sp = 0; }
COMP9315 21T1 ♢ Linear Hashing ♢ [11/13]
<< ∧ >>
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
12/15
2021/4/28 Linear Hashing
<< ∧ >>
❖ Insertion Cost
If no split required, cost same as for standard hashing:
Costinsert: Best: 1r + 1w, Avg: (1+Ov)r + 1w, Worst: (1+max(Ov))r + 2w Ifsplitoccurs,incurCostinsert pluscostofsplitting:
readpagesp (plusallofitsover owpages) writepagesp (anditsnewover owpages)
writepagesp+2d (anditsnewover owpages) On average, Costsplit = (1+Ov)r + (2+Ov)w
COMP9315 21T1 ♢ Linear Hashing ♢ [12/13]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
13/15
2021/4/28 Linear Hashing
❖ Deletion with Lin.Hashing Deletion is similar to ordinary static hash le.
But might wish to contract le when enough tuples removed.
Rationale: r shrinks, b stays large ⇒ wasted space. Method:
remove last bucket in data le (contracts linearly).
merge tuples from bucket with its buddy page (using d-1 hash bits)
COMP9315 21T1 ♢ Linear Hashing ♢ [13/13]
<< ∧
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
14/15
2021/4/28 Linear Hashing
Produced: 10 Mar 2021
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/linear-hashing/slides.html
15/15