CS计算机代考程序代写 2021/4/28 Hashing in PostgreSQL

2021/4/28 Hashing in PostgreSQL
Hashing in PostgreSQL
Hashing in PostgreSQL PostgreSQL Hash Function Hash Files in PostgreSQL
>>
COMP9315 21T1 ♢ PostgreSQL Hashing ♢ [0/7]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/pg-hashing/slides.html
1/9

2021/4/28 Hashing in PostgreSQL
∧ >>
❖ Hashing in PostgreSQL
PostgreSQL uses linear hashing on tables which have
been:
create index Ix on R using hash (k);
Hash le implementation: backend/access/hash
hashfunc.c … a family of hash functions hashinsert.c … insert, with overows hashpage.c … utilities + splitting hashsearch.c … iterator for hash les
Detailed info in src/backend/access/hash/README Based on “A New Hashing Package for Unix”, Margo Seltzer, Winter
Usenix 1991
COMP9315 21T1 ♢ PostgreSQL Hashing ♢ [1/7]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/pg-hashing/slides.html
2/9

2021/4/28 Hashing in PostgreSQL
<< ∧ >>
❖ PostgreSQL Hash Function PostgreSQL generic hash function (simplied):
Datum hash_any(unsigned char *k, int keylen) {
uint32 a, b, c, len, *ka = (uint32 *)k;
/* Set up the internal state */
len = keylen;
a = b = c = 0x9e3779b9+len+3923095;
/* handle most of the key */
while (len >= 12) {
a += ka[0]; b += ka[1]; c += ka[2]; mix(a, b, c);
ka+=3; len-=12;
}
… collect data from remaining bytes into a,b,c …
mix(a, b, c);
return UInt32GetDatum(c);
}
See backend/access/hash/hashfunc.c for details (incl mix())
COMP9315 21T1 ♢ PostgreSQL Hashing ♢ [2/7]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/pg-hashing/slides.html
3/9

2021/4/28 Hashing in PostgreSQL
<< ∧ >> ❖ PostgreSQL Hash Function (cont)
hash_any() gives hash value as 32-bit quantity (uint32).
Typically invoked from a type-specic function, e.g.
Datum
hashint4(PG_FUNCTION_ARGS)
{
return hash_uint32(PG_GETARG_INT32(0));
}
where hash_uint32() is a faster version of hash_any() Hash value is “wrapped” as a Datum
COMP9315 21T1 ♢ PostgreSQL Hashing ♢ [3/7]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/pg-hashing/slides.html
4/9

2021/4/28 Hashing in PostgreSQL
<< ∧ >> ❖ PostgreSQL Hash Function (cont)
Implementation of hash → page ID
Bucket
_hash_hashkey2bucket(uint32 hashkey, uint32 maxbucket,
uint32 highmask, uint32 lowmask)
Bucket bucket;
bucket = hashkey & highmask;
if (bucket > maxbucket)
bucket = bucket & lowmask;
return bucket;
}
COMP9315 21T1 ♢ PostgreSQL Hashing ♢ [4/7]
{
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/pg-hashing/slides.html
5/9

2021/4/28 Hashing in PostgreSQL
<< ∧ >>
❖ Hash Files in PostgreSQL PostgreSQL uses different le organisation …
has a single le containing header, main and overow pages
has groups of main pages of size 2n
in between groups, arbitrary number of overow pages
maintains collection of group pointers in header page
each group pointer indicates start of main page group
If overow pages become empty, add to free list and re- use.
Confusingly, PostgreSQL calls “group pointers” as “split pointers”
COMP9315 21T1 ♢ PostgreSQL Hashing ♢ [5/7]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/pg-hashing/slides.html
6/9

2021/4/28 Hashing in PostgreSQL
<< ∧ >> ❖ Hash Files in PostgreSQL (cont)
PostgreSQL hash le structure:
COMP9315 21T1 ♢ PostgreSQL Hashing ♢ [6/7]
https://cgi.cse.unsw.edu.au/~cs9315/21T1/lectures/pg-hashing/slides.html
7/9

2021/4/28 Hashing in PostgreSQL
<< ∧ ❖ Hash Files in PostgreSQL (cont) Approximate method for converting bucket # to page address: // which page is primary page of bucket uint bucket_to_page(headerp, B) { uint *splits = headerp->hashm_spares;
uint chunk, base, offset, lg2(uint);
chunk = (B<2) ? 0 : lg2(B+1)-1; base = splits[chunk]; offset = (B<2) ? B : B-(1<