程序代写代做代考 data structure go concurrency AVL database chain algorithm Chapter 10:

Chapter 10:
Tree-Structured Indexes (with a Focus on B+ Trees)
Ed Knorr, Slides from Fall 2020 Based on:
• Sections 10.1, 10.3-10.8.4 of Ramakrishnan & Gehrke (textbook)—i.e., basically all of Chapter 10, but skip the parts about ISAM indexes.
• Sections 8.1, 8.2, and 8.2.1 – Read or re-read this part of the textbook if you want more background information about indexing. Highly recommended! The rest of Chapter 8 may be useful to you when we reach other areas of the course.
• Some of this chapter may be a review from CPSC 221 (although they sometimes teach AVL trees, instead). Regardless, B+ trees are very important external data structures used by RDBMSs!
1

Some Learning Goals
 List the 3 ways or “alternatives” of representing data entries k* having key k, in a general index.
 Justify the use of indexes in database systems.
 Explain how a tree-structured index supports both range and point queries.
 Build a B+ tree for a given set of data. Show how to insert and delete keys in a B+ tree.
 Analyze the complexity of: (a) searching, (b) inserting into, and (c) deleting from, a B+ tree.
 Explain why a B+ tree needs sibling pointers for its leaf pages.
 Explain why B+ trees tend to be very shallow, even after storing many millions of data entries. Equivalently, provide arguments for why B+ trees can store large numbers of data entries in their pages.
 Explain the pros and cons of allowing prefix key compression in an index.
 Given a set of data, build a B+ tree using bulk loading.
2

Some Learning Goals (cont.)
 Provide several advantages of bulk loading compared to using a large number of SQL INSERT statements.
 Estimate the number of I/Os required to traverse the records of a large file (whose records have key k) using B+ trees for: (a) the clustered case, and (b) the unclustered case. Justify any assumptions that are needed.
 Using a diagram, show the principal difference between a clustered and an unclustered index.
 Provide examples of when clustering doesn’t help performance, and may actually hinder the performance of certain kinds of queries.
3

Motivation
 On the one hand, we want to keep as much data as possible about an entity (e.g., shipping record, telephone transaction, customer record, student record), but on the other hand, we want good performance.
 We can have both, via indexing!
• e.g., Searching 100 million records in real time
 Good performance is a major DBA goal.
 Note that SQL queries perform “exact” matches in indexes; so, spelling and typing matter (unlike Google’s forgiving search engine).
• You can do point and range queries in SQL.
 Two major categories of indexes:  Tree-structured indexes
 Hash index structures
4

Index Entries and Data Entries
 Letusdenoteanindexentryasaninternal pair to direct traffic inside a tree index.
 Letusdenoteadataentryasapairinthe leaf pages of a tree index.
 (From Chapter 8) For any index, there are 3 alternatives for storing data entries k*, whose search key value is k:
Alt. 1: Whole data record with search key k:
Alt. 2:
Alt. 3:
5

Introduction
 Tree-structuredindexingtechniquessupportboth range searches and equality searches.
 B+ tree: A dynamic, balanced tree structure  Adjusts gracefully for insertions and deletions
 Skew is generally not an issue.
6

Range Searches in a Sorted File
 “Find all students with GPA > 3.0”
 Consider a flat file (sometimes called a heap file, but “heap” is
not the same as a CPSC 221 data structures heap).
 If the data is sorted on GPA in this file, then: (1) we can use binary search to find one student with that GPA, and then (2) we can sequentially scan to find the others (if there are duplicates).
 The “cost” of binary search can be high on a large file.
• Doyouknowwherethestartandendoftherangeof
pages is? Keep in mind that they may be non-contiguous.
• Doyouknowwherethemiddleofarangeofpagesis?
• Isthereanevendistributionofnumbersofrecordson each page?
• Howisdataskewhandled?
• Howmanyseeksdowedo?
7

Range Searches in a Sorted File (cont.)
 “Find all students with GPA > 3.0”
SELECT STUDENT_NAME
FROM GRADE G, STUDENT S
WHERE G.STUDENT_ID = S.STUDENT_ID AND G.GPA > 3.0;
 Simple idea: Create an ‘index’ file:
k1 k2
kM
Index File
Page 1 Page 2 Page 3
Page N
Data File
 Now, we can do a faster search using a smaller (index) file!
8

B+ Tree:
Most Widely-Used Index in RDBMSs
 (Unique) Search, Insert, or Delete is O(logF N), where F = fanout and N = # leaf pages (or roughly the total # of leaf + internal pages). N (approx.) = #recs / (avg.#recs /page)
 We want to keep the tree height-balanced. Why?
 Minimum 50% occupancy (except for root). Each node contains d ≤ m ≤ 2d entries. d is called the order of the tree, and m is the # of keys in a particular node.
 Supports equality and range searches, efficiently Index Entries
Data Entries (“Sequence set”)
9

2* 3* 5* 7* 14* 16*
19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
Example: B+ Tree
 Search always begins at root, and key comparisons direct it to a leaf
 Search for: 5*, 15*, all data entries ≥ 24*, etc. Root
13 17 24 30
 Based on the search for 15*, we know it is not in the tree!
10

B+ Tree Statistics, in Practice
 Some stats:
 Typical order = 100
 Typical fill-factor = 67%  Average fanout = 133
 Typical maximum capacities:
 Height 4: 1334 = 312,900,721 leaf pages (not records)  Height 3: 1333 = 2,352,637 leaf pages
 For a busy index, we can often keep the top levels in the buffer pool. Why?
 Level0= 1page =  Level1= 133pages=  Level 2 = 17,689 pages =
4KB ~0.5MB ~69 MB
11

DB2 Metadata for B+ Tree Indexes
 For a clustering index: NEAROFFPOSF and FAROFFPOSF indicate how many rows are away from their ideal data page (recall our study of relocated rows, in the previous unit).
 LEAFDIST indicates 100x the average number of physical pages between successive leaf pages when doing a sequential index scan.
 CLUSTERRATIOF: 100% means the index and data ordering match perfectly. Less than 100% means that some of the data pages do not follow the index order.
 As CLUSTERRATIOF drops, think about reorganizing the data.
 For more information, and many other kinds of index metadata, see the SYSIBM.SYSINDEX* DB2 catalog tables, such as SYSIBM.SYSINDEXES and SYSIBM.SYSINDEXPART.
12

Indexes Used by Commercial RDBMSs (Other indexes may exist for extended features.)
DBMS
Primary Index Structures
Secondary Index Structures (dense)
IBM DB2
B+ Tree (dense)
B+ Tree
Oracle
B+ Tree (dense), Hash (dense)
B+ Tree, Bitmap
SQL Server
B+ Tree (sparse) B+ Tree (sparse)
B+ Tree B+ Tree
Sybase (purchased by SAP in 2010)
13

Inserting a Data Entry into a B+ Tree
 Find correct leaf L
 Put data entry onto L
 If L has enough space, done!
 Else, must split L (into L and a new node L2) • Redistribute entries evenly, copy up middle key • Insert index entry pointing to L2 into parent of L
 This can happen recursively:
 To split index node, redistribute entries evenly, but
push up middle key. (Contrast with leaf splits.)
 Splits “grow” tree; only a root split increases the
height
 Tree growth: gets wider or one level taller at top
14

 Observe how the minimum
Entry to be inserted in parent node.
 Note the difference between copy-up and move-up. Be sure you understand the reasons for this.
17
Entry to be inserted in parent node. (Note that 17 is moved up and only
Inserting 8* into the Example B+ Tree from Slide 10
occupancy is guaranteed in both leaf and index page splits.
2* 3*
5* 7* 8*
5 13
24 30
5
(Note that 5 is copied up and continues to appear in the leaf.)
appears once in the index. Contrast this with a leaf split.)
15

Example B+ Tree After Inserting 8*
Root
17
5 13 24 30
2* 3* 5* 7* 8* 14*16* 19*20*22* 24*27*29* 33*34*38*39*
Note that root was split, leading to increase in height
In this example, we can avoid splitting by re-distributing entries; however, this is usually not done in practice.
16

Deleting a Data Entry from a B+ Tree
 Start at root, find leaf L where entry belongs
 Remove the entry
 If L is at least half-full, done!  If L has only d-1 entries,
• Try to re-distribute, borrowing from sibling (adjacent node with same parent as L)
• If re-distribution fails, merge L and sibling
 If merge occurred, must delete entry (pointing to L
or sibling) from parent of L
 Merge could propagate to root, decreasing height
17

2* 3*
5* 7* 8* 14*16* 22*24* 27*29* 33*34*38*39*
Tree after Inserting 8* (see Slide 16) and then Deleting 19* and 20* …
 Deleting 19* is easy.
Root
17
5 13 27 30
 Deleting 20* is done with re-distribution. Notice how the middle key is copied up.
18

 Must merge
30
… and then Deleting 24*
 Observe ‘toss’ of index entry (on right), and ‘pull down’ of index entry (below)
22* 27* 29*
33* 34* 38* 39*
Root
2* 3* 5* 7* 8* 14*16* 22*27*29*
33*34*38*39*
5 13 17 30
19

2* 3*
5* 7* 8*
14*16* 17*18*
20*21* 22*27*29*
33*34*38*39*
Example of Non-Leaf Re-Distribution
 Tree is shown below during deletion of 24*  Where was “24” prior to the deletion process?
 In contrast to previous example, can re-distribute entry from left child of root to right child
5
13 17 20
30
Root
22
20

2* 3*
5* 7* 8*
14*16* 17*18*
20*21* 22*27*29*
33*34*38*39*
After Re-Distribution
 Intuitively, entries are re-distributed by ’pushing through’ the splitting entry in the parent node.
 It suffices to re-distribute index entry with key 20. We’ve re-distributed 17, as well, for illustration.
5 13
20 22 30
Root
17
21

An Optimization:
Prefix Key Compression
 Increase fan-out for greater efficiency.
 Key values in index entries only ’direct traffic’;
but often, we can compress them.
 e.g., If we have adjacent index entries with search key values Danielle Miller, David Smith and Deepak Murthy, we can abbreviate David Smith to Dav
• Is this correct? What if there is a data entry Davey Jones? (Can only compress David Smith to Davi)
• In general, while compressing, must leave each index entry greater than every key value (in any subtree) to its left
 Insert/delete must be suitably modified
22

Another Optimization: Bulk Loading
 If we have a large collection of records, and we want to create a B+ tree on some field, doing so by repeatedly inserting records is very slow. Why?
 Bulk Loading can be done much more efficiently
 Initialization: Sort all data entries, insert pointer to
first (leaf) page in a new (root) page
Root
Sorted pages of data entries; not yet in B+ tree
3* 4* 6* 9* 10* 11* 12* 13* 20*22* 23* 31* 35*36* 38*41* 44*
23

Root
6
Bulk Loading Example
3* 4* 6* 9* 10*11* 12*13* 20*22* 23*31* 35*36* 38*41* 44*
24

Bulk Loading Example (cont.)
Root
6 10
3* 4* 6* 9* 10*11* 12*13* 20*22* 23*31* 35*36* 38*41* 44*
25

Bulk Loading Example (cont.)
6 12
10
3* 4* 6* 9* 10*11* 12*13* 20*22* 23*31* 35*36* 38*41* 44*
26

Bulk Loading Example (cont.)
6
12 20
10
3* 4* 6* 9* 10*11* 12*13* 20*22* 23*31* 35*36* 38*41* 44*
27

Bulk Loading Example (cont.)
6
12 23
10 20
3* 4* 6* 9* 10*11* 12*13* 20*22* 23*31* 35*36* 38*41* 44*
28

 Index entries for leaf pages are always entered into the right- most index page just above the leaf level. 3* 4* When a node fills up,
6
12
23 35
Data entry pages not yet in B+ tree
it splits. (The split may go up the right- most path to the root.)
Root 10
20
 This is much faster than repeated inserts, especially when one considers locking!
35
6 12 23 38
Data entry pages not yet in B+ tree
Bulk Loading (cont.)
6* 9*
10* 11*
12* 13*
20*22*
23* 31*
35*36*
38*41* 44*
Root
10 20
3* 4* 6* 9* 10*11* 12*13* 20*22* 23*31* 35*36* 38*41* 44*
29

 Slow
Summary of Bulk Loading
 Non-bulk approach: Multiple SQL INSERT statements
 Does not give sequential storage of leaves, unless we sort the keys before issuing the INSERT statements, and pre-allocate space for the leaf pages
 Lots of locking overhead
• May wish to use the LOCK TABLE statement in SQL
• Indexes require locking, too!
• Locking part of an index locks the corresponding data rows (and that may be a lot of rows) because rows cannot be inserted, updated, or deleted during this time.
 Lots of log records can be generated when loading a table via INSERT statements!
30

Summary of Bulk Loading (cont.)
 Bulk Loading
 Has advantages for concurrency control and logging
 Fewer I/Os during building
 Leaves will be stored sequentially (and will be linked, of course)
 Can control “fill factor” on pages, and even leave blank pages periodically
• In DB2, PCTFREE and FREEPAGE are options on the CREATE TABLESPACE statement and the CREATE INDEX statement.
31

Summary of Bulk Loading (cont.)
 Bulk Loading Steps during a Reorg:
 Backup the old table (i.e., do a full “image copy”).
 Unload (export) the records from the table.
 Sort keys/records by clustering order (for the clustering index and the initial data load).
 Delete the old table and its indexes.
 Allocate the new table and the new index files, perhaps with
more space and different PCTFREE and FREEPAGE parameters.
 Create the indexes (via CREATE INDEX) before loading.
 Turn off logging for the new table.
 Bulk load (populate) the table, and bulk-load/build the indexes.
 Backup the new table (e.g., DB2 will not let users use the table without running a full image copy beforehand … why not?)
 Run RUNSTATS and rebind the existing query plans.
 Turn logging back on.
 Make the table available to users.
32

Summary of ’Order’
 Different authors may have slightly different definitions of “order”.
 A useful rule/phrase to remember is: “non-root nodes are at least half full”.
 Variable-sized records and search keys mean different nodes may contain different numbers of entries.
 ** To simplify some of our calculations:
 We’ll assume that the maximum number of entries in an internal page and
a leaf page are the same (for Alternative 2 indexes).
 We’ll accept either an odd or an even number of maximum entries on a leaf/internal B+ tree page for our calculations (unless told otherwise).
 We’ll ignore “overhead” bytes on each page (e.g., sibling pointers and housekeeping bytes, which differ for different kinds of pages).
 Think back to Chapter 9c on page and record layouts, or to the prefix key compression slide, or even to Alt. 2 vs. Alt. 3—to gain an appreciation for these simplifications to our calculations (otherwise, we’d be overwhelmed).
33

Summary of ’Order’ (cont.)
 Some indexes may have a lot of duplicates, and can be handled via:
 Overflow pages
 Alternative 3 indexes
 Regular Alt. 2 situation, with a tweak to the B+ tree algorithm so that records in the left child are ≤ instead of just < (and records in the right child remain as ≥).  Concatenating the key and the rid, and then treating this combination as a unique key (Alternative 2)  Even with fixed-length fields, multiple records with the same search key value (duplicates) can lead to variable- sized data entries in Alt. 3. 34 Summary of the Rest of the Chapter  Tree-structured indexes are very efficient for range searches and for equality searches  We didn’t cover ISAM indexes; however, they are similar in spirit to B+ tree indexes for searching.  There is a nice coverage of ISAM indexes on pages 341-344 of the textbook. You won’t be tested on them, but it’ll strengthen your understanding of indexing.  ISAM: Only leaf pages can be modified; the internal nodes do not change (they simply direct traffic).  ISAM indexes are mostly static, and are well-suited to dictionaries or data that doesn’t change frequently.  Overflow pages are used when the target leaf page is full. • Leads to potentially long overflow chains • Overflow chains can degrade performance 35 Summary (cont.)  A B+ tree is a dynamic index structure.  Inserts/deletes leave tree height-balanced  High fanout (F) means depth is rarely more than 3 or 4  O(logF N) cost for: • Unique searches • Insertions • Deletions  Almost always better than maintaining a sorted file  In practice, nodes have about 67% occupancy, on average.  Usually preferred over ISAM; little reason to use ISAM today  Adjusts to growth gracefully  Key compression increases fanout, reduces height 36 Summary (cont.)  Bulk loading can be much faster than repeated INSERTs for a table, especially a large one.  B+ trees are the most widely-used index in DBMSs because of their versatility . • One of the most optimized components of a DBMS!  Indexing principles are very useful to know... even if you don’t plan to work with databases in the future! 37