程序代写代做代考 algorithm chain cache data structure PowerPoint Presentation

PowerPoint Presentation

Tree Indexes
R & G – Chapter 10

1

Reminder on Heap Files
Two access APIs:
fetch by recordId (pageId, slotId)
scan (starting from some page)

Page 1
Page 2
Page 3
Page 4
Page 5
Page 6

Header Page
DIRECTORY

2

Wouldn’t it be nice…
…if we could look things up by value?
Toward a Declarative access API
But … efficiency?
“If you don’t find it in the index, look very carefully through the entire catalog. ”
—Sears, Roebuck, and Co., Consumers’ Guide, 1897

If you don’t find it in the index, look very carefully through the entire catalog.
–Sears, Roebuck, and Co., Consumers’ Guide, 1897

3

We’ve seen this before
Data structures … in RAM:
Search trees (Binary, AVL, Red-Black, …)
Hash tables
Needed: disk-based data structures
“paginated”: made up of disk pages!

Index
An index is data structure that enables fast lookup and modification of
data entries by search key
Lookup: may support many different operations
Equality, 1-d range, 2-d region, …
Search Key: any subset of columns in the relation
Do not need to be unique
—e.g. (firstname) or (firstname, lastname)

5

Index Part 2
An index is data structure that enables fast lookup and modification of
data entries by search key
Data Entries: items stored in the index
Assume for today: a pair (k, recordId) …
Pointers to records in Heap Files!
Easy to generalize later
Modification: want to support fast insert and delete
Many Types of indexes exist: B+-Tree, Hash, R-Tree, GiST, …

Simple Idea?
Step 1: Sort heap file & leave some space
Pages physically stored in logical order (sequential access)
Do we need “next” pointers to link pages?
No. Pages are physically sorted in logical order
Step 2: Build the index data structure over this…
Why not just use binary search in this heap file?
Fan-out of 2  deep tree  lots of I/Os
Examine entire records just to read key during search

1, 2, _
3, 4, _
5, 6, _
7, 8, _
9, 10, _

3, 4, 5
1, 2, 7
8, 6, 9
10, _, _
Input
Heap File

8

Build a high fan-out search tree
Start simple: Sorted (key, page id) file
No record data
Binary search in the key file. Better!
Forgot: Need to break across pages!

1, 2, _
3, 4, _
5, 6, _
7, 8, _
9, 10, _

1

3

5

7

9

9

Build a high fan-out search tree
Start simple: Sorted (key, page id) file
No record data
Binary search in the key file. Better!
Forgot: Need to break across pages!

1, 2, _
3, 4, _
5, 6, _
7, 8, _
9, 10, _

1

3

5

7

9

10

Build a high fan-out search tree Part 2
Start simple: Sorted (key, page id) file
No record data
Binary search in the key file. Better!
Complexity?

1, 2, _
3, 4, _
5, 6, _
7, 8, _
9, 10, _

1

3

5

7

9

Build a high fan-out search tree Part 3
Start simple: Sorted (key, page id) file
No record data
Binary search in the key file. Better!
Complexity: Still binary search, just a constant factor smaller input

1, 2, _
3, 4, _
5, 6, _
7, 8, _
9, 10, _

1

3

5

7

9

Build a high fan-out search tree Part 4
Recursively “index” key file
Key Invariant:
Node […, (KL, PL), (KR, PR), … ]  All tuples in range KL <= K < KR are in tree PL 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, _ 1 3 5 7 9 1 7 Search a high fan-out search tree Searching for 5? Binary Search each node (page) starting at root Follow pointers to next level of search tree Complexity? O(logF(#Pages)) 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, _ 1 3 5 7 9 1 7 14 Left Key Optimization? Optimization Do we need the left most key? 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, _ 1 3 5 7 9 1 7 1 7 1 Build a high fan-out search tree Disk Layout? All in a single file, Data Pages first. 1, 2, _ 3, 4, _ 9,10,_ Data Pages 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, _ 3 5 9 7 Index Pages 16 Status Check Some design goals: Fast sequential scan? High Fan-out? Support insertion? … 1, 2, _ 3, 4, _ 9,10,_ Data Pages Indexed File 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, _ 3 5 9 7 Index Pages ISAM Indexed Sequential Access Method (Early IBM Indexing Technology) 17 Insert 11, Before 3, 4, _ 9,10,_ Data Pages 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, _ 3 5 9 7 1, 2, _ Index Pages Insert 11, After 3, 4, _ 9,10,11 Data Pages 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, 11 3 5 9 7 1, 2, _ Index Pages Find location Place in data page Re-sort page … Insert 12? Find location Place in data page Add overflow page if necessary … Overflow Pages 12, _, _ 1, 2, _ 3, 4, _ 5, 6, _ 7, 8, _ 9, 10, 11 3 5 9 7 12, _, _ 1, 2, _ 3, 4, _ 9,10,11 Data Pages Index Pages Recap: ISAM Data entries in sorted heap file High fan-out static tree index Fast search + good locality Assuming nothing changes Insert into overflow pages A Note of Caution ISAM is an old-fashioned idea Introduced by IBM in 1960s B+ trees are usually better, as we’ll see Though not always ( we’ll come back to this) But, it’s a good place to start Simpler than B+ tree, many of the same ideas Upshot Don’t brag about ISAM on your resume Do understand ISAM, and tradeoffs with B+ trees B+-Tree Enter the B+ Tree Similar to ISAM Same interior node structure pairs with same key invariant
Same search routine as before
Dynamic Tree Index
Always Balanced
Support efficient insertion & deletion
Grows at root not leaves!
“+”? B-tree that stores data entries in leaves only

Example of a B+ Tree
Occupancy Invariant
Each interior node is at least partially full:
d <= #entries <= 2d d: order of the tree (max fan-out = 2d + 1) Data pages at bottom need not be stored in logical order Next and prev pointers 17 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* Root Node Page 1 Page 4 Page 6 Page 2 Page 3 Page 5 Page 7 Page 8 Page 9 26 Sanity Check What is the value of d? 2 What about the root? The root is special Why not in sequential order? Data pages allocated dynamically 17 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* Root Node Page 1 Page 4 Page 6 Page 2 Page 3 Page 5 Page 7 Page 8 Page 9 27 B+ Trees and Scale How big is a height 1 B+ tree d = 2  Fan-out? Fan-out = 2d + 1 = 5 Height 1: 5 x 4 = 20 Records Root Node B+ Trees and Scale Part 2 Root Node How big is a height 3 B+ tree d = 2  Fan-out? Fan-out = 2d + 1 = 5 Height 3: 53 x 4= 500 Records B+ Trees in Practice Typical order: 1600. Typical fill-factor: 67%. average fan-out = 2144 (assuming 128 Kbytes pages at 40Bytes per record) At typical capacities Height 1: 21442 = 4,596,736 records Height 2: 21443 = 9,855,401,984 records Roughly 40 bytes per record 2144 * 128 kilobytes = 274.43 MB 2144 * 2144 * 128 kilobytes = 588.38 GB of pages 30 Searching the B+ Tree 17 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* Root Node Page 1 Page 4 Page 6 Page 2 Page 3 Page 5 Page 7 Page 8 Page 9 Same as ISAM Find key = 27 Find split on each node (Binary Search) Follow pointer to next node Roughly 40 bytes per record 2144 * 128 kilobytes = 274.43 MB 2144 * 2144 * 128 kilobytes = 588.38 GB of pages 32 Searching the B+ Tree: Find 27 17 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* Root Node Page 1 Page 4 Page 6 Page 2 Page 3 Page 5 Page 7 Page 8 Page 9 Same as ISAM Find key = 27 Find split on each node (Binary Search) Follow pointer to next node 27* Roughly 40 bytes per record 2144 * 128 kilobytes = 274.43 MB 2144 * 2144 * 128 kilobytes = 588.38 GB of pages 33 Searching the B+ Tree: Fetch Data 17 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39* Root Node Page 1 Page 4 Page 6 Page 2 Page 3 Page 5 Page 7 Page 8 Page 9 27* (20, Tim) (7, Dan) (5, Kay) (3, Jim) (27, Joe) (34, Kit) (1, Kim) (42, Hal) Page 1 Page 2 Page 3 Page 4 PageId, Record Id Roughly 40 bytes per record 2144 * 128 kilobytes = 274.43 MB 2144 * 2144 * 128 kilobytes = 588.38 GB of pages 34 Inserting 25* into a B+ Tree Part 1 13 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 25* Find the correct leaf Inserting 25* into a B+ Tree Part 2 13 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 25* 33* 34* 38* 39* Root Node Find the correct leaf If there is room in the leaf just add the entry Inserting 25* into a B+ Tree Part 3 13 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 25* 29* 33* 34* 38* 39* Root Node Find the correct leaf If there is room in the leaf just add the entry Sort the leaf page by key Inserting 8* into a B+ Tree: Find Leaf 13 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 8* Find the correct leaf 39 Inserting 8* into a B+ Tree: Insert 13 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 8* Find the correct leaf Split leaf if there is not enough room Inserting 8* into a B+ Tree: Split Leaf Find the correct leaf Split leaf if there is not enough room Redistribute entries evenly 13 17 24 30 2* 3* 5* 7* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 8* 41 5* 7* 8* Inserting 8* into a B+ Tree: Split Leaf, cont 13 17 24 30 5* 7* 8* 2* 3* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 2* 3* Find the correct leaf Split leaf if there is not enough room Redistribute entries evenly Fix next/prev pointers 42 Inserting 8* into a B+ Tree: Fix Pointers 13 17 24 30 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 2* 3* 5* 7* 8* Find the correct leaf Split leaf if there is not enough room Redistribute entries evenly Fix next/prev pointers 43 Inserting 8* into a B+ Tree: Mid-Flight 13 17 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node Something is still wrong! I am an orphan! 44 Inserting 8* into a B+ Tree: Copy Middle Key 13 17 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node Copy up from leaf the middle key No room in parent? Recursively split index nodes 5 13 Inserting 8* into a B+ Tree: Split Parent, Part 1 17 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* 5 Copy up from leaf the middle key No room in parent? Recursively split index nodes Redistribute the rightmost d keys 30 24 17 Inserting 8* into a B+ Tree: Split Parent, Part 2 13 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Copy up from leaf the middle key No room in parent? Recursively split index nodes Redistribute the rightmost d keys 5 30 24 17 47 Inserting 8* into a B+ Tree: Root Grows Up 13 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node Push up from interior node the middle key Now the last key on left No room in parent? Recursively split index nodes Redistribute the rightmost d keys 5 17 24 30 13 Inserting 8* into a B+ Tree: Root Grows Up, Pt 2 Recursively split index nodes Redistribute right d keys Push up middle key 17 24 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 17 17 17 5 30 49 Inserting 8* into a B+ Tree: Root Grows Up, Pt 3 Recursively split index nodes Redistribute right d keys Push up middle key 17 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 5 Copy up vs Push up! Notice: The leaf entry (5) was copied up The index entry (17) was pushed up 17 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node 5 Inserting 8* into a B+ Tree: Final Check invariants Key Invariant: Node[…, (KL, PL), …]  KL<= K for all K in PL Sub-tree Occupancy Invariant: d <= # entries <= 2d 17 5 13 24 30 2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39* Root Node B+ Tree Insert: Algorithm Sketch Find the correct leaf L. Put data entry onto L. If L has enough space, done! Else, must split L (into L and a new node L2) Redistribute entries evenly, copy up middle key Insert index entry pointing to L2 into parent of L. B+ Tree Insert: Algorithm Sketch Part 2 Step 2 can happen recursively To split index node, redistribute entries evenly, but push up middle key. (Contrast with leaf splits) Splits “grow” tree; root split increases height. Tree growth: gets wider or one level taller at top. Before and After Observations Notice that the root was split to increase the height Grow from the root not the leaves All paths from root to leaves are equal lengths Does the occupancy invariant hold? Yes! All nodes (except root) are at least half full Proof? After Before 56 Splitting a Leaf Start with full leaf (2d) entries (let d = 2) Add a 2d + 1 entry (8*) Split into leaves with (d, d+1) entries Copy key up to parent Why copy key and not push key up to parent? d entries d+1 entries 2* 3* 4* 5* 8* 2* 3* 5* 7* 8* 5 57 Splitting an Inner Node Start with full interior node (2d) entries: (let d = 2) Add a 2d + 1 entry Split into nodes with (d, d+1) entries Push key up to parent d entries d+1 entries 13 17 24 30 5 13 24 30 5 17 Splitting an Inner Node Pt 2 Start with full interior node (2d) entries: (let d = 2) Add a 2d + 1 entry Split into nodes with (d, d) entries Push key up to parent 13 17 24 30 5 d entries d entries 5 13 24 30 17 Splitting an Inner Node Pt 3 Start with full interior node (2d) entries: (let d = 2) Add a 2d + 1 entry Split into nodes with (d, d) entries Push key up to parent 13 17 24 30 5 d entries d entries 5 13 24 30 17 Why push not copy? Routing key not needed in child Occupancy invariant holds after split Nice Animation Online Great animation online of B+ Trees One small difference to note Upon deletion of leftmost value in a node, it updates the parent index entry Incurs unnecessary extra writes B+-Tree Deletion We will skip deletion In practice, occupancy invariant often not enforced Just delete leaf entries and leave space If new inserts come, great This is common If page becomes completely empty, can delete Parent may become underfull That’s OK too Guarantees still attractive: logF(max size of tree) BULK LOADING B+-TREES Bulk Loading of B+ Tree Part 1 Suppose we want to build an index on a large table Would it be efficient to just call insert repeatedly No … Why not? Random Order: CLZARNDXEKFWIUB. Order 2. Try it: Interactive demo Bulk Loading of B+ Tree Part 2 L D F R W A* B* C* D* E* F* I* K* L* N* R* U* W* X* Z* Constantly need to search from root Leaves and internal nodes mostly half-empty Modifying random pages: poor cache efficiency Bulk Loading of B+ Tree Part 2 Constantly need to search from leaf Leaves and nodes are mostly half full Modifying random pages -> poor cache efficiency

L

D
E

A
B
C

F
I
K

L
N

R
U

W
X
Z

D
F

R
W

16
Smarter Bulk Loading a B+ Tree

4

7

10

13

4

7

10

13

1*
2*
3*

4*
5*
6*

7*
8*
9*

10*
11*
12*

13*
14*
15*

16*
17*
18*

Sort the input records by key:
1*, 2*, 3*, 4*, …
We’ll learn a good disk-based sort algorithm soon!
Fill leaf pages to some fill factor (e.g. ¾)
Updating parent until full

68

Smarter Bulk Loading a B+ Tree Part 2

Sort the input records by key:
1*, 2*, 3*, 4*, …
Fill leaf pages to some fill factor (e.g. ¾)
Update parent until full
Then split parent (50/50) and copy to sibling

13

4

7

10

1*
2*
3*

4*
5*
6*

7*
8*
9*

10*
11*
12*

13*
14*
15*

16*
17*
18*

16

69

Smarter Bulk Loading a B+ Tree Part 3

Lower left part of the tree is never touched again
Occupancy invariant maintained

13

4

7

10

1*
2*
3*

4*
5*
6*

7*
8*
9*

10*
11*
12*

13*
14*
15*

16*
17*
18*

16

Never Touched Again

70

19

22

19*
20*
21*

22*
23*
24*

13

4

7

10

1*
2*
3*

4*
5*
6*

7*
8*
9*

10*
11*
12*

13*
14*
15*

16*
17*
18*

16

Smarter Bulk Loading a B+ Tree Part 4

Sort the input records by key:
1*, 2*, 3*, 4*, …
Fill leaf pages to some fill factor (e.g. ¾)
Update parent until full
Then split parent

Summary of Bulk Loading
Option 1: Multiple inserts
Slow
Does not give sequential storage of leaves
Option 2: Bulk Loading
Fewer I/Os during build. (Why?)
Leaves will be stored sequentially (and linked, of course)
Can control “fill factor” on pages.

Summary
ISAM is a static structure
Only leaf pages modified; overflow pages needed
Overflow chains can degrade performance unless size of data set and data distribution stay constant
B+ Tree is a dynamic structure
Inserts/deletes leave tree height-balanced; logFN cost
High fanout (F) means depth rarely more than 3 or 4.
Almost always better than maintaining a sorted file.
Typically, 67% occupancy on average
Usually preferable to ISAM; adjusts to growth gracefully.

Summary Cont.
Bulk loading can be much faster than repeated inserts for creating a B+ tree on a large data set.
B+ tree widely used because of its versatility
One of the most optimized components of a DBMS.
Concurrent Updates
In-memory efficiency

Graphic Components

5

13

17

33*
34*
38*
39*
Root Node

13

33*
5

17

5 13 24 30

2* 3* 5* 7* 8* 14* 16* 19* 20* 24* 27* 29* 33* 34* 38* 39*

Root Node