程序代写代做代考 Excel data structure algorithm hp0

hp0

HIERARCHICAL REPRESENTATIONS OF

POINT DATA

Hanan Samet
Computer Science Department and

Institute for Advanced Computer Studies and
Center for Automation Research

University of Maryland

College Park, MD 20742 USA
e-mail: hjs@umiacs.umd.edu

Copyright © 1998 Hanan Samet

These notes may not be reproduced by any means (mechanical or
electronic or any other) without the express written permission of Hanan
Samet

hp1
PRELIMINARIES

• File ≡ collection of records (N)
• Each record contains several attributes or keys (k)

Queries:
1. Point query
2. Range query (includes partial match)

3. Boolean query ≡ combine 1 and 2 with AND, OR, NOT

Search methods

1. Organize data to be stored
• boundaries of regions in the search space are

determined by the data
• e.g., binary search tree

2. Organize the embedding space from which the data is
drawn
• region boundaries in the search space are fixed
• e.g., address computation methods such as digital

searching
3. Hybrid

• use 1 for some attributes and 2 for others

Extreme solution:
• Bitmap representation where one bit is reserved for

every possible record in the multidimensional point
space whether or not it is present

• Problems:
1. large number of attributes
2. continuous data (non-discrete)

Copyright © 1998 by Hanan Samet

hp2

SIMPLE NON-HIERARCHICAL DATA STRUCTURES

1. Sequential list

NAME X Y

Chicago 35 42
Mobile 52 10
Toronto 62 77
Buffalo 82 65
Denver 5 45
Omaha 27 35
Atlanta 85 15
Miami 90 5

2. Inverted List

X Y

Denver Miami
Omaha Mobile
Chicago Atlanta
Mobile Omaha
Toronto Chicago
Buffalo Denver
Atlanta Buffalo
Miami Toronto

• 2 sorted lists

• Data is pointers

• Enables pruning the search with respect to one key

Copyright © 1998 by Hanan Samet

hp3

GRID METHOD

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto (82,65)

Buffalo

(85,15)
Atlanta

(90,5)
Miami

• Divide space into squares of width equal to the search
region

• Each cell contains a list of all points within it

• Assume L∞ distance metric (i.e., chessboard)

• Assume C = uniform distribution of points per cell

• Average search time for k-dimensional space is O(F•2k)

F = number of records found = C since query region has
the width of a cell

2k = number of cells examined

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp4

(52,10)
Mobile

Mobile

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp4

(52,10)
Mobile

Mobile

Copyright © 1998 by Hanan Samet

3
z

hp4

(62,77)
Toronto

Toronto

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp4

(52,10)
Mobile

Mobile

Copyright © 1998 by Hanan Samet

3
z

hp4

(62,77)
Toronto

Toronto

Copyright © 1998 by Hanan Samet

4
g

hp4

(82,65)
Buffalo

Buffalo

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp4

(52,10)
Mobile

Mobile

Copyright © 1998 by Hanan Samet

3
z

hp4

(62,77)
Toronto

Toronto

Copyright © 1998 by Hanan Samet

4
g

hp4

(82,65)
Buffalo

Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp4

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp4

(52,10)
Mobile

Mobile

Copyright © 1998 by Hanan Samet

3
z

hp4

(62,77)
Toronto

Toronto

Copyright © 1998 by Hanan Samet

4
g

hp4

(82,65)
Buffalo

Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp4

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

6
g

hp4

(27,35)
Omaha

Omaha

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp4

(52,10)
Mobile

Mobile

Copyright © 1998 by Hanan Samet

3
z

hp4

(62,77)
Toronto

Toronto

Copyright © 1998 by Hanan Samet

4
g

hp4

(82,65)
Buffalo

Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp4

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

6
g

hp4

(27,35)
Omaha

Omaha

Copyright © 1998 by Hanan Samet

7
v

hp4

(85,15)
Atlanta

Atlanta

Copyright © 1998 by Hanan Samet

POINT QUADTREE (Finkel/Bentley)
1 hp4
b

• Marriage between a uniform grid and a binary search tree

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp4

(52,10)
Mobile

Mobile

Copyright © 1998 by Hanan Samet

3
z

hp4

(62,77)
Toronto

Toronto

Copyright © 1998 by Hanan Samet

4
g

hp4

(82,65)
Buffalo

Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp4

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

6
g

hp4

(27,35)
Omaha

Omaha

Copyright © 1998 by Hanan Samet

7
v

hp4

(85,15)
Atlanta

Atlanta

Copyright © 1998 by Hanan Samet

8
z

hp4

(90,5)
Miami

Miami

Copyright © 1998 by Hanan Samet

hp5

PROBLEM OF DELETION IN POINT QUADTREES

1
b

A

I
G

B
C

H

E

D

F

• Delete node A

• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees

Copyright © 1998 by Hanan Samet

hp5

1. B is the new root

2
r

PROBLEM OF DELETION IN POINT QUADTREES

1
b

A

I
G

B
C

H

E

D

F

• Delete node A

• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees

Copyright © 1998 by Hanan Samet

hp5

2. D is the new root

2
r

PROBLEM OF DELETION IN POINT QUADTREES

1
b

A

I
G

B
C

H

E

D

F

• Delete node A

• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees

Copyright © 1998 by Hanan Samet

hp5

3. G is the new root

2
r

PROBLEM OF DELETION IN POINT QUADTREES

1
b

A

I
G

B
C

H

E

D

F

• Delete node A

• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees

Copyright © 1998 by Hanan Samet

hp52
r

4. I is the new root

PROBLEM OF DELETION IN POINT QUADTREES

1
b

A

I
G

B
C

H

E

D

F

• Delete node A

• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees

Copyright © 1998 by Hanan Samet

hp53
z

• Optimal solution is to use H as the new root since the
shaded region is empty

PROBLEM OF DELETION IN POINT QUADTREES

1
b

A

I
G

B
C

H

E

D

F

• Delete node A

• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees

Copyright © 1998 by Hanan Samet

hp53
z

• Optimal solution is to use H as the new root since the
shaded region is empty

4
g

• Problems:

1. must search for H

2. a node such as H may possibly not exist as is the
case if node J is present

J

PROBLEM OF DELETION IN POINT QUADTREES

1
b

A

I
G

B
C

H

E

D

F

• Delete node A

• Conventional algorithm takes one son as the new root
and reinserts the remaining subtrees

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

2
z

hp6

A

B

C

D A

D

C

B

• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

2
z

hp6

A

B

C

D A

D

C

B

• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree

Copyright © 1998 by Hanan Samet

3
r

hp6

D is the closest node

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

2
z

hp6

A

B

C

D A

D

C

B

• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree

Copyright © 1998 by Hanan Samet

3
r

hp6

D is the closest node

Copyright © 1998 by Hanan Samet

4
v

hp6

• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of

its bordering axes than any other candidate node
which is on the same side of those axes

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

2
z

hp6

A

B

C

D A

D

C

B

• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree

Copyright © 1998 by Hanan Samet

3
r

hp6

D is the closest node

Copyright © 1998 by Hanan Samet

4
v

hp6

• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of

its bordering axes than any other candidate node
which is on the same side of those axes

Copyright © 1998 by Hanan Samet

5
g

hp6

A

B

E
D

C

condition does not always hold

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

2
z

hp6

A

B

C

D A

D

C

B

• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree

Copyright © 1998 by Hanan Samet

3
r

hp6

D is the closest node

Copyright © 1998 by Hanan Samet

4
v

hp6

• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of

its bordering axes than any other candidate node
which is on the same side of those axes

Copyright © 1998 by Hanan Samet

5
g

hp6

A

B

E
D

C

condition does not always hold

Copyright © 1998 by Hanan Samet

6
z

hp6

A

B

ED

C

more than one node satisfies it

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

2
z

hp6

A

B

C

D A

D

C

B

• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree

Copyright © 1998 by Hanan Samet

3
r

hp6

D is the closest node

Copyright © 1998 by Hanan Samet

4
v

hp6

• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of

its bordering axes than any other candidate node
which is on the same side of those axes

Copyright © 1998 by Hanan Samet

5
g

hp6

A

B

E
D

C

condition does not always hold

Copyright © 1998 by Hanan Samet

6
z

hp6

A

B

ED

C

more than one node satisfies it

Copyright © 1998 by Hanan Samet

7
r

hp6

Lx

Ly

dx

dy

2. Use the L1 metric to break ties or deadlocks
• L1 metric is the sum of the displacements from the

bordering x and y axes
• rationale: area of shaded region is Lx•dy+Ly•dx-dx•dy

which can be approximated by 2dx•(Lx+Ly) assuming
dx=dy and dx being very small

Copyright © 1998 by Hanan Samet

MECHANICS OF DELETION IN POINT QUADTREES

• Ideally, want to replace deleted
node (A) with a node (B) that
leaves an empty shaded region

hp61
b

B

A

Copyright © 1998 by Hanan Samet

2
z

hp6

A

B

C

D A

D

C

B

• Involves search and instead settle on a set of
candidate nodes obtained using a method analogous
to searching a binary search tree

Copyright © 1998 by Hanan Samet

3
r

hp6

D is the closest node

Copyright © 1998 by Hanan Samet

4
v

hp6

• Set of candidates = “closest” node in each quadrant
1. choose the candidate node that is closer to each of

its bordering axes than any other candidate node
which is on the same side of those axes

Copyright © 1998 by Hanan Samet

5
g

hp6

A

B

E
D

C

condition does not always hold

Copyright © 1998 by Hanan Samet

6
z

hp6

A

B

ED

C

more than one node satisfies it

Copyright © 1998 by Hanan Samet

7
r

hp6

Lx

Ly

dx

dy

2. Use the L1 metric to break ties or deadlocks
• L1 metric is the sum of the displacements from the

bordering x and y axes
• rationale: area of shaded region is Lx•dy+Ly•dx-dx•dy

which can be approximated by 2dx•(Lx+Ly) assuming
dx=dy and dx being very small

Copyright © 1998 by Hanan Samet

8
v

hp6

• at most one of the remaining candidates will be in
the shaded region

Copyright © 1998 by Hanan Samet

ALGORITHM FOR DELETION IN POINT QUADTREES

1 hp7
b

A

N

M

L

JK D

B E

C

H
F

G

I

• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant

Copyright © 1998 by Hanan Samet

ALGORITHM FOR DELETION IN POINT QUADTREES

1 hp7
b

A

N

M

L

JK D

B E

C

H
F

G

I

• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant

Copyright © 1998 by Hanan Samet

2
z

hp7

Copyright © 1998 by Hanan Samet

ALGORITHM FOR DELETION IN POINT QUADTREES

1 hp7
b

A

N

M

L

JK D

B E

C

H
F

G

I

• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant

Copyright © 1998 by Hanan Samet

2
z

hp7

Copyright © 1998 by Hanan Samet

3
v

hp7

1. no reinsertion in opposite quadrant (SW)

Copyright © 1998 by Hanan Samet

ALGORITHM FOR DELETION IN POINT QUADTREES

1 hp7
b

A

N

M

L

JK D

B E

C

H
F

G

I

• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant

Copyright © 1998 by Hanan Samet

2
z

hp7

Copyright © 1998 by Hanan Samet

3
v

hp7

1. no reinsertion in opposite quadrant (SW)

Copyright © 1998 by Hanan Samet

4
r

hp7

2. ADJ: apply to adjacent quadrants (NW,SE):
if root remains in the quadrant (J) then
• no reinsertion in 2 subquadrants (NW,NE)
• apply ADJ to remaining 2 subquadrants (SW,SE)
else reinsert entire quadrant

Copyright © 1998 by Hanan Samet

ALGORITHM FOR DELETION IN POINT QUADTREES

1 hp7
b

A

N

M

L

JK D

B E

C

H
F

G

I

• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant

Copyright © 1998 by Hanan Samet

2
z

hp7

Copyright © 1998 by Hanan Samet

3
v

hp7

1. no reinsertion in opposite quadrant (SW)

Copyright © 1998 by Hanan Samet

4
r

hp7

2. ADJ: apply to adjacent quadrants (NW,SE):
if root remains in the quadrant (J) then
• no reinsertion in 2 subquadrants (NW,NE)
• apply ADJ to remaining 2 subquadrants (SW,SE)
else reinsert entire quadrant

Copyright © 1998 by Hanan Samet

5
g

hp7

3. NEWROOT: apply to quadrant containing replacement
node (NE)
• same subquadrant (NE): no reinsertion
• adjacent subquadrants (NW,SE): ADJ
• opposite subquadrant (SW): NEWROOT

Copyright © 1998 by Hanan Samet

ALGORITHM FOR DELETION IN POINT QUADTREES

1 hp7
b

A

N

M

L

JK D

B E

C

H
F

G

I

• Select a node satisfying the “closest” criteria to the
deleted node A to serve as the new root (B in the NE
quadrant

Copyright © 1998 by Hanan Samet

2
z

hp7

Copyright © 1998 by Hanan Samet

3
v

hp7

1. no reinsertion in opposite quadrant (SW)

Copyright © 1998 by Hanan Samet

4
r

hp7

2. ADJ: apply to adjacent quadrants (NW,SE):
if root remains in the quadrant (J) then
• no reinsertion in 2 subquadrants (NW,NE)
• apply ADJ to remaining 2 subquadrants (SW,SE)
else reinsert entire quadrant

Copyright © 1998 by Hanan Samet

5
g

hp7

3. NEWROOT: apply to quadrant containing replacement
node (NE)
• same subquadrant (NE): no reinsertion
• adjacent subquadrants (NW,SE): ADJ
• opposite subquadrant (SW): NEWROOT

Copyright © 1998 by Hanan Samet

6
b

hp7

• Comparison with conventional reinsertion algorithm
1. 5/6 reduction in number of nodes requiring

reinsertion (2/3 if pick a candidate at random)
2. number of comparisons during reinsertion was

observed to be ~log4n vs. a much larger factor
3. (average total path length)/(optimal total path length)

was observed to be constant vs. an increase
Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp83
z

(4,0)
Mobile

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp83
z

(4,0)
Mobile

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp84
g

(4,6)
Toronto

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp83
z

(4,0)
Mobile

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp84
g

(4,6)
Toronto

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp85
v

(6,5)
Buffalo

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp83
z

(4,0)
Mobile

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp84
g

(4,6)
Toronto

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp85
v

(6,5)
Buffalo

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp86
z

(0,3)
Denver

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp83
z

(4,0)
Mobile

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp84
g

(4,6)
Toronto

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp85
v

(6,5)
Buffalo

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp86
z

(0,3)
Denver

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp87
g

(2,2)
Omaha

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp83
z

(4,0)
Mobile

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp84
g

(4,6)
Toronto

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp85
v

(6,5)
Buffalo

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp86
z

(0,3)
Denver

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp87
g

(2,2)
Omaha

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp88
r

(6,1)
Atlanta

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8
MX QUADTREE (Hunter)

• Points are like BLACK pixels in a region quadtree
• Useful for raster to vector conversion
• Empty cells are merged to form larger empty cells
• Only good for discrete data
• Good for sparse matrix applications
• Assume that the point is associated with the lower

left corner of each cell
• Ex: assume an 8 x 8 array
divide coordinate values by 12.5

1
b

(0,8) (8,8)

(8,0)(0,0)

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp82
r

(2,3)
Chicago

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp83
z

(4,0)
Mobile

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp84
g

(4,6)
Toronto

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp85
v

(6,5)
Buffalo

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp86
z

(0,3)
Denver

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp87
g

(2,2)
Omaha

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp88
r

(6,1)
Atlanta

Copyright © 1998 by Hanan Samet

hp8hp8hp8hp8hp8hp8hp8hp8hp8

(7,0)
Miami

9
v

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

2
r

hp9

(52,10)
Mobile

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

2
r

hp9

(52,10)
Mobile

Copyright © 1998 by Hanan Samet

3
z

hp9

(62,77)
Toronto

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

2
r

hp9

(52,10)
Mobile

Copyright © 1998 by Hanan Samet

3
z

hp9

(62,77)
Toronto

Copyright © 1998 by Hanan Samet

4
g

hp9

(82,65)
Buffalo

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

2
r

hp9

(52,10)
Mobile

Copyright © 1998 by Hanan Samet

3
z

hp9

(62,77)
Toronto

Copyright © 1998 by Hanan Samet

4
g

hp9

(82,65)
Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp9

(5,45)
Denver

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

2
r

hp9

(52,10)
Mobile

Copyright © 1998 by Hanan Samet

3
z

hp9

(62,77)
Toronto

Copyright © 1998 by Hanan Samet

4
g

hp9

(82,65)
Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp9

(5,45)
Denver

Copyright © 1998 by Hanan Samet

6
g

hp9

(27,35)
Omaha

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

2
r

hp9

(52,10)
Mobile

Copyright © 1998 by Hanan Samet

3
z

hp9

(62,77)
Toronto

Copyright © 1998 by Hanan Samet

4
g

hp9

(82,65)
Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp9

(5,45)
Denver

Copyright © 1998 by Hanan Samet

6
g

hp9

(27,35)
Omaha

Copyright © 1998 by Hanan Samet

7
z

hp9

(85,15)
Atlanta

Copyright © 1998 by Hanan Samet

PR QUADTREE (Orenstein)
1 hp9
b

Regular decomposition point representation

Decomposition occurs whenever a block contains more
than one point
Useful when the domain of data points is not discrete
but finite

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Maximum level of decomposition depends on the
minimum separation between two points

• if two points are very close, then decomposition can be
very deep

• can be overcome by viewing blocks as buckets with
capacity c and only decomposing the block when it
contains more than c points

1.

2.

3.

4.

Ex: c = 1

Copyright © 1998 by Hanan Samet

2
r

hp9

(52,10)
Mobile

Copyright © 1998 by Hanan Samet

3
z

hp9

(62,77)
Toronto

Copyright © 1998 by Hanan Samet

4
g

hp9

(82,65)
Buffalo

Copyright © 1998 by Hanan Samet

5
v

hp9

(5,45)
Denver

Copyright © 1998 by Hanan Samet

6
g

hp9

(27,35)
Omaha

Copyright © 1998 by Hanan Samet

7
z

hp9

(85,15)
Atlanta

Copyright © 1998 by Hanan Samet

8
r

hp9

(90,5)
Miami

Copyright © 1998 by Hanan Samet

REGION SEARCH
1 hp10
b

Use of quadtree results in pruning the search space

Ex: Find all points within radius r of point A

A

r

Copyright © 1998 by Hanan Samet

REGION SEARCH
1 hp10
b

Use of quadtree results in pruning the search space

Ex: Find all points within radius r of point A

A

r

Copyright © 1998 by Hanan Samet

hp10

If a quadrant subdivision point p lies in a region l, then
search the quadrants of p specified by l

1. SE 6. NE 11. All but SW
2. SE, SW 7. NE, NW 12. All but SE
3. SW 8. NW 13. All
4. SE, NE 9. All but NW
5. SW, NW 10. All but NE

1 2 3
9 10

13

1211

4
5

876

2
r

Copyright © 1998 by Hanan Samet

REGION SEARCH
1 hp10
b

Use of quadtree results in pruning the search space

Ex: Find all points within radius r of point A

A

r

Copyright © 1998 by Hanan Samet

hp10

If a quadrant subdivision point p lies in a region l, then
search the quadrants of p specified by l

1. SE 6. NE 11. All but SW
2. SE, SW 7. NE, NW 12. All but SE
3. SW 8. NW 13. All
4. SE, NE 9. All but NW
5. SW, NW 10. All but NE

1 2 3
9 10

13

1211

4
5

876

2
r

Copyright © 1998 by Hanan Samet

hp103
z

Copyright © 1998 by Hanan Samet

hp11
FINDING THE NEAREST OBJECT

• Ex: find the nearest object to P

1
b

P

12 8 7 6

13 9 1 4 5

2 3

10 11

D

E C

F

A

B

• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:

Copyright © 1998 by Hanan Samet

hp11
FINDING THE NEAREST OBJECT

• Ex: find the nearest object to P

1
b

P

12 8 7 6

13 9 1 4 5

2 3

10 11

D

E C

F

A

B

• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:

Copyright © 1998 by Hanan Samet

hp112
r

1. start at block 2 and compute distance to P from A

Copyright © 1998 by Hanan Samet

hp11
FINDING THE NEAREST OBJECT

• Ex: find the nearest object to P

1
b

P

12 8 7 6

13 9 1 4 5

2 3

10 11

D

E C

F

A

B

• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:

Copyright © 1998 by Hanan Samet

hp112
r

1. start at block 2 and compute distance to P from A

Copyright © 1998 by Hanan Samet

hp113
z

2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3

Copyright © 1998 by Hanan Samet

hp11
FINDING THE NEAREST OBJECT

• Ex: find the nearest object to P

1
b

P

12 8 7 6

13 9 1 4 5

2 3

10 11

D

E C

F

A

B

• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:

Copyright © 1998 by Hanan Samet

hp112
r

1. start at block 2 and compute distance to P from A

Copyright © 1998 by Hanan Samet

hp113
z

2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3

Copyright © 1998 by Hanan Samet

hp114
g

3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A

Copyright © 1998 by Hanan Samet

hp11
FINDING THE NEAREST OBJECT

• Ex: find the nearest object to P

1
b

P

12 8 7 6

13 9 1 4 5

2 3

10 11

D

E C

F

A

B

• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:

Copyright © 1998 by Hanan Samet

hp112
r

1. start at block 2 and compute distance to P from A

Copyright © 1998 by Hanan Samet

hp113
z

2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3

Copyright © 1998 by Hanan Samet

hp114
g

3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A

Copyright © 1998 by Hanan Samet

hp115
v

4. ignore blocks 6, 7, 8, 9, and 10 as the minimum
distance to them from P is greater than the distance
from P to A

Copyright © 1998 by Hanan Samet

hp11
FINDING THE NEAREST OBJECT

• Ex: find the nearest object to P

1
b

P

12 8 7 6

13 9 1 4 5

2 3

10 11

D

E C

F

A

B

• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:

Copyright © 1998 by Hanan Samet

hp112
r

1. start at block 2 and compute distance to P from A

Copyright © 1998 by Hanan Samet

hp113
z

2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3

Copyright © 1998 by Hanan Samet

hp114
g

3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A

Copyright © 1998 by Hanan Samet

hp115
v

4. ignore blocks 6, 7, 8, 9, and 10 as the minimum
distance to them from P is greater than the distance
from P to A

Copyright © 1998 by Hanan Samet

hp116
z

5. examine block 11 as the distance from P to the southern
border of 1 is shorter than the distance from P to A;
however, reject F as it is further from P than A

Copyright © 1998 by Hanan Samet

hp11
FINDING THE NEAREST OBJECT

• Ex: find the nearest object to P

1
b

P

12 8 7 6

13 9 1 4 5

2 3

10 11

D

E C

F

A

B

• Assume PR quadtree for points (i.e., at most one point
per block)
• Search neighbors of block 1 in counterclockwise order
• Points are sorted with respect to the space they occupy
which enables pruning the search space
• Algorithm:

Copyright © 1998 by Hanan Samet

hp112
r

1. start at block 2 and compute distance to P from A

Copyright © 1998 by Hanan Samet

hp113
z

2. ignore block 3 whether or not it is empty as A is closer
to P than any point in 3

Copyright © 1998 by Hanan Samet

hp114
g

3. examine block 4 as distance to SW corner is shorter
than the distance from P to A; however, reject B as it is
further from P than A

Copyright © 1998 by Hanan Samet

hp115
v

4. ignore blocks 6, 7, 8, 9, and 10 as the minimum
distance to them from P is greater than the distance
from P to A

Copyright © 1998 by Hanan Samet

hp116
z

5. examine block 11 as the distance from P to the southern
border of 1 is shorter than the distance from P to A;
however, reject F as it is further from P than A

Copyright © 1998 by Hanan Samet

hp117
r

• If F was moved, a better order would have started with
block 11, the southern neighbor of 1, as it is closest

new F

Copyright © 1998 by Hanan Samet

hp12
COMPARISON OF POINT, MX, AND PR QUADTREES

FEATURE MX PR POINT

Regular
decomposition

Yes Yes No

Type of nodes Data Stored in
leaf nodes and
non-leaf nodes
are for control

Data stored in leaf
nodes and non-
leaf nodes are for
control

Data stored in
leaf nodes and
non-leaf nodes

Shape of tree
depends on order
of inserting nodes

No No Yes

Deletion Simple but may
have to collapse
WHITE nodes

Simple but may
have to collapse
WHITE nodes

Complex

Size of space
represented

Finite Finite Unbounded

Type of data
represented

Discrete Continuous Continuous

Shape of space
represented

Square Rectangle Unbounded

Stores
coordinates

No Yes Yes

Depth of tree (d);
assume M points

All nodes are at
the same depth,
n for a 2n by 2n
region

For square region
with side length L
and minimum
separation S
between two
points,
log4(M-1) ≤ d
and
d≤log2((L/S)2.5)

log4(3M) ≤ d
and d ≤ M-1

Copyright © 1998 by Hanan Samet

APPLICATION OF THE MX QUADTREE (Hunter)

• Represent the boundary as a sequence of BLACK
pixels in a region quadtree

• Useful for a simple digitized polygon (i.e., non-
intersecting edges)

• Three types of nodes

1. interior – treat like WHITE nodes

2. exterior – treat like WHITE nodes

3. boundary – the edge of the polygon passes
through them and treated like BLACK nodes

• Disadvantages

1. a thickness is associated with the line segments

2. no more than 4 lines can meet at a point

hp131
b

Copyright © 1998 by Hanan Samet

APPLICATION OF THE MX QUADTREE (Hunter)

• Represent the boundary as a sequence of BLACK
pixels in a region quadtree

• Useful for a simple digitized polygon (i.e., non-
intersecting edges)

• Three types of nodes

1. interior – treat like WHITE nodes

2. exterior – treat like WHITE nodes

3. boundary – the edge of the polygon passes
through them and treated like BLACK nodes

• Disadvantages

1. a thickness is associated with the line segments

2. no more than 4 lines can meet at a point

hp131
b

Copyright © 1998 by Hanan Samet

hp132
r

Copyright © 1998 by Hanan Samet

hp14
MX-CIF QUADTREE (Kedem)

1
b

Collections of small rectangles for VLSI applications

Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets

1.

2.

3.

1

2

3

4
5

6

9

87

10 11

12

D {11}

F {12}E {3,4,5}

B {1}

A {2,6,7,8,9,10}

C {}

A

B
C

E

D

F

Copyright © 1998 by Hanan Samet

hp14
MX-CIF QUADTREE (Kedem)

1
b

Collections of small rectangles for VLSI applications

Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets

1.

2.

3.

1

2

3

4
5

6

9

87

10 11

12

D {11}

F {12}E {3,4,5}

B {1}

A {2,6,7,8,9,10}

C {}

A

B
C

E

D

F

Copyright © 1998 by Hanan Samet

hp142
r

Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point

4.

Copyright © 1998 by Hanan Samet

hp14
MX-CIF QUADTREE (Kedem)

1
b

Collections of small rectangles for VLSI applications

Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets

1.

2.

3.

1

2

3

4
5

6

9

87

10 11

12

D {11}

F {12}E {3,4,5}

B {1}

A {2,6,7,8,9,10}

C {}

A

B
C

E

D

F

Copyright © 1998 by Hanan Samet

hp142
r

Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point

4.

Copyright © 1998 by Hanan Samet

hp14

one for y-axis

Binary tree for y-
axis through A

Y1

Y2
10
Y4

2

Y5

Y3

6
Y7

8

Y6

3
g

Copyright © 1998 by Hanan Samet

hp14
MX-CIF QUADTREE (Kedem)

1
b

Collections of small rectangles for VLSI applications

Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets

1.

2.

3.

1

2

3

4
5

6

9

87

10 11

12

D {11}

F {12}E {3,4,5}

B {1}

A {2,6,7,8,9,10}

C {}

A

B
C

E

D

F

Copyright © 1998 by Hanan Samet

hp142
r

Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point

4.

Copyright © 1998 by Hanan Samet

hp14

one for y-axis

Binary tree for y-
axis through A

Y1

Y2
10
Y4

2

Y5

Y3

6
Y7

8

Y6

3
g

Copyright © 1998 by Hanan Samet

hp14

if a rectangle intersects both x and y axes, then
associate it with the y axis

4
v

Copyright © 1998 by Hanan Samet

hp14
MX-CIF QUADTREE (Kedem)

1
b

Collections of small rectangles for VLSI applications

Each rectangle is associated with its minimum
enclosing quadtree block
Like hashing: quadtree blocks serve as hash buckets

1.

2.

3.

1

2

3

4
5

6

9

87

10 11

12

D {11}

F {12}E {3,4,5}

B {1}

A {2,6,7,8,9,10}

C {}

A

B
C

E

D

F

Copyright © 1998 by Hanan Samet

hp142
r

Collision = more than one rectangle in a block
resolve by using two one-dimensional MX-CIF trees to
store the rectangle intersecting the lines passing
through each subdivision point

4.

Copyright © 1998 by Hanan Samet

hp14

one for y-axis

Binary tree for y-
axis through A

Y1

Y2
10
Y4

2

Y5

Y3

6
Y7

8

Y6

3
g

Copyright © 1998 by Hanan Samet

hp14

if a rectangle intersects both x and y axes, then
associate it with the y axis

4
v

Copyright © 1998 by Hanan Samet

hp145
z

one for x-axis

Binary tree for x-
axis through A

X1

X3
9

X5

7

X4

X2

X6

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp15

(52,10)
Mobile

Mobile

x test

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp15

(52,10)
Mobile

Mobile

x test

Copyright © 1998 by Hanan Samet

3
z

hp15

(62,77)
Toronto

Toronto

y test

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp15

(52,10)
Mobile

Mobile

x test

Copyright © 1998 by Hanan Samet

3
z

hp15

(62,77)
Toronto

Toronto

y test

Copyright © 1998 by Hanan Samet

4
g

hp15

(82,65)
Buffalo

Buffalo

x test

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp15

(52,10)
Mobile

Mobile

x test

Copyright © 1998 by Hanan Samet

3
z

hp15

(62,77)
Toronto

Toronto

y test

Copyright © 1998 by Hanan Samet

4
g

hp15

(82,65)
Buffalo

Buffalo

x test

Copyright © 1998 by Hanan Samet

5
v

hp15

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp15

(52,10)
Mobile

Mobile

x test

Copyright © 1998 by Hanan Samet

3
z

hp15

(62,77)
Toronto

Toronto

y test

Copyright © 1998 by Hanan Samet

4
g

hp15

(82,65)
Buffalo

Buffalo

x test

Copyright © 1998 by Hanan Samet

5
v

hp15

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

6
g

hp15

(27,35)
Omaha

Omaha

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp15

(52,10)
Mobile

Mobile

x test

Copyright © 1998 by Hanan Samet

3
z

hp15

(62,77)
Toronto

Toronto

y test

Copyright © 1998 by Hanan Samet

4
g

hp15

(82,65)
Buffalo

Buffalo

x test

Copyright © 1998 by Hanan Samet

5
v

hp15

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

6
g

hp15

(27,35)
Omaha

Omaha

Copyright © 1998 by Hanan Samet

7
r

hp15

(85,15)
Atlanta

Atlanta

y test

Copyright © 1998 by Hanan Samet

K-D TREE (Bentley)

• Test one attribute at a time instead of all simultaneously
as in the point quadtree

• Usually cycle through all the attributes

• Shape of the tree depends on the order in which the
data is encountered

1 hp15
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

2
r

hp15

(52,10)
Mobile

Mobile

x test

Copyright © 1998 by Hanan Samet

3
z

hp15

(62,77)
Toronto

Toronto

y test

Copyright © 1998 by Hanan Samet

4
g

hp15

(82,65)
Buffalo

Buffalo

x test

Copyright © 1998 by Hanan Samet

5
v

hp15

(5,45)
Denver

Denver

Copyright © 1998 by Hanan Samet

6
g

hp15

(27,35)
Omaha

Omaha

Copyright © 1998 by Hanan Samet

7
r

hp15

(85,15)
Atlanta

Atlanta

y test

Copyright © 1998 by Hanan Samet

8
v

hp15

(90,5)
Miami

Miami

Copyright © 1998 by Hanan Samet

hp16

K-D TREE DELETION

• Similar to deletion in binary search trees

• Assume branch to HISON if test value is ≥ root value

• Assume root discriminates on the x coordinate and
subsequent alternation with the y coordinate value

• Algorithm:

1. replace root by node in HISON with the minimum x
coordinate value

2. repeat the process for the position of the replacement
node using the x or y coordinate values depending on
whether the replacement node is an x or a y
discriminator

Copyright © 1998 by Hanan Samet

EXAMPLE OF K-D TREE DELETION

Ex:

hp171
b

J (28,45)

A (20,20)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

G (25,30)

H (29,40)

I (27,35)

x

y

x

y

x

y

x

Copyright © 1998 by Hanan Samet

EXAMPLE OF K-D TREE DELETION

Ex:

hp171
b

J (28,45)

A (20,20)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

G (25,30)

H (29,40)

I (27,35)

x

y

x

y

x

y

x

Copyright © 1998 by Hanan Samet

2
r

hp17

min x

Copyright © 1998 by Hanan Samet

EXAMPLE OF K-D TREE DELETION

Ex:

hp171
b

J (28,45)

A (20,20)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

G (25,30)

H (29,40)

I (27,35)

x

y

x

y

x

y

x

Copyright © 1998 by Hanan Samet

2
r

hp17

min x

Copyright © 1998 by Hanan Samet

3
z

hp17

min y

Copyright © 1998 by Hanan Samet

EXAMPLE OF K-D TREE DELETION

Ex:

hp171
b

J (28,45)

A (20,20)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

G (25,30)

H (29,40)

I (27,35)

x

y

x

y

x

y

x

Copyright © 1998 by Hanan Samet

2
r

hp17

min x

Copyright © 1998 by Hanan Samet

3
z

hp17

min y

Copyright © 1998 by Hanan Samet

4
g

hp17

min y

G (25,30)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

I (27,35)

H (29,40)

J (28,45)

Copyright © 1998 by Hanan Samet

EXAMPLE OF K-D TREE DELETION

Ex:

hp171
b

J (28,45)

A (20,20)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

G (25,30)

H (29,40)

I (27,35)

x

y

x

y

x

y

x

Copyright © 1998 by Hanan Samet

2
r

hp17

min x

Copyright © 1998 by Hanan Samet

3
z

hp17

min y

Copyright © 1998 by Hanan Samet

4
g

hp17

min y

G (25,30)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

I (27,35)

H (29,40)

J (28,45)

Copyright © 1998 by Hanan Samet

5
v

hp17

• Analogy with binary search trees implies that we can
replace the root by the node in LOSON with the max x
coordinate value. However, if there is more than one
node with the same max value, then after replacement
there would be a node in LOSON which is not strictly less
than the new root (e.g., nodes B and C)

Copyright © 1998 by Hanan Samet

EXAMPLE OF K-D TREE DELETION

Ex:

hp171
b

J (28,45)

A (20,20)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

G (25,30)

H (29,40)

I (27,35)

x

y

x

y

x

y

x

Copyright © 1998 by Hanan Samet

2
r

hp17

min x

Copyright © 1998 by Hanan Samet

3
z

hp17

min y

Copyright © 1998 by Hanan Samet

4
g

hp17

min y

G (25,30)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

I (27,35)

H (29,40)

J (28,45)

Copyright © 1998 by Hanan Samet

5
v

hp17

• Analogy with binary search trees implies that we can
replace the root by the node in LOSON with the max x
coordinate value. However, if there is more than one
node with the same max value, then after replacement
there would be a node in LOSON which is not strictly less
than the new root (e.g., nodes B and C)

Copyright © 1998 by Hanan Samet

6
r

hp17

• What if the HISON of the root is empty?

A

T

Copyright © 1998 by Hanan Samet

EXAMPLE OF K-D TREE DELETION

Ex:

hp171
b

J (28,45)

A (20,20)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

G (25,30)

H (29,40)

I (27,35)

x

y

x

y

x

y

x

Copyright © 1998 by Hanan Samet

2
r

hp17

min x

Copyright © 1998 by Hanan Samet

3
z

hp17

min y

Copyright © 1998 by Hanan Samet

4
g

hp17

min y

G (25,30)

B (10,30) D (40,50)

C (10,20) E (30,40) F (35,60)

I (27,35)

H (29,40)

J (28,45)

Copyright © 1998 by Hanan Samet

5
v

hp17

• Analogy with binary search trees implies that we can
replace the root by the node in LOSON with the max x
coordinate value. However, if there is more than one
node with the same max value, then after replacement
there would be a node in LOSON which is not strictly less
than the new root (e.g., nodes B and C)

Copyright © 1998 by Hanan Samet

6
r

hp17

• What if the HISON of the root is empty?

A

T

Copyright © 1998 by Hanan Samet

7
z

hp17

1. replace root node A with the node B in LOSON having
the minimum x coordinate value and set the HISON
pointer of A to be the old LOSON pointer of A

2. repeat process for the position of the replacement node

B

T’
B

B is the node with
the minimum x
coordinate value
and replaces A

Copyright © 1998 by Hanan Samet

K-D TREE SEARCH

• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined

hp181
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto (82,65)

Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

K-D TREE SEARCH

• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined

hp181
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto (82,65)

Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
r

hp18

• Ex: find all points within 10 of (20,30)

(20,30)

Copyright © 1998 by Hanan Samet

K-D TREE SEARCH

• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined

hp181
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto (82,65)

Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
r

hp18

• Ex: find all points within 10 of (20,30)

(20,30)

Copyright © 1998 by Hanan Samet

3
z

hp18

1. search the region entirely to the left of Chicago (x=35)

Copyright © 1998 by Hanan Samet

K-D TREE SEARCH

• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined

hp181
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto (82,65)

Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
r

hp18

• Ex: find all points within 10 of (20,30)

(20,30)

Copyright © 1998 by Hanan Samet

3
z

hp18

1. search the region entirely to the left of Chicago (x=35)

Copyright © 1998 by Hanan Samet

4
g

hp18

2. search the region entirely below Denver (y=45)

Copyright © 1998 by Hanan Samet

K-D TREE SEARCH

• Search space can be pruned by testing if the search
region is completely contained in one of the partitions of
the node as now only one subtree must be examined

hp181
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto (82,65)

Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
r

hp18

• Ex: find all points within 10 of (20,30)

(20,30)

Copyright © 1998 by Hanan Samet

3
z

hp18

1. search the region entirely to the left of Chicago (x=35)

Copyright © 1998 by Hanan Samet

4
g

hp18

2. search the region entirely below Denver (y=45)

Copyright © 1998 by Hanan Samet

5
v

hp18

3. search yields Omaha

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

3
z

hp19

(52,10)
Mobile

MobileChicago

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

3
z

hp19

(52,10)
Mobile

MobileChicago

Copyright © 1998 by Hanan Samet

4
g

hp19

(62,77)
Toronto

Mobile Toronto

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

3
z

hp19

(52,10)
Mobile

MobileChicago

Copyright © 1998 by Hanan Samet

4
g

hp19

(62,77)
Toronto

Mobile Toronto

Copyright © 1998 by Hanan Samet

5
v

hp19

(82,65)
Buffalo

BuffaloToronto

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

3
z

hp19

(52,10)
Mobile

MobileChicago

Copyright © 1998 by Hanan Samet

4
g

hp19

(62,77)
Toronto

Mobile Toronto

Copyright © 1998 by Hanan Samet

5
v

hp19

(82,65)
Buffalo

BuffaloToronto

Copyright © 1998 by Hanan Samet

6
z

hp19

(5,45)
Denver

Chicago
Denver

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

3
z

hp19

(52,10)
Mobile

MobileChicago

Copyright © 1998 by Hanan Samet

4
g

hp19

(62,77)
Toronto

Mobile Toronto

Copyright © 1998 by Hanan Samet

5
v

hp19

(82,65)
Buffalo

BuffaloToronto

Copyright © 1998 by Hanan Samet

6
z

hp19

(5,45)
Denver

Chicago
Denver

Copyright © 1998 by Hanan Samet

7
g

hp19

(27,35)
Omaha

ChicagoOmaha

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

3
z

hp19

(52,10)
Mobile

MobileChicago

Copyright © 1998 by Hanan Samet

4
g

hp19

(62,77)
Toronto

Mobile Toronto

Copyright © 1998 by Hanan Samet

5
v

hp19

(82,65)
Buffalo

BuffaloToronto

Copyright © 1998 by Hanan Samet

6
z

hp19

(5,45)
Denver

Chicago
Denver

Copyright © 1998 by Hanan Samet

7
g

hp19

(27,35)
Omaha

ChicagoOmaha

Copyright © 1998 by Hanan Samet

8
r

hp19

(85,15)
Atlanta

Mobile
Atlanta

Copyright © 1998 by Hanan Samet

PR K-D TREE (Knowlton)

• A region contains at most one data point

• Analogous to EXCELL with bucket size of 1

1 hp19
b

(0,100) (100,100)

(100,0)(0,0)

Copyright © 1998 by Hanan Samet

2
r

hp19

(35,42)
Chicago

Chicago

Copyright © 1998 by Hanan Samet

3
z

hp19

(52,10)
Mobile

MobileChicago

Copyright © 1998 by Hanan Samet

4
g

hp19

(62,77)
Toronto

Mobile Toronto

Copyright © 1998 by Hanan Samet

5
v

hp19

(82,65)
Buffalo

BuffaloToronto

Copyright © 1998 by Hanan Samet

6
z

hp19

(5,45)
Denver

Chicago
Denver

Copyright © 1998 by Hanan Samet

7
g

hp19

(27,35)
Omaha

ChicagoOmaha

Copyright © 1998 by Hanan Samet

8
r

hp19

(85,15)
Atlanta

Mobile
Atlanta

Copyright © 1998 by Hanan Samet

(90,5)
Miami

9
v

hp19

MiamiAtlanta

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
v

hp20

57,X

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
v

hp20

57,X

Copyright © 1998 by Hanan Samet

3
r

hp20

31,X

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
v

hp20

57,X

Copyright © 1998 by Hanan Samet

3
r

hp20

31,X

Copyright © 1998 by Hanan Samet

4
z

hp20

16,X

OmahaDenver

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
v

hp20

57,X

Copyright © 1998 by Hanan Samet

3
r

hp20

31,X

Copyright © 1998 by Hanan Samet

4
z

hp20

16,X

OmahaDenver

Copyright © 1998 by Hanan Samet

5
g

hp20

26,Y

ChicagoMobile

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
v

hp20

57,X

Copyright © 1998 by Hanan Samet

3
r

hp20

31,X

Copyright © 1998 by Hanan Samet

4
z

hp20

16,X

OmahaDenver

Copyright © 1998 by Hanan Samet

5
g

hp20

26,Y

ChicagoMobile

Copyright © 1998 by Hanan Samet

6
g

hp20

40,Y

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
v

hp20

57,X

Copyright © 1998 by Hanan Samet

3
r

hp20

31,X

Copyright © 1998 by Hanan Samet

4
z

hp20

16,X

OmahaDenver

Copyright © 1998 by Hanan Samet

5
g

hp20

26,Y

ChicagoMobile

Copyright © 1998 by Hanan Samet

6
g

hp20

40,Y

Copyright © 1998 by Hanan Samet

AtlantaMiami

7
v

hp20

10,Y10,Y

Copyright © 1998 by Hanan Samet

ADAPTIVE K-D TREE

• Data is only stored in terminal nodes
• An interior node contains the median of the set as the
discriminator
• The discriminator key is the one for which the spread of
the values of the key is a maximum

hp201
b

(0,100) (100,100)

(100,0)(0,0)

(5,45)
Denver (35,42)

Chicago

(27,35)
Omaha

(52,10)
Mobile

(62,77)
Toronto

(82,65)
Buffalo

(85,15)
Atlanta

(90,5)
Miami

Copyright © 1998 by Hanan Samet

2
v

hp20

57,X

Copyright © 1998 by Hanan Samet

3
r

hp20

31,X

Copyright © 1998 by Hanan Samet

4
z

hp20

16,X

OmahaDenver

Copyright © 1998 by Hanan Samet

5
g

hp20

26,Y

ChicagoMobile

Copyright © 1998 by Hanan Samet

6
g

hp20

40,Y

Copyright © 1998 by Hanan Samet

AtlantaMiami

7
v

hp20

10,Y10,Y

Copyright © 1998 by Hanan Samet

8
r

hp20

72,X

BuffaloToronto

Copyright © 1998 by Hanan Samet

hp21
GRID FILE (Nievergelt,Hinterberger,Sevcik)

• Two level grid for storing points

• Uses a grid directory (a 2d array of grid blocks) on disk
that contains the address of the bucket (i.e., page) that
contains the data associated with the grid block

• Linear scales (a pair of 1d arrays) in core that access the
grid block in the grid directory (on disk) that is associated
with a particular point thereby enabling the
decomposition of the space to be arbitrary

• Guarantees access to any record with two disk
operations — one for each level of the grid

1. access the grid block

2. access the bucket

• Each bucket has finite capacity

• Partition upon overflow

1. bucket partition — overflowing bucket is associated
with more than one grid block

2. grid partition — overflowing bucket is associated with
just one grid block

• Splitting policies

1. split at midpoint and cycle through attributes

2. adaptive

• increases granularity of frequently queried attributes

• favors some attributes over others

Copyright © 1998 by Hanan Samet

GRID FILE EXAMPLE

• Assume bucket size = 2

hp221
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

Linear scales

1. Initially Chicago and Mobile in bucket A

x:
0 100

y:
0 100

A

Copyright © 1998 by Hanan Samet

GRID FILE EXAMPLE

• Assume bucket size = 2

hp221
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

Linear scales

1. Initially Chicago and Mobile in bucket A

x:
0 100

y:
0 100

A

Copyright © 1998 by Hanan Samet

2
z

hp22

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

45

x=45

B

Copyright © 1998 by Hanan Samet

GRID FILE EXAMPLE

• Assume bucket size = 2

hp221
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

Linear scales

1. Initially Chicago and Mobile in bucket A

x:
0 100

y:
0 100

A

Copyright © 1998 by Hanan Samet

2
z

hp22

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

45

x=45

B

Copyright © 1998 by Hanan Samet

3
r

hp22

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

C

y=70

A

70

Copyright © 1998 by Hanan Samet

GRID FILE EXAMPLE

• Assume bucket size = 2

hp221
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

Linear scales

1. Initially Chicago and Mobile in bucket A

x:
0 100

y:
0 100

A

Copyright © 1998 by Hanan Samet

2
z

hp22

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

45

x=45

B

Copyright © 1998 by Hanan Samet

3
r

hp22

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

C

y=70

A

70

Copyright © 1998 by Hanan Samet

4
v

hp22

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

GRID FILE EXAMPLE

• Assume bucket size = 2

hp221
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

Linear scales

1. Initially Chicago and Mobile in bucket A

x:
0 100

y:
0 100

A

Copyright © 1998 by Hanan Samet

2
z

hp22

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

45

x=45

B

Copyright © 1998 by Hanan Samet

3
r

hp22

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

C

y=70

A

70

Copyright © 1998 by Hanan Samet

4
v

hp22

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

5
g

hp22

(27,35)
Omaha

5. Insert Omaha causing a grid partition yielding bucket D

y=43

43

D

D

C

Copyright © 1998 by Hanan Samet

GRID FILE EXAMPLE

• Assume bucket size = 2

hp221
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

Linear scales

1. Initially Chicago and Mobile in bucket A

x:
0 100

y:
0 100

A

Copyright © 1998 by Hanan Samet

2
z

hp22

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

45

x=45

B

Copyright © 1998 by Hanan Samet

3
r

hp22

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

C

y=70

A

70

Copyright © 1998 by Hanan Samet

4
v

hp22

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

5
g

hp22

(27,35)
Omaha

5. Insert Omaha causing a grid partition yielding bucket D

y=43

43

D

D

C

Copyright © 1998 by Hanan Samet

6
r

hp22

(85,15)
Atlanta

6. Insert Atlanta causing a bucket partition yielding bucket E

E

Copyright © 1998 by Hanan Samet

GRID FILE EXAMPLE

• Assume bucket size = 2

hp221
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

Linear scales

1. Initially Chicago and Mobile in bucket A

x:
0 100

y:
0 100

A

Copyright © 1998 by Hanan Samet

2
z

hp22

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

45

x=45

B

Copyright © 1998 by Hanan Samet

3
r

hp22

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

C

y=70

A

70

Copyright © 1998 by Hanan Samet

4
v

hp22

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

5
g

hp22

(27,35)
Omaha

5. Insert Omaha causing a grid partition yielding bucket D

y=43

43

D

D

C

Copyright © 1998 by Hanan Samet

6
r

hp22

(85,15)
Atlanta

6. Insert Atlanta causing a bucket partition yielding bucket E

E

Copyright © 1998 by Hanan Samet

7. Insert Miami causing a grid partition yielding bucket F

7
v

hp22

(90,5)
Miami

B

x=70

E

F

70

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

2
r

hp23

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

B

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

2
r

hp23

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

B

Copyright © 1998 by Hanan Samet

3
z

hp23

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

CA

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

2
r

hp23

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

B

Copyright © 1998 by Hanan Samet

3
z

hp23

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

CA

Copyright © 1998 by Hanan Samet

4
r

hp23

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

2
r

hp23

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

B

Copyright © 1998 by Hanan Samet

3
z

hp23

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

CA

Copyright © 1998 by Hanan Samet

4
r

hp23

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

5
v

hp23

(27,35)
Omaha

5. Insert Omaha; bucket A overflows; split A yielding bucket D

D

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

2
r

hp23

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

B

Copyright © 1998 by Hanan Samet

3
z

hp23

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

CA

Copyright © 1998 by Hanan Samet

4
r

hp23

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

5
v

hp23

(27,35)
Omaha

5. Insert Omaha; bucket A overflows; split A yielding bucket D

D

Copyright © 1998 by Hanan Samet

6
g

hp23

6. Bucket A is still too full, so perform a grid partition

D C

B

E

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

2
r

hp23

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

B

Copyright © 1998 by Hanan Samet

3
z

hp23

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

CA

Copyright © 1998 by Hanan Samet

4
r

hp23

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

5
v

hp23

(27,35)
Omaha

5. Insert Omaha; bucket A overflows; split A yielding bucket D

D

Copyright © 1998 by Hanan Samet

6
g

hp23

6. Bucket A is still too full, so perform a grid partition

D C

B

E

Copyright © 1998 by Hanan Samet

7
z

hp23

(85,15)
Atlanta

7. Insert Atlanta causing no change

Copyright © 1998 by Hanan Samet

EXCELL (Tamminen)

• Uses regular decomposition
• Like grid file, guarantees access to any record with two

disk operations
• Differentiated from grid file by absence of linear scales

which enable decomposition of space to be arbitrary
• Grid partition results in doubling the size of grid directory
• Ex: bucket size = 2

hp231
b

(0,100) (100,100)

(100,0)(0,0)

(35,42)
Chicago

(52,10)
Mobile

1. Initially, Chicago and Mobile in bucket A

A

Copyright © 1998 by Hanan Samet

2
r

hp23

(62,77)
Toronto

2. Insert Toronto causing a grid partition yielding bucket B

B

Copyright © 1998 by Hanan Samet

3
z

hp23

(82,65)
Buffalo

3. Insert Buffalo causing a grid partition yielding bucket C

CA

Copyright © 1998 by Hanan Samet

4
r

hp23

(5,45)
Denver

4. Insert Denver causing no change

Copyright © 1998 by Hanan Samet

5
v

hp23

(27,35)
Omaha

5. Insert Omaha; bucket A overflows; split A yielding bucket D

D

Copyright © 1998 by Hanan Samet

6
g

hp23

6. Bucket A is still too full, so perform a grid partition

D C

B

E

Copyright © 1998 by Hanan Samet

7
z

hp23

(85,15)
Atlanta

7. Insert Atlanta causing no change

Copyright © 1998 by Hanan Samet

8
v

hp23

(90,5)
Miami

8. Insert Miami causing a bucket partition yielding bucket F

F

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

2
r

hp24

Sequential
List

0: no
organization

N

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

2
r

hp24

Sequential
List

0: no
organization

N

Copyright © 1998 by Hanan Samet

3
z

hp24

Inverted
List

1: one attribute

H

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

2
r

hp24

Sequential
List

0: no
organization

N

Copyright © 1998 by Hanan Samet

3
z

hp24

Inverted
List

1: one attribute

H

Copyright © 1998 by Hanan Samet

4
g

hp24

Grid
Method

2: all attributes
(fixed number
of cells)

E

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

2
r

hp24

Sequential
List

0: no
organization

N

Copyright © 1998 by Hanan Samet

3
z

hp24

Inverted
List

1: one attribute

H

Copyright © 1998 by Hanan Samet

4
g

hp24

Grid
Method

2: all attributes
(fixed number
of cells)

E

Copyright © 1998 by Hanan Samet

5
v

hp24

Grid
File

3: permit the
number of
cells to vary

Excell

E D

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

2
r

hp24

Sequential
List

0: no
organization

N

Copyright © 1998 by Hanan Samet

3
z

hp24

Inverted
List

1: one attribute

H

Copyright © 1998 by Hanan Samet

4
g

hp24

Grid
Method

2: all attributes
(fixed number
of cells)

E

Copyright © 1998 by Hanan Samet

5
v

hp24

Grid
File

3: permit the
number of
cells to vary

Excell

E D

Copyright © 1998 by Hanan Samet

6
r

hp24

Point
Quadtree

4: only partition
the
overflowing
cellsD

PR
Quadtree

H

MX
Quadtree

E

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

2
r

hp24

Sequential
List

0: no
organization

N

Copyright © 1998 by Hanan Samet

3
z

hp24

Inverted
List

1: one attribute

H

Copyright © 1998 by Hanan Samet

4
g

hp24

Grid
Method

2: all attributes
(fixed number
of cells)

E

Copyright © 1998 by Hanan Samet

5
v

hp24

Grid
File

3: permit the
number of
cells to vary

Excell

E D

Copyright © 1998 by Hanan Samet

6
r

hp24

Point
Quadtree

4: only partition
the
overflowing
cellsD

PR
Quadtree

H

MX
Quadtree

E

Copyright © 1998 by Hanan Samet

7
z

hp24

K-d tree

5: only partition
the
overflowing
cells into 2
cellsD

PR
K-d tree

H

Copyright © 1998 by Hanan Samet

SUMMARY

• Data structures can be grouped:
N = No data organization

D = Organize data to be
stored (e.g., binary
search tree)

E = Organize the embedding
space from which the
data is drawn (e.g.,
digital searching)

H = Hybrid (combines at
least two of N, D, and E)

hp241
b

Copyright © 1998 by Hanan Samet

2
r

hp24

Sequential
List

0: no
organization

N

Copyright © 1998 by Hanan Samet

3
z

hp24

Inverted
List

1: one attribute

H

Copyright © 1998 by Hanan Samet

4
g

hp24

Grid
Method

2: all attributes
(fixed number
of cells)

E

Copyright © 1998 by Hanan Samet

5
v

hp24

Grid
File

3: permit the
number of
cells to vary

Excell

E D

Copyright © 1998 by Hanan Samet

6
r

hp24

Point
Quadtree

4: only partition
the
overflowing
cellsD

PR
Quadtree

H

MX
Quadtree

E

Copyright © 1998 by Hanan Samet

7
z

hp24

K-d tree

5: only partition
the
overflowing
cells into 2
cellsD

PR
K-d tree

H

Copyright © 1998 by Hanan Samet

8
g

hp24

Adaptive
K-d tree

6: partition using
attribute
having
greatest
spread across
the cell

D

Copyright © 1998 by Hanan Samet

hp25

DRAWBACKS OF MOST HASHING METHODS

• Require rehashing all the data when the hash table
becomes too full

• Goal: only move a few records
• Solutions:

1. Knott
• use a trie in the form of a binary tree

2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)

• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets

3. Linear hashing (Litwin)
• provides for linear growth in the number of

buckets (i.e., the hash table grows at a rate of one
bucket at a time)

• does not make use of a directory

1
b

A

B C

D

Copyright © 1998 by Hanan Samet

hp25

DRAWBACKS OF MOST HASHING METHODS

• Require rehashing all the data when the hash table
becomes too full

• Goal: only move a few records
• Solutions:

1. Knott
• use a trie in the form of a binary tree

2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)

• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets

3. Linear hashing (Litwin)
• provides for linear growth in the number of

buckets (i.e., the hash table grows at a rate of one
bucket at a time)

• does not make use of a directory

1
b

A

B C

D

Copyright © 1998 by Hanan Samet

hp252
r

• bucket overflow is solved by splitting
• drawback: accessing a bucket at

level m requires m operations
E F

Copyright © 1998 by Hanan Samet

hp25

DRAWBACKS OF MOST HASHING METHODS

• Require rehashing all the data when the hash table
becomes too full

• Goal: only move a few records
• Solutions:

1. Knott
• use a trie in the form of a binary tree

2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)

• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets

3. Linear hashing (Litwin)
• provides for linear growth in the number of

buckets (i.e., the hash table grows at a rate of one
bucket at a time)

• does not make use of a directory

1
b

A

B C

D

Copyright © 1998 by Hanan Samet

hp252
r

• bucket overflow is solved by splitting
• drawback: accessing a bucket at

level m requires m operations
E F

Copyright © 1998 by Hanan Samet

hp253
z

• implemented as a directory of pointers to buckets
• accessing a bucket requires 1 operation

A A A A B C D D

Copyright © 1998 by Hanan Samet

hp25

DRAWBACKS OF MOST HASHING METHODS

• Require rehashing all the data when the hash table
becomes too full

• Goal: only move a few records
• Solutions:

1. Knott
• use a trie in the form of a binary tree

2. extendible hashing
(Fagin, Nievergelt, Pippenger, Strong)

• like a trie except that all buckets are at same level
• buckets are accessed by use of a directory
• directory elements are NOT the same as buckets

3. Linear hashing (Litwin)
• provides for linear growth in the number of

buckets (i.e., the hash table grows at a rate of one
bucket at a time)

• does not make use of a directory

1
b

A

B C

D

Copyright © 1998 by Hanan Samet

hp252
r

• bucket overflow is solved by splitting
• drawback: accessing a bucket at

level m requires m operations
E F

Copyright © 1998 by Hanan Samet

hp253
z

• implemented as a directory of pointers to buckets
• accessing a bucket requires 1 operation

A A A A B C D D

Copyright © 1998 by Hanan Samet

hp254
g

• bucket overflow may cause doubling the directory
• e.g., EXCELL

A A A A A A A A B B E F D D D D

Copyright © 1998 by Hanan Samet

hp26
MECHANICS OF LINEAR HASHING

• Assume a file with m buckets

• Use two hashing functions hi(k) = f(k) mod 2i+1 for i = n
and i = n+1

1. compute hn+1(k) = x and use the result if x < m 2. otherwise use hn(k) • Such a file is said to be at level n,n+1 • There exist primary and overflow buckets • When a record hashes to a full primary bucket, then it is inserted into an overflow bucket corresponding to the primary bucket • τ : storage utilization factor τ = number of records in file divided by the total of available slots in primary and overflow buckets • When τ > a given load α, then one of the buckets is split

• When bucket i is split, it and its overflow bucket’s records
are rehashed using hn+1 and distributed into buckets i
and i + 2n as is appropriate

Copyright © 1998 by Hanan Samet

hp27

LINEAR HASHING INSERTION ALGORITHM

• Let s denote the identity of the next bucket to be split and
cycles from 0 to 2n-1

• Insertion algorithm

1. compute bucket address i for record r

2. insert r in bucket i

3. if τ > α, then split bucket i creating bucket i + 2n and
reinsert in buckets i and i + 2n

4. if s = 2n, then all buckets have been split

• increment n

• reset s to 0

5. if buckets i or i+1 overflow, then allocate an overflow
bucket

6. if rehashing causes some overflow buckets to be
reclaimed, repeat steps 3-5

• Notes

1. a bucket split need not necessarily occur when a
record hashes to a full bucket, nor does the bucket
being split need to be full

2. key principle is that eventually every bucket will be
split and ideally all overflow buckets will be emptied
and reclaimed

3. if the storage utilization gets too low, then buckets
should be reclaimed

Copyright © 1998 by Hanan Samet

hp28

BIT INTERLEAVING HASHING FUNCTIONS

• Bit interleaving takes one bit from the binary
representation of the x coordinate value and one bit
from the binary representation of the y coordinate
value and alternates them

• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits

• Example with y being more significant than x

City x y f(z)=z div 12.5 Bit Interleaved
x y Value

Chicago 35 42 2 3 14

Mobile 52 10 4 0 16

Toronto 62 77 4 6 56

Buffalo 82 65 6 5 54

Denver 5 45 0 3 10

Omaha 27 35 2 2 12

Atlanta 85 15 6 1 22

Miami 90 5 7 0 21

1
b

Copyright © 1998 by Hanan Samet

hp28

BIT INTERLEAVING HASHING FUNCTIONS

• Bit interleaving takes one bit from the binary
representation of the x coordinate value and one bit
from the binary representation of the y coordinate
value and alternates them

• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits

• Example with y being more significant than x

City x y f(z)=z div 12.5 Bit Interleaved
x y Value

Chicago 35 42 2 3 14

Mobile 52 10 4 0 16

Toronto 62 77 4 6 56

Buffalo 82 65 6 5 54

Denver 5 45 0 3 10

Omaha 27 35 2 2 12

Atlanta 85 15 6 1 22

Miami 90 5 7 0 21

1
b

Copyright © 1998 by Hanan Samet

hp282
r

0 0 1

1 1 0

y

x

• Ex: Atlanta (6,1)

Copyright © 1998 by Hanan Samet

hp28

BIT INTERLEAVING HASHING FUNCTIONS

• Bit interleaving takes one bit from the binary
representation of the x coordinate value and one bit
from the binary representation of the y coordinate
value and alternates them

• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits

• Example with y being more significant than x

City x y f(z)=z div 12.5 Bit Interleaved
x y Value

Chicago 35 42 2 3 14

Mobile 52 10 4 0 16

Toronto 62 77 4 6 56

Buffalo 82 65 6 5 54

Denver 5 45 0 3 10

Omaha 27 35 2 2 12

Atlanta 85 15 6 1 22

Miami 90 5 7 0 21

1
b

Copyright © 1998 by Hanan Samet

hp282
r

0 0 1

1 1 0

y

x

• Ex: Atlanta (6,1)

Copyright © 1998 by Hanan Samet

hp283
z

0 1 0 1 1 0 = 22

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2

3
z

Toronto
Chicago

2

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2

3
z

Toronto
Chicago

2

Copyright © 1998 by Hanan Samet

hp29

• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3

4
g

Buffalo

0 3

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2

3
z

Toronto
Chicago

2

Copyright © 1998 by Hanan Samet

hp29

• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3

4
g

Buffalo

0 3

Copyright © 1998 by Hanan Samet

hp295
v

Denver

• Insert Denver (10) into bucket 2 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2

3
z

Toronto
Chicago

2

Copyright © 1998 by Hanan Samet

hp29

• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3

4
g

Buffalo

0 3

Copyright © 1998 by Hanan Samet

hp295
v

Denver

• Insert Denver (10) into bucket 2 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp296
g

Omaha

• Insert Omaha (12) into bucket 0 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2

3
z

Toronto
Chicago

2

Copyright © 1998 by Hanan Samet

hp29

• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3

4
g

Buffalo

0 3

Copyright © 1998 by Hanan Samet

hp295
v

Denver

• Insert Denver (10) into bucket 2 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp296
g

Omaha

• Insert Omaha (12) into bucket 0 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp297
r

Atlanta

• Insert Atlanta (22) into bucket 2’s overflow area

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2

3
z

Toronto
Chicago

2

Copyright © 1998 by Hanan Samet

hp29

• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3

4
g

Buffalo

0 3

Copyright © 1998 by Hanan Samet

hp295
v

Denver

• Insert Denver (10) into bucket 2 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp296
g

Omaha

• Insert Omaha (12) into bucket 0 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp297
r

Atlanta

• Insert Atlanta (22) into bucket 2’s overflow area

Copyright © 1998 by Hanan Samet

hp298
z

Omaha

4

• Insert Miami (21) into bucket 1:
τ=0.67 and split bucket 0 creating bucket 4; move Omaha to
bucket 4

Miami

Copyright © 1998 by Hanan Samet

hp29

EXAMPLE OF LINEAR HASHING

• Assume primary and overflow bucket capacity is 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 0 exists

1
b

0

Copyright © 1998 by Hanan Samet

hp292
r

• Insert Chicago (14) and Mobile (16):
τ=1 and split bucket 0 creating bucket 1

Chicago

0 1

Mobile

Copyright © 1998 by Hanan Samet

hp29

• Insert Toronto (56) into bucket 0:
τ=0.75 and split bucket 0 creating bucket 2; move
Chicago to bucket 2

3
z

Toronto
Chicago

2

Copyright © 1998 by Hanan Samet

hp29

• Insert Buffalo (54) into bucket 2:
τ=0.67 and split bucket 1 creating bucket 3

4
g

Buffalo

0 3

Copyright © 1998 by Hanan Samet

hp295
v

Denver

• Insert Denver (10) into bucket 2 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp296
g

Omaha

• Insert Omaha (12) into bucket 0 which causes it to overflow

Copyright © 1998 by Hanan Samet

hp297
r

Atlanta

• Insert Atlanta (22) into bucket 2’s overflow area

Copyright © 1998 by Hanan Samet

hp298
z

Omaha

4

• Insert Miami (21) into bucket 1:
τ=0.67 and split bucket 0 creating bucket 4; move Omaha to
bucket 4

Miami

Copyright © 1998 by Hanan Samet

hp299
v

Miami

5

• Reclaim the overflow area of bucket 0:
τ=0.67 again and split bucket 1 creating bucket 5; move
Miami to bucket 5

Copyright © 1998 by Hanan Samet

hp30

ORDER PRESERVING LINEAR HASHING (OPLH)

• Problem: the hash function hn(k) = k mod 2n implies that
all records in a given bucket agree in the least
n significant bits

1. OK for random access
2. unacceptable for sequential access as each record will

be in a different bucket
• Solution: use the hash function

hn(k) = (reverse(k)) mod 2n

1. tests the n most significant bits
2. all records in a bucket are within a given range

City x y f(z)=z mod 12.5 Bit Interleaved Value
x y x most sig y most sig

Chicago
35 42 2 3 44 28

Mobile 52 10 4 0 1 2
Toronto 62 77 4 6 11 7
Buffalo 82 65 6 5 39 27
Denver 5 45 0 3 40 20
Omaha 27 35 2 2 12 12
Atlanta 85 15 6 1 37 26
Miami 90 5 7 0 21 42

• Shortcomings
1. records may not be scattered too well

• overflow is much more common than with traditional
hashing methods

• random access is slower since several overflow
buckets may have to be examined

2. creates a large number of sparsely filled buckets
• sequential access may be slower as may have to

examine many empty buckets

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp312
r

(52,10)
Mobile

(35,42)
Chicago

1

• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp312
r

(52,10)
Mobile

(35,42)
Chicago

1

• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1

Copyright © 1998 by Hanan Samet

hp313
z

(62,77)
Toronto

2

• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp312
r

(52,10)
Mobile

(35,42)
Chicago

1

• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1

Copyright © 1998 by Hanan Samet

hp313
z

(62,77)
Toronto

2

• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2

Copyright © 1998 by Hanan Samet

hp314
g

(82,65)
Buffalo3

• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp312
r

(52,10)
Mobile

(35,42)
Chicago

1

• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1

Copyright © 1998 by Hanan Samet

hp313
z

(62,77)
Toronto

2

• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2

Copyright © 1998 by Hanan Samet

hp314
g

(82,65)
Buffalo3

• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3

Copyright © 1998 by Hanan Samet

hp315
v

(5,45)
Denver

• Insert Denver (20) into bucket 0

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp312
r

(52,10)
Mobile

(35,42)
Chicago

1

• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1

Copyright © 1998 by Hanan Samet

hp313
z

(62,77)
Toronto

2

• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2

Copyright © 1998 by Hanan Samet

hp314
g

(82,65)
Buffalo3

• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3

Copyright © 1998 by Hanan Samet

hp315
v

(5,45)
Denver

• Insert Denver (20) into bucket 0

Copyright © 1998 by Hanan Samet

hp316
g

(27,35)
Omaha4

• Insert Omaha (12) into 0: τ=0.75 and split bucket 0 creating
bucket 4; move Denver and Chicago to bucket 4

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp312
r

(52,10)
Mobile

(35,42)
Chicago

1

• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1

Copyright © 1998 by Hanan Samet

hp313
z

(62,77)
Toronto

2

• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2

Copyright © 1998 by Hanan Samet

hp314
g

(82,65)
Buffalo3

• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3

Copyright © 1998 by Hanan Samet

hp315
v

(5,45)
Denver

• Insert Denver (20) into bucket 0

Copyright © 1998 by Hanan Samet

hp316
g

(27,35)
Omaha4

• Insert Omaha (12) into 0: τ=0.75 and split bucket 0 creating
bucket 4; move Denver and Chicago to bucket 4

Copyright © 1998 by Hanan Samet

hp317
v

(85,15)
Atlanta

• Insert Atlanta (26) into bucket 2

Copyright © 1998 by Hanan Samet

hp31

EXAMPLE OF ORDER PRESERVING LINEAR HASHING
• Assume primary and overflow bucket capacity is 2
• A bucket is split whenever τ≥α=0.66
• Initially, only bucket 0 exists

(0,100) (100,100)

(100,0)(0,0)

0

1
b

Copyright © 1998 by Hanan Samet

hp312
r

(52,10)
Mobile

(35,42)
Chicago

1

• Insert Chicago (28) and Mobile (2): τ=1 and split bucket
0 creating bucket 1

Copyright © 1998 by Hanan Samet

hp313
z

(62,77)
Toronto

2

• Insert Toronto (7) into bucket 1: τ=0.75 and split bucket
0 creating bucket 2; move Mobile to bucket 2

Copyright © 1998 by Hanan Samet

hp314
g

(82,65)
Buffalo3

• Insert Buffalo (27) into 1: τ=0.67, split bucket 1 creating
bucket 3; move Toronto and Buffalo to bucket 3

Copyright © 1998 by Hanan Samet

hp315
v

(5,45)
Denver

• Insert Denver (20) into bucket 0

Copyright © 1998 by Hanan Samet

hp316
g

(27,35)
Omaha4

• Insert Omaha (12) into 0: τ=0.75 and split bucket 0 creating
bucket 4; move Denver and Chicago to bucket 4

Copyright © 1998 by Hanan Samet

hp317
v

(85,15)
Atlanta

• Insert Atlanta (26) into bucket 2

Copyright © 1998 by Hanan Samet

hp318
z

(90,5)
Miami

5

• Insert Miami (42) into bucket 2: τ=0.67 and split bucket 1
creating bucket 5; move Miami to bucket 2’s overflow area

Copyright © 1998 by Hanan Samet

hp32

COMPARISON OF OPLH WITH EXCELL

OPLH EXCELL
1. Implicit directory

• one directory element
per primary bucket

• all buckets stored at
one of two levels

• overflow buckets

1. Explicit directory
• set of primary buckets

2. Reverse bit interleaving 2. Reverse bit interleaving
3. Bucket overflow

• allocate at most two
additional buckets (one
for the bucket that has
been split and one for
the overflowing bucket)

3. Bucket overflow
• triggers bucket split or

directory doubling

4.Retrieval of a record
requires examining primary
and overflow buckets

4.Retrieve any record with
two disk accesses

• Summary: order preserving linear hashing (OPLH) yields
a more gradual growth in the size of the
directory at the expense of the loss of the
guarantee of retrieval of any record with just
two disk accesses

Copyright © 1998 by Hanan Samet

hp33
EXAMPLES OF COMPARISON OF OPLH WITH EXCELL
1. Reversed bit interleaving with the y coordinate value as

the most significant (i.e., y is split first)
OPLH: EXCELL:

• 6 primary buckets • 7 buckets
• 2 overflow buckets • 16 directory elements

2. Reversed bit interleaving with the x coordinate value as
the most significant (i.e., x is split first)
OPLH: EXCELL:

• 7 primary buckets • 6 buckets
• no overflow buckets • 8 directory elements

(52,10)
Mobile

(35,42)
Chicago

(62,77)
Toronto

(82,65)
Buffalo

(5,45)
Denver

(27,35)
Omaha (85,15)

Atlanta

(90,5)
Miami

(52,10)
Mobile

(5,45)
Denver

(85,15)
Atlanta

(27,35)
Omaha

(0,100) (100,100)

(100,0)(0,0)

0

1

2

3

4

5

(0,100) (100,100)

(100,0)(0,0)

A

(52,10)
Mobile

(35,42)
Chicago

B

(62,77)
Toronto

C

(82,65)
Buffalo

(5,45)
Denver

(27,35)
OmahaD

(85,15)
Atlanta

(90,5)
Miami

B B B B

B BB

A G

E
F F

(0,100) (100,100)

(100,0)(0,0)

A

(52,10)
Mobile

(35,42)
Chicago

D

(62,77)
Toronto

B

(82,65)
Buffalo

(5,45)
Denver

(27,35)
Omaha

(85,15)
Atlanta

(90,5)
Miami

D CC

E F

(0,100) (100,100)

(100,0)(0,0)

0

2

1

3

4

6

5

(35,42)
Chicago

(62,77)
Toronto

(82,65)
Buffalo

(90,5)
Miami

Copyright © 1998 by Hanan Samet

hp341
bSPIRAL HASHING (Martin)

• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the

probability of the occurence of overflow
2. all buckets that are candidates for a split have the

same probability of overflowing

• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses

1. records are distributed in the active buckets in an
uneven manner

2. split the bucket with the highest probability of
overflowing

• When a bucket s is split

1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used

• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split

• Ex: assume that initially there are d – 1 active buckets
starting at address 1

• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)

1d = 2:

Copyright © 1998 by Hanan Samet

hp341
bSPIRAL HASHING (Martin)

• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the

probability of the occurence of overflow
2. all buckets that are candidates for a split have the

same probability of overflowing

• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses

1. records are distributed in the active buckets in an
uneven manner

2. split the bucket with the highest probability of
overflowing

• When a bucket s is split

1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used

• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split

• Ex: assume that initially there are d – 1 active buckets
starting at address 1

• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)

1d = 2:

Copyright © 1998 by Hanan Samet

hp342
r

2 3

Copyright © 1998 by Hanan Samet

hp341
bSPIRAL HASHING (Martin)

• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the

probability of the occurence of overflow
2. all buckets that are candidates for a split have the

same probability of overflowing

• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses

1. records are distributed in the active buckets in an
uneven manner

2. split the bucket with the highest probability of
overflowing

• When a bucket s is split

1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used

• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split

• Ex: assume that initially there are d – 1 active buckets
starting at address 1

• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)

1d = 2:

Copyright © 1998 by Hanan Samet

hp342
r

2 3

Copyright © 1998 by Hanan Samet

hp343
z

3 4 5

Copyright © 1998 by Hanan Samet

hp341
bSPIRAL HASHING (Martin)

• Drawbacks of linear hashing:
1. order in which buckets are split is unrelated to the

probability of the occurence of overflow
2. all buckets that are candidates for a split have the

same probability of overflowing

• Central idea is the existence of an ever-changing (and
growing) address space of active bucket addresses

1. records are distributed in the active buckets in an
uneven manner

2. split the bucket with the highest probability of
overflowing

• When a bucket s is split

1. create d new buckets
2. rehash the contents of s into the d new buckets
3. bucket s is no longer used

• If [s,t ] are the active buckets, then [s + 1,t + d ] are
the active buckets after the split

• Ex: assume that initially there are d – 1 active buckets
starting at address 1

• After s bucket splits, there are (s + 1)•(d + 1) active
buckets starting at address s + 1 (prove by induction)

1d = 2:

Copyright © 1998 by Hanan Samet

hp342
r

2 3

Copyright © 1998 by Hanan Samet

hp343
z

3 4 5

Copyright © 1998 by Hanan Samet

hp344
g

4 5 6 7

Copyright © 1998 by Hanan Samet

hp35

SPIRAL HASHING FUNCTION

• Assume that initially there are d −1 active buckets starting
at address 1

• Key idea is the behavior of the function y = dx

1. dx+1 − dx = dx • (d −1)
2. bucket s has just been split
3. let dx = s +1 = address of first active bucket
4. implies dx+1 − dx = (s +1) • (d −1) = number of active buckets
5. last active bucket is at address dx+1 −1
6. [dx,dx+1 ) is range of active buckets

• Use two hashing functions
1. h(k) maps key k uniformly into [0,1) which is the range of the

difference in exponent values of addresses iny = dx+1 − dx

2. y(k) maps h(k) into an address in [s+1,(s+1)+(s+1)•(d −1))

• y(k) = dx(k) with x(k) in range [logd (s+1),logd (s+1)+1)
• before split, active buckets lay in range [logd s,logd s+1)
• want to make sure that all key values previously in

bucket s − i.e., x(k) in [logds,logd (s+1)) are rehashed
into one of the new buckets with an x(k) value in
[logds+1,logd (s+1)+1)

• leave other key values in [logd (s+1),logd s +1) unchanged
• difficult to choose x(k)

a. x(k) = logd (s+1) + h(k)
• drawback: must rehash all keys when a bucket is split

b. x(k) = logd (s+1) − h(k)  + h(k)
• guarantees that if k is hashed into bucket b (≥ s + 1)

then it continues to hash there until bucket b is split
• implies that x(k) is a number in the range
[logd (s +1),logd (s+1) +1) whose fractional part is h(k)

Copyright © 1998 by Hanan Samet

hp36

BEHAVIOR OF THE SPIRAL HASHING FUNCTION

• Ex: d = 2

1
b

y

15
14
13
12
11
10
9
8
7
6
5
4
3
2
1

y = 2x

x
Bucket
Number

Relative
Load

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0

0.1

0.2

0.5

1.0

The function y = 2x Relative load of buckets 1-15

Bucket address Hash interval Relative load

1 [0.0000,1.0000) 1.0000
2 [0.0000,0.5849) 0.5849
3 [0.5849,1.0000) 0.4151
4 [0.0000,0.3219) 0.3219
5 [0.3219,0.5849) 0.2630
6 [0.5849,0.8073) 0.2224
7 [0.8073,1.0000) 0.1927
8 [0.0000,0.1699) 0.1699
9 [0.1699,0.3219) 0.1520
10 [0.3219,0.4594) 0.1375
11 [0.4594,0.5849) 0.1255
12 [0.5849,0.7004) 0.1155
13 [0.7004,0.8073) 0.1069
14 [0.8073,0.9068) 0.0995
15 [0.9068,1.0000) 0.0932

Copyright © 1998 by Hanan Samet

hp36

BEHAVIOR OF THE SPIRAL HASHING FUNCTION

• Ex: d = 2

1
b

y

15
14
13
12
11
10
9
8
7
6
5
4
3
2
1

y = 2x

x
Bucket
Number

Relative
Load

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0

0.1

0.2

0.5

1.0

The function y = 2x Relative load of buckets 1-15

Bucket address Hash interval Relative load

1 [0.0000,1.0000) 1.0000
2 [0.0000,0.5849) 0.5849
3 [0.5849,1.0000) 0.4151
4 [0.0000,0.3219) 0.3219
5 [0.3219,0.5849) 0.2630
6 [0.5849,0.8073) 0.2224
7 [0.8073,1.0000) 0.1927
8 [0.0000,0.1699) 0.1699
9 [0.1699,0.3219) 0.1520
10 [0.3219,0.4594) 0.1375
11 [0.4594,0.5849) 0.1255
12 [0.5849,0.7004) 0.1155
13 [0.7004,0.8073) 0.1069
14 [0.8073,0.9068) 0.0995
15 [0.9068,1.0000) 0.0932

Copyright © 1998 by Hanan Samet

hp362
r

• Split bucket 3

Copyright © 1998 by Hanan Samet

hp36

BEHAVIOR OF THE SPIRAL HASHING FUNCTION

• Ex: d = 2

1
b

y

15
14
13
12
11
10
9
8
7
6
5
4
3
2
1

y = 2x

x
Bucket
Number

Relative
Load

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.0

0.1

0.2

0.5

1.0

The function y = 2x Relative load of buckets 1-15

Bucket address Hash interval Relative load

1 [0.0000,1.0000) 1.0000
2 [0.0000,0.5849) 0.5849
3 [0.5849,1.0000) 0.4151
4 [0.0000,0.3219) 0.3219
5 [0.3219,0.5849) 0.2630
6 [0.5849,0.8073) 0.2224
7 [0.8073,1.0000) 0.1927
8 [0.0000,0.1699) 0.1699
9 [0.1699,0.3219) 0.1520
10 [0.3219,0.4594) 0.1375
11 [0.4594,0.5849) 0.1255
12 [0.5849,0.7004) 0.1155
13 [0.7004,0.8073) 0.1069
14 [0.8073,0.9068) 0.0995
15 [0.9068,1.0000) 0.0932

Copyright © 1998 by Hanan Samet

hp362
r

• Split bucket 3

Copyright © 1998 by Hanan Samet

hp363
z

• Yields buckets 6 and 7
Copyright © 1998 by Hanan Samet

hp37

TABLES FOR EXAMPLE OF SPIRAL HASHING

• Use bit interleaving to form a value k – i.e., take one bit
from the binary representation of the x coordinate
value and one bit from the binary representation of the
y coordinate value and alternate them

• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits

• Use h(k) = k/64 which has same effect as reverse bit
interleaving and behavior is analogous to OPLH

• Example with y being more significant than x
City x y f(z)=z div 12.5
x y k/64

Chicago 35 42 2 3 14 .21875

Mobile 52 10 4 0 16 .25

Toronto 62 77 4 6 56 .875

Buffalo 82 65 6 5 54 .84375

Denver 5 45 0 3 10 .15625

Omaha 27 35 2 2 12 .1875

Atlanta 85 15 6 1 22 .34375

Miami 90 5 7 0 21 .328125

1
b

Copyright © 1998 by Hanan Samet

hp37

TABLES FOR EXAMPLE OF SPIRAL HASHING

• Use bit interleaving to form a value k – i.e., take one bit
from the binary representation of the x coordinate
value and one bit from the binary representation of the
y coordinate value and alternate them

• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits

• Use h(k) = k/64 which has same effect as reverse bit
interleaving and behavior is analogous to OPLH

• Example with y being more significant than x
City x y f(z)=z div 12.5
x y k/64

Chicago 35 42 2 3 14 .21875

Mobile 52 10 4 0 16 .25

Toronto 62 77 4 6 56 .875

Buffalo 82 65 6 5 54 .84375

Denver 5 45 0 3 10 .15625

Omaha 27 35 2 2 12 .1875

Atlanta 85 15 6 1 22 .34375

Miami 90 5 7 0 21 .328125

1
b

Copyright © 1998 by Hanan Samet

hp372
r

0 0 1

1 1 0

y

x

• Ex: Atlanta (6,1)

Copyright © 1998 by Hanan Samet

hp37

TABLES FOR EXAMPLE OF SPIRAL HASHING

• Use bit interleaving to form a value k – i.e., take one bit
from the binary representation of the x coordinate
value and one bit from the binary representation of the
y coordinate value and alternate them

• Use city coordinate values and divide by 12.5 so that
each coordinate value requires just three binary digits

• Use h(k) = k/64 which has same effect as reverse bit
interleaving and behavior is analogous to OPLH

• Example with y being more significant than x
City x y f(z)=z div 12.5
x y k/64

Chicago 35 42 2 3 14 .21875

Mobile 52 10 4 0 16 .25

Toronto 62 77 4 6 56 .875

Buffalo 82 65 6 5 54 .84375

Denver 5 45 0 3 10 .15625

Omaha 27 35 2 2 12 .1875

Atlanta 85 15 6 1 22 .34375

Miami 90 5 7 0 21 .328125

1
b

Copyright © 1998 by Hanan Samet

hp372
r

0 0 1

1 1 0

y

x

• Ex: Atlanta (6,1)

Copyright © 1998 by Hanan Samet

hp373
z

0 1 0 1 1 0 = 22

Copyright © 1998 by Hanan Samet

hp38

MECHANICS OF EXAMPLE OF SPIRAL HASHING

• Assume d = 2 and primary and overflow bucket capacity of 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists

1
b

1

1

Copyright © 1998 by Hanan Samet

hp38

MECHANICS OF EXAMPLE OF SPIRAL HASHING

• Assume d = 2 and primary and overflow bucket capacity of 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists

1
b

1

1

Copyright © 1998 by Hanan Samet

hp38

• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2

2
r

Mobile

Chicago

3

1 2

2 3

CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

MECHANICS OF EXAMPLE OF SPIRAL HASHING

• Assume d = 2 and primary and overflow bucket capacity of 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists

1
b

1

1

Copyright © 1998 by Hanan Samet

hp38

• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2

2
r

Mobile

Chicago

3

1 2

2 3

CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4

3
z

Toronto

5

4

5
4 5

TOR CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

MECHANICS OF EXAMPLE OF SPIRAL HASHING

• Assume d = 2 and primary and overflow bucket capacity of 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists

1
b

1

1

Copyright © 1998 by Hanan Samet

hp38

• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2

2
r

Mobile

Chicago

3

1 2

2 3

CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4

3
z

Toronto

5

4

5
4 5

TOR CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7

4
g

Buffalo

6 7

6 7

TOR
BUF

Copyright © 1998 by Hanan Samet

hp38

MECHANICS OF EXAMPLE OF SPIRAL HASHING

• Assume d = 2 and primary and overflow bucket capacity of 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists

1
b

1

1

Copyright © 1998 by Hanan Samet

hp38

• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2

2
r

Mobile

Chicago

3

1 2

2 3

CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4

3
z

Toronto

5

4

5
4 5

TOR CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7

4
g

Buffalo

6 7

6 7

TOR
BUF

Copyright © 1998 by Hanan Samet

hp385
g

• Insert Denver (.16) and Omaha (.19) into bucket 4’s overflow
area

Omaha

Denver
DEN
OMA

Copyright © 1998 by Hanan Samet

hp38

MECHANICS OF EXAMPLE OF SPIRAL HASHING

• Assume d = 2 and primary and overflow bucket capacity of 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists

1
b

1

1

Copyright © 1998 by Hanan Samet

hp38

• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2

2
r

Mobile

Chicago

3

1 2

2 3

CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4

3
z

Toronto

5

4

5
4 5

TOR CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7

4
g

Buffalo

6 7

6 7

TOR
BUF

Copyright © 1998 by Hanan Samet

hp385
g

• Insert Denver (.16) and Omaha (.19) into bucket 4’s overflow
area

Omaha

Denver
DEN
OMA

Copyright © 1998 by Hanan Samet

hp386
v

• Insert Atlanta (.34) into bucket 5: τ=0.7 and split bucket 4
creating buckets 8 and 9; Denver is moved to bucket 8,
while Chicago, Mobile, and Omaha are moved to bucket 9
which causes it to overflow

Atlanta
9

8

9

8 9

CHI
MOB

DEN

OMA

ATL

Copyright © 1998 by Hanan Samet

hp38

MECHANICS OF EXAMPLE OF SPIRAL HASHING

• Assume d = 2 and primary and overflow bucket capacity of 2

• A bucket is split whenever τ ≥ α = 0.66
• Initially, only bucket 1 exists

1
b

1

1

Copyright © 1998 by Hanan Samet

hp38

• Insert Chicago (.22) and Mobile (.25): τ=1 and split bucket 1
creating buckets 2 and 3; Chicago and Mobile are moved to
bucket 2

2
r

Mobile

Chicago

3

1 2

2 3

CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Toronto (.87) into bucket 3: τ=0.75 and split bucket 2
creating buckets 3 and 4; Chicago and Mobile are moved to
bucket 4

3
z

Toronto

5

4

5
4 5

TOR CHI
MOB

Copyright © 1998 by Hanan Samet

hp38

• Insert Buffalo (.84) into bucket 3: τ=0.67 and split bucket 3
creating buckets 6 and 7; Toronto and Buffalo are moved to
bucket 7

4
g

Buffalo

6 7

6 7

TOR
BUF

Copyright © 1998 by Hanan Samet

hp385
g

• Insert Denver (.16) and Omaha (.19) into bucket 4’s overflow
area

Omaha

Denver
DEN
OMA

Copyright © 1998 by Hanan Samet

hp386
v

• Insert Atlanta (.34) into bucket 5: τ=0.7 and split bucket 4
creating buckets 8 and 9; Denver is moved to bucket 8,
while Chicago, Mobile, and Omaha are moved to bucket 9
which causes it to overflow

Atlanta
9

8

9

8 9

CHI
MOB

DEN

OMA

ATL

Copyright © 1998 by Hanan Samet

hp387
v

• Insert Miami (.33) into bucket 5: τ=0.67 and split bucket 5
creating buckets 10 and 11; Move Atlanta and Miami to
bucket 10

10 11

ATL
MIA

Miami

1110
11

Copyright © 1998 by Hanan Samet

hp39

COMPARISON OF SPIRAL AND LINEAR HASHING

• Main advantage of spiral hashing over linear hashing is
that the bucket being split is the one most likely to
overflow

• Disadvantages of spiral hashing:

1. the buckets that have been split are not reused

• overcome by using a mapping between logical and
physical addresses

2. expensive to calculate function y = dx

• overcome by use of an approximation

Copyright © 1998 by Hanan Samet

hp40
WHY SPIRAL?

• Can rewrite y = dx  as ρ = e jθ  using polar coordinates
which yields the equation of a spiral

• Ex: ρ = e (ln 2)•θ/2π

1
b

-10 -5 5 10 15

-10

-5

5

10

3

6

12

14

7
15

13

5

10

11

9

• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π

• Mechanics of a bucket split
1. let first active bucket be at a = e j•b  (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active

bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b

4. buckets a + 1 through g are now the active buckets
• Ex: d = 2

8
16

Copyright © 1998 by Hanan Samet

hp40
WHY SPIRAL?

• Can rewrite y = dx  as ρ = e jθ  using polar coordinates
which yields the equation of a spiral

• Ex: ρ = e (ln 2)•θ/2π

1
b

-10 -5 5 10 15

-10

-5

5

10

3

6

12

14

7
15

13

5

10

11

9

• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π

• Mechanics of a bucket split
1. let first active bucket be at a = e j•b  (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active

bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b

4. buckets a + 1 through g are now the active buckets
• Ex: d = 2

8
16

Copyright © 1998 by Hanan Samet

hp402
r

1. active buckets 6 through 11

Copyright © 1998 by Hanan Samet

hp40
WHY SPIRAL?

• Can rewrite y = dx  as ρ = e jθ  using polar coordinates
which yields the equation of a spiral

• Ex: ρ = e (ln 2)•θ/2π

1
b

-10 -5 5 10 15

-10

-5

5

10

3

6

12

14

7
15

13

5

10

11

9

• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π

• Mechanics of a bucket split
1. let first active bucket be at a = e j•b  (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active

bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b

4. buckets a + 1 through g are now the active buckets
• Ex: d = 2

8
16

Copyright © 1998 by Hanan Samet

hp402
r

1. active buckets 6 through 11

Copyright © 1998 by Hanan Samet

hp403
z

2. split bucket 6 to yield buckets 12 and 13

Copyright © 1998 by Hanan Samet

hp40
WHY SPIRAL?

• Can rewrite y = dx  as ρ = e jθ  using polar coordinates
which yields the equation of a spiral

• Ex: ρ = e (ln 2)•θ/2π

1
b

-10 -5 5 10 15

-10

-5

5

10

3

6

12

14

7
15

13

5

10

11

9

• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π

• Mechanics of a bucket split
1. let first active bucket be at a = e j•b  (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active

bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b

4. buckets a + 1 through g are now the active buckets
• Ex: d = 2

8
16

Copyright © 1998 by Hanan Samet

hp402
r

1. active buckets 6 through 11

Copyright © 1998 by Hanan Samet

hp403
z

2. split bucket 6 to yield buckets 12 and 13

Copyright © 1998 by Hanan Samet

hp404
g

3. active buckets 7 through 13

Copyright © 1998 by Hanan Samet

hp40
WHY SPIRAL?

• Can rewrite y = dx  as ρ = e jθ  using polar coordinates
which yields the equation of a spiral

• Ex: ρ = e (ln 2)•θ/2π

1
b

-10 -5 5 10 15

-10

-5

5

10

3

6

12

14

7
15

13

5

10

11

9

• Polar coordinates mean that the active buckets are always
within one complete arc of the spiral − i.e., θ = 2π

• Mechanics of a bucket split
1. let first active bucket be at a = e j•b  (i.e., θ = b )
2. last active bucket is at c = ej•(b+2π) − 1
3. bucket split means that the contents of the active

bucket at ρ = a (i.e., θ = b) are distributed into buckets
c + 1 through g where g = ej•(b+2π+φ) and φ is the
solution of a + 1 = ej•(b+φ) − i.e., φ = (ln (a + 1))/j − b

4. buckets a + 1 through g are now the active buckets
• Ex: d = 2

8
16

Copyright © 1998 by Hanan Samet

hp402
r

1. active buckets 6 through 11

Copyright © 1998 by Hanan Samet

hp403
z

2. split bucket 6 to yield buckets 12 and 13

Copyright © 1998 by Hanan Samet

hp404
g

3. active buckets 7 through 13

Copyright © 1998 by Hanan Samet

hp405
v

• Observations
1. insead of h(k) uniformly mapping key k into [0,1), use

hθ(k) to uniformly map k into [0,2π)
2. length of arc has constant value between successive

integer values of ρ
Copyright © 1998 by Hanan Samet