Query Evaluation
Query Evaluation
• ProcessingofSQLqueries
Copyright By PowCoder代写 加微信 powcoder
– Parser translates into internal query representation • The parser parses the string and checks for syntax errors • Checks existence of relations and attributes
• Replaces views by their definitions
– The query optimizer translates the query into an efficient execution plan
• Many different execution plans exist;
• Choosing an efficient execution plan is crucial
– The plan executor executes the execution plan and delivers data
Query Evaluation
Query decomposition
– Queries are composed of few basic operators • Selection
• Projection
• Order by
• Group by
• Intersection •…
– Several alternative algorithms for implementing each relational operator
– Not a single “best” algorithm; efficiency of each implementation depends on factors like size of the relation, existing indexes, size of buffer pool etc.
Basic terminology
• Access Path
– The method used to retrieve a set of tuples from a relation
• Basic: file scan
• index plus matching selection condition (index can be used
to retrieve only the tuples that satisfy condition) • Partitioning and others
Cost model
• In order to compare different alternatives we must estimate how expensive a specific execution is
• Input for cost model:
– the query,
– databasestatistics(e.g.,distributionofvalues,etc.),
– resourceavailabilityandcosts:buffersize,I/Odelay,diskbandwidth,
CPU costs, network bandwidth, ….
• Our cost model
– Number of I/O = pages retrieved from disk
• assumptionthattherootandallintermediatenodesoftheB+areinmain
memory: only leaf pages may not be in main memory!!!
• ApagePmightberetrievedseveraltimesifmanypagesareaccessed between two accesses of P
Basic Query Processing Operations
• Selection:𝜎
– Scan (without an index)
– Use an index involving the attributes of the selection
• Projection:𝜋
– Nested loop join – Sort-merge join – Many others…
• Grouping and duplicate elimination 6
Concatenation of Operators
• Execution tree:
– Leaf nodes are the base relations
– Inner nodes are operators
– Tuples from base relations (leaves) flow into the parent operator node(s)
– Output of an operator node flows into the parent node
– Output of the root flows as result to the client
Execution Plan
Selection avg(rating) > 5
Count and average
SELECT P.cid, count(*), AVG(S.rating)
FROM Skaters S, Participates P
WHERE P.sid = S.sid AND S.age < 10
GROUP BY P.cid
HAVING AVG(S.rating) > 5
Index Nested
Grouping on cid
Project 𝜋(cid, rating)
Join R1.sid = P.sid
Selection 𝜎(age < 10)
Participates
Schema for Examples
Users (uid: int, uname: string, experience: int, age: int) GroupMembers (uid: int, gid: int, stars: int)
• Users (U):
– 40,000 tuples (denoted as CARD(U))
– Around 80 tuples per page
– 500 pages (denoted as UserPages)
– An index on uid has around 170 leaf pages
– An index on uname has around 300 leaf pages
• GroupMembers (GM):
– 100,000 tuples (denoted as CARD(GM))
– Around 100 tuples per page
– Total of 1000 pages (denoted as GroupMemberPages)
• Database statistics tables
– Keep track of cardinality of tables, number of pages, what indexes, how big an
index, etc.
– keep track of domain of values and their rough distribution
• E.g., rating values: 1.....10, uniform distribution
Selectivity / Reduction Factor
• Reduction Factor of a condition is defined as
– Red(scondition(R) ) = | scondition(R) | / |R|
– Red(sexperience=5(Users) ) = |sexperience=5(Users)| / |Users| =
e.g., 10.000/40.000 = 0.25 ( assuming 10,000 user have experience of 5)
• If not known, DBMS makes simple assumptions
– Red(sexperience=5(R) ) = 1/|different experience levels| = 0.1 • Uniformdistributionassumed
– Red(sage<=16(R) ) = (16 - min(age)+1) / (max(age) - min(age)+1) = (16 – 12+1)/(61 – 12+1)= 5/50= 0.1
– Red(sexperience=5 and age <= 16 (R) ) = ?
• Result sizes: number of input tuples * reduction factor
• How to know number of different values, how many of a certain value, max, min...
– through indices, heuristics, separate statistics (histograms) 10
Simple Selections
FROM Users
WHERE uname LIKE ‘B%’
FROM Users
WHERE uid = 123
σ R.attr op value (R)
– Search on arbitrary attribute(s): scan the relation (cost is
UserPages=500)
– Search on primary key attribute: scan on average half of U (cost is UserPages/2=250)
• Index on selection attributes:
– Use index to find qualifying data entries, then retrieve corresponding
data records.
– I/O: number of leaf pages + certain number of data pages that have matching tuples
– Now how much is that??
• Depends on number of qualifying tuples
• General form:
• No index:
• Depends on type of index
In case of Clustered B+ Tree
• Path from root to the left-most
leaf lq with qualifying data entry: – Inner pages in memory
– One I/O to retrieve lq page
• Retrieve page of first qualifying tuple: 1 I/O
• Retrieve all following pages as long as search criteria is fulfilled
• Each data page only retrieved once
• # data pages
– #matching tuples / tuples
– E.g., if 20% of tuples qualify then 20% of data pages are retrieved
Index Inner Pages (Directs search)
Data Entries In leaves
("Sequence set"
Data Records on Data Pages
Clustered Index for our Example
Clustered: matching tuples are in adjacent data pages:
– # datapages = number of matching tuples / number of tuples per page – Example 1:SELECT * FROM Users WHERE uid = 123
• 1 tuple matches because primary key (1 data page)
• I/O cost: 1 index leaf page + 1 data page
– Example 2:SELECT * FROM Users WHERE uname LIKE ‘B%
• System estimates the number of matching tuples – e.g., around 100 tuples match
• Since clustered there are all on few pages (2 data pages)
• I/O cost: 1 index leaf page + two data pages
– Example 3:SELECT * FROM Users WHERE uname < ‘F’
• Estimate is that around 10000 tuples match
• i.e., 25% of the data, clustered on 25% of the pages (125 data pages) • Cost: 1 leaf pages + 125 data pages
– Note: some systems might retrieve all rids through the index. 13
– In this case, example 3 will read 25% of the leaf pages
In case of Unclustered B+ Tree
• Path to first qualifying leaf:
– One I/O to retrieve lq page
• Retrieve page of first qualifying
tuple: 1 I/O
• Retrieve page of second qualifying tuple
• Retrieve next leaf page with
qualifying tuple
• Retrieve page of next qualifying tuple ....
• Sometimes page might have been retrieved previously
– Mightstillbeinmainmemory – Mighthavebeenreplacedagain
Index inner pages (Directs search)
Data Entries on Leaves
("Sequence set")
Data Records on Data Pages
Might result in one I/O per data record!
Using an Unclustered Index for Selections
• Non clustered, simple strategy:
– get one page after the other:
– Worst case #data pages = #matching tuples
– Example 1:SELECT * FROM Users WHERE uid = 123
• Same as before
– Example 2:SELECT * FROM Users WHERE uname LIKE
• Estimated that around 100 tuples match
• In worst case there are on 100 different data pages
• But even if two are on the same page, when the second record is retrieved the page might already be no more in main memory..
• cost: 1 index-leaf-page + 100 pages (some pages are retrieved twice) 15
Using an Unclustered Index for Selections
– Example 3:SELECT * FROM Users WHERE uname < ‘F’
• Estimated that 10000 tuples match = 25%
• Likely that nearly every data page has a matching tuple! (maybe 10 don’t???)
• But as we retrieve tuple by tuple, every retrieval might lead to I/O as page might have been purged from main memory in between
• cost: 75 leaf pages + 10000 pages (most pages will be retrieved several times)
• 75 leaves = 25% of all leaf pages
• Simple scan is faster!! – Lesson learned:
• Indices usually only useful with very small reduction factors
Using an Unclustered Index with Sorting for Selections
• Example 3: SELECT * FROM Users WHERE uname < ‘F’
– Determine all leaf pages that have matching entries (75 leaf pages)
– sort matching data entries (rid=pid,slot-id) in leaf-pages by page-id
– Only fast if the 75 leaf pages with matching entries fit in main memory
– Retrieve each page only once and get all matching tuples
– #data pages = #data pages that have at least one matching tuple; • worst case is total # of data pages
– For Example 3
• Around 10000 tuples match
• If they are distributed over 490
– cost: 75 leaf pages + 490 pages
• If they are distributed over 300 pages
(worse than a scan) – Cost 75 leaf pages + 300 pages (better than a scan)
– Note: sorting expensive if leaf-pages do not fit in main-memory 17
Selections on 2 or more attributes
A=100 AND B=50 (pages = 500) – No index:
– IndexonAattribute:
• Get all tuples for A=100 through index, check value of B
• Cost same as the query for A=100
– 2indexes;oneforeachattribute
• Find rids where A = 100 through A index
• Find rids where B = 50 through B index
• Build intersection of rids
• Retrieve from data pages all tuples with rids in that intersection
• In some cases can just use A’s index (e.g. when A is unique or has small reduction factor)
• Very low reduction factor for B<50, not much use.
– 1 index with both attributes A=100 and B<50 (Red(B<50) = 0.5)
Selections on 2 or more attributes
• A=100 OR B=50 (pages = 500) – No index:
– Index on A attribute:
• Not useful
– 2 indexes; one for each attribute
• Find rids where A = 100 through A index
• Find rids where B = 50 through B index
• Build union of rids
• Retrieve from data pages all tuples with rids in that union
– 1 index with both attributes (A,B)
• Have to read all the leaf pages anyways.
External Sorting Example Setup
Represented in the following as:
Summary of example: • 5 pages
• Each with two tuples
(123,Dora, 2, 13) (132, Bug, 8, 60)
(267,Sakura, 7, 15) (111, Cyphon, 8, 35)
External Sorting
FROM Users ORDER BY experience
Summary of example: • 5 pages
• Each with two tuples
What if only 3 buffer frames available? That is, data pages do not
fit into main memory
External Sorting - Pass 0
•In example: 5 pages, 3 buffer frames •N pages, B buffer frames:
•Bring B pages in buffer
•Sort with any main memory sort
•Write out to disk to a temporary file; •it’s called a run of B pages
External Sorting - Pass 0
Run 1 of pass 0
Run 2 of pass 0
•In example: 5/3 = 2 runs (round up!) •In total N/B runs
External Sorting Pass 1:
•N/B input frames •One output frame •Merge Sort the runs of pass 0 •Hopefully B-1>N/B
•Otherwise pass 2…
• Sometimes a Pass 2 is needed:
– Pass 0 created more runs than there are main memory buffers – Therefore Pass 1 produces more than one run
• Take the first B-1 runs from pass 0 and merge them to one bigger Pass 1 run
• Then take the next B-1 runs from pass 0 and merge then to one bigger Pass 1 run
– Pass 2 takes the runs of Pass 1 and merges them
• If there are less than B Pass 1 runs, then this is the final pass • Otherwise Pass 3…
• Number of passes:
• Cost = 2N * (# of passes)
Number of passes
Sort Costs
Sort together with other operators
• I/O Costs:
– SELECT uname, experience FROM Users ORDER BY experience – If everything fits into main memory (Only pass 0 needed):
• Read number of data pages
• sort and pipeline result into next operator: 𝜋(uname,experience) – Pass 0 + pass 1 needed
• Pass 0: read # pages, write # pages (have to write temp. results!) • Pass 1: read # pages, sort and pipeline result into next operator
• 3 * #pages
– Pass 0 + pass1 + pass2 needed
• 5 * #pages
On the fly
Project 𝜋(uid,experience)
Order by experience
Blocked I/O:
Sort in real life
– use more output pages and flush to consecutive blocks on disk
– Might lead to more I/O but each I/O is cheaper
Other optimizations:
– At write in Pass 0 to disk (if needed):
• Do projection on necessary attributesà each tuple is smalleràless pages
• that is, projection is pushed to the lowest level possible
Equality Joins
FROM Users U, GroupMembers GM
WHERE U.uid = GM.uid
experience
Join Cardinality Estimation
• | Users GroupMembers | = ?
– Join attribute is primary key for Users
– Each GroupMember tuple matches exactly with one Users tuple
– Result: |GroupMembers|
• | Users X GroupMembers | = ?
– Result: |Users| * |GroupMembers|
– Cross product is always the product of individual relation sizes
• For other joins more difficult to estimate 30
Cardinality Estimation
• | Users 𝜎(stars > 3)(GroupMembers) | = ?
– Result: | 𝜎(stars > 3)(GroupMembers) |
– Assuming 1-5 stars and 100,000 members: • 40,000
• | 𝜎(experience > 5) (Users) (GroupMembers) | = ? – Assume 1-10 experience levels and 40,000 users and
uniform distribution for experience – Red(𝜎(experience > 5)(Users)) = 1/2
– Result: 1⁄2 * |GroupMembers|
Simple Nested Loop Join
For each tuple in the outer relation Users U we scan the entire inner relation GroupMembers GM.
foreach tuple u in U do
foreach tuple g in GM do
if u.uid == g.uid then add to result
experience
Simple Nested Loop Join
For each tuple in the outer relation Users U we scan the entire inner relation GroupMembers GM.
foreach tuple u in U do
foreach tuple g in GM do
if u.uid == g.uid then add to result
• Cost: UserPages + |Users| * GroupMemberPages = 500 + 40,000*1000 !
• NOT GOOD
• We need page-oriented algorithm!
Page Nested Loop Join
• For each page pu of Users U, get each page pg of GroupMembers GM – write out matching pairs , where u is in pu and g is in pg.
For each page pu of Users U
for each page pg of GroupMembers GM
experience
for each tuple u in pu do for each tuple g in pg do
if u.uid == g.uid then add to result
Page Nested Loop Join
• For each page pu of Users U, get each page pg of GroupMembers GM – write out matching pairs , where u is in pu and g is in pg.
For each page pu of Users U
for each page pg of GroupMembers GM
for each tuple u in pu do for each tuple g in pg do
if u.uid == g.uid then add to result
Cost: UserPages + UserPages*GroupPages = 500 + 500*1000 = 500,500
Block Nested Loop Join
• For each block of pages bpu of Users U, get each page pg of GroupMembers GM – write out matching pairs , where u is in bpu and g is in pg.
• block of pages bpu and one page of GM must fit in main memory
– For each block of pages bpu
• Load block into main memory
• Get first page from GM
– Do all the matching between users in bpu and group members in first page
• Get second page from GM (into the same frame the first one was in before)
– Do all the the matching between users in bpu and group members in second page
• Get last page from GM (into again that frame reserved for GM)
• Cost: UserPages + UserPages / |bpu| * GroupMemberPages 36
Block Nested Loop
experience
Block Nested Loop Join
• Examples depending on available main memory:
• 51 Buffer Frames:
– 500 + 500/50 * 1000 = 500 + 10,000
• 501 Buffer Frames
– 500 + 500/500 * 1000 = 500 + 1000
– Special case: outer relation fits into main memory!!
Index Nested Loops Join
• For each tuple in the outer relation Users U we find the matching tuples in GroupMembers GM through an index
– Condition: GM must have an index on the join attribute
foreach tuple u in U do
find all matching tuples g in GM through index
then add all to result
experience
Index Nested Loops Join
foreach tuple u in U do
find all matching tuples g in GM through index
then add all to result
• Index MUST be on the inner relation (in this case GM).
• Cost: OuterPages + CARD(OuterRelation) * cost of finding
matching tuples in inner relation
• In example of previous page:
– Index on uid on GM is clustered:
• 500 + 40.000 * (1 leaf page +1 data pages)
– Index on uid on GM is not clustered:
• 500 + 40.000 * (1 leaf page + 2.5 data pages) (on average 2.5 tuples in GM per user)
Index Nested Loops Join
Switch inner and outer if index is on uid of Users Note: uid is primary key in User
– Only one tuple matches!
foreach tuple g in GM do
find the one matching tuple u in U through index
then add
Cost: 1000 + 100.000 * (1 leaf page + 1 data page)
experience
Block Nested Loop vs. Index
• for Block Nested Loop (if outer relation fits in main memory) – OuterPages + InnerPages
• Index Nested Loop:
– OuterPages + Card(Outer) * matching tuples Inner
• Index Nested Loop wins if:
– InnerPages > Card(Outer) * matching tuples Inner
– E.g., if Outer is the result of a selection that only selected very few tuples
• 𝜎(age > 60) (Users)
(GroupMembers)
Pipeline R1 Scan
Index Nested
Selection 𝜎(age>60)
Join R1.uid = GM.uid
GroupMembers
Sort-Merge Join
• Sort U and GM on the join column, then scan them to do a “merge” (on join col.), and output result tuples.
• AssumethescancursorscurrentlypointstoUtupleuand GM tuple g. Advance scan cursor of U until u.uid >= g.uid and then advance scan cursor of GM so that g.uid >= u.uid. Do this until u.uid = g.uid.
• Atthispoint,allUtupleswithsamevalueinuid(currentU group) and all GM tuples with same value in uid (current GM group) match; output for all pairs of such tuples.
– ThenresumescanningUandGM.
• U is scanned once; each GM group is scanned once per matching U tuple. (Multiple scans of an GM group are likely to find needed pages in buffer.)
experience
Example of Sort-Merge Join
Cost of Sort-Merge Join
• Relations are already sorted:
– UserPages + GroupMemberPages = 500 + 1000
• Relations need to be sorted – simple way:
– Sort relations and write sorted relations to temporary stable storage – Read in sorted relations and merge
– Costs:assuming100bufferpages
• both Users and GroupMembers can be sorted in 2 passes (Pass 0 and 1): 4* UserPages + 4 GroupPages
• Final merge: 500 + 1000 = ( UserPages + GroupPages) • Total: 5 * UserPages + 5 GroupPages
Cost of Sort-Merge Join
• Relations need to be sorted – use pipelining to combine last sort pass and join
– Sorting performs Pass 0 reads and writes each of the relations: 2 * UserPages
+ 2 GroupPages
– Pass 1 reads data, sorts and then performs merge in pipeline fashion (ignore
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com