程序代写代做代考 algorithm database Query Optimization 1

Query Optimization 1

 Parsing Queries
 Relational algebra review
 Relational algebra equivalencies  Estimating relation size
 Cost based plan selection
 Join order
John Edgar 2

 Generating equivalent logical plans and their physical plans is known as query optimization
 Selecting a good query plan entails decisions
▪ Which equivalent plan?
▪ Which algorithm for plan operations?
▪ How is data passed from one operation to the next?  These choices depend on database metadata
▪ Size of relations
▪ Number and frequency of attributes ▪ Indexing and data file organization
John Edgar 3

query
Parser Preprocessor
select … from … where …
semantic checking
Rewriter
Logical plan
 ( (…))1
 ( (…))
 ( (…))2
 ( (…))3
… …
x RS…
Logical plan generator
John Edgar
4
 ( (…))2

 Parsing
▪ Construct a parse tree for a query
▪ Translate SQL to a relational algebra tree
 Generate equivalent logical query plans
▪ Convert the parse tree to a query plan in relational algebra ▪ Transform the plan into more efficient equivalents
 Generate a physical plan
▪ Select algorithms for each of the operators in the query
▪ Including details about how tables are to be accessed or sorted
John Edgar 5

1.1

 Selection ()
▪ salary > 50000(Employee) – removes rows
 Projection ()
▪ sin, salary(Employee) – removes columns
 Set Operations
▪ Union () – all rows from both tables
▪ Intersection () – rows in common between tables
▪ Set Difference (−) – rows in LH table not in RH table
▪ Cartesian Product () – combines all rows in both tables ▪ Division () – not usually implemented in SQL
 Joins (⋈) Often used to combine two tables that relate to each other ▪ Cartesian product followed by join selection
John Edgar 7

Doctor
sin
fName
lName
speciality
office
555
Tom
Baker
Cardiology
168
123
William
Hartnell
GP
743
499
Jon
Pertwee
Oncology
291
674
David
Tennant
Neurology
445
Doctor  Patient Error! – not union compatible
Patient
msp
sin
fName
lName
dob
34456
555
Tom
Baker
20/01/1934
77321
321
Lalla
Ward
28/06/1951
11387
499
Jon
Pertwee
07/07/1919
12121
674
Billie
Piper
22/09/1982
sin fName
lName
555
Tom
Baker
499
Jon
Pertwee
John Edgar
8
sin,fName,lName(Doctor)  sin,fName,lName(Patient)

Doctor
sin
fName
lName
speciality
office
555
Tom
Baker
Cardiology
168
123
William
Hartnell
GP
743
499
Jon
Pertwee
Oncology
291
674
David
Tennant
Neurology
445
sin,fName,lName(Doctor)  sin,fName,lName(Patient)
sin fName lName
555
Tom
Baker
123
William
Hartnell
499
Jon
Pertwee
674
David
Tennant
321
Lalla
Ward
674
Billie
Piper
Patient
msp
sin
fName
lName
dob
34456
555
Tom
Baker
20/01/1934
77321
321
Lalla
Ward
28/06/1951
11387
499
Jon
Pertwee
07/07/1919
12121
674
Billie
Piper
22/09/1982
John Edgar 9

Doctor
sin
fName
lName
speciality
office
555
Tom
Baker
Cardiology
168
123
William
Hartnell
GP
743
499
Jon
Pertwee
Oncology
291
674
David
Tennant
Neurology
445
sin,fName,lName(Doctor) − sin,fName,lName(Patient)
sin fName lName
123
William
Hartnell
674
David
Tennant
Patient
msp
sin
fName
lName
dob
34456
555
Tom
Baker
20/01/1934
77321
321
Lalla
Ward
28/06/1951
11387
499
Jon
Pertwee
07/07/1919
12121
674
Billie
Piper
22/09/1982
John Edgar 10

Doctor
sin
fName
lName
speciality
office
555
Tom
Baker
Cardiology
168
123
William
Hartnell
GP
743
499
Jon
Pertwee
Oncology
291
674
David
Tennant
Neurology
445
(1)
(2)
(3)
speciality
office msp
(6)
(7) (8)
age
555
Tom
Baker
Cardiology
168
34456
555
Tom
Baker
20/01/1934
555
Tom
Baker
Cardiology
168
77321
321
Lalla
Ward
28/06/1951
555
Tom
Baker
Cardiology
168
11387
499
Jon
Pertwee
07/07/1919
555
Tom
Baker
Cardiology
168
12121
674
Billie
Piper
22/09/1982
123
William
Hartnell
GP
743
34456
555
Tom
Baker
20/01/1934
123
William
Hartnell
GP
743
77321
321
Lalla
Ward
28/06/1951
123
William
Hartnell
GP
743
11387
499
Jon
Pertwee
07/07/1919
123
William
Hartnell
GP
743
12121
674
Billie
Piper
22/09/1982
499
Jon
Pertwee
Oncology
291
34456
555
Tom
Baker
20/01/1934
499
Jon
Pertwee
Oncology
291
77321
321
Lalla
Ward
28/06/1951
499
Jon
Pertwee
Oncology
291
11387
499
Jon
Pertwee
07/07/1919
499
Jon
Pertwee
Oncology
291
12121
674
Billie
Piper
22/09/1982
674
David
Tennant
Neurology
445
34456
555
Tom
Baker
20/01/1934
674
David
Tennant
Neurology
445
77321
321
Lalla
Ward
28/06/1951
674
David
Tennant
Neurology
445
11387
499
Jon
Pertwee
07/07/1919
674
David
Tennant
Neurology
445
12121
674
Billie
Piper
22/09/1982
Doctor  Patient
Patient
msp
sin
fName
lName
dob
34456
555
Tom
Baker
20/01/1934
77321
321
Lalla
Ward
28/06/1951
11387
499
Jon
Pertwee
07/07/1919
12121
674
Billie
Piper
22/09/1982
John Edgar 11

fName,lName,description(Patient.msp = Operation.msp  dob.year  1920(Patient  Operation))
Patient
msp
sin
fName
lName
dob
34456
555
Tom
Baker
20/01/1934
77321
321
Lalla
Ward
28/06/1951
11387
499
Jon
Pertwee
07/07/1919
12121
674
Billie
Piper
22/09/1982
(1) sin fName lName dob
opID description
date (9)
34456
555
Tom
Baker
20/01/1934
12
appendectomy
01-01-05
34456
34456
555
Tom
Baker
20/01/1934
13
vasectomy
02-01-05
11387
34456
555
Tom
Baker
20/01/1934
14
appendectomy
03-01-05
34456
34456
555
Tom
Baker
20/01/1934
15
kidney transplant
05-01-05
34456
77321
321
Lalla
Ward
28/06/1951
12
appendectomy
01-01-05
34456
77321
321
Lalla
Ward
28/06/1951
13
vasectomy
02-01-05
11387
77321
321
Lalla
Ward
28/06/1951
14
appendectomy
03-01-05
34456
77321
321
Lalla
Ward
28/06/1951
15
kidney transplant
05-01-05
34456
11387
499
Jon
Pertwee
07/07/1919
12
appendectomy
01-01-05
34456
11387 499 Jon Pertwee 07/07/1919 13 vasectomy 02-01-05 11387
11387
499
Jon
Pertwee
07/07/1919
14
appendectomy
03-01-05
34456
11387
499
Jon
Pertwee
07/07/1919
15
kidney transplant
05-01-05
34456
12121
674
Billie
Piper
22/09/1982
12
appendectomy
01-01-05
34456
12121
674
Billie
Piper
22/09/1982
13
vasectomy
02-01-05
11387
12121
674
Billie
Piper
22/09/1982
14
appendectomy
03-01-05
34456
12121
674
Billie
Piper
22/09/1982
15
kidney transplant
05-01-05
34456
Operation
opID
description
date
msp
12
appendectomy
01-01-05
34456
13
vasectomy
02-01-05
11387
14
appendectomy
03-01-05
34456
15
kidney transplant
05-01-05
34456
fName
lName description
Jon
Pertwee
vasectomy
John Edgar 12

fName,lName,description(Patient.msp = Operation.msp(dob.year  1920(Patient)  Operation))
Patient
msp
sin
fName
lName
dob
34456
555
Tom
Baker
20/01/1934
77321
321
Lalla
Ward
28/06/1951
11387
499
Jon
Pertwee
07/07/1919
12121
674
Billie
Piper
22/09/1982
msp
sin fName lName dob
11387
499
Jon
Pertwee
07/07/1919
(1) sin
fName lName age opID description
date (9)
11387
499
Jon
Pertwee
07/07/1919
12
appendectomy
01-01-05
34456
11387
499
Jon
Pertwee
07/07/1919
13
vasectomy
02-01-05
11387
11387
499
Jon
Pertwee
07/07/1919
14
appendectomy
03-01-05
34456
11387
499
Jon
Pertwee
07/07/1919
15
kidney transplant
05-01-05
34456
Operation
opID
description
date
msp
12
appendectomy
01-01-05
34456
13
vasectomy
02-01-05
11387
14
appendectomy
03-01-05
34456
15
kidney transplant
05-01-05
34456
(1) sin fName
lName
age opID
description
date (9)
11387
499
Jon
Pertwee
07/07/1919
13
vasectomy
02-01-05
11387
fName
lName description
Jon
Pertwee
vasectomy
John Edgar 13

1.2

 The parser takes an SQL query and converts it to a parse tree
 A parse tree is a tree whose nodes are ▪ Atoms – keywords, attribute names,
1000000
= ▪ Syntactic categories – families of balance
relations, constants, operators
query subparts such as a query or a condition
 An atom is a node with no children
▪ If a node is a syntactic category it is described by one
of the rules of the grammar
John Edgar 15

 We will look at a simplified version of SQL ▪ … very simplified …
 The grammar only has rules for
▪ Queries, select, from and where clauses
▪ Rules for select, from and where are also simplified
 We will give examples of how the grammar can be used to convert queries to parse trees
John Edgar 16

 The syntactic category represents SQL queries
 Just one rule for queries
::= SELECT FROM WHERE
▪ The symbol ::= means “can be expressed as”
▪ The query rule omits GROUP BY, HAVING and
(many) other optional clauses
John Edgar 17

 Select Lists
▪ Comma separated list of attributes
▪ Single attributes, or
▪ An attribute, a comma and
a select list
▪ No expressions, aliases and aggregations
::= , ::=
 From Lists
▪ Comma separated list of
relations
▪ No joins, sub-queries or tuple variables
::= , ::= < Relation>
John Edgar
18

 This abbreviated set of rules does not include ▪ OR, NOT and EXISTS
▪ Comparisons not on equality or LIKE
▪ Parentheses
▪ …
::= AND
::= IN
::= =
::= LIKE
John Edgar 19

 There are three base syntactic categories
, and
▪ These categories are not defined by rules but by which atoms they can contain
 An can be any string of characters that identifies a legal attribute
 A can be any string of characters that identifies a legal relation
John Edgar 20

 Consider two relations
▪ Account = {accID, balance, ownerID}
▪ Transaction = {transID, amount, date, trans_accID}  And a query
SELECT trans_accID, amount FROM Transaction
WHERE trans_accID IN(
SELECT accID
FROM Account
WHERE balance = 1000000)
John Edgar 21

SELECT trans_accID, amount FROM Transaction WHERE trans_accID IN( SELECT accID
FROM Account WHERE balance = 1000000)

SELECT FROM
, trans_accID

Transaction
WHERE
IN trans_accID
amount
SELECT

accID
FROM
Account

= 1000000

WHERE
John Edgar
22
balance

 Consider the same two relations
▪ Account = {accID, balance, ownerSIN}
▪ Transaction = {transID, amount, date, trans_accID}  And a query that is equivalent to the query in
the previous example
SELECT trans_accID, amount
FROM Transaction, Account
WHERE balance = 1000000 AND trans_accID = accID
John Edgar 23

SELECT trans_accID, amount
FROM Transaction, Account
WHERE balance = 1000000 AND trans_accID = accID

SELECT FROM

WHERE

trans_accID
,
amount

Transaction
,
Account

AND
= balance
1000000
= trans_accID
accID
John Edgar
24

 The pre-processor has two main tasks
 Relations that are virtual views are replaced by a parse
tree that describes the view
 Names in the query are checked for validity
▪ Each relation name in the FROM clause
▪ Attributes
▪ In a relation in a FROM clause of the query
▪ All attributes must be in the correct scope
▪ Check types
▪ Attribute types must be appropriate for their uses
▪ Operands must be appropriate and compatible types
Semantic checking
John Edgar 25