程序代写代做代考 database SQL algorithm Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Query Processing and Optimization
Overview
Measures of Query Cost
Selection Operation
Sort Operation
Join Operation
nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join
Other Operations
Evaluation of Expressions
Transformation of Relational Expressions
Statistics Estimation
Choices of Evaluation Plans

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

Basic Steps in Query Processing (Cont.)
Parsing and translation
Parser checks syntax, verifies relations
Translator translates the query (e.g., SQL) into its internal form (e.g., relational algebra)
A SQL query can be translated into several relational algebra expressions.
E.g., select salary from instructor where salary < 75000;  salary75000(salary(instructor)), or salary(salary75000(instructor)) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Basic Steps in Query Processing (Cont.) Each relational algebra operation can be evaluated using one of several different algorithms E.g., can search every tuple in instructor to find tuples with salary < 75000, or can use an index on salary to find instructors with salary < 75000 Correspondingly, a relational algebra expression can be evaluated in many ways. Annotated expression specifying detailed evaluation strategy is called an evaluation plan (or execution plan). Evaluation The query execution engine takes a query evaluation plan, executes that plan, and returns the answers to the query. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Basic Steps in Query Processing (Cont.) Query Optimization: Amongst all equivalent evaluation plans, choose the one with lowest cost. Cost is estimated using statistical information from the database catalog e.g. number of tuples in each relation, size of tuples, etc. In this lecture, we study how to measure query costs algorithms for evaluating relational algebra operations how to combine algorithms for individual operations in order to evaluate a complete expression how to optimize queries, that is, how to find an evaluation plan with lowest estimated cost ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Measures of Query Cost Cost can be measured based on response time, i.e., total elapsed time for answering query Many factors contribute to time cost disk access, CPU time, … Typically disk access is the predominant cost, and is also relatively easy to estimate. Real systems do take CPU cost into account Disk cost can be estimated as number of block transfers from disk and number of seeks tT – time to transfer one block tS – time for one seek Cost for b block transfers plus S seeks b * tT + S * tS ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Measures of Query Cost (Cont.) We do not include cost of writing output to disk in our cost formulae The output of an operation may be sent to the next operation without being written to disk. Worst case estimates No data is initially in buffer, i.e., data must be read from disk initially Only the minimum amount of memory needed for the operation is available Several algorithms can reduce disk I/O by using extra buffer space Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Selection Operation File scan Algorithm A1 (linear search). Scan each file block (stored contiguously) and test all records to see whether they satisfy the selection condition. Cost estimate = tS + br * tT br denotes number of blocks in relation r If selection is on a key attribute, can stop on finding record average cost = tS + (br /2)* tT  Linear search can be applied regardless of selection condition, or ordering of records in the file, or availability of indices ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Selections Using Indices Index scan – search algorithms that use an index (B+-tree) selection condition must be on search-key of index. A2 (primary index, equality on key). Retrieve a single record that satisfies the corresponding equality condition Cost = (hi + 1) * (tT + tS) hi = height of the B+-tree index traverse the height of the tree + one I/O to fetch the record; each requires a seek and a block transfer A3 (primary index, equality on nonkey). Retrieve multiple records on consecutive blocks. Cost = hi * (tT + tS) + tS + b * tT b = number of blocks containing matching records ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Selections Using Indices (Cont.) A4 (secondary index, equality). Retrieve a single record if the search-key is a candidate key Cost = (hi + 1) * (tT + tS) Retrieve multiple records if search-key is not a candidate key each of n matching records may be on a different block Cost = (hi + n) * (tT + tS) Can be very expensive! ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Selections Involving Comparisons Can implement selections of the form AV (r) or A  V(r) by using a linear file scan, or by using indices in the following ways: A5 (primary index, comparison). (Relation is sorted on A) For A  V(r), use index to find first tuple  V and scan relation sequentially from there (cost estimate is identical to A3) For AV (r), use file scan till first tuple > V; do not use index
A6 (secondary index, comparison).
For A  V(r), use index to find first index entry  V and scan leaf index nodes sequentially from there, to find pointers to records (cost estimate is identical to A4, equality on nonkey)
For AV (r), just scan leaf index nodes finding pointers to records, till first entry > V
In either case, retrieve records that are pointed to
requires an I/O for each record
linear file scan may be cheaper

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

Implementation of Complex Selections
Conjunction: 1 2. . . n(r)
A7 (conjunctive selection using one index).
Select a combination of i and algorithms A1 through A6 that results in the least cost for i (r).
Test other conditions on tuple after fetching it into memory buffer.
A8 (conjunctive selection using composite index).
Use appropriate composite (multiple-key) index if available.
A9 (conjunctive selection by intersection of identifiers).
Requires indices with record pointers.
Use corresponding index for each condition, and take intersection of all the obtained sets of record pointers.
Then fetch records from file.
If some conditions do not have appropriate indices, apply test in memory.

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

14

Algorithms for Complex Selections
Disjunction:1 2 . . . n (r).
A10 (disjunctive selection by union of identifiers).
Applicable if all conditions have available indices.
Otherwise use linear scan.
Use corresponding index for each condition, and take union of all the obtained sets of record pointers.
Then fetch records from file.
Negation: (r)
Use linear scan on file.
If very few records satisfy , and an index is applicable to 
Find satisfying records using index and fetch from file.

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

15

Query Processing and Optimization
Overview
Measures of Query Cost
Selection Operation
Sort Operation
Join Operation
nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join
Other Operations
Evaluation of Expressions
Transformation of Relational Expressions
Statistics Estimation
Choices of Evaluation Plans

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

Sorting
Sorting is important because
SQL queries can specify that the output be sorted, and,
for query processing, several of the relational operations, such as joins, can be implemented efficiently if the input relations are first sorted.
For relations that fit in memory, techniques like quicksort can be used. For relations that don’t fit in memory, external
sort-merge is a good choice.

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

External Sort-Merge
Create sorted runs. Let i be 0 initially.
Repeatedly do the following till the end of the relation:
(a) Read M blocks of relation into memory
(b) Sort the in-memory blocks
(c) Write sorted data to run Ri; increment i.
Let the final value of i be N
Merge the runs (next slide)…..
Let M denote memory size (in pages=blocks).

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

External Sort-Merge (Cont.)
Merge the runs (N-way merge). We assume (for now) that N < M. Use N blocks of memory to buffer input runs, and 1 block to buffer output. Read the first block of each run into its buffer page repeat Select the first record (in sort order) among all buffer pages Write the record to the output buffer. If the output buffer is full, write it to disk. Delete the record from its input buffer page. If the buffer page becomes empty then read the next block (if any) of the run into the buffer. until all input buffer pages are empty ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts External Sort-Merge (Cont.) If N  M, several merge passes are required. In each pass, contiguous groups of M - 1 runs are merged to get a single run for the next pass. A pass reduces the number of runs by a factor of M -1, and creates runs longer by the same factor. E.g. If M=11, and there are 90 runs, one pass reduces the number of runs to 9, each 10 times the size of the initial runs Repeated passes are performed till all runs have been merged into one. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Example: External Sorting Using Sort-Merge memory size: 3 blocks block size: one tuple For merging two input blocks one output block ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts External Merge Sort (Cont.) Cost analysis: Initial number of runs: br / M Using bb buffer blocks per run during merge, i.e., read/write bb blocks at a time, can merge M / bb  - 1 runs in one pass Total number of merge passes required: logM / bb–1 (br / M). Block transfers for initial run creation and in each pass is 2br for final pass, we don’t count write cost Reminder: ignore final write cost Thus total number of block transfers for external sorting: br ( 2 log M / bb–1(br / M) + 1) What is the number of block transfers in the example shown on the last slide? Seeks: next slide ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts External Merge Sort (Cont.) Cost of seeks During run generation: one seek to read each run and one seek to write each run 2 br / M During the merge phase Need 2 br / bb seeks for each merge pass except the final one which does not require a write Total number of seeks: 2 br / M + br / bb (2 log M / bb–1(br / M) -1) What is the cost of seeks in the example shown on the last slide? ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Join Operation Several different algorithms to implement joins Nested-loop join Block nested-loop join Indexed nested-loop join Merge-join Hash-join Choice based on cost estimate Examples use the following information Number of records (n) of student: 5,000 takes: 10,000 Number of blocks (b) of student: 100 takes: 400 ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Nested-Loop Join To compute the theta join r ⨝ s for each tuple tr in r do begin for each tuple ts in s do begin test pair (tr, ts) to see if they satisfy the join condition  if they do, add tr • ts to the result. end end r is called the outer relation and s the inner relation of the join.  Requires no indices and can be used with any kind of join condition.  Expensive since it examines every pair of tuples in the two relations. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Nested-Loop Join (Cont.) In the worst case, if there is enough memory only to hold one block of each relation, we have to perform a complete scan on s for each record in r and the estimated cost is (nr  bs + br) block transfers + (nr + br) seeks where n: no. of tuples, b: no. of blocks If the smaller relation fits entirely in memory, use that as the inner relation. Reduces cost to br + bs block transfers and 2 seeks Example Assume worst case memory availability, cost estimate is with student as outer relation: 5000  400 + 100 = 2,000,100 block transfers, 5000 + 100 = 5100 seeks with takes as the outer relation 10000  100 + 400 = 1,000,400 block transfers and 10,400 seeks If smaller relation (student) fits entirely in memory, the cost estimate will be 500 block transfers and 2 seeks Block nested-loops algorithm (next slide) is preferable. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Block Nested-Loop Join Variant of nested-loop join in which every block of inner relation is paired with every block of outer relation. for each block Br of r do begin for each block Bs of s do begin for each tuple tr in Br do begin for each tuple ts in Bs do begin Check if (tr, ts) satisfy the join condition if they do, add tr • ts to the result. end end end end ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Block Nested-Loop Join (Cont.) Worst case estimate: br  bs + br block transfers + 2  br seeks Each block in the inner relation s is read once for each block in the outer relation More efficient to use the smaller relation as the outer relation Best case: br + bs block transfers + 2 seeks. Example Assuming worst case memory availability, cost estimate is with student as outer relation: 100  400 + 100 = 40,100 block transfers, 2  100 = 200 seeks The best-case cost remains the same. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Indexed Nested-Loop Join Index lookups can replace file scans if an index is available on the inner relation’s join attribute. For each tuple tr in the outer relation r, use the index to look up tuples in s that satisfy the join condition with tuple tr. Worst case: buffer has space for only one block of r and one block of the index. Cost of the join: br (tT + tS) + nr  c where c is the cost of traversing index and fetching all matching s tuples for one tuple of r c can be estimated as cost of a single selection on s using the join condition. If indices are available on join attributes of both r and s, use the relation with fewer tuples as the outer relation. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Example of Nested-Loop Join Costs Compute student takes, with student as the outer relation. Let takes have a primary B+-tree index on the attribute ID, which contains 20 entries in each index node. Since takes has 10,000 tuples, the height of the tree is 4, and one more access is needed to find the actual data student has 5000 tuples Cost of block nested loops join 100 * 400 + 100 = 40,100 block transfers, 2 * 100 = 200 seeks Cost of indexed nested loops join 100 + 5000 * 5 = 25,100 block transfers and seeks. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Merge-Join Sort both relations on their join attribute (if not already sorted on the join attributes). Merge the sorted relations to join them Similar to the merge stage of the sort-merge algorithm. A group of tuples of one relation with the same join-attribute values is read into a set Skip tuples of another relation with the join-attribute values smaller than the current join-attribute value of those tuples in the set Join every tuple in the set with the tuples of another relation with the same join-attribute values Detailed algorithm in book ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Merge-Join A group of tuples of one relation with the same value on the join attributes is read into a set before matching. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Merge-Join (Cont.) Can be used only for equi-joins and natural joins Each block needs to be read only once Thus the cost of merge join is (assuming bb buffer blocks are allocated to each relation): br + bs block transfers + br / bb + bs / bb seeks + the cost of sorting if relations are unsorted. Example Assuming worst case memory availability and the relations are already sorted on the join attribute, cost estimate 400 + 100 = 500 block transfers, 400 + 100 = 500 seeks (assuming bb =1) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Hash-Join Applicable for equi-joins and natural joins. A hash function h is used to partition tuples of both relations into sets that have the same hash value on the join attributes. h maps JoinAttrs values to {0, 1, ..., n-1}, where JoinAttrs denotes the common attributes of r and s used in the natural join. r0, r1, . . ., rn-1 denote partitions of r tuples Each tuple tr  r is put in partition ri where i = h(tr [JoinAttrs]). s0, s1,. . ., sn-1 denote partitions of s tuples Each tuple ts  s is put in partition si where i = h(ts [JoinAttrs]). ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Hash-Join (Cont.) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Hash-Join (Cont.) An r tuple and an s tuple that satisfy the join condition will have the same value for the join attributes. If that value is hashed to some value i, the r tuple has to be in ri and the s tuple in si. As a result, r tuples in ri need only to be compared with s tuples in si Need not be compared with s tuples in any other partition. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Hash-Join Algorithm 1. Partition the relation s using hash function h on the join attributes. When partitioning a relation, one block of memory is reserved as the output buffer for each partition. 2. Partition r similarly. 3. For each i: // perform indexed nested-loop join on each partition (a) Load si into memory and build an in-memory hash index on it using the join attribute. This hash index uses a different hash function than the earlier one h. (b) Read the tuples in ri from the disk block by block. For each tuple tr, locate each matching tuple ts in si using the in-memory hash index. Output the concatenation of their attributes. The hash-join of r and s is computed as follows. Relation s is called the build input and r is called the probe input. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Hash-Join Algorithm (Cont.) The value n and the hash function h is chosen such that each si should fit in memory for build and probe. The probe relation partitions need not fit in memory Typically, M > bs/n  n > bs/M
Note that n < M for partitioning ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Cost of Hash-Join Cost of hash join is for partitioning, a complete reading and a subsequent writing back of both relations: 2(br + bs) block transfers + 2(br / bb + bs / bb) seeks for build and probe, a reading of each of the partitions once: (br + bs) block transfers + 2  n seeks total: 3(br + bs) block transfers + 2(br / bb + bs / bb) + 2  n seeks If the entire build input can be kept in main memory, no partitioning is required Cost estimate goes down to br + bs block transfers + 2 seeks ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Example of Cost of Hash-Join Assume that memory size is 22 blocks bstudent = 100 and btakes = 400. student is to be used as build input. Partition it into five partitions, each of size 20 blocks. Similarly, partition takes into five partitions, each of size 80. Therefore total cost, assuming 3 blocks are allocated to the input and each of the 5 outputs during partitioning (bb=3): 3(100 + 400) = 1500 block transfers + 2( 100/3 + 400/3) + 25 = 346 seeks takes student ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Complex Joins Join with a conjunctive condition: r ⨝ 1  2...   n s Either use nested loops/block nested loops, or Compute the result of one of the simpler joins r ⨝ i s final result comprises those tuples in the intermediate result that satisfy the remaining conditions 1  . . .  i –1  i +1  . . .  n Join with a disjunctive condition r ⨝ 1  2 ...  n s Either use nested loops/block nested loops, or Compute as the union of the records in individual joins r ⨝ i s: (r ⨝ 1 s)  (r ⨝ 2 s)  . . .  (r ⨝ n s) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 42 Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Other Operations: Duplicate elimination Duplicate elimination can be implemented via hashing or sorting. On sorting, duplicates will come adjacent to each other, and all but one copy can be deleted In external sort merge, duplicates can be deleted during run generation and at intermediate merge steps. On hashing, duplicates will come into the same bucket relation is partitioned on the basis of a hash function on the whole tuple read in each partition, construct an in-memory hash index and insert a tuple only if it is not already present, otherwise, discard it. Worst-case cost estimate is the same as that for sorting or hash join. Because of the relatively high cost, SQL requires an explicit request to remove duplicates. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Other Operations : Projection and Aggregation Projection: perform projection on each tuple followed by duplicate elimination. Aggregation can be implemented in a manner similar to duplicate elimination, but based on the grouping attributes. Sorting or hashing can be used to bring tuples in the same group together, and then the aggregate functions can be applied on each group. The cost estimate is the same as that of duplicate elimination. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Other Operations : Set Operations Set operations (,  and ): can either use variant of merge-join after sorting, or variant of hash-join. E.g., Set operations using hashing: Partition both relations using the same hash function Process each partition i as follows. Using a different hashing function, build an in-memory hash index on ri. Process si as follows (next slide) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Other Operations : Set Operations (Cont.) Process si as follows r  s: Add tuples in si to the hash index if they are not already in it. At end of si, add the tuples in the hash index to the result. r  s: Output tuples in si to the result if they are already there in the hash index r – s: For each tuple in si, if it is there in the hash index, delete it from the index. At end of si, add remaining tuples in the hash index to the result. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Evaluation of Expressions So far: we have seen algorithms for individual operations Alternatives for evaluating an entire expression Materialization evaluate one operation at a time in an appropriate order materialize (store) the result of each evaluation in a temporary relation for subsequent use Pipelining evaluate several operations simultaneously in a pipeline pass on the results of one operation to the next, without the need to store a temporary relation ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Materialization Materialized evaluation: evaluate one operation at a time, starting at the lowest-level. Use intermediate results materialized into temporary relations to evaluate next-level operations. E.g., the following operator tree computes and stores then computes and stores its join with instructor, and finally computes the projection on name. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Materialization (Cont.)  Materialized evaluation is always applicable  Cost of writing results to disk and reading them back can be quite high Our cost formulas for operations ignore cost of writing results to disk, so Overall cost = Sum of costs of individual operations + cost of writing intermediate results to disk ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Pipelining Pipelined evaluation : evaluate several operations simultaneously, passing the results of one operation on to the next. E.g., in previous operator tree, don’t store result of instead, pass tuples directly to the join. Similarly, don’t store result of join, pass tuples directly to projection.  Much cheaper than materialization: no need to store a temporary relation to disk. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Pipelining (Cont.)  However, pipelining may not always be possible. Some blocking operations, e.g., sort, may not be able to output any results until all tuples from their inputs have been examined. Other operations, such as join, are not inherently blocking but specific algorithms may be blocking. Hash-join requires both its inputs to be fully partitioned before it outputs any tuples. Indexed nested-loop join can output result tuples as it gets tuples for the outer relation (pipelined on its outer relation). Merge join can be pipelined if both inputs are sorted on the join attribute and the join condition is an equi-join. For pipelining to be effective, use evaluation algorithms that generate output tuples even as tuples are received for inputs to the operation. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Blocking Operations Blocking operations: cannot generate any output until all inputs are consumed E.g. sorting, aggregation, … But, some blocking operators can often consume inputs from a pipeline, or produce outputs to a pipeline Such operations actually execute in stages and blocking actually happens between two stages of the operation. For sort: run generation and merge For hash join: partitioning and build-probe ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Pipeline Stages Pipeline stages: All operations in a stage run concurrently A stage can start only after preceding stages have completed execution Example: Hash-join is a blocking operation since it requires both its inputs to be fully partitioned before it outputs any tuples. The partitioning step for each input can be pipelined with its input The build-probe step can be pipelined with its output The build-probe step can start only after partitioning has been completed on both inputs ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Optimization Reminder: Query Optimization: Amongst all equivalent evaluation plans, choose the one with lowest cost. Annotated expression specifying detailed evaluation strategy is called an evaluation plan (or execution plan). Alternative ways of evaluating a given query Different algorithms for each operation Equivalent expressions ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Optimization Example: Find the names of all instructors in the Music department together with the course title of all the courses that the instructors teach. instructor(ID, name, dept_name, salary) teaches(ID, course_id, sec_id, semester, year) course(course_id, title, dept_name, credits) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Optimization (Cont.) An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Optimization (Cont.) Cost difference between evaluation plans for a query can be enormous E.g. seconds vs. days in some cases Steps in cost-based query optimization Generate logically equivalent expressions using equivalence rules Annotate resultant expressions to get alternative query plans Choose the cheapest plan based on estimated cost Estimation of plan cost based on: Statistical information about relations. Examples: no. of tuples, index depths Cost formulae for algorithms, computed using statistics Statistics estimation for intermediate results to compute cost of complex expressions ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Transformation of Relational Expressions Two relational algebra expressions are said to be equivalent if the two expressions generate the same set of tuples on every legal database instance Note: order of tuples is irrelevant An equivalence rule says that expressions of two forms are equivalent Can replace expression of first form by second, or vice versa However, the rules do not say that one is better than the other. Notation:  : predicate L : list of attributes E : relation or relational-algebra expression ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Equivalence Rules Conjunctive selection operations can be deconstructed into a sequence of individual selections. σ1  2 (E) ≡ σ1 (σ2 (E)) Selection operations are commutative. σ1(σ2(E)) ≡ σ2 (σ1(E)) Only the last in a sequence of projection operations is needed, the others can be omitted.  L1( L2(…( Ln(E))…)) ≡  L1(E) where L1 ⊆ L2 … ⊆ Ln Selections can be combined with Cartesian products and theta joins. σ (E1 x E2) ≡ E1 ⨝  E2 σ 1 (E1 ⨝2 E2) ≡ E1 ⨝ 1∧2 E2 ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 62 Equivalence Rules (Cont.) Theta-join operations (and natural joins) are commutative. E1 ⨝ E2 ≡ E2 ⨝ E1 6. (a) Natural join operations are associative: (E1 ⨝ E2) ⨝ E3 ≡ E1 ⨝ (E2 ⨝ E3) (b) Theta joins are associative in the following manner: (E1 ⨝ 1 E2) ⨝ 2  3 E3 ≡ E1 ⨝1  3 (E2 ⨝ 2 E3) where 2 involves attributes from only E2 and E3. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 63 Equivalence Rules (Cont.) The selection operation distributes over the theta join operation under the following two conditions: (a) When all the attributes in 0 involve only the attributes of one of the expressions (E1) being joined. 0 E1 ⨝ E2) ≡ (0(E1)) ⨝ E2 (b) When 1 involves only the attributes of E1 and 2 involves only the attributes of E2. 1  2 E1 ⨝ E2) ≡ (1(E1)) ⨝ (2(E2)) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 64 Equivalence Rules (Cont.) 8. The projection operation distributes over the theta join operation as follows: (a) Let L1 and L2 be sets of attributes from E1 and E2, respectively, if  involves only attributes from L1  L2:  L1  L2(E1 ⨝ E2) ≡  L1(E1) ⨝  L2(E2) (b) Consider a join E1 ⨝ E2. Let L1 and L2 be sets of attributes from E1 and E2, respectively. Let L3 be attributes of E1 that are involved in join condition , but are not in L1  L2, and let L4 be attributes of E2 that are involved in join condition , but are not in L1  L2.  L1  L2(E1 ⨝ E2) ≡  L1  L2( L1  L3(E1) ⨝  L2  L4(E2)) Similar equivalences hold for outerjoin operations: ⟕, ⟖, and ⟗ ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Equivalence Rules (Cont.) The set operations union and intersection are commutative E1  E2 ≡ E2  E1 E1  E2 ≡ E2  E1 (set difference is not commutative). 10. Set union and intersection are associative. (E1  E2)  E3 ≡ E1  (E2  E3) (E1  E2)  E3 ≡ E1  (E2  E3) The selection operation distributes over ,  and –. a.  (E1  E2) ≡  (E1)  (E2) b.  (E1  E2) ≡  (E1)  (E2) c.  (E1 – E2) ≡  (E1) – (E2) d.  (E1  E2) ≡ (E1)  E2 e.  (E1 – E2) ≡ (E1) – E2 ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Equivalence Rules (Cont.) 12. The projection operation distributes over union L(E1  E2) ≡ (L(E1))  (L(E2)) 13. Selection distributes over aggregation as below (G𝛾A(E)) ≡ G𝛾A((E)) provided  only involves attributes in G ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Transformation Example: Pushing Selections Query: Find the names of all instructors in the Music department, along with the titles of the courses that they teach name, title(dept_name= “Music” (instructor ⨝ (teaches ⨝ course_id, title (course)))) Transformation using rule 7a. name, title((dept_name= “Music”(instructor)) ⨝ (teaches ⨝ course_id, title (course))) Performing the selection as early as possible reduces the size of the relation to be joined. instructor(ID, name, dept_name, salary) teaches(ID, course_id, sec_id, semester, year) course(course_id, title, dept_name, credits) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Example with Multiple Transformations Query: Find the names of all instructors in the Music department who have taught a course in 2009, along with the titles of the courses that they taught name, title(dept_name= “Music”year = 2009 (instructor ⨝ (teaches ⨝ course_id, title (course)))) Transformation using join associativity (Rule 6a): name, title(dept_name= “Music”year = 2009 ((instructor ⨝ teaches) ⨝ course_id, title (course))) Second form provides an opportunity to apply the “perform selections early” rule (Rule 7b), resulting in the subexpression dept_name = “Music” (instructor) ⨝  year = 2009 (teaches) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Multiple Transformations (Cont.) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Transformation Example: Pushing Projections Consider: name, title((dept_name= “Music” (instructor) ⨝ teaches) ⨝ course_id, title (course)) When we compute (dept_name = “Music” (instructor) ⨝ teaches) we obtain a relation whose schema is: (ID, name, dept_name, salary, course_id, sec_id, semester, year) Push projections using equivalence rule 8b; eliminate unneeded attributes from intermediate results to get: name, title((name, course_id ((dept_name= “Music” (instructor)) ⨝ teaches)) ⨝ course_id, title (course)) Performing the projection as early as possible reduces the size of the relation to be joined. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Join Ordering Example A good join ordering is important for reducing the size of temporary results. For all relations r1, r2, and r3, (r1 ⨝ r2) ⨝ r3 = r1 ⨝ (r2 ⨝ r3 ) (Join Associativity)(Rule 6a) If r2 ⨝ r3 is quite large and r1 ⨝ r2 is small, we choose (r1 ⨝ r2) ⨝ r3 so that we compute and store a smaller temporary relation. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Join Ordering Example (Cont.) Consider the expression name, title((dept_name= “Music” (instructor)) ⨝ teaches ⨝ course_id, title (course)) Could compute teaches ⨝ course_id, title (course) first but the result is likely to be a large relation because it contains one tuple for every course. Only a small fraction of the university’s instructors are likely to be from the Music department, it is better to compute dept_name= “Music” (instructor) ⨝ teaches first. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Statistics Estimation Reminder: Cost of each operator needs statistics of input relations The database-system catalog stores the following statistical information about database relations. nr : number of tuples in a relation r. br : number of blocks containing tuples of r. lr : size of a tuple of r. fr : blocking factor of r, i.e., the number of tuples of r that fit into one block. V(A, r) : number of distinct values that appear in r for attribute A; same as the size of A(r). If tuples of r are stored together physically in a file, then: Others: heights of B+-tree indices, no. of leaf pages in the indices However, inputs can be results of sub-expressions Need to estimate statistics of expression results ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Statistics Estimation (Cont.) A query-evaluation plan that has the lowest estimated execution cost may NOT have the lowest actual execution cost because the estimates are based on assumptions that may not hold exactly. However, real-world experience has shown that the plans with the lowest estimated costs usually have actual execution costs that are either the lowest, or are close to the lowest. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Selection Size Estimation A=v(r) Assume uniform distribution (though may not be correct) nr / V(A,r) : number of records that will satisfy the selection Equality condition on a key attribute: size estimate = 1 AV(r) (case of A  V(r) is symmetric) Let c denote the estimated number of tuples satisfying the condition. If min(A,r) and max(A,r) are available in catalog c = 0 if v < min(A,r) c = nr if v  max(A,r) c = , otherwise If the value v is not available, c is assumed to be nr / 2. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Size Estimation of Complex Selections The selectivity of a condition i is the probability that a tuple in the relation r satisfies i . If si is the number of satisfying tuples in r, the selectivity of i is given by si /nr. Conjunction: 1 2. . .  n (r). Assuming independence, estimate of tuples in the result is: Disjunction:1 2 . . .  n (r). Estimated number of tuples: Negation: (r). Estimated number of tuples: nr – size((r)) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 78 Estimation of the Size of Joins The Cartesian product r x s contains nr  ns tuples; each tuple occupies lr + ls bytes. If R  S = , then r ⨝ s is the same as r x s. If R  S is a key for R, then a tuple of s will join with at most one tuple from r therefore, the number of tuples in r ⨝ s is no greater than the number of tuples in s. If R  S in S is a foreign key in S referencing R, then the number of tuples in r ⨝ s is exactly the same as the number of tuples in s. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Estimation of the Size of Joins (Cont.) If R  S = {A} is not a key for R or S. If we assume that every tuple t in R produces tuples in R ⨝ S, the number of tuples in R ⨝ S is estimated to be: If the reverse is true, the estimate obtained will be: The lower of these two estimates is probably the more accurate one. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Join Operation Example Running example: student ⨝ takes Catalog information for join examples: nstudent = 5,000. fstudent = 50, which implies that bstudent =5000/50 = 100. ntakes = 10000. ftakes = 25, which implies that btakes = 10000/25 = 400. V(ID, student) = 5000 (primary key!) V(ID, takes) = 2500, which implies that on average, each student who has taken a course has taken 4 courses. Attribute ID in takes is a foreign key referencing student. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Join Operation Example (Cont.) Example: student takes, ID in takes is a foreign key referencing student hence, the result has exactly ntakes tuples, which is 10000 Example: compute the size estimates for student ⨝ takes without using information about foreign keys: V(ID, takes) = 2500, and V(ID, student) = 5000 The two estimates are 5000 * 10000/2500 = 20,000 and 5000 * 10000/5000 = 10000 We choose the lower estimate, which in this case, is the same as our earlier computation using foreign keys. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Size Estimation for Other Operations Projection: estimated size of A(r) = V(A,r), duplicates eliminated Aggregation: estimated size of G𝛾F(r) = V(G,r), one tuple in G𝛾F(r) for each distinct value of G. Set operations For unions/intersections of selections on the same relation: rewrite and use size estimate for selections E.g. 1 (r)  2 (r) can be rewritten as 12 (r) For operations on different relations: estimated size of r  s = size of r + size of s. estimated size of r  s = minimum size of r and size of s. estimated size of r – s = r. All the three estimates may be quite inaccurate, but provide upper bounds on the sizes. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Estimation of Number of Distinct Values Selections:  (r) If  forces A to take a specified value: V(A, (r)) = 1. e.g., A = 3 If  forces A to take on one of a specified set of values: V(A, (r)) = number of specified values. e.g., (A = 1  A = 3  A = 4) If the selection condition  is of the form A op v, where op is a comparison operator: estimated V(A, (r)) = V(A, r) * s where s is the selectivity of the selection. In all the other cases: use approximate estimate of min(V(A,r), n (r) ) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 84 Estimation of Distinct Values (Cont.) Joins: r ⨝ s If all attributes in A are from r estimated V(A, r ⨝ s) = min (V(A, r), n r ⨝ s) If A contains attributes A1 from r and A2 from s, then estimated V(A, r ⨝ s) = min(V(A1, r)*V(A2 – A1, s), V(A1 – A2, r)*V(A2, s), nr ⨝ s) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 85 Estimation of Distinct Values (Cont.) Estimation of distinct values are straightforward for projections. They are the same in A (r) as in r. The same holds for grouping attributes of aggregation. For aggregated values For min(A) and max(A), the number of distinct values can be estimated as min(V(A,r), V(G,r)) where G denotes grouping attributes For other aggregates, assume all values are distinct, and use V(G,r) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 86 Query Processing and Optimization Overview Measures of Query Cost Selection Operation Sort Operation Join Operation nested-loop, block nested-loop, indexed nested-loop, merge-join and hash join Other Operations Evaluation of Expressions Transformation of Relational Expressions Statistics Estimation Choices of Evaluation Plans ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Choice of Evaluation Plans Choosing the cheapest algorithm for each operation independently may not yield best overall algorithm. Must consider the interaction of evaluation techniques when choosing evaluation plans merge-join may be costlier than hash-join, but may provide a sorted output which reduces the cost for an outer level aggregation. nested-loop join may provide opportunity for pipelining Practical query optimizers incorporate elements of the following two broad approaches: 1. Search all the plans and choose the best plan in a cost-based fashion. 2. Uses heuristics to choose a plan (at the potential risk of not finding the optimal plan). ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Cost-Based Optimization Consider finding the best join-order for r1 ⨝ r2 ⨝ . . . ⨝ rn. There are (2(n – 1))!/(n – 1)! different join orders for above expression. n = 3, the number is 12 n = 7, the number is 665280 n = 10, the number is greater than 176 billion! No need to generate all the join orders. Using dynamic programming, the least-cost join order for any subset of {r1, r2, . . . rn} can be computed only once and stored for future use. Example: Find the best join order of the form (r1 ⨝ r2 ⨝ r3) ⨝ r4 ⨝ r5 There are 12 different join orders for computing r1 ⨝ r2 ⨝ r3 and 12 orders for computing the join of this result with r4 and r5. We first find the best join order for the subset of relations {r1, r2, r3}, we can use that order for further joins with r4 and r5, and ignore all more expensive join orders of r1 ⨝ r2 ⨝ r3. Thus, only 12 + 12 choices, instead of 144 join orders are examined. Read pseudo-code in book for details. Cost-based optimization is expensive, but worthwhile for queries on large datasets (typical queries have small n, generally < 10) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Heuristic Optimization Cost-based optimization is expensive. Systems may use heuristics to reduce the number of choices that must be made in a cost-based fashion. Heuristic optimization transforms the query-tree by using a set of rules that typically (but not in all cases) improve execution performance: Perform selection early (reduces the number of tuples) Perform projection early (reduces the number of attributes) Perform most restrictive selection and join operations (i.e. with smallest result size) before other similar operations. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Heuristic Optimization (Cont.) Many optimizers consider only left-deep join orders (instead of all join orders). Right-hand-side input for each join is a relation, not the result of an intermediate join  Convenient for pipelined evaluation since the right operand is a stored relation  Reduces optimization complexity For n=3, there are 6 ways. Some versions of Oracle, consider n evaluation plans in a left-deep join order for an n-way join Starting from each of the n relations Repeatedly pick “best” relation to join next on the basis of a ranking of the available access paths ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Heuristic Optimization (Cont.) Search for optimal plan is terminated when the optimization cost budget is exceeded and the best plan found so far is returned Optimizers usually first apply cheap heuristics to find a plan Then, start a full cost-based optimization with a budget based on the heuristically chosen plan Plan caching reuses previous optimal query plan if query is resubmitted (with different constants in query). Example: a query to find the courses for which a student has registered in a university application. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Concluding Remarks Real-life observations The difference in execution time between a good plan and a bad one may be huge. The added cost of cost-based query optimization is usually more than offset by the saving at query-execution time, which is dominated by slow disk accesses. The achieved saving is magnified in applications that run on a regular basis, where a query can be optimized once, and the selected query plan can be used many times. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts ) ( " Watson " department building = s ú ú ú ú ù ê ê ê ê é = r f r n r b ) , min( ) , max( ) , min( . r A r A r A v n r - - n r n r n sss n ∗∗∗ ∗ . . . 21 n r n r n sss n *** * . . . 21 ⎟⎟ ⎠ ⎞ ⎜⎜ ⎝ ⎛ −∗∗−∗−−∗ )1(...)1()1(1 21 r n rr r n s n s n s n ÷ ÷ ø ö ç ç è æ -**-*--* )1(...)1()1(1 21 r n rr r n s n s n s n ) , ( s A V n n s r * ) , ( r A V n n s r * /docProps/thumbnail.jpeg