程序代写代做代考 database SQL AI Chapter 5: Other Relational Languages

Chapter 5: Other Relational Languages

A Big Picture:
Basic Steps in Query Processing

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

A Big Picture:
Basic Steps in Query Processing (Cont.)
A SQL query can be translated into several relational algebra expressions (internal form).
E.g., select salary from instructor where salary < 75000;  salary75000(salary(instructor)), or salary(salary75000(instructor)) The relational algebra representation of a query specifies (partially) how to evaluate a query. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Relational Algebra Database System Concepts ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 3 Relational Algebra A quick review of relational database Six basic operators select:  project:  union:  set difference: – Cartesian product: x rename:  Additional operations that simplify common queries set intersection join assignment outer join Extended relational algebra operations generalized projection aggregate functions ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 4 Example of a Instructor Relation attributes (or columns) tuples (or rows) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 5 Attribute The set of allowed values for each attribute is called the domain of the attribute Attribute values are (normally) required to be atomic; that is, indivisible The special value null is a member of every domain, indicated that the value is “unknown” ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 6 Relation Schema and Instance A1, A2, …, An are attributes R = (A1, A2, …, An ) is a relation schema Example: instructor = (ID, name, dept_name, salary) Formally, given sets D1, D2, …. Dn, a relation r is a subset of D1 x D2 x … x Dn where Di is the domain of attribute Ai. Thus, a relation is a set of n-tuples (a1, a2, …, an) where each ai  Di The current values (relation instance) of a relation are specified by a table An element t of r is a tuple, represented by a row in a table ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 7 Relations are Unordered Order of tuples is irrelevant (tuples may be stored in an arbitrary order) Example: instructor relation with unordered tuples ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 8 Database A database consists of multiple relations Information about a university is broken up into parts instructor student advisor Bad design: univ (instructor-ID, name, dept_name, salary, student_ID, ..) results in repetition of information (e.g., two students have the same instructor) the need for null values (e.g., represent an student with no advisor) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 9 Keys Let K  R K is a superkey of R if values for K are sufficient to identify a unique tuple of each possible relation r(R) Example: {ID} and {ID,name} are both superkeys of instructor. Superkey K is a candidate key if K is minimal Example: {ID} is a candidate key for Instructor One of the candidate keys is selected to be the primary key. Foreign key constraint: Value in one relation must appear in another Referencing relation Referenced relation Example – dept_name in instructor is a foreign key from instructor referencing department ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 10 Schema Diagram for University Database ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Relational Algebra A quick review of relational database Six basic operators select:  project:  union:  set difference: – Cartesian product: x rename:  Additional operations that simplify common queries set intersection join assignment outer join Extended relational algebra operations generalized projection aggregate functions ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 12 Select Operation The select operation selects tuples that satisfy a given predicate. Notation:  p(r) p is called the selection predicate Example: select those tuples of the instructor relation where the instructor is in the “Physics” department. Query:  dept_name=“Physics” (instructor) Result: ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 13 Select Operation (Cont.) We allow comparisons using =, , >, , <,  in the selection predicate. We can combine several predicates into a larger predicate by using the connectives:  (and),  (or),  (not) Example: Find the instructors in Physics with a salary greater $90,000, we write:  dept_name=“Physics”  salary > 90,000 (instructor)
Then select predicate may include comparisons between two attributes.
Example, find all departments whose name is the same as their building name:
 dept_name=building (department)

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

14

Project Operation
A unary operation that returns its argument relation, with certain attributes left out.
Notation:
 A1,A2,A3 ….Ak (r)
where A1, A2 are attribute names and r is a relation name.
The result is defined as the relation of k columns obtained by erasing the columns that are not listed
Duplicate rows removed from result, since relations are sets

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

15

Project Operation (Cont.)
Example: eliminate the dept_name attribute of instructor
Query:
ID, name, salary (instructor)
Result:

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

16

Composition of Relational Operations
The result of a relational-algebra operation is relation and therefore relational-algebra operations can be composed together into a relational-algebra expression.
Consider the query — Find the names of all instructors in the Physics department.
name( dept_name =“Physics” (instructor))
Instead of giving the name of a relation as the argument of the projection operation, we give an expression that evaluates to a relation.

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

17

Union Operation
The union operation allows us to combine two relations
Notation: r  s
Output the union of tuples from the two input relations
For r  s to be valid.
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd
column of s)

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

18

Union Operation – Example
Example: to find all courses taught in the Fall 2009 semester, or in the Spring 2010 semester, or in both
Query:
course_id ( semester=“Fall” Λ year=2009 (section)) 
course_id ( semester=“Spring” Λ year=2010 (section))
Result:

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

19

Set Difference Operation
The set-difference operation allows us to find tuples that are in one relation but are not in another.
Notation: r – s
Produce a relation containing those tuples in r but not in s
Set differences must be taken between compatible relations.
r and s must have the same arity
attribute domains of r and s must be compatible

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

20

Set Difference Operation – Example
Example: to find all courses taught in the Fall 2009 semester, but not in the Spring 2010 semester
Query:
course_id ( semester=“Fall” Λ year=2009 (section)) −
course_id ( semester=“Spring” Λ year=2010 (section))
Result:

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

21

Cartesian-Product Operation
The Cartesian-product operation (denoted by X) allows us to combine information from any two relations.
Notation: r x s
Output all pairs of rows from the two input relations (regardless of whether or not they have the same values on common attributes)
Assume that attributes of r(R) and s(S) are disjoint. (That is, R  S = ).
If attributes of r(R) and s(S) are not disjoint, then renaming must be used.

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

22

Cartesian-Product Operation – Example
Example: the Cartesian product of the relations instructor and teaches is written as:
instructor X teaches
We construct a tuple of the result out of each possible pair of tuples: one from the instructor relation and one from the teaches relation (see next slide)
Since the instructor ID appears in both relations we distinguish between these attribute by attaching to the attribute the name of the relation from which the attribute originally came.
instructor.ID
teaches.ID

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

23

The instructor X teaches table

instructor
teaches

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

24

Example Queries
Find the names of all instructors in the Physics department, along with the course_id of all courses they have taught

Query 1
name,course_id (instructor.ID=teaches.ID (
 dept_name=“Physics” (instructor x teaches)))

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

25

Example Queries
There is often more than one way to write a query in relational algebra
The following two queries are equivalent, i.e., they give the same result
Query 1
name,course_id ( instructor.ID=teaches.ID (
 dept_name=“Physics” (instructor x teaches)))

Query 2
name,course_id (instructor.ID=teaches.ID (
 dept_name=“Physics” (instructor) x teaches))

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

26

Rename Operation
Allows us to name, and therefore to refer to, the results of relational-algebra expressions.
Allows us to refer to a relation by more than one name.
The expression
x (E)
returns the result of expression E under the name X
If a relational-algebra expression E has arity n, then
x(A1,A2, .. An) (E)
returns the result of expression E under the name X, and with the
attributes renamed to A1 , A2 , …., An .

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

27

Example Query
Find the highest salary in the university
Step 1: find instructor salaries that are less than some other instructor salary (i.e. not the highest)
using a copy of instructor under a new name d
instructor.salary ( instructor.salary < d.salary (instructor x d (instructor))) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Example Query Find the highest salary in the university Step 2: Find the highest salary salary (instructor) – instructor.salary ( instructor.salary < d.salary (instructor x d (instructor))) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Formal Definition A basic expression in the relational algebra consists of either one of the following: A relation in the database A constant relation Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions: E1  E2 E1 – E2 E1 x E2 p(E1), p is a predicate on attributes in E1 s(E1), S is a list consisting of some of the attributes in E1 x(E1), x is the new name for the result of E1 ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 30 Relational Algebra A quick review of relational database Six basic operators select:  project:  union:  set difference: – Cartesian product: x rename:  Additional operations that simplify common queries set intersection Join assignment outer join Extended relational algebra operations generalized projection aggregate functions ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 31 Additional Operations We define additional operations that do not add any power to the relational algebra, but that simplify common queries. Set intersection Join Assignment Outer join ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 32 Set-Intersection Operation The set-intersection operation allows us to find tuples that are in both the input relations. Notation: r  s Assume: r, s have the same arity attributes of r and s are compatible Note: r  s = r – (r – s) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 33 Set-Intersection Operation – Example Example: to find all courses taught in both the Fall 2009 and the Spring 2010 semesters Query: course_id ( semester=“Fall” Λ year=2009 (section))  course_id ( semester=“Spring” Λ year=2010 (section)) Result: ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 34 Join Operation The Cartesian-Product instructor X teaches associates every tuple of instructor with every tuple of teaches. Most of the resulting rows have information about instructors who did NOT teach a particular course. To get only those tuples of “instructor X teaches “ that pertain to instructors and the courses that they taught, we write:  instructor.id = teaches.id (instructor x teaches)) We get only those tuples of “instructor X teaches” that pertain to instructors and the courses that they taught. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 35 Join Operation (Cont.) The join operation allows us to combine a select operation and a Cartesian-Product operation into a single operation. Consider relations r(R) and s(S) Let “theta” be a predicate on attributes in the schema R “union” S. The join operation r s is defined as follows: Thus  instructor.id = teaches.id (instructor x teaches)) Can equivalently be written as instructor Instructor.id = teaches.id teaches. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 36 Join Operation (Cont.) The join operation without predicate is called natural join. Notation: r s Output pairs of rows from the two input relations that have the same value on all attributes that have the same name Let r and s be relations on schemas R and S respectively. Then, r s is a relation on schema R  S obtained as follows: Consider each pair of tuples tr from r and ts from s. If tr and ts have the same value on each of the attributes in R  S, add a tuple t to the result, where t has the same value as tr on r t has the same value as ts on s ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 37 Join Operation – Example Find the names of all instructors in the Comp. Sci. department together with the course titles of all the courses that the instructors teach  name, title ( dept_name=“Comp. Sci.” (instructor teaches course)) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Assignment Operation It is convenient at times to write a relational-algebra expression by assigning parts of it to temporary relation variables. The assignment operation is denoted by  and works like assignment in a programming language. Example: Find all instructor in the “Physics” and Music department. Physics   dept_name=“Physics” (instructor) Music   dept_name=“Music” (instructor) Physics  Music With the assignment operation, a query can be written as a sequential program consisting of a series of assignments followed by an expression whose value is displayed as a result of the query. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 39 Outer Join An extension of the join operation that avoids loss of information. Computes the join and then adds tuples from one relation that does not match tuples in the other relation to the result of the join. Uses null value to signify that the value is unknown or does not exist Three forms of outer join: left outer join right outer join full outer join ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 40 Outer Join – Example Relation course Relation prereq Observe that CS-315 is missing in prereq and CS-347 is missing in course ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 41 Outer Join – Example Left Outer Join course ⟕ prereq Join course prereq ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 42 Outer Join – Example Right Outer Join course ⟖ prereq Full Outer Join course ⟗ prereq ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 43 Outer Join using Joins Outer join can be expressed using basic operations e.g. r ⟕ s can be written as (r s)  (r – R(r s)) x {(null, …, null)} where the constant relation {(null, …, null)} is on the schema S - R ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Relational Algebra A quick review of relational database Six basic operators select:  project:  union:  set difference: – Cartesian product: x rename:  Additional operations that simplify common queries set intersection join assignment outer join Extended relational algebra operations generalized projection aggregate functions ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 45 Generalized Projection Extends the projection operation by allowing arithmetic functions to be used in the projection list. E is any relational-algebra expression Each of F1, F2, …, Fn are arithmetic expressions involving constants and attributes in the schema of E. Given relation instructor(ID, name, dept_name, salary) where salary is annual salary, get the same information but with monthly salary ID, name, dept_name, salary/12 (instructor) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 46 Aggregate Functions and Operations Aggregate function takes a collection of values and returns a single value as a result. avg: average value min: minimum value max: maximum value sum: sum of values count: number of values Aggregate operation in relational algebra  E is any relational-algebra expression G1, G2 …, Gn is a list of attributes on which to group (can be empty) Each Fi is an aggregate function Each Ai is an attribute name ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 47 Aggregate Operation – Example Find the average salary in each department For convenience, we permit renaming as part of aggregate operation ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 48 Multiset Relational Algebra Pure relational algebra removes all duplicates e.g. after projection Multiset relational algebra retains duplicates, to match SQL semantics SQL duplicate retention was initially for efficiency, but is now a feature Multiset relational algebra defined as follows selection: has as many duplicates of a tuple as in the input, if the tuple satisfies the selection  (r1): If there are c1 copies of tuple t1 in r1, and t1 satisfies selection ,, then there are c1 copies of t1 in  (r1). projection: one tuple per input tuple, even if it is a duplicate A (r1): For each copy of tuple t1 in r1, there is a copy of tuple A (t1) in A (r1) where A (t1) denotes the projection of the single tuple t1. ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts Multiset Relational Algebra (Cont.) Multiset relational algebra defined as follows Cartesian product: r1 x r2 : If there are c1 copies of tuple t1 in r1 and c2 copies of tuple t2 in r2, there are c1 x c2 copies of the tuple t1. t2 in r1 x r2 Other operators similarly defined union: m + n copies intersection: min(m, n) copies difference: min(0, m – n) copies Example: Suppose multiset relations r1 (A, B) and r2 (C) are as follows: r1 = {(1, a) (2,a)} r2 = {(2), (3), (3)} Then B(r1) would be {(a), (a)}, while B(r1) x r2 would be {(a,2), (a,2), (a,3), (a,3), (a,3), (a,3)} ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts SQL and Multiset Relational Algebra select A1, A2, ..., An from r1, r2, ..., rm where P is equivalent to the following expression in multiset relational algebra The select clause corresponds to the projection operation The from clause corresponds to the Cartesian product operation The where clause corresponds to the selection operation ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts SQL and Multiset Relational Algebra- Examples Find courses that ran in Fall 2009 or in Spring 2010 (select course_id from section where semester = ‘Fall’ and year = 2009) union (select course_id from section where semester = ‘Spring’ and year = 2010) is equivalent to course_id ( semester=“Fall” Λ year=2009 (section))  course_id ( semester=“Spring” Λ year=2010 (section)) Find courses that ran in Fall 2009 and in Spring 2010 (select course_id from section where semester = ‘Fall’ and year = 2009) intersect (select course_id from section where semester = ‘Spring’ and year = 2010) is equivalent to course_id ( semester=“Fall” Λ year=2009 (section))  course_id ( semester=“Spring” Λ year=2010 (section)) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 52 SQL and Multiset Relational Algebra - Examples Find courses that ran in Fall 2009 but not in Spring 2010 (select course_id from section where semester = ‘Fall’ and year = 2009) except (select course_id from section where semester = ‘Spring’ and year = 2010) is equivalent to course_id ( semester=“Fall” Λ year=2009 (section)) − course_id ( semester=“Spring” Λ year=2010 (section)) ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 53 SQL and Multiset Relational Algebra (Cont.) Set operations union, intersect, and except automatically eliminate duplicates To retain all duplicates, use the corresponding multiset versions union all, intersect all and except all. Suppose a tuple occurs m times in r and n times in s, then, it occurs: m + n times in r union all s min(m,n) times in r intersect all s max(0, m – n) times in r except all s ©Silberschatz, Korth and Sudarshan ‹#› Database System Concepts 54 SQL and Multiset Relational Algebra (Cont.) select A1, A2, sum(A3) from r1, r2, ..., rm where P group by A1, A2 having count (A4) > 2
is equivalent to the following expression in multiset relational algebra

Find the average salary of instructors in each department
select dept_name, avg (salary) as avg_salary
from instructor
group by dept_name;
is equivalent to

©Silberschatz, Korth and Sudarshan
‹#›
Database System Concepts

)
(

)
(
,
),
(
),
(
,
,
,
2
2
1
1
2
1
E
n
n
n
A
F
A
F
A
F
G
G
G
K
K

))
(
(
2
1
,
,
,
2
1
m
P
A
A
A
r
r
r
n
´
´
´
Õ
K
K
s

/docProps/thumbnail.jpeg