CS计算机代考程序代写 database SQL AI CS3402

CS3402
1
CS3402 : Chapter 7 Relational Algebra
Semester B, 2020/2021

Relational Algebra
 Relational algebra: a formal language for the relational model
 The operations in relational algebra enable a user to specify basic
retrieval requests (or queries)
 Relational algebra consists of a set of operations on relations to
generate relations
 The result of an operation is a new relation
They can be further manipulated using operations
 A sequence of relational algebra operations forms a relational
algebra expression
CS3402 2

Importance of Relational Algebra
 Relational algebra provides a formal foundation for relational model  It is used as a basis for implementing and optimizing queries in
query processing and optimization
 Its concepts are incorporated into SQL standard language for
CS3402
3
relational database management systems
The internal modules of most commercial RDBMS are based on relational algebra

Relational Algebra Overview
 Relational algebra consists of several groups of operations Unary Relational Operations
CS3402
4
SELECT (symbol: σ (sigma)) PROJECT (symbol: π (pi)) RENAME (symbol: ρ (rho))
Binary Relational Operations
JOIN (several variations of JOIN exist) DIVISION
Relational algebra operations from set theory
UNION ( ∪ ), INTERSECTION ( ∩ ), DIFFERENCE (or MINUS, – ) CARTESIAN PRODUCT ( x )
Additional Relational Operations
OUTER JOINS, OUTER UNION
AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM, COUNT, AVG, MIN, MAX)

Database State for COMPANY
 All examples discussed below refer to the COMPANY database shown here
CS3402
Slide 8- 5
5

The following query results refer to this database state
CS3402 6

Unary Relational Operations: SELECT
 The SELECT operation (denoted by σ (sigma)) is used to select a subset of the tuples from a relation based on a selection condition.
The selection condition acts as a filter
Keeps only those tuples that satisfy the qualifying condition
Horizontal partitioning
Tuples satisfying the condition are selected whereas the other tuples are discarded (filtered out)
 The general form of the select operation is: σ (R)
CS3402
7

Unary Relational Operations: SELECT
 Example 1:
Select the EMPLOYEE tuples whose department number is 4:
Equivalent to: SELECT *
FROM EMPLOYEE WHERE DNO=4
CS3402
8
σ DNO = 4 (EMPLOYEE)

Unary Relational Operations: SELECT
 Example 2:
Select the employee tuples whose department number is 4 and salary is greater than $25,000 or department number is 5 and salary is greater than $30,000:
Equivalent to: SELECT *
FROM EMPLOYEE
WHERE (Dno=4 AND Salary>25,000) OR (Dno=5 AND Salary>30,000)
σ (Dno =4 AND Salary>25,000 ) OR (Dno=5 AND Salary> 30 ,000)(EMPLOYEE)
CS3402 9

Unary Relational Operations: SELECT
 SELECT Operation Properties
The SELECT operation σ (R) produces a relation
S that has the same schema (same attributes) as R SELECT σ is commutative:
σ (σ < condition2> (R)) = σ (σ < condition1> (R)) Thus, a cascade (sequence) of SELECT operations may be
CS3402
10
applied in any order:
σ (R)) = σ ( R)))

Unary Relational Operations: SELECT
 SELECT Operation Properties:
A cascade of SELECT operations may be replaced by a single
selection with a conjunction (and) of all the conditions: σ(σ< cond2> (σ(R)) = σ AND < cond2> AND < cond3>(R)))
The number of tuples in the result of a SELECT is less than (or
equal to) the number of tuples in the input relation R
The fraction of tuples selected by a selection condition is called
the selectivity of the condition
CS3402 11

Unary Relational Operations: PROJECT
 PROJECT Operation is denoted by π (pi)
 This operation keeps certain attributes from a relation and discards
the other attributes
 PROJECT creates a vertical partitioning
The list of specified attributes is kept in each tuple The other attributes in each tuple are discarded
 The general form of the project operation is: π(R)
π (pi) is the symbol used to represent the project operation  is the desired list of attributes from relation R
CS3402 12

Unary Relational Operations: PROJECT
 The project operation removes any duplicate tuples
This is because the result of the project operation must be a set of
tuples
Mathematical sets do not allow duplicate elements  Example: πSex,Salary(EMPLOYEE)
CS3402 Duplicated tuples 13

Unary Relational Operations: PROJECT
 PROJECT Operation Properties
CS3402
14
The number of tuples in the result of projection π(R) is always less (duplicates are removed) or equal (unique values) to the number of tuples in R
If the list of attributes includes a key of R, then the number of tuples in the result of PROJECT is equal to the number of tuples in R
PROJECT is not commutative
π (R) ) ≠ π (R) )
π (R) ) = π (R) as long as contains
the attributes in Example:
List1 = LNAME, FNAME
List2 = LNAME, FNAME, SALARY

Single expression versus sequence of relational operations
 We may want to apply several relational algebra operations one after the other
Either we can write the operations as a single relational algebra expression by nesting the operations, or
We can apply one operation at a time and create intermediate result relations
 In the latter case, we must give names (rename) to the relations that hold the intermediate results
CS3402 15

Single expression versus sequence of relational operations
 Example: To retrieve the first name, last name, and salary of all employees who work in department number 5, we must apply a select and a project operation
 We can write a single relational algebra expression as follows: πFNAME, LNAME, SALARY(σ DNO=5(EMPLOYEE))
 OR We can explicitly show the sequence of operations, giving a name to each intermediate relation:
DEP5_EMPS ← σ DNO=5(EMPLOYEE) TEMP ← π FNAME, LNAME, SALARY (DEP5_EMPS)
Note: the ← symbol is an assignment operator
CS3402 16

Unary Relational Operations: RENAME
 The RENAME operator is denoted by ρ (rho)
 The RENAME operation ρ can be used to rename either the relation
CS3402
17
name or the attribute names, or both: ρS(R) changes:
the relation name only to S ρ(B1, B2, …, Bn)(R) changes:
the attribute names only to B1, B2, …..Bn ρS (B1, B2, …, Bn )(R) changes both:
the relation name to S, and
the attribute names to B1, B2, …..Bn

Unary Relational Operations: RENAME
 For convenience, we also use a shorthand for renaming attributes in an intermediate relation:
If we write: TEMPS ← π FNAME, LNAME, SALARY (DEP5_EMPS)
• Note: TEMPS has the same attribute names as DEP5_EMPS
If we write: ρ R (First_name , Last_name , Salary)(TEMPS)
• The 3 attributes of TEMPS are renamed to First_name ,
CS3402
18
Last_name and Salary, respectively; and • R is the name of the result relation.

Example of applying multiple operations and RENAME
CS3402
Slide 8- 19
19

Relational Algebra Operations from Set Theory
 Type compatibility of operands is required for the binary set operation UNION ∪, INTERSECTION ∩, and SET DIFFERENCE –
 R1(A1, A2, …, An) and R2(B1, B2, …, Bn) are type compatible if:
they have the same number of attributes, and
the domains of corresponding attributes are type compatible (i.e. dom(Ai)=dom(Bi) for i=1, 2, …, n)
 The resulting relation for R1∪R2 (also for R1∩R2, or R1–R2) has the same attribute names as the first operand relation R1
CS3402 20

Relational Algebra Operations from Set Theory: UNION
 UNION Operation
Binary operation, denoted by ∪
The result of R ∪ S, is a relation that includes all tuples that are either in R or in S or in both R and S
Duplicate tuples are eliminated
The two operand relations R and S must be “type compatible”
CS3402 21

Relational Algebra Operations from Set Theory: UNION
 Example:
CS3402
22
To retrieve the social security numbers of all employees who either work in department 5 (RESULT1 below) or directly supervise an employee who works in department 5 (RESULT2 below)
We can use the UNION operation as follows: DEP5_EMPS ← σDNO=5 (EMPLOYEE)
RESULT1 ← π SSN(DEP5_EMPS) RESULT2(SSN) ← πSUPERSSN(DEP5_EMPS) RESULT ← RESULT1 ∪ RESULT2
The union operation produces the tuples that are in either RESULT1 or RESULT2 or both

Figure 8.3 Result of the UNION operation RESULT ← RESULT1 ∪ RESULT2.
CS3402
23
RESULT ← RESULT1 ∪ RESULT2

Relational Algebra Operations from Set Theory: INTERSECTION
 INTERSECTION is denoted by ∩
 The result of the operation R ∩ S, is a relation that includes all tuples
that are in both R and S
The attribute names in the result will be the same as the attribute names in R
 The two operand relations R and S must be “type compatible”
CS3402 24

Relational Algebra Operations from Set Theory: SET DIFFERENCE
 SET DIFFERENCE (also called MINUS or EXCEPT) is denoted by –  The result of R – S, is a relation that includes all tuples that are in R
but not in S
The attribute names in the result will be the same as the attribute names in R
 The two operand relations R and S must be “type compatible”  R∩ S=(R∪S)–(R–S))–(S–R)
CS3402 25

Example to illustrate the result of UNION
CS3402 26

Example to illustrate the result of INTERSECT
CS3402 27

Example to illustrate the result of SET DIFFERENCE
CS3402 28

Example to illustrate the result of DIFFERENCE
CS3402 29

Properties of UNION, INTERSECT, and DIFFERENCE
 Notice that both union and intersection are commutative operations; that is
R ∪ S = S ∪ R, and R ∩ S = S ∩ R
 Both union and intersection can be treated as n-ary operations applicable to any number of relations as both are associative operations; that is
R ∪ (S ∪ T) = (R ∪ S) ∪ T R ∩ (S ∩ T) = (R ∩ S) ∩ T
 The minus operation is not commutative R – S ≠ S – R
CS3402 30

Relational Algebra Operations from Set Theory: CARTESIAN PRODUCT
 CARTESIAN (or CROSS) PRODUCT Operation
This operation is used to combine tuples from two relations in a
combinatorial fashion
Denoted by R(A1, A2, . . ., An) x S(B1, B2, . . ., Bm) Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order
The resulting relation state has one tuple for each combination
of tuples – one from R and one from S
Hence, if R has nR tuples (denoted as |R| = nR ), and S has nS
tuples, then R x S will have nR * nS tuples
The two operands do NOT have to be “type compatible”
CS3402 31

Relational Algebra Operations from Set Theory: CARTESIAN PRODUCT
 Generally, CROSS PRODUCT is not a meaningful operation Some relations do not exist in the mini-world
Can become meaningful when followed by other operations
 Example (not meaningful):
(1) FEMALE_EMPS ← σ SEX=’F’(EMPLOYEE)
(2) EMPNAMES ← π FNAME, LNAME, SSN (FEMALE_EMPS) (3) EMP_DEPENDENTS ← EMPNAMES x DEPENDENT
 EMP_DEPENDENTS will contain every combination of EMPNAMES and DEPENDENT
whether or not they are actually related
CS3402 32

Figure 8.5 The CARTESIAN PRODUCT (CROSS PRODUCT) operation
CS3402 33

Figure 8.5 The CARTESIAN PRODUCT (CROSS PRODUCT) operation
CS3402 34

Figure 8.5 The CARTESIAN PRODUCT (CROSS PRODUCT) operation
CS3402 35

Relational Algebra Operations from Set Theory: CARTESIAN PRODUCT
 To keep only combinations where the DEPENDENT is related to the EMPLOYEE, we add a SELECT operation as follows
 Example (meaningful):
(1) FEMALE_EMPS ← σ SEX=’F’(EMPLOYEE)
(2) EMPNAMES ← π FNAME, LNAME, SSN (FEMALE_EMPS) (3) EMP_DEPENDENTS ← EMPNAMES x DEPENDENT (4) ACTUAL_DEPS ← σ SSN=ESSN(EMP_DEPENDENTS)
(5) RESULT ← π FNAME, LNAME, DEPENDENT_NAME (ACTUAL_DEPS)
 RESULT will now contain the name of female employees and their
dependents
CS3402 36

The CARTESIAN PRODUCT (CROSS PRODUCT) operation
CS3402 37

Binary Relational Operations: JOIN
 JOIN Operation (denoted by )
CS3402
38
The sequence of CARTESIAN PRODECT followed by SELECT is used quite commonly to identify and select related tuples from two relations
A special operation, called JOIN combines this sequence into a single operation
The general form of a join operation on two relations R(A1, A2, . . ., An) and S(B1, B2, . . ., Bm) is:
R S
R and S can be any relations that result from general relational algebra expressions

Binary Relational Operations: JOIN
 Example: Suppose that we want to retrieve the name of the manager of each department
 To get the manager’s name, we need to combine each DEPARTMENT tuple with the EMPLOYEE tuple whose SSN value matches the MGRSSN value in the department tuple.
DEPT_MGR ← DEPARTMENT MGRSSN=SSN EMPLOYEE
CS3402
39
MGRSSN=SSN is the join condition: Combines each department record with the employee who manages the department

Figure 8.6 Result of the JOIN operation
DEPT_MGR ← DEPARTMENT Mgr_ssn=SsnEMPLOYEE
CS3402
40

Some properties of JOIN
 Consider the following JOIN operation: R(A1, A2, . . ., An)
S(B1, B2, . . ., Bm)
CS3402
41
R.Ai=S.Bj
Result is a relation Q with degree n + m attributes:
Q(A1, A2, . . ., An, B1, B2, . . ., Bm), in that order
The resulting relation state has one tuple for each combination of tuples: r from R and s from S, but only if they satisfy the join condition r[Ai]=s[Bj]
Hence, if R has nR tuples, and S has nS tuples, then the join result will generally have less than nR x nS tuples

Theta-join
 The general case of JOIN operation is called a Theta-join: R S
 A general join condition is of the form
AND AND … AND
 Each is of the form Ai θ Bj,
Ai is an attribute of R
Bj is an attribute of S
Ai and Bj have the same domain
θ is one of the comparison operators {=, <, ≤, >, ≥, ≠}
CS3402 42

EQUIJOIN
 The most common use of join involves join conditions with equality comparisons only
 Such a join, where the only comparison operator used is =, is called an EQUIJOIN
In the result of an EQUIJOIN we always have one or more pairs of attributes (whose names need not be identical) that have identical values in every tuple
The JOIN seen in the previous example was an EQUIJOIN
DEPT_MGR ← DEPARTMENT Mgr_ssn=SsnEMPLOYEE CS3402
43

NATURAL JOIN Operation
 NATURAL JOIN: denoted by * was created to get rid of the superfluous attributes in an EQUIJOIN condition
One of each pair of attributes with identical values is superfluous
 The standard definition of natural join requires that the two join attributes, or each pair of corresponding join attributes, have the same name in both relations.
 E.g. Q ← R(A,B,C,D) * S(C,D,E)
The implicit join condition includes each pair of attributes with
CS3402
44
the same name, “AND”ed together: R.C=S.C AND R.D=S.D Result keeps only one attribute of each such pair:
Q(A,B,C,D,E)

NATURAL JOIN
 Example: Combine each PROJECT tuple with the DEPARTMENT.
 We first rename the Dnumber attribute of DEPARTMENT to Dnum, so that it has the same name as the Dnum attribute in PROJECT, and then we apply NATURAL JOIN.
DEPT ← ρ(Dname, Dnum, Mgr_ssn, Mgr_start_date)(DEPARTMENT) PROJ_DEPT ← PROJECT * DEPT
 The attribute Dnum is called the join attribute for NATURAL JOIN, because it is the attribute with the same name in both relations.
 Only one join attribute value is kept.
CS3402 45

Example of NATURAL JOIN operation
PROJ_DEPT ← PROJECT * DEPT
DEPT_LOCS ← DEPARTMENT * DEPT_LOCATIONS
CS3402 46

Binary Relational Operations: DIVISION
 Q. How to retrieve the Social Security numbers of employees who work on all the projects that ‘John Smith’ works on?
 First, retrieve the list of project numbers that ‘John Smith’ works on in the intermediate relation SMITH_PNOS:
SMITH ← σ Fname=‘John’ AND Lname=‘Smith’ (EMPLOYEE) SMITH_PNOS ← π Pno(WORKS_ON Essn=SsnSMITH)
 Next, we create a relation that includes a tuple for all employees:
SSN_PNOS ← π Essn, Pno(WORKS_ON)
 Finally, apply the DIVISION operation to the two relations, which
gives the desired employees’ Social Security numbers:
CS3402 SSNS(Ssn) ← SSN_PNOS ÷ SMITH_PNOS 47

Binary Relational Operations: DIVISION
CS3402 48

Binary Relational Operations: DIVISION
 DIVISION Operation
The division operation is applied to two relations
R(Z) ÷ S(X), where X subset Z
Let Y = Z – X (and hence Z = X ∪ Y); that is, let Y be the set of attributes of R that are not attributes of S
The result of DIVISION is a relation T(Y) that includes a tuple t if tuples tR appear in R with tR [Y] = t, and with
tR [X] = ts for every tuple ts in S
For a tuple t to appear in the result T of the DIVISION, the
values in t must appear in R in combination with every tuple in S
CS3402 49

Example of DIVISION
SSNS(Ssn) ← SSN_PNOS ÷SMITH_PNOS
•R= SSN_PNOS, S= SMITH_PNOS, T= SSNS.
Z={Essn,Pno}, Y = {Essn}, X ={Pno}
•E.g., 123456789 is in SSNS because tuples tR appear in R with tR[Y]=123456789, and with tR[X]= ts for every tuple ts (i.e., 1 and 2) in S
48
CS3402 50

Complete Set of Relational Operations
 The set of operations including SELECT σ, PROJECT π , UNION ∪, DIFFERENCE − , RENAME ρ, and CARTESIAN PRODUCT X is called a complete set because any other relational algebra expression can be expressed by a combination of these six operations
 For example:
R ∩ S = (R ∪ S ) – ((R − S) ∪ (S − R))
CS3402
51
R S = σ (R X S)

Table 8.1 Operations of Relational Algebra
CS3402 52

Table 8.1 Operations of Relational Algebra
CS3402 53

References
 7e
 Ch. 8
CS3402 54