Relational Algebra (Part 2)
Summary of Relational Operators
Operator Notation Meaning
Selection σϕ(R) choose rows
Projection πA1,…,An(R) choose columns
Union R1 ∪ R2
set operationsIntersection R1 ∩ R2
Difference R1 − R2
Cartesian product R1 × R2
combine tablesJoin R1 ��ϕ R2
Natural-join R1 �� R2
Renaming
ρR� (A1,…,An)(R)
rename relation and attributesρR� (R)
ρ(A1,…,An)(R)
A Complete Set of Relational Operators
The following six operators constitute a complete set:
selection σ;
projection π;
renaming ρ;
union ∪;
difference –;
Cartesian product ×.
A Complete Set of Relational Operators
Six operators (i.e., selection σ, projection π, renaming ρ, union ∪,
difference – and Cartesian product ×) constitute a complete set.
It means that the other RA operators like intersection and join are not
necessary and can be expressed by these six operators.
join: R1 ��ϕ R2 = σϕ(R1 × R2)
intersection: R1 ∩ R2 = R1 − (R1 − R2)
Hence, intersection and join do not increase the expressive power of RA.
Nonetheless it is important to include intersection and join because they
are convenient to use and commonly applied in database applications.
Relational Algebra Queries
The output of each RA operation is a relation, which can be used again as
the input for another RA operation.
RA operations can be nested to arbitrary depth for expressing complex
queries, as in arithmetic.
Parentheses and precedence rules define the order of evaluation:
from highest to lowest: {σ,π, ρ}, {×, ��}, {∩}, {∪,−}
Operators with the same precedence are evaluated from left to right.
Use brackets if you are not sure.
A query in RA is a sequence of RA operations and each RA operation takes
one or two relations as its input and produces one relation as its output.
Different from SQL, RA considers relations as sets (not multisets as in
SQL). Hence, relations produced by an RA operation have no duplicate
tuples.
Hints for Writing RA Queries
1 Firstly, identify which relations need to be involved, while ignoring the rest.
2 Then break the answer down by considering intermediate relations, i.e.,
queries may be expressed as a sequence of assignment statements.
Example: R := πHTeam,GTeam(σHScore=1(ρ(HTeam,HScore,GScore,GTeam)(SOCCER)))
Use good names for intermediate relations;
Keep track of attributes you have at each step.
3 When combining relations, check attribute names and make sure that:
attributes that should match are to match.
attributes that shouldn’t match are not to match.
4 When using set operations, make sure that two relations of an operation
have the same type (i.e., type compatibility).
RA Queries – Exercises (Self Join)
Given the following relation schema:
STUDENT={StudentID, Name, DoB}
Query 1: Find pairs of students who have the same birthday. Show their
names.
STUDENT
StudentID Name DoB
457 Lisa 18-Oct-1993
458 Mike 16-May-1990
459 Peter 18-Oct-1993
RA Queries – Exercises (Self Join)
Given the following relation schema:
STUDENT={StudentID, Name, DoB}
Query 1: Find pairs of students who have the same birthday. Show their
names.
πR1.Name,R2.Name(σR1.StudentID
Natural-Join: R1 �� R2 corresponds to
SELECT DISTINCT * FROM R1 NATURAL JOIN R2;
Outer joins are not considered in the traditional relational algebra, as well as
aggregation.
Summary
RA is a procedural query language defined in the relational model.
An RA query itself suggests a procedure for constructing the result (i.e.,
implement the query).
RA is not used as a query language by users.
RA is used for the internal representation and processing of SQL
queries in relational DBMSs, which is a basis of query optimisation
techniques.
Thus, to understand how SQL queries are processed and how they can be
optimised, we first need to understand relational algebra.