CS计算机代考程序代写 SQL Java Relational Algebra

Relational Algebra

Relational Algebra
Jianjun Chen

Contents
Basic operators in relational algebra.
Cartesian product and Joins.
Aggregations, division and grouping.

Relational Algebra
DBMS: Software that is designed to enable users and programs to store, retrieve and update data.
We need a language to describe these operations.
The language is called Structured Query Language (SQL)

What is relational algebra?
Theoretical foundation of (part of) SQL.

Relational Algebra
“Find all universities with > 20000 students”
Relational Algebra:

SQL:
SELECT uName FROM University WHERE University.Enrollment > 20000

Relational Algebra and SQL are declarative (Not procedural like C/Java)
No need to specify the steps of data processing.

Relational Algebra: Operator Properties
Like functions in Java and C, operators in relational algebra requires operands and returns results.
One of the operands must be a relation.
The “return result” is a temporary relation (called View) that has been processed.

The returned relation can be then processed by another operator.
This property is called closure: the ability that allows expressions to be nested. (Textbook: Chapter 5.1)

Relational Algebra VS SQL
Relational algebra is set-based, that means duplicate tuples in results are always eliminated.

Duplicates are kept in SQL results.

Relational Algebra: Basic Operators

Selection: Sigma (σ)
Usage:
The Selection operation works on a single relation R and defines a relation that contains only those tuples of R that satisfy the specified condition (predicate).
List all staffs with a salary greater than £10,000

Useful when a column name appears in two tables
Predicate also supports logical operators ∧ (AND), ∨ (OR) and ~ (NOT).

Predicate: a function with parameters that either returns a true or false.

Projection: PI (𝛑)
Usage:
The Projection operation works on a single relation R and defines a relation that contains a vertical subset of R, extracting the values of specified attributes and eliminating duplicates.

Produce a list of salaries for all staff, showing only the staffNo, fName, lName, and salary details.

Union:
Usage:
The union of two relations and defines a relation that contains all the tuples of , or , or both and , duplicate tuples being eliminated. and must be union-compatible.

List all cities where there is either a branch office or a property for rent

Union Compatible
UNION-compatible means that the numbers of attributes must be the same and their corresponding data types also match

Union compatible:
A: (First_name (char), Last_name(char), Date_of_Birth(date))
B: (FName(char), LName(char), DOB(date))
Both table have 3 attributes and of same date type.

Not union compatible:
A: (First_name (char), Last_name(char), Date_of_Birth(date))
B: (FName(char), LName(char), PhoneNumber(number))

A: (First_name (char), Date_of_Birth(date), Last_name(char))
B: (FName(char), LName(char), DOB(date))

Set Difference: Minus –
Usage:
The Set difference operation defines a relation consisting of the tuples that are in relation R, but not in S. R and S must be union-compatible.

List all cities where there is a branch office but no properties for rent.

Intersection ∩
Usage:
The Intersection operation defines a relation consisting of the set of all tuples that are in both R and S. R and S must be union-compatible.
Same as:

List all cities where there is both a branch office and at least one property for rent.

Cartesian Product and Joins

Cartesian Product
Usage:
The Cartesian product operation defines a relation that is the concatenation of every tuple of relation R with every tuple of relation S.

E.g:

The result is presented in the next slide.

Client[0] +
Client[0] +
Client[0] +
Client[0] +
Client[0] +
Client[1] +
Client[1] +
Client[1] +
Client[1] +
Client[1] +
Client[2] +
Client[2] +
Client[2] +
Viewing[0]
Viewing[1]
Viewing[2]
Viewing[3]
Viewing[4]
Viewing[0]
Viewing[1]
Viewing[2]
Viewing[3]
Viewing[4]
Viewing[0]
Viewing[1]
Viewing[2]

Cartesian Product
The result of the previous operation is not very meaningful.
But based on that, we can filter out rows and obtain some useful information.

For example, how to list the names and comments of all clients who have viewed a property for rent?
Hint: use selection with a predicate

We want a result like this

Theta Join
Usage:
The Theta join defines a relation that contains tuples satisfying the predicate F from . The predicate F is of the form “” where may be one of the comparison operators (<, ≤, >, ≥, =, ≠).
If contains only equality (=), it is instead called Equijoin.
The previous example can be simply represented by a Theta Join:

Theta Join
List the names and comments of all clients who have viewed a property for rent:

Same result as the previous one 😉

Natural Join
Usage:
The Natural join is an Equijoin of the two relations R and S over all common attributes x.
One occurrence of each common attribute is eliminated from the result. (see clientNo in the next slide)

List the names and comments of all clients who have viewed a property for rent.

The extra clientNo is removed.

See theta join for comparison

Natural Join
The natural Join works based on attributes names and their types (domain)

If two tables have multiple attributes with the same names and data types, then all these attributes will be selected. For example:

A: (FName (char), LName(char), Date_of_Birth(date))

B: (FName(char), LName(char), PhoneNumber(number))

Natural Join : A Question
What should the natural join of these two tables look like?

FName LName Date of Birth
John Smith 2000-1-1
Lily Adam 2001-1-11

FName LName Phone No
Jason Smith 111
Lily Adam 222

Left Outer Join:
Usage: R ⟕ S
The (left) Outer join is a join in which tuples from R that do not have matching values in the common attributes of S are also included in the result relation. Missing values in the second relation are set to null.
“Left (Outer numbers) right”.
Like natural join, but use Nulls for missing values in table S.

Produce a status report on property viewings.

PropertyForRent[0]
PropertyForRent[1]
Found two matches:
No matches found, add Null

Left Outer Join: R ⟕ S
Q: What if “Viewing” has more “PropertyNo” than “PropertyForRent”?
A: Check the steps in the previous slide.
A: Or check this Wikipedia page: can you work it out by yourself? 

Right Outer Join and full outer join also exist.
Right outer join (⟕): Just the opposite of left outer Join.
Full outer join (⟗): Find it out!

Rename Operator: rho
or
The Rename operation provides a new name S for the expression E, and optionally names the attributes as .
Why rename operator?
E.g. Natural Join relies on the attribute name to work properly.
Some times the attributes of two different tables have different names, but they actually refer to the same kind of information.

A: (First_name (char), Last_name(char), Date_of_Birth(date))
B: (FName(char), LName(char), DOB(date))

Rename Operator: rho
or
changes “University” table to “uni”
changes “Branch” table to “b” without changing its tuples.

col1 col2 col3 col4
Tuples Stay The Same

Division, Aggregation, Grouping

Division:
Usage:
Division is used when we wish to express queries with “all”:
“Which persons have a bank account at ALL the banks in the country?”
“Which students are registered on ALL the courses”

Assume relation R is defined over the attribute set A and relation S is defined over the attribute set B such that B ⊆ A (B is a subset of A).

B: (propertyNo)
A: (ClientNo, propertyNo)
Identify all clients who have viewed all properties with three rooms:

Division:
Formal definition as in our textbook:
Assume relation R is defined over the attribute set A and relation S is defined over the attribute set B such that B ⊆ A (B is a subset of A).

Let C = A − B, that is, C is the set of attributes of R that are not attributes of S (clientNo). The Division operation defines a relation over the attributes C that consists of the set of tuples from R that match the combination of every tuple (pg4, pg36) in S.

Aggregate:
Usage:
Applies the aggregate function list, , to the relation R to define a relation over the aggregate list. contains one or more (, ) pairs.
The symbol is “capital I” (according the the system symbol table).
Aggregate functions:
COUNT: returns the number of values in the associated attribute.
SUM: returns the sum of the values in the associated attribute.
AVG: returns the average of the values in the associated attribute.
MIN: returns the smallest value in the associated attribute.
MAX: returns the largest value in the associated attribute.

How many properties cost more than £350 per month to rent?

Find the minimum, maximum, and average staff salary

Grouping
Usage: GAℑAL(R)
Groups the tuples of relation R by the grouping attributes, GA.
And then applies the aggregate function list AL to define a new relation. AL contains one or more (, ) pairs.
The resulting relation contains the grouping attributes, GA, along with the results of each of the aggregate functions.
Simplified: find aggregation result for each different value in GA.

Grouping
Find the number of staff working in each branch and the sum of their salaries.

Extended Reading
Find out the information about the full outer join.
Can you describe how full outer join works?

Use pseudo code to describe the process of all operators covered in this lecture.

Read our textbook, chapter 5.1

/docProps/thumbnail.jpeg

Related Posts