Database Fundamentals
Database Fundamentals
SQL Joins
1
SQL – WHERE
Logical conditions often used to filter results:
Use these to test values and combine tests with AND/OR as required
2
Condition Description
= Equals
<> Not equal
< Less than
> Greater than
<= Less than or equal to
>= Greater than or equal to
LIKE Partial matches (string comparison)
Use ‘%’ for zero or more unknown characters: LIKE ‘%a%’ returns all words containing ‘a’
NOT LIKE Not like the partial matches
IS NULL Test an attribute value is empty (ie, Null)
IS NOT NULL Test an attribute has a value (ie, not null)
SQL Commands Overview – SELECT x,y,z FROM
SQL Queries return a new relations that are composed of the attributes used in the query
StudID Name
1 Mei
2 Phil
StudID Name
2 Phil
SELECT * FROM Students WHERE Name = ‘Phil’
StudID Name
1 Mei
2 Phil
Students
SELECT Name FROM Students
Name
Mei
Phil
Results
Students
Results
When you execute a SQL query over relations in the database you get back another relation.
The advantage of this is that we can take the values of this new relation and run them as part of another query
For example we could then query TableC and TableB for different information
3
SQL Commands Overview – SELECT x,y,z FROM
SQL Queries return a new relations that are composed of the attributes used in the query
Relations can be combined using the key word JOIN to create a new relation
This relation can have a combination of the attributes from the tables used in the query
StudID Name
001 Pete
002 Jane
Stud Book
001 Databases
002 Programming
StudID Stud Name Book
001 001 Pete Databases
002 002 Jane Programming
SELECT * FROM TableA JOIN TableB ON StudID = Stud
TableA
TableB
Results
When you execute a SQL query over relations in the database you get back another relation.
The advantag of this is that we can take the values of this new relation and run them as part of another query
For example we could then query TableC and TableB for different information
4
SQL – Table JOINS
Turning data into information
SQL – Sample Database
6
FirstName Surname Dept Office Salary City
Mary Brown Administration 10 45 London
Charles White Production 20 36 Toulouse
Gus Green Administration 20 40 Oxford
Jackson Neri Distribution 16 45 Dover
Charles Brown Planning 14 80 London
Laurence Chen Planning 7 73 Worthing
Pauline Bradshaw Administration 75 40 Brighton
Alice Jackson Production 75 46 Toulouse
Employees
DeptName Address City
Administration Bond Street London
Production Rue Victor Hugo Toulouse
Distribution Pond Road Brighton
Planning Bond Street London
Research Sunset Street San Jose
Departments
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
SQL – Sample Database
CREATE TABLE Departments(
DeptName varchar(50) PRIMARY KEY ,
Address varchar(50) NULL,
City varchar(50) NULL
);
CREATE TABLE dbo.Employees(
FirstName varchar(50) NOT NULL,
Surname varchar(50) NOT NULL,
Dept varchar(50) REFERENCES Departments(DeptName),
Office varchar(50) NULL,
Salary int NULL,
City varchar(50) NULL,
PRIMARY KEY (FirstName, Surname)
);
INSERT INTO Departments VALUES
(‘Administration’, ‘Bond Street’, ‘London’),
(‘Distribution’, ‘Pond Road’, ‘Brighton’),
(‘Planning’, ‘Bond Street’, ‘London’),
(‘Production’, ‘Rue Victor Hugo’, ‘Toulouse’),
(‘Research’, ‘Sunset Street’, ‘San Jose’);
INSERT INTO Employees VALUES
(‘Mary’, ‘Brown’, ‘Administration’, ’10’, 45, ‘London’),
(‘Charles’, ‘White’, ‘Production’, ’20’, 36, ‘Toulouse’),
(‘Gus’, ‘Green’, ‘Administration’, ’20’, 40, ‘Oxford’),
(‘Jackson’, ‘Neri’, ‘Distribution’, ’16’, 45, ‘Dover’),
(‘Charles’, ‘Brown’, ‘Planning’, ’14’, 80, ‘London’),
(‘Laurence’, ‘Chen’, ‘Planning’, ‘7’, 73, ‘Worthing’),
(‘Pauline’, ‘Bradshaw’, ‘Administration’, ’75’, 40, ‘Brighton’),
(‘Alice’, ‘Jackson’, ‘Production’, ’75’, 46, ‘Toulouse’);
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
SQL – FROM Table Variables
Used to improve readability of the query or rename a relation
Useful where the relation is used more than once to distinguish between different instances of the relation
8
SELECT e.firstname, d.City
FROM Departments AS d JOIN Employees AS e
ON d.DeptName = e.Dept
WHERE Surname=’Brown’
SELECT Employee.firstname, Department.City
FROM Departments JOIN Employees
ON Departments.DeptName = Employees.Dept
WHERE Surname=’Brown’
Explicitly stating the table attribute being used
Explicitly stating the table attribute being used with abbreviations
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
SELECT e.firstname, d.City
FROM Departments d JOIN Employees e
ON d.DeptName = e.Dept
WHERE Surname=’Brown’
SQL Query – SELECT (Database Setup)
CREATE TABLE Employees(
Enum char(3),
Name varchar(100) NOT NULL,
Age int NULL,
Salary int NULL,
CONSTRAINT employeePk PRIMARY KEY (Enum)
);
CREATE TABLE Supervision(
Sid char(3),
Eid char(3),
CONSTRAINT supervisorFk FOREIGN KEY (Sid) REFERENCES Employees(Enum),
CONSTRAINT employeeFk FOREIGN KEY (Eid) REFERENCES Employees(Enum),
CONSTRAINT supervisorFk PRIMARY KEY (Sid, Eid)
);
SQL Query – SELECT (Database Setup)
Enum Name Age Salary
101 Mary Smith 34 40
103 Mary Bianchi 23 35
104 Luigi Neri 38 61
105 Nico Bini 44 38
210 Marco Celli 49 60
231 Siro Bisi 50 60
252 Nico Bini 44 70
301 Steve Smith 34 70
375 Mary Smith 50 65
Employees
Sid Eid
210 101
210 103
210 104
231 105
301 210
301 231
375 252
Supervision
Employees(Enum, Name, Age, Salary)
PK: Enum
Supervision(Sid, Eid)
PK: (Sid, Eid)
FK: Sid ~> Employees(Enum)
FK: Eid ~> Employees(Enum)
Logical Schema
SQL Query – SELECT (Database Setup)
INSERT INTO Employees VALUES
(‘101’, ‘Mary Smith’, 34, 40),
(‘103’, ‘Mary Bianchi’, 23, 35),
(‘104’, ‘Luigi Neri’, 38, 61),
(‘105’, ‘Nico Bini’, 44, 38),
(‘210’, ‘Marco Celli’, 49, 60),
(‘231’, ‘Siro Bisi’, 50, 60),
(‘252’, ‘Nico Bini’, 44, 70),
(‘301’, ‘Steve Smith’, 34, 70),
(‘375’, ‘Mary Smith’, 50, 65);
INSERT INTO Supervision VALUES
(210, 101),
(210, 103),
(210, 104),
(231, 105),
(301, 210),
(301, 231),
(375, 252);
Example 1 – Basic SELECT
Think: Asking a single table for an answer.
Find the number, the name and the age of each employee earning more than 40 thousand
Put required attributes in the SELECT clause:
the query asks for only 3 columns, so
SELECT Enum, Name, Age
Put required tables to get the attributes in the FROM clause:
It needs only one table to answer:
SELECT Enum, Name, Age
FROM Employees
Put filters in the WHERE clause:
WHERE salary > 40, so:
Enum Name Age
104 Luigi Neri 38
210 Marco Celli 49
231 Siro Bisi 50
252 Nico Bini 44
301 Steve Smith 34
375 Mary Smith 50
When Query is executed:
SELECT Enum, Name, Age
FROM Employees
WHERE salary > 40
Relational Schema:
Employees(Enum, Name, Age, Salary)
Supervision(Sid, Eid)
12
Example 2 – Table JOIN
Think: Putting Two Tables Together to get a combined answer
Find the number of the supervisor of each employee earning more than 40 thousand
If two or more tables are required, they need to be joined to produce an intermediate table that contains ALL the necessary detail. The filters and the attributes to be displayed are are derived from this more “detailed” table.
two tables are needed: Employees and Supervision (to get Sid)
join them: Eid REFERENCES Enum of an Employee, so
Employees JOIN Supervision ON Enum = Eid
the schema and the tuples of the intermediate table after join are:
Intermediate Result:
SELECT Employees.*, Sid
FROM Employees
JOIN Supervision
ON Enum = Eid
Relational Schema:
Employees(Enum, Name, Age, Salary)
Supervision(Sid, Eid)
13
Example 2 – Table JOIN
Find the number of the supervisor of each employee earning more than 40 thousand
Now the filters and the selection list are applied to the temporary table
Output: need only Sid
Filter: (Salary of employee) > 40
Final query:
SELECT Sid
FROM Employees
JOIN Supervision
ON Enum = Eid
WHERE Salary > 40
Relational Schema:
Employees(Enum, Name, Age, Salary)
Supervision(Sid, Eid)
Example 3 – Table Name Alias
An Alias changes the name of an attribute of the results to make the meaning of query clearer
Given:
If E represents an employee record and SV represents a supervision record, then E.Enum means the employee number and the above query becomes:
SELECT Sid
FROM Employees JOIN Supervision
ON Enum = Eid
WHERE Salary > 40
SELECT SV.Sid
FROM Employees E JOIN Supervision SV
ON E.Enum = SV.Eid
WHERE E.salary > 40
Relational Schema:
Employees(Enum, Name, Age, Salary)
Supervision(Sid, Eid)
Example 3 – Table Name Alias
Find the name and salary of the supervisor of each employee earning more than 40 thousand
Required tables,
three tables are needed: “Employees E” (for employee salary) and “Supervision SV” and “Employees S” (for supervisor info)
join them: SV.Eid refererences Enum of an employee, and SV.Sid refererences Enum of the supervisor so:
Employees E JOIN Supervision SV
ON E.Enum = SV.Eid JOIN Employees S
ON SV.Sid = S.Enum
the schema of the intermediate table after the first join:
E JOIN SV (Enum, Name, Age, Salary, Sid, Eid)
the schema of the intermediate table after the 2nd join:
E JOIN SV JOIN S(E.Enum, E.Name, E.Age, E.Salary, Sid, Eid, S.Enum, S.Name, S.Age, S.Salary)
Output attributes: S.Name, S.Salary
Filters: E.salary > 40
Final query (see next page)
Relational Schema:
Employees(Enum, Name, Age, Salary)
Supervision(Sid, Eid)
Example 3 – Table Name Alias
Find the name and salary of the supervisor of each employee earning more than 40 thousand
Final query:
Output:
SELECT S.Name, S.Salary
FROM Employees E JOIN Supervision SV
ON E.Enum = SV.Eid JOIN Employees S
ON SV.Sid = S.Enum
WHERE E.salary > 40
“SELECT Name, Salary” does not work, why?
Q: Find the name and salary of the supervisor of each employee earning more than 40 thousand
Relational Schema:
Employees(Enum, Name, Age, Salary)
Supervision(Sid, Eid)
Example 4 – Table Name Alias
Find each employee earning more than his/her respective supervisor, showing the number, the name and the salary of the employee and the number, the name and the salary of his/her supervisor
Query
E.Enum E.Name E.Salary S.Enum S.Name S.Salary
104 Luigi Neri 61 210 Marco Celli 60
252 Nico Bini 70 375 Mary Smith 65
SELECT E.Enum, E.Name, E.Salary, S.Enum, S.Name, S.Salary
FROM Employees E JOIN Supervision SV
ON E.Enum = SV.Eid JOIN Employees S
ON S.Enum = SV.Sid
WHERE E.salary > S.Salary
Output:
Relational Schema:
Employees(Enum, Name, Age, Salary)
Supervision(Sid, Eid)
SQL Query – Cartesian product
This operator is basic for evaluating many other operators
It combines two tables and gives every possible combinations of the tuples of the two tables
The Cartesian product of r and s following is given table z
SQL:
s
A B C
1 2 3
1 2 4
D E
a b
a c
r
z
A B C D E
1 2 3 a b
1 2 3 a c
1 2 4 a b
1 2 4 a c
Meaningful criteria in the WHERE clauses turn the output of a Cartesian Product into something Useful!
SELECT * FROM r, s
SELECT * FROM r, s WHERE 1 = 1
SELECT * FROM r JOIN s ON 1 = 1
SELECT * FROM r INNER JOIN s ON 1 = 1
19
SQL Query – JOIN Query
The meaning of the query:
Build the Cartesian product of all tables in the FROM clause
Consider only the rows that satisfy the conditions in the WHERE clause
Evaluate the attribute list for each of the rows returned
The whole point is to recombine data distributed across multiple tables to give a more “complete” picture
A normalised database stores data across many different tables (space efficient and to avoid anomalies) that must be recombined when retrieving information
20
SQL Query – JOIN Query
Example:
Display a table that contains all the information of every employee and all the information of the employee’s department.
The requested information is stored in two tables: Departments and Employee
Employees are related to departments via the Department name they work in.
The two tables need to be ‘combined’ to form a new wider result table
Joining tables involves combining rows of data from each table based on a join condition
A Join condition is a comparison between a set of one or more columns that have contain the same or related information.
In the application, the Dept of Employees have the same meaning as DeptName of Departments. So the condition is Departments.DeptName = Employees.Dept
DeptName and Dept are called the join attributes.
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
21
SQL Query – JOIN Query Mechanism
In a normalised database design data is stored across multiple relations and connected by foreign key relationships
To re-create the information captured in the data relations must be merged together based on conditions that specify which columns in one relation match the same data in tuples of the other
This ordinarily follows matching table2 FK table1 PK attributes
In our sample database design:
Departments.DeptName = Employees.Dept
DeptName and Dept are called the join attributes
SELECT * FROM Employees
JOIN Departments
ON Employees.Dept = Departments.DeptName
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
SQL Query – JOIN Query Mechanism
The SQL query – this type of join is also called an INNER JOIN.
If an employee tuple does not match any department tuple on the join attributes, the employee tuple is said dangling.
Dangling tuples are dropped from the output
SELECT * FROM Departments, Employees WHERE Employees.Dept = Departments.DeptName
SELECT * FROM Departments JOIN Employees ON Employees.Dept = Departments.DeptName
SELECT * FROM Departments INNER JOIN Employees ON Employees.Dept = Departments.DeptName
The query is evaluated following the algorithm below:
for each employee x in Employees
for each department y in Departments
if x.Dept == y.Deptname then
output x + y tuple
end if
end for y
end for x
23
SQL Query – JOIN Query Mechanism
Joins between tables can also make use renaming to improve readability of longer/large queries:
24
SELECT * FROM Employees
JOIN Departments
ON Employees.Dept = Departments.DeptName
SELECT * FROM Employees AS e
JOIN Departments AS d
ON e.Dept = d.DeptName
SELECT * FROM Employees e
JOIN Departments d
ON e.Dept = d.DeptName
SQL Query – JOIN Query output
The output of the SQL query on previous slide is (we assume that City in Departments is renamed to Dcity):
FirstName Surname Dept Office Salary City DeptName Address Dcity
Mary Brown Administration 10 45 London Administration Bond Street London
Charles White Production 20 36 Toulouse Production Rue Victor Hugo Toulouse
Gus Green Administration 20 40 Oxford Administration Bond Street London
Jackson Neri Distribution 16 45 Dover Distribution Pond Road Brighton
Charles Brown Planning 14 80 London Planning Bond Street London
Laurence Chen Planning 7 73 Worthing Planning Bond Street London
Pauline Bradshaw Administration 75 40 Brighton Administration Bond Street London
Alice Jackson Production 75 46 Toulouse Production Rue Victor Hugo Toulouse
How to decide the number of columns and the column names?
How to decide the number of tuples in the output?
Why some department tuples are duplicated?
What is the rule behind?
What happened to the Research department?
What is the rule behind?
why did we need the assumption about City?
25
SQL Query – JOIN Query with selection and projection
For each employee with Surname Brown, list the firstname and the address of his/her department
The information required are in two tables, name is in Employees and address of the department is in Departments.
The join condition is (same as a previous example) DeptName=Dept
The SQL query
Compare this with
SELECT firstname, address
FROM Departments JOIN Employees
ON DeptName=Dept
WHERE Surname = ‘Brown’
SELECT firstname, address
FROM Departments JOIN Employees
ON DeptName=Dept
26
SQL Query – JOIN Query duplicate column names
For each employee with Surname Brown, list his/her firstname, the city where he/she lives and the city in which he/she works
Incorrect SQL query
The correct query
Use table name to distinguish duplicate column names
SELECT firstname, city, city
FROM Departments JOIN Employees ON DeptName=Dept
WHERE Surname=’Brown’
Ambiguous
SELECT firstname, Employees.city, Departments.city
FROM Departments JOIN Employees ON DeptName=Dept
WHERE Surname=’Brown’
Explicit!
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
27
SQL Query – JOIN Query duplicate column names
What if the output has ambiguous attribute names?
Example:
28
SELECT firstname, City, City
FROM Departments JOIN Employees
ON DeptName = Dept
WHERE Surname=’Brown’
SELECT firstname, Departments.City, Employees.City
FROM Departments JOIN Employees
ON DeptName = Dept
WHERE Surname=’Brown’
SELECT firstname, d.City, e.City
FROM Departments d JOIN Employees e
ON d.DeptName = e.Dept
WHERE Surname=’Brown’
OR
SQL Query – JOIN Query table alias
Use table alias to simplify a query.
The following queries are equivalent:
SELECT firstname, Employees.city, Departments.city
FROM Departments JOIN Employees ON DeptName=Dept
WHERE Surname=’Brown’
SELECT firstname, E.city, D.city
FROM Departments AS D JOIN Employees AS E ON DeptName=Dept
WHERE Surname=’Brown’
SELECT firstname, E.city, D.city
FROM Departments D JOIN Employees E ON DeptName=Dept
WHERE Surname=’Brown’
Note that “firstname” is unambiguous hence it does not need to be qualified with the table from which it comes from
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
29
JOIN Query – a different but a more common way
Find the names of the employees and the cities in which they work:
The following two queries are equivalent
SELECT FirstName, Surname, D.City
FROM Employees E, Departments D
WHERE E.Dept = D.DeptName;
SELECT FirstName, Surname, D.City
FROM Employees JOIN Departments D
ON Dept = DeptName;
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
30
SQL Query – JOIN Query Multiple Matches
Find the head of department for each employee
31
Employee Dept
Smith sales
Black production
White production
Employees
Dept Head
sales Mori
production
production Brown
Pink
Departments
SELECT * FROM Departments D JOIN Employees E
ON D.Dept = E.Dept
JOINS on attributes that are not unique may result in multiple matching records
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
SQL Query – JOIN Query Dangling Tuples
32
Employee Dept
Smith sales
Black production
White production
Employees
Dept Head
production
purchasing Brown
Pink
Departments
Dangling tuples do not end up in the final result set of INNER JOINs
SELECT * FROM Departments D JOIN Employees E
ON D.Dept = E.Dept
Find the head of department for each employee
Departments(DeptName, Address, City)
Employee(FirstName, Surname, Dept, Office, Salary, City)
FK(Dept) -> Departments(DeptName)
SQL Query – JOIN Query Dangling Tuples
33
Employee Dept
Smith sales
Black production
White production
Dept Head
marketing
purchasing Brown
Pink
In a join, all tuples may become dangling tuples and the JOIN outputs no tuples
SELECT * FROM Departments D JOIN Employees E
ON D.Dept = E.Dept
Find the head of department for each employee
Employees
Departments
Departments(DeptName, Head)
Employees(Employee, Dept)
FK(Dept) -> Departments(Dept)
JOIN Query – Multiple attributes in join condition
Code Date Officer Dept Reg
143256 25/10/1992 567 75 5694FR
987554 26/10/1992 456 75 5694FR
987557 26/10/1992 456 75 6544XY
630876 15/10/1992 456 47 6544XY
Offences
Code Date Officer Dept Reg Owner …
143256 25/10/1992 567 75 5694FR Latour Hortense
987554 26/10/1992 456 75 5694FR Latour Hortense
987557 26/10/1992 456 75 6544XY Cordon Edouard
630876 15/10/1992 456 47 6544XY Mimault Bernard
SELECT Code, Date, Officer, O.Dept, O.Reg, Owner
FROM Offences O JOIN Cars C
ON O.Dept = C.Dept AND O.Reg = C.Reg
Reg Dept Owner …
5694FR 75 Latour Hortense
6544XY 75 Cordon Edouard
6544XY 47 Mimault Bernard
7122HT 75 Cordon Edouard
Cars
Outer Joins
JOIN Query – OUTER JOIN
An OUTER JOIN is a variant of the join that keeps the dangling tuples in the result
Achieved by padding missing values with NULL where the tuples do not match in both relations
Three variants
LEFT OUTER JOIN
only dangling tuples of the left of the join statement are padded with NULL values
LEFT JOIN | LEFT OUTER JOIN
RIGHT OUTER JOIN
only dangling tuples of the right of the join statement are padded with NULL values
RIGHT JOIN | RIGHT OUTER JOIN
FULL OUTER JOIN
dangling tuples of both sides of the join statement are padded with NULL values
FULL JOIN | FULL OUTER JOIN
JOIN Query – OUTER JOIN
INNER JOIN
only matching tuples between tables are retained
JOIN | INNER JOIN
A B
A1 B1
A2 B2
A4 B4
A C
A1 C1
A2 C2
A3 C3
SELECT t1.a, t1.b, t2.c FROM
Table1 t1 INNER JOIN Table2 t2
ON t1.a = t2.a
Table1
Table2
A B C
A1 B1 C1
A2 B2 C2
JOIN Query – OUTER JOIN
LEFT OUTER JOIN
only dangling tuples of the left of the join statement are padded with NULL values
LEFT JOIN | LEFT OUTER JOIN
A B
A1 B1
A2 B2
A4 B4
A C
A1 C1
A2 C2
A3 C3
SELECT t1.a, t1.b, t2.c FROM
Table1 t1 LEFT OUTER JOIN Table2 t2
ON t1.a = t2.a
Table1
Table2
A B C
A1 B1 C1
A2 B2 C2
A4 B4 NULL
JOIN Query – OUTER JOIN
RIGHT OUTER JOIN
only dangling tuples of the right of the join statement are padded with NULL values
RIGHT JOIN | RIGHT OUTER JOIN
A B
A1 B1
A2 B2
A4 B4
A C
A1 C1
A2 C2
A3 C3
Table1
Table2
A B C
A1 B1 C1
A2 B2 C2
NULL NULL C3
SELECT t1.a, t1.b, t2.c FROM
Table1 t1 RIGHT OUTER JOIN Table2 t2
ON t1.a = t2.a
Would = A3 if SELECT t2.a, t1.b, t2.c
JOIN Query – OUTER JOIN
FULL OUTER JOIN
dangling tuples of both sides of the join statement are padded with NULL values
FULL JOIN | FULL OUTER JOIN
A B
A1 B1
A2 B2
A4 B4
A C
A1 C1
A2 C2
A3 C3
Table1
Table2
A B C
A1 B1 C1
A2 B2 C2
NULL NULL C3
A4 B4 NULL
SELECT t1.a, t1.b, t2.c FROM
Table1 t1 FULL OUTER JOIN Table2 t2
ON t1.a = t2.a
Would = A3 if SELECT t2.a, t1.b, t2.c
JOIN Query – OUTER JOIN examples
Employee Dept
Smith sales
Black production
White production
Employees
Dept Head
production Mori
purchasing Brown
Departments
Employee Dept Head
Smith sales NULL
Black production Mori
White production Mori
Employee Dept Head
Black production Mori
White production Mori
NULL purchasing Brown
Employee Dept Head
Smith sales NULL
Black production Mori
White production Mori
NULL purchasing Brown
SELECT * FROM
Employees E RIGHT OUTER JOIN Department D
ON D.Dept = E.Dept
SELECT * FROM
Employees E FULL OUTER JOIN Department D
ON D.Dept = E.Dept
SELECT * FROM
Employees E LEFT OUTER JOIN Department D
ON D.Dept = E.Dept
Departments(DeptName, Head)
Employees(Employee, Dept)
FK(Dept) -> Departments(Dept)
41
More outer Join Examples based on
FirstName Surname DriverID
Mary Brown VR2030030Y
Charles White PZ1012436B
Marco Neri AP4544442R
Drivers
Automobiles
CarRegNo Make Model DriverID
ABC123 BMW 323 VR2030030Y
DEF456 BMW Z3 VR2030030Y
GHI789 Lancia Delta PZ10124436B
BBB421 BMW 316 MI2020030U
INSERT INTO Drivers (FirstName, Surname, DriverID)
VALUES
(‘Mary’, ‘Brown’, ‘VR2030030Y’),
(‘Charles’, ‘White’, ‘PZ1012436B’),
(‘Marco’, ‘Neri’, ‘AP4544442R’);
INSERT INTO Automobiles(CarRegNo, Make, Model, DriverID)
VALUES
(‘ABC123′,’BMW’,’323′,’VR2030030Y’),
(‘DEF456′,’BMW’,’Z3′,’VR2030030Y’),
(‘GHI789′,’Lancia’,’Delta’,’PZ10124436B’),
(‘BBB421′,’BMW’,’316′,’MI2020030U’);
CREATE TABLE Automobiles(
CarRegNo varchar(50) PRIMARY KEY,
Make varchar(50),
Model varchar(50),
DriverID varchar(50)
);
CREATE TABLE Drivers(
DriverID varchar(50) PRIMARY KEY,
FirstName varchar(50),
Surname varchar(50)
);
Drivers ( FirstName, Surname, DriverID )
Automobiles ( CarRegNo, Make, Model, DriverID )
42
SQL JOINS – LEFT OUTER JOINS
LEFT OUTER JOIN
Retains all dangling tuples from the table on the left of the join statement
43
Employee Dept
Smith sales
Black production
White production
Employees
Dept Head
production
purchasing Brown
Pink
Departments
SELECT * FROM Employees E
LEFT OUTER JOIN Departments D
ON D.Dept = E.Dept
Departments(DeptName, Head)
Employees(Employee, Dept)
FK(Dept) -> Departments(Dept)
JOIN Query – LEFT OUTER JOIN
LEFT OUTER JOIN
Find the drivers with their cars, including the drivers without cars:
SELECT FirstName, Surname, D.DriverID,
CarRegNo, Make, Model
FROM Drivers D LEFT OUTER JOIN Automobiles A
ON D.DriverID = A.DriverID
Drivers ( FirstName, Surname, DriverID )
Automobiles ( CarRegNo, Make, Model, DriverID )
44
SQL JOINS – RIGHT OUTER JOINS
RIGHT OUTER JOIN
Retains all dangling tuples from the table on the right of the join statement
45
SELECT * FROM Employees E
RIGHT OUTER JOIN Departments D
ON D.Dept = E.Dept
Employee Dept
Smith sales
Black production
White production
Employees
Dept Head
production
purchasing Brown
Pink
Departments
Departments(DeptName, Head)
Employees(Employee, Dept)
FK(Dept) -> Departments(Dept)
SQL JOINS – RIGHT OUTER JOINS
RIGHT OUTER JOIN
Find the details all cars and their drivers including cars without drivers:
SELECT FirstName, Surname, D.DriverID,
CarRegNo, Make, Model
FROM Drivers D RIGHTOUTER JOIN Automobiles A
ON D.DriverID = A.DriverID
Drivers ( FirstName, Surname, DriverID )
Automobiles ( CarRegNo, Make, Model, DriverID )
46
SQL JOINS – FULL OUTER JOINS
FULL OUTER JOIN
Retains all dangling tuples from both tables either side of the join statement
47
SELECT * FROM Employees E
FULL OUTER JOIN Departments D
ON D.Dept = E.Dept
Employee Dept
Smith sales
Black production
White production
Employees
Dept Head
production
purchasing Brown
Pink
Departments
Departments(DeptName, Head)
Employees(Employee, Dept)
FK(Dept) -> Departments(Dept)
SQL JOINS – FULL OUTER JOINS
FULL OUTER JOIN
Find the drivers with their cars, including drivers without cars, and cars without drivers:
SELECT FirstName, Surname, D.DriverID, CarRegNo, Make, Model
FROM Drivers D FULL OUTER JOIN Automobiles A
ON D.DriverID = A.DriverID
Drivers ( FirstName, Surname, DriverID )
Automobiles ( CarRegNo, Make, Model, DriverID )
48
40 Previous Slides Can Be Summarized As
49
http://sqlwithmanoj.com/tag/cross-join-2/
Nested Queries
Nested Queries
WHERE clause conditions can also
Compare an attribute expression with the result of another SQL query
IN – true if the attribute value exists in the results returned by the sub query
NOT IN – true if the attribute value being compared does NOT exist in the results returned by the sub query
ScalarValue Operator < ANY | ALL > SelectSQL
ANY – true if at least one row returned by SelectSQL satisfies the comparison. This is default if not specified
ALL – true if all rows returned by SelectSQL satisfy the comparison
Use the existential quantifier on an SQL query
EXISTS – true if sub query returns a result
NOT EXISTS
The query appearing in the where clause is called a nested query
SQL – Sub Queries (IN)
Sub Queries are nested SELECT statements within the condition of another query
Sub Queries are used to provide a list of values that can be tested/used in a condition statement
Equivalent to
52
SELECT * FROM Employees WHERE Dept IN (
SELECT DeptName FROM Departments
WHERE City <> ‘London’
);
SELECT E.* FROM Employees E JOIN Departments D
ON E.Dept = D.DeptName
WHERE D.City <> ‘London’
= ANY by default
Simple nested query example 1
Find the employees who work in departments in London
Equivalent to
.
SELECT FirstName, Surname
FROM Employees
WHERE Dept IN (
SELECT DeptName
FROM Departments
WHERE City = ‘London’
)
SELECT e.FirstName, e.Surname
FROM Employees e INNER JOIN Departments d
ON e.Dept = d.DeptName
WHERE d.City = ‘London’
Simple nested query example 2
Find the employees in Planning with the same first name as someone in Production
with a nested query
without a nested query
.
SELECT FirstName, Surname FROM Employees
WHERE Dept = ‘Planning’
AND FirstName IN (
SELECT FirstName
FROM Employees
WHERE Dept = ‘Production’
)
SELECT e1.FirstName, e1.Surname
FROM Employees e1 INNER JOIN Employees e2
ON e1.FirstName = e2.FirstName
WHERE e1.Dept = ‘Planning’
AND e2.Dept = ‘Production’
Nested query with negation
Find the departments in which there is no-one named Brown
Or using set difference
.
SELECT DeptName
FROM Departments
WHERE DeptName NOT IN (
SELECT Dept
FROM Employees
WHERE Surname = ‘Brown’
)
< > any
SELECT DeptName
FROM Departments
EXCEPT
SELECT Dept AS DeptName
FROM Employees
WHERE Surname = ‘Brown’
SQL – Sub Queries (NOT IN)
Sub Queries are nested SELECT statements within the condition of another query
Sub Queries are used to provide a list of values that can be tested/used in a condition statement
56
SELECT * FROM Employees WHERE Dept NOT IN (
SELECT DeptName FROM Departments
WHERE City <> ‘London’
);
SELECT E.* FROM Employees E JOIN Departments D
ON E.Dept = D.DeptName
WHERE D.City = ‘London’
Nested Query with max and min
Queries using the aggregate operators max and min can be expressed with nested queries
Find the department of the employee earning the highest salary
using MAX
not using max
SELECT Dept
FROM Employees
WHERE Salary IN (
SELECT MAX(Salary) FROM Employees
)
SELECT Dept
FROM Employees
WHERE Salary >= ALL (
SELECT Salary FROM Employees
)
What does all this mean?
There are many equivalent ways to write the same query
Nested query using EXISTS
A nested query can use variables of the external query
(‘transfer of bindings’)
Semantics: the nested query is evaluated for each row of the external query
Find students with the same name as another student
SELECT *
FROM Student s
WHERE EXISTS (
SELECT *
FROM Student s2
WHERE s.Name = s2.Name
AND s.StudentID <> s2.StudentID
)
The main difference between EXISTS and In is that IN does a direct match between the column(s) specified before the IN keyword and the values returned by the subquery. This means if a subquery returns Null, the entire IN query becomes Null – which requires “IS NULL or IS NOT NULL” operators to work efficiently.
By contrast, EXISTS doesn’t check for a column by column match between tuples, it just tests if a row was returned from the nested/inner query. This means EXISTS will return either True/False resulting in a more efficient comparison (no nulls are returned!).
59
SQL – Sub Queries (EXISTS)
EXISTS
Tests if results are returned by a sub query rather than if a particular value exists in the sub query
EXISTS = sub query is not empty
NOT EXISTS = sub query is empty
60
SELECT * FROM Employees E
WHERE EXISTS (
SELECT * FROM Departments D
WHERE D.City = E.City
);
Departments
SQL – Sub Queries (NOT EXISTS)
EXISTS
Tests if results are returned by a sub query rather than if a particular value exists in the sub query
EXISTS = sub query is not empty
NOT EXISTS = sub query is empty
61
SELECT * FROM Employees E
WHERE NOT EXISTS (
SELECT * FROM Departments D
WHERE D.City = E.City
);
Departments
SQL – Sub Queries (NOT EXISTS)
EXISTS
Tests if results are returned by a sub query rather than if a particular value exists in the sub query
EXISTS = sub query is not empty
NOT EXISTS = sub query is empty
62
SELECT * FROM Employees E1
WHERE NOT EXISTS (
SELECT * FROM Employees E2
WHERE E2.Salary > E1.Salary
)
What if we want to find the employee with the highest salary.
So we will change the where clause to be comparing two employees salaries.
The way this query works is that I want to find all Employees such that no other employees salary is higher
If there were two people on 80k then two results would be returned.
62
Nested queries using NOT EXISTS
Find the names of all courses with no students enrolled
SELECT CourseName
FROM Course c
WHERE NOT EXISTS (
SELECT *
FROM Enrolment e
WHERE c.CourseID = e.CourseID
)
SQL – Sub Queries (ALL)
Example: Find the employee with the highest salary
64
SELECT * FROM Employees E1
WHERE E1.Salary >= ALL (
SELECT E2.Salary FROM Employees E2
)
SELECT E1.* FROM Employees E1, Employees E2
WHERE E1.Salary >= E2.Salary;
>
>
>
…
What if we want to find the employee with the highest salary.
So we will change the where clause to be comparing two employees salaries.
The way this query works is that I want to find all Employees such that no other employees salary is higher
If there were two people on 80k then two results would be returned.
This is an example of a query that cannot be done using table joins. If we try using a join then instead of ending up with the single employee with the highest salary we will end up finding all employees such that there is some other employee with a lower salary. In other words, we will end up with a list of all employees that don’t have the lowest salary.
64
SQL – Sub Queries in SELECT
A Sub-SELECT query returns a value to be used as a result in the main query
This is different from the previous examples where the nested query is used to help generate/condition the results of the outer query
EXAMPLE: Find each employees salary as a % of the max salary for their given department
65
SELECT E.*, E.Salary / (
SELECT MAX(salary) FROM Employees E2
WHERE E2.Dept = E.Dept)* 100) AS PercentMax
FROM Employees E;
SELECT E.*, CONVERT(Decimal(6,2),
E.Salary / (
SELECT MAX(salary) FROM Employees E2
WHERE E2.Dept = E.Dept)* 100) AS PercentMax
FROM Employees E;
One thing that can easliy trip you up when working with calculated columns in MS-SQL is that it is poor at implicitly converting one data type to another. Instead it relys on you converting one data type to another. So in this case the Salaries are held as integers. When we do the first calculation instead of getting 78% etc for the various columns we get 0.
The reason for this is that the int data type of the salary field is passed into the calculation. Now because E.Salary and Max(Salary) are both integers, MS-SQL is going to calculate the result as an integer before converting it to a percentage.
65
SQL – Sub Queries in SELECT
A Sub-SELECT query returns a value to be used as a result in the outer query
This is different from the previous examples where the nested query is used to help generate/condition the results of the outer query
Note:
Because the SELECT query is part of another SELECT Clause it must only return ONE result – because it is being used to fill in a single result
66
SELECT E.*, CONVERT(Decimal(6,2),
E.Salary / (
SELECT (salary) FROM Employees E2
WHERE E2.Dept = E.Dept)* 100) AS PercentMax
FROM Employees E;
SQL – Sub Queries in FROM
A Sub-FROM query returns a relation that can be used to help generate the results in the outer query
Example: Find all Employees that live in the same capital city:
67
SELECT E.* FROM Employees E
JOIN ( SELECT E2.* FROM Employees E2 ) AS MySubQuery
ON E.Dept = MySubQuery.Dept
WHERE E.FirstName <> MySubQuery.FirstName
AND E.Surname <> MySubQuery.Surname
AND E.City = MySubQuery.City;
SELECT * FROM (
SELECT E1.* FROM Employees E1 JOIN Employees E2
ON E1.Dept = E2.Dept
WHERE E1.FirstName <> E2.FirstName
AND E1.Surname <> E2.Surname
AND E1.City = E2.City
) AS MySubQuery
Comments on nested queries
The use of nested queries may produce ‘less declarative’ queries, but they can be more readable
Complex queries can become very difficult to understand
The use of variables must respect visibility rules
a variable can only be used within the query where it is defined, or within a query that is nested in the query where it is defined
/docProps/thumbnail.jpeg