3. Structured Query Language (SQL)
3. Structured Query Language (SQL)
We will study:
A brief history
2 Data Definition Language (DDL)
3 Data Manipulation Language (DML)
4 Basic SQL queries
Select – From – Where – Group By – Order By
5 Advanced SQL queries
Set operations Join operations Subqueries
6 Additional features of SQL
7 Limitations of SQL
1
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 1
3. Structured Query Language (SQL)
SQL (Structured Query Language)
SQL provides an interface to relational database systems, including: Data Definition Language (DDL);
Data Manipulation Language (DML);
Query Language (QL).
Initially developed at IBM, as one of the first commercial languages for Edgar F. Codd’s relational model.
History:
1986 – SQL was standardised by ANSI and ISO ( SQL-86).
1989 – SQL was revised ( SQL-89).
1992 – SQL was strengthened and much expanded ( SQL-92).
1999 – SQL was expanded and divided into a core specification plus optional specialised packages ( SQL:1999).
2003 – SQL was further expanded, e.g., XML support ( SQL:2003).
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 2
3. Structured Query Language (SQL)
SQL with Sets and Multisets
Unlike the relational model that is based on sets, SQL is based on multisets. It means that SQL allows a relation to have duplicate tuples.
Suppose that we have the following tables. Does the database tell us which Peter has enrolled in both COMP1130 and COMP2400?
Name
CourseNo
Semester
Tom
COMP2400
2016 S2
Peter
COMP1130
2016 S1
Peter
COMP1130
2016 S1
Peter
COMP2400
2016 S2
Name
DoB
Email
Tom
25/01/1988
tom@gmail.com
Peter
23/05/1993
peter@gmail.com
Peter
20/02/1991
peter@hotmail.com
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 3
3. Structured Query Language (SQL)
SQL with Sets and Multisets
Unlike the relational model that is based on sets, SQL is based on multisets. It means that SQL allows a relation to have duplicate tuples.
Suppose that we have the following relations. Does the database tell us which Peter has enrolled in both COMP1130 and COMP2400?
Answer: No.
Name
CourseNo
Semester
Tom
COMP2400
2016 S2
Peter
COMP1130
2016 S1
Peter
COMP1130
2016 S1
Peter
COMP2400
2016 S2
Name
DoB
Email
Tom
25/01/1988
tom@gmail.com
Peter
23/05/1993
peter@gmail.com
Peter
20/02/1991
peter@hotmail.com
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 4
3. Structured Query Language (SQL)
SQL with Sets and Multisets
We can constrain relations in SQL to be sets by finding some attribute(s) (e.g., StudentID) whose values can uniquely identify each tuple, and specifying them as the PRIMARY KEY.
StudentID
u234567
Name
CourseNo
COMP2400
Semester
u123456
Tom
2016 S2
Peter
COMP1130
2016 S1
StudentID
u234568
Peter
COMP1130
COMP2400
2016 S1
u234567
Peter
2016 S2
Name
DoB
Email
u123456
Tom
25/01/1988
tom@gmail.com
u234567
Peter
23/05/1993
peter@gmail.com
u234568
Peter
20/02/1991
peter@hotmail.com
SQL also provides the DISTINCT option to remove duplicates in a query result (i.e., convert multisets into sets).
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 5
3. Structured Query Language (SQL)
SQL Data Definition Language I/II
The CREATE TABLE statement is used to create a new relation schema by specifying its name, its attributes and, optionally, its constraints.
CREATE TABLE table_name
(attribute_name data_type [attribute constraints],
…,
attribute_name data_type [attribute constraints],
[table constraints]);
For each attribute in a relation, we specify its name, its type and, optionally, a constraint specific to the attribute (i.e., attribute constraint).
attribute_name data_type [attribute_constraint]
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 6
3. Structured Query Language (SQL)
Create Table – Example
CREATE TABLE Student
(StudentID INT not null PRIMARY KEY,
Name VARCHAR(50) not null, DoB Date,
Email VARCHAR(100) UNIQUE);
CREATE TABLE Course
(No INT not null PRIMARY KEY,
Cname VARCHAR(50) not null,
Unit SMALLINT not null Default 6);
CREATE TABLE Enrol
(StudentID INT not null CHECK StudentID>0,
CourseNo INT not null,
Semester VARCHAR(50) not null,
Status VARCHAR(50),
PRIMARY KEY(StudentID, CourseNo)
FOREIGN KEY(StudentID) REFERENCES Student(StudentID), FOREIGN KEY(CourseNo) REFERENCES Course(No));
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 7
3. Structured Query Language (SQL)
Numeric types:
SQL Data Types – I
INTEGER, INT and SMALLINT provide domains of integer numbers of various sizes.
FLOAT, REAL and DOUBLE PRECISION provide floating point numbers of various precision.
NUMERIC(i,j) and DECIMAL(i,j) provide fixed point numbers with parameters precision i and scale j:
– precision for the total number of digits;
– scale for the number of digits following the decimal point.
String types:
CHAR(n) allows character strings of fixed length, where n is the number of characters.
VARCHAR(n) allows character strings of varying length, where n is the maximum number of characters.
BIT(n) allows bit strings of fixed length, where n is the number of bits. BIT VARYING(n) allows bit strings of varying length, where n is the maximum number of bits.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 8
3. Structured Query Language (SQL)
SQL Data Types – II
Date and time types:
DATE provides date values (year, month, day).
TIME provides time values (hour, minute, second).
TIMESTAMP includes the DATE and TIME fields, plus a minimum of six
positions for seconds and an optional WITH TIME ZONE qualifier. INTERVAL specifies a relative value that can be used to increment or
decrement a value of a date, time or timestamp. Boolean type: has the values of TRUE or FALSE.
The CREATE DOMAIN statement is used to create a domain that is essentially a specific data type.
CREATE DOMAIN domain_name AS data_type
[default expression][constraint,…,constraint];
Example: CREATE DOMAIN ssn type AS CHAR(9);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 9
3. Structured Query Language (SQL)
SQL Constraints
The following constraints can be specified in SQL.
NOT NULL: DEFAULT: CHECK: UNIQUE:
PRIMARY KEY: INDEX: FOREIGN KEY:
specify that NULL is not allowed for an attribute.
set a default value for an attribute.
limit the values taken from the domain of an attribute.
ensure that uniqueness of the values for an attribute or a set of attribute in a table.
uniquely identify each tuple in a table.
provides accelerated access to the rows of table. enforce referential integrity between two tables.
Note: PRIMARY KEY and UNIQUE automatically imply NOT NULL.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 10
3. Structured Query Language (SQL)
Not Null, Default and Check – Example
If we don’t want to have missing and unknown data, we can specify NOT NULL for attributes to forbid NULL values.
Unit of any new tuple in COURSE is set to 6 if no explicit value is provided. CHECK for StudentID excludes the student IDs such as 0, -1 and -33.
CREATE TABLE Course
(No INT not null PRIMARY KEY,
Cname VARCHAR(50) not null,
Unit SMALLINT not null Default 6);
CREATE TABLE Enrol
(StudentID INT not null CHECK StudentID>0,
CourseNo INT not null, Semester VARCHAR(50) not null, Status VARCHAR(50),
…);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 11
3. Structured Query Language (SQL)
Unique and Primary Key – Example
If a primary key contains only one attribute, PRIMARY KEY can be defined as an attribute constraint (e.g., in COURSE); otherwise it is defined as a table constraint (e.g., in ENROL).
PRIMARY KEY specifies a key while UNIQUE specifies additional keys.
CREATE TABLE Course
(No INT not null PRIMARY KEY,
Cname VARCHAR(50) not null,
Unit SMALLINT not null Default 6);
CREATE TABLE Enrol
(StudentID INT not null CHECK StudentID>0,
CourseNo INT not null,
Semester VARCHAR(50) not null, Status VARCHAR(50),
PRIMARY KEY(StudentID, CourseNo) …);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 12
3. Structured Query Language (SQL)
Foreign Key – Example
Can we define ENROL before STUDENT and COURSE?
CREATE TABLE Enrol
(StudentID INT not null CHECK StudentID>0,
CourseNo INT not null,
Semester VARCHAR(50) not null,
Status VARCHAR(50),
PRIMARY KEY(StudentID, CourseNo)
FOREIGN KEY(StudentID) REFERENCES Student(StudentID), FOREIGN KEY(CourseNo) REFERENCES Course(No));
CREATE TABLE Student
(StudentID INT not null PRIMARY KEY,
…);
CREATE TABLE Course
(No INT not null PRIMARY KEY,
…);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 13
3. Structured Query Language (SQL)
Foreign Key – Example
Can we define ENROL before STUDENT and COURSE?
Answer: No. ENROL has the foreign keys that reference STUDENT and COURSE.
CREATE TABLE Enrol
(StudentID INT not null CHECK StudentID>0,
CourseNo INT not null,
Semester VARCHAR(50) not null,
Status VARCHAR(50),
PRIMARY KEY(StudentID, CourseNo)
FOREIGN KEY(StudentID) REFERENCES Student(StudentID), FOREIGN KEY(CourseNo) REFERENCES Course(No);
CREATE TABLE Student
(StudentID INT not null PRIMARY KEY,
…);
CREATE TABLE Course
(No INT not null PRIMARY KEY,
…);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 14
3. Structured Query Language (SQL)
Index – Example
Indexes are used for fast retrieval based on columns other than the primary key.
CREATE TABLE Customer
(CustomerID INT not null CHECK CustomerID>0,
Name VARCHAR(50) not null, DOB DATE not null,
Address VARCHAR(80),
Phone INT CHECK Phone>0,
PRIMARY KEY(CustomerID);
CREATE UNIQUE INDEX index1 ON Customer (CustomerID); CREATE INDEX index2 ON Customer (Name, DOB);
CREATE UNIQUE INDEX index3 ON Customer (Phone);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 15
3. Structured Query Language (SQL)
SQL Data Definition Language II/II
The ALTER TABLE statement is used to modify an existing relation schema, including:
changing the name of a table;
adding or dropping an attribute; changing the definition of an attribute; adding or dropping table constraints.
ALTER TABLE table_name ADD attribute_name data_type;
The DROP TABLE statement is used to remove an existing relation schema from a database schema.
DROP TABLE table_name;
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 16
3. Structured Query Language (SQL)
Alter and Drop Table – Examples
Rename the table ENROL to CS ENROL:
ALTER TABLE Enrol RENAME TO CS Enrol; Add an attribute EnrolDate into the table ENROL:
ALTER TABLE Enrol ADD EnrolDate Date; Modify the attribute Status in the table ENROL:
ALTER TABLE Enrol MODIFY Status VARCHAR(50) not null; Drop the attribute Status in the table ENROL:
ALTER TABLE Enrol DROP COLUMN Status; Drop the table ENROL:
DROP TABLE Enrol;
Question: Can we change the primary key or an index of an existing table?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 17
3. Structured Query Language (SQL)
Alter and Drop Table – Examples
Question: Can we change the primary key or an index of an existing table? Answer: Yes, we can. But we need to first drop the existing one and
then add the new one.
Example: (the syntax may vary in different DBMSs)
Drop the existing primary key of the table STUDENT:
ALTER TABLE Student DROP PRIMARY KEY; Add the new primary key in the table STUDENT:
ALTER TABLE Student ADD PRIMARY KEY (Email);
Drop an existing index of the table CUSTOMER:
ALTER TABLE Customer DROP INDEX index1;
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 18
3. Structured Query Language (SQL)
SQL Data Manipulation Language
The INSERT statement is used to add tuples into a relation.
INSERT INTO table_name
[(attribute_name,…,attribute_name)]
VALUES (value,…,value),…,(value,…,value);
The UPDATE statement is used to modify attribute values of one or more selected tuples.
UPDATE table_name
SET attribute_name = value,…,attribute_name = value
[WHERE selection_condition];
The DELETE statement is used to remove tuples from a relation.
DELETE FROM table_name
[WHERE selection_condition];
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 19
3. Structured Query Language (SQL)
Insert – Examples
The following three ways of inserting tuples into the relation STUDENT are equivalent.
INSERT INTO Student
VALUES (456, ‘Tom’, 25/01/1988, ‘tom@gmail.com’),
(458, ‘Peter’, 20/02/1991, ‘peter@hotmail.com’);
INSERT INTO Student(Name, StudentID, DoB, Email) VALUES (‘Tom’, 456, 25/01/1988, ‘tom@gmail.com’),
(‘Peter’, 458, 20/02/1991, ‘peter@hotmail.com’);
INSERT INTO Student
VALUES (456, ‘Tom’, 25/01/1988, ‘tom@gmail.com’);
INSERT INTO Student
VALUES (458, ‘Peter’, 20/02/1991, ‘peter@hotmail.com’);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 20
3. Structured Query Language (SQL)
Insert – Primary Key Violation
Suppose that we have the relation STUDENT with the primary key on StudentID:
What would happen if we try to recycle Tom’s StudentID?
INSERT INTO Student(StudentID, Name, DoB, Email)
VALUES (456, ‘Smith’, 27/08/1989, ‘smith@gmail.com’);
DBMSs will not allow two tuples with the same primary key value in STUDENT.
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
…
…
…
…
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 21
3. Structured Query Language (SQL)
Insert – Foreign Key Violation
Consider the relations STUDENT, and ENROL with the foreign key [StudentID]⊆ STUDENT[StudentID].
If we only have the above three tuples in STUDENT, can we add the following tuple into ENROL?
INSERT INTO Enrol(StudentID, CourseNo, Semester, Status) VALUES (460, ‘COMP2400’, ‘2016 S2’, ‘active’);
Again, DBMSs will not allow a tuple in ENROL which has a student ID not appearing in any tuples of STUDENT due to the foreign key [StudentID]⊆ STUDENT[StudentID] on ENROL.
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
459
Fran
11/09/1987
frankk@gmail.com
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 22
3. Structured Query Language (SQL)
Update and Delete – Examples
If we want to change Tom’s email and name stored in the relation STUDENT, then we use
UPDATE Student
SET Name=‘Tom Lee’, Email=‘tom.lee@yahoo.com’ WHERE StudentID=456;
If we want to delete Tom’s information from the relation STUDENT, then we use
DELETE FROM Student WHERE StudentID=456;
We can delete all the tuples in the relation STUDENT by using
DELETE FROM Student;
Question: What is the difference between the above statement and the
following one?
DROP Table Student;
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 23
3. Structured Query Language (SQL)
Update and Delete – Examples
If we want to change Tom’s email and name stored in the relation STUDENT, then we use
UPDATE Student
SET Name=‘Tom Lee’, Email=‘tom.lee@yahoo.com’ WHERE StudentID=456;
If we want to delete Tom’s information from the relation STUDENT, then we use
DELETE FROM Student WHERE StudentID=456;
We can delete all the tuples in the relation STUDENT by using
DELETE FROM Student;
Question: What is the difference between the above statement and the
following one?
DROP Table Student;
Answer: The table STUDENT (empty) exists after the first statement, but would disappear if applying the second one.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 24
3. Structured Query Language (SQL)
Update and Delete – Referential Actions
Referential actions specify what happens in case of deleting or updating referenced tuples (via foreign key constraints).
SQL offers four possibilities:
NO ACTION (default) will throw an error if one tries to delete a row (or update the primary key value) referenced.
CASCADE will force the referencing tuples to be deleted (or updated with new primary key value).
SET NULL will force the corresponding values in the referencing tuples to be set to a null value (i.e., unknown).
SET DEFAULT will force the corresponding values in the referencing tuples to be set to a specified default value.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 25
3. Structured Query Language (SQL)
Referential Actions – Examples
Consider the following foreign key defined on ENROL:
FOREIGN KEY(StudentID) REFERENCES STUDENT(StudentID) ON DELETE NO ACTION
ENROL
StudentID
CourseNo
Semester
Status
EnrolDate
456
COMP1130
2016 S1
active
25/02/2016
458
COMP1130
2016 S1
active
25/02/2016
456
COMP2400
2016 S2
active
09/03/2016
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
The deletion of a student who has enrolled at least one course will throw out an error concerning the foreign key.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 26
3. Structured Query Language (SQL)
Referential Actions – Examples
Consider the following foreign key defined on ENROL:
FOREIGN KEY(StudentID) REFERENCES STUDENT(StudentID) ON DELETE CASCADE
ENROL
StudentID
CourseNo
Semester
Status
EnrolDate
456
COMP1130
2016 S1
active
25/02/2016
458
COMP1130
2016 S1
active
25/02/2016
456
COMP2400
2016 S2
active
09/03/2016
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
Deleting a student in STUDENT will also delete all of his enrolled courses in ENROL. We would have ENROL below after deleting the student 456.
StudentID
CourseNo
Semester
Status
EnrolDate
458
COMP1130
2016 S1
active
25/02/2016
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 27
3. Structured Query Language (SQL)
Simple SQL Queries
SQL provides the SELECT statement for retrieving data from a database. The SELECT statement has the following basic form:
SELECT attribute_list
FROM table_list
[WHERE condition]
[GROUP BY attribute_list [HAVING group_condition]]
[ORDER BY attribute_list];
Note:
Only SELECT and FROM are mandatory. The symbol ∗ means all the attributes.
Attribute names may be qualified with the table name (required, if attribute-names are not unique).
Attribute and table names can be given an alias.
DISTINCT is used for removing duplicate tuples in the query result.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 28
3. Structured Query Language (SQL)
Simple SQL Queries – Select Clause
ENROL
StudentID
CourseNo
Semester
Status
EnrolDate
456
COMP2600
2016 S2
active
25/02/2016
458
COMP1130
2016 S1
active
25/02/2016
456
COMP2400
2016 S2
active
09/03/2016
SELECT * FROM Enrol;
SELECT Enrol.StudentID, Semester FROM Enrol;
SELECT e.StudentID as SID, e.Semester FROM Enrol e; SELECT DISTINCT StudentID, Semester FROM Enrol;
StudentID
Semester
456
2016 S2
458
2016 S1
456
2016 S2
SID
Semester
456
2016 S2
458
2016 S1
456
2016 S2
StudentID
Semester
456
2016 S2
458
2016 S1
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 29
3. Structured Query Language (SQL)
Simple SQL Queries – Where Clause
Unspecified WHERE-clause means no condition.
all tuples of a relation in the FROM-clause are selected.
if multiple relations are specified in the FROM-clause without join conditions, the Cartesian product of tuples is selected (be careful).
The condition in the WHERE-clause can be simple or complicated. SELECT * FROM Student;
SELECT * FROM Student, Course;
SELECT * FROM Student WHERE StudentID BETWEEN 100 AND 500;
SELECT * FROM Student WHERE Email is NOT NULL;
SELECT * FROM Student WHERE Email like ‘%@gmail.com’;
Question: Assume that we have 1000 tuples in STUDENT and 100 tuples in COURSE. How many tuples we will have in the results of the first two queries?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 30
3. Structured Query Language (SQL)
Simple SQL Queries – Where Clause
Unspecified WHERE-clause means no condition.
all tuples of a relation in the FROM-clause are selected.
if multiple relations are specified in the FROM-clause without join conditions, the Cartesian product of tuples is selected (be careful).
The condition in the WHERE-clause can be simple or complicated. SELECT * FROM Student;
SELECT * FROM Student, Course;
SELECT * FROM Student WHERE StudentID BETWEEN 100 AND 500;
SELECT * FROM Student WHERE Email is NOT NULL;
SELECT * FROM Student WHERE Email like ‘%@gmail.com’;
Question: Assume that we have 1000 tuples in STUDENT and 100 tuples in COURSE. How many tuples we will have in the results of the first two queries?
Answer: 1st query result: 1000 tuples; 2nd query result: 100000 tuples.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 31
3. Structured Query Language (SQL)
Simple SQL Queries – Group By Clause
GROUP BY attribute list groups tuples for each value combination in the attribute list.
Aggregate functions can be applied to aggregate a group of attribute values into a single value, e.g.,
COUNT returns the total number of argument values AVG returns the average of argument values
MIN returns the minimum value of the arguments MAX returns the maximum value of the arguments SUM returns the sum of the argument values
We can use HAVING condition to add the condition on the groups.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 32
3. Structured Query Language (SQL)
Simple SQL Queries – Group By Clause
List each course offered in Semester 2 2016 together with the number of students who have enrolled in the course
SELECT e.CourseNo, COUNT(*) AS NumberOfStudents FROM Enrol e
WHERE e.Semester = ‘2016 S2’
GROUP BY e.CourseNo;
The query result may look like:
We may also add HAVING NumberOfStudents >= 120 into the above query in order to exclude the courses that have less than 120 students enrolled from the result.
CourseNo
NumberOfStudents
COMP2400
120
COMP2600
100
COMP1130
150
…
…
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 33
3. Structured Query Language (SQL)
Simple SQL Queries – Having Clause
List each course offered in Semester 2 2016 together with the number of students who have enrolled in the course
SELECT e.CourseNo, COUNT(*) AS NumberOfStudents FROM Enrol e
WHERE e.Semester = ‘2016 S2’
GROUP BY e.CourseNo
HAVING COUNT(*)>= 120;
The query result may look like:
CourseNo
NumberOfStudents
COMP2400
120
COMP1130
150
…
…
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 34
3. Structured Query Language (SQL)
Simple SQL Queries – Order By Clause
The ORDER BY clause allows us to sort the tuples in a query result. ASC indicates ascending order (default).
DESC indicates descending order. We can sort the previous result by
SELECT e.CourseNo, COUNT(*) AS NumberOfStudents FROM Enrol e
WHERE e.Semester = ‘2016 S2’
GROUP BY e.CourseNo
ORDER BY NumberOfStudents DESC;
This would return all tuples sorted by the number of enrolled students in descending order.
CourseNo
NumberOfStudents
COMP1130
150
COMP2400
120
COMP2600
100
…
…
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 35
3. Structured Query Language (SQL)
Advanced SQL Queries – Set Operations
SQL incorporates several set operations: UNION (set union), and sometimes MINUS (set difference) and INTERSECT (set intersection).
Set operations result in return of a relation of tuples (no duplicates).
Set operations apply only to relations that have the same attributes appearing in the same order, e.g., list all students who have either a gmail or hotmail email account.
(SELECT * FROM Student WHERE Email like ‘%@gmail.com’) UNION
(SELECT * FROM Student WHERE Email like ‘%@hotmail.com’); Question: Can we have the following query?
(SELECT * FROM Student) UNION
(SELECT * FROM Course);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 36
3. Structured Query Language (SQL)
Advanced SQL Queries – Set Operations
SQL incorporates several set operations: UNION (set union), and sometimes MINUS (set difference) and INTERSECT (set intersection).
Set operations result in return of a relation of tuples (no duplicates).
Set operations apply only to relations that have the same attributes appearing in the same order, e.g., list all students who have either a gmail or hotmail email account.
(SELECT * FROM Student WHERE Email like ‘%@gmail.com’) UNION
(SELECT * FROM Student WHERE Email like ‘%@hotmail.com’); Question: Can we have the following query?
(SELECT * FROM Student) UNION
(SELECT * FROM Course);
Answer: No. STUDENT and COURSE have different attributes.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 37
3. Structured Query Language (SQL)
Advanced SQL Queries – Join Operations
When we want to retrieve data from more than one relations, we often need to use join operations.
Consider the following queries, which both need a join operation between two relations:
List the names of all courses which have been enrolled by at least one student.
List all students, and their enrolled courses if any.
List all students who have enrolled and their courses.
Types of join operations: Inner Join (default type) and Outer Join.
STUDENT
StudentID
Name
DoB
Email
COURSE
No
Cname
Unit
ENROL
StudentID
CourseNo
Semester
Status
EnrolDate
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 38
3. Structured Query Language (SQL)
Advanced SQL Queries – Inner Join
Inner Join: tuples are included in the result only if there is at least one matching in both relations.
For the query “list the names of all courses which have been enrolled by at least one student”, we use:
SELECT c.Cname
FROM Course c INNER JOIN Enrol e ON c.No=e.CourseNo
No
COURSE
Cname
Unit
COMP2400
Relational Databases
6
COMP3900
Advanced Database Concepts
6
ENROL
StudentID
CourseNo
Semester
Status
EnrolDate
456
COMP1130
COMP1130
2016 S1
active
25/02/2016
458
2016 S1
active
25/02/2016
456
COMP2400
2016 S2
active
09/03/2016
Result:
Cname
Relational Databases
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 39
3. Structured Query Language (SQL)
Advanced SQL Queries – Outer Join
Outer Join: all tuples are included in the result, even if there are no matches in the relations.
For the query “list all students, and their enrolled courses if any”, we can use either of the following statements:
SELECT s.*, e.CourseNo, e.Semester FROM Student s LEFT JOIN Enrol e
ON s.StudentID=e.StudentID
SELECT s.*, e.CourseNo, e.Semester FROM Enrol e RIGHT JOIN Student s
ON e.StudentID=s.StudentID
If we have 1000 tuples in STUDENT, then the query result should contain at least 1000 tuples (one tuple in STUDENT may occur multiple times) with the following attributes:
StudentID
Name
DoB
Email
CourseNo
Semester
…
…
…
…
…
…
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 40
3. Structured Query Language (SQL)
Advanced SQL Queries – Outer Join
SELECT s.*, e.CourseNo, e.Semester FROM Student s LEFT JOIN Enrol e
ON s.StudentID=e.StudentID
ENROL
StudentID
CourseNo
Semester
Status
EnrolDate
456
COMP1130
2016 S1
active
25/02/2016
457
COMP1130
2016 S1
active
25/02/2016
456
COMP2400
2016 S2
active
09/03/2016
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
Result:
StudentID
Name
DoB
Email
CourseNo
Semester
456
Tom
25/01/1988
tom@gmail.com
COMP1130
2016 S1
456
Tom
25/01/1988
tom@gmail.com
COMP2400
2016 S2
458
Peter
20/02/1991
peter@hotmail.com
null
null
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 41
3. Structured Query Language (SQL)
Advanced SQL Queries – Left Outer Join
Left Outer Join: A left outer join retains all rows of the left table regardless of whether there is a row that matches on the right table.
SELECT *
FROM Student s LEFT JOIN Enrol1 e
ON s.StudentID=e.StudentID
ENROL1
StudentID
CourseNo
Semester
456
COMP1130
2016 S1
457
COMP1130
2016 S1
456
COMP2400
2016 S2
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
StudentID
Name
DoB
Email
StudentID
CourseNo
Semester
456
Tom
25/01/1988
tom@gmail.com
456
COMP1130
2016 S1
456
Tom
25/01/1988
tom@gmail.com
456
COMP2400
2016 S2
458
Peter
20/02/1991
peter@hotmail.com
null
null
null
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 42
3. Structured Query Language (SQL)
Advanced SQL Queries – Outer Join
Right Outer Join: A right outer join retains all rows of the right table regardless of whether there is a row that matches on the left table.
SELECT *
FROM Student s RIGHT JOIN Enrol1 e
ON s.StudentID=e.StudentID
ENROL1
StudentID
CourseNo
Semester
456
COMP1130
2016 S1
457
COMP1130
2016 S1
456
COMP2400
2016 S2
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
StudentID
Name
DoB
Email
StudentID
CourseNo
Semester
456
Tom
25/01/1988
tom@gmail.com
456
COMP1130
2016 S1
456
Tom
25/01/1988
tom@gmail.com
456
COMP2400
2016 S2
null
null
null
null
457
COMP1130
2016 S1
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 43
3. Structured Query Language (SQL)
Advanced SQL Queries – Outer Join
Left Outer Join:
SELECT *
FROM Enrol1 e LEFT JOIN Student s
ON e.StudentID=s.StudentID
ENROL1
StudentID
CourseNo
Semester
456
COMP1130
2016 S1
457
COMP1130
2016 S1
456
COMP2400
2016 S2
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
StudentID
CourseNo
Semester
StudentID
Name
DoB
Email
456
COMP1130
2016 S1
456
Tom
25/01/1988
tom@gmail.com
457
COMP1130
2016 S1
null
null
null
null
456
COMP2400
2016 S2
456
Tom
25/01/1988
tom@gmail.com
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 44
3. Structured Query Language (SQL)
Advanced SQL Queries – Outer Join
Right Outer Join:
SELECT *
FROM Enrol1 e RIGHT JOIN Student s
ON e.StudentID=s.StudentID
ENROL1
StudentID
CourseNo
Semester
456
COMP1130
2016 S1
457
COMP1130
2016 S1
456
COMP2400
2016 S2
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
StudentID
CourseNo
Semester
StudentID
Name
DoB
Email
456
COMP1130
2016 S1
456
Tom
25/01/1988
tom@gmail.com
null
null
null
458
Peter
20/02/1991
peter@hotmail.com
456
COMP2400
2016 S2
456
Tom
25/01/1988
tom@gmail.com
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 45
3. Structured Query Language (SQL)
Advanced SQL Queries – Natural Join
Inner Join: An inner join retains all the data of the two tables for only the matched rows, with duplication
SELECT *
FROM Student s INNER JOIN Enrol1 e
ENROL1
StudentID
CourseNo
Semester
456
COMP1130
2016 S1
457
COMP1130
2016 S1
456
COMP2400
2016 S2
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
Result:
StudentID
Name
DoB
Email
StudentID
CourseNo
Semester
456
Tom
25/01/1988
tom@gmail.com
456
COMP1130
2016 S1
456
Tom
25/01/1988
tom@gmail.com
456
COMP2400
2016 S2
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 46
3. Structured Query Language (SQL)
Advanced SQL Queries – Natural Join
Natural Join: A natural join retains all the data of the two tables for only the matched rows, without duplication
SELECT *
FROM Student s NATURAL JOIN Enrol1 e
ENROL1
StudentID
CourseNo
Semester
456
COMP1130
2016 S1
457
COMP1130
2016 S1
456
COMP2400
2016 S2
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
Result:
StudentID
Name
DoB
Email
CourseNo
Semester
456
Tom
25/01/1988
tom@gmail.com
COMP1130
2016 S1
456
Tom
25/01/1988
tom@gmail.com
COMP2400
2016 S2
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 47
3. Structured Query Language (SQL)
Advanced SQL Queries – Natural Join
Natural Join: one kind of inner join, in which two relations are joined implicitly by comparing all attributes of the same names in both relations.
For the query “list all students who have enrolled and their courses”, we use:
SELECT * FROM Student NATURAL JOIN Enrol
ENROL
StudentID
CourseNo
Semester
Status
EnrolDate
456
COMP1130
2016 S1
active
25/02/2016
457
COMP1130
2016 S1
active
25/02/2016
STUDENT
StudentID
Name
DoB
Email
456
Tom
25/01/1988
tom@gmail.com
458
Peter
20/02/1991
peter@hotmail.com
Result:(STUDENT.StudentID=ENROL.StudentID is used in the query)
StudentID
Name
DoB
Email
CourseNo
Semester
Status
EnrolDate
456
Tom
25/01/1988
tom@gmail.com
COMP1130
2016 S1
active
25/02/2016
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 48
3. Structured Query Language (SQL)
Advanced SQL Queries – Subqueries
Subqueries are just queries that are used where a relation is required. Subqueries can be specified within the FROM-clause (usually in conjunction
with aliases and renaming) to create inline view (exist only for the query) Subqueries can also be specified within the WHERE-clause, e.g.,
IN subquery tests if tuple occurs in the result of the subquery EXISTS subquery tests whether the subquery results in non-empty
relation
UNIQUE subquery tests if the result of the subquery contains no duplicates
using ALL, SOME or ANY before a subquery makes subqueries usable in comparison formulae
in all these cases the condition involving the subquery can be negated using a preceding NOT
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 49
3. Structured Query Language (SQL)
Subqueries – In
Recall that, for the query “list all students who have enrolled and their courses”, we have:
SELECT *
FROM Student NATURAL JOIN Enrol
Now if we want to query: “list all students who have enrolled in a course that has less than 10 students enrolled and the CourseNo of these courses”, we have
SELECT s.*,e1.CourseNo
FROM Student s NATURAL JOIN Enrol e1
WHERE e1.CourseNo IN
(SELECT e2.CourseNo
FROM Enrol e2
GROUP BY e2.CourseNo HAVING COUNT(*)<10)
Comp2400/Comp6240 - Relational Databases, Semester 2, 2016 50
3. Structured Query Language (SQL)
Subqueries – Exists
For the query: “list all students who have enrolled in at least one course”, we have
SELECT s.*
FROM Student s
WHERE EXISTS (SELECT *
FROM Enrol e
WHERE s.StudentID=e.StudentID)
For the query: “list all students who have not enrolled in any course”, we have
SELECT s.*
FROM Student s
WHERE NOT EXISTS (SELECT *
FROM Enrol e
WHERE s.StudentID=e.StudentID)
Comp2400/Comp6240 - Relational Databases, Semester 2, 2016 51
3. Structured Query Language (SQL)
Subqueries – More Complicated
For the query: “list the courses that have the largest number of students enrolled in Semester 2 2016”, we have
SELECT e.CourseNo,
FROM (SELECT e1.CourseNo, COUNT(*) AS NoOfStudents
FROM Enrol e1
WHERE e1.Semester = ‘2016 S2’ GROUP BY e1.CourseNo) e
WHERE e.NoOfStudents =
(SELECT MAX(e2.NoOfStudents)
FROM (SELECT e1.CourseNo, COUNT(*) AS NoOfStudents FROM Enrol e1
WHERE e1.Semester = ‘2016 S2’
GROUP BY e1.CourseNo) e2);
Comp2400/Comp6240 - Relational Databases, Semester 2, 2016 52
3. Structured Query Language (SQL)
Subqueries – More Complicated
For the query: “list all the courses that have more students enrolled than at least one other course in Semester 2 2016”, we have
SELECT e.CourseNo,
FROM (SELECT e1.CourseNo, COUNT(*) AS NoOfStudents
FROM Enrol e1
WHERE e1.Semester = ‘2016 S2’ GROUP BY e1.CourseNo) e
WHERE e.NoOfStudents > ANY
(SELECT e2.NoOfStudents
FROM (SELECT e1.CourseNo, COUNT(*) AS NoOfStudents FROM Enrol e1
WHERE e1.Semester = ‘2016 S2’
GROUP BY e1.CourseNo) e2);
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 53
3. Structured Query Language (SQL)
Views in SQL
A view in SQL is a virtual table that is derived from other tables in the same database or previously defined views.
There is a limitation on possible update operations which can be performed on view but there is no limitation on querying a view.
How to Create Views?
Suppoe we already have tables STUDENT(StudentID, Name, DoB, Email) and ENROL(StudentID, CourseNo, Semester, Status, EnrolDate). Then we can create a view ENROL1 as follows:
CREATE VIEW ENROL1
AS SELECT e.StudentID, Name, CourseNo, EnrolDate
FROM STUDENT s, ENROL e
WHERE s.StudentID=e.StudentID;
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 54
3. Structured Query Language (SQL)
Additional Features of SQL
SQL provides some further features for: specifying assertions, triggers and views;
CREATE ASSERTION name CHECK . . . allows to define integrity constraints outside the definition of relation schemas;
CREATE TRIGGER . . . specifies the type of action to be taken when certain conditions are satisfied;
transaction control commands (study in Week 10);
specifying privileges to users (study in Week 10);
constructs for writing programs in a programming language style; specifying physical database design and file structures;
features from object-oriented models.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 55
3. Structured Query Language (SQL)
Limitations of SQL
Reachability queries:
Query 1: Find pairs of cities (S, D) such that one can fly from S to D
with at most one stop.
Query 2: Find pairs of cities (S, D) such that one can fly from S to D with at most two stops.
FLIGHTS
Source
Destination
Singapore
Tokyo
Tokyo
Joburg
Joburg
Sydney
…
…
Question: Can we use SQL to express “Find pairs of cities (S, D) such that one can fly from S to D”?
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 56
3. Structured Query Language (SQL)
Limitations of SQL
Query 1: Find pairs of cities (S, D) such that one can fly from S to D with at most one stop.
Query 2: Find pairs of cities (S, D) such that one can fly from S to D with at most two stops.
ANSWER
Source
Destination
Singapore
Joburg
Tokyo
Sydney
…
…
ANSWER
Source
Destination
Singapore
Sydney
…
…
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 57
3. Structured Query Language (SQL)
Limitations of SQL
Query 1: Find pairs of cities (S, D) such that one can fly from S to D with at most one stop.
SELECT F1.Source, F2.Destination
FROM Flights F1, Flights F2
WHERE F1.Destination = F2.Source
Query 2: Find pairs of cities (S, D) such that one can fly from S to D with at most two stops.
SELECT F1.Source, F3.Destination
FROM Flights F1, Flights F2, Flights F3
WHERE F1.Destination = F2.Source
AND F2.Destination = F3.Source
ANSWER
Source
Destination
Singapore
Joburg
Tokyo
Sydney
…
…
ANSWER
Source
Destination
Singapore
Sydney
…
…
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 58
3. Structured Query Language (SQL)
Limitations of SQL
Reachability queries:
Query 1: Find pairs of cities (S, D) such that one can fly from S to D
with at most one stop.
Query 2: Find pairs of cities (S, D) such that one can fly from S to D with at most two stops.
FLIGHTS
Source
Destination
Singapore
Tokyo
Tokyo
Joburg
Joburg
Sydney
…
…
Question: Can we use SQL to express “Find pairs of cities (S, D) such that one can fly from S to D”?
Answer: No. We cannot use SQL to express that.
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 59
3. Structured Query Language (SQL)
Summary: SQL in Practice
The core SQL syntax is implemented by
various commercial DBMSs, such as: IBM DB2, Oracle Database Server, Microsoft SQL Server, Sybase, Informix
various open source DBMS, such as: MySQL, PostgreSQL.
None of them is 100 percent compliant to the ANSI/ISO standard
actual implementation of the SQL standard are different often support SQL with proprietary extensions
For database applications, we often need to run SQL statements from a host programming language.
SQL is good for processing data in a relational database, but not so good for complex computational tasks.
SQL’99 can do much more than SQL’92, but usually operations in programming language still lack.
SQL offers two flavours of communicating with a programming language: embedded SQL and dynamic SQL
Comp2400/Comp6240 – Relational Databases, Semester 2, 2016 60