程序代写 Part II: Advanced Queries

Part II: Advanced Queries

Aggregation
Significant extension of relational algebra

Copyright By PowCoder代写 加微信 powcoder

“Count the number of tuples in Skaters”
SELECT COUNT(*)
FROM skaters
“Count how many different ratings?”
SELECT COUNT(DISTINCT rating)
FROM skaters
Result is a relation with only one tuple

Aggregation
• Syntax:COUNT, SUM, AVG, MAX, MIN apply to single attribute/column.
• Additionally, COUNT(*)
• “What is the average age of skaters
with rating 7?”
SELECT AVG(age)
FROM skaters
WHERE rating = 7
• What is the average age of skaters with rating 7, and how many are
SELECT AVG(age), COUNT(*)
FROM skaters
WHERE rating = 7

Aggregation
“Give the names of the skaters with the highest rankings”
SELECT sname
FROM skaters
WHERE rating = (SELECT MAX(rating)
FROM skaters)
(Note also the = in the where clause.We can use direct comparison (=, <, ...) when it is assured that the relation resulting from the subquery has only one tuple.) “Give the name of the sailor that is the first in the alphabet” SELECT min(sname) FROM Skaters S1 • So far, we have applied aggregate operators to all (qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples. • Example: “Find the average age of the skaters in each rating level” – In general, we don’t know how many rating levels exist, and what the rating values for these levels are. – Suppose, we know that the rating levels go from 1 to 10; then we can write 10 queries that look like this: For i = 1, 2, ... 10 SELECT AVG(age) FROM skaters WHERE rating = i Grouping does this with one query SELECT AVG(age), MIN(age) FROM skaters GROUP BY rating Queries with GROUP BY SELECT target-list FROM relation list WHERE qualification GROUP BY grouping list • A group is defined as a set of tuples that have the same value for all attributes in the grouping list • One answer tuple is generated per group. • The target-list contains aggregation terms and/or attributes • Allowed attributes: – Subset of the grouping list – Since each answer tuple corresponds to one group, we can only depict attributes, for which all tuples in the group have the same value • Example: SELECT rating, MIN(age) FROM skaters GROUP BY rating Queries with GROUP BY SELECT target-list FROM relation list WHERE qualification GROUP BY grouping list (ORDER BY...) SELECT rating, MIN(age) FROM Skaters GROUP BY rating ORDER BY rating Evaluation SELECT target-list FROM relationlist (WHERE qualification) GROUP BY grouping list (ORDER BY attributes from target-list) • ConversiontoRelationalAlgebra – Compute the cross-product of relations in FROM clause, consider only tuples that fulfill the qualification in WHERE clause, project on fields that are needed (in SELECT or GROUP BY) – Partition the remaining tuples into groups by the value of attributes in grouping-list – Return all attributes in the SELECT clause (must also be in the group list) plus the calculated aggregation terms per group. – Return in order if requested 9 SELECT lists with aggregation If any aggregation is used, then each element in the attribute list of the SELECT clause must either be aggregated or appear in a group-by clause SELECT rating, MIN(age) FROM Skater GROUP BY rating Wrong way to find the name of the oldest skaters SELECT sname, MAX(age) FROM Skaters Correct way to find the names of the oldest skaters SELECT sname, age FROM skaters WHERE age = (SELECT MAX(age) FROM skaters) HAVING CLAUSE • HAVINGclausesareselectionsongroups,justasWHEREclausesare selections on tuples Example: “For each rating, find the minimum age of the skaters with this rating. Only consider rating levels with at least two skaters SELECT rating, MIN(age) FROM Skaters GROUP BY rating HAVING COUNT(*) >= 2
Example 2: “For each rating > 5, find the average age of the skaters with this
rating. Only consider rating levels where there are at least two skaters
SELECT rating, AVG(age)
FROM Skaters
WHERE rating > 5
GROUP BY rating
HAVING COUNT(*) >= 2

SELECT rating, avg(age)
FROM Skaters
WHERE rating > 5
GROUP BY rating
HAVING COUNT(*) >= 2
Select upon WHERE and project to necessary attributes
Partition by GROUP and check whether they fulfill HAVING

SELECT rating, age, count(*)
FROM Skaters
WHERE rating > 5
GROUP BY rating, age
HAVING COUNT(*) >= 2

SELECT rating, avg(age)
FROM Skaters
WHERE rating > 5
GROUP BY rating
HAVING COUNT(*) >= 2
Evaluation
SELECT target-list
FROM relation list
WHERE qualification
GROUP BY grouping list
HAVING group-qualification
Conversion to Relational Algebra
– Compute the cross-product of relations in FROM clause, consider only tuples that fulfill the
qualification in WHERE clause, project on fields that are needed (in SELECT or GROUP BY)
– NOTE: the WHERE clause can contain any attributes of the relations in the relation list
SELECT rating, MIN(age)
FROM Skaters
WHERE sname LIKE ‘A%’ GROUP BY rating HAVING COUNT(*) >= 2
– Partition the remaining tuples into groups by the value of attributes in grouping-list
– For each group, the group qualification is then applied selecting only those groups that fulfill the qualification. Expressions in group-qualification must have a single value per group. Hence, for each attribute in the group qualification, either
• the attribute also appears in the grouping list
• or it is argument of an aggregation

Example II
• For each local competition, find the number of participants
SELECT C.cid, COUNT (*) AS scount FROM Competition C, Participates P WHERE C.cid=P.cid AND c.type=‘local’ GROUP BY C.cid
• Grouping over a join of two relations.

Example III
• Findthoseratingsforwhichtheaverageageistheminimum over all ratings
• Aggregate operations cannot be nested! WRONG
SELECT S.rating
FROM Skaters S
WHERE S.age =(SELECT MIN (AVG (S2.age))
• Useviews
FROM Skaters S2)

A view is just an unmaterialized relation: we store a definition rather than a set of tuples.
CREATE VIEW ActiveSkaters (sid,sname)
AS SELECT DISTINCT S.sid, S.sname
FROM Skaters S, Participates P
WHERE S.sid = P.sid
❑ Viewscanbeusedtopresentnecessaryinformation(orasummary), while hiding details in underlying relation(s).
✩ Given ActiveSkaters, we know the names of the skaters who have participated competition but not the age of the skaters (may be uninteresting for the users of ActiveSkaters).

• Use a view as intermediate relation (rename in rel.algebra)
• Find those ratings for which the average age is the
minimum over all ratings
CREATE VIEW Temp (rating, avgage)
AS SELECT rating, AVG (age) AS avgage
FROM skaters
GROUP BY rating)
SELECT rating, avgage
WHERE avgage = (SELECT MIN (avgage)
FROM Temp)

Views (contd)
• Views can be treated as if they were materialized relations
• The system translates a SELECT on a view into SELECTS on
the materialized relations
• Modifications are problematic
• Views can be dropped using the DROP VIEW command
– How to handle DROP TABLE if there’s a view on the table?
• DROPTABLEcommandhasoptionstolettheuserspecifiythis.

NULL Values
Meaning of a NULL value
– Unknown/missing
– Inapplicable(e.g.,no spouse’􏰁sname)
Comparing NULLs to values
– E.g.,howtoevaluateconditionrating>7iftuplehasaNULLinrating?
– WhenwecompareaNULLvalueandanyothervalue(includingNULL) using a comparison operator like > or =, the result is “􏰂unknown􏰃”.
– IfwewanttocheckwhetheravalueisNULL,SQLprovidesthespecial comparison operator IS NULL
Arithmetic Operations (*, +, etc):
– WhenatleastoneoperandhasaNULLvalue(theotheroperandscanhave any value including NULL) then the result is NULL (consequence 0*NULL=NULL !)
– WecannotuseNULLasanoperand(e.g.,rating 7;
WITH A(…) AS
(SELECT …)
, B(…) AS
(SELECT …)
SELECT …;
Possible to have multiple aliases defined
Can be any complex SQL Select query, with joins, aggregations, etc., including with other tables.

Derived Tables
SELECT sid, sname, cid
SELECT S.sid, S.sname, S.age, P.rank, P.cid
FROM skaters S
INNER JOIN participates P
ON S.sid = P.sid
Can be any complex SQL Select query, with joins, aggregations, etc.
Functions like the concept of a view for the scope of this SQL statement
WHERE age > 7;
The outer query can be any complex SQL Select query, with joins, aggregations, etc., including with other tables.
SELECT …
FROM (SELECT …)A
,(SELECT …)B
Possible to have multiple aliases defined

Levels of Abstraction
❑ Single conceptual (logical) schema defines logical
✩ Conceptual database design
❑ Physical schema describes the files and indexes
✩ Physical database design
❑ Different views describe how users see the data
(also referred to as external schema)
✩ generated on demand from the real data
❑ Physical data independence: the conceptual schema protects from changes in the physical structure of data
❑ Logical data independence: external schema protects from changes in conceptual schema of data
Conceptual Schema
Physical Schema

DB Modifications: insert/delete/update
Insert values for all attributes in the order attributes were declared or values for only some attributes
– INSERT INTO Skaters VALUES (68,’Jacky’,10, 10)
– INSERT INTO Skaters (sid,name) VALUES (68, ‘Jacky’)
Insert the result of a query
– ActiveSkaters(sid,name)
– INSERT INTO ActiveSkaters (
SELECT Skaters.sid Skaters.name
FROM Skaters, Participates
WHERE Skaters.sid = Participates.sid)

DB Modifications: insert/delete/update
• Deletesomeoralltuplesofarelation
– DELETE FROM Competitions WHERE cid = 103 – DELETE FROM Competitions
• Update some of the attributes of some of the tuples – UPDATE Skaters
SET ranking = 10, age = age + 1
WHERE name = ‘debby’ OR name = ‘lilly’
• SQL2 semantics: all conditions in a modification statement must be evaluated by the system BEFORE any modifications occur.

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com