程序代写 RELATIONAL ALGEBRA

RELATIONAL ALGEBRA
CS 564- Fall 2021
ACKs: , Jignesh Patel, An

WHAT IS THIS LECTURE ABOUT?
• RelationalAlgebra
– query language for relations
• BasicOperations
– selection, projection
– difference, union
– cross-product, renaming
• DerivedOperations
– join, natural join, equi-join, division
CS 564 [Fall 2021] – Paris Koutris 2

RELATIONAL QUERY LANGUAGES
• allowthemanipulationandretrievalofdatafroma database
• twotypesofquerylanguages:
– Declarative: describe what a user wants, rather
than how to compute it
• Tuple Relational Calculus
• Domain Relational Calculus
– Procedural: operational, useful for representing execution plans
• Relational Algebra
CS 564 [Fall 2021] – Paris Koutris 3

WHAT IS RELATIONAL ALGEBRA? • algebra:mathematicalsystemconsistingof
– operands: variables or values from which new values can be constructed
– operators: symbols denoting procedures that construct new values from given values
• relationalalgebra:analgebrawhoseoperands are relations or variables that represent relations
– operators do the most common things that we need to do with relations in a database
– can be used as a query language for relations
CS 564 [Fall 2021] – Paris Koutris 4

RELATIONAL ALGEBRA: PRELIM
• Query:
– Input: relational instances
– Output: relational instances
– specified using the schemas
• mayproducedifferentresultsfordifferentinstances
• theschemaoftheresultisfixed
• therearetwotypesofnotationforattributes: – positional (e.g. 2, 4)
– named-field (e.g. C.name, Person.SSN)
CS 564 [Fall 2021] – Paris Koutris 5

RELATIONAL ALGEBRA: PRELIM
• Basicoperations:
– Selection {𝜎}: selects a subset of rows
– Projection {𝜋}: deletes columns
– Cross-product {×}: combines two relations
– Set-difference {−}
– Union {⋃}
• When the relations have named fields: – Renaming {𝜌}
• Additionaloperations:
– Intersection, join, division
CS 564 [Fall 2021] – Paris Koutris 6

KEEP IN MIND!
• SQLusesmultisets,howeverinRelationalAlgebra we will consider relations as sets
• Wewillconsiderthenamedperspective,where every attribute must have a unique name
The attribute order in a relation does not matter!
CS 564 [Fall 2021] – Paris Koutris 7

BASIC OPERATIONS
CS 564 [Fall 2021] – Paris Koutris 8

SELECTION
Notation: 𝜎!(𝑅)
• CisaconditionthatreferstotheattributesofR
• outputstherowsofRthatsatisfyC
• outputschema:sameasinputschema
Example
• 𝜎!”#$%& 𝑃𝑒𝑟𝑠𝑜𝑛
• 𝜎!”#$%& !'( !”#)%*(𝑃𝑒𝑟𝑠𝑜𝑛)
• 𝜎!”#$%& !'( ‘!+#,“.!/01”(𝑃𝑒𝑟𝑠𝑜𝑛)
CS 564 [Fall 2021] – Paris Koutris 9
SELECT *
FROM Person WHERE age > 24 ;

SELECTION: EXAMPLE Person
SSN
name
age
phoneNumber
934729837
Paris
24
608-374-8422
934729837
Paris
24
603-534-8399
123123645
John
30
608-321-1163
384475687
Arun
25
206-473-8221
𝜎!”#$%& 𝑃𝑒𝑟𝑠𝑜𝑛
SSN
name
age
phoneNumber
123123645
John
30
608-321-1163
384475687
Arun
25
206-473-8221
CS 564 [Fall 2021] – Paris Koutris 10

PROJECTION
Notation: 𝜋”!,””,…,”# (𝑅)
• outputsonlythecolumns𝐴!,𝐴”,…,𝐴# • removesanyduplicatetuples
• outputschema:𝑅(𝐴!,𝐴”,…,𝐴#)
Example
• 𝜋334,!”# 𝑃𝑒𝑟𝑠𝑜𝑛
• 𝜋334,678’#49+:#/,!”# (𝑃𝑒𝑟𝑠𝑜𝑛)
SELECT DISTINCT SSN,age FROM Person ;
CS 564 [Fall 2021] – Paris Koutris 11

PROJECTION: EXAMPLE Person
SSN
name
age
phoneNumber
934729837
Paris
24
608-374-8422
934729837
Paris
24
603-534-8399
123123645
John
30
608-321-1163
384475687
Arun
20
206-473-8221
𝜋334,’!+# 𝑃𝑒𝑟𝑠𝑜𝑛
SSN
name
934729837
Paris
123123645
John
384475687
S 564 [Fall 2021] – Paris Koutris 12

RA OPERATORS ARE COMPOSITIONAL
SELECT DISTINCT SSN,age FROM Person
WHERE age > 24 ;
Two logically equivalent expressions in RA:
• 𝜋334,!”# 𝜎!”#$%& 𝑃𝑒𝑟𝑠𝑜𝑛 • 𝜎!”#$%&(𝜋334,!”# 𝑃𝑒𝑟𝑠𝑜𝑛 )
CS 564 [Fall 2021] – Paris Koutris 13

UNION
Notation: 𝑅% ∪ 𝑅&
• outputs all tuples in 𝑅!or 𝑅”
• bothrelationsmusthavethesameschema! • outputschema:sameasinput
∪=
A
B
a1
b1
a2
b1
a2
b2
a3
b1
a4
b4
A
B
a1
b1
a2
b1
a2
b2
A
B
a1
b1
a3
b1
a4
b4
CS 564 [Fall 2021] – Paris Koutris 14

DIFFERENCE
Notation: 𝑅% − 𝑅&
• outputsalltuplesin𝑅!andnotin𝑅”
• bothrelationsmusthavethesameschema! • outputschema:sameasinput
−=
A
B
a1
b1
a2
b1
a2
b2
A
B
a1
b1
a3
b1
a4
b4
A
B
a2
b1
a2
b2
CS 564 [Fall 2021] – Paris Koutris 15

CROSS-PRODUCT
Notation: 𝑅%× 𝑅&
• matcheseachtuplesin𝑅!witheachtuplein𝑅”
• input schema: 𝑅! 𝐴!,𝐴”,…,𝐴# , 𝑅” 𝐵!,𝐵”,…,𝐵$ • output schema: 𝑅 𝐴!,…,𝐴#,𝐵!,…,𝐵$
Example
• 𝑃𝑒𝑟𝑠𝑜𝑛×𝐷𝑒𝑝𝑎𝑟𝑡𝑚𝑒𝑛𝑡
CS 564 [Fall 2021] – Paris Koutris 16
SELECT *
FROM Person,Department;

CROSS-PRODUCT: EXAMPLE
Person
Dependent
SSN
name
934729837
Paris
123123645
John
depSSN
depname
934729837
Helen
934729837
Bob
𝑃𝑒𝑟𝑠𝑜𝑛 × 𝐷𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡
SSN
name
depSSN
depname
934729837
Paris
934729837
Helen
123123645
John
934729837
Bob
934729837
Paris
934729837
Bob
123123645
John
934729837
S 564 [Fall 2021] – Paris Koutris 17

RENAMING
Notation: 𝜌”!,””,…,”#(𝑅)
• doesnotchangetheinstance,onlytheschema! • inputschema:𝑅 𝐵!,𝐵”,…,𝐵#
• outputschema:𝑅𝐴!,…,𝐴#
Why is it necessary?
named perspective: when joining relations, we need to distinguish between attributes with the same name!
CS 564 [Fall 2021] – Paris Koutris 18

RENAMING: EXAMPLE Person
Dependent
SSN
name
934729837
Paris
123123645
SN
name
934729837
Helen
934729837
Bob
𝑃𝑒𝑟𝑠𝑜𝑛 × 𝜌(#6334,(#6′!+# (𝐷𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡)
SSN
name
depSSN
depname
934729837
Paris
934729837
Helen
123123645
John
934729837
Bob
934729837
Paris
934729837
Bob
123123645
John
934729837
S 564 [Fall 2021] – Paris Koutris 19

DERIVED OPERATIONS
CS 564 [Fall 2021] – Paris Koutris 20

INTERSECTION
Notation: 𝑅% ∩ 𝑅&
• outputs all tuples in 𝑅!and 𝑅”
• outputschema:sameasinput
• canbeexpressedas:𝑅!−(𝑅!−𝑅”)
RS

SELECT R.A, R.B FROM R,S
WHERE R.A = S.A AND R.B = S.B;
A
B
a1
b1
a2
b1
a2
b2
A
B
a1
b1
a3
b1
a4
b4
A
B
a1
b1
=
CS 564 [Fall 2021] – Paris Koutris 21

JOIN (THETA JOIN)
Notation: 𝑅%⋈’ 𝑅& = 𝜎'(𝑅%× 𝑅&)
• cross-productfollowedbyaselection
• θcanbeanyboolean-valuedcondition
• mighthavelesstuplesthanthecross-product!
SELECT * FROM R1, R2 WHERE θ;
CS 564 [Fall 2021] – Paris Koutris 22

THETA JOIN: EXAMPLE
Person
Dependent
SSN
name
age
934729837
Paris
26
123123645
John
22
dSSN
dname
dage
934729837
Helen
23
934729837
Bob
28
𝑃𝑒𝑟𝑠𝑜𝑛 ⋈.#/18′.!”#$<#6#'(#'=.(!"# 𝐷𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 SSN name age dSSN dname dage 934729837 Paris 26 934729837 Helen 23 CS 564 [Fall 2021] - Paris Koutris 23 EQUI-JOIN Notation: 𝑅%⋈' 𝑅& • specialcaseofjoinwheretheconditionθcontains only equalities between attributes • outputschema:sameasthecross-product Examplefor𝑅 𝐴,𝐵 ,𝑆(𝐶,𝐷) • 𝑅 ⋈%&' 𝑆 • outputschema:𝑇(𝐴,𝐵,𝐶,𝐷) SELECT * FROM R, S WHERE R.B = S.C; CS 564 [Fall 2021] - Paris Koutris 24 NATURAL JOIN Notation: 𝑅! ⋈ 𝑅" • equi-join on all the common fields • the output schema has one copy of each common attribute Person Dependent SSN name age 934729837 Paris 26 123123645 John 22 SSN dname 934729837 Helen 934729837 Bob 𝑃𝑒𝑟𝑠𝑜𝑛 ⋈ 𝐷𝑒𝑝𝑒𝑛𝑑𝑒𝑛𝑡 SELECT SSN,name,age,dname FROM Person P, Department D WHERE P.SSN = D.SSN ; SSN name age dname 934729837 Paris 26 Helen 934729837 Paris 26 S 564 [Fall 2021] - Paris Koutris 25 NATURAL JOIN Natural Join 𝑅 ⋈ 𝑆 • Input schema: 𝑅 𝐴,𝐵,𝐶,𝐷 ,𝑆 𝐴,𝐶,𝐸 – Output schema: T(A, B, C, D, E) • Input schema: 𝑅 𝐴,𝐵,𝐶 ,𝑆 𝐷,𝐸 – Output schema: T(A, B, C, D, E) • Input schema: 𝑅 𝐴,𝐵,𝐶 ,𝑆 𝐴,𝐵,𝐶 – Output schema? T(A, B, C,) CS 564 [Fall 2021] - Paris Koutris 26 SEMI-JOIN Notation: 𝑅%⋉ 𝑅& • naturaljoinfollowedbyprojectiononthe attributes of R1 Example: • 𝑅 𝐴,𝐵,𝐶 ,𝑆(𝐵,𝐷) • 𝑅 ⋉ 𝑆 = 𝜋(,%,' 𝑅 ⋈ 𝑆 • outputschema:T(A,B,C) SELECT A,B,C FROM R, S WHERE R.B = S.B ; CS 564 [Fall 2021] - Paris Koutris 27 DIVISION Notation: 𝑅%/𝑅& • suppose𝑅!(𝐴,𝐵)and 𝑅" 𝐵 • theoutputcontainsallvaluesasuchthatforevery tuple (b) in R2, tuple (a, b) is in R1 • outputschema:R(A) CS 564 [Fall 2021] - Paris Koutris 28 DIVISION: EXAMPLE A B1 B2 A B a1 b1 a1 b2 a1 b3 a2 b1 B b2 b3 b1 B b1 A a1 A a1 a2 𝑨 /𝑩𝟏 𝑨 /𝑩𝟐 CS 564 [Fall 2021] - Paris Koutris 29 EXTENDING RELATIONAL ALGEBRA CS 564 [Fall 2021] - Paris Koutris 30 GROUP BY AGGREGATE • ispartoftheso-calledextendedRA • helpsustocomputecounts,sums,min,max,... Examples • Whatistheaverageageofthecustomers? • HowmanypeopleboughtaniPad? CS 564 [Fall 2021] - Paris Koutris 31 GROUP BY AGGREGATE Notation: 𝛾(,")) * (𝑅) • groupbytheattributesinX • aggregatetheattributeinY – SUM, COUNT, AVG (average), MIN, MAX • Outputschema:X+anextra(numerical)attribute CS 564 [Fall 2021] - Paris Koutris 32 EXAMPLE Person SSN name age 934729837 Paris 24 123123645 John 30 384475687 Arun 21 𝑃𝑒𝑟𝑠𝑜𝑛 AVG(age) 25 SELECT AVG(age) FROM Person ; CS 564 [Fall 2021] - Paris Koutris 33 EXAMPLE Person SSN name age phoneNumber 934729837 Paris 24 608-374-8422 934729837 Paris 24 603-534-8399 123123645 John 30 608-321-1163 384475687 Arun 21 206-473-8221 𝛾334,EFG4H(678'#49+:#/) 𝑃𝑒𝑟𝑠𝑜𝑛 SELECT SSN, COUNT(phoneNumber) FROM Person GROUP BY SSN; SSN COUNT(phoneNumber) 934729837 2 123123645 1 384475687 1 CS 564 [Fall 2021] - Paris Koutris 34 CONSTRUCTING RA QUERIES CS 564 [Fall 2021] - Paris Koutris 35 COMBINING RA OPERATORS • Wecanbuildmorecomplexqueriesbycombining RA operators together e.g. standard algebra: 𝑥 + 1 • Thereare3differentnotations: – sequence of assignment statements – expressions with operators – expression trees ∗ 𝑦 − 𝑧" CS 564 [Fall 2021] - Paris Koutris 36 COMBINING RA OPERATORS Input schema: 𝑅 𝐵, 𝐶 , 𝑆(𝐴, 𝐵) • expressionswithoperators 𝑅 ⋈𝑆) 𝜋@ ⋈ • sequenceofassignmentstatements 𝑅J = 𝜎E,I 𝑅 𝑅JJ = 𝑅′ ⋈ 𝑆 S 𝑅JJJ = 𝜎E,I • expressiontrees R CS 564 [Fall 2021] - Paris Koutris 37 EXPRESSIVE POWER OF RA • RAcannotexpresstransitiveclosure! Edges Transitive closure computes all pairs of nodes connected by a directed path From To a b b c a d c d CS 564 [Fall 2021] - Paris Koutris 38