程序代写代做代考 database module4

module4

Database Management, GMU, Dr. Brodsky, Module 4 1

Relational Algebra

Module 4
Professor Alex Brodsky

Database Systems

Database Management, GMU, Dr. Brodsky, Module 4 2

Relational Query Languages

❖ Query languages: Allow manipulation and retrieval
of data from a database.

❖ Relational model supports simple, powerful QLs:
– Strong formal foundation based on logic.
– Allows for much optimization.

❖ Query Languages != programming languages!
– QLs not expected to be “Turing complete”.
– QLs not intended to be used for complex calculations.
– QLs support easy, efficient access to large data sets.

Database Management, GMU, Dr. Brodsky, Module 4 3

Formal Relational Query Languages

Two mathematical Query Languages form the
basis for “real” languages (e.g. SQL), and for
implementation:

❶ Relational Algebra: More operational, very
useful for representing execution plans.

❷ Relational Calculus: Lets users describe what
they want, rather than how to compute it.
(Non-operational, declarative.)

☛ Understanding Algebra is key to understanding SQL,
☛ and query processing!

Database Management, GMU, Dr. Brodsky, Module 4 4

Algebra Preliminaries

❖ A query is applied to relation instances, and the
result of a query is also a relation instance.
– Schemas of input relations for a query are fixed (but

query will run regardless of instance!)
– The schema for the result of a given query is also

fixed! Determined by definition of query language
constructs.

Database Management, GMU, Dr. Brodsky, Module 4 5

Example Instances

R1

S1

S2

❖ “Sailors” and “Reserves”
relations for our examples.

Database Management, GMU, Dr. Brodsky, Module 4 6

Algebra Operations

❖ Look what we want to get from the following
table:

Database Management, GMU, Dr. Brodsky, Module 4 7

Projection

p
sname rating

S
,

( )2

page S( )2

❖ Deletes attributes that are not in
projection list.

❖ Schema of result contains exactly
the fields in the projection list,
with the same names that they
had in the (only) input relation.

❖ Projection operator has to
eliminate duplicates! (Why??)
– Note: real systems typically

don’t do duplicate elimination
unless the user explicitly asks
for it. (Why not?)

Database Management, GMU, Dr. Brodsky, Module 4 8

Selection

=
>

)2(
8
S

rating
s

❖ Selects rows that satisfy
selection condition.

❖ No duplicates in result!
(Why?)

❖ Schema of result
identical to schema of
(only) input relation.

S2

Database Management, GMU, Dr. Brodsky, Module 4 9

Composition of Operations

❖ Result relation can be the input for another relational
algebra operation! (Operator composition.)

=
>

))2(
8

(, Sratingratingsname
sp

S2

Database Management, GMU, Dr. Brodsky, Module 4 10

What do we want to get from two
relations?

R1
S1

What about: Who reserved boat 101?

Or: Find the name of the sailor who reserved boat 101.

Database Management, GMU, Dr. Brodsky, Module 4 11

Cross-Product
❖ Each row of S1 is paired with each row of R1.
❖ Result schema has one field per field of S1 and R1,

with field names inherited.

)1,2()1,1( RsidsidSsidsid ®´® rr

sid1 sname rating age sid2 bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96

☛ Renaming operator (because naming conflict):

Database Management, GMU, Dr. Brodsky, Module 4 12

Why does this cross product help

Query: Find the name of the sailor who reserved boat 101.

))(
101and21

(Result

)1,2()1,1(

CP
bidsidsidSname

RsidsidSsidsidCP

==
=

®´®=

sp

rr

* Note my use of “temporary” relation CP.

Database Management, GMU, Dr. Brodsky, Module 4 13

Another example

❖ Find the name of the sailor having the highest
rating.

))AllR2((Result?

)2,(AllR

´
< = >-=

S
ratingAratingSname

SratingArating
ratingA

sp

p

What’s in “Result?” ?

Does it answer our query?

Database Management, GMU, Dr. Brodsky, Module 4 14

Union, Intersection, Set-Difference

❖ All of these operations take
two input relations, which
must be union-compatible:
– Same number of fields.
– `Corresponding’ fields

have the same type.
❖ What is the schema of result?

Database Management, GMU, Dr. Brodsky, Module 4 15

Back to our query

❖ Find the name of the sailor having the highest
rating.

Tmp)2
,

( Result

))AllR2((
,

Tmp

)2,(AllR

-=

´
< = >-=

)(S
SnameSidSname

S
ratingAratingSnameSid

SratingArating
ratingA

pp

sp

p

* Why not project on Sid only for Tmp?

Database Management, GMU, Dr. Brodsky, Module 4 16

Relational Algebra (Summary)

❖ Basic operations:
– Selection ( s ) Selects a subset of rows from relation.
– Projection ( p ) Deletes unwanted columns from relation.
– Cross-product ( ´ ) Allows us to combine two relations.
– Set-difference ( – ) Tuples in reln. 1, but not in reln. 2.
– Union ( È ) Tuples in reln. 1 and in reln. 2.
– Rename ( r ) Changes names of the attributes

❖ Since each operation returns a relation, operations can be
composed! (Algebra is “closed”.)

❖ Use of temporary relations recommended.

Database Management, GMU, Dr. Brodsky, Module 4 17

Natural Join

❖ Natural-Join:

❖ Result schema similar to cross-product, but only one copy of
each field which appears in both relations.

❖ Natural Join = Cross Product + Selection on common fields
+ drop duplicate fields.

=11 RS !”

Database Management, GMU, Dr. Brodsky, Module 4 18

Query revisited using natural join

Query: Find the name of the sailor who reserved boat 101.

))1(
101

1(Result

))11(
101

(Result

R
bid

S
Sname

Or

RS
bidSname

=
=

=
=

sp

sp

!”

!”

Database Management, GMU, Dr. Brodsky, Module 4 19

Consider yet another query
❖ Find the sailor(s) who reserved all the

red boats.

sid bid day
22 101 10/10/96
22 103 10/11/96
58 102 11/12/96

R1

bid color
101 Red
102 Green
103 Red

B

Database Management, GMU, Dr. Brodsky, Module 4 20

Start an attempt

❖ who reserved what boat:

❖ All the red boats:

sid bid
22 101
22 103
58 102

== )1(
,

1 R
bidsid

S p

=
=

= ))((2 B
redcolorbid

S sp
bid
101
103

Database Management, GMU, Dr. Brodsky, Module 4 21

Division
❖ Not supported as a primitive operator, but useful for

expressing queries like:
Find sailors who have reserved all red boats.

❖ Let S1 have 2 fields, x and y; S2 have only field y:
– S1/S2 =
– i.e., S1/S2 contains all x tuples (sailors) such that for every

y tuple (redboat) in S2, there is an xy tuple in S1 (i.e, x
reserved y).

❖ In general, x and y can be any lists of fields; y is the
list of fields in S2, and xÈy is the list of fields of S1.

ïþ

ï
ý
ü

ïî

ï
í
ì )1,(2| RinyxexiststhereRinyforallx

Database Management, GMU, Dr. Brodsky, Module 4 22

Examples of Division A/B

A

B1
B2

B3

A/B1 A/B2 A/B3

Database Management, GMU, Dr. Brodsky, Module 4 23

Find names of sailors who’ve reserved boat #103

❖ Solution 1: )))Re(( (
103

Sailorsserves
bidsname

!”
=

sp

❖ Solution 2: )Re1 (
103

servesTemp
bid =

=s

SailorsTempTemp !”12=
)2(Re Tempsult snamep=

❖ Solution 3:

Database Management, GMU, Dr. Brodsky, Module 4 24

Find names of sailors who’ve reserved a red boat

❖ Information about boat color only available in
Boats; so need an extra join:

❖ A more efficient
solution:

☛ A query optimizer can find this given the first solution!

Database Management, GMU, Dr. Brodsky, Module 4 25

Find sailors who’ve reserved a red or a green boat

❖ Can identify all red or green boats, then find
sailors who’ve reserved one of these boats:

)(
””
Boats

greencolorredcolor
Tempboats

=Ú=
=s

)Re(Re SailorsservesTempboatssnamesult !”!”p=

❖ Can also define Tempboats using union! (How?)

❖ What happens if is replaced by in this query?Ú Ù

Database Management, GMU, Dr. Brodsky, Module 4 26

Find sailors who’ve reserved a red and a green boat

❖ Previous approach won’t work! Must identify
sailors who’ve reserved red boats, sailors
who’ve reserved green boats, then find the
intersection (note that sid is a key for Sailors):

)Re))(

(( servesBoats
redcolorsid

Tempred !”
=

= sp

))((Re SailorsTempgreenTempredsnamesult !ӂ=p

)Re))(

(( servesBoats
greencolorsid

Tempgreen !”
=

= sp

Database Management, GMU, Dr. Brodsky, Module 4 27

Find the names of sailors who’ve reserved all boats

❖ Uses division; schemas of the input relations
to / must be carefully chosen:

))((/))(Re
,

(Tempsids Boats
bid

serves
bidsid

pp=

)(Result SailorsTempsidssname !”p=

❖ To find sailors who’ve reserved all ‘Interlake’ boats:

))(

(/ Boats
Interlakebnamebid =

sp…..

Database Management, GMU, Dr. Brodsky, Module 4 28

Summary

❖ The relational model has rigorously defined
query languages that are simple and
powerful.

❖ Relational algebra is more operational; useful
as internal representation for query
evaluation plans.

❖ Several ways of expressing a given query; a
query optimizer should choose the most
efficient version.