程序代写代做代考 database algorithm CS44800 – Homework 1

CS44800 – Homework 1
Spring 2017

Due: Thursday, ​ February 23, 2017​, in class. Before class starts.
(There will be 10% penalty for each late day. After 4 late days, the homework will not be
accepted.)
___________________________________________________________________________

Part 1: Relational Algebra and Calculus (10 points)

Suppliers (​sid​ : integer​, ​sname​ : string, ​address​ : string)
Parts (​pid​ : integer​, ​pname​ : string, ​color​ : string)
Catalog (​sid​ : integer, ​pid​ : integer​, ​cost​ : integer)

The key fields are underlined, and the domain of each field is listed after the field name.
Therefore, ​sid is the key for Suppliers, ​pid is the key for Parts, and ​sid and ​pid are together form
the key for Catalog. The Catalog relation lists the prices charged for parts by Suppliers.

Based on the above schema, write the following queries in SQL, relational algebra, tuple
relational calculus and domain relational calculus:

1. Find the sids of suppliers who supply some red part or are at 221 Packer Ave.

2. Find pairs of sids such that the supplier with the first sid charges more for some part than
the supplier with the second sid.

3. Find the pids of parts supplied by at least two different suppliers.

4. Find the pids of the most expensive parts supplied by suppliers named Yosemite Sham.

5. Find the pids of parts supplied by every supplier at less than $200. (If any supplier either
does not supply the part or charges more than $200 for it, the part is not selected.)

___________________________________________________________________________

Part 2: Tree-Based Indexing (35 points)

1. Show the result of inserting 12, 30, 15, 29, 1, 17, 3, 13, 35, 18, 27 and 8 into an initially
empty B+ tree of order 3.

2. Show the result of deleting 15, 8 and 30 from the previous B+ tree. Show the B+
tree after each deletion.

3. Assume that you have just built a dense B+ tree index using Alternative (2) on a heap
file containing 20,000 records. The key field for this B+ tree index is a 40-byte string, and
it is a candidate key. Pointers (i.e., record ids and page ids) are exactly 10-byte values.
The size of one disk page is 1000 bytes. The index was built in a bottom-up fashion
using the bulk-loading algorithm, and the nodes at each level were filled with a fill factor
of 100%.

Please answer the following:

a. How many levels does the resulting tree have? Show your steps.

b. For each level of the tree, how many nodes are at that level? Show your steps.

4. If your database system supported both a static and a dynamic tree index (e.g, ISAM
and B+ trees), would you ever consider using the static index in preference to the
dynamic index? Justify your answer.

___________________________________________________________________________

Part 3: Hash Based Indexing (25 points)

1. Assume that you have an empty hash table where each page can store up to two
elements (i.e., page size = 2). Assume further that the initial number of buckets you have
is two. Show the result of inserting 12, 30, 15, 29, 1, 17, 3 and 13 into the hash table
using:

a. Extensible Hashing
b. Linear Hashing

2. Describe a situation where extendable hashing suffers from performance limitations.

Does linear hashing suffer from the same performance limitation you described? Justify
your answer.

State clearly any assumptions you may assume.

___________________________________________________________________________

Part 4: Disk and Files (30 points)

Consider a disk with a sector size of 512 bytes, 2000 tracks per surface, 50 sectors per track, 5
double-sided platters and average seek time of 10 msec, rotational delay time of 2 msec and
the transfer rate is 100MB per second. Suppose a block size of 1024 bytes is chosen and a file
containing 100,000 records of 100 bytes each is to be stored on a such a disk and no record is
allowed to span 2 blocks.

1. How many records fit onto a block?

2. How many blocks are required to store the entire file? If the file is arranged sequentially
on disk how many surfaces are needed?

3. How many records of 100 bytes each can be stored using this disk?

4. If pages are stored sequentially on disk, with page 1 on block 1 of track 1, what page is
stored on block 1 of track 1 on the next disk surface? How would your answer change if
the disk were capable of reading and writing from all heads in parallel?

5. What time is required to read a file containing 100,000 records of 100 bytes each
sequentially? Again, how would your answer change if the disk were capable of reading /
writing from all heads in parallel (and the data was arranged optimally)?

6. What is the time required to read a file containing 100,000 records of 100 bytes each in
a random order? To read a record, the block containing the record has to be fetched
from disk. Assume that each block request incurs the average seek time and rotational
delay.