5/18/2020 Submit Final Exam | Gradescope
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new 1/18
0/17 Questions Answered
TIME REMAINING hrs mins
Copyright By PowCoder代写 加微信 powcoder
Final Exam
Q1 Please read and sign this before the exam 0 Points
Exam Instructions – Please Read Carefully
This exam takes place on 5/18/2020 between 3 pm-7 pm EST. It is designed to take about 2.5 hours under normal circumstances. You should finish the exam by 7 pm at the latest.
For every part of the exam, you have the choice to either type the solution into a text box or upload file(s). The files could be diagrams created using other programs, or photos/scans of work you did on paper.
Please make sure to press “Submit Answer” for each subquestion from time to time, and finally press “Submit Assignment” after you have finished everything. You can always resubmit before the end time of the exam.
Also, if you have limited internet connectivity or only a cellphone, you may download the exam and put all answers on paper. In that case, you can submit by scanning the solution in using your smartphone, and sending the instructor the scan via email. Please consider using an app such as camscanner or geniusscan to make it easier for us to process these files.
Please download the entire exam in PDF or another format at the beginning of the exam, even if you plan to do the exam online. This is as a backup, in case you have trouble doing the exam online.
Please periodically check your email during the exam, for possible announcements (clarifications) from the instructor.
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Exam Rules
You may consult your class notes and your textbook. You may also use electronic versions of your class notes and the textbook. You may also use online drawling tools for ER diagrams. You may even run your queries on a database system if you feel that helps you.
You may send an email to the instructor, and you can also request a phone call or skype call from the instructor. Use email, not piazza or messaging inside NYU Classes. Email is
For emergencies, you can also call the intructor at 516 231 2538. But email is preferred,
You may not communicate or collaborate with anyone to work on the exam. You may also not use other information sources on the internet to work on the exam.
Honor Code Oath
I certify and affirm that,
(1) The work presented is 100% my own, and I have not collaborated with anyone else on this test.
(2) I have not cheated or misrepresented someone’s else’s work as my own.
(3) I have not copied or recorded the content of this test.
(4) I will not discuss, nor in any way divulge the contents of this exam, until it is returned.
(5) I understand that my failure to abide by my oath will result in all the consequences outlined in the university code of conduct
Please type your name and the current date below to acknowledge the rules and the oath.
Enter your answer here
Enter your answer here
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Q2 In which way do you prefer to do this exam?
I prefer to do this exam online, by filling in the boxes and/or uploading individual files for each problem.
I prefer to do the entire exam in handwriting. I will upload all my answers as scans or photos here. Preferably in one or a few files. Upload your scanned handwriting here, and graders will ONLY check the files uploaded here.
I prefer to do the entire exam in handwriting, but am unable to upload a copy here. I will send my solution via email by 7pm.
Please select file(s) Select file(s)
Q3 Problem 1 17 Points
Suppose we have an database modeling users and tweets in Twitter, consisting of the following four relations:
User(uid, uname, uemail, ucity, country); Tweet(tid, uid, ttext, ttime, date); Follows(uid, followeruid});
Retweet(uid, tid, rttime, rdate);
A tweet is identified by a tid. Each tweet is made by a particular user at
a certain time and date, and consists of a text of up to 140 characters (varchar(140)). Users can be followed by other users that will then receive the first user’s tweets. Users may also retweet (basically forward) tweets made by other users. This is, of course, a simplified schema, so we do not model issues such as blocking or hashtags.
Q3.1 Problem 1 (a) 3 Points
List all the foreign keys in the above schema.
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Q3.2 Problem 1 (b) 2 Points
In the above schema, can one user retweet the same tweet several times? Why or why not?
Q3.3 Problem 1 (c) 8 Points
Write SQL statements for the following queries:
(i) For each user, list her uid and name, the number of tweets she has sent, and the number of followers she has.
(ii) For each tweet, list the tid, the uid of the user that wrote it, and the number of distinct users who received the tweet. (Users receive a tweet if they either follow the user who wrote it, or follow another user that retweeted it.)
(iii) For each country, list the city with the most tweets sent by people in that city.
Enter your answer here
Enter your answer here
Enter your answer here
Enter your answer here
file chosen
Enter your answer here
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
(iv) List the uid and name of any user who has at least 1000 followers, but where less than ten of the followers live in the United States.
You can upload file(s) here for Problem 1(c) if you prefer to upload screenshot/scan of handwriting instead of the text box.
Please select file(s) Select file(s)
Q3.4 Problem 1 (d) 4 Points
Write statements in Relational Algebra for the first and second query. (i) For each user, list her uid and name, the number of tweets she has sent, and the number of followers she has.
(ii) For each tweet, list the tid, the uid of the user that wrote it, and the number of distinct users who received the tweet. (Users receive a tweet if they either follow the user who wrote it, or follow another user that retweeted it.)
You can upload file(s) here for Problem 1(d ) if you prefer to upload screenshot/scan of handwriting instead of text box.
Enter your answer here
Enter your answer here
Enter your answer here
Please select file(s)
Q4 Problem 2 14 Points
Select file(s)
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Suppose you have a database modeling an online auction site somewhat similar to eBay, given by the
following schema:
USER (uid, uname, city, state)
ITEMTYPE (iid, itemname, description)
AUCTION (seller, iid, starttime, endtime, minbid, condition)
Note: seller references uid in USER, and iid references iid in ITEMTYPE
BID (bidder, seller, iid, starttime, bidtime, bidprice)
Note: bidder references uid in USER, and (seller, iid, starttime)
references AUCTION
In this schema, users can sell items at auction, and can also place bids on auctions by other users. An auction has a start time, when bidding begins, and an end time (both of type datetime). The auction is won by the highest bid made before the end time, provided the bid is higher than the minimum bid. The description of the item that is being sold in the auction is kept in a separate table ITEMTYPE, because several items of the same type (say, a certain book or model of sneakers) may be sold in different auctions by the same or different users, and this way we avoid storing the same information repeatedly. (Note that the condition of the item, such as new , like new , slightly damaged etc., is kept in the AUCTION table since the same type of item may come in different conditions.) Finally, a bid by a user is for a particular auction, and contains the time of the bid and the bid price (which should be higher than the minimum bid and all previous bids for this item, to have a chance to be successful, but this is not enforced by the schema).
SELECT seller
FROM AUCTION
WHERE minbid > 100 AND condition = “NEW”
SELECT U.uid, U.uname
FROM USER U, AUCTION A JOIN BID B
WHERE U.uid = AUCTION.seller AND B.bidprice > 10000 AND
U.state = TX
In the following, assume 100 million users, 20 million itemtypes, 100 million auctions, and 1 billion bids. 10% of auctions have minbid > 100,
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
20% of auctions are for new items, 2% of users are in Texas, and 0.01% of bid prices are > 10000. Assume independence between attributes otherwise.
For simplicity, assume that all records are of size 100 bytes, and any IDs and timestamps are 16 bytes. Any index entries are of size 20 bytes, and any index structures have a node size of 4 KB and a height of 4 (the root, the leaf level, and two levels in between). There are 200 MB of main memory available for processing the query, and you are given a disk with 100 MB/s transfer rate and 10ms latency (seek time plus average rotational latency).
Q4.1 Problem 2 (a) 6 Points
Consider the first query, and estimate its running time under the scan (latency transfer-rate) model of disk performance, for the following three cases:
(i) There are no index structures available.
(ii) There is a dense clustered B+-tree index on the condition attribute in the Auction table that is used for the selection.
(iii) There is an unclustered B+-tree index on the minbid attribute in the Auction table that is used for the selection.
You can upload file(s) here for Problem 2(a) if you prefer to upload screenshot/scan of handwriting instead of text box.
Please select file(s)
Enter your answer here
Enter your answer here
Enter your answer here
file chosen
Select file(s)
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Q4.2 Problem 2 (b) 8 Points
Now consider the second query.
(i) Draw an optimized query evaluation plan for this query, assuming no index structures are available. For each join, identify the join algorithm that is being used by the plan.
Please select file(s) Select file(s)
(ii) Estimate the running time of the query evaluation plan under the
latency transfer-rate model of disk performance.
(iii) Suppose you can create two index structure to make this query faster. Which index structures would you choose and why? How would this change the query evaluation plan, in terms of which methods are used for selection and for joins?
Please select file(s) Select file(s)
You can upload file(s) here for Problem 2(b) if you prefer to upload screenshot/scan of handwriting instead of text box.
Enter your answer here
Enter your answer here
Enter your answer here
Please select file(s)
Q5 Problem 3 12 Points
Select file(s)
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
In each of the following questions, choose the correct answer for each statement.
Q5.1 Problem 3 (a) 3 Points
• Index-based joins are not efficient when both relations are very large.
True False
• It is always fastest to use an index-based join when an index on the join attribute is available.
True False
• Sort-based joins are the most commonly executed joins in database systems.
True False
• Blocked Nested-Loop joins require an index structure on the join attribute.
True False
• Clustered and unclustered B+-trees result in the same running time when used for index-based joins.
True False
Q5.2 Problem 3 (b) 3 Points
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Which of the following statements about concurrency control and serializability are true, and which are false?
• Every view-serializable schedule is also conflict serializable.
True False
• Every view-serializable schedule can be generated by a 2PL scheme.
True False
• Every schedule generated by 2PL is conflict-serializable.
True False
• Every conflict-serializable schedule is serializable.
True False
• 2PL avoids cascading aborts.
True False
Q5.3 Problem 3 (c) 3 Points
Which of the following statements about database design theory are true, and which are false?
• Every schema can be transformed into 3NF.
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
True False
• Every schema can be transformed into BCNF, but the resulting schema may sometimes not be dependency-preserving.
True False
• Given a functional dependency that violates BCNF, it is sometimes impossible to find a lossless join decomposition.
True False
• Given a set of functional dependencies F, it is always possible to find a canonical cover of F.
True False
• Any schema that is in BCNF is also in 3NF.
True False
Q5.4 Problem 3 (d) 3 Points
Which of the following statements about index structures are true, and which are false?
• Hash indexes are usually good for inequality (range) selections.
True False
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
• Clustered B + -trees are usually better than unclustered B + -trees for inequality selections.
True False
• Using an index for an equality selection is always better than scanning the table.
True False
• For equality selections, clustered and unclustered B + -trees always have the same performance.
True False
• Composite indexes combine ideas from hash indexes and B + -trees.
True False
Q6 Problem 4 10 Points
Consider the two schedules shown below. Schedule 1
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Schedule 2
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Q6.1 Problem 4 (a) 3 Points
For each schedule, draw the conflict graphs!
(i) Schedule 1
Type your answer in the box and/or upload an image for this question
Please select file(s) Select file(s)
(ii) Schedule 2
Type your answer in the box and/or upload an image for this question
Enter your answer here
Enter your answer here
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Please select file(s)
Q6.2 Problem 4 (b) 3 Points
For each schedule, decide if it is conflict-serializable. (i) Schedule 1
(ii) Schedule 2
You can upload file(s) here for Problem 4(b) if you prefer to upload screenshot/scan of handwriting instead of text box.
Please select file(s) Select file(s)
Q6.3 Problem 4 (c) 4 Points
Are the schedules guaranteed to avoid cascading aborts? Are the schedules recoverable? Explain!
(i) Schedule 1
(ii) Schedule 2
Enter your answer here
Enter your answer here
Enter your answer here
Enter your answer here
file chosen
Select file(s)
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
You can upload file(s) here for Problem 4(c) if you prefer to upload screenshot/scan of handwriting instead of text box.
Please select file(s) Select file(s)
Q7 Problem 5 6 Points
The Video Hut chain of home electronics stores maintains a very simple database of sales data that keeps a record of every item that was sold to a customer in one of the stores. The database consists of two relations SALES and STORES, which currently contain the following fields and records:
Q7.1 Problem 5 (a) 2 Points
Consider the following SQL query:
SELECT S.customer, sum(S.price)
FROM SALES S, STORES ST
WHERE S.storenumber = ST.storenumber AND ST.state = Texas GROUP BY S.customer
HAVING sum(S.price) > 200
Explain in one sentence what this query computes.
file chosen
seliF esoohC
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new
Submit Final Exam | Gradescope
Q7.2 Problem 5 (b) 4 Points
Now consider the following four SQL statements, executed consecutively on the above relations. Give the result after the second and after the last query. Also, describe in one sentence what each of the queries does. (You can assume that all of these statements are correct SQL statements – no tricks.)
CREATE VIEW X AS
SELECT S.storenum, S.product, avg(S.price) AS avgprice FROM SALES S
GROUP BY S.storenum, S.product
WHERE X.product = ’TV’
INSERT INTO SALES VALUES (“ ”, “TV”, 349, 1082)
WHERE X.storenum = 1082
You can upload file(s) here for Problem 5(b) if you prefer to upload screenshot/scan of handwriting instead of text box.
Please select file(s) Select file(s)
Enter your answer here
Enter your answer here
Submit Final Exam | Gradescope
https://www.gradescope.com/courses/101776/assignments/507135/submissions/new 18/18
Submit & View Submission
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com