Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Practice Exam Quiz
Started: Nov 5 at 17:20
Quiz Instructions
INFO20003 Practice Exam
This is a practice exam which is representative of the types of questions / answer formats you might encounter in the final exam.
This practice exam has been set to have unlimited time, however in the real exam you would have 180 minutes + 15 mins for uploads. We recommend taking this quiz under timed conditions once you have revised the subject material, as it will give you a good indication of your preparedness for the exam. There is another ‘sample exam’ in the modules section which you might consider using for earlier study / practice, and then make use of this quiz when you are further into your study.
Answers will be immediately given for the questions which are automatically marked (eg multichoice), and answers to the other questions are available in the modules section of the LMS.
Note that there are several questions which require an answer to be attached as a scan/screenshot. You may scan all of the working for these questions together into one image/document if this is easier than scanning each question individually. Please upload this combined document at the top of
the exam (in question 1).
Question 1 0 pts
There are several questions which require an answer to be attached as a scan/screenshot. Instead of scanning/uploading a document for each question, you may scan all of the working for these questions together into one image/document and attach it below if you prefer.
The following questions should be included: – Modelling
– Relational Algebra [ 2 x questions ]
– Data Warehousing Modelling
Upload
Choose a File
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:1/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Question 2 2 pts
The inconsistent retrieval problem is primarily the result of the absence of which ACID property?
Isolation Consistency Durability Atomicity
Question 3 2 pts
A database has a table ‘customer’ with primary key ‘id’.
For each of the usage scenarios of this table, select the BEST configuration for file organisation + index:
Data frequently inserted into the table. The following query is run frequently : SELECT * FROM customer
Data rarely inserted into table. The following type of query is run frequently: SELECT * FROM customer WHERE id > 200
[ Choose ]
[ Choose ]
[ Choose ]
Data is frequently inserted into the table. The following query is run regularly: SELECT * FROM customer WHERE id = 200;
Question 4 2 pts
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:2/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Question 5 2 pts
Which of the following is a correct description of a distributed database?
There are several logical databases, which are physically split across many locations but all connected
There are always many database management systems (DBMS) storing the data, which are connected together
Three are separate and unconnected instances of the database in many locations
There is only one logical database, which is physically split across many locations but all connected
Which of the following is FALSE?
Pipelining is the technical term for the use of a left-deep query plan instead of a right-deep query plan
Left and right deep plans both allow for efficient use of pipelining
Pipelining is the direct ‘streaming’ in memory of the output of one operation (eg a join) as the input of another operation (eg another join) without writing the output to disk
Right-deep query plans are not evaluated because there is always an equivalent left-deep plan
Question 6 2 pts
We apply a database model (e.g., the relational model) to our database design at which point in the lifecycle?
As part of the physical -> implementation transition
As part of the logical – > physical design transition
As part of the requirements definition -> conceptual design transition
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:3/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Question 7 2 pts
Which of the following is the best description of a transaction (in the context of a relational database management system)?
A grouping of SQL statements which must be entirely completed or aborted Any change made to the database state
Prevents concurrent access to the database by restricting usage to a single individual at a time, thus preserving data integrity
An SQL statement, such as an UPDATE of a row, which must be durable (changes to the row are never lost)
As part of the conceptual -> logical design transition
Question 8 15 pts
A record label company wants to upgrade its albums database and it hires you to do the job. The relevant information is as follows:
They want to keep information about singers, their names (which is unique), active since, whether it’s a band or not.
For each album, the singer (band), the year it was recorded, its unique title, its cover photo, and its price must be stored.
Genres and their hierarchy are also stored. Some genres have emerged from others. For example, a genre can be the parent of 0 or many genres, and another genre can be the child of 0 or 1 genres. Each genre is identified by a unique name.
A given album may belong to more than one genre, which could be at any level in the hierarchy.
Songs of each album are to be stored. For each song, the company stores its title and duration, and the album/albums it belongs to, i.e., all songs must be in some album, but can be in many albums.
Finally, the record company keeps information about the managers. For each manager, the company keeps the name, address, the total amount of dollars owed. Managers can manage multiple artists and genres. We still store the managers even if
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:4/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
they are no longer managing genres/artists.
Draw a conceptual model using Chen’s notation for the record label company. Be sure to indicate the various attributes of each entity and relationship set; also specify the key and participation constraints for each relationship set. Be sure to write down any assumptions you make.
Remember you may upload the scan of this question below, or combine and scan all handwritten questions together (see the top of the exam, question 1)
Upload
Choose a File
Question 9 8 pts
Write the CREATE TABLE statements resulting from the following conceptual model:
Be sure to specify primary and foreign keys. You MUST specify whether the fields are NOT NULL, and whether a UNIQUE constraint is needed. Where the data type is not obvious, feel free to choose an appropriate data type.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:5/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
p 0 words >
Australia has carried out several censuses. The following diagram shows the tables corresponding to preferred means of transportation, residency location, and origins and destinations of the frequent trips.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:6/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Write a single SQL statement to correctly answer each of the following questions. DO NOT USE VIEWS or VARIABLES to answer questions. Query nesting is allowed.
Question 10 2 pts
How many different people, across all censuses, have said they live in Brunswick?
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:7/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
p 0 words >
Question 11 3 pts
List the name of the transport means and the number of people who use each, according to the 2016 census, ordered from the most to the least preferred.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:8/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
p 0 words >
Question 12 4 pts
It was found that the suburbs that are the worst affected by COVID-19 are Brunswick, Epping, and Port Melbourne. Calculate how many different people, according to the 2016 census, frequently travels from or to those suburbs.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:9/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
p 0 words >
Question 13 6 pts
Of people who prefer public transport, in which suburb do the largest number of different people live, according to the 2016 census?
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:10/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Based on the same diagram above (i.e., Australia censuses), write a relational algebra expression for each of the following questions:
Remember you may upload the scan of these questions below, or combine and scan all handwritten questions together (see top of the exam, question 1)
Question 14 2 pts
List the DOB of people whose preferred means of transportation is public transport, according to the 2016 census.
Upload
Choose a File
p 0 words >
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:11/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Question 15 3 pts
List the DOB of people who prefer public transport and frequently travel to ‘Brunswick’, according to the 2016 census.
Upload
Edit View Insert Format Tools Table 12pt Paragraph
Choose a File
Question 16 6 pts
Consider two relations called Customer and DocumentHeader. Imagine that the relation Customer has 250 pages and DocumentHeader has 1000 pages. Consider the following SQL statement:
SELECT *
FROM Customer LEFT JOIN DocumentHeader
ON Customer.CID = DocumentHeader.CID
WHERE Customer.CName = ‘ACME’;
There are 502 buffer pages available in memory. Both relations are stored as simple heap files. Neither relation has any indexes built on it.
Evaluate block-oriented NLJ, SMJ and HJ using the number of disk I/O’s as the cost. Provide the formulae you use to calculate your cost estimates.
Which join would be the optimizer’s choice? Consider Customer as the outer relation in all alternatives. Assume that sorting can be performed in two passes for both relations. All selections are performed on-the-fly after the join.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:12/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
p 0 words >
Question 17 2 pts
Would the cost of Sort-Merge Join have changed if we would have had an unclustered B+ tree index created on the attribute in the WHERE condition? If yes, explain how (no need to recalculate). If no, explain why.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:13/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Consider a relation called Loans that stores information about loans given by a bank. Imagine that the relation Loans consists of 800 pages and each page stores 50 tuples. Imagine that the catalog manager provides statistics about the distribution of some attributes of this relation as the following:
Type: {‘A’: 10%, ‘B’: 30%, ‘C’: 45%, ‘D’: 15%}
ApprovalYear: {2018: 10%, 2019: 60%, 2020: 30%}
Suppose that the following SQL query is executed frequently using the given relation:
SELECT Type
FROM Loans
WHERE Type IN (‘B’, ‘C’)
AND ApprovalYear > 2019;
p 0 words >
Question 18 2 pts
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:14/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Compute the estimated result size (in number of tuples) for the query, and the reduction factor of each predicate.
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 19 2 pts
Compute the estimated cost of plan alternatives and state which plan is the best assuming that an unclustered B+ tree index on (ApprovalYear) is (the only index) available. Suppose there are 100 index pages.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:15/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 20 2 pts
What would happen if our query changed and became:
SELECT *
FROM Loans
WHERE 2018 < ApprovalYear
AND ApprovalYear < 2020
AND Type IN ('B', 'C');
Will this change impact the result size? If yes, calculate the new result size. If no, explain why.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:16/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Consider three relations called Account, AccountDetail, and TransactionType. Imagine that the relation Account has 1000 pages, AccountDetail has 10000 pages,
and TransactionType has 100 pages. Each page stores 50 tuples.
The AccountType attribute has values in the set {'Savings', 'CreditCard'}. There is a clustered B+ tree index on AccountType, which is 50 pages in size. Consider the following query:
SELECT *
FROM Account AS A, AccountDetail AS AD, TransactionType AS TT
WHERE A.Account_ID = AD.Account_ID AND AD.TranType_ID = TT.TranType_ID
AND A.AccountType = 'Savings';
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 21 4 pts
Compute the cost of the plan shown below. NLJ is a Page-oriented Nested Loops Join. Assume that Account_ID is the candidate key of Account, TranType_ID is the candidate
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:17/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
key of TransactionType, and 100 tuples of a resulting join between Account and AccountDetail can fit on one page.
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 22 8 pts
Compute the cost of the plan shown below. Assume that SMJ can be done in 2 passes. Assume that Account_ID is the candidate key of Account, TranType_ID is the candidate
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:18/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
key of TransactionType, and 100 tuples of a resulting join between Account and AccountDetail can fit on one page.
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 23 3 pts
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:19/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Would the cost of the plan presented in the previous question change if the selection on AccountType happened between HJ and SMJ, or after SMJ? Explain (intuitively) which out of the three possibilities would be the cheapest: a) the original plan presented in the previous question, b) the plan with selection happening between HJ and SMJ, or c) the plan with selection happening after SMJ as the last operation in the plan. You do not need to calculate the cost of all three alternatives, an explanation in 2-3 sentences is sufficient.
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
You work in the IT department at CBA. The relation LoanSchedules below contains the amortization schedules of the loans approved by the bank:
DueDate
Status
Principal
Interest
Balance
LoanNum
ApprovalDate
Score
Client
CID
31/01/2019
paid
100
12
1007
4885
01/01/2019
B1
James B
7846564
28/02/2019
partial, past due
101
11
950
4885
01/01/2019
B1
James B
7846564
31/01/2019
paid
50
40
4561
4886
02/01/2019
A2
Ben R
8758155
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:20/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
28/02/2019
paid
52
38
4509
4886
02/01/2019
A2
Ben R
8758155
31/12/2018
paid
1550
450
10020
4710
01/12/2018
B2
James B
7846564
31/01/2019
partial, past due
1565
435
10000
4710
01/12/2018
B2
James B
7846564
28/02/2019
past due
1580
420
10000
4710
01/12/2018
B2
James B
7846564
The set {LoanNum, DueDate} is a candidate key for this relation. The following functional dependencies hold for this relation:
LoanNum, DueDate –> Status, Principal, Interest, Balance LoanNum –> CID, ApprovalDate, Score
CID –> Client
Question 24 5 pts
Could any anomalies arise in the table above? If yes, discuss which anomalies and problems may appear in this table. Provide examples.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:21/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 25 10 pts
Normalize the table above to 3rd Normal Form (3NF). For each step, explicitly identify which normal form is violated and briefly explain why. Write the normalised tables in textual format, as in:
TableName (PrimaryKey, Column, ForeignKey(FK)) AnotherTable (PrimaryKey, Column, AnotherColumn)
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:22/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 26 2 pts
Normalisation prevents the anomalies as discussed above, however, some databases are deliberately non-normalised.
Explain in 2-3 sentences why/when we might choose to not normalise a database.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:23/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Question 27 2 pts
You are given this relation:
A
B
C
a1
b1
c5
a2
b1
c5
a3
b2
c5
a4
b1
c5
A possible candidate key is: [ Select ] , and a functional dependency is: [ Select ]
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 28 3 pts
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:24/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
A conceptual model of the production database used by a retailer that sells a small and unchanging range of products is below:
The store estimates that it receives 1000 orders per year, and each order has an average of 3 items in it. To place an order, a customer must have a user account.
Fill in the blanks below:
The fastest-growing table in this model will be the [ Select ] table,
while the slowest-growing will be the [ Select ] table.
If a row in the “Order Item” table has an average size of 100 bytes, and the “Order” table currently has 3000 orders in it, we might estimate that the “Order Item” table is
[ Select ]
in size.
Question 29 3 pts
For the following case study, choose the best option for each of the choices related to the company’s backup:
Acme inc. is a sales company with many locations around the world, each with its own production database. Acme is currently trying to optimise its business process. To do this, it wants to be able to save snapshots of parts of its production database over time, so that these can be analysed by data analysts. The data analysts are unlikely to have access to the same version of the DBMS used by Acme, and also different Acme
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:25/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
locations often have different DBMS versions! The production databases are used very heavily during the day, but are essentially unused overnight.
Logical/Physical
Online/Offline
Full/Incremental
[ Choose ]
[ Choose ]
[ Choose ]
Question 30 6 pts
Alice and Bob are two Unimelb students who are both interested in buying a new Nintendo Switch (to deal with isolation boredom). They’ve both wound up on the
JB Hifi website, where the page lists that there is only 1 left in stock. Both Alice and Bob are trying to reserve the switch so they can go pick it up in store.
The database table might look something like this: StockReservations(StockID, StoreID(FK), ItemTypeID(FK), ReservationName) The timeline of their interactions with the site is as follows:
Time t
Bob
Alice
t =0
Clicks the JB Hifi link from google [reads available unreserved stock]
t=1
Clicks the JB Hifi link from google [reads available unreserved stock]
t=2
Clicks ‘Reserve’ [writes new ReservationName]
t=3
Clicks ‘Reserve’ [writes new
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:26/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Answer the following questions:
1. What is the name for this sort of common problem?
2. Who ends up with their name on the switch reservation after this interaction?
3. Give two possible ways to implement concurrency control to prevent this issue,
and for each solution, indicate which of Bob and Alice will wind up with the Nintendo Switch!
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
ReservationName]
A company wants to create a data warehouse that enables analysing the performance of its resellers, which is measured by the sales amount and number of products sold. The company also wants to know the net income (the difference between the sales amount and the product list price.) The manager wants to be able to filter the report by product (name, description, color, list price), product subcategory (name), and product category (name). Since the company has resellers across the country, it is important to filter by state and city as well. Of course, having a filter by the reseller (name, address) is a must. The report can also be filtered by year, semester, or quarter based on either the order date, due date, or ship date. The employee (SSN, name, DOB) who is in contact with the
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:27/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Question 31 5 pts
Use Kimball’s four-step dimensional design process for designing a dimensional model to support the design of this data warehouse. Explain what you achieved in each step of Kimball’s model. In particular make sure to discuss the name of the business process, identify and explain dimensions, declare the grain, and identify and explain facts.
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
reseller is also an important filter.
Question 32 5 pts
Draw a star schema to support the design of this data warehouse, showing the attributes in each table. Use PK to denote primary key, PFK to denote primary foreign key, and FK to denote foreign key. You do not need to specify data types nor whether the fields are NULL/NOT NULL.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:28/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Question 33 2 pts
Why is “Data Consistency” VERY POOR in a distributed database with “Integrated Partitions”?
DO NOT USE the explanation in the table provided in the lecture slides.
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Remember you may upload the scan of these questions below, or combine and scan all handwritten questions together (see top of the exam, question 1)
Upload
Choose a File
Question 34 2 pts
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:29/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Explain in 2-3 sentences how the CAP theorem is related with BASE in NoSQL databases.
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Question 35 2 pts
It seems like Schema-on-Read in NoSQL databases is awesome. However, I bet there should be disadvantages with it. Give one disadvantage and explain why in 2-3 sentences.
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:30/31
Quiz: Practice Exam Quiz 2020/11/5 下午2:27
Remember to upload your answers to modelling + relational algebra + data warehousing at the top of the exam (question 1) if you’ve chosen to scan all questions together.
END OF EXAM
Edit View Insert Format Tools Table 12pt Paragraph
p
0 words >
Saved at 14:26
Submit Quiz
https://canvas.lms.unimelb.edu.au/courses/8200/quizzes/91343/take ⻚码:31/31