程序代写代做代考 database INFO20003: Database Systems

INFO20003: Database Systems
Dr Renata Borovica-Gajic
Lecture 24
Overview, sample exam questions Part II
Week 12
INFO20003 Database Systems © University of Melbourne 1

Query Processing –formulae
• QueryProcessingCostFormulaeontheLMS
INFO20003 Database Systems © University of Melbourne 3

Query Processing 1/2
Consider relations Employees, Orders and OrderDetails. Imagine that relation Employees has 1,000 pages, relation Orders 5,000 pages, and relation OrderDetails 10,000 pages. Each page stores 100 tuples, and neither relation has any indexes built on it. Consider the following query:
SELECT *
FROM Employees as E, Orders as O, OrderDetails as OD
WHERE E.empid = O.empid AND O.orderid = OD.orderid;
Compute the cost of the plan shown below. NLJ is a Page-oriented Nested Loops Join. Assume that empid is the candidate key of Employees, orderid is the candidate key of Orders, and 100 tuples of a resulting join between Employees and Orders can fit on one page.
INFO20003 Database Systems © University of Melbourne 4

Query Processing 2/2
Consider the query presented below. Does the following equivalence class hold? Yes/No and Why?
SELECT firstname, lastname
FROM Employees NATURAL JOIN Orders NATURAL JOIN OrderDetails WHERE quantity > 5 AND freight < 100 INFO20003 Database Systems © University of Melbourne 5 Normalization The table shown below is part of an office inventory database. Identify the design problems and draw a revised table structure in 3rd Normal Form (3NF) that corrects those problems. For each step explicitly identify and discuss which normal form is violated. Item ID is the candidate key for this table. Item ID determines Description, Quan, Cost/Unit and Dept, while Dept determines Dept Name and Dept Head. Item ID Description Dept Dept Name Dept Head Quan Cost/Unit Inventory Value 4011 5 ft desk 4020 File cabinet 4005 Executive chair 4036 5 ft desk MK Marketing MK Marketing MK Marketing ENG Engineering Ahmad Rashere 5 200 1000 10 75 750 5 100 500 7 200 1400 Jane Thompson Jane Thompson Jane Thompson INFO20003 Database Systems © University of Melbourne 7 Data Warehouse You are making a data warehouse for a real estate agency. The company wants to track information about the selling of their properties. Whenever a buyer comes into the office an agent takes that person to a number of properties and deals with the buyer. This warehouse keeps information about the agents (real estate license#, first name, last name, phone #, and branch office), buyers that come in (buyer id, first name, last name, phone #, max price), and property (property#, property address, number of rooms, pool, owner id). The information managers want to be able to find is the number of times a property is viewed, sales price and commission. Sales commission is additional compensation the agent receives for exceeding expectations. The information needs to be accessible by agent, by buyer, by property and for different time (day, week, month, quarter and year). Draw a star schema to support the design of this data warehouse. INFO20003 Database Systems © University of Melbourne 8 Data Warehouse - Solution INFO20003 Database Systems © University of Melbourne 9 DW questions • Whatisandhowtoidentifybusinessprocess – What the study is talking about (short couple words description) – E.g. “weather forecasting”, “real-estate sales” • Howtochoosegrain – Look at each dimension individually and see what is the finest needed level per that dimension (measurements are at the intersection of all finest) – E.g. if monitoring rainfall and we need to be able to report hourly, daily, weekly and monthly rainfall – we need to store hourly rainfall as a measurement. We can derive the value of courser granularities such as weekly from hourly measurements, but can’t do the opposite (from weekly get hourly) INFO20003 Database Systems © University of Melbourne 10 Database admin Vine is a social media sharing service where users can host 6 second video clips within multiple categories (e.g. “Comedy”, “Science”, “Social”). Part of the database schema for the Vine service is given below. There are 15 different categories that users can share videos about and 1 million users to start with. A user posts 5 videos on average per month. Assume that the average storage requirement for the BLOB data type is 20,000 bytes. Estimate the disk space requirements only for the Video table at go-live and after one month of operation. Storage Requirement for different data types Data Type Storage requirements (bytes) INT 4 BLOB 65,535 (Max) DATETIME 8 INFO20003 Database Systems © University of Melbourne 11 Next lecture – no more THANK YOU!!!!! INFO20003 Database Systems © University of Melbourne 15