CS计算机代考程序代写 Excel SQL database ER Programmes:

Programmes:
MSc Computer Science / MSc Information Science (Data Analytics)
Module Code:
KC7013
Module Title:
Database Modelling
Distributed on:
Thursday 25th March 2021
Submission Time and Date:
Wednesday 19th May 2021 by 23.59 GMT
Date by which Work and Feedback will be returned to Students:
Wednesday 16th June 2021
Weighting
This coursework accounts for 50% of the total marks for this module
Submission of Assessment
Electronic Management of Assessment (EMA): Please note your assignment is to be submitted electronically and it will be submitted online
It is your responsibility to ensure that your assignment arrives before the submission deadline stated above. See the University policy on late submission of work.

Instructions on Assessment:

• You are expected to produce a word-processed answer to this assignment. Please use Arial font and a font size of 12. For SQL code and output, you can use courier new font, which preserves SQL format and layout.
• You are required to use the Harvard Style of referencing and citation. The “Cite them right” guide is recommended for referencing and citation (Pears and Shields, 2008) which should be followed throughout your answer especially Part 3. A good alternative is Northumbria (2019). Please do not include references to lecture notes.

Mapping to Programme Goals and Objectives

This assignment covers the following learning outcomes for the module:

Knowledge & Understanding:

• Key concepts of data warehousing.

Intellectual / Professional skills & abilities:

• Conceptual data modelling, relational database design and implementation in SQL & PL/SQL, and object-based databases.
• Design and Implementation of a data warehouse using Oracle database system.

Assessment Regulations

You are advised to read the guidance for students regarding assessment policies. They are available online here.

Late submission of work

Where coursework is submitted late without approval, after the published hand-in deadline, the following penalties will apply.

For coursework submitted up to 1 working day (24 hours) after the published hand-in deadline without approval, 10% of the total marks available for the assessment (i.e.100%) shall be deducted from the assessment mark.

Coursework submitted more than 1 working day (24 hours) after the published hand-in deadline without approval will be regarded as not having been completed. A mark of zero will be awarded for the assessment and the module will be failed, irrespective of the overall module mark.

These provisions apply to all assessments, including those assessed on a Pass/Fail basis.

The full policy can be found here.

Students must retain an electronic copy of this assignment (including ALL appendices) and it must be made available within 24hours of them requesting it be submitted.

Academic Misconduct

The Assessment Regulations for Taught Awards (ARTA) contain the Regulations and procedures applying to cheating, plagiarism and other forms of academic misconduct.

The full policy is available at here

You are reminded that plagiarism, collusion and other forms of academic misconduct as referred to in the Academic Misconduct procedure of the assessment regulations, which are taken very seriously. Assignments in which evidence of plagiarism or other forms of academic misconduct is found may receive a mark of zero.

Criteria for success:

For textual components :
80-100% – The description will excellently cover all the specific topics requested. The written work will be fluent, clearly presented and of out-standing quality.
70-79% – The description will comprehensively cover all the specific topics requested. The written work will be fluent and clearly presented and of distinctive quality.
60-69% – The student will show a very good knowledge of the specific topics, with very good presentation skills and quality.
50-59% – The student will show an above average knowledge of the specific topics, with above average presentation skills and quality.
40-49% – There will be an adequate description of a significant proportion of the topics requested. There will be no major failures in presentation clarity.
Less than 40% – There will be little or no information conveyed in an intelligible manner on the specific topics requested.

For ERD, SQL and other database technical components:
80-100% – The students will produce exceptional models (conceptual/logical/physical), and will demonstrate the use of notation/language, which have outstanding syntactic accuracy (e.g., adhering to ER modelling standards, error free SQL code), with exceptional semantic relevance (e.g., are relevant to the requirements of the particular scenario).
70-79% – The students will produce fully complete models (conceptual/ logical/ physical), and will demonstrate the use of notation/language, which have high syntactic accuracy, with high semantic relevance.
60-69% – The students will produce almost complete models (conceptual/ logical/ physical), and will demonstrate the use of notation/language, which have appropriate syntactic accuracy with reasonably well semantic relevance.
50-59% – The students will produce fairly complete models (conceptual/ logical/ physical), and will demonstrate the use of notation/language, which have adequate syntactic accuracy with reasonable semantic relevance.
40-49% – The students will produce models, and will demonstrate the use of notation/language, which have some syntactic accuracy and semantic relevance.
Less than 40% – The students will not produce sufficient models, and/or will be unable to demonstrate the use of notation/language with significant syntactic accuracy and/or significant semantic relevance.

Assessment Background and Scenario

This assessment is based on two scenarios as follows:

• An online ordering system called MyMarketPlace is detailed in Appendix 1.

• The Sales History (SH) Data Warehouse scenario. SH is a sample database schema provided by Oracle, which has been extensively used in the Oracle’s Data Warehousing Guide (Lane, 2013). The details of this scenario are provided in Appendix 2.

Assignment Questions

Part 1 (50 marks)

Using SQL and PL/SQL of the Oracle Database System (10g/11g/12c), your task is to provide an object-relational database design and implementation for the online ordering system called MyMarketPlace, the requirements of which are explained in Appendix 1.

You should then write two PL/SQL procedures:

• The first to allow the user to input information for an order and n number of order items / sub-orders (where 2 ≤ n ≤ 10) as parameters (You should make it general but if you use fixed parameters for just two order items, then you will get a maximum of half marks for this procedure).
• The second to allow the user to input a product code as a parameter, then retrieve and display name of the product and information (order and sub-order number, quantity ordered and cost charged, and the related customer) about the orders that include the given product (this will need a cursor to handle the multiple orders retrieved).

NOTE the following important points when answering Part 1:
• You must provide a conceptual/logical structure of your database (4 marks) and all the SQL code used for creating (20 marks), populating (8 marks) and manipulating your database using PL/SQL procedures (14 marks).
• You must populate your database with some sample data to test your procedures.
• You must provide the output of running your code (e.g., spool file) in electronic form (4 marks).

Part 2: Data Warehousing Tasks (50 Marks)

This part is based on the Sales History scenario as described in Appendix 2.

You must submit all the SQL queries and any other code that you wrote in answering any of the tasks / questions (e.g., the use of Explain Plan statements for the queries and their SPOOL outputs).

• Study the index definitions in sh_idx.sql. Discuss in detail (using cost-based analysis) why these indexes (at least two of them) are useful for answering queries over the SH2 and DWU versions of the database. You should not run the sh_idx.sql script at all under any circumstances as it will interfere with your cost-based analysis of query performance in Part 2 of this assignment.
(10 marks)

• Identify two new indexes and justify why they could be useful. Write the SQL code for creating these indexes under your DWU account. Give example queries with cost-based analysis for both DWU account (which will have the new indexes) and SH2 shared schema (which will NOT have any of your new indexes). Alternatively, you may choose to run the same queries on your DWU account before and after creating your proposed three indexes
(10 marks)

• Given the two materialized views (MVs) defined in sh_cremv.sql, discuss in detail why these MVs are useful for users of the SH database. You should provide detailed examples of cost based analysis, e.g., using Explain Plan for running sample queries on both SH2 and DWU to illustrate your answer. You should not run the sh_cremv.sql script at all under any circumstances as it will interfere with your cost-based analysis of query performance in Part 2 of this assignment.
(10 marks)

• Identify two new MVs based on the base tables in the SH schema under your DWU account and justify why they would be useful for the users of your data warehouse. Write the SQL code for creating these MVs. Moreover, run sample queries on both SH2 and DWU to ensure that queries running on DWU will be re-written by Oracle to use your proposed three MVs instead of the base tables used in the sample queries. Note that you must not query your MVs directly in the FROM clause; let the Oracle Query Optimizer re-write the queries and answer them using your proposed MVs.
(10 marks)

• Prior to the introduction of the aggregation function ROLLUP, there was no possibility to express an aggregation over different levels within one SQL statement without using the set operation UNION ALL. Every different aggregation level needed its own SQL aggregation expression, operating on the exact same data set n times, once for each of the n different aggregation levels. With the introduction of ROLLUP in the recent editions, Oracle provided a single SQL command for handling the aggregation over different levels within one single SQL statement, not only improving the runtime of this operation but also reducing the number of internal operations necessary and reducing the workload on the system.

• Using ROLLUP, write an SQL query over the SH schema under your DWU user involving one fact table (SALES or COSTS) and at least two dimension tables. Provide reasons why your query may be useful for users of the SH data warehouse. Provide output of successful execution of your query.
(3 marks)

• Using set operation UNION ALL (and not ROLLUP), write an SQL query that produces the same result as the query in (a) above. Provide output of successful execution of your query.
(3 marks)

• Using EXPLAIN PLAN, provide a detailed discussion analysing costs of evaluating the above queries (i.e. with and without ROLLUP).
(4 marks)

Hand-in procedure

You should hand in your answer for this assignment as a single word-processed document electronically on the Blackboard. A sample assignment template file will be uploaded on the Blackboard near the submission deadline.

Appendix 1

Online Ordering System – MyMarketPlace
• A sample MyMarketPlace invoice is provided in the appendix.
Appendix 1.

The requirements, for the MyMarketPlace system, are that the following information should be held in the database:

• Data on each order, including order number, name and contact details (email, addresses, telephones, etc) of customer and date the order was placed and order total;
• Data on each item within the order including product code, quantity ordered, its condition, unit price, and cost charged together with any postage and packing cost;
• Data on each seller who sells different items (e.g., name, profile, URL, rating, etc);
• Data on despatch and delivery of the items / sub-orders, and
• Data on each product in the product catalogue including product code, name, description, and category.
• Data required for uniqueness / identification of different entities in the system, which are required but may not be so obvious from the invoice.

A sample MyMarketPlace invoice is provided on the next page.

MarketPlace Invoice
Ordering Information:
E-mail Address:  ABC@gmail.com
Invoice Address:
abc
Newcastle upon Tyne, Tyne and Wear,
NE99 9PN, United Kingdom
Delivery Address:
abc
Newcastle upon Tyne, Tyne and Wear,
NE99 9PN, United Kingdom

Order Date: 09/Dec/2020 Order Grand Total: £219.95
Order Summary:
Your order has been divided into 3 separate sub-orders.
Sub-Order #1: Vodafone Limited

Sub-Order #:
302-6130531-6774743
Delivery Method:
Standard Delivery
Subtotal of Items:  
£106.98
Postage & Packing:  
£0.00

Total:  
£106.98

Total for this order:  
£106.98
 
Delivery estimate: 12 Dec 2020 – 16 Dec 2020
Dispatch estimate for these items: 09 Dec 2020 – 12 Dec 2020

2
“Pink Samsung Genio S3650 Mobile Phone on Vodafone PAYG”
Unit Price: £53.49, Condition: New,  Sold by: Vodafone Limited

Sub-Order #2: emartbuyuk

Sub-Order #:
302-9084298-7718724
Delivery Method:
Standard Delivery
Subtotal of Items:  
£1.99
Postage & Packing:  
£2.00

Total:  
£3.99

Total for this order:  
£3.99
 
Delivery estimate: 12 Dec 2020 – 14 Dec 2020
Dispatch estimate for these items: 09 Dec 2020 – 11 Dec 2020

1
“EMARTBUY SAMSUNG S5230 TOCCO LITE PINK FLIP CASE/COVER/POUCH”
Unit Price: £1.99, Condition: New,  Sold by: emartbuyuk

Sub-Order #3: : Prepaymania

Sub-Order #:
302-4206613-2584323
Delivery Method:
Standard Delivery
Subtotal of Items:  
£106.98
Postage & Packing:  
£2.00
 

Total:  
£108.98
Total for this order:  
£108.98

Delivery estimate: 13 Dec 2020 – 17 Dec 2020
Dispatch estimate for these items: 10 Dec 2020 – 13 Dec 2020

2
“Pink Samsung Genio S3650 Mobile Phone on Vodafone PAYG”
Unit Price: £53.49, Condition: Refurbished,  Sold by: Prepaymania

Appendix 2

The Sales History (SH) Data Warehouse

SH is a sample database schema provided by Oracle, which has been extensively used in the Oracle’s Data Warehousing Guide (Lane, 2013). The SH schema, as shown in Figure 3, consists of a big fact table, SALES, and five relatively small dimension tables: TIMES, PROMOTIONS, CHANNELS, PRODUCTS and CUSTOMERS. The additional COUNTRIES table linked to CUSTOMERS creates a simple snowflake. The model and the attributes are aimed at demonstrating data warehousing functionality like star transformation and query rewrite. They do not necessarily represent the optimal approach for this kind of data warehouse in real productive environments; and such a design would be driven more by business requirements than by the star itself. Table 1 shows the cardinalities of the tables in the SH schema.

Table
Number of Rows
Customers
50,000
Countries
19
Products
10,000
Sales
10,16,271
Costs
7,87,766
Times
1,461
Promotions
501
Channels
5
Table 1: Cardinalities of SH tables (as per SH2 implementation).

Figure 3: SH Star Schema.

Usually, a star schema includes a single fact table and few dimensional tables, however, the SH schema designer identified another useful fact table called COSTS, which is linked to the TIMES and PRODUCTS dimensional tables. Figure 3 shows two fact tables: SALES and COSTS.

Two versions of SH schema/database

• The default and optimized version of the SH database is created under the SH2 user. You have read-only access to query any of the tables of SH database by prefixing any table or view name by “SH2.”, e.g., to query the contents of CHANNELS table, use:

SELECT * FROM SH2.CHANNELS;

• An un-optimized version of the SH database has been created under your own username DWU. You will be given a special username and password for doing this part of the assignment (i.e. an Oracle account starting with DWU and followed by a number e.g. DWU1, DWU2, etc). Throughout this part, you MUST use this special username. However, it is assumed that you will not create any indexes on any of the tables in your own personal version of the SH until asked to do specifically. Moreover, neither materialized views (see sh_cremv.sql) nor dimensional hierarchies (see sh_hiera.sql) should be created in your version of the SH database.

Important Note

Whenever, you need to assess the performance of existing database structures (e.g., materialized view, index, dimensional hierarchies) you must refer to SH2 as above (using the “SH2.” prefix). Any new database structures that you need to create should be created under your username. Likewise, when you need to assess the performance of any new database structures that you create under your DWU username, you need to prefix all tables/views/materialised views accordingly, e.g., to query the contents of your CHANNELS table, use:

SELECT * FROM DWU1.CHANNELS;

References

Kannan, P.K. (2017) Oracle Database Object-Relational Developer’s Guide, 12c Release 2 (12.2). Available at: https://docs.oracle.com/en/database/oracle/oracle-database/12.2/adobj/index.html (Accessed: 17 January 2020).

Morin, L. (2017) Oracle Database PL/SQL Language Reference, 12c Release 1 (12.1). Available at: https://docs.oracle.com/database/121/LNPLS/title.htm (Accessed: 17 January 2020).

Northumbria (2020) Academic Regulations for Taught Awards. Available at: https://www.northumbria.ac.uk/about-us/university-services/academic-registry/quality-and-teaching-excellence/assessment/guidance-for-students (Accessed: 4 October 2020).

Pears, R. and  Shields, G. (2008) Cite them right: the essential referencing guide. 
Newcastle upon Tyne: Pear Tree Books. Available at: https://www.citethemrightonline.com/ (Accessed: 4 October 2020).