INDIVIDUAL ASSIGNMENT 1: DIMENSION MODELING AND PHYSICAL DESIGN (15%)
ISSUE DATE: 3RD DECEMBER 2020 (THURSDAY)
DUE DATE: 17TH DECEMBER 2020 (THURSDAY), 5:00 PM (SUBMIT IN ITALEEMC)
DELIVERABLES
Softcopies of your answers for the following:
1. Part 1 in Full Screen shots of the Employees Database with the tables in Microsoft SQL Server
and Data. (2%)
2. Part 2 in bus matrix and dimensional modeling Excel workbook (saved as matricno-
dimensional-modeling.xlsx). (6%)
3. Part 3 in SQL script file for the Data Mart star schema (saved as matricno-star-schema.sql)
and the star schema diagram. (7%)
Any file created should have your matric number in the filename. Part 1 and star schema diagram are to be in ONE PDF file but please identify them so that I will be able to differentiate the Database and Data Warehouse (Data Mart). Pack all the files into one ZIP file (named as matricno- assignment1.zip).
OBJECTIVES
This assignment will introduce the process of data warehouse design and development for a business process assigned to you. You will design the data warehouse using dimensional modeling design technique better suited for BI applications and data warehouse. You will then implement the physical model of your detailed dimensional design.
The objectives are:
• to enable you to produce dimensional design specifications by creating detailed bus matrix
and dimensional model; and
• to assist you to translate the detailed dimensional design specifications into a star-schema
implementation, including creating tables, primary and foreign keys, and database diagrams.
ASSIGNMENT REQUIREMENTS
For this assignment, you will need the following:
1. Access to Microsoft SQL Server and Microsoft SQL Server Management Studio.
2. Employees Database (Employees Database SQL script file is provided in iTaleemc).
3. Template for Dimensional Modeling Excel workbook (template-bus-matrix-dim-modeling-
workbook.xlsx is provided in iTaleemc).
4. Microsoft Excel for editing the workbook.
DATA WAREHOUSE BUSINESS REQUIREMENTS
As part of the company’s Business Intelligence initiative, the company would like to create a data mart with the following goal:
• Salary reporting. Senior management would like to be able to track salary by employee and department with the goal of establishing which employees are the top earners and which departments spent the most in salaries.
1
PART 1: EMPLOYEES DATABASE
You will use the database provided in iTaleemc. Download the script file of the database and Execute the file in Microsoft SQL Server. The database is in Transact-SQL (T-SQL) script which is used by Microsoft SQL Server. Though the SQL does not differ that much from ORACLE SQL, you are required to explore the SQL functions available in SQL server.
When completed, the Employees database will have four (4) tables. Take Full Screen shots that include 1) the Object Explorer view that shows the table names and database name, 2) server name at the top, and 3) the sample data of the four tables..
Example:
PART 2: DIMENSIONAL MODELING
For the above requirement, you are to start building the first data warehouse (data mart) for the BUSINESS PROCESS.
Connect to your SQL Server using SQL Server Management Studio and open the Employees Database you created in Part 1.
Kimball’s four (4) step modeling process walks us through setting the fact grain and identifying the useable dimensions. The bus matrix and dimensional modeling workbook (template-bus-matrix-dim- modeling-workbook.xlsx) is designed to walk you through this process. Save your work in matricno- dimensional-modeling.xlsx.
2
2. A) HIGH-LEVEL DESIGN
Use the BUS MATRIX for the following:
Step 1: Select Business Process
Name the process following the business requirements provided.
Step 2: Declare Grain
What does it mean to be a single row in the fact table?
Identify the granularity of the fact table or what each row represents.
Step 3: Choose Dimensions
From the attributes in the tables in the database and your understanding of dimensions, identify the dimensions for the data warehouse.
Step 4: Identify Facts
Identify the facts; this depends a lot on the requirements. One important thing to recognize is that not all facts appear among the source data. Some of the facts are derived by doing a “little math” on some of the source data values.
2. B) DETAILED DESIGN
Use DIMENSION TABLE and FACT TABLE TEMPLATES for the following:
For the fact and dimensions related to the assigned business process identified in Part 2. a), complete the detailed dimensional design.
The process you will follow to design a dimension or fact table is outlined in 5 Steps:
1. Create a new dimension (or fact) worksheet in the workbook for the fact and every dimension.
2. Complete the table definition part of the worksheet.
3. Complete the basic column information.
4. Complete the target table information.
5. Complete the source data information.
Step 1: Create a new dimension or fact worksheet
Make a copy of the Dimension worksheet for every dimension you identified. Right-click on it and select Rename from the menu, then type in the Dimension Name (e.g. Customers).
For the fact table, type in the Fact Name.
3
Step 2: Complete the table definition
Complete the table definition for the dimension/fact table. Please complete it as follows:
The name of table Dimension or Fact
Step 3: Complete the basic column information
Complete the basic column information following the explanation below:
Column Name→Physical name of column in the table.
Description→Explanation of the column, for documentation purposes.
Unknown Member→What should be used for an unknown value (in place of NULL). Example Values→What do sample values look like?
Step 4: Complete the target table information
Complete the table definition of your dimension/fact table in your star schema following the explanation below:
Data Type, Size→the SQL Server data type (including size and precision, where appropriate) of the attribute. SQL server data type reference can be found at https://docs.microsoft.com/en-us/sql/t- sql/data-types/data-types-transact-sql
Key→Should be blank if not a key or labeled PK = primary key or FK = foreign key.
4
FK To→When you label an attribute as FK, you need to include a dimension table and its primary key as the referencing column.
NULL→Whether or not the attribute permits null values (N = No, Y = Yes). The better design decision is to provide a default value in place of NULL.
Default Value→A value which should be stored in the event there is no value.
Step 5: Complete the source data information.
Complete the source definition of the dimension/fact table, which will assist you as you complete the ETL implementation in the next assignment, following the explanation below:
Source System→List the source system for the attribute. Derived implies the attribute is calculated. Source Table→State the table where the attribute comes from in the source system.
Source Field Name→The column or columns which supply the attribute. If the column is a calculation, specify that here (ex. OrderQty*Price).
Source Data Type→the data type (including size and precision, where appropriate) of the attribute in the source system.
PART 3: CREATION OF DATA WAREHOUSE STAR SCHEMA
The goal for Part 3 is to implement the star schema designed in Part 2 in SQL and create ONE script file – matricno-star-schema.sql – which when executed will create the schema and tables for the first data warehouse (data mart).
For this assignment, DO NOT load the star schema with data.
Your steps,
1. Create a new database which will be your data warehouse. Name it according to the process.
2. Create a new SQL query.
3. Switch to your data warehouse.
e.g. USE EmployeesDW
4. Use your completed Dimensional Modeling Excel Workbook from Part 2 to start creating your
tables in SQL.
5. There should be a DROP TABLE command before the CREATE TABLE command so that it can
be re-run without error. This is critical to the iterative nature of development, as you might
need to re-run this script after changes.
6. Remember to drop the fact table first, but create it last to take care of the foreign key
dependencies. Either that or drop the foreign keys. 5
7. Remember to execute your script against your data warehouse
8. View the star schema diagram from the Database Diagrams section of your data warehouse.
Copy the diagram to a Word or Powerpoint document. Include in the document after the diagram, screenshots of the data warehouse and tables in the Object Explorer view. Save these together with Part 1 deliverables in PDF.
6