CS代写 FIT3003 – Business Intelligence and Data Warehousing

INFORMATION TECHNOLOGY
FIT3003 – Business Intelligence and Data Warehousing
Week 3b – Average in Fact Semester 1, 2021
Developed by:

Recall – Star Schema Components
▪ There are Three main components of the Star Schema: 1. Facts
2. Dimensions 3. Attributes

Recall – Fact
▪ A Fact Table consists of key attributes from each dimension, and fact measures.
▪ A Fact Table is created by a join operation, that joins several tables from the operational database.
▪ Fact tables are created either through TempFact or directly retrieval from the tables in the operational database.
▪ The fact measure itself is an aggregated value.
➢ In the SQL command, the fact measure attribute in the Fact Table is created using an aggregate function, such as count or sum, and the group by operation.

Average in the Fact

Average in the Fact
Unit Code Unit Title Semester Student First Name
IT001 Database 1 Mirriam

Average in the Fact
Unit Code Unit Title
IT001 Database
Semester Student First Name

The operational database contains:
• 9 records of Semester one
• 7 records of Semester two

Average in the Fact
Unit Title Semester Student First Name
Database 1 Mirriam

The operational database contains:
• 9 records of Semester one
• 7 records of Semester two
• 8 records of Database Unit
(6 Semester one and 2 Semester two)

Average in the Fact
(b) Subject Dimension
Unit Code Semester
Average_Score
(c) Semester Dimension
Unit Title

Average in the Fact
(b) Subject Dimension
Unit Code Semester
Average_Score
(c) Semester Dimension
Unit Title

Average in the Fact
Average Score for the Database Unit in:
• Semester one: (81+41+74+85+87+75)/6 = 73.833
(b) Subject Dimension
Unit Code Semester
Average_Score
(c) Semester Dimension
Unit Title

Average in the Fact
Average Score for the Database Unit in:
• Semester one: (81+41+74+85+87+75)/6 = 73.833
• Semester two: (64+32)/2 = 48
(b) Subject Dimension
Unit Code Semester
Average_Score
(c) Semester Dimension
Unit Title

Average in the Fact
Average Score for the Database Unit in:
• Semester one: (81+41+74+85+87+75)/6 = 73.833
• Semester two: (64+32)/2 = 48
These are actually incorrect!
(b) Subject Dimension
Unit Code Semester
Average_Score
(c) Semester Dimension
Unit Title

Average in the Fact
Query: Calculate Average Score for the Database Unit: • (73.833 + 48) / 2 = 60.9165
(b) Subject Dimension
The SQL command:
select avg(Average_Score) from EnrolmentFact
where UnitCode = ’IT001’;
Unit Code Semester
Average_Score
(c) Semester Dimension
Unit Title

Average in the Fact
Query: Calculate Average Score for the Database Unit. Calculation using Fact:
• (73.833 + 48) / 2 = 60.9165
Calculation based on the Operational Database: • (81+41+74+85+87+75+64+32) / 8 =
539 / 8 = 67.375 (a) Fact
(b) Operational Database
Unit Code Semester
Average_Score
Unit Title Semester
Database 1
Student First Name Score
Mirriam 81

Database 1

Average in the Fact
Query: Calculate Average Score for the Java Unit in both Semesters. Calculation using Fact:
• (56+71.667) / 2 = 63.833
Calculation based on the Operational Database:
• (65+47+78+73+64) / 5 = 65.4 (a) Fact
(b) Operational Database
Unit Code Semester
Average_Score
Unit Title Semester
Database 1
Database 1
Student First Name Score
Mirriam 81

Mirriam 78

Average in the Fact
Query: Calculate Average Score for Semester One. Calculation using Fact:
• (73.833+56+63) / 3 = 64.287
Calculation based on the Operational Database:
• (81+41+74+85+87+75+65+47+63) / 9 = 68.667 (a) Fact
(b) Operational Database
Average_Score
Unit Title
Student First Name Score
Mirriam 81

Mirriam 78

Average in the Fact
Query: Calculate Average Score for Semester Two. Calculation using Fact:
• (48+71.667+52.5) / 3 = 57.389
Calculation based on the Operational Database:
• (64+32+78+73+64+53+53) / 7 = 59.4286 (a) Fact
(b) Operational Database
Unit Title Semester
Database 1
Student First Name Score
Mirriam 81
Database 1

Unit Code Semester
Average_Score

Mirriam 78

Average in the Fact

Average in the Fact
IT001 1 443 6
(b) Subject Dimension
Unit Title
(a) Fact Version 2
Unit Code Semester Total_Score Number_of_Students
(c) Semester Dimension

Average in the Fact
(a) Fact Version 2
Query: Calculate Average Score for the Database Unit:
• (443+96)/(6+2) = 67.375
The SQL command:
select sum(Total_Score)/ sum(Number_of_Students) as Average_Score
from EnrolmentFact2
where UnitCode = ’IT001’;
Unit Code Semester Total_Score Number_of_Students
IT001 1 443 6

Average in the Fact
▪ The problem of Average in the Fact is known as the Average of an Average problem.
➢ This problem is well known in Mathematics and Statistics.
➢ Average of an average will simply produce an incorrect average result.
▪ Hence, it is not desirable to have an average measure in the fact.
➢ Exceptional case: when the analysis ALWAYS uses all the dimensions (e.g. Determinant Dimensions).

Min & Max in the Fact
▪ If Average should not be used in the fact, how about Min or Max? ➢ Yes, we can.
➢ Because Max of Max is always a global Max, and Min of Min is always a global Min.
(b) Subject Dimension
Ucode Semester Min_Score Max_Score
IT001 1 41 87
(c) Semester Dimension
Unit Title

Min & Max in the Fact
▪ Query: Find the Maximum Score of Database Unit. • Max of {87, 64} is 87.
▪ The SQL command:
select max(Max_Score) from EnrolmentFact where UnitCode = ’IT001’;
(b) Subject Dimension
Ucode Semester Min_Score Max_Score
(c) Semester Dimension
IT001 1 41 87
Unit Title

Min & Max in the Fact
▪ Query: Find the Minimum Score of Database Unit. • Min of {41, 32} is 32.
▪ The SQL command:
select min(Min_Score) from EnrolmentFact where UnitCode = ’IT001’;
IT001 1 41 87
(b) Subject Dimension
Ucode Semester Min_Score Max_Score
(c) Semester Dimension
Unit Title

Average in the Fact – Conclusion
▪ Average in the Fact is not desirable, although technically it satisfies the two criteria of the fact (e.g. must be a numerical and aggregate value).
▪ Min and Max in the Fact can still be used, since and are valid fact measures (e.g. they are numerical and aggregated values).
▪ In general, count and sum are more common.