CS计算机代考程序代写 database chain case study PowerPoint Presentation

PowerPoint Presentation

Week 3a – Bridge Tables

Semester 2, 2021

FIT3003 – Business Intelligence
and Data Warehousing

Developed by:
Dr. Agnes Haryanto

Agnes.

MONASH
INFORMATION

TECHNOLOGY

Agenda

1. Bridge Tables

1. Product Sales Case Study

2. Truck Delivery Case Study

Using FLUX

1. Visit http://flux.qa/ on your internet enabled device

2. Log in using your Monash account (not required if
you are already logged in to Monash)

3. Click on the “+” to join audience

4. Enter the Audience Code:

• Clayton – Z9J7MT

• Malaysia – PMYBD6

5. Select FIT3003 in the Active Presentation menu

http://flux.qa/1AW6N8

Bridge Tables

Bridge Tables

▪ A bridge table is a table that links between two dimensions; and only one

of these two dimensions are linked to the fact.

➢ As a result, the star schema becomes a snowflake schema.

Bridge Tables

▪ Two reasons on why a dimension cannot be connected directly to the Fact:

a) The Fact table has a fact measure, and the dimension has a key identity. In

order to connect a dimension to the Fact, the dimension’s key identity must

contribute directly to the calculation of the fact measure. Unfortunately, this

cannot happen if the operational database does not have this data.

b) The operational database does not have this data if the relationship between

two entities in the operational database that hold the information about

dimension’s key identity and the intended fact measure is a many-many

relationship.

Bridge Tables

Case Study #1

Case Study #1 – A Product Sales Case Study

▪ A company management team would like to

analyze the statistics of its product sales

history. The analysis is needed to identify

popular products, suppliers supplying those

products, the best time to purchase more

stock, etc.

▪ A small data warehouse is to be built to keep

track of the statistics.

▪ The management is particularly interested in

analyzing the total sales (quantity * price) by

product, customer suburbs, sales time

periods (month and year), and supplier.

Case Study #1 – A Product Sales Case Study

▪ The management is particularly interested in

analyzing the total sales (quantity * price) by

product, customer suburbs, sales time

periods (month and year), and supplier.

Case Study #1 – A Product Sales Case Study

▪ The management is particularly interested in

analyzing the total sales (quantity * price) by

product, customer suburbs, sales time

periods (month and year), and supplier.

▪ Sales Star Schema

➢ Fact:

• Total Sales

➢ Dimensions:

• Product

• Customer locations/suburbs

• Time period

• Supplier

▪ Possible Two-Column Methodology Tables:

Case Study #1 – A Product Sales Case Study

ProductNo TotalSales

A1 $130,000

B2 $15,900

C3 $2,500,000

… …

TimeID TotalSales

201801 $25,000

201802 $4,700

201803 $3,500

… …

Suburb TotalSales

Caulfield $6,500

Chadstone $12,000

Clayton $1,800

… …

(a) Product point of view (b) Time point of view (c) Suburb point of view

Case Study #1 – A Product Sales Case Study

▪ Sales Star Schema

➢ Fact:

• Total Sales

➢ Dimensions:

• Product

• Customer locations/suburbs

• Time period

• Supplier

Case Study #1 – A Product Sales Case Study

Case Study #1 – A Product Sales Case Study

SupplierID TotalSales

S1 $77,000

S2 $5,700

S3 $12,500

… …

Supplier point of view

Case Study #1 – A Product Sales Case Study

SupplierID TotalSales

S1 $77,000

S2 $5,700

S3 $12,500

… …

Supplier point of view

Case Study #1 – A Product Sales Case Study

Case Study #1 – A Product Sales Case Study

Case Study #1 – A Product Sales Case Study

SupplierID TotalSales

S1 $77,000

S2 $5,700

S3 $12,500

… …

Case Study #1 – A Product Sales Case Study

Case Study #1 – A Product Sales Case Study

Bridge Table

▪ To create Time Dimension:
– create table TimeDim as

select

distinct to_char(SalesDate, ’YYYYMM’) as TimeID,

to_char(SalesDate, ’YYYY’) as Year,

to_char(SalesDate, ’MM’) as Month

from Sales;

▪ To create Customer Location Dimension:
– create table CustLocDim as

select distinct Suburb, Postcode

from Customer;

Case Study #1 – A Product Sales Case Study

▪ To create Product Dimension:
– create table ProductDim as

select distinct ProductNo, ProductName

from Product;

▪ To create Bridge Table:
– create table ProductSupplierBridge as

select *

from StockSupplier;

▪ To create Supplier Dimension:
– create table SupplierDim as

select SupplierID, Name as SupplierName

from Supplier;

Case Study #1 – A Product Sales Case Study

▪ To create Fact Table:
– create table ProductSalesFact as

Select

to_char(S.SalesDate, ’YYYYMM’) as TimeID,

P.ProductNo,

C.Suburb,

sum(SI.QtySold*P.Price) as TotalSales

from Sales S, Product P, Customer C, SalesItem SI

where S.SalesNo = SI.SalesNo

and SI.ProductNo= P.ProductNo

and C.CustomerID = S.CustomerID

group by

to_char(S.SalesDate, ’YYYYMM’), P.ProductNo, C.Suburb;

Case Study #1 – A Product Sales Case Study

Bridge Tables

Case Study #2

Case Study #2 – A Truck Delivery Case Study

▪ A trucking company is responsible for picking up goods from warehouses of a retail chain

company, and delivering the goods to individual retail stores.

▪ A truck carry goods during a single trip, which is

identified by TripID, and delivers these goods to

multiple stores. Trucks have different capacities

for both the volumes they can hold and the

weights they can carry.

▪ At the moment, a truck makes several trips each

week. An operational database is being used to

keep track the deliveries, including the scheduling

of trucks, which provide timely deliveries to

stores.

Case Study #2 – A Truck Delivery Case Study

▪ A trip may pick up goods from many

warehouses

o i.e. a many-many relationship between

Warehouse and Trip

▪ A trip uses one truck only, and a truck may

have many trips in the history

o i.e. a many-1 relationship between Trip and

Truck

▪ A trip delivers goods (e.g. TVs, fridges, etc)

potentially to several stores

o a many-many relationship between Trip and

Store, which is represented by the Destination

table

▪ Sample data in the operational database:

Case Study #2 – A Truck Delivery Case Study

WarehouseID Location

W1 Warehouse1

W2 Warehouse1

W3 Warehouse1

… …

(d) Truck Table

(b) Trip Table (c) TripFrom Table

TripID Date TotalKm TruckID

Trip1 14-Apr-2018 370 Truck1

Trip2 14-Apr-2018 570 Truck2

Trip3 14-Apr-2018 250 Truck3

Trip4 15-Jul-2018 450 Truck1

… … … …

TripID WarehouseID

Trip1 W1

Trip1 W2

Trip1 W3

Trip2 W1

Trip2 W2

… …

TruckID VolCapacity WeightCategory CostPerKm

Truck1 250 Medium $1.20

Truck2 300 Medium $1.50

Truck3 100 Small $0.80

Truck4 550 Large $2.30

Truck5 650 Large $2.50

… … … …

StoreID StoreName Address

M1 MyStore City Melbourne

M2 MyStore Chaddy Chadstone

M3 MyStore HiPoint High Point

M4 MyStore Donc Doncaster

M5 MyStore North Northland

M6 MyStore South Southland

M7 MyStore East Eastland

M8 MyStore Knox Knox

… …

TripID StoreID

Trip1 M1

Trip1 M2

Trip1 M4

Trip1 M3

Trip1 M8

Trip2 M4

Trip2 M1

Trip2 M2

… …

(a) Warehouse Table

(e) Store Table (f) Destination Table

Case Study #2 – A Truck Delivery Case Study

▪ The management of this trucking company would like to analyze the deliver cost, based on

trucks, time period, and store.

Case Study #2 – A Truck Delivery Case Study

▪ Sales Star Schema

➢ Fact:

• Total Delivery Cost

(distance * cost per kilometre)

➢ Dimensions:

• Truck

• Time period

• Store

Case Study #2 – A Truck Delivery Case Study

▪ From the Truck point of view, Truck1 has two trips (e.g. Trip1 and Trip4), with the total kilometres of

820km (370km + 450km). The cost for Truck1 is $1.20. Hence, calculating the cost for Truck1 is

straightforward. Other trucks can be calculated this way.

▪ From the Period point of view, 14-Apr-2018 has three trips (e.g. Trip1,Trip2, and Trip3). Trip1 (370km) is

delivered by Truck1 which costs $1.20/km. Trip2 and Trip 3, on the same day, can be calculated the same

way. Hence, on 14-Apr-2018, the total cost can be calculated.

▪ From the Store point of view; The cost is calculated

based on Trip, but a trip delivers goods to many

stores. Therefore, the delivery cost for each store

cannot be calculated. The delivery cost is for the trip –

not for the store.

Case Study #2 – A Truck Delivery Case Study

Solution Model 1 – Using a Bridge Table

Case Study #2 – A Truck Delivery Case Study

Solution Model 2 – add a Weight Factor attribute

A weight factor is only needed if we want to estimate

the contribution that a dimension made to the fact

Case Study #2 – A Truck Delivery Case Study

StoreID StoreName Address

M1 MyStore City Melbourne

M2 MyStore Chaddy Chadstone

M3 MyStore HiPoint High Point

M4 MyStore Donc Doncaster

M5 MyStore North Northland

M6 MyStore South Southland

M7 MyStore East Eastland

M8 MyStore Knox Knox

… …

TripID StoreID

Trip1 M1

Trip1 M2

Trip1 M4

Trip1 M3

Trip1 M8

Trip2 M4

Trip2 M1

Trip2 M2

… …

(c) Store Table(b) Bridge Table

(a) Trip Dimension Table

▪ To create Trip Dimension:
– create table TripDim2 as

select T.TripID, T.TripDate, T.TotalKm,

1.0/count(*) as WeightFactor

from Trip T, Destination D

where T.TripID = D.TripID

group by T.tripid, T.tripdate, T.totalkm;

Case Study #2 – A Truck Delivery Case Study

Case Study #2 – A Truck Delivery Case Study

Solution Model 3 – a List Aggregate version

Case Study #2 – A Truck Delivery Case Study

TripID StoreID

Trip1 M1

Trip1 M2

Trip1 M4

Trip1 M3

Trip1 M8

Trip2 M4

Trip2 M1

Trip2 M2

… …

(b) Bridge Table

(a) Trip Dimension Table

listagg (Attr1, ‘_’) within group

(order by Attr1) as ColumnName

▪ To create Trip Dimension:
– create table TripDim3 as

select T.TripID, T.TripDate, T.TotalKm,

1.0/count(D.StoreID) as WeightFactor,

listagg (D.StoreID, ’_’) within group

(order by D.StoreID) as StoreGroupList

from Trip T, Destination D

where T.TripID = D.TripID

group by T.TripID, T.TripDate, T.TotalKm;

Case Study #2 – A Truck Delivery Case Study

▪ Joining based on the StoreGroupList attribute in the Trip dimension table
and the StoreID in the Store dimension table:

– select *

from TripDim3 T, StoreDim3 S

where T.StoreGroupList like ’%’||S.StoreID||’%’;

▪ Without the StoreGroupList attribute in the Trip dimension, we need to join
three tables:

– select *

from TripDim3 T, BridgeTable3 B, StoreDim3 S

where T.TripID = B.TripID

and B.StoreID = S.StoreID;

Case Study #2 – A Truck Delivery Case Study

Bridge Tables

Summary

Bridge Tables – Summary

▪ In principal, a Bridge Table is used:

a) When it is impossible to have a dimension connected directed to the Fact table,

because simply there is no relationship between this dimension and the Fact table

(e.g. in the Product Sales case study, it is impossible to have a direct link from

SupplierDim to ProductSalesFact)

b) When an entity (which will become a dimension) has a many-many relationship with

another entity (dimension) in the E/R schema of the operational database (e.g.

Supplier and Stock has a many-many relationship).

c) When temporality aspect (data history) is maintained in the operational database

and the bridge table can be used to accommodate the dimension that has temporal

attributes (e.g. product supply history is maintained in the second snowflake schema

example).

Bridge Tables – Summary

▪ When a Bridge Table is used in the schema, there are two additional options:

a) A Weight Factor is used to estimate the contribution of a dimension in the calculation

of the fact measure. Because this is only an estimate, a weight factor is option.

b) Every snowflake schema (whether it has Weight Factor or not) can be implemented

in two ways: a List Aggregate version, and a non-List Aggregate version.