程序代写 INFS5710 IT Infra. for BA

INFS5710 IT Infra. for BA
Business Intelligence and Data Warehouses

Chapter 13

Copyright By PowCoder代写 加微信 powcoder

Business Intelligence and Data Warehouses 13-1 TO 13-7
pp. 590 – 632
After three weeks of heavy materials, this week is a bit easier. Most of the materials are on the slides! 
This week, we will be looking at Business Intelligence and Data Warehouse.

INFS5710 Week 1
Copyright © 2012, SAS Institute Inc. All rights reserved.
Database Systems Infrastructure
Entity Relationship Model (ERM)
Normalisation
Prepared by , Feb. 2021
External Data (e.g. Excel)
ETL (Data Cleansing)
Star Schema (De-Normalised)
Data Warehouse
External Data (e.g. Excel)
Data (DW to BD, or vice versa or both)
Unstructured Data (Social Media)
Data Streaming
Structured Data (Internet of Things (IOT))
Data Streaming
Flat Files
Machine Learning
Relational Database
Coming back to this slide…
So far, we have looked at Normalisation and ERD, and today we will be looking at Business Intelligence (BI) and Data Warehouse (DW).
In the 1990s, Cognos and Business Objects and a few other vendors are moving in the direction of Business Intelligence (BI), Data Warehouse (DW) is the key of consolidated the data before populated to the BI applications.
The purpose of DW is the data is cleaned and I do not have to worry about cleaning the data again. Every day or every week I only have to clean and add the current data to the database.
The data were used in BI. Reports are printed or on screen for managers to review and make decisions.
(“Not Normalised”)
Spark and NoSQL (and other tools)
Reporting (Business Intelligence and Visualisation) and Business Analysis (End Users)
Note: In-Memory Database (e.g. SAP Hana) is an alternative data model not shown here
Hadoop Distributed File System (HDFS) and MapReduce

Business Intelligence (BI)
 Comprehensive, cohesive, integrated set of tools and processes
 Captures, collects, integrates, stores, and analyzes data
 Generates and presents information to support business
decision making
 Allows a business to transform:
 Data into information
 Information into knowledge  Knowledge into wisdom
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a
certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
From the previous diagram, we have seen it has to do with the users.
We have to gather data and turn the data into information, and usually on a report on paper or screen.
The decision maker will review and make a decision on the information on a report or they will come back and ask for more information.
As stated in the slide…

Business Intelligence (BI)
 Concepts, practices, tools and techniques to help business  Understand its core capabilities
 Provide snapshots of the company situation
 Identify key opportunities to create a competitive advantage
 Provides a framework for
 Collecting and storing operational data and aggregating it
into decision support data
 Analysing decision support data and presenting generated information to end users to support business decisions
 Making business decision which generates more data
 Monitoring results to evaluate outcomes and predicting
future outcomes with a high degree of accuracy
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Data are collected and stored, and then the data will then be used to populate (i.e. added to) in DW.
As stated on the slide…

Business Analytics
Prescriptive Analytics
Predictive Analytics
Business Intelligence
Descriptive Analytics
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a
Big Data 6
certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated in the slide…
There are about four to five analytics. There are other terms of analytics are used such as exploratory analytics or explanatory analytics.
Exploratory Analysis is the first step in the data analysis process looking what are available in the data and any interesting insight. This is the same when you look at your data too. You first might look at any interesting story and do you need to get more data to get the insights you are after.
In explanatory analysis, you need data to support your insight and able to explain to others what you have done is supported by the data.
BI is usually related to Descriptive Analytics because in the 1990s and 2000s, the technologies drive what you can do!

INFS5710 Week 1
Last Year Project: LA Bike-Sharing
Entity Relationship Model (ERM)
Normalisation
Employee, Products, Relational
Prepared by , Feb. 2021
External Data (e.g. Excel)
ETL (Data Cleansing)
Star Schema (De-Normalised)
Data Warehouse
External Data (e.g. Excel)
Data (DW to BD, or vice versa or both)
Unstructured Data (Social Media)
Data Streaming
Structured Data (Internet of Things (IOT))
Data Streaming
Flat Files
Machine Learning
Customers,Database Assets
(“Not Normalised”)
Hadoop Distributed File System (HDFS) and MapReduce
Spark and NoSQL (and other tools)
Reporting (Business Intelligence and Visualisation) and Business Analysis (End Users)
Note: In-Memory Database (e.g. SAP Hana) is an alternative data model not shown here
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
Bike transactions
(real time)

©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
This is a typical BI framework, i.e. an IT infrastructure for BI. You have people, process, management and governance.
1. People are the technical people or users.
2. Processes are what you need to do get the report produced.
3. In this case, the data, external and operational, have to undergo ETL (Extract, Transform and Loading) which is to clean the data. Management refers to management of data.
• Extract is to get the data.
• Transform is to clean the data to make sure they are readable, blanks are filled and data are corrected to the correct format. For instance, you might want to consolidate “University of Wales”, “U. Wales”, “University of NSW”, or even “Unsw” etc.. To become “UNSW”.
• Loading is to load into DW, and there is usually a process of checking and evaluating the data uploaded into DW (Data store in the diagram), if they are not good or enough information, you do it again.
Data cleaning and data cleansing are two different terms used but they are the

same thing.
You can then run a query on the data and create reports, or do data analysis etc.
4. As for governance, this is to deal with how you data is to be managed. Who have access to the data, and what data is to be stored and where?
Read Table 13-2 for the BI architectural component.

©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…

A dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. Commonly used dimensions
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a
are people, products, place and time.
certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
A dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. Commonly used dimensions are people, products, place and time.
Use a rubric cube as an example, each side of the rubric cube is a dimension, so you can say you have a six dimensions. One dimension could be sales, another dimension could be states lot like NSW, Victoria, and so on, another dimension could be products, another dimension could be customers, and so on

Practices to Manage Data
 Master data management (MDM): Collection of
concepts, techniques, and processes for identification,
definition, and management of data elements
to provide a comprehensive and consistent definition of all data within an organization
 Governance: Method of government for controlling business health and for consistent decision making
to ensure accountability of decision-making
 Key performance indicators (KPI): Numeric or
scale-based measurements that assess company’s
effectiveness in reaching its goals
profit, profit margin, employee turnover, graduation rate
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
KPI is a popular measurement! To see how the business is going, or how a person is going with his work.

Practices to Manage Data
 Data visualization: Abstracting data to provide information in a visual format
 Enhances the user’s ability to efficiently comprehend the meaning of the data
 Techniques:
 Pie charts and bar charts
 Line graphs
 Scatter plots
 Gantt charts Project schedule
 Heat maps a representation of data in the form of a map or diagram in which data values are represented as colours.
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
In SAS EG and Studio, both have these tools. The graphs are not pretty, colours and fonts are not the best but you can see the results.

©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
You can view the data by Zip Code – this can be done INFS5700 using SAS VA or SAS Viya.

Reporting Styles of a Modern BI System
Advanced reporting
insightful information about the organization and key actionable information used to support decisions.
Monitoring and alerting
monitor decision outcome
Advanced data analytics
explanatory and predictive
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
Different types and purpose of different Reporting Styles of a modern BI System.
Monitoring could be to see how things are going, so you might have different colours such as red, amber, and green. Green means everything it’s OK. Amber means something needs attention. Red means something needs attention.

Business Intelligence Benefits
Improved decision making Integrating architecture
BI can be the integrating umbrella for a disparate mix of IT systems
Common user interface for data reporting and analysis Common data repository fosters single version of company data
Improved organizational performance
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
Please read the textbook for BI Benefits

©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
This diagram shows how we read data and report data! It changes over time due to improved in technologies and the Internet. So now, instead of a company, we can do as an individual. For instance, how many steps did you do today, last week, last month? Your patterns of getting up and going to bed, and how is your wellbeing.
Another example is how much electricity produced by the solar panels of the house during the day. It might change the pattern of how a mother cooks Instead of cooking late in the evening, she might start to cook in the early afternoon because of “free solar electricity”, and only heat up before dinner. She might instead do the washing at the early morning; she might do a bit late, say after kids are in school, because more electricity is generated.

Business Intelligence Evolution
A transaction is any action that reads from or writes to a database
(SELECT, UPDATE, CREATE)
Relational DB
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
A summary of BI Evolution.

Multidimensional DB
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…

Decision Support Data
 Decisionsupportsystem(DSS)isanarrangementofcomputerisedtools used to assist managerial decision making
 Typically has a much narrower focus and reach than a BI solution
 BIinformationtechnologyhasevolvedfromcentralizedreportingstylesto the current, mobile BI and Big Data analytics style in the span of just a few years
 The rate of technological change is not slowing down; technology advancements are accelerating the adoption of BI to new levels
 EffectivenessofBIdependsonqualityofdatagatheredatoperationallevel
 Operationaldata
 Seldomwell-suitedfordecisionsupporttasks
 Storedinrelationaldatabasewithhighlynormalisedstructures
 Optimizedtosupporttransactionsrepresentingdailyoperations
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…

Decision Support Data
 Differ from operational data in:
 Time span Operational data short time frame;
Decision support data longer  Granularity for decision support data
 Drill down: Decomposing a data to a lower level
 Roll up: Aggregating a data into a higher level  Dimensionality
High for decision support data (e.g., data requirement for multi-periods, multi-location, multi-companies)
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…

©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
You can transform operational data into decision support data by using Pivot Table in Excel. Pivot Table is a cheap and free data tools to do BI.
• Column A Year
• Column B Region
• Column C Agent
• Column D Product
• Column E Value
becomes Time remains as Region remains as Agent remains as Product becomes Sales
You then can run a report, say, Sales by product, region over the last two years (2014- 2015).

Decision Support Database Requirements
organization of data
 Database schema
 Must support complex, non-
normalized data representations
 Data must be aggregated and summarized
 Queries must be able to extract multidimensional time slices
Slice-and-dice
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…
We will talk more about data warehouse in a moment and that will fit in with this.

Decision Support Database Requirements
 Data extraction and loading (ETL)
 Allow batch and scheduled data extraction
 Support different data sources and check for inconsistent data or data validation rules
 Support advanced integration, aggregation, and classification
 Database size should support
 Very large databases (VLDBs)  Advanced storage technologies  Multiple-processor technologies
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use.
As stated on the slide…

Data Warehouse
©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or serv

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com