QBUS6840 Predictive Analytics
QBUS6840 Predictive Analytics
Copyright By PowCoder代写 加微信 powcoder
Lecture 1: Introduction
Discipline of Business Analytics
University of School
Table of contents
What is QBUS6840 about?
Introduction to forecasting
Business Analytics
Businesses are increasing investment in data infrastructure, and
collecting massive data everyday.
More and more companies and government organizations use
and need business analytics.
Business analytics refers to skills, methodologies and
technologies that help extracting useful knowledge from business
data, which is important for data-driven decision making.
The good news: many opportunities for you
Companies are creating positions for data analytics or data
science in recent years
See a recent report from the National Skills Commission1:
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/
25-emerging-occupations/data-analytics/data-scientists
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/25-emerging-occupations/data-analytics/data-scientists
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/25-emerging-occupations/data-analytics/data-scientists
The good news: many opportunities for you
Companies are creating positions for data analytics or data
science in recent years
See a recent report from the National Skills Commission1:
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/
25-emerging-occupations/data-analytics/data-scientists
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/25-emerging-occupations/data-analytics/data-scientists
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/25-emerging-occupations/data-analytics/data-scientists
The good news: many opportunities for you
Companies are creating positions for data analytics or data
science in recent years
See a recent report from the National Skills Commission1:
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/
25-emerging-occupations/data-analytics/data-scientists
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/25-emerging-occupations/data-analytics/data-scientists
https://www.nationalskillscommission.gov.au/reports/emerging-occupations/25-emerging-occupations/data-analytics/data-scientists
Business Analytics
An umbrella term to describe a set of data analysis techniques
(for example: regression, classification, clustering), each
designed for different type of data (for example: cross-sectional
data, time series data, text data), targeting business
applications.
This course focuses on techniques for analysing time series
data, and focuses on forecasting.
Time series data
A time series is a time stamped sequence of observations on a
I Weekly unit sales of a product.
I Unemployment rate in Australia each quarter.
I Daily production levels of a product.
I Average annual temperature in Sydney.
I Monthly water level in Warragamba Dam.
I 5 minute prices for CBA stock on the ASX.
Mathematically we denote a time series as,
Y0, Y1, Y2, . . . , Yt, . . . , YT , YT+1, . . .
Examples of questions we try to answer in this unit
1O Understanding, 2O prediction/forecasting, and 3O evaluation
What is the underlying pattern in the yearly GDP time series? 1O
What is the ride-share demand in Sydney at 5pm this Friday? 2O
How can we forecast the electricity demand in NSW next year? 2O
Can we forecast the variation (called volatility) of the stock return of
an asset tomorrow? 2O
Is the forecast produced by a colleague of yours sensible/accurate? 3O
So, what is QBUS6840 about?
This unit offers a survey of popular statistical methodologies for
analysis of business time series data
This also provides the tools necessary to extract information
required for specific tasks such as forecasting or quantifying
prediction uncertainty.
Emphasis will be given to business applications of predictive
analytics methods using modern software tools.
Learning objectives
At the completion of the unit, students
Understand the characteristics of time-series data in order to
analyse real business data of this form,
Select and use an appropriate technique to predict the future
behaviour of business variables of interest,
Be fluent in using computational tools to assist carrying out
your analysis and generating visualisation,
Can identify advantages and limitations of each method,
Can use Python to perform the analysis, visualise data/results,
Can present and write about their findings effectively,
communicate effectively with highly technical data scientists,
can supervise business analytics projects.
Learning by doing and asking questions!
You become a better problem solver by solving problems. Focus
your efforts on the assignments and tutorials.
Ask questions in the live Q&A sessions, or on Ed, or during
consultation times.
Discuss the materials with your classmates/colleagues. Ask for
help. Stay in touch with your classmates and teaching staff.
Look for answers and extra readings on the Internet, especially
about programming issues.
We focus on technical materials, but remember the profile of a
great data scientists must include: common sense, effective
communications and story-telling based on data.
Introduction to Forecasting
Forecasting
Definition/terminologies
Problems Data Forecasting
Quantitative/data-driven Qualitative
principles/steps
Reading: https://otexts.com/fpp2/intro.html
https://otexts.com/fpp2/intro.html
Introduction to Forecasting
Defining predictive modeling
Predictive modeling: the process of developing a
mathematical/statistical tool or model that generates an
accurate prediction.
Sometimes we only care about the predictions themselves. But
it is also often essential to establish and communicate the
uncertainty in the predictions, as this is often important to
account for the risk in decision making.
Prediction vs Interpretation.
I Interpretation (or statistical inference) focuses on interpreting
and understanding what has happened
I Prediction is to forecast what might happen in future.
Forecasting
A forecast is a prediction of what might happen in the future.
Forecasting is the process of making a forecast.
In this course, the terms prediction and forecast are the same,
and used exchangeably.
Forecasting influences business and economic decision making,
planning, policy setting, etc.
Importance of forecasting
Governments need to forecast unemployment, economic growth,
expected revenues from income taxes, etc. to formulate policies.
Companies need to forecast demand, sales, consumer
preferences in strategic planning.
Banks/investors/financial analysts need to forecast financial
returns, risk or volatility.
University administrators need to forecast enrollments to plan
for facilities and for faculty recruitment
Retail stores need to forecast demand to control inventory
levels, hire employees and provide training
Sports organisations need to project sports performance, crowd
figures, club gear sales, revenues, etc. in the coming season.
Introduction to Forecasting
Types of problems and data
Regression problems and classification problems.
Time series data (the focus of this unit) and cross-sectional
Regression problems
In regression problems we want to predict a numerical outcome.
I “How many copies will this book sell?”
I “What will inflation be next month?”
I “How much will my house sell for in the current market?”
I “How much is this customer going to spend in my website today,
given that he is going to purchase something?”
I “How many tourists are going to visit NSW within the next
Examples of regression models/algorithms: linear regression,
penalized linear regression, partial least squares, neural
networks, regression trees, etc.
Classification Problems
Classification involves mapping your data points into a finite set
of labels or the probabilities for each label. Some examples:
I “Will someone click on this ad?” 0 or 1 (no or yes)
I “What is this news article about?” politics, sports, culture …
I “What number is this? (image recognition)” 0, 1, 2, …
I “Is this message spam” 0 or 1
I “Is this transaction fraudulent?” 0 or 1
I “Is the customer going to leave the service?” 0 or 1
Examples of techniques for predicting labels: logistic regression,
k-nearest neighbours, naive Bayes, discriminant analysis,
classification trees, support vector machines, etc.
We will not cover classification problems in this course, but you
can study this in the Data Mining course (QBUS6810) and/or
Machine Learning course (QBUS6850)
Cross-sectional data
Cross-sectional data are values observed “at one point” in time.
I Starting salary and WAM for graduates in 2015
I Amazon orders in Sydney on March 1
I Annual return on Fortune 500 company stocks in 2018.
I Votes for or against Labor party in 2019 Federal election.
Time series plots and forecasting
Top left plot is monthly
sales of a lubricant. It is
challenging to forecast
because…
Top right plot is monthly
electricity production. It is
easier to forecast because..
Bottom plot is quarterly
sales of clay bricks in
Australia. It is challenging
to forecast because…
Types of forecasting
Qualitative (judgmental) forecasting.
Quantitative (data based) forecasting.
Combine of the two: judgmentally adjusted statistical
forecasting
Types of forecasting
Judgmental forecasting
Expert opinion (subjective)
I often used when there is a lack of historical data
I used in conjunction with data-based forecasting
Delphi method: a popular judgmental forecasting method
I Invented in 1950s by Helmer and Dalkey
I Assumption: forecasts from a group is more accurate than those
from individuals
I Stages: forming panel, setting tasks, initial expert views,
feedback to experts, aggregating expert views for forecasting
Subjectively extending previous patterns into future
Quantitative Forecasting
Based on historical data
Use formal econometric or statistical forecasting methods.
I Project previous patterns into future using a statistical model.
I Time series modelling.
I Regression modelling.
Is the focus of this course
I time series forecast
I regression forecast
Time series forecasts
A class of forecasting techniques based on time series data
analysis: AR, ARMA, Recurrent neural networks, etc.
Project previous patterns into future using a formal statistical
model e.g.
Salest = f(Salest−1, Salest−2, …,Salest−24) + �t
Only concerned with forecasting, not reasons why the variable
Regression forecasts
A class of forecasting techniques based on regression modelling
Use a formal statistical regression model
Pricet = f(Seasont; Demandt; GeneratorVendort) + �t
Can assess other quantities related to changes in a variable
Can do scenario forecasting
Introduction to Forecasting
The process of forecasting
Formulation of the business problem
Gathering information
Preliminary data analysis
Choose and test models.
Using and evaluating a forecasting model (forecast, assess
forecasts, and make decisions!).
This can be and should be an iterative process of discovery: (i)
to refine the forecasting process, (ii) to take into account new
information; the business environment is always changing
Formulation of the business problem
Formulation
I The first step is to understand the business problem to be solved.
Often, raising the business question is as important as finding
the solution
I What exactly needs forecasting? Can the variable of interest
even be forecast?
I How will forecasts be used?
Principles
I Use experts knowledge/previous studies to examine if forecasting
is considered possible.
I Use theory to guide the search for possible explanatory factors.
I Communicate with all involved in data collection, decision
making, etc. to properly structure problem definition.
Gathering information
What information is available for the forecast problem?
What kind of data is needed? Where to find the data?
customer database, transaction database, etc.
Often, it’s necessary to invest a lot of money in data collection.
Sometimes, it’s necessary to estimate and compare the costs
and benefits of each data source
Preliminary data analysis
Each statistical data analysis technique requires a specific form
of data it uses: numerical, categorical, tabular format,…
In the preliminary data analysis phase, the raw data might be
converted into the required forms
Cleaning data, checking typos, checking unusual observations,..
Visualize the data to get some insights
Choose and test models
From the problem formulation and preliminary data analysis
step, try to work out the predictive models that are suitable for
the data you have
There are an extensive list of predictive postulated models:
ARMA, Recurrent Neural Networks, etc. It’s important to
select an appropriate model (or set of models).
Principles
I We must be careful not to overt by using excessively complex
models that pick up noise in the training sample instead of
underlying predictive patterns: Occam’s Razor principle
I All models are wrong, but some are useful – . No
single model or method will always be best. We should consider
a wide variety of techniques.
I Combining predictions from different models might work best
than any single model in isolation.
Using and evaluating a forecasting model
How will the forecasts be used?
Put the forecasts into real use in order to realize some return on
investment
Example (https://otexts.com/fpp2/case-studies.html)
The Australian federal government needed to forecast the
annual budget for the Pharmaceutical Benet Scheme (PBS).
The PBS provides a subsidy for many pharmaceutical products
sold in Australia.
The total expenditure was around $7 billion in 2009 and had
been underestimated by nearly $1 billion in the each of the two
In order to forecast the total expenditure, it is necessary to
forecast hundreds of groups.
So we needed to find a forecasting method that allowed for
trend and seasonality if it was present, and that was robust to
sudden changes in the underlying patterns.
How might we go about defining the problem here?
Where to find time series data on the web? Australian
Data Libraries
OZDasl – Australian Data and Story Library
ANU Social Science Data Archives
University of Databases Collection
Original Data Sources
Reserve Bank of Australia
Australian Bureau of Statistics
Penn World Tables ? Australia
Time Series Data Library (Hyndman)
Datastream International
Where to find time series data on the web? US Data
Yahoo Finance Database
Data and Story Library — CMU
Bureau of Economic Analysis
Economagic – times series data
FRED – Federal Reserve Economic
White House – Economic Statistics
Brieng Room
NBER – National Bureau of
Economic Research
NBER – Marriage and Divorce
ICensus – Statistical Abstracts
Census Data
ICPSR – Interuniversity
Consortium for Social and
Political Research
Panel Survey of Income Dynamics
Bureau of Labor Statistics
Survey of Income and Program
Participation (SIPP)
National Center for Health
Statistics
Statistics in Sport (American
Statistical Association)
Elements of a Good Forecast
We’re human after all!
Forecasting techniques you’re going to learn in this course are
powerful. But they are not magic! They require good data,
proper understanding and validation.
Forecaster’s creativity, business knowledge, common sense and
human judgment are essential for their success.
Happy forecasting!
What is QBUS6840 about?
Introduction to forecasting
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com