CS代写 QBUS6840 Predictive Analytics Lecture 2: Data Patterns, Graphing, Time Se

QBUS6840 Predictive Analytics Lecture 2: Data Patterns, Graphing, Time Series Components, and Forecast Accuracy

QBUS6840 Predictive Analytics

Copyright By PowCoder代写 加微信 powcoder

Lecture 2: Data Patterns, Graphing, Time
Series Components, and Forecast Accuracy

Discipline of Business Analytics

The University of School

Forecasting

Definition/terminologies

Problems Data Forecasting

Quantitative/data-driven Qualitative

principles/steps

Table of contents

Data and plotting

Components of Time Series

Näıve Forecasting Methods

Prediction Errors and Measures

Evaluating forecast accuracy

Online textbook Chapter 2:
https://otexts.com/fpp2/graphics.html and Chapter 3:

https://otexts.com/fpp2/simple-methods.html

https://otexts.com/fpp2/graphics.html
https://otexts.com/fpp2/simple-methods.html

Data carry information/knowledge: image, video, sound, text,
preference, categorical, numerical, any forms you can think of.

Traditionally, data collected to prove some ’hypotheses”, to
infer knowledge or causality
I A pharmaceutical company conducts a study on effectiveness of

its COVID-19 vaccine candidates through lengthy clinical trials,
I A company uses face to face interviews to ask how much people

would be willing to pay for a product.

Today, data are often collected in a passive way thanks to
digital technology and can be used for near real-time decision
I data driven drug development (e.g. to identify candidate

compounds)
I dynamic pricing to quickly adapt to demand and supply

(e.g. Uber’s surge pricing)

Data carry information/knowledge: image, video, sound, text,
preference, categorical, numerical, any forms you can think of.

Traditionally, data collected to prove some ’hypotheses”, to
infer knowledge or causality
I A pharmaceutical company conducts a study on effectiveness of

its COVID-19 vaccine candidates through lengthy clinical trials,
I A company uses face to face interviews to ask how much people

would be willing to pay for a product.

Today, data are often collected in a passive way thanks to
digital technology and can be used for near real-time decision
I data driven drug development (e.g. to identify candidate

compounds)
I dynamic pricing to quickly adapt to demand and supply

(e.g. Uber’s surge pricing)

Data carry information/knowledge: image, video, sound, text,
preference, categorical, numerical, any forms you can think of.

Traditionally, data collected to prove some ’hypotheses”, to
infer knowledge or causality
I A pharmaceutical company conducts a study on effectiveness of

its COVID-19 vaccine candidates through lengthy clinical trials,
I A company uses face to face interviews to ask how much people

would be willing to pay for a product.

Today, data are often collected in a passive way thanks to
digital technology and can be used for near real-time decision
I data driven drug development (e.g. to identify candidate

compounds)
I dynamic pricing to quickly adapt to demand and supply

(e.g. Uber’s surge pricing)

Plotting data is the first step in any data analysis task,
including predictive analytics task.

I Visualize many features of the data: trend, correlation

relationship, etc.
I Give insights into the data

The type of the data determines the type of graphing technique
I Two popular ones: time series plot and scatter plot

Plotting data is the first step in any data analysis task,
including predictive analytics task.

I Visualize many features of the data: trend, correlation

relationship, etc.
I Give insights into the data

The type of the data determines the type of graphing technique
I Two popular ones: time series plot and scatter plot

Plotting data is the first step in any data analysis task,
including predictive analytics task.

I Visualize many features of the data: trend, correlation

relationship, etc.
I Give insights into the data

The type of the data determines the type of graphing technique
I Two popular ones: time series plot and scatter plot

Data Graphing: Some examples

’s video: 200 Countries, 200 Years, 4 Minutes

Google Charts
https://developers.google.com/chart/interactive/docs/

Good graphs convey both patterns and the randomness in the

Message+ noise

https://developers.google.com/chart/interactive/docs/gallery
https://developers.google.com/chart/interactive/docs/gallery

Python Plotting

Almost all types of plots done by matplotlib in Python

You use “import matplotlib.pyplot as plt” to import all
the functionalities

Always follow the following steps

1. Prepare data: either loading data from file, processing and

2. Define a drawing window: size, subplots etc (or use the default
setting by plt.plot())

3. Use the main plotting functions plot and/or scatter etc,
depending on what plots are needed

Please see tutorials 1 and 2

Many examples online, e.g. matplotlib gallery

Time Series Plots

Time Series Plots

Time Series Plots

Time Series Components

See an example in Lecture02 Example01.py

Reflects the long-run growth or decline in the time series

The Trend Component

What type of trend? Trend may depend on the length of the
observed time series so we must extrapolate with care

What type of trend?

Trend may depend on the length of the
observed time series so we must extrapolate with care

What type of trend? Trend may depend on the length of the
observed time series so we must extrapolate with care

Slow rises and falls that are not in a regularly repeating
pattern, no fixed period

often related to “business cycles”

thus, very difficult to model!

Figure: The data exhibit a cyclic pattern every 8-10 years

Rises and falls that are in
a regular repeating
pattern, on a seasonal
basis such as months of
the year or days of the

There is a fixed seasonal
period/frequency,
denoted by M

Example: Time series plot

Example: Seasonal Plot

See an example in Lecture02 Example02.py

Difference between Seasonal and Cycle

Cycle: the rises and falls are not of a fixed frequency

Seasonal: the rises and falls are associated with the calendar,
and the frequency M is unchanging (every 12 months, 7 days,

Figure: The data exhibit both a cyclic pattern and a seasonal pattern.

Irregular fluctuations

follow no visualizable pattern, need
statistical models to capture

assumed unexplainable

Might be ’unusual’ events: earthquakes,
accidents, hurricanes, wars, strikes

OR just random variations: e.g. noise,
customer behaviors.

Example: Tasty Cola sales (in hundreds of cases)

The Upward Trend

Seasonal Patterns:

Septembers

Irregular Fluctuations

… after the seasonal component is removed

We denote a time series of length T (or N) as

Y = {Y1, Y2, …, YT }, or Y1:T

where Yt is the observation at time point t.

The forecast of YT+h based on data Y1:T is denoted as ŶT+h|T :
I h = 1, 2, … is called horizon
I ŶT+h|T is a h-step-ahead forecast.
I Sometimes, we simply write ŶT+h for ŶT+h|T .

The estimate error (residual)

et = Yt − Ŷt

The seasonal period M and the general seasonal index m
(taking one value of 1, …, M)

Näıve forecasting method: Most Recent Value

This is the baseline model to which all forecast models should
be compared

Monthly beer consumption in Australia

Monthly beer consumption in Australia: naive forecast.
Multi-step-ahead forecast.

Mean: −43.8 43.8 −32.7 32.7

Monthly beer consumption in Australia: naive forecast.
One-step-ahead forecast.

Mean: −3.6 22.1 −3.7 15.9

Seasonal Näıve Method: Most Recent Season’s Value

Ŷt+h = Yt+h−M , h = 1, 2, …

Drift Method
A variant of the näıve method, allowing the changes in the forecasts
The amount of change over time, called drift, is the average change
seen in the historical data.
Give a (historical) time series

T = {Y1, Y2, . . . , YT }

The drift method defines the forecast for the time point T + h as

ŶT+h := YT +

(Yt − Yt−1)

i.e., adding (h) times of the average change to the most recent
observation YT .
It can be proved that

ŶT+h = YT + h

This is equivalent to drawing a line between the first and last

observations, and use that line to forecast for times after T .

Google stock price forecast

The forecast error

Let Ŷt = Ŷt|t−1 be an one-step-ahead forecast of Yt, based on
data Y1:t−1. The forecast error is the difference between Yt and

its forecast Ŷt.
I depends on how we define the ”difference”

For numerical data, the forecast error is often defined as

et = Yt − Ŷt

For categorical data, the forecast error is measured in terms of
disagreement

0 Yt = Ŷt,

1 Yt 6= Ŷt.

MUST have actual data to compute errors!

Forecast accuracy measures

A forecast method is unbiased if:

E(et) = 0⇐⇒ E(Yt) = E(Ŷt)

which implies that:

(Yt − Ŷt) ≈ 0.

Is this a good criterion to assess forecast accuracy? WHY?

Answers: Should we choose method with sum of errors (i.e.
average error) closest to 0? Desirable but what about the SIGN
of errors?

Forecast accuracy measures

A forecast method is unbiased if:

E(et) = 0⇐⇒ E(Yt) = E(Ŷt)

which implies that:

(Yt − Ŷt) ≈ 0.

Is this a good criterion to assess forecast accuracy? WHY?
Answers: Should we choose method with sum of errors (i.e.
average error) closest to 0? Desirable but what about the SIGN
of errors?

Accuracy measures

Mean Absolute Deviation (MAD)

|Yt − Ŷt|

MAD is average distance between actual and forecast, i.e.
average forecast error.

Mean Squared Error (MSE)

(Yt − Ŷt)2

MSE is like a forecast variance, if forecasts are unbiased.

Root Mean Squared Error (RMSE) = the square root of MSE.

MAD and RMSE are in original units of data. MSE penalises
large errors more than MAD.

Accuracy measures

Mean Absolute Percentage Error

|Yt − Ŷt|

Forecast errors are percentages of the actual data point,
e.g. 10%. Very popular in business forecasting

E.g., MAPE = 10% means that on average the forecast error is
10% of the data value.
I sometimes, knowing that MAPE is 10%, say, can be more

valuable than knowing it is e.g. 12 Megalitres (MAD or RMSE)

Cannot be used if any Yt = 0.

Monthly beer consumption in Australia: naive forecast.
Multi-step-ahead forecast.

Mean: −43.8 43.8 −32.7 32.7

Monthly beer consumption in Australia: naive forecast.
One-step-ahead forecast.

Mean: −3.6 22.1 −3.7 15.9

Which measure is best?

I same units as Y
I Does not heavily penalise a small number of large errors

I Harder to interpret
I Heavily penalises large errors
I RMSE has same units as Y

I measures percentage error

All can be used simultaneously. Report measure(s) that
decision maker/manager can BEST understand. Are one or two
large errors highly UNDESIRABLE? Yes? Use MSE. No? Use
MAD or MAPE

Which measure is best?

All the MAD, MSE, RMSE and MAPE are suitable for
numerical data

For categorical or direction forecasting

Percentage agreement and/or Percentage disagreement

are often used.

e.g. how many elections has this model correctly forecast,
compared to the total? Ans. 3

Training and test sets

An objective way of assessing a model’s forecast accuracy is to
use a test set for testing forecasting.

It is often not enough to look at how well a model fits the
historical data. We can only determine the accuracy of forecasts
by considering how well a model performs on new data
I A model which fits the data well does not necessarily forecast

I A near perfect fit can always be obtained by using a model with

enough parameters.
I Over-fitting a model to data is as bad as failing to identify the

systematic pattern in the data.

Occam’s Razor: Simpler methods tend to work well on average
in practice.

Training and test sets

When choosing models, it is common to use a portion of the
available data for fitting, and use the rest of the data for testing
the model. Then the testing data can be used to measure how
well the model is likely to forecast on new data.

The size of the test set is typically about 20% of the total
sample, although this value depends on how long the sample is
and how far ahead you want to forecast. The size of the test set
should ideally be at least as large as the maximum forecast
horizon required.

Time series references often call the training set the “in-sample
data” and the test set the “out-of-sample data”

Refresher: cross validation for cross-sectional data

Consider leave-one-out cross validation

Select observation i (leave-one-out) for the test set, and use the
remaining observations in the training set.

Compute the error on the test observation.

Repeat the above steps for i = 1, 2, …, T , where T is the total
number of observations.

Compute the forecast accuracy measures based on the errors

Cross-validation for time series data

Two approaches: sliding window and expanding window

chronological testing since order matters!

using consecutive time steps (with a fixed or variable length) for
training/building a model

testing using the next 1 or few observations

averaging over all passes: report or use this to compare models

Testing: we could use one-step-ahead or h-step-ahead forecasts.

Cross-validation for time series data

Two approaches: sliding window and expanding window

chronological testing since order matters!

using consecutive time steps (with a fixed or variable length) for
training/building a model

testing using the next 1 or few observations

averaging over all passes: report or use this to compare models

Testing: we could use one-step-ahead or h-step-ahead forecasts.

Example: using expanding window with one-step-ahead

The blue dots are training data, red dots are test data.

Example: using expanding window with four-step-ahead

The blue dots are training data, red dots are test data. h = 4.

Data and plotting
Components of Time Series
Naïve Forecasting Methods
Prediction Errors and Measures
Evaluating forecast accuracy

程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com