Predictive Analytics – Time Series Forecasting
Predictive Analytics
Time Series Forecasting
Copyright By PowCoder代写 加微信 powcoder
Discipline of Business Analytics, The University of School
QBUS2820 content structure
1. Statistical and Machine Learning foundations and applications.
2. Advanced regression methods.
3. Classification methods.
4. Time series forecasting.
Readings: Chapters 1, 2 and 3 in https://otexts.com/fpp2/
Time Series Forecasting
1. Problem definition
2. Time series patterns
3. Simple forecasting methods
4. Measuring forecast accuracy
5. Random walk model
Time series
A time series is a set of observations y1, y2, . . . , yt ordered in time.
• Weekly unit sales of a product.
• Unemployment rate in Australia each quarter.
• Daily production levels of a product.
• Average annual temperature in Sydney.
• 5 minute prices for CBA stock on the ASX.
Example: visitor Arrivals in Australia
Example: AUD/USD exchange rate
Example: assaults in Sydney
Forecasting
A forecast is a prediction about future events and conditions given
all current information, including historical data and knowledge of
any future events that might impact these events.
The act of making such predictions is called forecasting.
Forecasting informs business and economic decision making,
planning, government policy, etc.
• Governments need to forecast unemployment, interest rates,
expected revenues from income taxes to formulate policies.
• Retail stores need to forecast demand to control inventory
levels, hire employees and provide training.
• Banks/investors/financial analysts need to forecast financial
returns, risk or volatility, market ’timing’.
• University administrators need to forecast enrollments to plan
for facilities and for faculty recruitment.
• Sports organisations need to project sports performance,
crowd figures, club gear sales, revenues, etc. in the coming
Forecasting in business
Different problems lead to different approaches under the umbrella
of forecasting.
• Quantitative (data based) forecasting (our focus in this unit).
• Qualitative (judgmental) forecasting.
• Common approach: judgmentally adjusted statistical
forecasting.
Problem definition
Forecasting
Our objective is to predict the value of a time indexed response
variable at a future point t + h, given the observed series until the
present point t. That is, we want to predict Yt+h given
y1, y2, . . . , yt, where h is the forecast horizon.
We can extend this setting to allow for the presence of predictors
x1, x2, . . . , xt, leading to a dynamic regression problem.
Decision theory
We denote a point forecast as Ŷt+h = f(Y1:t). As before, we
assume a squared error loss function:
L(Yt+h, f(Y1:t)) = (Yt+h − f(Y1:t))2
We use the slice notation Y1:t as a compact way to write
Y1, . . . , Yt.
Point forecasting (key concept)
Using the arguments from earlier in the unit, the optimal point
forecast under the squared error loss is the conditional expectation:
f(Y1:t) = E(Yt+h|Y1:t)
Our objective is therefore to approximate the conditional
expectation of Yt+h given the historical data, possible for multiple
values of h.
Interval forecasting (key concept)
Uncertainty quantification is an essential for business forecasting.
A density forecast p̂(Yt+h|y1, . . . , yt) is an estimate of the entire
conditional density p(Yt+h|y1, . . . , yt).
An interval forecast is an interval (ŷt+h,L, ŷt+h,U ) such that
P̂ (ŷt+h,L < Yt+h < ŷt+h,U ) = 1 − α.
Fan chart (key concept)
• For consecutive forecast horizons, construct prediction
intervals for different probability levels (say, 75%, 90%, and
99%) and plot them using different shades.
• The intervals typically get wider with the horizon, representing
increasing uncertainty about future values.
• Fan charts are useful tools for presenting forecasts.
Example: fan chart
Time series patterns
Time series patterns (key concept)
We interpret a time series as
Yt = f(Tt, St, Ct, Et),
where Tt is the trend component, St is the seasonal component, Ct
is the cyclic component, and Et is an irregular or error component.
Trend. The systematic long term increase or decrease in the series.
Seasonal. A systematic change in the mean of the series due to
seasonal factors (month, day of the week, etc).
Cyclic. A cyclic pattern exists when there are medium or long run
fluctuations in the time series that are not of a fixed period.
Irregular. Short term fluctuations and noise.
Examples: time series patterns
Example: cyclic series
Time series decomposition
Time series decomposition methods are algorithms for splitting
a time series into different components, typically for purposes of
seasonal adjustment and interpretation.
In the context of forecasting, decomposition methods are useful
tools for exploratory data analysis, allowing us to visualise patterns
in the data.
We won’t cover methodology for time series decomposition in the
lecture, but students will learn how to use a package in tutorials.
Time series decomposition: visitor arrivals
Example: seasonal adjustment and trend extraction
Simple forecasting methods
Random walk
The random walk method (called the näıve method in the book)
forecasts the series using the value of the last available observation:
ŷt+h = yt
Seasonal random walk
For time series with seasonal patterns, we can extend the random
walk method by forecasting the series with the value of the last
available observation in the same season:
ŷt+h = yt+h−m (if h ≤ m),
where m is the seasonal period. For example, m = 12 and m = 4
for monthly and quarterly data respectively.
The general formula is
ŷt+h = yt+h−km, k = ⌊(h − 1)/m + 1⌋.
Drift method
The drift method forecasts the series as the sum of the most
recent value (as in the näıve method) and the average change over
ŷt+1 = yt +
ŷt+h = yt + h ×
• Time series forecasting is essential in many business
applications.
• Point forecast, interval forecast and density forecast.
• Time series decomposition: trend, seasonal and irregular
components
• Simple forecast methods: naive method and seasonal naive
Measuring forecast accuracy
Measuring forecast accuracy
Let Ŷt+h|t (also denoted Ŷt+h) be forecast of Yt+h given the
observations y1:t.
• Ŷt+1|t often called one-step-ahead forecast.
• Ŷt+h|t, h ≥ 2, often called multiple-step-ahead forecast.
We typically assume the squared error loss and compute the
out-of-sample MSE to measure forecast accuracy.
However, it is useful to be familiar with other measures that are
common in business forecasting:
• Percentage errors.
Mean absolute error
The mean absolute error is
MAE = mean(|yt − ŷt|).
MAE and RMSE are in original units of data. MSE penalises large
errors more than MAE.
Percentage errors
• The percentage error is given by pt = 100 × ((yt − ŷt)/yt). It
has the advantage of being scale-independent.
• The most commonly used measure is mean absolute
percentage error
MAPE = mean(|pt|).
• Measures based on percentage errors have the disadvantage of
being infinite or undefined if yt = 0 for any t in the period of
interest, and having extreme values when any yt = 0 is close
Example: Quarterly Australian Beer Production
The figure shows shows three forecasting methods applied to the
quarterly Australian beer production using data to the end of 2005.
We compute the forecast accuracy measures for 2006-2008.
Example: Quarterly Australian Beer Production
Method RMSE MAE MAPE
Mean method 38.01 33.78 8.17
Näıve method 70.91 63.91 15.88
Seasonal näıve method 12.97 11.27 2.73
It is clear from the graph that the seasonal naive method is best
for the data, although it can still be improved.
Random walk model
Random walk model (key example)
In this section, we use the random walk method to illustrate how
to obtain point and interval forecasts for multiple horizons based
on a time series model.
We assume the model
Yt = Yt−1 + εt,
where εt is i.i.d with constant variance σ2.
Random walk model
Since Yt = Yt−1 + εt, we can use back substitution to show that
Yt+1 = Yt + εt+1
Yt+2 = Yt+1 + εt+2
= Yt + εt+1 + εt+2
Yt+h = Yt+h−1 + εt+h
= Yt + εt+1 + . . . + εt+h
Point forecast
Yt+h = Yt +
Therefore, we obtain the point forecast for any horizon as
ŷt+h = E(Yt+h|y1:t)
∣∣∣∣∣ y1:t
The conditional variance is
Var(Yt+h|y1:t) = Var(yt +
εt+i|y1:t)
For density forecasting, we need to make further assumptions
about the errors. If we assume that εt ∼ N(0, σ2),
Yt+h|y1:t ∼ N
Forecast interval
Under the Gaussian assumption,
Yt+h|y1:t ∼ N
leading to the forecast interval
yt ± zα/2 ×
t=2(yt − yt−1)2
and zα/2 is the appropriate critical value from the normal
distribution.
Example: USD/AUD exchange rate
Forecast interval
Forecast interval based on the assumption of normal errors:
• This forecast interval is based on the plug-in method, as we
replace the unknown σ2 with an estimate.
• The plug in method is a standard approach, but you should be
aware that it ignores parameter uncertainty, leading to
prediction intervals that are too narrow.
• If the errors are not Gaussian, you should use other methods
such as the Bootstrap algorithm (not in the scope of our unit).
Review questions
• What is point and interval forecasting?
• What are the four time series components?
• Which diagnostics do we use for univariate time series models,
• How to we conduct model validation for forecasting?
• How do we compute forecasts and prediction intervals for the
random walk model?
Another model: Simple exponential smoothing
• A ’rule-of-thumb’ model: Weighted average of the past, where
the further into the past, the smaller the weight.
• Simple yet effective.
• This idea is extended to a family of models (Exponential
smoothing)
The simple exponential smoothing (SES) method specifies the
forecasting rule
ŷt+1 = ℓt (forecast equation)
ℓt = αyt + (1 − α)ℓt−1 (smoothing equation)
for an initial value ℓ0 and 0 ≤ α ≤ 1.
ℓt is known as the level of the time series.
Exponentially weighted moving average
ℓ1 = αy1 + (1 − α)ℓ0
ℓ2 = αy2 + (1 − α)ℓ1
= αy2 + (1 − α)αy1 + (1 − α)2ℓ0
ℓ3 = αy3 + (1 − α)ℓ2
= αy3 + (1 − α)αy2 + (1 − α)2αy1 + (1 − α)3ℓ0
ℓ4 = αy4 + (1 − α)ℓ3
= αy4 + (1 − α)αy3 + (1 − α)2αy2 + (1 − α)3αy1 + (1 − α)4ℓ0
Problem definition
Time series patterns
Simple forecasting methods
Measuring forecast accuracy
Random walk model
程序代写 CS代考 加微信: powcoder QQ: 1823890830 Email: powcoder@163.com