CS计算机代考程序代写 Week-12 Practical Forecasting Issues

Week-12 Practical Forecasting Issues

Some of the slides are adapted from the lecture notes provided by Prof. Antoine Saure and Prof. Rob Hyndman

Business Forecasting Analytics
ADM 4307 – Fall 2021

Practical Forecasting Issues

Ahmet Kandakoglu, PhD

22 November, 2021

Outline

• Models for different frequencies

• Ensuring forecasts stay within limits

• Forecast combinations

• Missing values

• Outliers

• Choosing the right forecasting technique

ADM 4307 Business Forecasting Analytics 2Fall 2021

Models for Different Frequencies

• Models for annual data

• ETS, ARIMA, Dynamic regression

• Models for quarterly data

• ETS, ARIMA/SARIMA, Dynamic regression, Dynamic harmonic regression, STL+ETS,

STL+ARIMA

• Models for monthly data

• ETS, ARIMA/SARIMA, Dynamic regression, Dynamic harmonic regression, STL+ETS,

STL+ARIMA

• Models for weekly data

• ARIMA/SARIMA, Dynamic regression, Dynamic harmonic regression, STL+ETS,

STL+ARIMA, TBATS

• Models for daily, hourly and other sub-daily data

• ARIMA/SARIMA, Dynamic regression, Dynamic harmonic regression, STL+ETS,

STL+ARIMA, TBATS

ADM 4307 Business Forecasting Analytics 3Fall 2021

Ensuring Forecasts Stay within Limits

• It is common to want forecasts to be positive, or to require them to be within some

specified range [a, b].

• Both of these situations are relatively easy to handle using transformations.

• Positive Forecasts

• To impose a positivity constraint, simply work on the log scale, by specifying the

Box-Cox parameter λ=0.

• Interval Forecasts

• We can transform the data using a scaled logit transform which maps (a, b) to the

whole real line:

ADM 4307 Business Forecasting Analytics 4Fall 2021

Ensuring Forecasts Stay within Limits

• For example, consider the real price of a dozen eggs (1900-1993; in cents):

ADM 4307 Business Forecasting Analytics 5

Constrained to be lie between 50 and 400. Constrained to be positive.

Fall 2021

Forecast Combinations

• An easy way to improve forecast accuracy is to use several different methods on the

same time series, and to average the resulting forecasts.

• Clemen (1989):

• The results have been virtually unanimous: combining multiple forecasts leads to

increased forecast accuracy. In many cases one can make dramatic performance

improvements by simply averaging the forecasts.

• While there has been considerable research on using weighted averages, or some

other more complicated combination approach, using a simple average has proven

hard to beat.

ADM 4307 Business Forecasting Analytics 6Fall 2021

Example – Expenditure on Eating Out

We form a combination in the mutate() function by simply taking a average of the estimated

models.

auscafe <- aus_retail %>%

filter(Industry == “Takeaway food services”) %>%

summarise(Turnover = sum(Turnover))

train <- auscafe %>% filter(year(Month) <= 2013) cafe_models <- train %>%

model(

ETS = ETS(Turnover),

ARIMA = ARIMA(log(Turnover))

) %>%

mutate(Combination = (ETS + ARIMA) / 2)

cafe_fc <- cafe_models %>% forecast(h = “5 years”)

ADM 4307 Business Forecasting Analytics 7Fall 2021

Example – Expenditure on Eating Out

Forecast combinations

cafe_fc %>% autoplot(auscafe %>% filter(year(Month) > 2008), level = NULL) +

labs(y = “$ billion”, title = “Australian monthly expenditure on eating out”)

ADM 4307 Business Forecasting Analytics 8Fall 2021

Example – Expenditure on Eating Out

cafe_fc %>% accuracy(auscafe) %>% arrange(RMSE)

ADM 4307 Business Forecasting Analytics 9

# A tibble: 3 x 10

.model .type ME RMSE MAE MPE MAPE MASE RMSSE ACF1

1 ARIMA Test -25.4 46.2 38.9 -1.77 2.65 0.949 0.890 0.786

2 Combination Test 30.6 57.4 45.1 1.87 3.02 1.10 1.10 0.814

3 ETS Test 86.5 122. 101. 5.51 6.66 2.46 2.35 0.880

Fall 2021

Missing Values

• Missing data can arise for many reasons.

• It is worth considering whether the missingness will induce bias in the forecasting

model.

• When missing values cause errors, there are at least two ways to handle the problem.

• We could just take the section of data after the last missing value, assuming there

is a long enough series of observations to produce meaningful forecasts.

• We could replace the missing values with estimates. The interpolate() function is

designed for this purpose.

ADM 4307 Business Forecasting Analytics 10Fall 2021

Missing Values

• Some methods allow for missing values without any problems, some not.

• Functions which can handle missing values.

• ARIMA()

• TSLM()

• NNETAR()

• VAR()

• FASSTER()

• Models which cannot handle missing values

• ETS()

• STL()

• TBATS()

ADM 4307 Business Forecasting Analytics 11Fall 2021

Example – Daily Gold Prices

We will use the “gold” dataset in “forecast” library (Install and load the library to access the

dataset).

gold <- as_tsibble(gold) gold %>% autoplot(value)

ADM 4307 Business Forecasting Analytics 12Fall 2021

Example – Daily Gold Prices

gold_complete <- gold %>% model(ARIMA(value)) %>% interpolate(gold)

gold_complete %>% autoplot(value, colour = “red”) + autolayer(gold, value)

ADM 4307 Business Forecasting Analytics 13Fall 2021

Outliers

• Outliers are observations that are very different from the majority of the observations

in the time series.

• They may be errors, or they may simply be unusual.

• All of the methods we have considered in this course will not work well if there are

extreme outliers in the data.

• In this case, we may wish to replace them with missing values, or with an estimate

that is more consistent with the majority of the data.

• Simply replacing outliers without thinking about why they have occurred may be a

dangerous practice. They may provide useful information which should be taken into

account when forecasting.

ADM 4307 Business Forecasting Analytics 14Fall 2021

Example – Australia Visitors

Number of visitors to the Adelaide Hills region of South Australia.

There appears to be an unusual observation in 2002 Q4.

tourism %>% filter(Region == “Adelaide Hills”, Purpose == “Visiting”) %>%

autoplot(Trips) + labs(title = “Quarterly overnight trips to Adelaide Hills”, y = “Number of trips”)

ADM 4307 Business Forecasting Analytics 15Fall 2021

Example – Australia Visitors

One useful way to find outliers is to apply STL() to the series with the argument robust=TRUE.

Then any outliers should show up in the remainder series.

Since the data have almost no visible seasonality, so we will apply STL without a seasonal

component by setting period=1.

ah_decomp <- tourism %>%

filter(

Region == “Adelaide Hills”, Purpose == “Visiting”

) %>%

# Fit a non-seasonal STL decomposition

model(

stl = STL(Trips ~ season(period = 1), robust = TRUE)

) %>%

components()

ah_decomp %>% autoplot()

ADM 4307 Business Forecasting Analytics 16Fall 2021

Example – Australia Visitors

In more challenging cases using a boxplot of the remainder series would be useful.

A stricter rule is to define outliers as those that are greater than 3 interquartile ranges (IQRs)

from the central 50% of the data.

outliers <- ah_decomp %>%

filter(

remainder < quantile(remainder, 0.25) - 3*IQR(remainder) | remainder > quantile(remainder, 0.75) + 3*IQR(remainder)

)

outliers

ADM 4307 Business Forecasting Analytics 17

# A dable: 1 x 9 [1Q]

# Key: Region, State, Purpose, .model [1]

# : Trips = trend + remainder

Region State Purpose .model Quarter Trips trend remainder season_adjust

1 Adelaide Hills Sout~ Visiti~ stl 2002 Q4 81.1 11.1 70.0 81.1

Fall 2021

Choosing a Forecasting Technique

• No single technique works in every situation

• Two most important factors

• Cost

• Accuracy

• Other factors in selecting a forecasting technique:

• Relevance and availability of historical data

• Forecasting horizon

• Time available for making the analysis

• Pattern of data

ADM 4307 Business Forecasting Analytics 18Fall 2021

Business Forecasting Analytics
ADM 4307 – Fall 2021

Practical Forecasting Issues

ADM 4307 Business Forecasting Analytics 19Fall 2021