CS计算机代考程序代写 Week-3 Time Series Decomposition

Week-3 Time Series Decomposition

Some of the slides are adapted from the lecture notes provided by Prof. Antoine Saure and Prof. Rob Hyndman

Business Forecasting Analytics
ADM 4307 – Fall 2021

Time Series Decomposition

Ahmet Kandakoglu, PhD

27 September, 2021

Outline

• Review of last lecture

• Transformations and adjustments

• Time series components

• Seasonal adjustment

• Decomposition methods

• Moving averages

• Classical Decomposition

• STL Decomposition

Fall 2021 ADM 4307 Business Forecasting Analytics 2

Review of Last Lecture

• Overview of forecasting techniques

• Time series graphics

• Time series data patterns

• Some simple forecasting methods

• Residual diagnostics

• Evaluating forecasting accuracy

• Prediction intervals

Fall 2021 ADM 4307 Business Forecasting Analytics 3

Transformations and Adjustments

• Adjusting the historical data can often lead to a simpler forecasting model

• Four kinds of adjustments:

• Calendar adjustments

• Population adjustments

• Inflation adjustments

• Mathematical transformations

• The main goal is to simplify the patterns in the historical data by removing known

sources of variation or by making the pattern more consistent across the data set

• Simpler patterns usually lead to more accurate forecasts

Fall 2021 ADM 4307 Business Forecasting Analytics 4

Calendar Adjustments

• Some variation seen in seasonal data may be due to simple calendar effects. In such

cases, it is usually much easier to remove the variation before fitting a forecasting

model

Fall 2021 ADM 4307 Business Forecasting Analytics 5

Monthly milk production per cow

Years

2 4 6 8 10 12 14

6
0

0
7

0
0

8
0

0
9

0
0

Average milk production per cow per day

2 4 6 8 10 12 14
1

8
2

2
2

6
3

0

Example:

• If you are studying monthly milk production on a

farm, then there will be variation between the

months simply because of the different numbers

of trading days in each month in addition to

seasonal variation across the year

• It is easy to remove this variation by computing

average sales per trading day in each month,

rather than total sales in the month.

Population Adjustments

• Any data that are affected by population changes can be adjusted to give per-capita

data

• Consider the data per person (or per thousand or per million people) rather than the

total

Example:

• If you are studying the number of hospital beds in a particular region over time, the results are

much easier to interpret if you remove the effect of population changes by considering number

of beds per thousand people

• Then you can see if there have been real increases in the number of beds, or whether the

increases are entirely due to population increases

• It is possible for the total number of beds to increase, but the number of beds per thousand

people to decrease (population is increasing faster than the number of hospital beds)

Fall 2021 ADM 4307 Business Forecasting Analytics 6

Population Adjustments

global_economy %>%

filter(Country == “Australia”) %>%

autoplot(GDP/Population) + labs(title= “GDP per capita”, y = “$US”)

Fall 2021 ADM 4307 Business Forecasting Analytics 7

Inflation Adjustments

• Data which are affected by the value of money are best adjusted before modelling

Example:

• A $200,000 house this year is not the same as a $200,000 house twenty years ago. For this

reason, financial time series are usually adjusted so all values are stated in dollar values from

a particular year. For example, the house price data may be stated in year 2000 dollars

• To make these adjustments a price index is used. If 𝑧𝑡 denotes the price index and 𝑦𝑡 denotes
the original house price in year 𝑡, then 𝑥𝑡 = 𝑦𝑡/𝑧𝑡 × 𝑧2000 gives the adjusted house price at
year 2000 dollar values

• Price indexes are often constructed by government agencies. For consumer goods, a common

price index is the Consumer Price Index (or CPI)

Fall 2021 ADM 4307 Business Forecasting Analytics 8

Inflation Adjustments

Example:

• We can see that Australia’s newspaper and book retailing industry has been in

decline much longer than the original data suggests.

Fall 2021 ADM 4307 Business Forecasting Analytics 9

Mathematical Transformations

• If the data show variation that increases or decreases with the level of the series, then

a transformation can be useful

• Denote original observations as 𝑦1, … , 𝑦𝑇 and transformed observations as 𝑤1, … , 𝑤𝑇

• Mathematical transformations for stabilizing variation

• Square root: 𝑤𝑡 = 𝑦𝑡

• Cube root: 𝑤𝑡 =
3 𝑦𝑡

• Logarithm: 𝑤𝑡 = 𝑙𝑜𝑔 𝑦𝑡

• Logarithms are useful because they are interpretable: changes in a log value are

relative (or percentage) changes on the original scale

• Another useful feature of log transformations is that they constrain the forecasts to

stay positive on the original scale

Fall 2021 ADM 4307 Business Forecasting Analytics 10

Mathematical Transformations: An Example

food <- aus_retail %>%

filter(Industry == “Food retailing”) %>%

summarise(Turnover = sum(Turnover))

food %>% autoplot(Turnover) + labs(y = “Turnover ($AUD)”)

Fall 2021 ADM 4307 Business Forecasting Analytics 11

Mathematical Transformations: An Example

food %>% autoplot(sqrt(Turnover)) + labs(y = “Square root turnover”)

Fall 2021 ADM 4307 Business Forecasting Analytics 12

Mathematical Transformations: An Example

food %>% autoplot(-1/Turnover) + labs(y = “Inverse turnover”)

Fall 2021 ADM 4307 Business Forecasting Analytics 13

Mathematical Transformations: An Example

food %>% autoplot(log(Turnover)) + labs(y = “Log turnover”)

Fall 2021 ADM 4307 Business Forecasting Analytics 14

Mathematical Transformations

• A useful family of transformations is the family of “Box-Cox transformations”, which depends

on the parameter 𝜆 as follows:

• Having chosen a transformation, we need to forecast the transformed data. Then, we need to

reverse the transformation (or back-transform) to obtain forecasts on the original scale. The

reverse Box-Cox transformation is given by:

Fall 2021 ADM 4307 Business Forecasting Analytics 15

Box-Cox Transformations

• The guerrero feature can be used to choose a value of lambda for you.

lambda <- food %>%

features(Turnover, features = guerrero) %>%

pull(lambda_guerrero)

> lambda

[1] 0.05240561

• This attempts to balance the seasonal fluctuations and random variation across the series.

• Always check the results.

• A low value of 𝜆 can give extremely large prediction intervals.

Fall 2021 ADM 4307 Business Forecasting Analytics 16

Box-Cox Transformations

food %>% autoplot(box_cox(Turnover, lambda)) +

labs(y = “Box-Cox transformed turnover”)

Fall 2021 ADM 4307 Business Forecasting Analytics 17

Transformed Original

Mathematical Transformations

Features of power transformations:

• Often no transformation is needed

• Simple transformations are easier to explain and work well enough

• Transformations can have very large effect on prediction intervals

• If some data are zero or negative, then use 𝜆 > 0

• log1p() can also be useful for data with zeros.

• Choosing logs is a simple way to force forecasts to be positive

• Transformations must be reversed to obtain forecasts on the original scale. (Handled

automatically by fable.)

Fall 2021 ADM 4307 Business Forecasting Analytics 18

Outline

• Review of last lecture

• Transformations and adjustments

• Time series components

• Seasonal adjustment

• Decomposition methods

• Moving averages

• Classical Decomposition

• STL Decomposition

Fall 2021 ADM 4307 Business Forecasting Analytics 19

Time Series Decomposition

• Many methods are based on the concept that when an underlying pattern

exists, that pattern can be distinguished from randomness by smoothing

(averaging) past values

• The goal is to eliminate randomness so the pattern can be projected into the

future and used as the forecast

• In many situations, the pattern can be broken down (decomposed) into sub-

patterns that identify each component of the time series separately

• Decomposition methods help better understand the behavior of the series,

which facilitates improved forecasting accuracy

Fall 2021 ADM 4307 Business Forecasting Analytics 20

Time Series Components

• Seasonal

• pattern exists when a series is influenced by seasonal factors (e.g., the quarter of the year,

the month, or day of the week).

• Trend

• pattern exists when there is a long-term increase or decrease in the data

• Cycle

• pattern exists when data exhibit rises and falls that are not of fixed period (duration usually

of at least 2 years).

• Most decomposition methods consider the trend and the cycle as a single

component

Fall 2021 ADM 4307 Business Forecasting Analytics 21

Time Series Decomposition

• Decomposition assumes that the data are made up as follows:

data = pattern + error =𝑓 trend−cycle, seasonality, error

• The error term is often called the irregular or the remainder component

Fall 2021 ADM 4307 Business Forecasting Analytics 22

Decomposition Approach

The basic concept is empirical and consists of two steps:

1. Removing the trend-cycle

2. Isolating the seasonal component

Any residual is assumed to be randomness which, while it cannot be predicted,

can be identified

Fall 2021 ADM 4307 Business Forecasting Analytics 23

Decomposition Models

• The general mathematical representation of the decomposition approach is:

𝑦𝑡 = 𝑓 𝑆𝑡, 𝑇𝑡, 𝐸𝑡

• 𝑦𝑡 is the actual data, 𝑆𝑡 is the seasonal component (or index), 𝑇𝑡 is the trend-
cycle component, and 𝐸𝑡 is the remainder component at period 𝑡.

• The exact functional form depends on the method used:

Additive decomposition: 𝑦𝑡 = 𝑆𝑡+ 𝑇𝑡+ 𝐸𝑡
Multiplicative decomposition: 𝑦𝑡 = 𝑆𝑡 × 𝑇𝑡 × 𝐸𝑡

Fall 2021 ADM 4307 Business Forecasting Analytics 24

Decomposition Models

• The additive model is appropriate if the magnitude of the seasonal

fluctuations, or the variation around the trend-cycle, does not vary with the

level of the series

• The multiplicative model is appropriate when the variation in the seasonal

pattern, or the variation around the trend-cycle, appears to be proportional to

the level of the time series

• Multiplicative models are more prevalent with economic series because they

have a seasonal variation which increases with the level of the series (e.g.,

electricity production data)

Fall 2021 ADM 4307 Business Forecasting Analytics 25

Decomposition Models

• An alternative to using a multiplicative model, is to first transform the data until

the variation in the series appears to be stable over time, and then use an

additive model

• Logarithms turn a multiplicative relationship into an additive relationship

𝑦𝑡 = 𝑆𝑡 × 𝑇𝑡 × 𝐸𝑡 → log 𝑦𝑡 = log 𝑆𝑡 + log 𝑇𝑡 + log 𝐸𝑡

Fall 2021 ADM 4307 Business Forecasting Analytics 26

Decomposition Graphics

• Decomposition plot:

• Help visualize the decomposition procedure

• Seasonal sub-series plot:

• Help visualize the overall seasonal pattern and how the seasonal

component is changing over time

Fall 2021 ADM 4307 Business Forecasting Analytics 27

US Retail Employment

us_retail_employment <- us_employment %>%

filter(year(Month) >= 1990, Title == “Retail Trade”) %>%

select(-Series_ID)

us_retail_employment %>% autoplot(Employed) +

labs(y = “Persons (thousands)”, title = “Total employment in US retail”)

Fall 2021 ADM 4307 Business Forecasting Analytics 28

US Retail Employment

dcmp <- us_retail_employment %>%

model(stl = STL(Employed))

components(dcmp)

#> # A dable: 357 x 7 [1M]

#> # Key: .model [1]

#> # : Employed = trend + season_year + remainder

#> .model Month Employed trend season_year remainder season_adjust

#>

#> 1 stl 1990 Jan 13256. 13288. -33.0 0.836 13289.

#> 2 stl 1990 Feb 12966. 13269. -258. -44.6 13224.

#> 3 stl 1990 Mar 12938. 13250. -290. -22.1 13228.

#> 4 stl 1990 Apr 13012. 13231. -220. 1.05 13232.

#> 5 stl 1990 May 13108. 13211. -114. 11.3 13223.

#> 6 stl 1990 Jun 13183. 13192. -24.3 15.5 13207.

#> 7 stl 1990 Jul 13170. 13172. -23.2 21.6 13193.

#> 8 stl 1990 Aug 13160. 13151. -9.52 17.8 13169.

#> 9 stl 1990 Sep 13113. 13131. -39.5 22.0 13153.

#> 10 stl 1990 Oct 13185. 13110. 61.6 13.2 13124.

#> # … with 347 more rows

Fall 2021 ADM 4307 Business Forecasting Analytics 29

US Retail Employment

components(dcmp) %>% autoplot()

Fall 2021 ADM 4307 Business Forecasting Analytics 30

US Retail Employment

us_retail_employment %>%

autoplot(Employed, color=’gray’) + autolayer(components(dcmp), trend, color=’red’) +

labs(y = “Persons (thousands)”, title = “Total employment in US retail”)

Fall 2021 ADM 4307 Business Forecasting Analytics 31

US Retail Employment

components(dcmp) %>% gg_subseries(season_year)

Fall 2021 ADM 4307 Business Forecasting Analytics 32

Outline

• Review of last lecture

• Transformations and adjustments

• Time series components

• Seasonal adjustment

• Decomposition methods

• Moving averages

• Classical Decomposition

• STL Decomposition

Fall 2021 ADM 4307 Business Forecasting Analytics 33

Seasonal Adjustment

• If the seasonal component is removed from the original data, the resulting

values are called the seasonally adjusted data:

• For an additive decomposition : 𝑦𝑡 − 𝑆𝑡

• For a multiplicative decomposition : 𝑦𝑡 / 𝑆𝑡

Fall 2021 ADM 4307 Business Forecasting Analytics 34

US Retail Employment

us_retail_employment %>%

autoplot(Employed, color=’gray’) + autolayer(components(dcmp), season_adjust, color=’red’) +

labs(y = “Persons (thousands)”, title = “Total employment in US retail”)

Fall 2021 ADM 4307 Business Forecasting Analytics 35

Seasonal Adjustment

• Seasonally adjusted series contain the remainder component as well as the

trend-cycle component (they are not smooth and downturns or upturns can

be misleading)

• If the purpose is to look for turning points and interpret any changes in the

series, then it is better to use the trend-cycle component rather than the

seasonally adjusted data

• Useful when seasonal variation is not of primary interest (e.g., monthly

unemployment)

Fall 2021 ADM 4307 Business Forecasting Analytics 36

Outline

• Review of last lecture

• Transformations and adjustments

• Time series components

• Seasonal adjustment

• Decomposition methods

• Moving averages

• Classical Decomposition

• STL Decomposition

Fall 2021 ADM 4307 Business Forecasting Analytics 37

History of Decomposition Methods

• Classical method originated in 1920s.

• Census II method introduced in 1957. Basis for X-11 method and variants (including

X-12-ARIMA, X-13-ARIMA)

• STL method introduced in 1983

• TRAMO/SEATS introduced in 1990s.

National Statistics Offices

• ABS uses X-12-ARIMA

• US Census Bureau uses X-13ARIMA-SEATS

• Statistics Canada uses X-12-ARIMA

• ONS (UK) uses X-12-ARIMA

• EuroStat use X-13ARIMA-SEATS

Fall 2021 ADM 4307 Business Forecasting Analytics 38

Simple Moving Averages

• Classical decomposition method originated in the 1920s

• The goal is to smooth the data to reduce the random variation and thus

estimate the trend-cycle component

• Fundamental building block in all decomposition methods

• A moving average of order k (or k MA) where k is an odd integer, is defined

as:

𝑇𝑡 =
1

𝑘

𝑗=−𝑚

𝑚

𝑌𝑡+𝑗

where k MA is the average of an observation and m=(k-1)/2 points on either side

of it (m is called the half-width)

Fall 2021 ADM 4307 Business Forecasting Analytics 39

Simple Moving Averages

• The more observations included in the moving average:

• The smoother the resulting trend-cycle

• The larger the likelihood that randomness will be eliminated

• The more terms (and information) are lost in the process

• Also, long-term moving averages tend to smooth out genuine bumps or cycles

that are of interest

• The m terms lost in the beginning of the data are of little consequence, but the

lost m in the end are critical. Why?

Fall 2021 ADM 4307 Business Forecasting Analytics 40

Simple Moving Averages

• End point adjustments:

• Use a shorter length moving average

• Take an average of the points that are available

• Use a more sophisticated method

Fall 2021 ADM 4307 Business Forecasting Analytics 41

Simple Moving Averages

• Each value in the 5 MA row above is the

average of the five observations centered on

the corresponding year

• There are no values for the first two or last

two years because we do not have two

observations on either side

Fall 2021 ADM 4307 Business Forecasting Analytics 42

Year Exports 5-MA

1960 12.99

1961 12.40

1962 13.94 13.46

1963 13.01 13.50

1964 14.94 13.61

1965 13.22 13.40

1966 12.93 13.25

1967 12.88 12.66

… … …

2010 19.84 21.21

2011 21.47 21.17

2012 21.52 20.78

2013 19.99 20.81

2014 21.08 20.37

2015 20.01 20.32

2016 19.25

2017 21.27

Simple Moving Averages

aus_exports <- global_economy %>% filter(Country == “Australia”) %>%

mutate(`5-MA` = slider::slide_dbl(Exports, mean, .before = 2, .after = 2, .complete = TRUE))

aus_exports %>% autoplot(Exports) + geom_line(aes(y = `5-MA`), colour = “#D55E00”) +

labs(y = “% of GDP”, title = “Total Australian exports”) + guides(colour = guide_legend(title =

“series”))

Fall 2021 ADM 4307 Business Forecasting Analytics 43

Simple Moving Averages

Effect of the order of the moving average on the smoothness of the trend-cycle estimate

Fall 2021 ADM 4307 Business Forecasting Analytics 44

Classical Decomposition

• The classical decomposition method originated in the 1920s

• There are two forms of classical decomposition:

• Additive decomposition

• Multiplicative decomposition

• We assume the seasonal component is constant over time

• The values which are repeated to make up the seasonal component are

known as the seasonal indices (e.g., 4 values for quarterly data, 12 for

monthly data, 7 for daily data with a weekly pattern)

Fall 2021 ADM 4307 Business Forecasting Analytics 45

Additive Decomposition

1. Compute the trend-cycle component 𝑇𝑡 using a (centered) moving average

2. Calculate the de-trended series 𝑌𝑡 − 𝑇𝑡 = 𝑆𝑡 + 𝐸𝑡

3. Estimate the seasonal indices by simply gathering all the de-trended values

for a given season and taking the average

4. The remainder component 𝐸𝑡 is calculated by subtracting the estimated
seasonal and trend-cycle components from the original data series

Fall 2021 ADM 4307 Business Forecasting Analytics 46

Additive Decomposition

us_retail_employment %>% model(classical_decomposition(Employed, type = “additive”)) %>%

components() %>% autoplot() +

labs(title = “Classical additive decomposition of total US retail employment”)

Fall 2021 ADM 4307 Business Forecasting Analytics 47

Multiplicative Decomposition

• The multiplicative procedure is similar except that ratios are taken instead of

differences

• Step 2: 𝑅𝑡 = 𝑌𝑡/ 𝑇𝑡 = 𝑆𝑡𝑇𝑡𝐸𝑡/ 𝑇𝑡= 𝑆𝑡𝐸𝑡

• Step 4: 𝐸𝑡= 𝑌𝑡/(𝑆𝑡𝑇𝑡)

• This method is often called “ratio-to-moving average” method

Fall 2021 ADM 4307 Business Forecasting Analytics 48

Comments on Classical Decomposition

• The estimate of the trend is unavailable for the first few and last few

observations. Consequently, there is also no estimate of the remainder

component for the same time periods

• The trend-cycle estimate tends to over-smooth rapid rises and falls in the data

• It assumes that the seasonal component repeats from year to year. For many

series, this is not a reasonable assumption. It is unable to capture seasonal

changes over time

• Occasionally, the values in a small number of periods may be particularly

unusual. The method is not robust to these kinds of unusual values

Fall 2021 ADM 4307 Business Forecasting Analytics 49

STL Decomposition

• STL: “Seasonal and Trend decomposition using Loess”

• Very versatile and robust

• Advantages:

• Unlike X-12-ARIMA, STL is capable of handling any type of seasonality, not only monthly and

quarterly data

• The seasonal component is allowed to change over time, and the rate of change can be controlled

by the user

• The smoothness of the trend-cycle can be controlled by the user

• It can be robust to outliers

• Disadvantages:

• It is less developed than X-12-ARIMA

• It does not automatically handle trading day or calendar variation

• There is only an additive version of the STL procedure

• There are no seasonal adjustment diagnostics available

Fall 2021 ADM 4307 Business Forecasting Analytics 50

STL Decomposition

us_retail_employment %>% model(STL(Employed ~ trend(window = 7) + season(window = “periodic”),

robust = TRUE)) %>% components() %>% autoplot()

Fall 2021 ADM 4307 Business Forecasting Analytics 51

STL Decomposition

• The two main parameters to be chosen when using STL are the trend-cycle window

trend(window = ?) and the seasonal window season(window = ?)

• t.window controls wiggliness of trend component.

• s.window controls variation on seasonal component.

• season(window = ‘periodic’) is equivalent to an infinite window.

• These control how rapidly the trend-cycle and seasonal components can change.

• Smaller values allow for more rapid changes.

• Both trend and seasonal windows should be odd numbers; trend window is the number of

consecutive observations to be used when estimating the trend-cycle; season window is the

number of consecutive years to be used in estimating each value in the seasonal component.

• By default, the STL() function provides a convenient automated STL decomposition

Fall 2021 ADM 4307 Business Forecasting Analytics 52

STL Decomposition

us_retail_employment %>% model(STL(Employed)) %>% components() %>% autoplot()

Fall 2021 ADM 4307 Business Forecasting Analytics 53

Business Forecasting Analytics
ADM 4307 – Fall 2021

Time Series Decomposition

Fall 2021 ADM 4307 Business Forecasting Analytics 54