CS计算机代考程序代写 Week-2 Time Series Graphics

Week-2 Time Series Graphics

Some of the slides are adapted from the lecture notes provided by Prof. Antoine Saure and Prof. Rob Hyndman

Business Forecasting Analytics
ADM 4307 – Fall 2021

Time Series Graphics

Ahmet Kandakoglu, PhD

20 September, 2021

Outline

• Review of last lecture

• Overview of forecasting techniques

• Time series

• tsibble objects in R

• Time plots

• Time series patterns

• Seasonal plots

• Scatter plots

Fall 2021 ADM 4307 Business Forecasting Analytics 2

What is Forecasting?

It is the process of making predictions of the future based on past and present

data.

Fall 2021 ADM 4307 Business Forecasting Analytics 3

I see that you will get a 90 in

Forecasting Analytics this semester.

Features Common to All Forecasts

• Assumes causal system

past ==> future

• Forecasts rarely perfect because of randomness

• Forecasts more accurate for

groups vs. individuals

• Forecast accuracy decreases

as time horizon increases

Fall 2021 ADM 4307 Business Forecasting Analytics 4

Outline

• Review of last lecture

• Overview of forecasting techniques

• Time series

• tsibble objects in R

• Time plots

• Time series patterns

• Seasonal plots

• Scatter plots

Fall 2021 ADM 4307 Business Forecasting Analytics 6

Approaches to Forecasting

• Qualitative: Judgmental methods

• Non-quantitative analysis of subjective inputs

• Considers “soft” information such as human factors, experience, gut instinct

• Quantitative: Analyze “hard” data

• Time series models

• Extends historical patterns of numerical data

• Associative (causal) models

• Create equations with explanatory variables to predict the future

Fall 2021 ADM 4307 Business Forecasting Analytics 7

Quantitative Forecasting

• Conditions for their application:

• Information about the past is available

• Information can be quantified in numerical data

• Some aspects of the past pattern will continue into the future (continuity

assumption)

• Two extremes:

• Intuitive or ad hoc methods (simple, based on empirical experience, and no

accuracy information)

• Formal quantitative methods based on statistical principles

Fall 2021 ADM 4307 Business Forecasting Analytics 8

Quantitative Forecasting

• Time series models:

• Prediction of the future is based on past values of a variable and/or past errors

• The goal is to determine the pattern in the historical data series and extrapolate that

pattern into the future

• Black box that makes no attempt to discover the factor affecting forecast variable behavior

• Explanatory models:

• Assume that the variable to be forecasted shows an explanatory relationship with one or

more independent variables

• The goal is to determine the form of the relationship and use it to forecast future values of

the forecast variable

Fall 2021 ADM 4307 Business Forecasting Analytics 9

Quantitative Forecasting – An Example

Gross National Product (GDP) is a measure of a country’s economic

performance

• Time series models:

• GNPt+1= f (GNPt, GNPt-1 , GNPt-2 , GNPt-3 ,…, error)

• Explanatory model:

• GDP = f(monetary and fiscal policies, inflation, capital spending, imports, exports, error)

Time series models can often be used more easily to forecast, whereas explanatory

models can be used with greater success for policy and decision making

Fall 2021 ADM 4307 Business Forecasting Analytics 10

Qualitative Forecasting

• Do not require data in the same manner as quantitative forecasting methods

• Inputs required are mainly the product of judgement and accumulative

knowledge

• Used mainly to provide hints, to aid the planner, and to supplement

quantitative forecasts, rather than to provide a specific numerical forecast

• Used almost exclusively for medium- and long-term situations

• Frequently the only alternative is no forecast at all

Fall 2021 ADM 4307 Business Forecasting Analytics 11

Outline

• Review of last lecture

• Overview of forecasting techniques

• Time series

• tsibble objects in R

• Time plots

• Time series patterns

• Seasonal plots

• Scatter plots

Fall 2021 ADM 4307 Business Forecasting Analytics 12

Time Series and Cross-Sectional Data

• Time Series:

• Historical data that consists of a sequence of observations over time

• We will assume that the times of observations are equally spaced

• Monthly Australian beer production (megaliters, Ml) from January 1991–August 1995

• Cross-Sectional Data:

• All observations are from the same time

• Price ($US), mileage (mpg) and country of origin for 45 automobiles from Consumer

Reports, April 1990, pp. 235–255

Fall 2021 ADM 4307 Business Forecasting Analytics 13

tsibble Objects in R

• A tsibble allows storage and manipulation of multiple time series in R.

• It contains:

• An index: time information about the observation

• Measured variable(s): numbers of interest

• Key variable(s): optional unique identifiers for each series

• It works with tidyverse functions.

Fall 2021 ADM 4307 Business Forecasting Analytics 14

tsibble Objects

An Example of a time series is stored in a tsibble object in R

mydata <- tsibble( year = 2012:2016, y = c(123, 39, 78, 52, 110), index = year ) or mydata <- tibble( year = 2012:2016, y = c(123, 39, 78, 52, 110)) %>% as_tsibble(index = year)

mydata

Year Observation

2012 123

2013 39

2014 78

2015 52

2016 110

Fall 2021 ADM 4307 Business Forecasting Analytics 15

# A tsibble: 5 x 2 [1Y]

year y

1 2012 123

2 2013 39

3 2014 78

4 2015 52

5 2016 110

tsibble Objects

• For observations that are more frequent than once per

year, we need to use a time class function on the

index.

• For example, suppose we have a monthly dataset z.

• This can be converted to a tsibble object using the

following code

z_ts <- z %>%

mutate(Month = yearmonth(Month)) %>%

as_tsibble(index = Month)

Fall 2021 ADM 4307 Business Forecasting Analytics 16

z

# A tibble: 5 x 2

Month Observation

1 2019 Jan 50

2 2019 Feb 23

3 2019 Mar 34

4 2019 Apr 30

5 2019 May 25

z_ts

# A tsibble: 5 x 2 [1M]

Month Observation

1 2019 Jan 50

2 2019 Feb 23

3 2019 Mar 34

4 2019 Apr 30

5 2019 May 25

The tsibble index

Common time index variables can be created with these functions

Fall 2021 ADM 4307 Business Forecasting Analytics 17

Frequency Function

Annual start:end

Quarterly yearquarter()

Monthly yearmonth()

Weekly yearweek()

Daily as_date(), ymd()

Sub-daily as_datetime(), ymd_hms()

Key tsibble Functions

The most common functions:

• select(): subset columns

• filter(): subset rows on conditions

• arrange(): sort results

• mutate(): create new columns

• group_by(): group data by columns

• summarize(): create summary statistics

Fall 2021 ADM 4307 Business Forecasting Analytics 18

Graphical Summaries

• The first thing to do is to visualize the data

• Graphs allow us to see basic features of the data such as patterns, unusual

observations, changes over time, and relationships between variables

• These features should be included in an forecasting model

• The type of data will determine which type of graph is most appropriate

• Time plots, seasonal plots and scatterplots are routinely used in forecasting

Fall 2021 ADM 4307 Business Forecasting Analytics 19

Graphical Summaries

• Time plots

• The data are plotted over time

• Reveal trends over time, regular seasonal behavior and other systematic features of the

data

• Seasonal plots

• The data are plotted against the individual “seasons” in which the data were observed

• Enable the underlying seasonal pattern and substantial departures from the seasonal

pattern to be seen clearly

• Scatterplots

• Plot the variable that we wish to forecast against an explanatory variable

• Help us to visualize the relationship between two variables

Fall 2021 ADM 4307 Business Forecasting Analytics 20

Time Plots

PBS %>%

filter(ATC2 == “A10”) %>%

select(Month, Concession, Type, Cost) %>%

summarise(TotalC = sum(Cost)) %>%

mutate(Cost = TotalC / 1e6) -> a10

a10 %>%

autoplot(total_cost) +

ylab(“$ million”) +

xlab(“Month”) +

ggtitle(“Antidiabetic drug sales”)

Fall 2021 ADM 4307 Business Forecasting Analytics 21

Time Plots

melsyd_economy <- ansett %>%

filter(Airports == “MEL-SYD”, Class == “Economy”) %>%

mutate(Passengers = Passengers/1000)

autoplot(melsyd_economy, Passengers) +

labs(title = “Ansett airlines economy class”,

subtitle = “Melbourne-Sydney”,

y = “Passengers (‘000)”)

Fall 2021 ADM 4307 Business Forecasting Analytics 22

Time Plots

The time plot immediately reveals some interesting features:

• Range of the data

• Times at which peaks occur

• Relative size of the peaks compared with the rest of the series

• Randomness in the series (the data pattern is not perfect)

Fall 2021 ADM 4307 Business Forecasting Analytics 23

Time Plots

Now, your turn.

• Create plots of the following time series: aus_production, pelt,

gafa_stock, vic_elec.

• Use help() to find out about the data in each series.

• For the last plot, modify the axis labels and title.

Fall 2021 ADM 4307 Business Forecasting Analytics 24

Time Series Patterns

Horizontal pattern

• The data values fluctuate around a constant mean

• Such a series is called stationary in its mean

Seasonal pattern

• The data values are influenced by seasonal factors such as the month of the year or the day of the

week

• Seasonal series are sometimes called periodic although they do not exactly repeat themselves over

time

Cyclical pattern

• The data exhibit rises and falls that – are not of a fixed period

Trend pattern

• There is a long-term increase or decrease in the data

Many data series include a combination of the preceding patterns

Fall 2021 ADM 4307 Business Forecasting Analytics 25

Time Series Patterns

Year
1

Year
2

Year
3

Year
4

Seasonal peaks (winters) Trend component

Actual line

D
e

m
a

n
d

f
o

r
s

n
o

w
b

o
a

rd
s

Random

variation

Fall 2021 ADM 4307 Business Forecasting Analytics 26

Time Series Patterns

• Differences between seasonal and cyclic patterns:

• A seasonal pattern is of a constant length, while a cyclical pattern varies in length

• The average length of a cycle is usually longer than that of seasonality and

• The magnitude of a cycle is usually more variable than that of seasonality

• The timing of peaks and troughs is predictable with seasonal data, but

unpredictable in the long term with cyclic data.

Fall 2021 ADM 4307 Business Forecasting Analytics 27

Time Series Patterns

aus_production %>% filter(year(Quarter) >= 1980) %>% autoplot(Electricity) +

labs(y = “GWh”, title = “Australian electricity production”)

Fall 2021 ADM 4307 Business Forecasting Analytics 28

Time Series Patterns

aus_production %>% autoplot(Bricks) + labs(y = “million units”,

title = “Australian clay brick production”)

Fall 2021 ADM 4307 Business Forecasting Analytics 29

Time Series Patterns

us_employment %>% filter(Title == “Retail Trade”, year(Month) >= 1980) %>%

autoplot(Employed / 1e3) +

labs(y = “Million people”, title = “Retail employment, USA”)

Fall 2021 ADM 4307 Business Forecasting Analytics 30

Time Series Patterns

gafa_stock %>% filter(Symbol == “AMZN”, year(Date) >= 2018) %>%

autoplot(Close) + labs(y = “$US”, title = “Amazon closing stock price”)

Fall 2021 ADM 4307 Business Forecasting Analytics 31

Time Series Patterns

pelt %>% autoplot(Lynx) + labs(y=”Number trapped”,

title = “Annual Canadian Lynx Trappings”)

Fall 2021 ADM 4307 Business Forecasting Analytics 32

Seasonal Plots

a10 %>% gg_season(total_cost, labels = “both”) + labs(y = “$ million”,

title = “Seasonal plot: antidiabetic drug sales”)

Fall 2021 ADM 4307 Business Forecasting Analytics 33

Seasonal Plots

• Data plotted against the individual “seasons” in which the data were observed.

(In this case a “season” is a month.)

• Something like a time plot except that the data from each season are

overlapped.

• Enables the underlying seasonal pattern to be seen more clearly, and also

allows any substantial departures from the seasonal pattern to be easily

identified.

• In R: gg_season()

Fall 2021 ADM 4307 Business Forecasting Analytics 34

Seasonal Polar Plots

gg_season(a10, total_cost, polar=TRUE) + ylab(“$ million”)

Fall 2021 ADM 4307 Business Forecasting Analytics 35

Seasonal Subseries Plots

a10 %>% gg_subseries(total_cost) + labs(y = “$ million”,

title = “Subseries plot: antidiabetic drug sales”)

Fall 2021 ADM 4307 Business Forecasting Analytics 36

Seasonal Subseries Plots

• Data for each season collected together in time plot as separate time series.

• Enables the underlying seasonal pattern to be seen clearly, and changes in

seasonality over time to be visualized.

• In R: gg_subseries()

Fall 2021 ADM 4307 Business Forecasting Analytics 37

Seasonal Plots

Quarterly Australian Beer Production

beer <- aus_production %>%

select(Quarter, Beer) %>%

filter(year(Quarter) >= 1992)

beer %>% autoplot(Beer)

Fall 2021 ADM 4307 Business Forecasting Analytics 38

Seasonal Plots

Quarterly Australian Beer Production

beer %>% gg_season(Beer, labels=”right”)

Fall 2021 ADM 4307 Business Forecasting Analytics 39

Seasonal Plots

Quarterly Australian Beer Production

beer %>% gg_subseries(Beer)

Fall 2021 ADM 4307 Business Forecasting Analytics 40

Seasonal Plots

Now, your turn.

• Look at the quarterly tourism data for the Snowy Mountains

snowy <- tourism %>% filter(Region == “Snowy Mountains”)

• Use autoplot(), gg_season() and gg_subseries() to explore the data.

• What do you learn?

Fall 2021 ADM 4307 Business Forecasting Analytics 41

Scatterplots

• Scatterplot helps us to visualize the relationship
between the variables.

vic_elec %>%

filter(year(Time) == 2014) %>%

ggplot(aes(x = Temperature, y = Demand)) +

geom_point() +

labs(x = “Temperature (degrees Celsius)”,

y = “Electricity demand (GW)”)

• It is clear that high demand occurs when

temperatures are high due to the effect of air-

conditioning. But there is also a heating effect,

where demand increases for very low

temperatures.

Fall 2021 ADM 4307 Business Forecasting Analytics 42

Correlation

• It is common to compute correlation coefficients to measure the strength of the linear

relationship between two variables.

• The value always lies between −1 and 1 with negative values indicating a negative

relationship and positive values indicating a positive relationship.

Fall 2021 ADM 4307 Business Forecasting Analytics 43

Autocorrelation

• Correlation measures the extent of a linear relationship between two variables.

• Autocorrelation measures the linear relationship between lagged values of a

time series.

• The autocorrelation coefficients make up the autocorrelation function or ACF.

• The autocorrelation coefficients for the beer production data can be computed

using the ACF() function.

Fall 2021 ADM 4307 Business Forecasting Analytics 44

Autocorrelation

The plot is sometimes known as a correlogram.

recent_production %>%

ACF(Beer) %>%

autoplot() + labs(title=”Australian beer production”)

Fall 2021 ADM 4307 Business Forecasting Analytics 45

Trend and Seasonality in ACF Plots

• When data have a trend, the autocorrelations for small lags tend to be large and positive

because observations nearby in time are also nearby in value.

• When data are seasonal, the autocorrelations will be larger for the seasonal lags (at multiples

of the seasonal period) than for other lags.

• When data are both trended and seasonal, you see a combination of these effects.

• The a10 data shows both trend and seasonality.

a10 %>% ACF(Cost, lag_max = 48) %>%

autoplot() + labs(title=”Australian antidiabetic drug sales”)

Fall 2021 ADM 4307 Business Forecasting Analytics 46

White Noise

• Time series that show no autocorrelation are called white noise.

• White noise data is uncorrelated across time with zero mean and constant variance

y %>%

ACF(wn) %>%

autoplot() + labs(title = “White noise”)

Fall 2021 ADM 4307 Business Forecasting Analytics 47

White Noise

• We expect 95% of the spikes in the ACF to lie within the bounds on a graph of the ACF (the

blue dashed lines).

• If one or more large spikes are outside these bounds, or if more than 5% of spikes are outside

these bounds, then the series is probably not white noise.

set.seed(30)

y <- tsibble(sample = 1:50, wn = rnorm(50), index = sample) y %>% autoplot(wn) + labs(title = “White noise”, y = “”)

Fall 2021 ADM 4307 Business Forecasting Analytics 48

Business Forecasting Analytics
ADM 4307 – Fall 2021

Time Series Graphics

Fall 2021 ADM 4307 Business Forecasting Analytics 49