CS计算机代考程序代写 Office Use Only

Office Use Only
UNIT CODES: TITLE OF PAPER: EXAM DURATION:
Semester One 2020 Examination Period
Faculty of Business & Economics
ETF3231/ETF5231 Business Forecasting 3.5 hours
Page1of13

The exam contains five sections. All sections must be answered. The exam is worth 100 marks in total.
SECTION A
Write about a quarter of a page each on any four of the following topics. (Clearly state if you agree or disagree with each statement. No marks will be given without any justification.)
1. The disadvantage of using a test set for choosing a forecasting model is that it uses only a small proportion of the data.
2. The best forecasting models adapt rapidly to changes in the trend and seasonal patterns.
3. With STL decompositions and ETS models, we always need to transform our data before estimating the components.
4. The mean of a stationary AR(3) process
yt =c+φ1yt−1 +φ2yt−2 +φ3yt−3 +εt,
where εt ∼ NID(0,σ2), is equal to c.
(You can write out your answer elsewhere and upload it as an image if you prefer)
5. ARIMA models are better than ETS models because there are more possible models available.
6. Regression models are not very useful for forecasting because you have to forecast all the predictors as well.
—ENDOFSECTIONA—
Total: 20 marks
Page2of13

SECTION B
Figures 1, 2 and 3 relate to the number of students (in thousands) arriving in Australia from China over the period January 2005 – February 2020.
1. Using Figures 1, 2 and 3, describe the student arrivals from China. Carefully comment on the interesting features of all three plots.
4 marks
60
40
20
0
Number of students arriving in Australia from China
2010 Jan
2005 Jan
2015 Jan
2020 Jan
Month [1M]
Figure 1:
20
20 20
20
20
20 20 20 20 20
18
1290 17
16
15
2020
103 1124 089 057 06
20198 2017 20165 2014 201132
201009876 2005
60
40
20
0
Number of students arriving in Australia from China
1973 Apr
1973 Jan
1973 Jul
Month
1973 Oct
Figure 2:
Number of students arriving in Australia from China
Jan Feb Mar
Apr May Jun Jul Aug Sep Oct
Month
Nov Dec
60 40 20
0
Figure 3:
Page3of13
2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan 2005 Jan 2010 Jan 2015 Jan 2020 Jan
Thousands of students
Thousands of students
Thousands of students

2. Using the code below, describe what is plotted in all panels of Figures 4 and 5. Comment on the default settings for window in trend() and season(), and the effect of robust=TRUE. Which settings would you consider appropriate in this case?
ch_edu_arrivals %>%
model(STL(log(Count))) %>% components() %>% autoplot() + ggtitle(“STL decomposition”)
6 marks
STL decomposition
`log(Count)` = trend + season_year + remainder
4 3 2 1 0
2.5 2.0 1.5 1.0
1
0 −1
0.0 −0.5 −1.0
2005 Jan
2010 Jan
2015 Jan
2020 Jan
Month
Figure 4:
ch_edu_arrivals %>%
model(STL(log(Count), robust = TRUE)) %>% components() %>% autoplot() + ggtitle(“Robust STL decomposition”)
Robust STL decomposition
`log(Count)` = trend + season_year + remainder
4 3 2 1 0
2.5 2.0 1.5 1.0
1
0
−1 0.5 0.0
−0.5 −1.0 −1.5
2005 Jan
2010 Jan
2015 Jan
2020 Jan
Month
Figure 5:
Page4of13
log(Count) trend season_year remainder log(Count) trend season_year remainder

3. You have been asked to provide forecasts for the next three years for the Chinese student arrivals series assuming that the travel bans will be lifted soon and travel will commence as normal in July 2020.
Consider applying each of the methods and models below. Comment, in a few words each, on whether each one is appropriate for forecasting the data. No marks will be given for simply guessing whether a method or a model is appropriate without justifying your choice.
Start your response by stating: suitable or not suitable.
(a) Seasonal naïve method.
(b) AnSTLdecompositioncombinedwiththedriftmethodtoforecasttheseasonallyadjusted component.
(c) An STL decomposition on the log transformed data combined with an ETS to forecast the seasonally adjusted component.
(d) Holt-Winters method with damped trend and multiplicative seasonality.
(e) ETS(A,A,A).
(f) ETS(M,A,M).
(g) ARIMA(1,12,4).
(h) ARIMA(3,2,1)(1,1,0)12 on the log transformed data.
(i) ARIMA(3,1,1)(2,1,0)12 on the log transformed data.
(j) Regression model with time and Fourier terms.
—ENDOFSECTIONB—
10 marks
Total: 20 marks
Page5of13

SECTION C
The following R code and output concern models for the monthly student arrivals from China plotted in Figure 1.
## Series: Count
## Model: ETS(M,A,M)
##
##
##
##
##
##
## l b s1 s2 s3 s4 s5 s6 s7 s8
## 3.4222 0.046599 0.31423 0.35995 0.467 0.27169 0.7335 2.5698 0.34985 0.28221
## s9 s10 s11 s12
## 0.43686 1.2184 3.4074 1.5891
##
## sigma^2: 0.0329
##
## AIC AICc BIC
## 981.74 985.47 1036.21
fit <- ch_edu_arrivals %>% model(
ets = ETS(Count),
dcmp = decomposition_model(STL(log(Count)), ETS(season_adjust)) )
fit %>% select(ets) %>% report()
Smoothing parameters:
alpha = 0.19288
beta = 0.0066471
gamma = 0.292
Initial states:
fit %>% select(ets) %>% components() %>% autoplot() + labs(subtitle = “Components”)
ETS(M,A,M) decomposition
Components
60 40 20
0
20 10
0.3 0.2 0.1
3 2 1
0.5
0.0 −0.5
2005 Jan
2010 Jan
2015 Jan
2020 Jan
Month
Figure 6:
Page6of13
Count level slope season remainder

fit %>% select(ets) %>% gg_tsresiduals()
0.5
0.0
−0.5
2005 Jan 2010 Jan
2015 Jan 2020 Jan
Month
0.1
0.0
−0.1
6 12 18
lag [1M]
30 20 10
0
−0.5
0.0 0.5
.resid
Figure 7:
1. Briefly describe the fit object.
2. Write down the estimated ETS model in full.
(You can write out your answer elsewhere and upload it as an image if you prefer).
3. Comment on Figure 6 and how this relates to the estimated ETS model.
4. Comment on Figure 7.
5. Use the following output to produce forecasts and 95% prediction intervals for 1-step and 2-steps ahead. Give details for how the point forecasts can be obtained from the components. You must show your full workings. (You can write out your answer elsewhere and upload it as an image if you prefer).
fit %>% select(ets) %>% components() %>% tail(14)
## # A dable: 14 x 7 [1M]
## # Key: .model [1]
## # ETS(M,A,M) Decomposition: Count = (lag(level, 1) + lag(slope, 1)) *
## # lag(season, 12) * (1 + remainder)
## .model Month Count level slope season remainder
##
## 1 ets 2019 Jan 25.7 23.8 0.242 1.21 -0.185
## 2 ets 2019 Feb 66.9 23.8 0.232 2.91 -0.0587
## 3 ets 2019 Mar 24.0 23.5 0.216 1.08 -0.104
## 4 ets 2019 Apr 12.4 23.3 0.200 0.563 -0.0975
## 5 ets 2019 May 8.91 25.5 0.268 0.298 0.434
## 6 ets 2019 Jun 11.7 26.4 0.291 0.418 0.132
2 marks
4 marks
5 marks
2 marks
3 marks
Page7of13
acf .resid
count

##7ets
##8ets
##9ets
## 10 ets
## 11 ets 2019 Nov 8.57 27.8 0.287 0.341 -0.168 ## 12 ets 2019 Dec 9.16 27.5 0.269 0.351 -0.0967
2019 Jul 56.7
2019 Aug 21.5
2019 Sep 12.4
2019 Oct 17.4
25.7 0.257
26.9 0.291
28.2 0.324
28.4 0.319
2.48
0.732
0.404
0.622
-0.192
0.197
0.183
## 13 ets 2020 Jan 25.4 26.5 0.224 1.12
## 14 ets 2020 Feb 12.3 22.4 0.0747 2.19
fit %>% select(ets) %>% forecast(h = 2)
## # A fable: 2 x 4 [1M]
## # Key: .model [1]
## .model Month Count .mean
##
## 1 ets 2020 Mar N(24, 19) 24.3
## 2 ets 2020 Apr N(13, 5.5) 12.7
-0.0225
-0.245
-0.841
6. Describe the alternative model estimated via the decomposition_model() function and com- ment on the forecasts plotted in Figure 8.
4 marks
fit %>%
forecast(h = “3 years”) %>%
autoplot(ch_edu_arrivals, level = NULL) + ylab(“Thousands of students”)
60
40
20
0
2005 Jan
2010 Jan
2015 Jan
Month
2020 Jan
.model dcmp
ets
Figure 8:
—ENDOFSECTIONC—
Total: 20 marks
Page8of13
Thousands of students

SECTION D
Melbourne has a large number of sensors around the city which capture the number of pedestrians passing by. If we average the daily total pedestrians captured by each sensor, we get a measure of the pedestrian activity in the city each day.
Figures 9 and 10 relate to the daily average number of pedestrians across the City of Melbourne from 2019-01-01 until 2020-05-24.
Daily average number of pedestrians across Melbourne
20
15
10
5
2019−01 2019−07
2020−01
Date [1D]
Figure 9:
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday
Sunday
20
15
10
5
Date
Figure 10:
1. Describe the important features of the time series.
2. Study Figure 11 and specify a plausible ARIMA model. Justify your choices.
4 marks
4 marks
Page9of13
2019−01 2019−07 2020−01
2019−01 2019−07 2020−01
2019−01 2019−07 2020−01
2019−01 2019−07 2020−01
2019−01 2019−07 2020−01
2019−01 2019−07 2020−01
2019−01 2019−07 2020−01
Thousands of persons
Thousands of persons

average_daily_walkers %>% gg_tsdisplay(difference(Count, 7), plot_type = “partial”)
4
0
−4
−8
2019−01 2019−07
2020−01
Date
0.4 0.2 0.0
−0.2 −0.4
7 14 21
lag [1D]
0.4 0.2 0.0
−0.2 −0.4
7 14 21
lag [1D]
Figure 11:
3. Open the R file Exam2020_for_students.R provided to you in Moodle and run the first few lines to read in the Melbourne pedestrian data and create the average_daily_walkers tsibble object. Estimate the ARIMA model you have specified above. Check whether you are satisfied with the fitted model by performing some diagnostic checks of the residuals (clearly state any relevant parameters of any tests you may choose to conduct). Paste any relevant R output in the Moodle exam. (Further hints are included in the R file).
4. A second ARIMA model is estimated using the following code. Briefly comment on the residuals and the forecasts generated from this model. Which of the two models do you prefer? Explain.
## Series: Count
## Model: ARIMA(1,0,1)(0,1,1)[7]
##
## Coefficients:
## ar1 ma1 sma1
## 0.9769 -0.6132 -0.8459
## s.e. 0.0113 0.0385 0.0305
##
## sigma^2 estimated as 1.974: log likelihood=-887.33
## AIC=1782.7 AICc=1782.7 BIC=1799.5
4 marks
3 marks
fit_ARIMA_auto <- average_daily_walkers %>%
model(ARIMA(Count, order_constraint = p + q + P + Q <= 3, approximation = FALSE)) fit_ARIMA_auto %>% report()
Page 10 of 13
acf difference(Count, 7)
pacf

20
15
10
5
0
level 80
95
fit_ARIMA_auto %>% gg_tsresiduals()
2.5
0.0 −2.5 −5.0 −7.5
0.05
0.00
−0.05
2019−01
7
2019−07
14 21
lag [1D]
2020−01
fit_ARIMA_auto %>%
forecast(h = “28 days”) %>% autoplot(average_daily_walkers)
2019−01 2019−07
2020−01
2020−07
Date
60
40
20
0
Figure 12:
−7.5
−5.0
−2.5
.resid
0.0 2.5
Count acf .resid
count
Date
Figure 13:
Page 11 of 13

5. Write out the estimated model from Q4 in full, first using backshift notation, then using subscript notation. Use the information below to generate forecasts for 1 and 2-steps ahead. Also generate a prediction interval for 1-step ahead. You must show all workings. (You can write out your answer elsewhere and upload it as an image if you prefer).
## # A tsibble: 14 x 3 [1D]
## Date Count .resid
##
## 1 2020-05-11 2.71 -0.190
## 2 2020-05-12 2.96 -0.289
## 3 2020-05-13 2.96 -0.0658
## 4 2020-05-14 3.20 -0.00536
## 5 2020-05-15 3.45 -0.0721
## 6 2020-05-16 3.68 1.51
## 7 2020-05-17 3.12 0.943
## 8 2020-05-18 3.22 -0.317
## 9 2020-05-19 3.17 -0.625
## 10 2020-05-20 3.26 -0.201
## 11 2020-05-21 2.83 -0.752
## 12 2020-05-22 3.61 0.000744
## 13 2020-05-23 2.98 0.464
## 14 2020-05-24 3.27 1.12
—ENDOFSECTIOND—
5 marks
Total: 20 marks
Page 12 of 13

SECTION E
You will now analyse the data from one of the sensors in Melbourne. Use the code provided in the file Exam2020_for_students.R (on Moodle) to create your series. (Each student will have a different series.)
1. Fit a regression model to the log data, with Fourier terms for the weekly seasonality and a piecewise trend having breaks at 11 March 2020 and at 1 April 2020. The following code can be used to fit this model.
Write out the fitted model for the daily counts for your sensor. Interpret the trend coefficients. (You can write out your answer elsewhere and upload it as an image if you prefer)
2. Suppose you used a training set to the end of February 2020, and a test set consisting of the remaining data, and compared the accuracy of several possible models on the test set. Would the results help you to choose an appropriate forecasting model? Explain.
3. Why is it impossible to use time series cross-validation with this model?
4. How could you decide where to place the knots with this model?
5. Plot the residuals from the model using the following code
fit %>% gg_tsresiduals()
Upload the plot in moodle. Explain how you might improve the model based on this plot.
6. What makes the model unrealistic for forecasting more than a few weeks ahead?
7. Suppose the government decided to remove all COVID19 restrictions from 5 July, how would you allow for this in your forecasts?
—ENDOFSECTIONE—
fit <- myseries %>% model(
TSLM(log(Count) ~ fourier(K = 3) +
trend(knots = as.Date(c(“2020-03-11”, “2020-04-01”)))
) )
5 marks
2 marks
2 marks
2 marks
3 marks
2 marks
4 marks
Total: 20 marks
Page 13 of 13