Using expsmooth: worked examples
Justin Carmody, Rob J Hyndman
2023-07-18
Source:vignettes/expsmooth-vignette.Rmd
expsmooth-vignette.Rmd
Introduction
This package contains a collection of datasets that are designed to accompany the book “Forecasting with Exponential Smoothing: The State Space Approach” by Rob Hyndman, Anne B. Koehler, J. Keith Ord, and Ralph D. Snyder (Wiley, 3rd ed., 1998). The book can be purchased here.
When the expsmooth
package is loaded, the
forecast
package is also loaded, providing the functions to
fit and forecast with exponential smoothing state space models.
This vignette will replicate Section 2.8 from the book, and provide worked solutions to some of the exercises in Section 2.9. The forecast package has been updated since the book was published, and some of the resulting model estimates may be different from those in the book. Figure numbers are taken from the book.
Data sets
A graph of a time series often exhibits patterns, such as an upward
or downward movement (trend) or a pattern that repeats (seasonal
variation), that might be used to forecast future values. Chapters 1 and
2 reference four data sets that are included in the
expsmooth
package.
bonds 125 monthly US government bond yields (percent per annum) from January 1994 to May 2004.
usnetelec 55 observations of annual US net electricity generation (billion kwh) for 1949 through 2003.
ukcars 113 quarterly observations of passenger motor vehicle production in the UK (thousands of cars) for the first quarter of 1977 through the first quarter of 2005.
visitors 240 monthly observations of the number of short term overseas visitors to Australia from May 1985 to April 2005.
These time series are shown in Figure 1.1 which is reproduced below:
plot_bonds <- autoplot(bonds) +
ggtitle('(a) US 10-year Bonds Yield') +
xlab('Year') +
ylab('Percentage per Annum')
plot_usnetelec <- autoplot(usnetelec) +
ggtitle('(b) US Net Electricity Generation') +
xlab('Year') +
ylab('Billion kWh')
plot_ukcars <- autoplot(ukcars) +
ggtitle('(c) UK Passenger Vehicle Production') +
xlab('Year') +
ylab('Thousands of Cars')
plot_visitors <- autoplot(visitors) +
ggtitle('(d) Overseas Visitors to Australia') +
xlab('Year') +
ylab('Thousands of People')
gridExtra::grid.arrange(plot_bonds, plot_usnetelec, plot_ukcars, plot_visitors, nrow=2)
Model selection exercise
This part of the vignette will follow the methodology described in Section 2.8 of the book, and reproduce the results that are reported there. This also provides answers to Exercise 2.3 and Exercise 2.4.
The estimation and model selection are performed by the
ets()
function and the forecasting is done by the
forecast()
function. These are both a part of the
forecast
package. A basic introduction to using these
functions is given in Section 7.6 and Section
7.7 of “Forecasting:
Principles and Practice” by George Athanasopoulos and Rob J.
Hyndman.
The automatic forecasting process will by carried out for each of the four data sets. The process will be explained for the first data set, and the relevant results will be reported for the others.
bonds
The ets()
function is used to apply all appropriate
models (optimising parameters in each case), and then it also selects
the best model according to the AICc. The AICc is the default penalised
likelihood, but others can be specified.
fit_bonds <- ets(bonds)
summary(fit_bonds)
#> ETS(A,Ad,N)
#>
#> Call:
#> ets(y = bonds)
#>
#> Smoothing parameters:
#> alpha = 0.9999
#> beta = 0.0954
#> phi = 0.8026
#>
#> Initial states:
#> l = 5.3252
#> b = 0.5934
#>
#> sigma: 0.2428
#>
#> AIC AICc BIC
#> 256.5383 257.2502 273.5082
#>
#> Training set error measures:
#> ME RMSE MAE MPE MAPE MASE
#> Training set -0.01622111 0.2378765 0.1969524 -0.3276314 3.611147 0.2440042
#> ACF1
#> Training set 0.1426552
The autoplot()
function can then be used to show the
states over time.
autoplot(fit_bonds)
The forecast()
function is used to produce point
forecasts and prediction intervals. When the summary()
function is called on the forecast object, it prints the model
information along with the point forecasts and prediction intervals.
forecast_bonds <- forecast(fit_bonds, level=80)
summary(forecast_bonds)
#>
#> Forecast method: ETS(A,Ad,N)
#>
#> Model Information:
#> ETS(A,Ad,N)
#>
#> Call:
#> ets(y = bonds)
#>
#> Smoothing parameters:
#> alpha = 0.9999
#> beta = 0.0954
#> phi = 0.8026
#>
#> Initial states:
#> l = 5.3252
#> b = 0.5934
#>
#> sigma: 0.2428
#>
#> AIC AICc BIC
#> 256.5383 257.2502 273.5082
#>
#> Error measures:
#> ME RMSE MAE MPE MAPE MASE
#> Training set -0.01622111 0.2378765 0.1969524 -0.3276314 3.611147 0.2440042
#> ACF1
#> Training set 0.1426552
#>
#> Forecasts:
#> Point Forecast Lo 80 Hi 80
#> Jun 2004 4.744090 4.432953 5.055227
#> Jul 2004 4.779505 4.322348 5.236661
#> Aug 2004 4.807928 4.229691 5.386165
#> Sep 2004 4.830740 4.144568 5.516913
#> Oct 2004 4.849049 4.063834 5.634264
#> Nov 2004 4.863744 3.986280 5.741207
#> Dec 2004 4.875537 3.911379 5.839695
#> Jan 2005 4.885002 3.838871 5.931134
#> Feb 2005 4.892599 3.768607 6.016591
#> Mar 2005 4.898696 3.700482 6.096910
#> Apr 2005 4.903589 3.634409 6.172770
#> May 2005 4.907517 3.570303 6.244731
#> Jun 2005 4.910669 3.508080 6.313258
#> Jul 2005 4.913199 3.447657 6.378740
#> Aug 2005 4.915229 3.388948 6.441510
#> Sep 2005 4.916859 3.331869 6.501848
#> Oct 2005 4.918166 3.276338 6.559995
#> Nov 2005 4.919216 3.222273 6.616159
#> Dec 2005 4.920058 3.169596 6.670521
#> Jan 2006 4.920735 3.118233 6.723236
#> Feb 2006 4.921277 3.068112 6.774442
#> Mar 2006 4.921713 3.019167 6.824258
#> Apr 2006 4.922062 2.971334 6.872791
#> May 2006 4.922343 2.924552 6.920134
The autoplot()
function can then be called on the
forecast object to produce the graph shown in Figure
2.1.
autoplot(forecast_bonds) +
xlab('Year') +
ylab('Percentage per annum') +
ggtitle("US 10-year bond yields") +
labs(caption="Fig. 2.1a")
usnetelec
fit_usnetelec <- ets(usnetelec)
forecast_usnetelec <- forecast(fit_usnetelec, level=80)
summary(forecast_usnetelec)
#>
#> Forecast method: ETS(M,A,N)
#>
#> Model Information:
#> ETS(M,A,N)
#>
#> Call:
#> ets(y = usnetelec)
#>
#> Smoothing parameters:
#> alpha = 0.9999
#> beta = 0.2191
#>
#> Initial states:
#> l = 254.9338
#> b = 38.3125
#>
#> sigma: 0.0259
#>
#> AIC AICc BIC
#> 634.0437 635.2682 644.0803
#>
#> Error measures:
#> ME RMSE MAE MPE MAPE MASE
#> Training set 1.162583 52.00363 36.77721 0.2629582 1.942062 0.5211014
#> ACF1
#> Training set 0.006113498
#>
#> Forecasts:
#> Point Forecast Lo 80 Hi 80
#> 2004 3900.329 3770.801 4029.857
#> 2005 3952.650 3747.279 4158.022
#> 2006 4004.972 3725.589 4284.355
#> 2007 4057.293 3701.885 4412.701
#> 2008 4109.614 3674.968 4544.259
#> 2009 4161.935 3644.367 4679.503
#> 2010 4214.256 3609.881 4818.632
#> 2011 4266.577 3571.428 4961.726
#> 2012 4318.898 3528.985 5108.812
#> 2013 4371.220 3482.552 5259.888
autoplot(forecast_usnetelec) +
xlab('Year') +
ylab('Billion kwh') +
ggtitle("US net electricity generation") +
labs(caption="Fig. 2.1b")
ukcars
fit_ukcars <- ets(ukcars)
forecast_ukcars <- forecast(fit_ukcars, level=80)
summary(forecast_ukcars)
#>
#> Forecast method: ETS(A,N,A)
#>
#> Model Information:
#> ETS(A,N,A)
#>
#> Call:
#> ets(y = ukcars)
#>
#> Smoothing parameters:
#> alpha = 0.6199
#> gamma = 1e-04
#>
#> Initial states:
#> l = 314.2568
#> s = -1.7579 -44.9601 21.1956 25.5223
#>
#> sigma: 25.9302
#>
#> AIC AICc BIC
#> 1277.752 1278.819 1296.844
#>
#> Error measures:
#> ME RMSE MAE MPE MAPE MASE
#> Training set 1.313884 25.23244 20.17907 -0.1570979 6.629003 0.6576259
#> ACF1
#> Training set 0.02573334
#>
#> Forecasts:
#> Point Forecast Lo 80 Hi 80
#> 2005 Q2 427.4885 394.2576 460.7195
#> 2005 Q3 361.3329 322.2353 400.4305
#> 2005 Q4 404.5358 360.3437 448.7280
#> 2006 Q1 431.8154 383.0568 480.5741
#> 2006 Q2 427.4885 374.5571 480.4200
#> 2006 Q3 361.3329 304.5345 418.1313
#> 2006 Q4 404.5358 344.1174 464.9542
#> 2007 Q1 431.8154 367.9809 495.6500
autoplot(forecast_ukcars) +
xlab('Year') +
ylab('Thousands of cars') +
ggtitle("UK passenger motor vehicle production") +
labs(caption="Fig. 2.1c")
visitors
fit_visitors <- ets(visitors)
forecast_visitors <- forecast(fit_visitors, level=80)
summary(forecast_visitors)
#>
#> Forecast method: ETS(M,A,M)
#>
#> Model Information:
#> ETS(M,A,M)
#>
#> Call:
#> ets(y = visitors)
#>
#> Smoothing parameters:
#> alpha = 0.6146
#> beta = 2e-04
#> gamma = 0.192
#>
#> Initial states:
#> l = 92.9631
#> b = 2.2221
#> s = 0.9378 1.0666 1.0669 0.9625 1.3768 1.113
#> 1.0012 0.8219 0.9317 1.0046 0.8755 0.8413
#>
#> sigma: 0.0536
#>
#> AIC AICc BIC
#> 2603.654 2606.411 2662.825
#>
#> Error measures:
#> ME RMSE MAE MPE MAPE MASE
#> Training set -1.314437 15.89924 11.55716 -0.5970068 4.126055 0.4267949
#> ACF1
#> Training set 0.03686264
#>
#> Forecasts:
#> Point Forecast Lo 80 Hi 80
#> May 2005 361.7821 336.9468 386.6173
#> Jun 2005 396.8994 364.9446 428.8542
#> Jul 2005 495.9425 450.9283 540.9567
#> Aug 2005 428.7142 385.8826 471.5457
#> Sep 2005 424.6121 378.6605 470.5638
#> Oct 2005 472.8124 418.0207 527.6040
#> Nov 2005 495.6660 434.6881 556.6439
#> Dec 2005 610.9775 531.7211 690.2339
#> Jan 2006 462.3847 399.4790 525.2903
#> Feb 2006 511.2647 438.6401 583.8893
#> Mar 2006 501.9930 427.8139 576.1721
#> Apr 2006 441.2086 373.5980 508.8191
#> May 2006 382.3174 320.2398 444.3951
#> Jun 2006 419.3232 349.1686 489.4779
#> Jul 2006 523.8324 433.6993 613.9654
#> Aug 2006 452.7122 372.7330 532.6915
#> Sep 2006 448.2715 367.0789 529.4642
#> Oct 2006 499.0372 406.4927 591.5818
#> Nov 2006 523.0335 423.8434 622.2236
#> Dec 2006 644.5591 519.6916 769.4266
#> Jan 2007 487.6846 391.2701 584.0991
#> Feb 2007 539.1138 430.4447 647.7829
#> Mar 2007 529.2150 420.5430 637.8870
#> Apr 2007 465.0281 367.8225 562.2336
autoplot(forecast_visitors) +
xlab('Year') +
ylab('Thousands of people') +
ggtitle("Overseas visitors to Australia") +
labs(caption="Fig. 2.1d")