The Diebold-Mariano test compares the forecast accuracy of two forecast methods.
Arguments
- e1
Forecast errors from method 1.
- e2
Forecast errors from method 2.
- alternative
a character string specifying the alternative hypothesis, must be one of
"two.sided"
(default),"greater"
or"less"
. You can specify just the initial letter.- h
The forecast horizon used in calculating
e1
ande2
.- power
The power used in the loss function. Usually 1 or 2.
- varestimator
a character string specifying the long-run variance estimator. Options are
"acf"
(default) or"bartlett"
.
Value
A list with class "htest"
containing the following
components:
- statistic
the value of the DM-statistic.
- parameter
the forecast horizon and loss function power used in the test.
- alternative
a character string describing the alternative hypothesis.
- varestimator
a character string describing the long-run variance estimator.
- p.value
the p-value for the test.
- method
a character string with the value "Diebold-Mariano Test".
- data.name
a character vector giving the names of the two error series.
Details
This function implements the modified test proposed by Harvey, Leybourne and
Newbold (1997). The null hypothesis is that the two methods have the same
forecast accuracy. For alternative="less"
, the alternative hypothesis
is that method 2 is less accurate than method 1. For
alternative="greater"
, the alternative hypothesis is that method 2 is
more accurate than method 1. For alternative="two.sided"
, the
alternative hypothesis is that method 1 and method 2 have different levels
of accuracy. The long-run variance estimator can either the
auto-correlation estimator varestimator = "acf"
, or the estimator based
on Bartlett weights varestimator = "bartlett"
which ensures a positive estimate.
Both long-run variance estimators are proposed in Diebold and Mariano (1995).
References
Diebold, F.X. and Mariano, R.S. (1995) Comparing predictive accuracy. Journal of Business and Economic Statistics, 13, 253-263.
Harvey, D., Leybourne, S., & Newbold, P. (1997). Testing the equality of prediction mean squared errors. International Journal of forecasting, 13(2), 281-291.
Examples
# Test on in-sample one-step forecasts
f1 <- ets(WWWusage)
f2 <- auto.arima(WWWusage)
accuracy(f1)
#> ME RMSE MAE MPE MAPE MASE ACF1
#> Training set 0.2243266 3.40781 2.761668 0.2629465 2.162415 0.6102792 0.2308014
accuracy(f2)
#> ME RMSE MAE MPE MAPE MASE
#> Training set 0.3035616 3.113754 2.405275 0.2805566 1.917463 0.5315228
#> ACF1
#> Training set -0.01715517
dm.test(residuals(f1), residuals(f2), h = 1)
#>
#> Diebold-Mariano Test
#>
#> data: residuals(f1)residuals(f2)
#> DM = 1.9078, Forecast horizon = 1, Loss function power = 2, p-value =
#> 0.05932
#> alternative hypothesis: two.sided
#>
# Test on out-of-sample one-step forecasts
f1 <- ets(WWWusage[1:80])
f2 <- auto.arima(WWWusage[1:80])
f1.out <- ets(WWWusage[81:100], model = f1)
#> Model is being refit with current smoothing parameters but initial states are being re-estimated.
#> Set 'use.initial.values=TRUE' if you want to re-use existing initial values.
f2.out <- Arima(WWWusage[81:100], model = f2)
accuracy(f1.out)
#> ME RMSE MAE MPE MAPE MASE ACF1
#> Training set 0.2100836 3.24835 2.570459 0.1203497 1.352355 0.4246845 0.2287215
accuracy(f2.out)
#> ME RMSE MAE MPE MAPE MASE
#> Training set 1.081679 3.329012 2.437119 0.6810673 1.375924 0.4026544
#> ACF1
#> Training set -0.004460367
dm.test(residuals(f1.out), residuals(f2.out), h = 1)
#>
#> Diebold-Mariano Test
#>
#> data: residuals(f1.out)residuals(f2.out)
#> DM = -0.14392, Forecast horizon = 1, Loss function power = 2, p-value =
#> 0.8871
#> alternative hypothesis: two.sided
#>