Visualization of complex seasonal patterns in time series

Rob J Hyndman

23 September 2022

Complex seasonalities

Complex seasonalities

Complex seasonalities

Complex seasonalities

Complex seasonal topology

Example: hourly data

Complex seasonalities

  • Multiple seasonal periods, not necessarily nested
  • Non-integer seasonality
  • Irregular seasonal topography
  • Seasonality that depends on covariates
  • Complex seasonal topology
  • How to effectively visualise the underlying seasonalities?
  • How to decompose such time series into trend and multiple season components?

Visualizing complex seasonalities

Granularity plots

Granularity plots

Granularity plots

Granularity plots

Granularity plots

Granularity plots

Granularity plots

Granularity plots

Granularity plots

Faceted granularity plots

Faceted granularity plots

Faceted granularity plots

Faceted granularity plots

Granularities

Nested linear granularities

hhour, hour, day, week, fortnight, quarter, semester, year

Available cyclic granularities for half-hourly data

hhour/hour, hhour/day, hhour/week, hhour/fortnight, hhour/month, hhour/quarter, hhour/semester, hhour/year, hour/day, hour/week, hour/fortnight, hour/month, hour/quarter, hour/semester, hour/year, day/week, day/fortnight, day/month, day/quarter, day/semester, day/year, week/fortnight, week/month, week/quarter, week/semester, week/year, fortnight/month, fortnight/quarter, fortnight/semester, fortnight/year, month/quarter, month/semester, month/year, quarter/semester, quarter/year, semester/year

Plot options

  • raw data or distributional summary on y-axis
  • granularity on x-axis
  • optional granularity as facet

What is an interesting plot?

Single granularity plots

  • Compute Jensen-Shannon divergences between distributions q_1 and q_2: JSD(q_1,q_2) = \textstyle\frac{1}{2}D(q_1,M) + \frac{1}{2}D(q_2,M), where M = \frac{1}{2}(q_1+q_2) and D(q_1,q_2) is KL divergence.

  • Measure effectiveness of a plot as maximum JSD for that plot (adjusted for number of levels).

  • Users can be guided to view the most effective plots.

Normalization of maximum JSD

The distribution of max JSD depends on number of levels n.

Permutation approach (for small n)

  • Compute max JSD after permuting the levels.
  • Normalize by mean and standard deviation of permuted max JSD values

Modelling approach (for large n)

  • Fit a Gumbel GLM to max JSD from simulated N(0,1) data with n as covariate.
  • Standardize original data by \Phi^{-1}(), compute max JSD, and normalize by mean and standard deviation from model.

Single granularity plots

x Normalized maximum JSD
hhour/week 72.8
day/year 67.0
week/year 31.8
hhour/day 24.4
day/week 21.8
month/year 15.0
day/month -7.0
week/month -10.5

Single granularity plots

Single granularity plots

Single granularity plots

Single granularity plots

Single granularity plots

Single granularity plots

Single granularity plots

Single granularity plots

What is an interesting faceted plot?

Faceted granularity plots

  • Measure effectiveness of a plot as maximum JSD for that plot

    • weight within panel differences higher than between panel differences (weight 2:1)
    • normalization to adjust for number of levels and panels
  • Omit combinations with empty or near-empty intersections (“clashes”). e.g., day/year \times month/year

  • Omit multi-step nested granularities. e.g,. day/year, hhour/week

  • Omit facets with 20+ levels

Faceted granularity plots

Faceted granularity plots

Faceted granularity plots

References

 

  • Sayani Gupta, Rob J Hyndman, Dianne Cook and Antony Unwin (2022) Visualizing probability distributions across bivariate cyclic temporal granularities. J Computational & Graphical Statistics, 31(1), 14-25.
  • Sayani Gupta, Rob J Hyndman, Dianne Cook (2022) Detecting distributional differences between temporal granularities for exploratory time series analysis. Work in progress.

Time series decomposition for complex seasonalities

Time series decomposition for complex seasonalities

  • Multiple seasonal periods, not necessarily nested
  • Non-integer seasonality
  • Irregular seasonal topography
  • Seasonality that depends on covariates
  • Complex seasonal topology

No existing decomposition method handles all of these.

Two solutions

  1. MSTL: For multiple integer seasonal periods.
  2. STR: For all types of complex seasonality.

MSTL

 

  • Kasun Bandara, Rob J Hyndman, Christoph Bergmeir (2022) MSTL: A Seasonal-Trend Decomposition Algorithm for Time Series with Multiple Seasonal Patterns. International J Operational Research, to appear. robjhyndman.com/publications/mstl/
  • For multiple integer seasonal periods with additive components
  • Implemented in R packages forecast and fable.

MSTL

velec |>
  model(STL(Demand)) |>
  components() |>
  autoplot()

MSTL

MSTL

y_t = T_t + \sum_{i=1}^I S_t^{(i)} + R_t

y_t= observation at time t
T_t= smooth trend component
S_t^{(i)}= seasonal component i
i = 1,\dots,I
R_t= remainder component

Estimation

Components updated iteratively.

MSTL

# y: time series as vector
# periods: vector of seasonal periods in increasing order
# swindow: seasonal window values
# iterate: number of  STL iterations

seasonality <- matrix(0, nrow = length(y), ncol = length(periods))
deseas <- y
for (j in 1:iterate) {
  for (i in 1:length(periods)) {
    deseas <- deseas + seasonality[, i]
    fit <- stl(ts(deseas, frequency = periods[i]), s.window = swindow[i])
    seasonality[, i] <- fit$season
    deseas <- deseas - seasonality[, i]
  }
}
trend <- fit$trend
remainder <- deseas - trend
return(trend, seasonality, remainder)

MSTL

fable syntax

tsibble |>
  model(STL(variable ~ season(period = a, window = b) +
                       season(period = c, window = d)))



forecast syntax

vector |>
  msts(seasonal.periods = c(a, c)) |>
  mstl(s.window = c(b, d))

STR

 

  • Alex Dokumentov and Rob J Hyndman (2022) STR: Seasonal-Trend decomposition using Regression. INFORMS Journal on Data Science, 1(1), 50-62. robjhyndman.com/publications/str/

  • Implemented in R package stR.

STR

y_{t} = T_{t} + \sum_{i=1}^{I} S^{(i)}_{t} + \sum_{p=1}^P \phi_{p,t} z_{t,p} + R_{t}

T_t= smooth trend component
S_t^{(i)}= seasonal component i (possibly complex topology)
z_{p,t}= covariate with coefficient \phi_{p,t} (possibly time-varying)
R_t= remainder component

Estimation

Components estimated using penalized MLE

Smoothness via difference operators

Smooth trend obtained by requiring \Delta_2 T_t \sim \text{NID}(0,\sigma_L^2)

  • \Delta_2 = (1-B)^2 where B= backshift operator
  • \sigma_L controls smoothness

f(\bm{D}_\ell \bm{\ell}) \propto \exp\left\{-\frac{1}{2}\big\|\bm{D}_\ell \bm{\ell} / \sigma_L\big\|_{L_2}^2\right\}

  • \bm{\ell} = \langle T_{t} \rangle_{t=1}^{n}
  • \bm{D}_\ell= 2nd difference operator matrix: \bm{D}_\ell\bm{\ell} = \langle\Delta^2 T_{t}\rangle_{t=3}^n

Smooth 2D seasonal surfaces

  • m_i= number of “seasons” in S^{(i)}_{t}.
  • S^{(i)}_{k,t}= 2d season (k=1,\dots,m_i;t=1,\dots,n)
  • \sum\limits_k S^{(i)}_{k,t} = 0 for each t.

Smooth 2D seasonal surfaces

  • \bm{S}^{(i)} = [S_{k,t}^{(i)}] the ith seasonal surface matrix
  • \bm{s}_i = \text{vec}(\bm{S}_i)= the ith seasonal surface in vector form

Smoothness in time t direction:

\begin{align*} \bm{D}_{tt,i} \bm{s}_i &= \langle \Delta^2_{t} \bm{S}^{(i)}_{k,t} \rangle \sim \text{NID}(\bm{0},\sigma_{i}^2 \bm{\Sigma}_{i})\\ f(\bm{s}_i) &\propto \exp\Big\{-\frac{1}{2}\big\|\ \bm{D}_{tt,i}\bm{s}_i / \sigma_i\big\|_{L_2}^2\Big\} \end{align*}

Analogous difference matrices \bm{D}_{kk,i} and \bm{D}_{kt,i} ensure smoothness in season and time-season directions.

Gaussian remainders

  • R_{t} \sim \text{NID}(0,\sigma_R^2).
  • \bm{y} = [y_1,\dots,y_n]'= vector of observations
  • \bm{Z}=[z_{t,p}]= covariate matrix with coefficient \bm{\Phi} = [\phi_{p,t}]
  • \bm{Q}_i= matrix that extracts \langle S^{(i)}_{\kappa(t),t} \rangle_{t=1}^{n} from \bm{s}_i.
  • Residuals: \bm{r} = \bm{y} - \sum_i\bm{Q}_i\bm{s}_i -\bm{\ell} - \bm{Z}\bm{\Phi} have density f(\bm{r}) \propto \exp\Big\{-\frac{1}{2}\big\|\bm{r}/\sigma_R\big\|_{L_2}^2\Big\},

MLE for STR

Minimize wrt \bm{\Phi}, \bm{\ell} and \bm{s}_i:

\begin{align*} -\log \mathcal{L} &= \frac{1}{2\sigma_R} \Bigg\{ \Big\| \bm{y}- \sum_{i=1}^I \bm{Q}_i\bm{s}_i - \bm{\ell} - \bm{Z}\bm{\Phi} \Big\|_{L_2}^2 + \lambda_\ell\Big\|\bm{D}_\ell \bm{\ell}\Big\|_{L_2}^2 \\ & \hspace*{1cm} + \sum_{i=1}^{I}\left( \left\|\lambda_{tt,i} \bm{D}_{tt,i} \bm{s}_i \right\|_{L_2}^2 + \left\|\lambda_{st,i} \bm{D}_{st,i} \bm{s}_i \right\|_{L_2}^2 + \left\|\lambda_{ss,i} \bm{D}_{ss,i} \bm{s}_i \right\|_{L_2}^2 \right) \Bigg\} \end{align*}

Equivalent to linear model

\bm{y}_{+} = \bm{X}\bm{\beta} + \bm{\varepsilon}

  • \bm{y}_{+} = [\bm{y}',~ \bm{0}']'
  • \bm{\varepsilon} \sim N(\bm{0},\sigma_R^2\bm{I})

\bm{X} = \begin{bmatrix} \bm{Q}_1 & \dots & \bm{Q}_I & \bm{I}_n & \bm{Z} \\ \lambda_{tt,1} \bm{D}_{tt,1} & \dots & 0 & 0 & 0 \\ \lambda_{st,1} \bm{D}_{st,1} & \dots & 0 & 0 & 0 \\ \lambda_{ss,1} \bm{D}_{ss,1} & \dots & 0 & 0 & 0 \\ 0 & \ddots & 0 & 0 & 0 \\ 0 & \dots & \lambda_{tt,I} \bm{D}_{tt,I} & 0 & 0 \\ 0 & \dots & \lambda_{st,I} \bm{D}_{st,I} & 0 & 0 \\ 0 & \dots & \lambda_{ss,I} \bm{D}_{ss,I} & 0 & 0 \\ 0 & \dots & 0 & \lambda_\ell \bm{D}_{tt} & 0 \end{bmatrix}

STR

Three seasonal components, quadratic temperature regressors

STR outliers


R packages

Thanks to my collaborators

OTexts.com/fppit

For more information

Slides: robjhyndman.com/seminars/padova2022.html