Visualization of complex seasonal patterns in time series

Rob J Hyndman

23 September 2022

Complex seasonalities

Complex seasonal topology

Example: hourly data

Complex seasonalities

Multiple seasonal periods, not necessarily nested
Non-integer seasonality
Irregular seasonal topography
Seasonality that depends on covariates
Complex seasonal topology

How to effectively visualise the underlying seasonalities?
How to decompose such time series into trend and multiple season components?

Visualizing complex seasonalities

Granularity plots

Faceted granularity plots

Granularities

Nested linear granularities

hhour, hour, day, week, fortnight, quarter, semester, year

Available cyclic granularities for half-hourly data

hhour/hour, hhour/day, hhour/week, hhour/fortnight, hhour/month, hhour/quarter, hhour/semester, hhour/year, hour/day, hour/week, hour/fortnight, hour/month, hour/quarter, hour/semester, hour/year, day/week, day/fortnight, day/month, day/quarter, day/semester, day/year, week/fortnight, week/month, week/quarter, week/semester, week/year, fortnight/month, fortnight/quarter, fortnight/semester, fortnight/year, month/quarter, month/semester, month/year, quarter/semester, quarter/year, semester/year

Plot options

raw data or distributional summary on y-axis
granularity on x-axis
optional granularity as facet

What is an interesting plot?

Single granularity plots

Compute Jensen-Shannon divergences between distributions q_1 and q_2: JSD(q_1,q_2) = \textstyle\frac{1}{2}D(q_1,M) + \frac{1}{2}D(q_2,M), where M = \frac{1}{2}(q_1+q_2) and D(q_1,q_2) is KL divergence.
Measure effectiveness of a plot as maximum JSD for that plot (adjusted for number of levels).
Users can be guided to view the most effective plots.

Normalization of maximum JSD

The distribution of max JSD depends on number of levels n.

Permutation approach (for small n)

Compute max JSD after permuting the levels.
Normalize by mean and standard deviation of permuted max JSD values

Modelling approach (for large n)

Fit a Gumbel GLM to max JSD from simulated N(0,1) data with n as covariate.
Standardize original data by \Phi^{-1}(), compute max JSD, and normalize by mean and standard deviation from model.

Single granularity plots

x	Normalized maximum JSD
`hhour/week`	72.8
`day/year`	67.0
`week/year`	31.8
`hhour/day`	24.4
`day/week`	21.8
`month/year`	15.0
`day/month`	-7.0
`week/month`	-10.5

Single granularity plots

What is an interesting faceted plot?

Faceted granularity plots

Measure effectiveness of a plot as maximum JSD for that plot
- weight within panel differences higher than between panel differences (weight 2:1)
- normalization to adjust for number of levels and panels
Omit combinations with empty or near-empty intersections (“clashes”). e.g., day/year \times month/year
Omit multi-step nested granularities. e.g,. day/year, hhour/week
Omit facets with 20+ levels

Recommended faceted plots

facet	x	facet levels	x levels	Max JSD
`month/year`	`hhour/day`	12	48	123.7
`day/week`	`hhour/day`	7	48	76.6
`month/year`	`day/week`	12	7	63.3
`month/year`	`week/month`	12	5	59.7
`week/month`	`month/year`	5	12	55.9
`week/month`	`hhour/day`	5	48	51.5
`week/month`	`day/week`	5	7	47.4
`day/week`	`day/month`	7	31	44.1
`day/week`	`month/year`	7	12	37.7
`day/week`	`week/month`	7	5	23.9

Faceted granularity plots

References

Sayani Gupta, Rob J Hyndman, Dianne Cook and Antony Unwin (2022) Visualizing probability distributions across bivariate cyclic temporal granularities. J Computational & Graphical Statistics, 31(1), 14-25.
Sayani Gupta, Rob J Hyndman, Dianne Cook (2022) Detecting distributional differences between temporal granularities for exploratory time series analysis. Work in progress.

Time series decomposition for complex seasonalities

Multiple seasonal periods, not necessarily nested
Non-integer seasonality
Irregular seasonal topography
Seasonality that depends on covariates
Complex seasonal topology

No existing decomposition method handles all of these.

Two solutions

MSTL: For multiple integer seasonal periods.
STR: For all types of complex seasonality.

MSTL

Kasun Bandara, Rob J Hyndman, Christoph Bergmeir (2022) MSTL: A Seasonal-Trend Decomposition Algorithm for Time Series with Multiple Seasonal Patterns. International J Operational Research, to appear. robjhyndman.com/publications/mstl/
For multiple integer seasonal periods with additive components
Implemented in R packages forecast and fable.

MSTL

velec |>
  model(STL(Demand)) |>
  components() |>
  autoplot()

MSTL

y_t = T_t + \sum_{i=1}^I S_t^{(i)} + R_t

y_t=	observation at time t
T_t=	smooth trend component
S_t^{(i)}=	seasonal component i
	i = 1,\dots,I
R_t=	remainder component

Estimation

Components updated iteratively.

MSTL

# y: time series as vector
# periods: vector of seasonal periods in increasing order
# swindow: seasonal window values
# iterate: number of  STL iterations

seasonality <- matrix(0, nrow = length(y), ncol = length(periods))
deseas <- y
for (j in 1:iterate) {
  for (i in 1:length(periods)) {
    deseas <- deseas + seasonality[, i]
    fit <- stl(ts(deseas, frequency = periods[i]), s.window = swindow[i])
    seasonality[, i] <- fit$season
    deseas <- deseas - seasonality[, i]
  }
}
trend <- fit$trend
remainder <- deseas - trend
return(trend, seasonality, remainder)

MSTL

fable syntax

tsibble |>
  model(STL(variable ~ season(period = a, window = b) +
                       season(period = c, window = d)))

forecast syntax

vector |>
  msts(seasonal.periods = c(a, c)) |>
  mstl(s.window = c(b, d))

STR

Alex Dokumentov and Rob J Hyndman (2022) STR: Seasonal-Trend decomposition using Regression. INFORMS Journal on Data Science, 1(1), 50-62. robjhyndman.com/publications/str/
Implemented in R package stR.

STR

y_{t} = T_{t} + \sum_{i=1}^{I} S^{(i)}_{t} + \sum_{p=1}^P \phi_{p,t} z_{t,p} + R_{t}

T_t=	smooth trend component
S_t^{(i)}=	seasonal component i (possibly complex topology)
z_{p,t}=	covariate with coefficient \phi_{p,t} (possibly time-varying)
R_t=	remainder component

Estimation

Components estimated using penalized MLE

Smoothness via difference operators

Smooth trend obtained by requiring \Delta_2 T_t \sim \text{NID}(0,\sigma_L^2)

\Delta_2 = (1-B)^2 where B= backshift operator
\sigma_L controls smoothness

f(\bm{D}_\ell \bm{\ell}) \propto \exp\left\{-\frac{1}{2}\big\|\bm{D}_\ell \bm{\ell} / \sigma_L\big\|_{L_2}^2\right\}

\bm{\ell} = \langle T_{t} \rangle_{t=1}^{n}
\bm{D}_\ell= 2nd difference operator matrix: \bm{D}_\ell\bm{\ell} = \langle\Delta^2 T_{t}\rangle_{t=3}^n

Smooth 2D seasonal surfaces

m_i= number of “seasons” in S^{(i)}_{t}.
S^{(i)}_{k,t}= 2d season (k=1,\dots,m_i;t=1,\dots,n)
\sum\limits_k S^{(i)}_{k,t} = 0 for each t.

Smooth 2D seasonal surfaces

\bm{S}^{(i)} = [S_{k,t}^{(i)}] the ith seasonal surface matrix
\bm{s}_i = \text{vec}(\bm{S}_i)= the ith seasonal surface in vector form

Smoothness in time t direction:

\begin{align*} \bm{D}_{tt,i} \bm{s}_i &= \langle \Delta^2_{t} \bm{S}^{(i)}_{k,t} \rangle \sim \text{NID}(\bm{0},\sigma_{i}^2 \bm{\Sigma}_{i})\\ f(\bm{s}_i) &\propto \exp\Big\{-\frac{1}{2}\big\|\ \bm{D}_{tt,i}\bm{s}_i / \sigma_i\big\|_{L_2}^2\Big\} \end{align*}

Analogous difference matrices \bm{D}_{kk,i} and \bm{D}_{kt,i} ensure smoothness in season and time-season directions.

Gaussian remainders

R_{t} \sim \text{NID}(0,\sigma_R^2).
\bm{y} = [y_1,\dots,y_n]'= vector of observations
\bm{Z}=[z_{t,p}]= covariate matrix with coefficient \bm{\Phi} = [\phi_{p,t}]
\bm{Q}_i= matrix that extracts \langle S^{(i)}_{\kappa(t),t} \rangle_{t=1}^{n} from \bm{s}_i.
Residuals: \bm{r} = \bm{y} - \sum_i\bm{Q}_i\bm{s}_i -\bm{\ell} - \bm{Z}\bm{\Phi} have density f(\bm{r}) \propto \exp\Big\{-\frac{1}{2}\big\|\bm{r}/\sigma_R\big\|_{L_2}^2\Big\},

MLE for STR

Minimize wrt \bm{\Phi}, \bm{\ell} and \bm{s}_i:

\begin{align*} -\log \mathcal{L} &= \frac{1}{2\sigma_R} \Bigg\{ \Big\| \bm{y}- \sum_{i=1}^I \bm{Q}_i\bm{s}_i - \bm{\ell} - \bm{Z}\bm{\Phi} \Big\|_{L_2}^2 + \lambda_\ell\Big\|\bm{D}_\ell \bm{\ell}\Big\|_{L_2}^2 \\ & \hspace*{1cm} + \sum_{i=1}^{I}\left( \left\|\lambda_{tt,i} \bm{D}_{tt,i} \bm{s}_i \right\|_{L_2}^2 + \left\|\lambda_{st,i} \bm{D}_{st,i} \bm{s}_i \right\|_{L_2}^2 + \left\|\lambda_{ss,i} \bm{D}_{ss,i} \bm{s}_i \right\|_{L_2}^2 \right) \Bigg\} \end{align*}

Equivalent to linear model

\bm{y}_{+} = \bm{X}\bm{\beta} + \bm{\varepsilon}

\bm{y}_{+} = [\bm{y}',~ \bm{0}']'
\bm{\varepsilon} \sim N(\bm{0},\sigma_R^2\bm{I})

\bm{X} = \begin{bmatrix} \bm{Q}_1 & \dots & \bm{Q}_I & \bm{I}_n & \bm{Z} \\ \lambda_{tt,1} \bm{D}_{tt,1} & \dots & 0 & 0 & 0 \\ \lambda_{st,1} \bm{D}_{st,1} & \dots & 0 & 0 & 0 \\ \lambda_{ss,1} \bm{D}_{ss,1} & \dots & 0 & 0 & 0 \\ 0 & \ddots & 0 & 0 & 0 \\ 0 & \dots & \lambda_{tt,I} \bm{D}_{tt,I} & 0 & 0 \\ 0 & \dots & \lambda_{st,I} \bm{D}_{st,I} & 0 & 0 \\ 0 & \dots & \lambda_{ss,I} \bm{D}_{ss,I} & 0 & 0 \\ 0 & \dots & 0 & \lambda_\ell \bm{D}_{tt} & 0 \end{bmatrix}

STR

Three seasonal components, quadratic temperature regressors

STR outliers

R packages

Thanks to my collaborators

OTexts.com/fppit

For more information

Slides: robjhyndman.com/seminars/padova2022.html

Find me at:

robjhyndman.com
@robjhyndman
@robjhyndman
rob.hyndman@monash.edu