Compute surprisals or surprisal probabilities from a model or a data set. A surprisal is given by \(s = -\log f(y)\) where \(f\) is the density or probability mass function of the estimated or assumed distribution, and \(y\) is an observation. A surprisal probability is the probability of a surprisal at least as extreme as \(s\).

The surprisal probabilities may be computed in three different ways.

Given the same distribution that was used to compute the surprisal values. Under this option, surprisal probabilities are equal to 1 minus the coverage probability of the largest HDR that contains each value. Surprisal probabilities smaller than 1e-6 are returned as 1e-6.

Using a Generalized Pareto Distribution fitted to the most extreme surprisal values (those with probability less than

`threshold_probability`

). This option is used if`approximation = "gpd"`

. For surprisal probabilities greater than`threshold_probability`

, the value of`threshold_probability`

is returned. Under this option, the distribution is used for computing the surprisal values but not for determining their probabilities. Due to extreme value theory, the resulting probabilities should be relatively insensitive to the distribution used in computing the surprisal values.Empirically as the proportion of observations with greater surprisal values. This option is used when

`approxiation = "empirical"`

. This is also insensitive to the distribution used in computing the surprisal values.

## Usage

```
surprisals(
object,
probability = TRUE,
approximation = c("none", "gpd", "empirical"),
threshold_probability = 0.1,
...
)
# Default S3 method
surprisals(
object,
probability = TRUE,
approximation = c("none", "gpd", "empirical"),
threshold_probability = 0.1,
distribution = dist_kde(object, multiplier = 2, ...),
loo = FALSE,
...
)
```

## Arguments

- object
A model or numerical data set

- probability
Should surprisal probabilities be computed, or the surprisal values?

- approximation
Character string specifying the approximation to use in computing the surprisal probabilities. Ignored if

`probability = FALSE`

. :`none`

specifies that no approximation is to be used;`gpd`

specifies that the Generalized Pareto distribution should be used; while`empirical`

specifies that the probabilities should be estimated empirically.- threshold_probability
Probability threshold when computing the GPD approximation. This is the probability below which the GPD is fitted. Only used if

`approximation = "gpd"`

).- ...
Other arguments are passed to the appropriate method.

- distribution
A distribution object. If not provided, a kernel density estimate is computed from the data

`object`

.- loo
Should leave-one-out surprisals be computed?

## Details

If no distribution is provided, a kernel density estimate is computed. The leave-one-out surprisals (or LOO surprisals) are obtained by estimating the kernel density estimate using all other observations.

## Examples

```
# surprisals computed from bivariate data set
oldfaithful |>
filter(duration < 7000, waiting < 7000) |>
mutate(
loo_fscores = surprisals(cbind(duration, waiting), loo = TRUE)
)
#> # A tibble: 2,189 × 4
#> time duration waiting loo_fscores
#> <dttm> <dbl> <dbl> <dbl>
#> 1 2015-01-02 14:53:00 271 5040 0.0594
#> 2 2015-01-09 23:55:00 247 6060 0.670
#> 3 2015-02-07 00:49:00 203 5460 0.210
#> 4 2015-02-14 01:09:00 195 5221 0.108
#> 5 2015-02-21 01:12:00 210 5401 0.284
#> 6 2015-02-28 01:11:00 185 5520 0.0690
#> 7 2015-03-07 00:50:00 160 5281 0.0101
#> 8 2015-03-13 21:57:00 226 6000 0.405
#> 9 2015-03-13 23:37:00 190 5341 0.0886
#> 10 2015-03-20 22:26:00 102 3961 0.0905
#> # ℹ 2,179 more rows
# Univariate data
tibble(
y = c(5, rnorm(49)),
p_kde = surprisals(y, loo = TRUE),
p_normal = surprisals(y, distribution = dist_normal()),
p_zscore = 2*(1-pnorm(abs(y)))
)
#> # A tibble: 50 × 4
#> y p_kde p_normal p_zscore
#> <dbl> <dbl> <dbl> <dbl>
#> 1 5 0.00178 0.000000573 0.000000573
#> 2 -0.0199 0.893 0.984 0.984
#> 3 0.181 0.822 0.857 0.857
#> 4 0.570 0.637 0.569 0.569
#> 5 -0.899 0.564 0.369 0.369
#> 6 -2.38 0.118 0.0173 0.0173
#> 7 -0.610 0.701 0.542 0.542
#> 8 -0.517 0.747 0.605 0.605
#> 9 1.24 0.358 0.214 0.214
#> 10 -0.303 0.846 0.762 0.762
#> # ℹ 40 more rows
tibble(
y = n01$v1,
prob1 = surprisals(y, loo = TRUE),
prob2 = surprisals(y, approximation = "gpd"),
prob3 = surprisals(y, distribution = dist_normal()),
prob4 = surprisals(y, distribution = dist_normal(), approximation = "gpd")
) |>
arrange(prob1)
#> # A tibble: 1,000 × 5
#> y prob1 prob2 prob3 prob4
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 3.81 0.000675 0.000113 0.000139 0.000135
#> 2 3.06 0.00694 0.00194 0.00225 0.00261
#> 3 -3.01 0.0109 0.00362 0.00263 0.00308
#> 4 -3.00 0.0112 0.00376 0.00273 0.00320
#> 5 -2.94 0.0129 0.00456 0.00328 0.00389
#> 6 -2.89 0.0146 0.00538 0.00387 0.00461
#> 7 2.68 0.0199 0.00810 0.00746 0.00913
#> 8 2.65 0.0214 0.00887 0.00807 0.00991
#> 9 -2.60 0.0283 0.0128 0.00943 0.0116
#> 10 -2.59 0.0285 0.0130 0.00953 0.0118
#> # ℹ 990 more rows
# Bivariate data
tibble(
x = rnorm(50),
y = c(5, rnorm(49)),
prob = surprisals(cbind(x, y)),
lookout = lookout_prob(cbind(x, y))
)
#> # A tibble: 50 × 4
#> x y prob lookout
#> <dbl> <dbl> <dbl> <dbl>
#> 1 -1.19 5 0.02 0.0208
#> 2 1.30 0.427 0.3 1
#> 3 -1.45 0.578 0.36 1
#> 4 -0.657 0.918 0.72 1
#> 5 0.848 1.90 0.22 1
#> 6 0.342 -0.553 0.74 1
#> 7 -0.0558 -0.532 0.86 1
#> 8 0.957 0.583 0.54 1
#> 9 0.547 1.35 0.44 1
#> 10 1.33 1.13 0.24 1
#> # ℹ 40 more rows
```