Skip to contents

These smoothing functions allow smoothing of a variable in a vital object. The vital object is returned along with some additional columns containing information about the smoothed variable: usually .smooth containing the smoothed values, and .smooth_se containing the corresponding standard errors.

Usage

smooth_spline(.data, .var, age_spacing = 1, k = -1)

smooth_mortality(.data, .var, age_spacing = 1, b = 65, power = 0.4, k = 30)

smooth_fertility(.data, .var, age_spacing = 1, lambda = 1e-10)

smooth_loess(.data, .var, age_spacing = 1, span = 0.2)

Arguments

.data

A vital object

.var

name of variable to smooth

age_spacing

Spacing between ages for smoothed vital. Default is 1.

k

Number of knots to use for penalized regression spline estimate.

b

Lower age for monotonicity. Above this, the smooth curve is assumed to be monotonically increasing.

power

Power transformation for age variable before smoothing. Default is 0.4 (for mortality data).

lambda

Penalty for constrained regression spline.

span

Span for loess smooth.

Value

vital with added columns containing smoothed values and their standard errors

Details

smooth_mortality() use penalized regression splines applied to log mortality with a monotonicity constraint above age b. The methodology is based on Wood (1994). smooth_fertility() uses weighted regression B-splines with a concavity constraint, based on He and Ng (1999). The function smooth_loess() uses locally quadratic regression, while smooth_spline() uses penalized regression splines.

References

Hyndman, R.J., and Ullah, S. (2007) Robust forecasting of mortality and fertility rates: a functional data approach. Computational Statistics & Data Analysis, 51, 4942-4956. https://robjhyndman.com/publications/funcfor/

Author

Rob J Hyndman

Examples

library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
aus_mortality |>
  filter(State == "Victoria", Sex == "female", Year > 2000) |>
  smooth_mortality(Mortality)
#> # A vital: 2,020 x 10 [1Y]
#> # Key:     Age x (Sex, State, Code) [101 x 1]
#>     Year   Age Sex    State   Code  Mortality Exposure Deaths .smooth .smooth_se
#>    <int> <dbl> <chr>  <chr>   <chr>     <dbl>    <dbl>  <dbl> <dbl[1>  <dbl[1d]>
#>  1  2001     0 female Victor… VIC   0.00404     29229. 118.   4.02e-3  0.000347 
#>  2  2001     1 female Victor… VIC   0.000405    29654.  12    3.90e-4  0.0000789
#>  3  2001     2 female Victor… VIC   0.000201    29832.   6    2.16e-4  0.0000404
#>  4  2001     3 female Victor… VIC   0.000134    29859.   4.01 1.54e-4  0.0000287
#>  5  2001     4 female Victor… VIC   0.000165    30328.   5.01 1.25e-4  0.0000233
#>  6  2001     5 female Victor… VIC   0.0000652   30698.   2    1.10e-4  0.0000205
#>  7  2001     6 female Victor… VIC   0.0000959   31286.   3    1.02e-4  0.0000189
#>  8  2001     7 female Victor… VIC   0.0000945   31748.   3    9.96e-5  0.0000181
#>  9  2001     8 female Victor… VIC   0.0000943   31810.   3    1.01e-4  0.0000178
#> 10  2001     9 female Victor… VIC   0.0000627   31893.   2    1.05e-4  0.0000180
#> # ℹ 2,010 more rows
aus_fertility |>
  filter(Year > 2000) |>
  smooth_fertility(Fertility)
#> # A vital: 210 x 7 [1Y]
#> # Key:     Age [35 x 1]
#>     Year   Age Fertility Exposure Births .smooth .smooth_se
#>    <int> <dbl>     <dbl>    <dbl>  <dbl>   <dbl>      <dbl>
#>  1  2001    15   0.00320   132027   423. 0.00320    0.00333
#>  2  2001    16   0.00728   133096   969. 0.00755    0.00709
#>  3  2001    17   0.0158    131433  2075. 0.0158     0.0133 
#>  4  2001    18   0.0249    133123  3313. 0.0260     0.0196 
#>  5  2001    19   0.0372    132398  4931. 0.0357     0.0239 
#>  6  2001    20   0.0435    131377  5721. 0.0435     0.0258 
#>  7  2001    21   0.0485    127985  6202. 0.0499     0.0261 
#>  8  2001    22   0.0581    126901  7373. 0.0572     0.0262 
#>  9  2001    23   0.0656    127134  8336. 0.0656     0.0263 
#> 10  2001    24   0.0749    128239  9599. 0.0749     0.0264 
#> # ℹ 200 more rows