Skip to contents

tk_stl_diagnostics() is the preprocessor for plot_stl_diagnostics(). It helps by automating frequency and trend selection.

Usage

tk_stl_diagnostics(
  .data,
  .date_var,
  .value,
  .frequency = "auto",
  .trend = "auto",
  .message = TRUE
)

Arguments

.data

A tibble or data.frame with a time-based column

.date_var

A column containing either date or date-time values

.value

A column containing numeric values

.frequency

Controls the seasonal adjustment (removal of seasonality). Input can be either "auto", a time-based definition (e.g. "2 weeks"), or a numeric number of observations per frequency (e.g. 10). Refer to tk_get_frequency().

.trend

Controls the trend component. For STL, trend controls the sensitivity of the lowess smoother, which is used to remove the remainder.

.message

A boolean. If TRUE, will output information related to automatic frequency and trend selection (if applicable).

Value

A tibble or data.frame with Observed, Season, Trend, Remainder, and Seasonally-Adjusted features

Details

The tk_stl_diagnostics() function generates a Seasonal-Trend-Loess decomposition. The function is "tidy" in the sense that it works on data frames and is designed to work with dplyr groups.

STL method:

The STL method implements time series decomposition using the underlying stats::stl(). The decomposition separates the "season" and "trend" components from the "observed" values leaving the "remainder".

Frequency & Trend Selection

The user can control two parameters: .frequency and .trend.

  1. The .frequency parameter adjusts the "season" component that is removed from the "observed" values.

  2. The .trend parameter adjusts the trend window (t.window parameter from stl()) that is used.

The user may supply both .frequency and .trend as time-based durations (e.g. "6 weeks") or numeric values (e.g. 180) or "auto", which automatically selects the frequency and/or trend based on the scale of the time series.

Examples

library(dplyr)
library(timetk)


# ---- GROUPS & TRANSFORMATION ----
m4_daily %>%
    group_by(id) %>%
    tk_stl_diagnostics(date, box_cox_vec(value))
#> frequency = 7 observations per 1 week
#> trend = 92 observations per 3 months
#> box_cox_vec(): Using value for lambda: 1.25119350454964
#> frequency = 7 observations per 1 week
#> trend = 92 observations per 3 months
#> box_cox_vec(): Using value for lambda: 0.0882021886505848
#> frequency = 7 observations per 1 week
#> trend = 92 observations per 3 months
#> box_cox_vec(): Using value for lambda: 1.99992424816297
#> frequency = 7 observations per 1 week
#> trend = 92 observations per 3 months
#> box_cox_vec(): Using value for lambda: 0.401716085353735
#> # A tibble: 9,743 × 7
#> # Groups:   id [4]
#>    id    date       observed season  trend remainder seasadj
#>    <fct> <date>        <dbl>  <dbl>  <dbl>     <dbl>   <dbl>
#>  1 D10   2014-07-03   11303.  -8.25 10796.     516.   11311.
#>  2 D10   2014-07-04   11284.  -7.93 10790.     502.   11292.
#>  3 D10   2014-07-05   11116. -14.8  10784.     347.   11131.
#>  4 D10   2014-07-06   11117.  -6.28 10778.     346.   11124.
#>  5 D10   2014-07-07   10829.  10.7  10772.      47.0  10819.
#>  6 D10   2014-07-08   10905.  16.6  10766.     123.   10889.
#>  7 D10   2014-07-09   10915.  10.0  10760.     145.   10905.
#>  8 D10   2014-07-10   10836.  -8.25 10754.      90.5  10844.
#>  9 D10   2014-07-11   10854.  -7.93 10748.     114.   10862.
#> 10 D10   2014-07-12   10796. -14.8  10742.      69.1  10811.
#> # … with 9,733 more rows

# ---- CUSTOM TREND ----
m4_weekly %>%
    group_by(id) %>%
    tk_stl_diagnostics(date, box_cox_vec(value), .trend = "2 quarters")
#> frequency = 13 observations per 1 quarter
#> trend = 25 observations per 2 quarters
#> box_cox_vec(): Using value for lambda: -0.374719526760349
#> frequency = 13 observations per 1 quarter
#> trend = 25 observations per 2 quarters
#> box_cox_vec(): Using value for lambda: 0.0597533426736463
#> frequency = 13 observations per 1 quarter
#> trend = 26 observations per 2 quarters
#> box_cox_vec(): Using value for lambda: -0.937375922566063
#> frequency = 13 observations per 1 quarter
#> trend = 26 observations per 2 quarters
#> box_cox_vec(): Using value for lambda: -0.195493340612351
#> # A tibble: 2,295 × 7
#> # Groups:   id [4]
#>    id    date       observed     season trend  remainder seasadj
#>    <fct> <date>        <dbl>      <dbl> <dbl>      <dbl>   <dbl>
#>  1 W10   1999-01-01     2.33 -0.000172   2.40 -0.0678       2.33
#>  2 W10   1999-01-08     2.32 -0.000195   2.40 -0.0814       2.32
#>  3 W10   1999-01-15     2.40  0.0000406  2.40  0.000106     2.40
#>  4 W10   1999-01-22     2.40  0.000252   2.40 -0.000127     2.40
#>  5 W10   1999-01-29     2.40  0.000371   2.40 -0.000261     2.40
#>  6 W10   1999-02-05     2.40  0.000397   2.40 -0.000314     2.40
#>  7 W10   1999-02-12     2.40  0.000162   2.40 -0.0000956    2.40
#>  8 W10   1999-02-19     2.40 -0.0000578  2.40  0.000114     2.40
#>  9 W10   1999-02-26     2.40 -0.000171   2.40  0.000235     2.40
#> 10 W10   1999-03-05     2.40 -0.000163   2.40  0.000216     2.40
#> # … with 2,285 more rows