Skip to contents

The tk_acf_diagnostics() function provides a simple interface to detect Autocorrelation (ACF), Partial Autocorrelation (PACF), and Cross Correlation (CCF) of Lagged Predictors in one tibble. This function powers the plot_acf_diagnostics() visualization.

Usage

tk_acf_diagnostics(.data, .date_var, .value, .ccf_vars = NULL, .lags = 1000)

Arguments

.data

A data frame or tibble with numeric features (values) in descending chronological order

.date_var

A column containing either date or date-time values

.value

A numeric column with a value to have ACF and PACF calculations performed.

.ccf_vars

Additional features to perform Lag Cross Correlations (CCFs) versus the .value. Useful for evaluating external lagged regressors.

.lags

A seqence of one or more lags to evaluate.

Value

A tibble or data.frame containing the autocorrelation, partial autocorrelation and cross correlation data.

Details

Simplified ACF, PACF, & CCF

We are often interested in all 3 of these functions. Why not get all 3 at once? Now you can!

  • ACF - Autocorrelation between a target variable and lagged versions of itself

  • PACF - Partial Autocorrelation removes the dependence of lags on other lags highlighting key seasonalities.

  • CCF - Shows how lagged predictors can be used for prediction of a target variable.

Lag Specification

Lags (.lags) can either be specified as:

  • A time-based phrase indicating a duraction (e.g. 2 months)

  • A maximum lag (e.g. .lags = 28)

  • A sequence of lags (e.g. .lags = 7:28)

Scales to Multiple Time Series with Groupes

The tk_acf_diagnostics() works with grouped_df's, meaning you can group your time series by one or more categorical columns with dplyr::group_by() and then apply tk_acf_diagnostics() to return group-wise lag diagnostics.

Special Note on Dots (...)

Unlike other plotting utilities, the ... arguments is NOT used for group-wise analysis. Rather, it's used for processing Cross Correlations (CCFs).

Use dplyr::group_by() for processing multiple time series groups.

See also

Examples

library(dplyr)

# ACF, PACF, & CCF in 1 Data Frame
# - Get ACF & PACF for target (adjusted)
# - Get CCF between adjusted and volume and close
FANG %>%
    filter(symbol == "FB") %>%
    tk_acf_diagnostics(date, adjusted,                # ACF & PACF
                       .ccf_vars = c(volume, close),  # CCFs
                       .lags     = 500)
#> # A tibble: 501 × 7
#>      lag   ACF      PACF CCF_volume CCF_close .white_noise_upper
#>    <dbl> <dbl>     <dbl>      <dbl>     <dbl>              <dbl>
#>  1     0 1      1            -0.447     1                 0.0630
#>  2     1 0.997  0.997        -0.444     0.997             0.0630
#>  3     2 0.994 -0.0227       -0.442     0.994             0.0630
#>  4     3 0.990  0.0101       -0.438     0.990             0.0630
#>  5     4 0.987  0.0311       -0.437     0.987             0.0630
#>  6     5 0.985  0.0180       -0.438     0.985             0.0630
#>  7     6 0.982  0.00502      -0.437     0.982             0.0630
#>  8     7 0.979  0.0171       -0.437     0.979             0.0630
#>  9     8 0.976 -0.000118     -0.436     0.976             0.0630
#> 10     9 0.974 -0.00243      -0.435     0.974             0.0630
#> # ℹ 491 more rows
#> # ℹ 1 more variable: .white_noise_lower <dbl>

# Scale with groups using group_by()
FANG %>%
    group_by(symbol) %>%
    tk_acf_diagnostics(date, adjusted,
                       .ccf_vars = c(volume, close),
                       .lags     = "3 months")
#> # A tibble: 248 × 8
#> # Groups:   symbol [4]
#>    symbol   lag   ACF      PACF CCF_volume CCF_close .white_noise_upper
#>    <chr>  <dbl> <dbl>     <dbl>      <dbl>     <dbl>              <dbl>
#>  1 FB         0 1      1            -0.447     1                 0.0630
#>  2 FB         1 0.997  0.997        -0.444     0.997             0.0630
#>  3 FB         2 0.994 -0.0227       -0.442     0.994             0.0630
#>  4 FB         3 0.990  0.0101       -0.438     0.990             0.0630
#>  5 FB         4 0.987  0.0311       -0.437     0.987             0.0630
#>  6 FB         5 0.985  0.0180       -0.438     0.985             0.0630
#>  7 FB         6 0.982  0.00502      -0.437     0.982             0.0630
#>  8 FB         7 0.979  0.0171       -0.437     0.979             0.0630
#>  9 FB         8 0.976 -0.000118     -0.436     0.976             0.0630
#> 10 FB         9 0.974 -0.00243      -0.435     0.974             0.0630
#> # ℹ 238 more rows
#> # ℹ 1 more variable: .white_noise_lower <dbl>

# Apply Transformations
FANG %>%
    group_by(symbol) %>%
    tk_acf_diagnostics(
        date, diff_vec(adjusted),  # Apply differencing transformation
        .lags = 0:500
    )
#> diff_vec(): Initial values: 257.309998
#> diff_vec(): Initial values: 28
#> diff_vec(): Initial values: 361.264351
#> diff_vec(): Initial values: 13.144286
#> # A tibble: 2,004 × 6
#> # Groups:   symbol [4]
#>    symbol   lag      ACF     PACF .white_noise_upper .white_noise_lower
#>    <chr>  <dbl>    <dbl>    <dbl>              <dbl>              <dbl>
#>  1 FB         0  1        1                   0.0630            -0.0630
#>  2 FB         1  0.0272   0.0272              0.0630            -0.0630
#>  3 FB         2 -0.0219  -0.0226              0.0630            -0.0630
#>  4 FB         3 -0.0973  -0.0962              0.0630            -0.0630
#>  5 FB         4 -0.0554  -0.0512              0.0630            -0.0630
#>  6 FB         5  0.0104   0.00896             0.0630            -0.0630
#>  7 FB         6 -0.0622  -0.0751              0.0630            -0.0630
#>  8 FB         7  0.00363 -0.00334             0.0630            -0.0630
#>  9 FB         8 -0.0168  -0.0212              0.0630            -0.0630
#> 10 FB         9  0.0300   0.0187              0.0630            -0.0630
#> # ℹ 1,994 more rows