Skip to contents

A handy function for adding multiple lagged columns to a data frame. Works with dplyr groups too.

Usage

tk_augment_lags(.data, .value, .lags = 1, .names = "auto")

tk_augment_leads(.data, .value, .lags = -1, .names = "auto")

Arguments

.data

A tibble.

.value

One or more column(s) to have a transformation applied. Usage of tidyselect functions (e.g. contains()) can be used to select multiple columns.

.lags

One or more lags for the difference(s)

.names

A vector of names for the new columns. Must be of same length as .lags.

Value

Returns a tibble object describing the timeseries.

Details

Lags vs Leads

A negative lag is considered a lead. The tk_augment_leads() function is identical to tk_augment_lags() with the exception that the automatic naming convetion (.names = 'auto') will convert column names with negative lags to leads.

Benefits

This is a scalable function that is:

  • Designed to work with grouped data using dplyr::group_by()

  • Add multiple lags by adding a sequence of lags using the .lags argument (e.g. .lags = 1:20)

See also

Augment Operations:

Underlying Function:

  • lag_vec() - Underlying function that powers tk_augment_lags()

Examples

library(dplyr)
library(timetk)

# Lags
m4_monthly %>%
    group_by(id) %>%
    tk_augment_lags(contains("value"), .lags = 1:20)
#> # A tibble: 1,574 × 23
#> # Groups:   id [4]
#>    id    date       value value_lag1 value_lag2 value_…¹ value…² value…³ value…⁴
#>    <fct> <date>     <dbl>      <dbl>      <dbl>    <dbl>   <dbl>   <dbl>   <dbl>
#>  1 M1    1976-06-01  8000         NA         NA       NA      NA      NA      NA
#>  2 M1    1976-07-01  8350       8000         NA       NA      NA      NA      NA
#>  3 M1    1976-08-01  8570       8350       8000       NA      NA      NA      NA
#>  4 M1    1976-09-01  7700       8570       8350     8000      NA      NA      NA
#>  5 M1    1976-10-01  7080       7700       8570     8350    8000      NA      NA
#>  6 M1    1976-11-01  6520       7080       7700     8570    8350    8000      NA
#>  7 M1    1976-12-01  6070       6520       7080     7700    8570    8350    8000
#>  8 M1    1977-01-01  6650       6070       6520     7080    7700    8570    8350
#>  9 M1    1977-02-01  6830       6650       6070     6520    7080    7700    8570
#> 10 M1    1977-03-01  5710       6830       6650     6070    6520    7080    7700
#> # … with 1,564 more rows, 14 more variables: value_lag7 <dbl>,
#> #   value_lag8 <dbl>, value_lag9 <dbl>, value_lag10 <dbl>, value_lag11 <dbl>,
#> #   value_lag12 <dbl>, value_lag13 <dbl>, value_lag14 <dbl>, value_lag15 <dbl>,
#> #   value_lag16 <dbl>, value_lag17 <dbl>, value_lag18 <dbl>, value_lag19 <dbl>,
#> #   value_lag20 <dbl>, and abbreviated variable names ¹​value_lag3, ²​value_lag4,
#> #   ³​value_lag5, ⁴​value_lag6

# Leads
m4_monthly %>%
    group_by(id) %>%
    tk_augment_leads(value, .lags = 1:-20)
#> # A tibble: 1,574 × 25
#> # Groups:   id [4]
#>    id    date       value value_lag1 value_lag0 value_…¹ value…² value…³ value…⁴
#>    <fct> <date>     <dbl>      <dbl>      <dbl>    <dbl>   <dbl>   <dbl>   <dbl>
#>  1 M1    1976-06-01  8000         NA       8000     8350    8570    7700    7080
#>  2 M1    1976-07-01  8350       8000       8350     8570    7700    7080    6520
#>  3 M1    1976-08-01  8570       8350       8570     7700    7080    6520    6070
#>  4 M1    1976-09-01  7700       8570       7700     7080    6520    6070    6650
#>  5 M1    1976-10-01  7080       7700       7080     6520    6070    6650    6830
#>  6 M1    1976-11-01  6520       7080       6520     6070    6650    6830    5710
#>  7 M1    1976-12-01  6070       6520       6070     6650    6830    5710    5260
#>  8 M1    1977-01-01  6650       6070       6650     6830    5710    5260    5470
#>  9 M1    1977-02-01  6830       6650       6830     5710    5260    5470    7870
#> 10 M1    1977-03-01  5710       6830       5710     5260    5470    7870    7360
#> # … with 1,564 more rows, 16 more variables: value_lead5 <dbl>,
#> #   value_lead6 <dbl>, value_lead7 <dbl>, value_lead8 <dbl>, value_lead9 <dbl>,
#> #   value_lead10 <dbl>, value_lead11 <dbl>, value_lead12 <dbl>,
#> #   value_lead13 <dbl>, value_lead14 <dbl>, value_lead15 <dbl>,
#> #   value_lead16 <dbl>, value_lead17 <dbl>, value_lead18 <dbl>,
#> #   value_lead19 <dbl>, value_lead20 <dbl>, and abbreviated variable names
#> #   ¹​value_lead1, ²​value_lead2, ³​value_lead3, ⁴​value_lead4