Get date features from a time-series index
Details
tk_get_timeseries_signature
decomposes the timeseries into commonly
needed features such as
numeric value, differences,
year, month, day, day of week, day of month,
day of year, hour, minute, second.
tk_get_timeseries_summary
returns the summary returns the
start, end, units, scale, and a "summary" of the timeseries differences
in seconds including
the minimum, 1st quartile, median, mean, 3rd quartile, and maximum frequency.
The timeseries
differences give the user a better picture of the index frequency
so the user can understand the level of regularity or irregularity.
A perfectly regular time series will have equal values in seconds for each metric.
However, this is not often the case.
Important Note: These functions only work with time-based indexes in datetime, date, yearmon, and yearqtr values. Regularized dates cannot be decomposed.
Examples
library(dplyr)
library(tidyquant)
library(timetk)
# Works with time-based tibbles
FB_tbl <- FANG %>% filter(symbol == "FB")
FB_idx <- tk_index(FB_tbl)
tk_get_timeseries_signature(FB_idx)
#> # A tibble: 1,008 × 29
#> index index.num diff year year.…¹ half quarter month month…² month…³
#> <date> <dbl> <dbl> <int> <int> <int> <int> <int> <int> <ord>
#> 1 2013-01-02 1.36e9 NA 2013 2013 1 1 1 0 January
#> 2 2013-01-03 1.36e9 86400 2013 2013 1 1 1 0 January
#> 3 2013-01-04 1.36e9 86400 2013 2013 1 1 1 0 January
#> 4 2013-01-07 1.36e9 259200 2013 2013 1 1 1 0 January
#> 5 2013-01-08 1.36e9 86400 2013 2013 1 1 1 0 January
#> 6 2013-01-09 1.36e9 86400 2013 2013 1 1 1 0 January
#> 7 2013-01-10 1.36e9 86400 2013 2013 1 1 1 0 January
#> 8 2013-01-11 1.36e9 86400 2013 2013 1 1 1 0 January
#> 9 2013-01-14 1.36e9 259200 2013 2013 1 1 1 0 January
#> 10 2013-01-15 1.36e9 86400 2013 2013 1 1 1 0 January
#> # … with 998 more rows, 19 more variables: day <int>, hour <int>, minute <int>,
#> # second <int>, hour12 <int>, am.pm <int>, wday <int>, wday.xts <int>,
#> # wday.lbl <ord>, mday <int>, qday <int>, yday <int>, mweek <int>,
#> # week <int>, week.iso <int>, week2 <int>, week3 <int>, week4 <int>,
#> # mday7 <int>, and abbreviated variable names ¹year.iso, ²month.xts,
#> # ³month.lbl
tk_get_timeseries_summary(FB_idx)
#> # A tibble: 1 × 12
#> n.obs start end units scale tzone diff.m…¹ diff.q1 diff.…² diff.…³
#> <int> <date> <date> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 1008 2013-01-02 2016-12-30 days day UTC 86400 86400 86400 125096.
#> # … with 2 more variables: diff.q3 <dbl>, diff.maximum <dbl>, and abbreviated
#> # variable names ¹diff.minimum, ²diff.median, ³diff.mean
# Works with dates in any periodicity
idx_weekly <- seq.Date(from = ymd("2016-01-01"), by = 'week', length.out = 6)
tk_get_timeseries_signature(idx_weekly)
#> # A tibble: 6 × 29
#> index index.num diff year year.…¹ half quarter month month…² month…³
#> <date> <dbl> <dbl> <int> <int> <int> <int> <int> <int> <ord>
#> 1 2016-01-01 1451606400 NA 2016 2015 1 1 1 0 January
#> 2 2016-01-08 1452211200 604800 2016 2016 1 1 1 0 January
#> 3 2016-01-15 1452816000 604800 2016 2016 1 1 1 0 January
#> 4 2016-01-22 1453420800 604800 2016 2016 1 1 1 0 January
#> 5 2016-01-29 1454025600 604800 2016 2016 1 1 1 0 January
#> 6 2016-02-05 1454630400 604800 2016 2016 1 1 2 1 Februa…
#> # … with 19 more variables: day <int>, hour <int>, minute <int>, second <int>,
#> # hour12 <int>, am.pm <int>, wday <int>, wday.xts <int>, wday.lbl <ord>,
#> # mday <int>, qday <int>, yday <int>, mweek <int>, week <int>,
#> # week.iso <int>, week2 <int>, week3 <int>, week4 <int>, mday7 <int>, and
#> # abbreviated variable names ¹year.iso, ²month.xts, ³month.lbl
tk_get_timeseries_summary(idx_weekly)
#> # A tibble: 1 × 12
#> n.obs start end units scale tzone diff.m…¹ diff.q1 diff.…² diff.…³
#> <int> <date> <date> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 6 2016-01-01 2016-02-05 days week UTC 604800 604800 604800 604800
#> # … with 2 more variables: diff.q3 <dbl>, diff.maximum <dbl>, and abbreviated
#> # variable names ¹diff.minimum, ²diff.median, ³diff.mean
# Works with zoo yearmon and yearqtr classes
idx_yearmon <- seq.Date(from = ymd("2016-01-01"),
by = "month",
length.out = 12) %>%
as.yearmon()
tk_get_timeseries_signature(idx_yearmon)
#> # A tibble: 12 × 29
#> index index…¹ diff year year.…² half quarter month month…³ month…⁴ day
#> <yea> <dbl> <dbl> <int> <int> <int> <int> <int> <int> <ord> <int>
#> 1 Jan … 1.45e9 NA 2016 2015 1 1 1 0 January 1
#> 2 Feb … 1.45e9 2678400 2016 2016 1 1 2 1 Februa… 1
#> 3 Mar … 1.46e9 2505600 2016 2016 1 1 3 2 March 1
#> 4 Apr … 1.46e9 2678400 2016 2016 1 2 4 3 April 1
#> 5 May … 1.46e9 2592000 2016 2016 1 2 5 4 May 1
#> 6 Jun … 1.46e9 2678400 2016 2016 1 2 6 5 June 1
#> 7 Jul … 1.47e9 2592000 2016 2016 2 3 7 6 July 1
#> 8 Aug … 1.47e9 2678400 2016 2016 2 3 8 7 August 1
#> 9 Sep … 1.47e9 2678400 2016 2016 2 3 9 8 Septem… 1
#> 10 Oct … 1.48e9 2592000 2016 2016 2 4 10 9 October 1
#> 11 Nov … 1.48e9 2678400 2016 2016 2 4 11 10 Novemb… 1
#> 12 Dec … 1.48e9 2592000 2016 2016 2 4 12 11 Decemb… 1
#> # … with 18 more variables: hour <int>, minute <int>, second <int>,
#> # hour12 <int>, am.pm <int>, wday <int>, wday.xts <int>, wday.lbl <ord>,
#> # mday <int>, qday <int>, yday <int>, mweek <int>, week <int>,
#> # week.iso <int>, week2 <int>, week3 <int>, week4 <int>, mday7 <int>, and
#> # abbreviated variable names ¹index.num, ²year.iso, ³month.xts, ⁴month.lbl
tk_get_timeseries_summary(idx_yearmon)
#> # A tibble: 1 × 12
#> n.obs start end units scale tzone diff.…¹ diff.q1 diff.…² diff.…³ diff.q3
#> <int> <yearmo> <yea> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 12 Jan 2016 Dec … days month UTC 2505600 2592000 2678400 2.63e6 2678400
#> # … with 1 more variable: diff.maximum <dbl>, and abbreviated variable names
#> # ¹diff.minimum, ²diff.median, ³diff.mean