Intelligent Date & Time Sequences
Matt Dancho
20220818
Source:vignettes/TK02_Time_Series_Date_Sequences.Rmd
TK02_Time_Series_Date_Sequences.Rmd
Creating and modifying date sequences is critical to machine learning projects. We discuss:
 Making a Time Series Sequence:
tk_make_timeseries()
 Making a Future Sequence:
tk_make_future_timeseries()
 Holiday & Weekday/Weekend Sequences
Making a Time Series Sequence
tk_make_timeseries()
improves on the seq.Date()
and seq.POSIXt()
functions by simplifying into 1 function. Intelligently handles character dates and logical assumptions based on user inputs.
By Day
 Can use
by = "day"
or leave blank. 
include_endpoints = FALSE
removes the last value so your series is only 7 observations.
# Selects by day automatically
tk_make_timeseries("2011", length_out = "7 days", include_endpoints = FALSE)
## [1] "20110101" "20110102" "20110103" "20110104" "20110105"
## [6] "20110106" "20110107"
By 2 Seconds
 Can use
by = "2 sec"
to adjust the interval width. 
include_endpoints = TRUE
keeps the last value the series ends on the 6th second.
# Guesses by second
tk_make_timeseries("2016", by = "2 sec", length_out = "6 seconds")
## [1] "20160101 00:00:00 UTC" "20160101 00:00:02 UTC"
## [3] "20160101 00:00:04 UTC" "20160101 00:00:06 UTC"
Length Out = 1 year 6 months

length_out = "1 year 6 months"
 Can include complex expressions like “1 year 4 months 6 days”.
tk_make_timeseries("201207",
by = "1 month",
length_out = "1 year 6 months",
include_endpoints = FALSE)
## [1] "20120701" "20120801" "20120901" "20121001" "20121101"
## [6] "20121201" "20130101" "20130201" "20130301" "20130401"
## [11] "20130501" "20130601" "20130701" "20130801" "20130901"
## [16] "20131001" "20131101" "20131201"
Go In Reverse
 To go in reverse, just use
end_date
as where you want the series to end.
tk_make_timeseries(end_date = "20120701",
by = "1 month",
length_out = "1 year 6 months")
## [1] "20110101" "20110201" "20110301" "20110401" "20110501"
## [6] "20110601" "20110701" "20110801" "20110901" "20111001"
## [11] "20111101" "20111201" "20120101" "20120201" "20120301"
## [16] "20120401" "20120501" "20120601" "20120701"
Future Time Series Sequence
A common operation is to make a future time series sequence that mimics an existing. This is what tk_make_future_timeseries()
is for.
Suppose we have an existing time index.
idx < tk_make_timeseries("2012", by = "3 months",
length_out = "2 years",
include_endpoints = FALSE)
idx
## [1] "20120101" "20120401" "20120701" "20121001" "20130101"
## [6] "20130401" "20130701" "20131001"
Make a Future Time Series from an Existing
We can create a future time sequence from the existing sequence using tk_make_future_timeseries()
.
tk_make_future_timeseries(idx, length_out = "2 years")
## [1] "20140101" "20140401" "20140701" "20141001" "20150101"
## [6] "20150401" "20150701" "20151001"
Weekends & Holidays
Make weekday sequence removing holidays
 Result is 252 days.
idx < tk_make_weekday_sequence("2012",
remove_weekends = TRUE,
remove_holidays = TRUE, calendar = "NYSE")
tk_get_timeseries_summary(idx)
## # A tibble: 1 × 12
## n.obs start end units scale tzone diff.m…¹ diff.q1 diff.…² diff.…³
## <int> <date> <date> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 252 20120103 20121231 days day UTC 86400 86400 86400 124953.
## # … with 2 more variables: diff.q3 <dbl>, diff.maximum <dbl>, and abbreviated
## # variable names ¹diff.minimum, ²diff.median, ³diff.mean
## # ℹ Use `colnames()` to see all variable names
Which holidays were removed?
 NYSE Trading holidays which are days most businesses observe
tk_make_holiday_sequence("2012", calendar = "NYSE")
## [1] "20120102" "20120116" "20120220" "20120406" "20120528"
## [6] "20120704" "20120903" "20121122" "20121225"
Make future index removing holidays
holidays < tk_make_holiday_sequence(
start_date = "20130101",
end_date = "20131231",
calendar = "NYSE")
idx_future < idx %>%
tk_make_future_timeseries(length_out = "1 year",
inspect_weekdays = TRUE,
skip_values = holidays)
tk_get_timeseries_summary(idx_future)
## # A tibble: 1 × 12
## n.obs start end units scale tzone diff.m…¹ diff.q1 diff.…² diff.…³
## <int> <date> <date> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 252 20130102 20131231 days day UTC 86400 86400 86400 124953.
## # … with 2 more variables: diff.q3 <dbl>, diff.maximum <dbl>, and abbreviated
## # variable names ¹diff.minimum, ²diff.median, ³diff.mean
## # ℹ Use `colnames()` to see all variable names
Learning More
My Talk on HighPerformance Time Series Forecasting
Time series is changing. Businesses now need 10,000+ time series forecasts every day.
HighPerformance Forecasting Systems will save companies MILLIONS of dollars. Imagine what will happen to your career if you can provide your organization a “HighPerformance Time Series Forecasting System” (HPTSF System).
I teach how to build a HPTFS System in my HighPerformance Time Series Forecasting Course. If interested in learning Scalable HighPerformance Forecasting Strategies then take my course. You will learn:
 Time Series Machine Learning (cuttingedge) with
Modeltime
 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more)  NEW  Deep Learning with
GluonTS
(Competition Winners)  Time Series Preprocessing, Noise Reduction, & Anomaly Detection
 Feature engineering using lagged variables & external regressors
 Hyperparameter Tuning
 Time series crossvalidation
 Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
 Scalable Forecasting  Forecast 1000+ time series in parallel
 and more.