Intelligent Date & Time Sequences
Matt Dancho
2024-01-04
Source:vignettes/TK02_Time_Series_Date_Sequences.Rmd
TK02_Time_Series_Date_Sequences.Rmd
Creating and modifying date sequences is critical to machine learning projects. We discuss:
- Making a Time Series Sequence:
tk_make_timeseries()
- Making a Future Sequence:
tk_make_future_timeseries()
- Holiday & Weekday/Weekend Sequences
Making a Time Series Sequence
tk_make_timeseries()
improves on the
seq.Date()
and seq.POSIXt()
functions by
simplifying into 1 function. Intelligently handles character dates and
logical assumptions based on user inputs.
By Day
- Can use
by = "day"
or leave blank. -
include_endpoints = FALSE
removes the last value so your series is only 7 observations.
# Selects by day automatically
tk_make_timeseries("2011", length_out = "7 days", include_endpoints = FALSE)
## [1] "2011-01-01" "2011-01-02" "2011-01-03" "2011-01-04" "2011-01-05"
## [6] "2011-01-06" "2011-01-07"
By 2 Seconds
- Can use
by = "2 sec"
to adjust the interval width. -
include_endpoints = TRUE
keeps the last value the series ends on the 6th second.
# Guesses by second
tk_make_timeseries("2016", by = "2 sec", length_out = "6 seconds")
## [1] "2016-01-01 00:00:00 UTC" "2016-01-01 00:00:02 UTC"
## [3] "2016-01-01 00:00:04 UTC" "2016-01-01 00:00:06 UTC"
Length Out = 1 year 6 months
-
length_out = "1 year 6 months"
- Can include complex expressions like “1 year 4 months 6 days”.
tk_make_timeseries("2012-07",
by = "1 month",
length_out = "1 year 6 months",
include_endpoints = FALSE)
## [1] "2012-07-01" "2012-08-01" "2012-09-01" "2012-10-01" "2012-11-01"
## [6] "2012-12-01" "2013-01-01" "2013-02-01" "2013-03-01" "2013-04-01"
## [11] "2013-05-01" "2013-06-01" "2013-07-01" "2013-08-01" "2013-09-01"
## [16] "2013-10-01" "2013-11-01" "2013-12-01"
Go In Reverse
- To go in reverse, just use
end_date
as where you want the series to end.
tk_make_timeseries(end_date = "2012-07-01",
by = "1 month",
length_out = "1 year 6 months")
## [1] "2011-01-01" "2011-02-01" "2011-03-01" "2011-04-01" "2011-05-01"
## [6] "2011-06-01" "2011-07-01" "2011-08-01" "2011-09-01" "2011-10-01"
## [11] "2011-11-01" "2011-12-01" "2012-01-01" "2012-02-01" "2012-03-01"
## [16] "2012-04-01" "2012-05-01" "2012-06-01" "2012-07-01"
Future Time Series Sequence
A common operation is to make a future time series sequence that
mimics an existing. This is what
tk_make_future_timeseries()
is for.
Suppose we have an existing time index.
idx <- tk_make_timeseries("2012", by = "3 months",
length_out = "2 years",
include_endpoints = FALSE)
idx
## [1] "2012-01-01" "2012-04-01" "2012-07-01" "2012-10-01" "2013-01-01"
## [6] "2013-04-01" "2013-07-01" "2013-10-01"
Make a Future Time Series from an Existing
We can create a future time sequence from the existing sequence using
tk_make_future_timeseries()
.
tk_make_future_timeseries(idx, length_out = "2 years")
## [1] "2014-01-01" "2014-04-01" "2014-07-01" "2014-10-01" "2015-01-01"
## [6] "2015-04-01" "2015-07-01" "2015-10-01"
Weekends & Holidays
Make weekday sequence removing holidays
- Result is 252 days.
idx <- tk_make_weekday_sequence("2012",
remove_weekends = TRUE,
remove_holidays = TRUE, calendar = "NYSE")
tk_get_timeseries_summary(idx)
## # A tibble: 1 × 12
## n.obs start end units scale tzone diff.minimum diff.q1 diff.median
## <int> <date> <date> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 250 2012-01-03 2012-12-31 days day UTC 86400 86400 86400
## # ℹ 3 more variables: diff.mean <dbl>, diff.q3 <dbl>, diff.maximum <dbl>
Which holidays were removed?
- NYSE Trading holidays which are days most businesses observe
tk_make_holiday_sequence("2012", calendar = "NYSE")
## [1] "2012-01-02" "2012-01-16" "2012-02-20" "2012-04-06" "2012-05-28"
## [6] "2012-07-04" "2012-09-03" "2012-10-29" "2012-10-30" "2012-11-22"
## [11] "2012-12-25"
Make future index removing holidays
holidays <- tk_make_holiday_sequence(
start_date = "2013-01-01",
end_date = "2013-12-31",
calendar = "NYSE")
idx_future <- idx %>%
tk_make_future_timeseries(length_out = "1 year",
inspect_weekdays = TRUE,
skip_values = holidays)
tk_get_timeseries_summary(idx_future)
## # A tibble: 1 × 12
## n.obs start end units scale tzone diff.minimum diff.q1 diff.median
## <int> <date> <date> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 252 2013-01-02 2013-12-31 days day UTC 86400 86400 86400
## # ℹ 3 more variables: diff.mean <dbl>, diff.q3 <dbl>, diff.maximum <dbl>
Learning More
My Talk on High-Performance Time Series Forecasting
Time series is changing. Businesses now need 10,000+ time series forecasts every day.
High-Performance Forecasting Systems will save companies MILLIONS of dollars. Imagine what will happen to your career if you can provide your organization a “High-Performance Time Series Forecasting System” (HPTSF System).
I teach how to build a HPTFS System in my High-Performance Time Series Forecasting Course. If interested in learning Scalable High-Performance Forecasting Strategies then take my course. You will learn:
- Time Series Machine Learning (cutting-edge) with
Modeltime
- 30+ Models (Prophet, ARIMA, XGBoost, Random Forest, & many more) - NEW - Deep Learning with
GluonTS
(Competition Winners) - Time Series Preprocessing, Noise Reduction, & Anomaly Detection
- Feature engineering using lagged variables & external regressors
- Hyperparameter Tuning
- Time series cross-validation
- Ensembling Multiple Machine Learning & Univariate Modeling Techniques (Competition Winner)
- Scalable Forecasting - Forecast 1000+ time series in parallel
- and more.