
Group-wise Seasonality Data Preparation
Source:R/diagnostics-tk_seasonal_diagnostics.R
tk_seasonal_diagnostics.Rd
tk_seasonal_diagnostics()
is the preprocessor for plot_seasonal_diagnostics()
.
It helps by automating feature collection for time series seasonality analysis.
Arguments
- .data
A
tibble
ordata.frame
with a time-based column- .date_var
A column containing either date or date-time values
- .value
A column containing numeric values
- .feature_set
One or multiple selections to analyze for seasonality. Choices include:
"auto" - Automatically selects features based on the time stamps and length of the series.
"second" - Good for analyzing seasonality by second of each minute.
"minute" - Good for analyzing seasonality by minute of the hour
"hour" - Good for analyzing seasonality by hour of the day
"wday.lbl" - Labeled weekdays. Good for analyzing seasonality by day of the week.
"week" - Good for analyzing seasonality by week of the year.
"month.lbl" - Labeled months. Good for analyzing seasonality by month of the year.
"quarter" - Good for analyzing seasonality by quarter of the year
"year" - Good for analyzing seasonality over multiple years.
Details
Automatic Feature Selection
Internal calculations are performed to detect a sub-range of features to include useing the following logic:
The minimum feature is selected based on the median difference between consecutive timestamps
The maximum feature is selected based on having 2 full periods.
Example: Hourly timestamp data that lasts more than 2 weeks will have the following features: "hour", "wday.lbl", and "week".
Scalable with Grouped Data Frames
This function respects grouped data.frame
and tibbles
that were made with dplyr::group_by()
.
For grouped data, the automatic feature selection returned is a collection of all features within the sub-groups. This means extra features are returned even though they may be meaningless for some of the groups.
Transformations
The .value
parameter respects transformations (e.g. .value = log(sales)
).
Examples
if (FALSE) {
library(dplyr)
library(timetk)
# ---- GROUPED EXAMPLES ----
# Hourly Data
m4_hourly %>%
group_by(id) %>%
tk_seasonal_diagnostics(date, value)
# Monthly Data
m4_monthly %>%
group_by(id) %>%
tk_seasonal_diagnostics(date, value)
# ---- TRANSFORMATION ----
m4_weekly %>%
group_by(id) %>%
tk_seasonal_diagnostics(date, log(value))
# ---- CUSTOM FEATURE SELECTION ----
m4_hourly %>%
group_by(id) %>%
tk_seasonal_diagnostics(date, value, .feature_set = c("hour", "week"))
}