Skip to contents

An interactive and scalable function for visualizing time series seasonality. Plots are available in interactive plotly (default) and static ggplot2 format.

Usage

plot_seasonal_diagnostics(
  .data,
  .date_var,
  .value,
  .facet_vars = NULL,
  .feature_set = "auto",
  .geom = c("boxplot", "violin"),
  .geom_color = "#2c3e50",
  .geom_outlier_color = "#2c3e50",
  .title = "Seasonal Diagnostics",
  .x_lab = "",
  .y_lab = "",
  .interactive = TRUE
)

Arguments

.data

A tibble or data.frame with a time-based column

.date_var

A column containing either date or date-time values

.value

A column containing numeric values

.facet_vars

One or more grouping columns that broken out into ggplot2 facets. These can be selected using tidyselect() helpers (e.g contains()).

.feature_set

One or multiple selections to analyze for seasonality. Choices include:

  • "auto" - Automatically selects features based on the time stamps and length of the series.

  • "second" - Good for analyzing seasonality by second of each minute.

  • "minute" - Good for analyzing seasonality by minute of the hour

  • "hour" - Good for analyzing seasonality by hour of the day

  • "wday.lbl" - Labeled weekdays. Good for analyzing seasonality by day of the week.

  • "week" - Good for analyzing seasonality by week of the year.

  • "month.lbl" - Labeled months. Good for analyzing seasonality by month of the year.

  • "quarter" - Good for analyzing seasonality by quarter of the year

  • "year" - Good for analyzing seasonality over multiple years.

.geom

Either "boxplot" or "violin"

.geom_color

Geometry color. Line color. Use keyword: "scale_color" to change the color by the facet.

.geom_outlier_color

Color used to highlight outliers.

.title

Plot title.

.x_lab

Plot x-axis label

.y_lab

Plot y-axis label

.interactive

If TRUE, returns a plotly interactive plot. If FALSE, returns a static ggplot2 plot.

Value

A plotly or ggplot2 visualization

Details

Automatic Feature Selection

Internal calculations are performed to detect a sub-range of features to include useing the following logic:

  • The minimum feature is selected based on the median difference between consecutive timestamps

  • The maximum feature is selected based on having 2 full periods.

Example: Hourly timestamp data that lasts more than 2 weeks will have the following features: "hour", "wday.lbl", and "week".

Scalable with Grouped Data Frames

This function respects grouped data.frame and tibbles that were made with dplyr::group_by().

For grouped data, the automatic feature selection returned is a collection of all features within the sub-groups. This means extra features are returned even though they may be meaningless for some of the groups.

Transformations

The .value parameter respects transformations (e.g. .value = log(sales)).

Examples

if (FALSE) {
library(dplyr)
library(timetk)

# ---- MULTIPLE FREQUENCY ----
# Taylor 30-minute dataset from forecast package
taylor_30_min

# Visualize series
taylor_30_min %>%
    plot_time_series(date, value, .interactive = FALSE)

# Visualize seasonality
taylor_30_min %>%
    plot_seasonal_diagnostics(date, value, .interactive = FALSE)

# ---- GROUPED EXAMPLES ----
# m4 hourly dataset
m4_hourly

# Visualize series
m4_hourly %>%
    group_by(id) %>%
    plot_time_series(date, value, .facet_scales = "free", .interactive = FALSE)

# Visualize seasonality
m4_hourly %>%
    group_by(id) %>%
    plot_seasonal_diagnostics(date, value, .interactive = FALSE)

}