Skip to contents

This is mainly a wrapper for the outlier cleaning function, tsclean(), from the forecast R package. The ts_clean_vec() function includes arguments for applying seasonality to numeric vector (non-ts) via the period argument.

Usage

ts_clean_vec(x, period = 1, lambda = NULL)

Arguments

x

A numeric vector.

period

A seasonal period to use during the transformation. If period = 1, seasonality is not included and supsmu() is used to fit a trend. If period > 1, a robust STL decomposition is first performed and a linear interpolation is applied to the seasonally adjusted data.

lambda

A box cox transformation parameter. If set to "auto", performs automated lambda selection.

Value

A numeric vector with the missing values and/or anomalies transformed to imputed values.

Details

Cleaning Outliers

  1. Non-Seasonal (period = 1): Uses stats::supsmu()

  2. Seasonal (period > 1): Uses forecast::mstl() with robust = TRUE (robust STL decomposition) for seasonal series.

To estimate missing values and outlier replacements, linear interpolation is used on the (possibly seasonally adjusted) series. See forecast::tsoutliers() for the outlier detection method.

Box Cox Transformation

In many circumstances, a Box Cox transformation can help. Especially if the series is multiplicative meaning the variance grows exponentially. A Box Cox transformation can be automated by setting lambda = "auto" or can be specified by setting lambda = numeric value.

See also

Examples

library(dplyr)


# --- VECTOR ----

values <- c(1,2,3, 4*2, 5,6,7, NA, 9,10,11, 12*2)
values
#>  [1]  1  2  3  8  5  6  7 NA  9 10 11 24

# Linear interpolation + Outlier Cleansing
ts_clean_vec(values, period = 1, lambda = NULL)
#>  [1] 1 2 3 4 5 6 7 8 9 9 9 9

# Seasonal Interpolation: set period = 4
ts_clean_vec(values, period = 4, lambda = NULL)
#>  [1]  1.00000  2.00000  3.00000  8.00000  5.00000  6.00000  7.00000 11.25703
#>  [9]  9.00000 10.00000 10.00000 14.00000

# Seasonal Interpolation with Box Cox Transformation (internal)
ts_clean_vec(values, period = 4, lambda = "auto")
#>  [1]  1.000000  2.000000  3.000000  8.444127  3.832690  6.000000  7.000000
#>  [8] 15.895521  9.000000 10.000000 11.000000 24.000000