`tk_tsfeatures()`

is a tidyverse compliant wrapper for `tsfeatures::tsfeatures()`

.
The function computes a matrix of time series features that describes the various time
series. It's designed for groupwise analysis using `dplyr`

groups.

## Usage

```
tk_tsfeatures(
.data,
.date_var,
.value,
.period = "auto",
.features = c("frequency", "stl_features", "entropy", "acf_features"),
.scale = TRUE,
.trim = FALSE,
.trim_amount = 0.1,
.parallel = FALSE,
.na_action = na.pass,
.prefix = "ts_",
.silent = TRUE,
...
)
```

## Arguments

- .data
A

`tibble`

or`data.frame`

with a time-based column- .date_var
A column containing either date or date-time values

- .value
A column containing numeric values

- .period
The periodicity (frequency) of the time series data. Values can be provided as follows:

"auto" (default) Calculates using

`tk_get_frequency()`

."2 weeks": Would calculate the median number of observations in a 2-week window.

7 (numeric): Would interpret the

`ts`

frequency as 7 observations per cycle (common for weekly data)

- .features
Passed to

`features`

in the underlying`tsfeatures()`

function. A vector of function names that represent a feature aggregation function. Examples:Use one of the function names from

`tsfeatures`

R package e.g.("lumpiness", "stl_features").Use a function name (e.g. "mean" or "median")

Create your own function and provide the function name

- .scale
If

`TRUE`

, time series are scaled to mean 0 and sd 1 before features are computed.- .trim
If

`TRUE`

, time series are trimmed by trim_amount before features are computed. Values larger than trim_amount in absolute value are set to`NA`

.- .trim_amount
Default level of trimming if trim==TRUE. Default: 0.1.

- .parallel
If TRUE, multiple cores (or multiple sessions) will be used. This only speeds things up when there are a large number of time series.

When

`.parallel = TRUE`

, the`multiprocess = future::multisession`

. This can be adjusted by setting`multiprocess`

parameter. See the`tsfeatures::tsfeatures()`

function for mor details.- .na_action
A function to handle missing values. Use na.interp to estimate missing values.

- .prefix
A prefix to prefix the feature columns. Default:

`"ts_"`

.- .silent
Whether or not to show messages and warnings.

- ...
Other arguments get passed to the feature functions.

## Details

The `timetk::tk_tsfeatures()`

function implements the `tsfeatures`

package
for computing aggregated feature matrix for time series that is useful in many types of
analysis such as clustering time series.

The `timetk`

version ports the `tsfeatures::tsfeatures()`

function to a `tidyverse`

-compliant
format that uses a tidy data frame containing grouping columns (optional), a date column, and
a value column. Other columns are ignored.

It then becomes easy to summarize each time series by group-wise application of `.features`

,
which are simply functions that evaluate a time series and return single aggregated value.
(Example: "mean" would return the mean of the time series (note that values are scaled to mean 1 and sd 0 first))

**Function Internals:**

Internally, the time series are converted to `ts`

class using `tk_ts(.period)`

where the
period is the frequency of the time series. Values can be provided for `.period`

, which will be used
prior to convertion to `ts`

class.

The function then leverages `tsfeatures::tsfeatures()`

to compute the feature matrix of summarized
feature values.

## References

Rob Hyndman, Yanfei Kang, Pablo Montero-Manso, Thiyanga Talagala, Earo Wang, Yangzhuoran Yang, Mitchell O'Hara-Wild: tsfeatures R package

## Examples

```
library(dplyr)
library(timetk)
walmart_sales_weekly %>%
group_by(id) %>%
tk_tsfeatures(
.date_var = Date,
.value = Weekly_Sales,
.period = 52,
.features = c("frequency", "stl_features", "entropy", "acf_features", "mean"),
.scale = TRUE,
.prefix = "ts_"
)
#> # A tibble: 7 × 22
#> # Groups: id [7]
#> id ts_fre…¹ ts_np…² ts_se…³ ts_tr…⁴ ts_sp…⁵ ts_li…⁶ ts_cu…⁷ ts_e_…⁸ ts_e_…⁹
#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1_1 52 1 52 6.70e-4 2.80e-5 -0.0581 0.112 0.349 0.334
#> 2 1_3 52 1 52 6.14e-2 9.87e-6 0.511 0.496 0.0581 0.0660
#> 3 1_8 52 1 52 7.56e-1 1.95e-6 6.41 3.67 0.330 0.358
#> 4 1_13 52 1 52 3.54e-1 4.75e-6 2.74 2.25 0.192 0.321
#> 5 1_38 52 1 52 4.25e-1 1.79e-5 -4.07 2.82 0.0459 0.152
#> 6 1_93 52 1 52 7.91e-1 7.54e-7 6.22 -0.684 -0.0248 0.363
#> 7 1_95 52 1 52 6.39e-1 5.67e-7 3.94 -0.377 0.0247 0.161
#> # … with 12 more variables: ts_seasonal_strength <dbl>, ts_peak <dbl>,
#> # ts_trough <dbl>, ts_entropy <dbl>, ts_x_acf1 <dbl>, ts_x_acf10 <dbl>,
#> # ts_diff1_acf1 <dbl>, ts_diff1_acf10 <dbl>, ts_diff2_acf1 <dbl>,
#> # ts_diff2_acf10 <dbl>, ts_seas_acf1 <dbl>, ts_mean <dbl>, and abbreviated
#> # variable names ¹ts_frequency, ²ts_nperiods, ³ts_seasonal_period, ⁴ts_trend,
#> # ⁵ts_spike, ⁶ts_linearity, ⁷ts_curvature, ⁸ts_e_acf1, ⁹ts_e_acf10
```