`step_box_cox`

creates a *specification* of a recipe
step that will transform data using a Box-Cox
transformation. This function differs from
`recipes::step_BoxCox`

by adding multiple methods
including Guerrero lambda optimization and handling for
negative data used in the Forecast R Package.

## Arguments

- recipe
A

`recipe`

object. The step will be added to the sequence of operations for this recipe.- ...
One or more selector functions to choose which variables are affected by the step. See

`selections()`

for more details. For the`tidy`

method, these are not currently used.- method
One of "guerrero" or "loglik"

- limits
A length 2 numeric vector defining the range to compute the transformation parameter lambda.

- role
Not used by this step since no new variables are created.

- trained
A logical to indicate if the quantities for preprocessing have been estimated.

- lambdas_trained
A numeric vector of transformation values. This is

`NULL`

until computed by`prep()`

.- skip
A logical. Should the step be skipped when the recipe is baked by

`bake.recipe()`

? While all operations are baked when`prep.recipe()`

is run, some operations may not be able to be conducted on new data (e.g. processing the outcome variable(s)). Care should be taken when using`skip = TRUE`

as it may affect the computations for subsequent operations.- id
A character string that is unique to this step to identify it.

- x
A

`step_box_cox`

object.

## Value

An updated version of `recipe`

with the new step
added to the sequence of existing steps (if any). For the
`tidy`

method, a tibble with columns `terms`

(the
selectors or variables selected) and `value`

(the
lambda estimate).

## Details

The `step_box_cox()`

function is designed specifically to handle time series
using methods implemented in the Forecast R Package.

**Negative Data**

This function can be applied to Negative Data.

**Lambda Optimization Methods**

This function uses 2 methods for optimizing the lambda selection from the Forecast R Package:

`method = "guerrero"`

: Guerrero's (1993) method is used, where lambda minimizes the coefficient of variation for subseries of x.`method = loglik`

: the value of lambda is chosen to maximize the profile log likelihood of a linear model fitted to x. For non-seasonal data, a linear time trend is fitted while for seasonal data, a linear time trend with seasonal dummy variables is used.

## References

Guerrero, V.M. (1993) Time-series analysis supported by power transformations.

*Journal of Forecasting*,**12**, 37–48.Box, G. E. P. and Cox, D. R. (1964) An analysis of transformations.

*JRSS*B**26**211–246.

## See also

Time Series Analysis:

Engineered Features:

`step_timeseries_signature()`

,`step_holiday_signature()`

,`step_fourier()`

Diffs & Lags

`step_diff()`

,`recipes::step_lag()`

Smoothing:

`step_slidify()`

,`step_smooth()`

Variance Reduction:

`step_box_cox()`

Imputation:

`step_ts_impute()`

,`step_ts_clean()`

Padding:

`step_ts_pad()`

Transformations to reduce variance:

`recipes::step_log()`

- Log transformation`recipes::step_sqrt()`

- Square-Root Power Transformation

Recipe Setup and Application:

## Examples

```
library(dplyr)
library(tidyr)
library(tidyquant)
library(recipes)
#>
#> Attaching package: ‘recipes’
#> The following object is masked from ‘package:stringr’:
#>
#> fixed
#> The following object is masked from ‘package:stats’:
#>
#> step
library(timetk)
FANG_wide <- FANG %>%
select(symbol, date, adjusted) %>%
pivot_wider(names_from = symbol, values_from = adjusted)
recipe_box_cox <- recipe(~ ., data = FANG_wide) %>%
step_box_cox(FB, AMZN, NFLX, GOOG) %>%
prep()
recipe_box_cox %>% bake(FANG_wide)
#> # A tibble: 1,008 × 5
#> date FB AMZN NFLX GOOG
#> <date> <dbl> <dbl> <dbl> <dbl>
#> 1 2013-01-02 12.5 8.28 4.92 5.26
#> 2 2013-01-03 12.4 8.28 5.08 5.27
#> 3 2013-01-04 12.7 8.29 5.06 5.28
#> 4 2013-01-07 12.9 8.37 5.17 5.28
#> 5 2013-01-08 12.8 8.35 5.10 5.28
#> 6 2013-01-09 13.3 8.35 5.06 5.28
#> 7 2013-01-10 13.5 8.34 5.13 5.28
#> 8 2013-01-11 13.7 8.36 5.24 5.28
#> 9 2013-01-14 13.4 8.40 5.32 5.26
#> 10 2013-01-15 13.1 8.39 5.26 5.27
#> # … with 998 more rows
recipe_box_cox %>% tidy(1)
#> # A tibble: 4 × 3
#> terms lambda id
#> <chr> <dbl> <chr>
#> 1 FB 0.671 box_cox_jEnKu
#> 2 AMZN 0.135 box_cox_jEnKu
#> 3 NFLX 0.458 box_cox_jEnKu
#> 4 GOOG -0.0388 box_cox_jEnKu
```