Standardization is commonly used to center and scale numeric features to prevent one from dominating in algorithms that require data to be on the same scale.

## Usage

standardize_vec(x, mean = NULL, sd = NULL, silent = FALSE)

standardize_inv_vec(x, mean, sd)

## Arguments

x

A numeric vector.

mean

The mean used to invert the standardization

sd

The standard deviation used to invert the standardization process.

silent

Whether or not to report the automated mean and sd parameters as a message.

## Value

Returns a numeric vector with the standardization transformation applied.

## Details

Standardization vs Normalization

• Standardization refers to a transformation that reduces the range to mean 0, standard deviation 1

• Normalization refers to a transformation that reduces the min-max range: (0, 1)

• Normalization/Standardization: standardize_vec(), normalize_vec()

• Box Cox Transformation: box_cox_vec()

• Lag Transformation: lag_vec()

• Differencing Transformation: diff_vec()

• Rolling Window Transformation: slidify_vec()

• Loess Smoothing Transformation: smooth_vec()

• Fourier Series: fourier_vec()

• Missing Value Imputation for Time Series: ts_impute_vec(), ts_clean_vec()

## Examples

library(dplyr)

d10_daily <- m4_daily %>% dplyr::filter(id == "D10")

# --- VECTOR ----

value_std <- standardize_vec(d10_daily\$value)
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
value     <- standardize_inv_vec(value_std,
mean = 2261.60682492582,
sd   = 175.603721730477)

# --- MUTATE ----

m4_daily %>%
group_by(id) %>%
mutate(value_std = standardize_vec(value))
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
#> Standardization Parameters
#> mean: 9243.15525375268
#> standard deviation: 4663.16194403596
#> Standardization Parameters
#> mean: 8259.78634615385
#> standard deviation: 927.592527167825
#> Standardization Parameters
#> mean: 8287.72878932316
#> standard deviation: 2456.05840988041
#> # A tibble: 9,743 × 4
#> # Groups:   id [4]
#>    id    date       value value_std
#>    <fct> <date>     <dbl>     <dbl>
#>  1 D10   2014-07-03 2076.     -1.06
#>  2 D10   2014-07-04 2073.     -1.07
#>  3 D10   2014-07-05 2049.     -1.21
#>  4 D10   2014-07-06 2049.     -1.21
#>  5 D10   2014-07-07 2006.     -1.45
#>  6 D10   2014-07-08 2018.     -1.39
#>  7 D10   2014-07-09 2019.     -1.38
#>  8 D10   2014-07-10 2007.     -1.45
#>  9 D10   2014-07-11 2010      -1.43
#> 10 D10   2014-07-12 2002.     -1.48
#> # ℹ 9,733 more rows