Skip to contents

Standardization is commonly used to center and scale numeric features to prevent one from dominating in algorithms that require data to be on the same scale.


standardize_vec(x, mean = NULL, sd = NULL, silent = FALSE)

standardize_inv_vec(x, mean, sd)



A numeric vector.


The mean used to invert the standardization


The standard deviation used to invert the standardization process.


Whether or not to report the automated mean and sd parameters as a message.


Returns a numeric vector with the standardization transformation applied.


Standardization vs Normalization

  • Standardization refers to a transformation that reduces the range to mean 0, standard deviation 1

  • Normalization refers to a transformation that reduces the min-max range: (0, 1)

See also



d10_daily <- m4_daily %>% dplyr::filter(id == "D10")

# --- VECTOR ----

value_std <- standardize_vec(d10_daily$value)
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
value     <- standardize_inv_vec(value_std,
                                 mean = 2261.60682492582,
                                 sd   = 175.603721730477)

# --- MUTATE ----

m4_daily %>%
    group_by(id) %>%
    mutate(value_std = standardize_vec(value))
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
#> Standardization Parameters
#> mean: 9243.15525375268
#> standard deviation: 4663.16194403596
#> Standardization Parameters
#> mean: 8259.78634615385
#> standard deviation: 927.592527167825
#> Standardization Parameters
#> mean: 8287.72878932316
#> standard deviation: 2456.05840988041
#> # A tibble: 9,743 × 4
#> # Groups:   id [4]
#>    id    date       value value_std
#>    <fct> <date>     <dbl>     <dbl>
#>  1 D10   2014-07-03 2076.     -1.06
#>  2 D10   2014-07-04 2073.     -1.07
#>  3 D10   2014-07-05 2049.     -1.21
#>  4 D10   2014-07-06 2049.     -1.21
#>  5 D10   2014-07-07 2006.     -1.45
#>  6 D10   2014-07-08 2018.     -1.39
#>  7 D10   2014-07-09 2019.     -1.38
#>  8 D10   2014-07-10 2007.     -1.45
#>  9 D10   2014-07-11 2010      -1.43
#> 10 D10   2014-07-12 2002.     -1.48
#> # ℹ 9,733 more rows