Skip to contents

Normalization is commonly used to center and scale numeric features to prevent one from dominating in algorithms that require data to be on the same scale.

Usage

normalize_vec(x, min = NULL, max = NULL, silent = FALSE)

normalize_inv_vec(x, min, max)

Arguments

x

A numeric vector.

min

The population min value in the normalization process.

max

The population max value in the normalization process.

silent

Whether or not to report the automated min and max parameters as a message.

Value

A numeric vector with the transformation applied.

Details

Standardization vs Normalization

  • Standardization refers to a transformation that reduces the range to mean 0, standard deviation 1

  • Normalization refers to a transformation that reduces the min-max range: (0, 1)

See also

Examples

library(dplyr)

d10_daily <- m4_daily %>% dplyr::filter(id == "D10")

# --- VECTOR ----

value_norm <- normalize_vec(d10_daily$value)
#> Normalization Parameters
#> min: 1781.6
#> max: 2649.3
value      <- normalize_inv_vec(value_norm,
                                min = 1781.6,
                                max = 2649.3)

# --- MUTATE ----

m4_daily %>%
    group_by(id) %>%
    mutate(value_norm = normalize_vec(value))
#> Normalization Parameters
#> min: 1781.6
#> max: 2649.3
#> Normalization Parameters
#> min: 1734.9
#> max: 19432.5
#> Normalization Parameters
#> min: 6309.38
#> max: 9540.62
#> Normalization Parameters
#> min: 4172.1
#> max: 14954.1
#> # A tibble: 9,743 × 4
#> # Groups:   id [4]
#>    id    date       value value_norm
#>    <fct> <date>     <dbl>      <dbl>
#>  1 D10   2014-07-03 2076.      0.340
#>  2 D10   2014-07-04 2073.      0.336
#>  3 D10   2014-07-05 2049.      0.308
#>  4 D10   2014-07-06 2049.      0.308
#>  5 D10   2014-07-07 2006.      0.259
#>  6 D10   2014-07-08 2018.      0.272
#>  7 D10   2014-07-09 2019.      0.274
#>  8 D10   2014-07-10 2007.      0.260
#>  9 D10   2014-07-11 2010       0.263
#> 10 D10   2014-07-12 2002.      0.253
#> # ℹ 9,733 more rows