
Standardize to Mean 0, Standard Deviation 1 (Center & Scale)
Source:R/vec-standardize.R
standardize_vec.Rd
Standardization is commonly used to center and scale numeric features to prevent one from dominating in algorithms that require data to be on the same scale.
Details
Standardization vs Normalization
Standardization refers to a transformation that reduces the range to mean 0, standard deviation 1
Normalization refers to a transformation that reduces the min-max range: (0, 1)
See also
Normalization/Standardization:
standardize_vec()
,normalize_vec()
Box Cox Transformation:
box_cox_vec()
Lag Transformation:
lag_vec()
Differencing Transformation:
diff_vec()
Rolling Window Transformation:
slidify_vec()
Loess Smoothing Transformation:
smooth_vec()
Fourier Series:
fourier_vec()
Missing Value Imputation for Time Series:
ts_impute_vec()
,ts_clean_vec()
Examples
library(dplyr)
d10_daily <- m4_daily %>% dplyr::filter(id == "D10")
# --- VECTOR ----
value_std <- standardize_vec(d10_daily$value)
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
value <- standardize_inv_vec(value_std,
mean = 2261.60682492582,
sd = 175.603721730477)
# --- MUTATE ----
m4_daily %>%
group_by(id) %>%
mutate(value_std = standardize_vec(value))
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
#> Standardization Parameters
#> mean: 9243.15525375268
#> standard deviation: 4663.16194403596
#> Standardization Parameters
#> mean: 8259.78634615385
#> standard deviation: 927.592527167825
#> Standardization Parameters
#> mean: 8287.72878932316
#> standard deviation: 2456.05840988041
#> # A tibble: 9,743 × 4
#> # Groups: id [4]
#> id date value value_std
#> <fct> <date> <dbl> <dbl>
#> 1 D10 2014-07-03 2076. -1.06
#> 2 D10 2014-07-04 2073. -1.07
#> 3 D10 2014-07-05 2049. -1.21
#> 4 D10 2014-07-06 2049. -1.21
#> 5 D10 2014-07-07 2006. -1.45
#> 6 D10 2014-07-08 2018. -1.39
#> 7 D10 2014-07-09 2019. -1.38
#> 8 D10 2014-07-10 2007. -1.45
#> 9 D10 2014-07-11 2010 -1.43
#> 10 D10 2014-07-12 2002. -1.48
#> # ℹ 9,733 more rows