Standardize to Mean 0, Standard Deviation 1 (Center & Scale)
Source:R/vec-standardize.R
standardize_vec.Rd
Standardization is commonly used to center and scale numeric features to prevent one from dominating in algorithms that require data to be on the same scale.
Arguments
- x
A numeric vector.
- mean
The mean used to invert the standardization
- sd
The standard deviation used to invert the standardization process.
- silent
Whether or not to report the automated
mean
andsd
parameters as a message.
Details
Standardization vs Normalization
Standardization refers to a transformation that reduces the range to mean 0, standard deviation 1
Normalization refers to a transformation that reduces the min-max range: (0, 1)
See also
Normalization/Standardization:
standardize_vec()
,normalize_vec()
Box Cox Transformation:
box_cox_vec()
Lag Transformation:
lag_vec()
Differencing Transformation:
diff_vec()
Rolling Window Transformation:
slidify_vec()
Loess Smoothing Transformation:
smooth_vec()
Fourier Series:
fourier_vec()
Missing Value Imputation for Time Series:
ts_impute_vec()
,ts_clean_vec()
Examples
library(dplyr)
d10_daily <- m4_daily %>% dplyr::filter(id == "D10")
# --- VECTOR ----
value_std <- standardize_vec(d10_daily$value)
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
value <- standardize_inv_vec(value_std,
mean = 2261.60682492582,
sd = 175.603721730477)
# --- MUTATE ----
m4_daily %>%
group_by(id) %>%
mutate(value_std = standardize_vec(value))
#> Standardization Parameters
#> mean: 2261.60682492582
#> standard deviation: 175.603721730477
#> Standardization Parameters
#> mean: 9243.15525375268
#> standard deviation: 4663.16194403596
#> Standardization Parameters
#> mean: 8259.78634615385
#> standard deviation: 927.592527167825
#> Standardization Parameters
#> mean: 8287.72878932316
#> standard deviation: 2456.05840988041
#> # A tibble: 9,743 × 4
#> # Groups: id [4]
#> id date value value_std
#> <fct> <date> <dbl> <dbl>
#> 1 D10 2014-07-03 2076. -1.06
#> 2 D10 2014-07-04 2073. -1.07
#> 3 D10 2014-07-05 2049. -1.21
#> 4 D10 2014-07-06 2049. -1.21
#> 5 D10 2014-07-07 2006. -1.45
#> 6 D10 2014-07-08 2018. -1.39
#> 7 D10 2014-07-09 2019. -1.38
#> 8 D10 2014-07-10 2007. -1.45
#> 9 D10 2014-07-11 2010 -1.43
#> 10 D10 2014-07-12 2002. -1.48
#> # ℹ 9,733 more rows