augment_ewma_volatility

augment_ewma_volatility(
    data,
    date_column,
    close_column,
    decay_factor=0.94,
    window=20,
    reduce_memory=False,
    engine='auto',
)

Calculate Exponentially Weighted Moving Average (EWMA) volatility for a financial time series.

Parameters

Name Type Description Default
data DataFrame or GroupBy(pandas or polars) Input time-series data. Grouped inputs are processed per group before the indicator is appended. required
date_column str or ColumnSelector Column name or selector containing dates or timestamps. required
close_column str, ColumnSelector, or list Column(s) with closing prices to calculate volatility. Must resolve to a single column. required
decay_factor float Smoothing factor (lambda) for EWMA, between 0 and 1. Higher values give more weight to past data. Default is 0.94 (RiskMetrics standard). 0.94
window Union[int, Tuple[int, int], List[int]] Size of the rolling window to initialize EWMA calculation. For each window value the EWMA volatility is only computed when at least that many observations are available. You may provide a single integer or multiple values (via tuple or list). Default is 20. 20
reduce_memory bool If True, reduces memory usage before calculation. Default is False. False
engine (auto, pandas, polars, cudf) Execution engine. "auto" (default) infers the backend from the input data while allowing explicit overrides. "auto"

Returns

Name Type Description
DataFrame DataFrame with added columns: - {close_column}ewma_vol{window}_{decay_factor}: EWMA volatility calculated using a minimum number of periods equal to each specified window.

Notes

EWMA volatility emphasizes recent price movements and is computed recursively as:

σ²_t = (1 - Ξ») * rΒ²_t + Ξ» * σ²_{t-1}

where r_t is the log return. By using the min_periods (set to the provided window value) we ensure that the EWMA is only calculated after enough observations have accumulated.

References:

  • https://www.investopedia.com/articles/07/ewma.asp

Examples

import pandas as pd
import polars as pl
import pytimetk as tk

df = tk.load_dataset("stocks_daily", parse_dates=["date"])

df
symbol date open high low close volume adjusted
0 META 2013-01-02 27.440001 28.180000 27.420000 28.000000 69846400 28.000000
1 META 2013-01-03 27.879999 28.469999 27.590000 27.770000 63140600 27.770000
2 META 2013-01-04 28.010000 28.930000 27.830000 28.760000 72715400 28.760000
3 META 2013-01-07 28.690001 29.790001 28.650000 29.420000 83781800 29.420000
4 META 2013-01-08 29.510000 29.600000 28.860001 29.059999 45871300 29.059999
... ... ... ... ... ... ... ... ...
16189 GOOG 2023-09-15 138.800003 139.360001 137.179993 138.300003 48947600 138.300003
16190 GOOG 2023-09-18 137.630005 139.929993 137.630005 138.960007 16233600 138.960007
16191 GOOG 2023-09-19 138.250000 139.175003 137.500000 138.830002 15479100 138.830002
16192 GOOG 2023-09-20 138.830002 138.839996 134.520004 134.589996 21473500 134.589996
16193 GOOG 2023-09-21 132.389999 133.190002 131.089996 131.360001 22042700 131.360001

16194 rows Γ— 8 columns

# EWMA Volatility - single stock (pandas)
ewma_single = (
    df
    .query("symbol == 'AAPL'")
    .augment_ewma_volatility(
        date_column="date",
        close_column="close",
        decay_factor=0.94,
        window=[20, 50],
    )
)

ewma_single.glimpse()
<class 'pandas.core.frame.DataFrame'>: 2699 rows of 10 columns
symbol:                  object            ['AAPL', 'AAPL', 'AAPL', 'AAP ...
date:                    datetime64[ns]    [Timestamp('2013-01-02 00:00: ...
open:                    float64           [19.779285430908203, 19.56714 ...
high:                    float64           [19.821428298950195, 19.63107 ...
low:                     float64           [19.343929290771484, 19.32142 ...
close:                   float64           [19.608213424682617, 19.36071 ...
volume:                  int64             [560518000, 352965200, 594333 ...
adjusted:                float64           [16.791179656982422, 16.57924 ...
close_ewma_vol_20_0.94:  float64           [nan, nan, nan, nan, nan, nan ...
close_ewma_vol_50_0.94:  float64           [nan, nan, nan, nan, nan, nan ...
# EWMA Volatility - grouped pandas engine
ewma_grouped = (
    df
    .groupby("symbol")
    .augment_ewma_volatility(
        date_column="date",
        close_column="close",
        decay_factor=0.94,
        window=[20, 50],
    )
)

ewma_grouped.glimpse()
<class 'pandas.core.frame.DataFrame'>: 16194 rows of 10 columns
symbol:                  object            ['META', 'META', 'META', 'MET ...
date:                    datetime64[ns]    [Timestamp('2013-01-02 00:00: ...
open:                    float64           [27.440000534057617, 27.87999 ...
high:                    float64           [28.18000030517578, 28.469999 ...
low:                     float64           [27.420000076293945, 27.59000 ...
close:                   float64           [28.0, 27.770000457763672, 28 ...
volume:                  int64             [69846400, 63140600, 72715400 ...
adjusted:                float64           [28.0, 27.770000457763672, 28 ...
close_ewma_vol_20_0.94:  float64           [nan, nan, nan, nan, nan, nan ...
close_ewma_vol_50_0.94:  float64           [nan, nan, nan, nan, nan, nan ...
# EWMA Volatility - polars engine
pl_single = pl.from_pandas(df.query("symbol == 'AAPL'"))
ewma_polars = (
    pl_single
    .tk.augment_ewma_volatility(
        date_column="date",
        close_column="close",
        decay_factor=0.94,
        window=[20, 50],
    )
)

ewma_polars.glimpse()
Rows: 2699
Columns: 10
$ symbol                          <str> 'AAPL', 'AAPL', 'AAPL', 'AAPL', 'AAPL', 'AAPL', 'AAPL', 'AAPL', 'AAPL', 'AAPL'
$ date                   <datetime[ns]> 2013-01-02 00:00:00, 2013-01-03 00:00:00, 2013-01-04 00:00:00, 2013-01-07 00:00:00, 2013-01-08 00:00:00, 2013-01-09 00:00:00, 2013-01-10 00:00:00, 2013-01-11 00:00:00, 2013-01-14 00:00:00, 2013-01-15 00:00:00
$ open                            <f64> 19.779285430908203, 19.567142486572266, 19.177499771118164, 18.64285659790039, 18.90035629272461, 18.66071319580078, 18.876785278320312, 18.60714340209961, 17.952856063842773, 17.796428680419922
$ high                            <f64> 19.821428298950195, 19.63107109069824, 19.236785888671875, 18.9035701751709, 18.996070861816406, 18.750356674194336, 18.882856369018555, 18.761428833007812, 18.125, 17.82107162475586
$ low                             <f64> 19.343929290771484, 19.321428298950195, 18.77964210510254, 18.399999618530273, 18.616071701049805, 18.428213119506836, 18.41142845153809, 18.53642845153809, 17.80392837524414, 17.26357078552246
$ close                           <f64> 19.608213424682617, 19.36071395874023, 18.821428298950195, 18.71071434020996, 18.761070251464844, 18.467857360839844, 18.696786880493164, 18.582143783569336, 17.91964340209961, 17.354286193847656
$ volume                          <i64> 560518000, 352965200, 594333600, 484156400, 458707200, 407604400, 601146000, 350506800, 734207600, 876772400
$ adjusted                        <f64> 16.791179656982422, 16.579240798950195, 16.1174373626709, 16.02262306213379, 16.065746307373047, 15.814659118652344, 16.010698318481445, 15.912524223327637, 15.345203399658203, 14.86106777191162
$ close_ewma_vol_20_0.94          <f64> None, None, None, None, None, None, None, None, None, None
$ close_ewma_vol_50_0.94          <f64> None, None, None, None, None, None, None, None, None, None
# EWMA Volatility - polars grouped
pl_df_full = pl.from_pandas(df)
ewma_polars_grouped = (
    pl_df_full
    .group_by("symbol")
    .tk.augment_ewma_volatility(
        date_column="date",
        close_column="close",
        decay_factor=0.94,
        window=[20, 50],
    )
)

ewma_polars_grouped.glimpse()
Rows: 16194
Columns: 10
$ symbol                          <str> 'META', 'META', 'META', 'META', 'META', 'META', 'META', 'META', 'META', 'META'
$ date                   <datetime[ns]> 2013-01-02 00:00:00, 2013-01-03 00:00:00, 2013-01-04 00:00:00, 2013-01-07 00:00:00, 2013-01-08 00:00:00, 2013-01-09 00:00:00, 2013-01-10 00:00:00, 2013-01-11 00:00:00, 2013-01-14 00:00:00, 2013-01-15 00:00:00
$ open                            <f64> 27.440000534057617, 27.8799991607666, 28.010000228881836, 28.690000534057617, 29.510000228881836, 29.670000076293945, 30.600000381469727, 31.280000686645508, 32.08000183105469, 30.63999938964844
$ high                            <f64> 28.18000030517578, 28.469999313354492, 28.93000030517578, 29.790000915527344, 29.600000381469727, 30.600000381469727, 31.450000762939453, 31.959999084472656, 32.209999084472656, 31.709999084472656
$ low                             <f64> 27.420000076293945, 27.59000015258789, 27.829999923706055, 28.649999618530273, 28.86000061035156, 29.489999771118164, 30.280000686645508, 31.100000381469727, 30.6200008392334, 29.8799991607666
$ close                           <f64> 28.0, 27.770000457763672, 28.760000228881836, 29.420000076293945, 29.059999465942383, 30.59000015258789, 31.299999237060547, 31.719999313354492, 30.950000762939453, 30.100000381469727
$ volume                          <i64> 69846400, 63140600, 72715400, 83781800, 45871300, 104787700, 95316400, 89598000, 98892800, 173242600
$ adjusted                        <f64> 28.0, 27.770000457763672, 28.760000228881836, 29.420000076293945, 29.059999465942383, 30.59000015258789, 31.299999237060547, 31.719999313354492, 30.950000762939453, 30.100000381469727
$ close_ewma_vol_20_0.94          <f64> None, None, None, None, None, None, None, None, None, None
$ close_ewma_vol_50_0.94          <f64> None, None, None, None, None, None, None, None, None, None
from pytimetk.utils.selection import contains

selector_df = (
    df
    .augment_ewma_volatility(
        date_column=contains("dat"),
        close_column=contains("clos"),
        window=20,
    )
)

selector_df.glimpse()
<class 'pandas.core.frame.DataFrame'>: 16194 rows of 9 columns
symbol:                  object            ['META', 'META', 'META', 'MET ...
date:                    datetime64[ns]    [Timestamp('2013-01-02 00:00: ...
open:                    float64           [27.440000534057617, 27.87999 ...
high:                    float64           [28.18000030517578, 28.469999 ...
low:                     float64           [27.420000076293945, 27.59000 ...
close:                   float64           [28.0, 27.770000457763672, 28 ...
volume:                  int64             [69846400, 63140600, 72715400 ...
adjusted:                float64           [28.0, 27.770000457763672, 28 ...
close_ewma_vol_20_0.94:  float64           [nan, nan, nan, 0.77951049752 ...