augment_hilbert

augment_hilbert(
    data,
    date_column,
    value_column,
    reduce_memory=True,
    engine='auto',
)

Apply the Hilbert transform to specified columns of a DataFrame or DataFrameGroupBy object.

Signal Processing: The Hilbert transform is used in various signal processing techniques, including phase and amplitude modulation and demodulation, and in the analysis of signals with time-varying amplitude and frequency.

Parameters

Name Type Description Default
data pd.DataFrame or pd.core.groupby.generic.DataFrameGroupBy Input DataFrame or DataFrameGroupBy object with one or more columns of real-valued signals. required
value_column str or list List of column names in ‘data’ to which the Hilbert transform will be applied. required
reduce_memory bool The reduce_memory parameter is used to specify whether to reduce the memory usage of the DataFrame by converting int, float to smaller bytes and str to categorical data. This reduces memory for large data but may impact resolution of float and will change str to categorical. Default is True. True
engine (auto, pandas, polars) Specifies the backend to use for the computation. When “auto” (default) the backend is inferred from the input data. Use “pandas” or “polars” to force a specific backend. "auto"

Returns

Name Type Description
df_hilbert DataFrame A new DataFrame with the 2 Hilbert-transformed columns added, 1 for the real and 1 for imaginary (original columns are preserved). Matches the backend of the input data.

Notes

The Hilbert transform is used in time series analysis primarily for:

  1. Creating Analytic Signals: Forms a complex-valued signal whose properties (magnitude and phase) provide valuable insights into the original signal’s structure.

  2. Determining Instantaneous Phase/Frequency: Offers real-time signal characteristics, crucial for non-stationary signals whose properties change over time.

  3. Extracting Amplitude Envelope: Helps in identifying signal’s amplitude variations, useful in various analysis tasks.

  4. Enhancing Signal Analysis: Assists in tasks like demodulation, trend analysis, feature extraction for machine learning, and improving signal-to-noise ratio, providing a deeper understanding of underlying patterns and trends.

Examples

# Example 1: Using Pandas Engine on a pandas groupby object
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('walmart_sales_weekly', parse_dates=['Date'])


df_hilbert = (
    df
        .groupby('id')
        .augment_hilbert(
            date_column = 'Date',
            value_column = ['Weekly_Sales'],
            engine = 'pandas'
        )
)

df_hilbert.head()
id Store Dept Date Weekly_Sales IsHoliday Type Size Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI Unemployment Weekly_Sales_hilbert_real Weekly_Sales_hilbert_imag
0 1_1 1 1 2010-02-05 24924.500000 0 A 151315 42.310001 2.572 NaN NaN NaN NaN NaN 211.096359 8.106 24924.498047 -12764.086914
1 1_1 1 1 2010-02-12 46039.488281 1 A 151315 38.509998 2.548 NaN NaN NaN NaN NaN 211.242172 8.106 46039.488281 -13469.210938
2 1_1 1 1 2010-02-19 41595.550781 0 A 151315 39.930000 2.514 NaN NaN NaN NaN NaN 211.289139 8.106 41595.550781 16686.888672
3 1_1 1 1 2010-02-26 19403.539062 0 A 151315 46.630001 2.561 NaN NaN NaN NaN NaN 211.319641 8.106 19403.535156 9378.224609
4 1_1 1 1 2010-03-05 21827.900391 0 A 151315 46.500000 2.625 NaN NaN NaN NaN NaN 211.350143 8.106 21827.898438 2552.131836
# Example 2: Using the polars accessor on a grouped table
import pytimetk as tk
import polars as pl


df = tk.load_dataset('walmart_sales_weekly', parse_dates=['Date'])
df_hilbert = (
    pl.from_pandas(df)
        .group_by('id')
        .tk.augment_hilbert(
            date_column = 'Date',
            value_column = ['Weekly_Sales'],
        )
)

df_hilbert.head()
shape: (5, 19)
id Store Dept Date Weekly_Sales IsHoliday Type Size Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI Unemployment Weekly_Sales_hilbert_real Weekly_Sales_hilbert_imag
str i64 i64 datetime[ns] f64 bool str i64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64
"1_1" 1 1 2010-02-05 00:00:00 24924.5 false "A" 151315 42.31 2.572 null null null null null 211.096358 8.106 24924.5 -12764.088716
"1_1" 1 1 2010-02-12 00:00:00 46039.49 true "A" 151315 38.51 2.548 null null null null null 211.24217 8.106 46039.49 -13469.209506
"1_1" 1 1 2010-02-19 00:00:00 41595.55 false "A" 151315 39.93 2.514 null null null null null 211.289143 8.106 41595.55 16686.889086
"1_1" 1 1 2010-02-26 00:00:00 19403.54 false "A" 151315 46.63 2.561 null null null null null 211.319643 8.106 19403.54 9378.223595
"1_1" 1 1 2010-03-05 00:00:00 21827.9 false "A" 151315 46.5 2.625 null null null null null 211.350143 8.106 21827.9 2552.133165
# Example 3: Using the polars accessor on a DataFrame
import pytimetk as tk
import polars as pl


df = tk.load_dataset('taylor_30_min', parse_dates=['date'])
df_hilbert = (
    pl.from_pandas(df)
        .tk.augment_hilbert(
            date_column = 'date',
            value_column = ['value'],
        )
)

df_hilbert.head()
shape: (5, 4)
date value value_hilbert_real value_hilbert_imag
datetime[ns, UTC] i64 f64 f64
2000-06-05 00:00:00 UTC 22262 22262.0 -1269.805182
2000-06-05 00:30:00 UTC 21756 21756.0 -2755.227462
2000-06-05 01:00:00 UTC 22247 22247.0 -4077.813213
2000-06-05 01:30:00 UTC 22759 22759.0 -4404.573458
2000-06-05 02:00:00 UTC 22549 22549.0 -4629.981251