augment_hilbert

augment_hilbert(data, date_column, value_column, reduce_memory=True, engine='pandas')

Apply the Hilbert transform to specified columns of a DataFrame or DataFrameGroupBy object.

Signal Processing: The Hilbert transform is used in various signal processing techniques, including phase and amplitude modulation and demodulation, and in the analysis of signals with time-varying amplitude and frequency.

Parameters

Name Type Description Default
data pd.DataFrame or pd.core.groupby.generic.DataFrameGroupBy Input DataFrame or DataFrameGroupBy object with one or more columns of real-valued signals. required
value_column str or list List of column names in ‘data’ to which the Hilbert transform will be applied. required
reduce_memory bool The reduce_memory parameter is used to specify whether to reduce the memory usage of the DataFrame by converting int, float to smaller bytes and str to categorical data. This reduces memory for large data but may impact resolution of float and will change str to categorical. Default is True. True
engine str The engine parameter is used to specify the engine to use for summarizing the data. It can be either “pandas” or “polars”. - The default value is “pandas”. - When “polars”, the function will internally use the polars library for summarizing the data. This can be faster than using “pandas” for large datasets. 'pandas'

Returns

Type Description
pd.DataFrame A new DataFrame with the 2 Hilbert-transformed columns added, 1 for the real and 1 for imaginary (original columns are preserved).

Notes

The Hilbert transform is used in time series analysis primarily for:

  1. Creating Analytic Signals: Forms a complex-valued signal whose properties (magnitude and phase) provide valuable insights into the original signal’s structure.

  2. Determining Instantaneous Phase/Frequency: Offers real-time signal characteristics, crucial for non-stationary signals whose properties change over time.

  3. Extracting Amplitude Envelope: Helps in identifying signal’s amplitude variations, useful in various analysis tasks.

  4. Enhancing Signal Analysis: Assists in tasks like demodulation, trend analysis, feature extraction for machine learning, and improving signal-to-noise ratio, providing a deeper understanding of underlying patterns and trends.

Examples

# Example 1: Using Pandas Engine on a pandas groupby object
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('walmart_sales_weekly', parse_dates=['Date'])


df_hilbert = (
    df
        .groupby('id')
        .augment_hilbert(
            date_column = 'Date',
            value_column = ['Weekly_Sales'],
            engine = 'pandas'
        )
)

df_hilbert.head()
id Store Dept Date Weekly_Sales IsHoliday Type Size Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI Unemployment Weekly_Sales_hilbert_real Weekly_Sales_hilbert_imag
0 1_1 1 1 2010-02-05 24924.500000 0 A 151315 42.310001 2.572 NaN NaN NaN NaN NaN 211.096359 8.106 24924.500000 -12764.087053
1 1_1 1 1 2010-02-12 46039.488281 1 A 151315 38.509998 2.548 NaN NaN NaN NaN NaN 211.242172 8.106 46039.488281 -13469.210227
2 1_1 1 1 2010-02-19 41595.550781 0 A 151315 39.930000 2.514 NaN NaN NaN NaN NaN 211.289139 8.106 41595.550781 16686.888640
3 1_1 1 1 2010-02-26 19403.539062 0 A 151315 46.630001 2.561 NaN NaN NaN NaN NaN 211.319641 8.106 19403.539062 9378.223680
4 1_1 1 1 2010-03-05 21827.900391 0 A 151315 46.500000 2.625 NaN NaN NaN NaN NaN 211.350143 8.106 21827.900391 2552.131871
# Example 2: Using Polars Engine on a pandas groupby object
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('walmart_sales_weekly', parse_dates=['Date'])
df_hilbert = (
    df
        .groupby('id')
        .augment_hilbert(
            date_column = 'Date',
            value_column = ['Weekly_Sales'],
            engine = 'polars'
        )
)

df_hilbert.head()
id Store Dept Date Weekly_Sales IsHoliday Type Size Temperature Fuel_Price MarkDown1 MarkDown2 MarkDown3 MarkDown4 MarkDown5 CPI Unemployment Weekly_Sales_hilbert_real Weekly_Sales_hilbert_imag
0 1_1 1 1 2010-02-05 24924.500000 0 A 151315 42.310001 2.572 NaN NaN NaN NaN NaN 211.096359 8.106 24924.500000 -12764.087053
1 1_1 1 1 2010-02-12 46039.488281 1 A 151315 38.509998 2.548 NaN NaN NaN NaN NaN 211.242172 8.106 46039.488281 -13469.210227
2 1_1 1 1 2010-02-19 41595.550781 0 A 151315 39.930000 2.514 NaN NaN NaN NaN NaN 211.289139 8.106 41595.550781 16686.888640
3 1_1 1 1 2010-02-26 19403.539062 0 A 151315 46.630001 2.561 NaN NaN NaN NaN NaN 211.319641 8.106 19403.539062 9378.223680
4 1_1 1 1 2010-03-05 21827.900391 0 A 151315 46.500000 2.625 NaN NaN NaN NaN NaN 211.350143 8.106 21827.900391 2552.131871
# Example 3: Using Polars Engine on a pandas dataframe
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('taylor_30_min', parse_dates=['date'])
df_hilbert = (
    df
        .augment_hilbert(
            date_column = 'date',
            value_column = ['value'],
            engine = 'polars'
        )
)

df_hilbert.head()
date value value_hilbert_real value_hilbert_imag
0 2000-06-05 00:00:00+00:00 22262 22262.0 -1269.805176
1 2000-06-05 00:30:00+00:00 21756 21756.0 -2755.227539
2 2000-06-05 01:00:00+00:00 22247 22247.0 -4077.813232
3 2000-06-05 01:30:00+00:00 22759 22759.0 -4404.573242
4 2000-06-05 02:00:00+00:00 22549 22549.0 -4629.981445
# Example 4: Using Polars Engine on a groupby object
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('taylor_30_min', parse_dates=['date'])
df_hilbert_pd = (
    df
        .augment_hilbert(
            date_column = 'date',
            value_column = ['value'],
            engine = 'pandas'
        )
)

df_hilbert.head()
date value value_hilbert_real value_hilbert_imag
0 2000-06-05 00:00:00+00:00 22262 22262.0 -1269.805176
1 2000-06-05 00:30:00+00:00 21756 21756.0 -2755.227539
2 2000-06-05 01:00:00+00:00 22247 22247.0 -4077.813232
3 2000-06-05 01:30:00+00:00 22759 22759.0 -4404.573242
4 2000-06-05 02:00:00+00:00 22549 22549.0 -4629.981445