augment_hilbert

augment_hilbert(
    data,
    date_column,
    value_column,
    reduce_memory=True,
    engine='pandas',
)

Apply the Hilbert transform to specified columns of a DataFrame or DataFrameGroupBy object.

Signal Processing: The Hilbert transform is used in various signal processing techniques, including phase and amplitude modulation and demodulation, and in the analysis of signals with time-varying amplitude and frequency.

Parameters

Name	Type	Description	Default
data	pd.DataFrame or pd.core.groupby.generic.DataFrameGroupBy	Input DataFrame or DataFrameGroupBy object with one or more columns of real-valued signals.	required
value_column	str or list	List of column names in ‘data’ to which the Hilbert transform will be applied.	required
reduce_memory	bool	The `reduce_memory` parameter is used to specify whether to reduce the memory usage of the DataFrame by converting int, float to smaller bytes and str to categorical data. This reduces memory for large data but may impact resolution of float and will change str to categorical. Default is True.	`True`
engine	str	The `engine` parameter is used to specify the engine to use for summarizing the data. It can be either “pandas” or “polars”. - The default value is “pandas”. - When “polars”, the function will internally use the `polars` library for summarizing the data. This can be faster than using “pandas” for large datasets.	`'pandas'`

Returns

Name	Type	Description
df_hilbert	pd.DataFrame	A new DataFrame with the 2 Hilbert-transformed columns added, 1 for the real and 1 for imaginary (original columns are preserved).

Notes

The Hilbert transform is used in time series analysis primarily for:

Creating Analytic Signals: Forms a complex-valued signal whose properties (magnitude and phase) provide valuable insights into the original signal’s structure.
Determining Instantaneous Phase/Frequency: Offers real-time signal characteristics, crucial for non-stationary signals whose properties change over time.
Extracting Amplitude Envelope: Helps in identifying signal’s amplitude variations, useful in various analysis tasks.
Enhancing Signal Analysis: Assists in tasks like demodulation, trend analysis, feature extraction for machine learning, and improving signal-to-noise ratio, providing a deeper understanding of underlying patterns and trends.

Examples

# Example 1: Using Pandas Engine on a pandas groupby object
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('walmart_sales_weekly', parse_dates=['Date'])


df_hilbert = (
    df
        .groupby('id')
        .augment_hilbert(
            date_column = 'Date',
            value_column = ['Weekly_Sales'],
            engine = 'pandas'
        )
)

df_hilbert.head()

	id	Store	Dept	Date	Weekly_Sales	IsHoliday	Type	Size	Temperature	Fuel_Price	MarkDown1	MarkDown2	MarkDown3	MarkDown4	MarkDown5	CPI	Unemployment	Weekly_Sales_hilbert_real	Weekly_Sales_hilbert_imag
0	1_1	1	1	2010-02-05	24924.500000	0	A	151315	42.310001	2.572	NaN	NaN	NaN	NaN	NaN	211.096359	8.106	24924.498047	-12764.086914
1	1_1	1	1	2010-02-12	46039.488281	1	A	151315	38.509998	2.548	NaN	NaN	NaN	NaN	NaN	211.242172	8.106	46039.488281	-13469.210938
2	1_1	1	1	2010-02-19	41595.550781	0	A	151315	39.930000	2.514	NaN	NaN	NaN	NaN	NaN	211.289139	8.106	41595.550781	16686.888672
3	1_1	1	1	2010-02-26	19403.539062	0	A	151315	46.630001	2.561	NaN	NaN	NaN	NaN	NaN	211.319641	8.106	19403.535156	9378.224609
4	1_1	1	1	2010-03-05	21827.900391	0	A	151315	46.500000	2.625	NaN	NaN	NaN	NaN	NaN	211.350143	8.106	21827.898438	2552.131836

# Example 2: Using Polars Engine on a pandas groupby object
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('walmart_sales_weekly', parse_dates=['Date'])
df_hilbert = (
    df
        .groupby('id')
        .augment_hilbert(
            date_column = 'Date',
            value_column = ['Weekly_Sales'],
            engine = 'polars'
        )
)

df_hilbert.head()

	id	Store	Dept	Date	Weekly_Sales	IsHoliday	Type	Size	Temperature	Fuel_Price	MarkDown1	MarkDown2	MarkDown3	MarkDown4	MarkDown5	CPI	Unemployment	Weekly_Sales_hilbert_real	Weekly_Sales_hilbert_imag
0	1_1	1	1	2010-02-05	24924.500000	0	A	151315	42.310001	2.572	NaN	NaN	NaN	NaN	NaN	211.096359	8.106	24924.498047	-12764.086914
1	1_1	1	1	2010-02-12	46039.488281	1	A	151315	38.509998	2.548	NaN	NaN	NaN	NaN	NaN	211.242172	8.106	46039.488281	-13469.210938
2	1_1	1	1	2010-02-19	41595.550781	0	A	151315	39.930000	2.514	NaN	NaN	NaN	NaN	NaN	211.289139	8.106	41595.550781	16686.888672
3	1_1	1	1	2010-02-26	19403.539062	0	A	151315	46.630001	2.561	NaN	NaN	NaN	NaN	NaN	211.319641	8.106	19403.535156	9378.224609
4	1_1	1	1	2010-03-05	21827.900391	0	A	151315	46.500000	2.625	NaN	NaN	NaN	NaN	NaN	211.350143	8.106	21827.898438	2552.131836

# Example 3: Using Polars Engine on a pandas dataframe
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('taylor_30_min', parse_dates=['date'])
df_hilbert = (
    df
        .augment_hilbert(
            date_column = 'date',
            value_column = ['value'],
            engine = 'polars'
        )
)

df_hilbert.head()

	date	value	value_hilbert_real	value_hilbert_imag
0	2000-06-05 00:00:00+00:00	22262	22262.0	-1269.805176
1	2000-06-05 00:30:00+00:00	21756	21756.0	-2755.227539
2	2000-06-05 01:00:00+00:00	22247	22247.0	-4077.813232
3	2000-06-05 01:30:00+00:00	22759	22759.0	-4404.573242
4	2000-06-05 02:00:00+00:00	22549	22549.0	-4629.981445

# Example 4: Using Polars Engine on a groupby object
import pytimetk as tk
import pandas as pd

df = tk.load_dataset('taylor_30_min', parse_dates=['date'])
df_hilbert_pd = (
    df
        .augment_hilbert(
            date_column = 'date',
            value_column = ['value'],
            engine = 'pandas'
        )
)

df_hilbert.head()

	date	value	value_hilbert_real	value_hilbert_imag
0	2000-06-05 00:00:00+00:00	22262	22262.0	-1269.805176
1	2000-06-05 00:30:00+00:00	21756	21756.0	-2755.227539
2	2000-06-05 01:00:00+00:00	22247	22247.0	-4077.813232
3	2000-06-05 01:30:00+00:00	22759	22759.0	-4404.573242
4	2000-06-05 02:00:00+00:00	22549	22549.0	-4629.981445