seasonal_diagnostics

seasonal_diagnostics(data, date_column, value_column, feature_set='auto')

Prepare seasonal feature diagnostics akin to tk_seasonal_diagnostics.

Parameters

Name Type Description Default
data pd.DataFrame or pd.core.groupby.generic.DataFrameGroupBy Time series data (long format) or grouped data. required
date_column str Name of the datetime column. required
value_column str Numeric measure to analyse. required
feature_set str or sequence One or more of ["second", "minute", "hour", "wday.lbl", "week", "month.lbl", "quarter", "year"]. The special value "auto" selects features based on the timestamp scale and overall history. 'auto'

Returns

Name Type Description
pd.DataFrame Tidy data with: - grouping columns (when present) - date (or the supplied date_column) - the original value_column - seasonal_feature (e.g. "hour") - seasonal_value (the actual categorical value for that observation)

Examples

import numpy as np
import pandas as pd
import pytimetk as tk

rng = pd.date_range("2020-01-01", periods=48, freq="H")
df = pd.DataFrame(
    {
        "id": ["A"] * 24 + ["B"] * 24,
        "date": list(rng[:24]) + list(rng[:24]),
        "value": np.random.default_rng(123).normal(size=48),
    }
)

diagnostics = tk.seasonal_diagnostics(
    data=df.groupby("id"),
    date_column="date",
    value_column="value",
    feature_set=["hour", "wday.lbl"],
)
diagnostics.head()
id date value seasonal_feature seasonal_value
0 A 2020-01-01 00:00:00 -0.989121 hour 0
1 A 2020-01-01 01:00:00 -0.367787 hour 1
2 A 2020-01-01 02:00:00 1.287925 hour 2
3 A 2020-01-01 03:00:00 0.193974 hour 3
4 A 2020-01-01 04:00:00 0.920231 hour 4
from pytimetk.utils.selection import contains

selector_diagnostics = tk.seasonal_diagnostics(
    data=df,
    date_column=contains("dat"),
    value_column=contains("val"),
    feature_set=["hour"],
)
selector_diagnostics.head()
date value seasonal_feature seasonal_value
0 2020-01-01 00:00:00 -0.989121 hour 0
1 2020-01-01 00:00:00 0.754770 hour 0
2 2020-01-01 01:00:00 -0.145978 hour 1
3 2020-01-01 01:00:00 -0.367787 hour 1
4 2020-01-01 02:00:00 1.287925 hour 2