PyTimeTK

Time series easier, faster, more fun.

Please ⭐ us on GitHub (it takes 2‑seconds and means a lot).


1 Why pytimetk?

  • Single API, multiple engines. Every helper works on pandas and Polars (many run on NVIDIA cudf/GPU as well).
  • Productivity first. Visualization, aggregation, feature engineering, anomaly detection, and regime modeling in a couple of lines.
  • Performance obsessed. Vectorized Polars support, GPU acceleration (beta), and feature-store style caching.

2 The toolkit in one table

Workflow pytimetk API Superpower Docs
Visualization & diagnostics plot_timeseries, plot_stl_diagnostics, plot_time_series_boxplot, theme_plotly_timetk Build Plotly dashboards (with dropdowns) or static plotnine charts in seconds Visualization guide
Time-aware aggregations summarize_by_time, apply_by_time, pad_by_time(fillna=…) Resample, roll up, pad + fill, or apply arbitrary functions over calendar buckets Selectors & periods
Feature engineering augment_timeseries_signature, augment_rolling, augment_wavelet, FeatureStore Calendar signatures, GPU-ready rolling stats, wavelet power features, cacheable transforms Feature reference
Anomaly workflows anomalize, plot_anomalies, plot_anomalies_decomp, plot_anomalies_cleaned Detect β†’ diagnose β†’ visualize anomalies without hand wiring Anomaly docs
Finance & regimes augment_regime_detection (✨ regime_backends extra), augment_macd, … HMM regime detection with hmmlearn/pomegranate plus dozens of indicators Finance module
Polars-native workflows .tk accessor on pl.DataFrame, engine="polars" on heavy helpers Keep everything inside Polars, including plotting and feature generation Polars guide
Production extras (beta) Feature Store, MLflow integration, GPU acceleration Cache expensive features, log metadata with experiments, or flip to RAPIDS Production docs

3 60‑second tour

import numpy as np
import pandas as pd
import pytimetk as tk
from pytimetk.utils.selection import contains

sales = tk.load_dataset("bike_sales_sample", parse_dates=["order_date"])

# 1. Summaries with Polars speed
monthly = (
    sales.groupby("category_1")
    .summarize_by_time(
        date_column="order_date",
        value_column="total_price",
        freq="MS",
        agg_func=["sum", "mean"],
        engine="polars",
    )
)

monthly.head()
category_1 order_date total_price_sum total_price_mean
0 Mountain 2011-01-01 221490 4922.000000
1 Mountain 2011-02-01 660555 4374.536424
2 Mountain 2011-03-01 358855 5882.868852
3 Mountain 2011-04-01 1075975 4890.795455
4 Mountain 2011-05-01 450440 4549.898990
# 2. Plot with dropdowns
monthly.groupby('category_1').plot_timeseries(
    date_column="order_date",
    value_column=contains("sum"),
    color_column="category_1",
    smooth=False,
    plotly_dropdown=True,
    title="Revenue by Category",
)
# 3. Fill gaps + anomalies
df = (
    sales.groupby(["category_1", "order_date"], as_index=False)
    .agg(total_price=("total_price", "sum"))
    .groupby("category_1")
    .pad_by_time(date_column="order_date", freq="1 day", fillna=0)
)

(
    df
    .groupby("category_1")
    .anomalize("order_date", "total_price")
    .groupby("category_1")
    .plot_anomalies(
        date_column="order_date",
        plotly_dropdown=True,
        plotly_dropdown_x=1.05,
        plotly_dropdown_y=1.10,
    )
)

4 Fresh highlights

  • New data visualizations Discover new time series plots like Time Series Box Plots, Regression Plots, Seasonal and Decomposition plots in our upgraded Guide 01.
  • Selectors + natural periods guide. Learn how to point at columns with contains()/starts_with() and specify periods like "2 weeks" or "45 minutes". β†’ Guide 08
  • Polars everywhere. Dedicated Polars guide plus .tk accessor coverage for plotting, feature engineering, and gap filling.
  • GPU + Feature Store (beta). Run rolling stats using our RAPIDS cudf guide or cache/track expensive feature sets with metadata and MLflow hooks in our new Feature Store guide.

5 Installation

Install the latest stable version of pytimetk using pip:

pip install pytimetk

Alternatively you can install the development version:

pip install --upgrade --force-reinstall git+https://github.com/business-science/pytimetk.git

6 Guides & docs

Topic Why read it?
Quick Start Load data, plot, summarize, and forecast-ready features in ~5 minutes.
Visualization Guide Deep dive into plot_timeseries, STL diagnostics, anomaly plots, and Plotly theming.
Polars Guide How to keep data in Polars while still using pytimetk plotting/feature APIs.
Selectors & Human Durations Column selectors, natural-language periods, and new padding/future-frame tricks.
Production / GPU Feature store beta, caching, MLflow logging, and NVIDIA RAPIDS setup.
API Reference Full catalogue of helpers by module.

7 Release notes (highlights)

pytimetk 2.4.x

  • Advanced Plotting Diagnostics: Added new APIs for visualizing time series diagnostics. These functions provide interactive and insightful plots to analyze correlation, seasonality, and trends. plot_acf_diagnostics plot_seasonal_diagnostics plot_time_series_boxplot.
  • Tidy Selectors: contains, starts_with, ends_with, matches.

pytimetk 2.2.x

  • Polars-native optimizations & memory efficiency: eliminated unnecessary conversions to pandas, keeping data in Arrow buffers for zero-copy chaining and reduced memory overhead.
  • ~7X faster execution for EWM operations (augment_ewm).

pytimetk 2.1.x

  • GPU acceleration (beta) for rolling/expanding/finance helpers via NVIDIA RAPIDS cudf (polars.LazyFrame.collect(engine="gpu") supported).
  • pad_by_time(fillna=…) scalar filling, new selectors/human-duration guide, Plotly theming helper.

pytimetk 2.0.x

  • Polars .tk accessor support landed for plotting helpers and diagnostics.
  • Feature Store beta with optional MLflow logging and on-disk caching.

8 Feature Store & Caching (Beta)

Beta

The Feature Store is currently released as a Beta capability. APIs, configuration options, and storage formats may change in upcoming releases. Please share feedback or issues so we can stabilize it quickly.

Persist expensive feature engineering steps once and reuse them everywhere. Register a transform, build it on a dataset, and reload it in any notebook or job with automatic versioning, metadata, and cache hits.

import pandas as pd
import pytimetk as tk

df = tk.load_dataset("bike_sales_sample", parse_dates=["order_date"])

store = tk.FeatureStore()

store.register(
    "sales_signature",
    lambda data: tk.augment_timeseries_signature(
        data,
        date_column="order_date",
        engine="pandas",
    ),
    default_key_columns=("order_id",),
    description="Calendar signatures for sales orders.",
)

result = store.build("sales_signature", df)
print(result.from_cache)  # False first run, True on subsequent builds
  • Supports local disk or any pyarrow filesystem (e.g., s3://, gs://) via the artifact_uri parameter, plus optional file-based locking for concurrent jobs.
  • Optional MLflow helpers capture feature versions and artifacts with your experiments for reproducible pipelines.

9 Documentation

Get started with the pytimetk documentation

10 πŸ† More Coming Soon…

We are in the early stages of development. But it’s obvious the potential for pytimetk now in Python. 🐍

11 ⭐️ Star History

Star History Chart

Please ⭐ us on GitHub (it takes 2 seconds and means a lot).