Speeding Up Workflows with Polars

Why Polars?

Polars shines on wide datasets and large groups thanks to its columnar memory model and eager/lazy execution. pytimetk supports Polars both through the .tk accessor and via engine="polars" on many heavy-hitting helpers, so you can keep your existing workflows while getting a speed boost.

1 Setup

Code

import polars as pl
import pytimetk as tk

We’ll use the m4_daily dataset (multiple daily series). Start in pandas, then convert to Polars.

Code

m4_daily_pd = tk.load_dataset("m4_daily", parse_dates=["date"])
m4_daily_pl = pl.from_pandas(m4_daily_pd)

m4_daily_pl

shape: (9_743, 3)

id	date	value
str	datetime[ns]	f64
"D10"	2014-07-03 00:00:00	2076.2
"D10"	2014-07-04 00:00:00	2073.4
"D10"	2014-07-05 00:00:00	2048.7
"D10"	2014-07-06 00:00:00	2048.9
"D10"	2014-07-07 00:00:00	2006.4
…	…	…
"D500"	2012-09-19 00:00:00	9418.8
"D500"	2012-09-20 00:00:00	9365.7
"D500"	2012-09-21 00:00:00	9445.9
"D500"	2012-09-22 00:00:00	9497.9
"D500"	2012-09-23 00:00:00	9545.3

2 Plotting Directly from Polars

Every Polars DataFrame gains a .tk accessor once pytimetk is imported. This means you can send Polars data straight into the visual helpers without bouncing back to pandas.

Code

single_series = m4_daily_pl.filter(pl.col("id") == "D10")

single_series.tk.plot_timeseries(
    date_column="date",
    value_column="value",
    title="Polars-powered plot_timeseries()",
)

3 Time-Based Aggregations with the Polars Engine

When you pass engine="polars" the heavy lifting happens in Polars, but the result returns as a pandas frame (so it works with the rest of the ecosystem). This is handy for weekly/monthly summaries across many groups.

Code

weekly_summary = (
    m4_daily_pl
    .group_by("id")
    .tk.summarize_by_time(
        date_column="date",
        value_column="value",
        freq="W",
        agg_func="mean",
        engine="polars",
    )
)

weekly_summary.head()

shape: (5, 3)

id	date	value
str	datetime[ns]	f64
"D10"	2014-07-06 00:00:00	2061.8
"D10"	2014-07-13 00:00:00	2005.828571
"D10"	2014-07-20 00:00:00	1981.085714
"D10"	2014-07-27 00:00:00	1895.185714
"D10"	2014-08-03 00:00:00	1924.457143

4 Rolling Features without Leaving Polars

The same pattern applies to rolling window computations. Here we build trailing 7-day mean and standard deviation per series, computed entirely with the Polars backend.

Code

rolling_features = (
    m4_daily_pl
    .group_by("id")
    .tk.augment_rolling(
        date_column="date",
        value_column="value",
        window=7,
        window_func=["mean", "std"],
        engine="polars",
    )
)

rolling_features.head()

shape: (5, 5)

id	date	value	value_rolling_mean_win_7	value_rolling_std_win_7
str	datetime[ns]	f64	f64	f64
"D10"	2014-07-03 00:00:00	2076.2	null	null
"D10"	2014-07-04 00:00:00	2073.4	null	null
"D10"	2014-07-05 00:00:00	2048.7	null	null
"D10"	2014-07-06 00:00:00	2048.9	null	null
"D10"	2014-07-07 00:00:00	2006.4	null	null

If you need a pandas DataFrame afterwards, just convert:

Code

rolling_features.to_pandas().head()

	id	date	value	value_rolling_mean_win_7	value_rolling_std_win_7
0	D10	2014-07-03	2076.2	NaN	NaN
1	D10	2014-07-04	2073.4	NaN	NaN
2	D10	2014-07-05	2048.7	NaN	NaN
3	D10	2014-07-06	2048.9	NaN	NaN
4	D10	2014-07-07	2006.4	NaN	NaN

5 Pure Polars Pipelines

You can stay in Polars from end-to-end:

Prep data with pl.DataFrame operations.
Call .tk helpers that support Polars inputs.
Only convert to pandas at the final step if your next tool requires it.

This keeps the data in a columnar format for as long as possible, unlocking better cache usage and multithreading—without rewriting the entire pytimetk API.

6 Next Steps

Revisit the Data Visualization Guide to combine Polars-powered preprocessing with the latest plotting helpers.
Check the function reference for each helper’s engine support matrix.