Code
import numpy as np
import pandas as pd
import pytimetk as tk
from pytimetk.utils.selection import contains, starts_with, ends_withMany pytimetk helpers accept tidy selectors for columns and human-readable durations for periods/frequencies. Mastering these inputs keeps your code expressive and resilient as schemas evolve.
We’ll use the bike sales dataset, trimmed to a few relevant columns.
| order_date | category_1 | category_2 | total_price | quantity | model | |
|---|---|---|---|---|---|---|
| 0 | 2011-01-07 | Mountain | Over Mountain | 6070 | 1 | Jekyll Carbon 2 |
| 1 | 2011-01-07 | Mountain | Over Mountain | 5970 | 1 | Trigger Carbon 2 |
| 2 | 2011-01-10 | Mountain | Trail | 2770 | 1 | Beast of the East 1 |
| 3 | 2011-01-10 | Mountain | Over Mountain | 5970 | 1 | Trigger Carbon 2 |
| 4 | 2011-01-10 | Road | Elite Road | 10660 | 1 | Supersix Evo Hi-Mod Team |
Selectors are callables (or patterns) that resolve to concrete column names at runtime. They work anywhere you see ColumnSelector in the docs—plot_timeseries, summarize_by_time, augment_*, etc.
Passing a string or list behaves exactly like pandas:
Use helpers from pytimetk.utils.selection to match columns dynamically:
| order_date | total_price_sum | total_price_mean | quantity_sum | quantity_mean | |
|---|---|---|---|---|---|
| 0 | 2011-01-01 | 483015 | 4600.142857 | 128 | 1.219048 |
| 1 | 2011-02-01 | 1162075 | 4611.408730 | 331 | 1.313492 |
| 2 | 2011-03-01 | 659975 | 5196.653543 | 174 | 1.370079 |
| 3 | 2011-04-01 | 1827140 | 4533.846154 | 542 | 1.344913 |
| 4 | 2011-05-01 | 844170 | 4097.912621 | 302 | 1.466019 |
Under the hood, selectors resolve through tk.resolve_column_selection, so you can even supply regular expressions or custom callables if needed.
Frequency-oriented helpers (e.g., pad_by_time, future_frame, plot_time_series_boxplot) accept pandas offsets or natural language strings. pytimetk converts the latter using tk.parse_human_duration.
0 days 00:45:00
<DateOffset: months=3>
pad_by_timeEnsure a continuous hourly series and fill padded rows with zeros:
| category_1 | order_date | total_price | |
|---|---|---|---|
| 0 | Mountain | 2011-01-07 00:00:00 | 12040.0 |
| 1 | Mountain | 2011-01-07 01:00:00 | 0.0 |
| 2 | Mountain | 2011-01-07 02:00:00 | 0.0 |
| 3 | Mountain | 2011-01-07 03:00:00 | 0.0 |
| 4 | Mountain | 2011-01-07 04:00:00 | 0.0 |
future_frameGenerate 60 additional days while keeping the output separate from the historical data:
plot_time_series_boxplotMix selectors and durations to build rolling distributions over arbitrary periods:
ColumnSelector or duration inputs.ColumnSelector or “duration” to discover which helpers accept these flexible inputs.