Ray Parallelism Quickstart

How to take advantage of pytimetk’s Ray-backed helpers for faster time-series workflows.

Why Ray?

Many of pytimetk’s performance-sensitive helpers (e.g., future_frame, ts_features, rolling/expanding utilities) now fan work out via Ray whenever you set threads != 1. Shipping Ray as a core dependency means you already have everything installed—there are just two knobs to remember:

Enable parallelism by passing threads=-1 (all cores) or any value > 1.
Disable parallelism by leaving threads=1 (the default) if you want strictly single-threaded execution.

To keep things predictable, Ray is initialized lazily the first time a helper actually needs it, so the common single-threaded path has zero extra overhead.

Example: `ts_features` with Ray

The snippet below mirrors the production behavior. Run it from any Python session (no Ray-specific bootstrapping required):

import pandas as pd
import pytimetk as tk
from tsfeatures import acf_features, hurst

# Load a small grouped dataset
df = tk.load_dataset("m4_hourly", parse_dates=["date"])

# Extract a couple of features per id using Ray workers
feature_df = (
    df
        .groupby("id", sort=False)
        .ts_features(
            date_column="date",
            value_column="value",
            features=[acf_features, hurst],
            freq=24,
            threads=-1,          # <-- spin up Ray workers (all cores)
            show_progress=True,
        )
)

print(feature_df.head())

2025-11-07 09:12:21,496 INFO worker.py:2012 -- Started a local Ray instance.

     id     hurst    x_acf1   x_acf10  diff1_acf1  diff1_acf10  diff2_acf1  \
0   H10  0.899455  0.935151  2.857201    0.181541     0.422999   -0.557066   
1  H150  0.464328  0.909548  2.548242    0.316810     0.350988   -0.422249   
2  H410  0.480160  0.803585  1.235837    0.258207     0.222276   -0.216247   
3   H50  0.890642  0.972679  3.225596    0.933112     3.583310    0.433440   

   diff2_acf10  seas_acf1  
0     0.331021   0.869954  
1     0.227795   0.711519  
2     0.219944   0.752024  
3     0.847016   0.900682

Behind the scenes:

The grouped frame is chunked by id.
Ray initializes (if it hasn’t already) using the available CPU count.
Each chunk runs in parallel, and the results are stitched back together in the original order.

If you need to fall back to single-threaded mode—for example, when debugging inside an environment that restricts background processes—just set threads=1 and the helper will never touch Ray.

Troubleshooting Tips

Memory pressure? Use smaller threads values (e.g., threads=2) to limit the number of Ray workers.
Jupyter notebooks sometimes keep Ray clusters alive between runs. Call import ray; ray.shutdown() if you need to tear down the cluster manually.
Progress bars still work. If you prefer silence, set show_progress=False.

Why Ray?

Example: ts_features with Ray

Troubleshooting Tips

Example: `ts_features` with Ray