Changelog for Pytimetk

pytimetk 1.0.1.9000

New Features

Time Series Cross Validation (TSCV)

Integration with timebasedcv #291. New Classes:

  • TimeSeriesCV(): An enhanced version of TimeBasedSplit() that defaults to mode = "backwards", allows for maximum splits using split_limit, and adds enhanced diagnostics like glimpse() and plot()

Plotly Dropdowns

A plotly dropdown automates the group-wise analysis. Instead of facets, which are only powerful for <=9 plots at a time, a dropdown can easily visualize more plots.

  • plot_timeseries(): Gets new parameters plotly_dropdown, plotly_dropdown_x, plotly_dropdown_y #301
  • plot_anomalies(): Gets new parameters plotly_dropdown, plotly_dropdown_x, plotly_dropdown_y #301

Wide-Format Plotting

  • plot_timeseries(value_column = list(), color_column=list()): Now supports multiple columns in wide format for grouped time series data visualization. #136

pytimetk 1.0.1

Fixes:

  • tk.summarize_by_time(): AttributeError: โ€˜DataFrameโ€™ object has no attribute โ€˜groupbyโ€™ #298

pytimetk 1.0.0

Pandas and Polars Compatibility:

Upgrading to:

  • pandas >= 2.0.0
  • polars >= 1.2.0

Use pytimetk <=0.4.0 to support:

  • pandas <2.0.0
  • polars <1.0.0

Improvements:

  • Implement sort_dataframe(): This function is used internally to make sure Polars and Pandas engines perform grouped operations consistently and correctly. #286 #290
  • .augment_lags() and .augment_leads(): value_column now accepts any dtype. #295

pytimetk 0.4.0

Feature Engineering Module:

  1. augment_pct_change(): pandas and polars engines

Finance Module Updates:

  1. augment_macd(): MACD, pandas and polars engines
  2. augment_bbands(): Bollinger Bands, pandas and polars engines
  3. augment_atr(): Average True Range, pandas and polars engines
  4. augment_ppo(): Percentage Price Oscillator, pandas and polars engines
  5. augment_rsi(): Relative Strength Index, pandas and polars engines
  6. augment_qsmomentum(): Quant Science Momentum Indicator, pandas and polars engines
  7. augment_roc(): Rate of Change (ROC), pandas and polars engines

Polars Upgrades

  • Migrate to polars 0.20.7

pytimetk 0.3.0

Correlation Funnel

The R package correlationfunnel has been ported inside pytimetk:

  • binarize()
  • correlate()
  • plot_correlation_funnel()

Core:

  • filter_by_time() - Filtering with time-based strings

Feature Engineering:

  • augment_diffs() - Can now add differenced columns
  • augment_fourier() - Can now add fourier features.

Finance Module:

  • augment_cmo(): Chande Momentum Oscillator (CMO)

New Polars Backends:

  • augment_diffs()
  • augment_fourier()
  • augment_cmo()

General

  • Make memory reduction optional #275

pytimetk 0.2.1

Bugfix - Issue with augment_rolling(engine='pandas') and augment_expanding(engine='pandas') with concatinating rolled/expanded calcโ€™s to the correct group

pytimetk 0.2.0

Anomaly Detection

  • anomalize(): A new scalable function for detecting time series anomalies (outliers)
  • plot_anomalies(): A scalable visualization tool for inspecting anomalies in time series data.
  • plot_anomalies_decomp: A scalable visualization tool for inspecting the observed, seasonal, trend, and remainder decomposition, which are useful for telling you whether or not anomalies are being detected to your preference.
  • plot_anomalies_cleaned(): A scalable visualization tool for showing the before and after transformation for the cleaned vs uncleaned anomalies.

New Functions:

  • apply_by_time(): For complex apply-style aggregations by time.
  • augment_rolling_apply(): For complex rolling operations using apply-style data frame functions.
  • augment_expanding(): For expanding calculations with single-column functions (e.g. mean).
  • augment_expanding_apply(): For complex expanding operations with apply-style data frame functions.
  • augment_hilbert(): Hilbert features for signal processing.
  • augment_wavelet(): Wavelet transform features.
  • get_frequency(): Infer a pandas-like frequency. More robust than pandas.infer_freq.
  • get_seasonal_frequency(): Infer the pandas-like seasonal frequency (periodicity) for the time series.
  • get_trend_frequency(): Infer the pandas-like trend for the time series.

New Finance Module

More coming soon.

  • augment_ewm(): Exponentially weighted augmentation

Speed Improvements

Polars Engines:

  • summarize_by_time(): Gains a polars engine.
    • 3X to 10X speed improvements.
  • augment_lags() and augment_leads(): Gains a polars engine. Speed improvements increase with number of lags/leads.
    • 6.5X speed improvement with 100 lags.
  • augment_rolling(): Gains a polars engine. 10X speed improvement.
  • augment_expanding(): Gains a polars engine.
  • augment_timeseries_signature(): Gains a polars engine. 3X speed improvement.
  • augment_holiday_signature(): Gains a polars engine.

Parallel Processing and Vectorized Optimizations:

  • pad_by_time(): Complete overhaul. Uses Cartesian Product (Vectorization) to enhance the speed. 1000s of time series can now be padded in seconds.
    • Independent review: Time went from over 90 minutes to 13 seconds for a 500X speedup on 10M rows.
  • future_frame(): Complete overhaul. Uses vectorization when possible. Grouped parallel processing. Set threads = -1 to use all cores.
    • Independent Review: Time went from 11 minutes to 2.5 minutes for a 4.4X speedup on 10M rows
  • ts_features: Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.
  • ts_summary: Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.
  • anomalize: Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.
  • augment_rolling() and augment_rolling_apply(): Uses concurrent futures to parallelize tasks. Set threads = -1 to use all cores.

Helpful Utilities:

  • parallel_apply: Mimics the pandas apply() function with concurrent futures.
  • progress_apply: Adds a progress bar to pandas apply()
  • glimpse(): Mimics tidyverse (tibble) glimpse function

New Data Sets:

  • expedia: Expedia hotel searches time series data set

3 New Applied Tutorials:

  1. Sales Analysis Tutorial
  2. Finance Analysis Tutorial
  3. Demand Forecasting Tutorial
  4. Anomaly Detection Tutorial

Final Deprecations:

  • summarize_by_time(): kind = "period". This was removed for consistency with pytimetk. โ€œtimestampโ€ is the default.
  • augment_rolling(): use_independent_variables. This is replaced by augment_rolling_apply().

pytimetk 0.1.0 (2023-10-02)

About the Initial release.

This release includes the following features:

  1. A workhorse plotting function called plot_timeseries() ๐Ÿ’ช
  2. Three (3) data wrangling functions that will simplify 90% of time series tasks ๐Ÿ™
  3. Five (5) โ€œaugmentorโ€ functions: These add hundreds of features to time series to help in predictive tasks ๐Ÿง 
  4. Two (2) time series feature summarizes: identify key aspects of your time series ๐Ÿ”
  5. Nine (9) pandas series and DatetimeIndex helpers (work more easily with these timestamp data structures) โฒ
  6. Four (4) date utility functions that fill in missing function gaps in pandas ๐Ÿผ
  7. Two (2) Visualization utilities to help you customize your visualizations and make them look MORE professional ๐Ÿ“ˆ
  8. Two (2) Pandas helpers that help clean up and understand pandas data frames with time series ๐ŸŽ‡
  9. Twelve (12) time series datasets that you can practice PyTimeTK time series analysis on ๐Ÿ”ข

The PyTimeTK website comes with:

  1. Two (2) Getting started tutorials
  2. Five (5) Guides covering common tasks
  3. Coming Soon: Applied Tutorials in Sales, Finance, Demand Forecasting, Anomaly Detection, and more.