Changelog for Pytimetk
pytimetk 1.0.1.9000
New Features
Time Series Cross Validation (TSCV)
Integration with timebasedcv
#291. New Classes:
TimeSeriesCV()
: An enhanced version ofTimeBasedSplit()
that defaults tomode = "backwards"
, allows for maximum splits usingsplit_limit
, and adds enhanced diagnostics likeglimpse()
andplot()
Plotly Dropdowns
A plotly dropdown automates the group-wise analysis. Instead of facets, which are only powerful for <=9 plots at a time, a dropdown can easily visualize more plots.
plot_timeseries()
: Gets new parametersplotly_dropdown
,plotly_dropdown_x
,plotly_dropdown_y
#301plot_anomalies()
: Gets new parametersplotly_dropdown
,plotly_dropdown_x
,plotly_dropdown_y
#301
Wide-Format Plotting
- plot_timeseries(value_column = list(), color_column=list()): Now supports multiple columns in wide format for grouped time series data visualization. #136
pytimetk 1.0.1
Fixes:
tk.summarize_by_time()
: AttributeError: โDataFrameโ object has no attribute โgroupbyโ #298
pytimetk 1.0.0
Pandas and Polars Compatibility:
Upgrading to:
- pandas >= 2.0.0
- polars >= 1.2.0
Use pytimetk <=0.4.0 to support:
- pandas <2.0.0
- polars <1.0.0
Improvements:
- Implement
sort_dataframe()
: This function is used internally to make sure Polars and Pandas engines perform grouped operations consistently and correctly. #286 #290 .augment_lags()
and.augment_leads()
: value_column now accepts any dtype. #295
pytimetk 0.4.0
Feature Engineering Module:
augment_pct_change()
: pandas and polars engines
Finance Module Updates:
augment_macd()
: MACD, pandas and polars enginesaugment_bbands()
: Bollinger Bands, pandas and polars enginesaugment_atr()
: Average True Range, pandas and polars enginesaugment_ppo()
: Percentage Price Oscillator, pandas and polars enginesaugment_rsi()
: Relative Strength Index, pandas and polars enginesaugment_qsmomentum()
: Quant Science Momentum Indicator, pandas and polars enginesaugment_roc()
: Rate of Change (ROC), pandas and polars engines
Polars Upgrades
- Migrate to
polars 0.20.7
pytimetk 0.3.0
Correlation Funnel
The R package correlationfunnel
has been ported inside pytimetk
:
binarize()
correlate()
plot_correlation_funnel()
Core:
filter_by_time()
- Filtering with time-based strings
Feature Engineering:
augment_diffs()
- Can now add differenced columnsaugment_fourier()
- Can now add fourier features.
Finance Module:
augment_cmo()
: Chande Momentum Oscillator (CMO)
New Polars Backends:
augment_diffs()
augment_fourier()
augment_cmo()
General
- Make memory reduction optional #275
pytimetk 0.2.1
Bugfix - Issue with augment_rolling(engine='pandas')
and augment_expanding(engine='pandas')
with concatinating rolled/expanded calcโs to the correct group
pytimetk 0.2.0
Anomaly Detection
anomalize()
: A new scalable function for detecting time series anomalies (outliers)plot_anomalies()
: A scalable visualization tool for inspecting anomalies in time series data.plot_anomalies_decomp
: A scalable visualization tool for inspecting the observed, seasonal, trend, and remainder decomposition, which are useful for telling you whether or not anomalies are being detected to your preference.plot_anomalies_cleaned()
: A scalable visualization tool for showing the before and after transformation for the cleaned vs uncleaned anomalies.
New Functions:
apply_by_time()
: For complex apply-style aggregations by time.augment_rolling_apply()
: For complex rolling operations using apply-style data frame functions.
augment_expanding()
: For expanding calculations with single-column functions (e.g. mean).augment_expanding_apply()
: For complex expanding operations with apply-style data frame functions.augment_hilbert()
: Hilbert features for signal processing.augment_wavelet()
: Wavelet transform features.get_frequency()
: Infer a pandas-like frequency. More robust thanpandas.infer_freq
.get_seasonal_frequency()
: Infer the pandas-like seasonal frequency (periodicity) for the time series.get_trend_frequency()
: Infer the pandas-like trend for the time series.
New Finance Module
More coming soon.
augment_ewm()
: Exponentially weighted augmentation
Speed Improvements
Polars Engines:
summarize_by_time()
: Gains a polars engine.- 3X to 10X speed improvements.
augment_lags()
andaugment_leads()
: Gains a polars engine. Speed improvements increase with number of lags/leads.- 6.5X speed improvement with 100 lags.
augment_rolling()
: Gains a polars engine. 10X speed improvement.augment_expanding()
: Gains a polars engine.augment_timeseries_signature()
: Gains a polars engine. 3X speed improvement.augment_holiday_signature()
: Gains a polars engine.
Parallel Processing and Vectorized Optimizations:
pad_by_time()
: Complete overhaul. Uses Cartesian Product (Vectorization) to enhance the speed. 1000s of time series can now be padded in seconds.- Independent review: Time went from over 90 minutes to 13 seconds for a 500X speedup on 10M rows.
future_frame()
: Complete overhaul. Uses vectorization when possible. Grouped parallel processing. Setthreads = -1
to use all cores.- Independent Review: Time went from 11 minutes to 2.5 minutes for a 4.4X speedup on 10M rows
ts_features
: Uses concurrent futures to parallelize tasks. Setthreads = -1
to use all cores.ts_summary
: Uses concurrent futures to parallelize tasks. Setthreads = -1
to use all cores.anomalize
: Uses concurrent futures to parallelize tasks. Setthreads = -1
to use all cores.augment_rolling()
andaugment_rolling_apply()
: Uses concurrent futures to parallelize tasks. Setthreads = -1
to use all cores.
Helpful Utilities:
parallel_apply
: Mimics thepandas apply()
function with concurrent futures.progress_apply
: Adds a progress bar to pandas apply()glimpse()
: Mimics tidyverse (tibble) glimpse function
New Data Sets:
expedia
: Expedia hotel searches time series data set
3 New Applied Tutorials:
Final Deprecations:
summarize_by_time()
:kind = "period"
. This was removed for consistency withpytimetk
. โtimestampโ is the default.augment_rolling()
:use_independent_variables
. This is replaced byaugment_rolling_apply()
.
pytimetk 0.1.0 (2023-10-02)
About the Initial release.
This release includes the following features:
- A workhorse plotting function called
plot_timeseries()
๐ช - Three (3) data wrangling functions that will simplify 90% of time series tasks ๐
- Five (5) โaugmentorโ functions: These add hundreds of features to time series to help in predictive tasks ๐ง
- Two (2) time series feature summarizes: identify key aspects of your time series ๐
- Nine (9) pandas series and DatetimeIndex helpers (work more easily with these timestamp data structures) โฒ
- Four (4) date utility functions that fill in missing function gaps in pandas ๐ผ
- Two (2) Visualization utilities to help you customize your visualizations and make them look MORE professional ๐
- Two (2) Pandas helpers that help clean up and understand pandas data frames with time series ๐
- Twelve (12) time series datasets that you can practice PyTimeTK time series analysis on ๐ข
The PyTimeTK website comes with:
- Two (2) Getting started tutorials
- Five (5) Guides covering common tasks
- Coming Soon: Applied Tutorials in Sales, Finance, Demand Forecasting, Anomaly Detection, and more.