Adds percentage difference (percentage change) columns to pandas or polars data.
Parameters
Name
Type
Description
Default
data
DataFrame or GroupBy(pandas or polars)
Input data to augment with percentage change columns.
required
date_column
str
The date_column parameter is a string that specifies the name of the column in the DataFrame that contains the dates. This column will be used to sort the data before adding the percentage differenced values.
required
value_column
str or list
The value_column parameter is the column(s) in the DataFrame that you want to add percentage differences values for. It can be either a single column name (string) or a list of column names.
required
periods
int or tuple or list
The periods parameter is an integer, tuple, or list that specifies the periods to shift values when percentage differencing. - If it is an integer, the function will add that number of percentage differences values for each column specified in the value_column parameter. - If it is a tuple, it will generate percentage differences from the first to the second value (inclusive). - If it is a list, it will generate percentage differences based on the values in the list.
1
reduce_memory
bool
The reduce_memory parameter is used to specify whether to reduce the memory usage of the DataFrame by converting int, float to smaller bytes and str to categorical data. This reduces memory for large data but may impact resolution of float and will change str to categorical. Default is True.
False
engine
(auto, pandas, polars, cudf)
Execution engine. When βautoβ (default) the backend is inferred from the input data type. Use βpandasβ, βpolarsβ, or βcudfβ to force a specific backend.
"auto"
Returns
Name
Type
Description
DataFrame
DataFrame with percentage differenced columns added, matching the backend of the input data.
Examples
import pandas as pdimport polars as plimport pytimetk as tkdf = tk.load_dataset('m4_daily', parse_dates=['date'])df
id
date
value
0
D10
2014-07-03
2076.2
1
D10
2014-07-04
2073.4
2
D10
2014-07-05
2048.7
3
D10
2014-07-06
2048.9
4
D10
2014-07-07
2006.4
...
...
...
...
9738
D500
2012-09-19
9418.8
9739
D500
2012-09-20
9365.7
9740
D500
2012-09-21
9445.9
9741
D500
2012-09-22
9497.9
9742
D500
2012-09-23
9545.3
9743 rows Γ 3 columns
# Example 1 - Add 7 pctdiff values for a single DataFrame object (pandas)pctdiff_df_single = ( df .query('id == "D10"') .augment_pct_change( date_column='date', value_column='value', periods=(1, 7) ))pctdiff_df_single.glimpse()