NEWS.md
Modeltime now has a Spark Backend
NEW Vignette - Modeltime Spark Backend describing how to set up Modeltime with the Spark Backend.
If users install smooth
, the following models become available:
adam_reg()
: Interfaces with the ADAM forecasting algorithm in smooth
.
exp_smoothing()
: A new engine “smooth_es” connects to the Exponential Smoothing algorithm in smooth::es()
. This algorithm has several advantages, most importantly that it can use x-regs (unlike “ets” engine).
extract_nested_modeltime_table()
- Extracts a nested modeltime table by row id.extract_nested_train_split
and extract_nested_test_split
: Changed parameter from .data
to .object
for consistency with other “extract” functions
Added a new logged feature to modeltime_nested_fit()
to track the attribute “metric_set”, which is needed for ensembles. Old nested modeltime objects will need to be re-run to get this new attribute. This will be used in ensembles.
Nested (Iterative) Forecasting is aimed at making it easier to perform forecasting that is traditionally done in a for-loop with models like ARIMA, Prophet, and Exponential Smoothing. Functionality has been added to:
extend_timeseries()
, nest_timeseries()
, and split_nested_timeseris()
.modeltime_nested_fit()
: Fits many models to nested time series data and organizes in a “Nested Modeltime Table”. Logs Accuracy, Errors, and Test Forecasts.
control_nested_fit()
: Used to control the fitting process including verbosity and parallel processing.
Logging Extractors: Functions that retrieve logged information from the initial fitting process. extract_nested_test_accuracy()
, extract_nested_error_report()
, and extract_nested_test_forecast()
.
modeltime_nested_select_best()
: Selects the best model for each time series ID.
Logging Extractors: Functions that retrieve logged information from the model selection process. extract_nested_best_model_report()
modeltime_nested_refit()
: Refits to the .future_data
. Logs Future Forecasts.
control_nested_refit()
: Used to control the re-fitting process including verbosity and parallel processing.
Logging Extractors: Functions that retrieve logged information from the re-fitting process. extract_nested_future_forecast()
.
extended_forecast_accuracy_metric_set()
: Adds the new MAAPE metric for handling intermittent data when MAPE returns Inf.maape()
: New yardstick metric that calculates “Mean Arctangent Absolute Percentage Error” (MAAPE). Used when MAPE returns Inf typically due to intermittent data.modeltime_fit_workflowset()
: Improved handling of Workflowset Descriptions, which now match the wflow_id
.We’ve expanded Panel Data functionality to produce model accuracy and confidence interval estimates by a Time Series ID (#114). This is useful when you have a Global Model that produces forecasts for more than one time series. You can more easily obtain grouped accuracy and confidence interval estimates.
modeltime_calibrate()
: Gains an id
argument that is a quoted column name. This identifies that the residuals should be tracked by an time series identifier feature that indicates the time series groups.
modeltime_accuracy()
: Gains a acc_by_id
argument that is TRUE
/FALSE
. If the data has been calibrated with id
, then the user can return local model accuracy by the identifier column. The accuracy data frame will return a row for each combination of Model ID and Time Series ID.
modeltime_forecast()
: Gains a conf_by_id
argument that is TRUE
/FALSE
. If the data has been calibrated with id
, then the user can return local model confidence by the identifier column. The forecast data frame will return an extra column indicating the identifier column. The confidence intervals will be adjusted based on the local time series ID variance instead of the global model variance.
temporal_hierarchy()
: Implements the thief
package by Rob Hyndman and Nikolaos Kourentzes for “Temporal HIErarchical Forecasting”. #117modeltime_fit_workflowset()
where the workflowset (wflw_id) order was not maintained.Parallel Processing
New Vignette: Parallel Processing
parallel_start()
and parallel_stop()
: Helpers for setting up multicore processing.
create_model_grid()
: Helper to generate model specifications with filled-in parameters from a parameter grid (e.g. dials::grid_regular()
).
control_refit()
and control_fit_workflowset()
: Better printing.
Bug Fixes
cores > cores_available
.modeltime_fit_workflowset()
(#85) makes it easy to convert workflow_set
objects to Modeltime Tables (mdl_time_tbl
). Requires a refitting process that can now be performed in parallel or in sequence.
exp_smoothing()
.exp_smoothing()
.exp_smoothing()
gained 3 new tunable parameters:
smooth_level()
: This is often called the “alpha” parameter used as the base level smoothing factor for exponential smoothing models.smooth_trend()
: This is often called the “beta” parameter used as the trend smoothing factor for exponential smoothing models.smooth_seasonal()
: This is often called the “gamma” parameter used as the seasonal smoothing factor for exponential smoothing models.modeltime_refit()
: supports parallel processing. See control_refit()
modeltime_fit_workflowset()
: supports parallel processing. See control_workflowset()
boost_tree(mtry)
: Mapping switched from colsample_bytree
to colsample_bynode
. prophet_boost()
and arima_boost()
have been updated to reflect this change. https://github.com/tidymodels/parsnip/pull/499
exp_smoothing()
models produced in prior versions may require refitting with modeltime_refit()
to upgrade their internals with the new parameters.recursive()
for ensembles. The new recursive ensemble functionality is in modeltime.ensemble
>= 0.3.0.9000.recursive()
(#71) - Received a full upgrade to work with Panel Data.modeltime::metric_tweak()
for yardstick::metric_tweak()
. The yardstick::metric_tweak()
has a required .name
argument in addition to .fn
, which is needed for tuning.Baseline algorithms (#5, #37) have been created for comparing high-performance methods with simple forecasting methods.
window_reg
: Window-based methods such as mean, median, and even more complex seasonal models based on a forecasting window. The main tuning parameter is window_size
.naive_reg
: NAIVE and Seasonal NAIVE (SNAIVE) Regression Modelsmetric_tweak()
- Can modify yardstick
metrics like mase()
, which have seasonal parameters.default_forecast_accuracy_metric_set()
- Gets a ...
parameter that allows us to add more metrics beyond the defaults.A new function is added modeltime_residuals_test()
(#62, #68). Tests are implemented:
plot_modeltime_forecast()
- When plotting a single point forecast, plot_modeltime_forecast()
now uses geom_point()
instead of geom_line()
. Fixes #66.Fixes
recursive()
& modeltime_refit()
: Now able to refit a recursive workflow or recursive fitted parsnip object.New Functions
recursive()
: Turn a fitted model into a recursive predictor. (#49, #50)update_modeltime_model()
: New function to update a modeltime model inside a Modeltime Table.Breaking Changes
arima_workflow_tuned
dataset.as_modeltime_table()
: New function to convert one or more fitted models stored in a list
to a Modeltime Table.
Bug Fixes
m750_models
: Fixes error “R parsnip Error: Internal error: Unknown composition
type.”Panel Data
modeltime_forecast()
upgrades:
keep_data
: Gains a new argument keep_data
. This is useful when the new_data
and actual_data
has important information needed in analyzing the forecast.arrange_index
: Gains a new argument arrange_index
. By default, the forecast keeps the rows in the same order as the incoming data. Prior versions arranged Model Predictions by .index
, which impacts the users ability to match to Panel Data which is not likely to be arranged by date. Prediction best-practices are to keep the original order of the data, which will be preserved by default. To get the old behavior, simply toggle arrange_index = TRUE
.modeltime_calibrate()
: Can now handle panel data.
modeltime_accuracy()
: Can now handle panel data.
plot_modeltime_forecast()
: Can handle panel data provided the data is grouped by an ID column prior to plotting.
Error Messaging
modeltime_calibrate(quiet = FALSE)
.Compatibility
parsnip >= 0.1.4
. Uses set_encodings()
new parameter allow_sparse_x
.Ensembles
modeltime_refit()
- Changes to improve fault tolerance and error handling / messaging when making ensembles.Ensembles
modeltime.ensemble
, a new R package designed for forecasting with ensemble models.New Workflow Helper Functions
add_modeltime_model()
- A helper function making it easy to add a fitted parsnip or workflow object to a modeltime tablepluck_modeltime_model()
& pull_modeltime_model()
- A helper function making it easy to extract a model from a modeltime tableImprovements
?prophet_boost
prophet_reg()
can now have regressors controlled via set_engine()
using the following parameters:
regressors.mode
- Set to seasonality.mode
by default.regressors.prior.scale
- Set to 10,000 by default.regressors.standardize
- Set to “auto” by default.Data Sets
Modeltime now includes 4 new data sets:
m750
- M750 Time Series Datasetm750_models
- 3 Modeltime Models made on the M750 Datasetm750_splits
- An rsplit
object containing Train/test splits of the M750 datam750_training_resamples
- A Time Series Cross Validation time_series_cv
object made from the training(m750_splits)
Bug Fix
plot_modeltime_forecast()
fix issue with “ACTUAL” data being shown at bottom of legend list. Should be first item.Forecast without Calibration/Refitting
Sometimes it’s important to make fast forecasts without calculating out-of-sample accuracy and refitting (which requires 2 rounds of model training). You can now bypass the modeltime_calibrate()
and modeltime_refit()
steps and jump straight into forecasting the future. Here’s an example with h = "3 years"
. Note that you will not get confidence intervals with this approach because calibration data is needed for this.
# Make forecasts without calibration/refitting (No Confidence Intervals)
# - This assumes the models have been trained on m750
modeltime_table(
model_fit_prophet,
model_fit_lm
) %>%
modeltime_forecast(
h = "3 years",
actual_data = m750
) %>%
plot_modeltime_forecast(.conf_interval_show = F)
Residual Analysis & Diagonstics
A common tool when forecasting and analyzing residuals, where residuals are .resid = .actual - .prediction
. The residuals may have autocorrelation or nonzero mean, which can indicate model improvement opportunities. In addition, users may which to inspect in-sample and out-of-sample residuals, which can display different results.
modeltime_residuals()
- A new function used to extract out residual informationplot_modeltime_residuals()
- Visualizes the output of modeltime_residuals()
. Offers 3 plots:
TBATS Model
Use seasonal_reg()
and set engine to “tbats”.
seasonal_reg(
seasonal_period_1 = "1 day",
seasonal_period_2 = "1 week"
) %>%
set_engine("tbats")
NNETAR Model
Use nnetar_reg()
and set engine to “nnetar”.
model_fit_nnetar <- nnetar_reg() %>%
set_engine("nnetar")
Prophet Model - Logistic Growth Support
prophet_reg()
and prophet_boost()
:
growth = 'logistic'
and one or more of logistic_cap
and logistic_floor
to valid saturation boundaries.changepoint_num
, changepoint_range
, seasonality_yearly
, seasonality_weekly
, seasonality_daily
, logistic_cap
, logistic_floor
combine_modeltime_tables()
- A helper function making it easy to combine multiple modeltime tables.update_model_description()
- A helper function making it easier to update model descriptions.modeltime_refit()
: When modeltime model parameters update (e.g. when Auto ARIMA changes to a new model), the Model Description now alerts the user (e.g. “UPDATE: ARIMA(0,1,1)(1,1,1)[12]”).
modeltime_calibrate()
: When training data is supplied in a time window that the model has previously been trained on (e.g. training(splits)
), the calibration calculation first inspects whether the “Fitted” data exists. If it iexists, it returns the “Fitted” data. This helps prevent sequence-based (e.g. ARIMA, ETS, TBATS models) from displaying odd results because these algorithms can only predict sequences directly following the training window. If “Fitted” data is being used, the .type
column will display “Fitted” instead of “Test”.
actual_data
reconciliation strategies when recipe removes rows. Strategy attempts to fill predictors using “downup” strategy to prevent NA
values from removing rows.modeltime_accuracy()
: Fix issue with new_data
not recalibrating.
prophet_reg()
and prophet_boost()
- Can now perform logistic growth growth = 'logistic'
. The user can supply “saturation” bounds using logistic_cap
and/or logisitc_floor
.
seasonal_decomp()
has changed to seasonal_reg()
and now supports both TBATS and Seasonal Decomposition Models.prophet_reg()
& prophet_boost()
: Argument changes:
num_changepoints
has become changepoint_num
modeltime_forecast()
: Now estimates confidence intervals using centered standard deviation. The mean is assumed to be zero and residuals deviate from mean = 0.parsnip
0.1.2.prophet_boost()
: Set nthreads = 1
(default) to ensure parallelization is thread safe.