automl_reg() is a way to generate a specification of a AutoML model
before fitting and allows the model to be created using
different packages. Currently the only package is h2o.
Arguments
- mode
 A single character string for the type of model. The only possible value for this model is "regression".
Details
Other options and arguments can be set using set_engine().
The model can be created using the fit() function using the following engines:
H2O "h2o" (the default)
Fit Details
The following features are REQUIRED to be available in the incoming data for the fitting process.
Fit:
fit(y ~ ., data): Includes a target feature that is a function of a "date" feature.Predict:
predict(model, new_data)wherenew_datacontains a column named "date".
Date and Date-Time Variable
It's a requirement to have a date or date-time variable as a predictor.
The fit() interface accepts date and date-time features and handles them internally.
Examples
if (FALSE) {
library(tidymodels)
library(modeltime.h2o)
library(h2o)
library(tidyverse)
library(timetk)
data_tbl <- walmart_sales_weekly %>%
    select(id, Date, Weekly_Sales)
            
splits <- time_series_split(
    data_tbl, 
    assess     = "3 month", 
    cumulative = TRUE
)
recipe_spec <- recipe(Weekly_Sales ~ ., data = training(splits)) %>%
    step_timeseries_signature(Date)
train_tbl <- bake(prep(recipe_spec), training(splits))
test_tbl  <- bake(prep(recipe_spec), testing(splits))
# Initialize H2O
h2o.init(
    nthreads = -1,
    ip = 'localhost',
    port = 54321
)
         
 
# ---- MODEL SPEC ----
model_spec <- automl_reg(mode = 'regression') %>%
    set_engine(
        engine                     = 'h2o',
        max_runtime_secs           = 30, 
        max_runtime_secs_per_model = 30,
        project_name               = 'project_01',
        nfolds                     = 5,
        max_models                 = 1000,
        exclude_algos              = c("DeepLearning"),
        seed                       =  786
    ) 
model_spec
# ---- TRAINING ----
# Important: Make sure the date is included as regressor.
# This training process should take 30-40 seconds
model_fitted <- model_spec %>%
    fit(Weekly_Sales ~ ., data = train_tbl)
model_fitted
# ---- PREDICT ----
# - IMPORTANT: New Data must have date feature
predict(model_fitted, test_tbl)
# Shutdown H2O when Finished. 
# Make sure to save any work before. 
h2o.shutdown(prompt = FALSE)
}
