Interface for Generalized Additive Models (GAM)

  mode = "regression",
  select_features = NULL,
  adjust_deg_free = NULL



A single character string for the type of model.


TRUE or FALSE. If this is TRUE then can add an extra penalty to each term so that it can be penalized to zero. This means that the smoothing parameter estimation that is part of fitting can completely remove terms from the model. If the corresponding smoothing parameter is estimated as zero then the extra penalty has no effect. Use adjust_deg_free to increase level of penalization.


If select_features = TRUE, then acts as a multiplier for smoothness. Increase this beyond 1 to produce smoother models.


A parsnip model specification


Available Engines:

Parameter Mapping:

select_featuresselect (FALSE)
adjust_deg_freegamma (1)

Engine Details


This engine uses mgcv::gam() and has the following parameters, which can be modified through the parsnip::set_engine() function.

## function (formula, family = gaussian(), data = list(), weights = NULL, 
##     subset = NULL, na.action, offset = NULL, method = "GCV.Cp", optimizer = c("outer", 
##         "newton"), control = list(), scale = 0, select = FALSE, knots = NULL, 
##     sp = NULL, min.sp = NULL, H = NULL, gamma = 1, fit = TRUE, paraPen = NULL, 
##     G = NULL, in.out = NULL, drop.unused.levels = TRUE, drop.intercept = NULL, 
##     discrete = FALSE, ...)

Fit Details

MGCV Formula Interface

Fitting GAMs is accomplished using parameters including:

These are applied in the fit() function:

fit(value ~ s(date_mon, k = 12) + s(date_num), data = df)


#> ── Attaching packages ────────────────────────────────────── tidymodels 0.1.2 ──
#> broom 0.7.5 recipes 0.1.15 #> dials rsample 0.0.9 #> dplyr 1.0.5 tibble 3.1.0 #> ggplot2 3.3.3 tidyr 1.1.3 #> infer 0.5.4 tune 0.1.3 #> modeldata 0.1.0 workflows 0.2.2 #> purrr 0.3.4 yardstick 0.0.7
#> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ── #> x dplyr::collapse() masks nlme::collapse() #> x purrr::discard() masks scales::discard() #> x dplyr::filter() masks stats::filter() #> x dplyr::lag() masks stats::lag() #> x recipes::step() masks stats::step()
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
#> readr 1.4.0 forcats 0.5.1 #> stringr 1.4.0
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── #> x readr::col_factor() masks scales::col_factor() #> x dplyr::collapse() masks nlme::collapse() #> x purrr::discard() masks scales::discard() #> x dplyr::filter() masks stats::filter() #> x stringr::fixed() masks recipes::fixed() #> x dplyr::lag() masks stats::lag() #> x readr::spec() masks yardstick::spec()
#> #> Attaching package: ‘lubridate’
#> The following objects are masked from ‘package:base’: #> #> date, intersect, setdiff, union
m750_extended <- m750 %>% group_by(id) %>% future_frame(.length_out = 24, .bind_data = TRUE) %>% mutate(lag_24 = lag(value, 24)) %>% ungroup() %>% mutate(date_num = as.numeric(date)) %>% mutate(date_month = month(date))
#> .date_var is missing. Using: date
m750_train <- m750_extended %>% drop_na() m750_future <- m750_extended %>% filter( model_fit_gam <- gen_additive_mod(mode = "regression") %>% set_engine("gam", family=Gamma(link="log"), method = "REML") %>% fit(value ~ s(date_month, k = 12) + s(date_num) + s(lag_24) + s(date_num, date_month), data = m750_train) model_fit_gam %>% predict(m750_future, type = "numeric")
#> # A tibble: 24 x 1 #> .pred #> <dbl> #> 1 9700. #> 2 9909. #> 3 10559. #> 4 11051. #> 5 11237. #> 6 11139. #> 7 11188. #> 8 11190. #> 9 11393. #> 10 11467. #> # … with 14 more rows
model_fit_gam %>% predict(m750_future, type = "raw")
#> 1 2 3 4 5 6 7 8 #> 9.179900 9.201203 9.264739 9.310256 9.326935 9.318221 9.322557 9.322787 #> 9 10 11 12 13 14 15 16 #> 9.340733 9.347238 9.344812 9.322673 9.204308 9.224249 9.285007 9.331768 #> 17 18 19 20 21 22 23 24 #> 9.349209 9.342442 9.341357 9.344865 9.362822 9.366746 9.367589 9.340906