Interface for Generalized Additive Models (GAM)

gen_additive_mod(
  mode = "regression",
  select_features = NULL,
  adjust_deg_free = NULL
)

Arguments

mode	A single character string for the type of model.
select_features	TRUE or FALSE. If this is TRUE then can add an extra penalty to each term so that it can be penalized to zero. This means that the smoothing parameter estimation that is part of fitting can completely remove terms from the model. If the corresponding smoothing parameter is estimated as zero then the extra penalty has no effect. Use `adjust_deg_free` to increase level of penalization.
adjust_deg_free	If `select_features = TRUE`, then acts as a multiplier for smoothness. Increase this beyond 1 to produce smoother models.

Value

A parsnip model specification

Details

Available Engines:

gam: Connects to mgcv::gam()

Parameter Mapping:

modelgam	mgcv::gam
select_features	select (FALSE)
adjust_deg_free	gamma (1)

Engine Details

gam

This engine uses mgcv::gam() and has the following parameters, which can be modified through the parsnip::set_engine() function.

## function (formula, family = gaussian(), data = list(), weights = NULL, 
##     subset = NULL, na.action, offset = NULL, method = "GCV.Cp", optimizer = c("outer", 
##         "newton"), control = list(), scale = 0, select = FALSE, knots = NULL, 
##     sp = NULL, min.sp = NULL, H = NULL, gamma = 1, fit = TRUE, paraPen = NULL, 
##     G = NULL, in.out = NULL, drop.unused.levels = TRUE, drop.intercept = NULL, 
##     discrete = FALSE, ...)

Fit Details

MGCV Formula Interface

Fitting GAMs is accomplished using parameters including:

mgcv::s(): GAM spline smooths
mgcv::te(): GAM tensor product smooths

These are applied in the fit() function:

fit(value ~ s(date_mon, k = 12) + s(date_num), data = df)

Examples


library(tidymodels)
#> ── Attaching packages ────────────────────────────────────── tidymodels 0.1.2 ──
#> ✓ broom     0.7.5          ✓ recipes   0.1.15    
#> ✓ dials     0.0.9.9000     ✓ rsample   0.0.9     
#> ✓ dplyr     1.0.5          ✓ tibble    3.1.0     
#> ✓ ggplot2   3.3.3          ✓ tidyr     1.1.3     
#> ✓ infer     0.5.4          ✓ tune      0.1.3     
#> ✓ modeldata 0.1.0          ✓ workflows 0.2.2     
#> ✓ purrr     0.3.4          ✓ yardstick 0.0.7     
#> ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
#> x dplyr::collapse() masks nlme::collapse()
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x recipes::step()   masks stats::step()
library(modelgam)
library(modeltime)
library(tidyverse)
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
#> ✓ readr   1.4.0     ✓ forcats 0.5.1
#> ✓ stringr 1.4.0     
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> x readr::col_factor() masks scales::col_factor()
#> x dplyr::collapse()   masks nlme::collapse()
#> x purrr::discard()    masks scales::discard()
#> x dplyr::filter()     masks stats::filter()
#> x stringr::fixed()    masks recipes::fixed()
#> x dplyr::lag()        masks stats::lag()
#> x readr::spec()       masks yardstick::spec()
library(timetk)
library(lubridate)
#> 
#> Attaching package: ‘lubridate’
#> The following objects are masked from ‘package:base’:
#> 
#>     date, intersect, setdiff, union

m750_extended <- m750 %>%
    group_by(id) %>%
    future_frame(.length_out = 24, .bind_data = TRUE) %>%
    mutate(lag_24 = lag(value, 24)) %>%
    ungroup() %>%
    mutate(date_num = as.numeric(date)) %>%
    mutate(date_month = month(date))
#> .date_var is missing. Using: date

m750_train  <- m750_extended %>% drop_na()
m750_future <- m750_extended %>% filter(is.na(value))

model_fit_gam <- gen_additive_mod(mode = "regression") %>%
    set_engine("gam", family=Gamma(link="log"), method = "REML") %>%
    fit(value ~ s(date_month, k = 12) 
        + s(date_num) 
        + s(lag_24) 
        + s(date_num, date_month), 
        data = m750_train)

model_fit_gam %>% predict(m750_future, type = "numeric") 
#> # A tibble: 24 x 1
#>     .pred
#>     <dbl>
#>  1  9700.
#>  2  9909.
#>  3 10559.
#>  4 11051.
#>  5 11237.
#>  6 11139.
#>  7 11188.
#>  8 11190.
#>  9 11393.
#> 10 11467.
#> # … with 14 more rows

model_fit_gam %>% predict(m750_future, type = "raw") 
#>        1        2        3        4        5        6        7        8 
#> 9.179900 9.201203 9.264739 9.310256 9.326935 9.318221 9.322557 9.322787 
#>        9       10       11       12       13       14       15       16 
#> 9.340733 9.347238 9.344812 9.322673 9.204308 9.224249 9.285007 9.331768 
#>       17       18       19       20       21       22       23       24 
#> 9.349209 9.342442 9.341357 9.344865 9.362822 9.366746 9.367589 9.340906