Tables for engine specific params in model docs #272

juliasilge · 2020-03-24T18:26:09Z

Addresses #211. This PR implements a table in each individual model .Rd file to show the mapping from the parsnip parameters to the engine parameters.

Still TODO is the defaults for these parameters but that is a WHOLE THING (maybe not something that can be automated) so I plan to do that in another PR in the hopefully near future. If this piece is good as is, let's merge it.

topepo · 2020-03-24T23:15:11Z

Looks good. I agree that the default arguments should be a second stage.

topepo · 2020-03-25T01:05:43Z

For future PRs... this is a little kludgy but we could add some code to the _data files that define the methods to catalog their default values:

library(tidymodels)
#> ── Attaching packages ───────────────────────────── tidymodels 0.1.0 ──
#> ✓ broom     0.5.4          ✓ recipes   0.1.10    
#> ✓ dials     0.0.4.9000     ✓ rsample   0.0.5.9000
#> ✓ dplyr     0.8.5          ✓ tibble    2.1.3     
#> ✓ ggplot2   3.3.0          ✓ tune      0.0.1.9000
#> ✓ infer     0.5.1          ✓ workflows 0.1.0     
#> ✓ parsnip   0.0.5          ✓ yardstick 0.0.5     
#> ✓ purrr     0.3.3
#> ── Conflicts ──────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard()  masks scales::discard()
#> x dplyr::filter()   masks stats::filter()
#> x dplyr::lag()      masks stats::lag()
#> x ggplot2::margin() masks dials::margin()
#> x recipes::step()   masks stats::step()
library(rlang)
#> 
#> Attaching package: 'rlang'
#> The following objects are masked from 'package:purrr':
#> 
#>     %@%, as_function, flatten, flatten_chr, flatten_dbl, flatten_int,
#>     flatten_lgl, flatten_raw, invoke, list_along, modify, prepend,
#>     splice


get_arg <- function(ns, f, arg) {
  args <- formals(getFromNamespace(f, ns))
  args <- args %>% as.list() 
  as.character(args[[arg]])
}

# Make the defaults character because there are cases where we will write something
# eg glmnet::glmnet would have "lambda (all)" or something similar
dt_defaults <- 
  tibble::tribble(
    ~model,         ~engine,                 ~original,  ~default,
    "decision_tree", "rpart",               "maxdepth", get_arg("rpart", "rpart.control", "maxdepth"),
    "decision_tree", "rpart",               "minsplit", get_arg("rpart", "rpart.control", "minsplit"),
    "decision_tree", "rpart",                     "cp", get_arg("rpart", "rpart.control", "cp"),
    "decision_tree",  "C5.0",               "minCases", get_arg("C50", "C5.0Control", "minCases"),
    "decision_tree", "spark",              "max_depth", get_arg("sparklyr", "ml_decision_tree", "max_depth"),
    "decision_tree", "spark", "min_instances_per_node", get_arg("sparklyr", "ml_decision_tree", "min_instances_per_node"),
  )

# emulating convert_args("decision_tree")
model_name <- "decision_tree"

envir <- get_model_env()

args <-
  ls(envir) %>%
  tibble::tibble(name = .) %>%
  dplyr::filter(grepl("args", name)) %>%
  dplyr::mutate(model = sub("_args", "", name),
                args  = purrr::map(name, ~envir[[.x]])) %>%
  dplyr::filter(grepl(model_name, model)) %>%
  tidyr::unnest(args) %>%
  dplyr::select(model:original) %>% 
  full_join(dt_defaults) %>% 
  mutate(original = paste0(original, " (", default, ")")) %>% 
  select(-default)
#> Joining, by = c("model", "engine", "original")

convert_df <- args %>%
  dplyr::select(-model) %>%
  tidyr::pivot_wider(names_from = engine, values_from = original)

convert_df %>%
  knitr::kable(col.names = paste0("**", colnames(convert_df), "**"))

parsnip	rpart	C5.0	spark
tree_depth	maxdepth (30)	NA	max_depth (5)
min_n	minsplit (20)	minCases (2)	min_instances_per_node (1)
cost_complexity	cp (0.01)	NA	NA

^{Created on 2020-03-24 by the reprex package (v0.3.0)}

github-actions · 2021-03-07T00:28:38Z

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

juliasilge added 2 commits March 24, 2020 11:56

Update for new tidyr to avoid warnings in tests

c922817

Tables that map parsnip params to engine params

63b6b5e

topepo merged commit 8b73b3b into tidymodels:master Mar 26, 2020

juliasilge mentioned this pull request Apr 3, 2020

[Documentation] linear_reg penalty parameter #173

Closed

juliasilge deleted the engine-specific-parameters branch June 30, 2020 20:14

github-actions bot locked and limited conversation to collaborators Mar 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tables for engine specific params in model docs #272

Tables for engine specific params in model docs #272

Uh oh!

juliasilge commented Mar 24, 2020

Uh oh!

topepo commented Mar 24, 2020

Uh oh!

topepo commented Mar 25, 2020

Uh oh!

github-actions bot commented Mar 7, 2021

Uh oh!

Uh oh!

Tables for engine specific params in model docs #272

Tables for engine specific params in model docs #272

Uh oh!

Conversation

juliasilge commented Mar 24, 2020

Uh oh!

topepo commented Mar 24, 2020

Uh oh!

topepo commented Mar 25, 2020

Uh oh!

github-actions bot commented Mar 7, 2021

Uh oh!

Uh oh!