document remainder of bonsai engines, translate bug fix

simonpcouch · simonpcouch · commit d83e78fbac5f · 2022-05-10T15:37:10.000-04:00
diff --git a/R/translate.R b/R/translate.R
@@ -166,7 +166,8 @@ deharmonize <- function(args, key) {
   merged <-
     dplyr::left_join(parsn, key, by = "parsnip") %>%
     dplyr::arrange(order)
-  # TODO correct for bad merge?
+
+  merged <- merged[!duplicated(merged$order),]
 
   names(args) <- merged$original
   args[!is.na(merged$original)]
diff --git a/man/rmd/boost_tree_lightgbm.Rmd b/man/rmd/boost_tree_lightgbm.Rmd
@@ -81,3 +81,5 @@ The "Introduction to bonsai" article contains [examples](https://github.com/tidy
 ## References
 
  - [LightGBM: A Highly Efficient Gradient Boosting Decision Tree](https://papers.nips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html)
+ 
+- Kuhn, M, and K Johnson. 2013. _Applied Predictive Modeling_. Springer.
diff --git a/man/rmd/boost_tree_lightgbm.md b/man/rmd/boost_tree_lightgbm.md
@@ -120,3 +120,5 @@ The "Introduction to bonsai" article contains [examples](https://github.com/tidy
 ## References
 
  - [LightGBM: A Highly Efficient Gradient Boosting Decision Tree](https://papers.nips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html)
+ 
+- Kuhn, M, and K Johnson. 2013. _Applied Predictive Modeling_. Springer.
diff --git a/man/rmd/decision_tree_partykit.Rmd b/man/rmd/decision_tree_partykit.Rmd
@@ -0,0 +1,65 @@
+```{r, child = "aaa.Rmd", include = FALSE}
+```
+
+`r descr_models("decision_tree", "partykit")`
+
+## Tuning Parameters
+
+```{r partykit-param-info, echo = FALSE}
+defaults <- 
+  tibble::tibble(parsnip = c("tree_depth", "min_n"),
+                 default = c("see below", "20L"))
+
+param <-
+ decision_tree() %>% 
+  set_engine("partykit") %>% 
+  set_mode("regression") %>% 
+  make_parameter_list(defaults)
+```
+
+This model has `r nrow(param)` tuning parameters:
+
+```{r partykit-param-list, echo = FALSE, results = "asis"}
+param$item
+```
+
+The `tree_depth` parameter defaults to `0` which means no restrictions are applied to tree depth.
+
+An engine-specific parameter for this model is: 
+
+ * `mtry`: the number of predictors, selected at random, that are evaluated for splitting. The default is to use all predictors.
+
+## Translation from parsnip to the original package (regression)
+
+`r uses_extension("decision_tree", "partykit", "regression")`
+
+```{r partykit-creg}
+decision_tree(tree_depth = integer(1), min_n = integer(1)) %>% 
+  set_engine("partykit") %>% 
+  set_mode("regression") %>% 
+  translate()
+```
+
+## Translation from parsnip to the original package (classification)
+
+`r uses_extension("decision_tree", "partykit", "classification")`
+
+```{r partykit-class}
+decision_tree(tree_depth = integer(1), min_n = integer(1)) %>% 
+  set_engine("partykit") %>% 
+  set_mode("classification") %>% 
+  translate()
+```
+
+`parsnip::ctree_train()` is a wrapper around [partykit::ctree()] (and other functions) that makes it easier to run this model. 
+
+## Preprocessing requirements
+
+```{r child = "template-tree-split-factors.Rmd"}
+```
+
+## References
+
+ - [partykit: A Modular Toolkit for Recursive Partytioning in R](https://jmlr.org/papers/v16/hothorn15a.html)
+
+ - Kuhn, M, and K Johnson. 2013. _Applied Predictive Modeling_. Springer.
diff --git a/man/rmd/decision_tree_partykit.md b/man/rmd/decision_tree_partykit.md
@@ -0,0 +1,89 @@
+
+
+
+For this engine, there are multiple modes: censored regression, regression, and classification
+
+## Tuning Parameters
+
+
+
+This model has 3 tuning parameters:
+
+- `tree_depth`: Tree Depth (type: integer, default: see below)
+
+- `min_n`: Minimal Node Size (type: integer, default: 20L)
+
+- `min_n`: Minimal Node Size (type: integer, default: 20L)
+
+The `tree_depth` parameter defaults to `0` which means no restrictions are applied to tree depth.
+
+An engine-specific parameter for this model is: 
+
+ * `mtry`: the number of predictors, selected at random, that are evaluated for splitting. The default is to use all predictors.
+
+## Translation from parsnip to the original package (regression)
+
+
+
+
+```r
+decision_tree(tree_depth = integer(1), min_n = integer(1)) %>% 
+  set_engine("partykit") %>% 
+  set_mode("regression") %>% 
+  translate()
+```
+
+```
+## Decision Tree Model Specification (regression)
+## 
+## Main Arguments:
+##   tree_depth = integer(1)
+##   min_n = integer(1)
+## 
+## Computational engine: partykit 
+## 
+## Model fit template:
+## parsnip::ctree_train(formula = missing_arg(), data = missing_arg(), 
+##     weights = missing_arg(), maxdepth = integer(1), minsplit = min_rows(0L, 
+##         data))
+```
+
+## Translation from parsnip to the original package (classification)
+
+
+
+
+```r
+decision_tree(tree_depth = integer(1), min_n = integer(1)) %>% 
+  set_engine("partykit") %>% 
+  set_mode("classification") %>% 
+  translate()
+```
+
+```
+## Decision Tree Model Specification (classification)
+## 
+## Main Arguments:
+##   tree_depth = integer(1)
+##   min_n = integer(1)
+## 
+## Computational engine: partykit 
+## 
+## Model fit template:
+## parsnip::ctree_train(formula = missing_arg(), data = missing_arg(), 
+##     weights = missing_arg(), maxdepth = integer(1), minsplit = min_rows(0L, 
+##         data))
+```
+
+`parsnip::ctree_train()` is a wrapper around [partykit::ctree()] (and other functions) that makes it easier to run this model. 
+
+## Preprocessing requirements
+
+
+This engine does not require any special encoding of the predictors. Categorical predictors can be partitioned into groups of factor levels (e.g. `{a, c}` vs `{b, d}`) when splitting at a node. Dummy variables are not required for this model. 
+
+## References
+
+ - [partykit: A Modular Toolkit for Recursive Partytioning in R](https://jmlr.org/papers/v16/hothorn15a.html)
+
+ - Kuhn, M, and K Johnson. 2013. _Applied Predictive Modeling_. Springer.
diff --git a/man/rmd/rand_forest_partykit.Rmd b/man/rmd/rand_forest_partykit.Rmd
@@ -0,0 +1,59 @@
+```{r, child = "aaa.Rmd", include = FALSE}
+```
+
+`r descr_models("rand_forest", "partykit")`
+
+## Tuning Parameters
+
+```{r partykit-param-info, echo = FALSE}
+defaults <- 
+  tibble::tibble(parsnip = c("trees", "min_n", "mtry"),
+                 default = c("500L", "20L", "5L"))
+
+param <-
+  rand_forest() %>% 
+  set_engine("partykit") %>% 
+  set_mode("regression") %>%  
+  make_parameter_list(defaults)
+```
+
+This model has `r nrow(param)` tuning parameters:
+
+```{r partykit-param-list, echo = FALSE, results = "asis"}
+param$item
+```
+
+## Translation from parsnip to the original package (regression)
+
+`r uses_extension("rand_forest", "partykit", "regression")`
+
+```{r partykit-creg}
+rand_forest() %>% 
+  set_engine("partykit") %>% 
+  set_mode("regression") %>% 
+  translate()
+```
+
+## Translation from parsnip to the original package (classification)
+
+`r uses_extension("rand_forest", "partykit", "classification")`
+
+```{r partykit-class}
+rand_forest() %>% 
+  set_engine("partykit") %>% 
+  set_mode("classification") %>% 
+  translate()
+```
+
+`parsnip::cforest_train()` is a wrapper around [partykit::cforest()] (and other functions) that makes it easier to run this model. 
+
+## Preprocessing requirements
+
+```{r child = "template-tree-split-factors.Rmd"}
+```
+
+## References
+
+ - [partykit: A Modular Toolkit for Recursive Partytioning in R](https://jmlr.org/papers/v16/hothorn15a.html)
+
+ - Kuhn, M, and K Johnson. 2013. _Applied Predictive Modeling_. Springer.
diff --git a/man/rmd/rand_forest_partykit.md b/man/rmd/rand_forest_partykit.md
@@ -0,0 +1,75 @@
+
+
+
+For this engine, there are multiple modes: censored regression, regression, and classification
+
+## Tuning Parameters
+
+
+
+This model has 4 tuning parameters:
+
+- `trees`: # Trees (type: integer, default: 500L)
+
+- `min_n`: Minimal Node Size (type: integer, default: 20L)
+
+- `mtry`: # Randomly Selected Predictors (type: integer, default: 5L)
+
+- `min_n`: Minimal Node Size (type: integer, default: 20L)
+
+## Translation from parsnip to the original package (regression)
+
+
+
+
+```r
+rand_forest() %>% 
+  set_engine("partykit") %>% 
+  set_mode("regression") %>% 
+  translate()
+```
+
+```
+## Random Forest Model Specification (regression)
+## 
+## Computational engine: partykit 
+## 
+## Model fit template:
+## parsnip::cforest_train(formula = missing_arg(), data = missing_arg(), 
+##     weights = missing_arg())
+```
+
+## Translation from parsnip to the original package (classification)
+
+
+
+
+```r
+rand_forest() %>% 
+  set_engine("partykit") %>% 
+  set_mode("classification") %>% 
+  translate()
+```
+
+```
+## Random Forest Model Specification (classification)
+## 
+## Computational engine: partykit 
+## 
+## Model fit template:
+## parsnip::cforest_train(formula = missing_arg(), data = missing_arg(), 
+##     weights = missing_arg())
+```
+
+`parsnip::cforest_train()` is a wrapper around [partykit::cforest()] (and other functions) that makes it easier to run this model. 
+
+## Preprocessing requirements
+
+
+This engine does not require any special encoding of the predictors. Categorical predictors can be partitioned into groups of factor levels (e.g. `{a, c}` vs `{b, d}`) when splitting at a node. Dummy variables are not required for this model. 
+
+## References
+
+ - [partykit: A Modular Toolkit for Recursive Partytioning in R](https://jmlr.org/papers/v16/hothorn15a.html)
+
+ - Kuhn, M, and K Johnson. 2013. _Applied Predictive Modeling_. Springer.