You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
document addition of "aorsf" engine to bonsai (#1120)
* document classification and regression with aorsf
* explain classification `type = "class"` divergence with aorsf
* `update_model_info_table()`
---------
Co-authored-by: Simon P. Couch <[email protected]>
Copy file name to clipboardExpand all lines: man/rmd/rand_forest_aorsf.Rmd
+30-3Lines changed: 30 additions & 3 deletions
Original file line number
Diff line number
Diff line change
@@ -26,10 +26,9 @@ param$item
26
26
27
27
Additionally, this model has one engine-specific tuning parameter:
28
28
29
-
*`split_min_stat`: Minimum test statistic required to split a node. Default is`3.841459` for the log-rank test, which is roughly a p-value of 0.05.
29
+
*`split_min_stat`: Minimum test statistic required to split a node. Defaults are`3.841459` for censored regression (which is roughly a p-value of 0.05) and `0` for classification and regression. For classification, this tuning parameter should be between 0 and 1, and for regression it should be greater than or equal to 0. Higher values of this parameter cause trees grown by `aorsf` to have less depth.
30
30
31
-
32
-
# Translation from parsnip to the original package (censored regression)
31
+
## Translation from parsnip to the original package (censored regression)
Predictions of survival probability at a time exceeding the maximum observed event time are the predicted survival probability at the maximum observed time in the training data.
58
83
84
+
The class predict method in `aorsf` uses the standard 'each tree gets one vote' approach, which is usually but not always consistent with the picking the class that has highest predicted probability. It is okay for this inconsistency to occur in `aorsf` because it is intentionally applying the traditional class prediction method for random forests, but in `tidymodels` it is preferable to embrace consistency. Thus, we opted to make predicted probability consistent with predicted class all the time by making the predicted class a function of predicted probability (see [tidymodels/bonsai#78](https://github.com/tidymodels/bonsai/pull/78)).
85
+
59
86
## References
60
87
61
88
- Jaeger BC, Long DL, Long DM, Sims M, Szychowski JM, Min YI, Mcclure LA, Howard G, Simon N. Oblique random survival forests. Annals of applied statistics 2019 Sep; 13(3):1847-83. DOI: 10.1214/19-AOAS1261
Copy file name to clipboardExpand all lines: man/rmd/rand_forest_aorsf.md
+53-4Lines changed: 53 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
2
2
3
3
4
-
For this engine, there is a single mode: censored regression
4
+
For this engine, there are multiple modes: censored regression, classification, and regression
5
5
6
6
## Tuning Parameters
7
7
@@ -17,10 +17,9 @@ This model has 3 tuning parameters:
17
17
18
18
Additionally, this model has one engine-specific tuning parameter:
19
19
20
-
*`split_min_stat`: Minimum test statistic required to split a node. Default is`3.841459` for the log-rank test, which is roughly a p-value of 0.05.
20
+
*`split_min_stat`: Minimum test statistic required to split a node. Defaults are`3.841459` for censored regression (which is roughly a p-value of 0.05) and `0` for classification and regression. For classification, this tuning parameter should be between 0 and 1, and for regression it should be greater than or equal to 0. Higher values of this parameter cause trees grown by `aorsf` to have less depth.
21
21
22
-
23
-
# Translation from parsnip to the original package (censored regression)
22
+
## Translation from parsnip to the original package (censored regression)
24
23
25
24
The **censored** extension package is required to fit this model.
26
25
@@ -43,6 +42,54 @@ rand_forest() %>%
43
42
## aorsf::orsf(formula = missing_arg(), data = missing_arg(), weights = missing_arg())
44
43
```
45
44
45
+
## Translation from parsnip to the original package (regression)
46
+
47
+
The **bonsai** extension package is required to fit this model.
48
+
49
+
50
+
```r
51
+
library(bonsai)
52
+
53
+
rand_forest() %>%
54
+
set_engine("aorsf") %>%
55
+
set_mode("regression") %>%
56
+
translate()
57
+
```
58
+
59
+
```
60
+
## Random Forest Model Specification (regression)
61
+
##
62
+
## Computational engine: aorsf
63
+
##
64
+
## Model fit template:
65
+
## aorsf::orsf(formula = missing_arg(), data = missing_arg(), weights = missing_arg(),
66
+
## n_thread = 1, verbose_progress = FALSE)
67
+
```
68
+
69
+
## Translation from parsnip to the original package (classification)
70
+
71
+
The **bonsai** extension package is required to fit this model.
72
+
73
+
74
+
```r
75
+
library(bonsai)
76
+
77
+
rand_forest() %>%
78
+
set_engine("aorsf") %>%
79
+
set_mode("classification") %>%
80
+
translate()
81
+
```
82
+
83
+
```
84
+
## Random Forest Model Specification (classification)
85
+
##
86
+
## Computational engine: aorsf
87
+
##
88
+
## Model fit template:
89
+
## aorsf::orsf(formula = missing_arg(), data = missing_arg(), weights = missing_arg(),
90
+
## n_thread = 1, verbose_progress = FALSE)
91
+
```
92
+
46
93
## Preprocessing requirements
47
94
48
95
@@ -59,6 +106,8 @@ The `fit()` and `fit_xy()` arguments have arguments called `case_weights` that e
59
106
60
107
Predictions of survival probability at a time exceeding the maximum observed event time are the predicted survival probability at the maximum observed time in the training data.
61
108
109
+
The class predict method in `aorsf` uses the standard 'each tree gets one vote' approach, which is usually but not always consistent with the picking the class that has highest predicted probability. It is okay for this inconsistency to occur in `aorsf` because it is intentionally applying the traditional class prediction method for random forests, but in `tidymodels` it is preferable to embrace consistency. Thus, we opted to make predicted probability consistent with predicted class all the time by making the predicted class a function of predicted probability (see [tidymodels/bonsai#78](https://github.com/tidymodels/bonsai/pull/78)).
110
+
62
111
## References
63
112
64
113
- Jaeger BC, Long DL, Long DM, Sims M, Szychowski JM, Min YI, Mcclure LA, Howard G, Simon N. Oblique random survival forests. Annals of applied statistics 2019 Sep; 13(3):1847-83. DOI: 10.1214/19-AOAS1261
0 commit comments