|
4 | 4 | #' `predict()` can be used for all types of models and uses the
|
5 | 5 | #' "type" argument for more specificity.
|
6 | 6 | #'
|
7 |
| -#' @param object An object of class `model_fit` |
| 7 | +#' @param object An object of class `model_fit`. |
8 | 8 | #' @param new_data A rectangular data object, such as a data frame.
|
9 | 9 | #' @param type A single character value or `NULL`. Possible values
|
10 |
| -#' are "numeric", "class", "prob", "conf_int", "pred_int", "quantile", "time", |
11 |
| -#' "hazard", "survival", or "raw". When `NULL`, `predict()` will choose an |
12 |
| -#' appropriate value based on the model's mode. |
| 10 | +#' are `"numeric"`, `"class"`, `"prob"`, `"conf_int"`, `"pred_int"`, |
| 11 | +#' `"quantile"`, `"time"`, `"hazard"`, `"survival"`, or `"raw"`. When `NULL`, |
| 12 | +#' `predict()` will choose an appropriate value based on the model's mode. |
13 | 13 | #' @param opts A list of optional arguments to the underlying
|
14 | 14 | #' predict function that will be used when `type = "raw"`. The
|
15 | 15 | #' list should not include options for the model object or the
|
16 | 16 | #' new data being predicted.
|
17 |
| -#' @param ... Arguments to the underlying model's prediction |
18 |
| -#' function cannot be passed here (see `opts`). There are some |
19 |
| -#' `parsnip` related options that can be passed, depending on the |
20 |
| -#' value of `type`. Possible arguments are: |
| 17 | +#' @param ... Additional `parsnip`-related options, depending on the |
| 18 | +#' value of `type`. Arguments to the underlying model's prediction |
| 19 | +#' function cannot be passed here (use the `opts` argument instead). |
| 20 | +#' Possible arguments are: |
21 | 21 | #' \itemize{
|
22 |
| -#' \item `interval`: for `type`s of "survival" and "quantile", should |
| 22 | +#' \item `interval`: for `type` equal to `"survival"` or `"quantile"`, should |
23 | 23 | #' interval estimates be added, if available? Options are `"none"`
|
24 | 24 | #' and `"confidence"`.
|
25 |
| -#' \item `level`: for `type`s of "conf_int", "pred_int", and "survival" |
| 25 | +#' \item `level`: for `type` equal to `"conf_int"`, `"pred_int"`, or `"survival"`, |
26 | 26 | #' this is the parameter for the tail area of the intervals
|
27 | 27 | #' (e.g. confidence level for confidence intervals).
|
28 |
| -#' Default value is 0.95. |
29 |
| -#' \item `std_error`: add the standard error of fit or prediction (on |
30 |
| -#' the scale of the linear predictors) for `type`s of "conf_int" |
31 |
| -#' and "pred_int". Default value is `FALSE`. |
32 |
| -#' \item `quantile`: the quantile(s) for quantile regression |
33 |
| -#' (not implemented yet) |
34 |
| -#' \item `time`: the time(s) for hazard and survival probability estimates. |
| 28 | +#' Default value is `0.95`. |
| 29 | +#' \item `std_error`: for `type` equal to `"conf_int"` or `"pred_int"`, add |
| 30 | +#' the standard error of fit or prediction (on the scale of the |
| 31 | +#' linear predictors). Default value is `FALSE`. |
| 32 | +#' \item `quantile`: for `type` equal to `quantile`, the quantiles of the |
| 33 | +#' distribution. Default is `(1:9)/10`. |
| 34 | +#' \item `time`: for `type` equal to `"survival"` or `"hazard"`, the |
| 35 | +#' time points at which the survival probability or hazard is estimated. |
35 | 36 | #' }
|
36 |
| -#' @details If "type" is not supplied to `predict()`, then a choice |
37 |
| -#' is made: |
| 37 | +#' @details For `type = NULL`, `predict()` uses |
38 | 38 | #'
|
39 | 39 | #' * `type = "numeric"` for regression models,
|
40 | 40 | #' * `type = "class"` for classification, and
|
41 | 41 | #' * `type = "time"` for censored regression.
|
42 | 42 | #'
|
43 |
| -#' `predict()` is designed to provide a tidy result (see "Value" |
44 |
| -#' section below) in a tibble output format. |
45 |
| -#' |
46 | 43 | #' ## Interval predictions
|
47 | 44 | #'
|
48 | 45 | #' When using `type = "conf_int"` and `type = "pred_int"`, the options
|
|
58 | 55 | #' have the opposite sign as what the underlying model's `predict()` method
|
59 | 56 | #' produces. Set `increasing = FALSE` to suppress this behavior.
|
60 | 57 | #'
|
61 |
| -#' @return With the exception of `type = "raw"`, the results of |
62 |
| -#' `predict.model_fit()` will be a tibble as many rows in the output |
63 |
| -#' as there are rows in `new_data` and the column names will be |
64 |
| -#' predictable. |
| 58 | +#' @return With the exception of `type = "raw"`, the result of |
| 59 | +#' `predict.model_fit()` |
| 60 | +#' |
| 61 | +#' * is a tibble |
| 62 | +#' * has as many rows as there are rows in `new_data` |
| 63 | +#' * has standardized column names, see below: |
| 64 | +#' |
| 65 | +#' For `type = "numeric"`, the tibble has a `.pred` column for a single |
| 66 | +#' outcome and `.pred_Yname` columns for a multivariate outcome. |
65 | 67 | #'
|
66 |
| -#' For numeric results with a single outcome, the tibble will have |
67 |
| -#' a `.pred` column and `.pred_Yname` for multivariate results. |
| 68 | +#' For `type = "class"`, the tibble has a `.pred_class` column. |
68 | 69 | #'
|
69 |
| -#' For hard class predictions, the column is named `.pred_class` |
70 |
| -#' and, when `type = "prob"`, the columns are `.pred_classlevel`. |
| 70 | +#' For `type = "prob"`, the tibble has `.pred_classlevel` columns. |
71 | 71 | #'
|
72 |
| -#' `type = "conf_int"` and `type = "pred_int"` return tibbles with |
73 |
| -#' columns `.pred_lower` and `.pred_upper` with an attribute for |
74 |
| -#' the confidence level. In the case where intervals can be |
75 |
| -#' produces for class probabilities (or other non-scalar outputs), |
76 |
| -#' the columns will be named `.pred_lower_classlevel` and so on. |
| 72 | +#' For `type = "conf_int"` and `type = "pred_int"`, the tibble has |
| 73 | +#' `.pred_lower` and `.pred_upper` columns with an attribute for |
| 74 | +#' the confidence level. In the case where intervals can be |
| 75 | +#' produces for class probabilities (or other non-scalar outputs), |
| 76 | +#' the columns are named `.pred_lower_classlevel` and so on. |
77 | 77 | #'
|
78 |
| -#' Quantile predictions return a tibble with a column `.pred`, which is |
| 78 | +#' For `type = "quantile"`, the tibble has a `.pred` column, which is |
79 | 79 | #' a list-column. Each list element contains a tibble with columns
|
80 | 80 | #' `.pred` and `.quantile` (and perhaps other columns).
|
81 | 81 | #'
|
82 |
| -#' Using `type = "raw"` with `predict.model_fit()` will return |
83 |
| -#' the unadulterated results of the prediction function. |
| 82 | +#' For `type = "time"`, the tibble has a `.pred_time` column. |
84 | 83 | #'
|
85 |
| -#' For censored regression: |
| 84 | +#' For `type = "survival"`, the tibble has a `.pred` column, which is |
| 85 | +#' a list-column. Each list element contains a tibble with columns |
| 86 | +#' `.time` and `.pred_survival` (and perhaps other columns). |
| 87 | +#' |
| 88 | +#' For `type = "hazard"`, the tibble has a `.pred` column, which is |
| 89 | +#' a list-column. Each list element contains a tibble with columns |
| 90 | +#' `.time` and `.pred_hazard` (and perhaps other columns). |
86 | 91 | #'
|
87 |
| -#' * `type = "time"` produces a column `.pred_time`. |
88 |
| -#' * `type = "hazard"` results in a list column `.pred` containing tibbles |
89 |
| -#' with a column `.pred_hazard`. |
90 |
| -#' * `type = "survival"` results in a list column `.pred` containing tibbles |
91 |
| -#' with a `.pred_survival` column. |
| 92 | +#' Using `type = "raw"` with `predict.model_fit()` will return |
| 93 | +#' the unadulterated results of the prediction function. |
92 | 94 | #'
|
93 | 95 | #' In the case of Spark-based models, since table columns cannot
|
94 | 96 | #' contain dots, the same convention is used except 1) no dots
|
|
0 commit comments