Skip to content

Improve readability of docs for predict.model_fit() #866

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 14, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 45 additions & 43 deletions R/predict.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,45 +4,42 @@
#' `predict()` can be used for all types of models and uses the
#' "type" argument for more specificity.
#'
#' @param object An object of class `model_fit`
#' @param object An object of class `model_fit`.
#' @param new_data A rectangular data object, such as a data frame.
#' @param type A single character value or `NULL`. Possible values
#' are "numeric", "class", "prob", "conf_int", "pred_int", "quantile", "time",
#' "hazard", "survival", or "raw". When `NULL`, `predict()` will choose an
#' appropriate value based on the model's mode.
#' are `"numeric"`, `"class"`, `"prob"`, `"conf_int"`, `"pred_int"`,
#' `"quantile"`, `"time"`, `"hazard"`, `"survival"`, or `"raw"`. When `NULL`,
#' `predict()` will choose an appropriate value based on the model's mode.
#' @param opts A list of optional arguments to the underlying
#' predict function that will be used when `type = "raw"`. The
#' list should not include options for the model object or the
#' new data being predicted.
#' @param ... Arguments to the underlying model's prediction
#' function cannot be passed here (see `opts`). There are some
#' `parsnip` related options that can be passed, depending on the
#' value of `type`. Possible arguments are:
#' @param ... Additional `parsnip`-related options, depending on the
#' value of `type`. Arguments to the underlying model's prediction
#' function cannot be passed here (use the `opts` argument instead).
#' Possible arguments are:
#' \itemize{
#' \item `interval`: for `type`s of "survival" and "quantile", should
#' \item `interval`: for `type` equal to `"survival"` or `"quantile"`, should
#' interval estimates be added, if available? Options are `"none"`
#' and `"confidence"`.
#' \item `level`: for `type`s of "conf_int", "pred_int", and "survival"
#' \item `level`: for `type` equal to `"conf_int"`, `"pred_int"`, or `"survival"`,
#' this is the parameter for the tail area of the intervals
#' (e.g. confidence level for confidence intervals).
#' Default value is 0.95.
#' \item `std_error`: add the standard error of fit or prediction (on
#' the scale of the linear predictors) for `type`s of "conf_int"
#' and "pred_int". Default value is `FALSE`.
#' \item `quantile`: the quantile(s) for quantile regression
#' (not implemented yet)
#' \item `time`: the time(s) for hazard and survival probability estimates.
#' Default value is `0.95`.
#' \item `std_error`: for `type` equal to `"conf_int"` or `"pred_int"`, add
#' the standard error of fit or prediction (on the scale of the
#' linear predictors). Default value is `FALSE`.
#' \item `quantile`: for `type` equal to `quantile`, the quantiles of the
#' distribution. Default is `(1:9)/10`.
#' \item `time`: for `type` equal to `"survival"` or `"hazard"`, the
#' time points at which the survival probability or hazard is estimated.
#' }
#' @details If "type" is not supplied to `predict()`, then a choice
#' is made:
#' @details For `type = NULL`, `predict()` uses
#'
#' * `type = "numeric"` for regression models,
#' * `type = "class"` for classification, and
#' * `type = "time"` for censored regression.
#'
#' `predict()` is designed to provide a tidy result (see "Value"
#' section below) in a tibble output format.
#'
#' ## Interval predictions
#'
#' When using `type = "conf_int"` and `type = "pred_int"`, the options
Expand All @@ -59,36 +56,41 @@
#' produces. Set `increasing = FALSE` to suppress this behavior.
#'
#' @return With the exception of `type = "raw"`, the results of
#' `predict.model_fit()` will be a tibble as many rows in the output
#' as there are rows in `new_data` and the column names will be
#' predictable.
#' `predict.model_fit()`
#'
#' * is a tibble
#' * has as many rows as there are rows in `new_data`
#' * has standardized column names, see below:
#'
#' For `type = "numeric"`, the tibble has a `.pred` column for a single
#' outcome and `.pred_Yname` columns for a multivariate outcome.
#'
#' For numeric results with a single outcome, the tibble will have
#' a `.pred` column and `.pred_Yname` for multivariate results.
#' For `type = "class"`, the tibble has a `.pred_class` column.
#'
#' For hard class predictions, the column is named `.pred_class`
#' and, when `type = "prob"`, the columns are `.pred_classlevel`.
#' For `type = "prob"`, the tibble has `.pred_classlevel` columns.
#'
#' `type = "conf_int"` and `type = "pred_int"` return tibbles with
#' columns `.pred_lower` and `.pred_upper` with an attribute for
#' the confidence level. In the case where intervals can be
#' produces for class probabilities (or other non-scalar outputs),
#' the columns will be named `.pred_lower_classlevel` and so on.
#' For `type = "conf_int"` and `type = "pred_int"`, the tibble has
#' `.pred_lower` and `.pred_upper` columns with an attribute for
#' the confidence level. In the case where intervals can be
#' produces for class probabilities (or other non-scalar outputs),
#' the columns are named `.pred_lower_classlevel` and so on.
#'
#' Quantile predictions return a tibble with a column `.pred`, which is
#' For `type = "quantile"`, the tibble has a `.pred` column, which is
#' a list-column. Each list element contains a tibble with columns
#' `.pred` and `.quantile` (and perhaps other columns).
#'
#' Using `type = "raw"` with `predict.model_fit()` will return
#' the unadulterated results of the prediction function.
#' For `type = "time"`, the tibble has a `.pred_time` column.
#'
#' For censored regression:
#' For `type = "survival"`, the tibble has a `.pred` column, which is
#' a list-column. Each list element contains a tibble with columns
#' `.time` and `.pred_survival` (and perhaps other columns).
#'
#' For `type = "hazard"`, the tibble has a `.pred` column, which is
#' a list-column. Each list element contains a tibble with columns
#' `.time` and `.pred_hazard` (and perhaps other columns).
#'
#' * `type = "time"` produces a column `.pred_time`.
#' * `type = "hazard"` results in a list column `.pred` containing tibbles
#' with a column `.pred_hazard`.
#' * `type = "survival"` results in a list column `.pred` containing tibbles
#' with a `.pred_survival` column.
#' Using `type = "raw"` with `predict.model_fit()` will return
#' the unadulterated results of the prediction function.
#'
#' In the case of Spark-based models, since table columns cannot
#' contain dots, the same convention is used except 1) no dots
Expand Down
6 changes: 3 additions & 3 deletions man/bart-internal.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 15 additions & 14 deletions man/other_predict.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

86 changes: 44 additions & 42 deletions man/predict.model_fit.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.