WIP: adding new prediction types for Survnip #359

EmilHvitfeldt · 2020-08-04T05:22:56Z

This PR extents predict() to allow for type = time and type = survival for risk prediction.

topepo · 2020-08-04T23:48:56Z

Looking back at my notes, I think I favor a mode of "censored regression" rather than "risk prediction". Technically, if we produce survival probabilities, those aren't really the risk.

The prediction format for survival probabilities would have rnow(input) == nrow(output). This means that the results should be nested tibbles (even without multi_predict()).

Here's an example using flexsurv:

library(tidymodels)
#> ── Attaching packages ───────────────────────────────────────────────────────────── tidymodels 0.1.1 ──
#> ✓ broom     0.7.0          ✓ recipes   0.1.13    
#> ✓ dials     0.0.8          ✓ rsample   0.0.7     
#> ✓ dplyr     1.0.1          ✓ tibble    3.0.3     
#> ✓ ggplot2   3.3.2          ✓ tidyr     1.1.1     
#> ✓ infer     0.5.2          ✓ tune      0.1.1.9000
#> ✓ modeldata 0.0.2          ✓ workflows 0.1.2.9000
#> ✓ parsnip   0.1.3          ✓ yardstick 0.0.7.9000
#> ✓ purrr     0.3.4
#> ── Conflicts ──────────────────────────────────────────────────────────────── tidymodels_conflicts() ──
#> x purrr::discard() masks scales::discard()
#> x dplyr::filter()  masks stats::filter()
#> x dplyr::lag()     masks stats::lag()
#> x recipes::step()  masks stats::step()
library(flexsurv)
#> Loading required package: survival

data(ovarian)
fit <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="weibull")

summary(fit, ovarian[1:2,], t = c(50, 100, 150)) %>% 
  map_dfr(~ .x) %>% 
  mutate(row = rep(1:2, each = 3)) %>% 
  dplyr::select(.time = time, .pred_survivial = est, row) %>% 
  group_nest(row, .key = ".pred") %>% 
  select(-row)
#> # A tibble: 2 x 1
#>                .pred
#>   <list<tbl_df[,2]>>
#> 1            [3 × 2]
#> 2            [3 × 2]

^{Created on 2020-08-04 by the reprex package (v0.3.0)}

EmilHvitfeldt · 2020-08-05T00:19:56Z

Alright! I have changed everything to "censored regression". We are technically doing right-censored regression. So if we add left-censored regression we would need to remember to clarify.

👍 on nested tibbles. Where would be the cleanest place to convert? doing parsnip::set_pred() with post argument or in format_survival()? It feels like we would need a post function almost no matter what so it might make sense to create a list of tibbles there and have format_survival() the full tibble.

EmilHvitfeldt · 2020-08-05T05:23:04Z

Changes to predict_survival should be done now

library(survnip)
#> Loading required package: parsnip
library(survival)

cox_mod <-
  cox_reg() %>%
  set_engine("survival") %>%
  fit(Surv(time, status) ~ age + ph.ecog, data = lung)

pred_vals <- predict(cox_mod, new_data = lung, type = "survival", .time = 100:200)

pred_vals
#> # A tibble: 228 x 1
#>        .pred_survival
#>    <list<tbl_df[,2]>>
#>  1          [101 × 2]
#>  2          [101 × 2]
#>  3          [101 × 2]
#>  4          [101 × 2]
#>  5          [101 × 2]
#>  6          [101 × 2]
#>  7          [101 × 2]
#>  8          [101 × 2]
#>  9          [101 × 2]
#> 10          [101 × 2]
#> # … with 218 more rows

pred_vals$.pred_survival[[1]]
#> # A tibble: 101 x 2
#>    .time .pred_survival
#>    <chr>          <dbl>
#>  1 100            0.855
#>  2 101            0.855
#>  3 102            0.855
#>  4 103            0.855
#>  5 104            0.855
#>  6 105            0.850
#>  7 106            0.850
#>  8 107            0.840
#>  9 108            0.840
#> 10 109            0.840
#> # … with 91 more rows

tidyr::unnest(pred_vals, cols = c(.pred_survival))
#> # A tibble: 23,028 x 2
#>    .time .pred_survival
#>    <chr>          <dbl>
#>  1 100            0.855
#>  2 101            0.855
#>  3 102            0.855
#>  4 103            0.855
#>  5 104            0.855
#>  6 105            0.850
#>  7 106            0.850
#>  8 107            0.840
#>  9 108            0.840
#> 10 109            0.840
#> # … with 23,018 more rows

^{Created on 2020-08-04 by the reprex package (v0.3.0)}

EmilHvitfeldt · 2020-08-06T22:18:20Z

Package related to this PR is located here: https://github.com/EmilHvitfeldt/survnip

topepo

Looks good overall; minor changes.

We might want to keep this in a branch; if we keep it in main, people will think that everything is there.

Thanks!

topepo · 2020-11-04T18:32:12Z

R/aaa_models.R

@@ -31,7 +31,8 @@ parsnip$modes <- c("regression", "classification", "unknown")
 # ------------------------------------------------------------------------------

 pred_types <-
-  c("raw", "numeric", "class", "prob", "conf_int", "pred_int", "quantile")
+  c("raw", "numeric", "class", "prob", "conf_int", "pred_int", "quantile",
+    "time", "survival", "linear_pred")


Can you add the infrastructure for a "hazard" type?

topepo · 2020-11-04T18:32:52Z

R/predict.R

@@ -216,6 +223,54 @@ format_classprobs <- function(x) {
  x
 }

+format_time <- function(x) {
+  if (inherits(x, "tbl_spark"))


You can remove the spark stuff from the survival bits

topepo · 2020-11-04T18:33:35Z

R/predict_linear_pred.R

+#' @export predict_linear_pred.model_fit
+#' @export
+predict_linear_pred.model_fit <- function(object, new_data, ...) {
+  if (object$spec$mode != "censored regression")


I'd remove this check. We can do these types of predictions for all types of models.

topepo · 2020-11-04T18:34:45Z

R/surv_reg_data.R

-)
+#
+# set_new_model("surv_reg")
+# set_model_mode("surv_reg", "regression")


We'd have to soft-deprecate these. I'll PR into your package to change the new surv_reg() to survival_reg().

github-actions · 2021-03-06T00:31:32Z

This pull request has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue.

EmilHvitfeldt added 3 commits August 1, 2020 00:44

setup predict_time.model_fit

2fbc7b0

add format_time

63e96b2

add predict_survival

a8231f8

"risk prediction" to "censored regression"

35ce9b2

EmilHvitfeldt added 2 commits August 5, 2020 23:22

allow any length of .time in predict_survival

075183c

add predict_linear_pred()

634a419

EmilHvitfeldt added 2 commits August 15, 2020 20:34

add censored regression mode

ebe0a5e

dont register surv_reg

2b6f06f

topepo requested changes Nov 4, 2020

View reviewed changes

EmilHvitfeldt added a commit to EmilHvitfeldt/parsnip that referenced this pull request Dec 3, 2020

transfer over PR tidymodels#359 due to deletion of branch

c0b78d1

EmilHvitfeldt mentioned this pull request Dec 3, 2020

adding new prediction types for Survnip #396

Merged

EmilHvitfeldt closed this Dec 3, 2020

github-actions bot locked and limited conversation to collaborators Mar 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: adding new prediction types for Survnip #359

WIP: adding new prediction types for Survnip #359

Uh oh!

EmilHvitfeldt commented Aug 4, 2020

Uh oh!

topepo commented Aug 4, 2020

Uh oh!

EmilHvitfeldt commented Aug 5, 2020 •

edited

Loading

Uh oh!

EmilHvitfeldt commented Aug 5, 2020

Uh oh!

EmilHvitfeldt commented Aug 6, 2020

Uh oh!

topepo left a comment

Uh oh!

topepo Nov 4, 2020

Uh oh!

topepo Nov 4, 2020

Uh oh!

topepo Nov 4, 2020

Uh oh!

topepo Nov 4, 2020

Uh oh!

github-actions bot commented Mar 6, 2021

Uh oh!

Uh oh!

WIP: adding new prediction types for Survnip #359

WIP: adding new prediction types for Survnip #359

Uh oh!

Conversation

EmilHvitfeldt commented Aug 4, 2020

Uh oh!

topepo commented Aug 4, 2020

Uh oh!

EmilHvitfeldt commented Aug 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

EmilHvitfeldt commented Aug 5, 2020

Uh oh!

EmilHvitfeldt commented Aug 6, 2020

Uh oh!

topepo left a comment

Choose a reason for hiding this comment

Uh oh!

topepo Nov 4, 2020

Choose a reason for hiding this comment

Uh oh!

topepo Nov 4, 2020

Choose a reason for hiding this comment

Uh oh!

topepo Nov 4, 2020

Choose a reason for hiding this comment

Uh oh!

topepo Nov 4, 2020

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 6, 2021

Uh oh!

Uh oh!

EmilHvitfeldt commented Aug 5, 2020 •

edited

Loading