Computed variable and delayed evaluation docs #5117

teunbrand · 2022-12-23T18:59:30Z

This PR aims to fix #4951

Briefly, in 'computed variables' sections there is now a pointer to the ?after_stat page, and all variable names are wrapped in after_stat(). In addition, the ?after_stat page has been restructured to more clearly articulate what to expect for each stage, as well as sprinkling around some extra, slightly more complicated examples.

yjunechoe · 2022-12-24T00:28:30Z

R/aes-evaluation.r

+#'     fun = ~ round(mean(.x), 2),
+#'     fun.max = ~ round(sd(.x), 2)


Small nitpicking only because of my love/hate relationship with the fun.* arguments in stat_summary():

May I suggest just passing a function to fun.data returning a dataframe with the desired variables-as-columns, as opposed to supplying functions to fun and fun.max that each return vectors?

So those 2 lines can be replaced with:

# possibly `\(val) ...` so the argument is not confused with the x-positional aesthetic fun.data = \(x) round(data.frame(mean = mean(x), sd = sd(x)), 2)

Then, the columns from fun.data output can be referenced in the after_stat() for label like this, which IMO is more transparent:

after_stat(paste(mean, sd, sep = " ± "))

And it also produces a more straightforward layer data:

p_new <- ggplot(mpg, aes(class, displ)) + geom_violin() + stat_summary( aes(y = stage(displ, after_stat = 8), label = after_stat(paste(mean, sd, sep = " ± "))), geom = "text", fun.data = \(x) round(data.frame(mean = mean(x), sd = sd(x)), 2) ) layer_data(last_plot(), 2)[,1:7] #> y label x group mean sd PANEL #> 1 8 6.16 ± 0.53 1 1 6.16 0.53 1 #> 2 8 2.33 ± 0.45 2 2 2.33 0.45 1 #> 3 8 2.92 ± 0.72 3 3 2.92 0.72 1 #> 4 8 3.39 ± 0.45 4 4 3.39 0.45 1 #> 5 8 4.42 ± 0.83 5 5 4.42 0.83 1 #> 6 8 2.66 ± 1.1 6 6 2.66 1.10 1 #> 7 8 4.46 ± 1.07 7 7 4.46 1.07 1

Without overriding fun.data (or doing an ugly fun.min = ~ NULL), you get the vestigial ymin column that's there to support the default pointrange geom. Plus, the mean that's calculated from fun gets mapped to y and then is immediately overriden by stage(after_stat = 8) which may be surprising (in case one wanted to use it to map size in the after-scale of something):

p_old <- ggplot(mpg, aes(class, displ)) + geom_violin() + stat_summary( aes(y = stage(displ, after_stat = 8), label = after_stat(paste(y, ymax, sep = " ± "))), geom = "text", fun = ~ round(mean(.x), 2), fun.max = ~ round(sd(.x), 2) ) layer_data(last_plot(), 2)[,1:7] #> y label x group ymin ymax PANEL #> 1 8 6.16 ± 0.53 1 1 NA 0.53 1 #> 2 8 2.33 ± 0.45 2 2 NA 0.45 1 #> 3 8 2.92 ± 0.72 3 3 NA 0.72 1 #> 4 8 3.39 ± 0.45 4 4 NA 0.45 1 #> 5 8 4.42 ± 0.83 5 5 NA 0.83 1 #> 6 8 2.66 ± 1.1 6 6 NA 1.10 1 #> 7 8 4.46 ± 1.07 7 7 NA 1.07 1

Lastly and relatedly, because sd is held in the ymax column in the old example, it gets used later in the retraining of the y-scales, so if you plot p_old you see it actually gets an expansion down to around 0 in the y-axis because ymax is around that value. All of this mess because stat_summary() appears as if it's geom-agnostic when it kinda isn't 😅

I may be missing obvious problems with my approach and this is possibly off-label usage of fun.data, but as long as fun.data dataframe doesn't touch standard aesthetics (like overriding/duplicating the x column or hard-coding colour before the scale sees it), this should be ok?

(And of course thanks for this great addition to the docs!)

Thanks June, I love the suggestion. It makes it much clearer about what is going on! Especially the y-axis expansion I hadn't spotted, so thanks :) The only way I'll deviate from your suggestion is that the example should work in older R versions than R4.1.0, so I'll stick to the rlang lambda syntax instead of base R lambda syntax for now.

…into comp_var_docs # Conflicts: # R/aes-evaluation.r

yutannihilation · 2022-12-25T01:39:43Z

Looks awesome!

@hadley
Could you take a look when you have time? Since #4951 is what you proposed, probably you are the best one to review this.

hadley

Looks good! I added a few comments specifically about the style.

R/aes-evaluation.r

R/geom-dotplot.r

teunbrand · 2022-12-27T13:03:35Z

Thanks for your comments Hadley! I've adapted at most places, and I'll go write an rd_computed_vars() helper function to format the section better.

teunbrand · 2022-12-27T15:40:50Z

Alright if we would disagree about the list method (describe vs itemize), it should now be relatively easy to adjust the rd_compute_vars() wrapper instead of manually going through all of the stat functions :)

hadley · 2023-01-01T17:58:32Z

R/aes-evaluation.r

+#' aesthetics from, and three functions to control at which stage aesthetics
+#' should be evaluated.
+#'
+#' @usage # These functions can be used inside the `aes()` function


Suggested change

#' @usage # These functions can be used inside the `aes()` function

#' @usage

#' # These functions can be used inside the `aes()` function

I think this makes it easier to read and shouldn't affect the output

hadley · 2023-01-01T17:58:59Z

R/aes-evaluation.r

+#'   If you want to map the same aesthetic multiple times, e.g. map `x` to a
+#'   data column for the stat, but remap it for the geom, you can use the
+#'   `stage()` function to collect multiple mappings.


hadley · 2023-01-01T17:59:45Z

R/aes-evaluation.r

+#'   )
+#' ```
+#' @note
+#' `after_stat()` replaces the old approaches of using either `stat()`, e.g.


I think I'd put this in the @description. I assume it already has a stat alias?

hadley · 2023-01-01T18:01:38Z

R/utilities-help.r

+  header <- "@section Computed variables: "
+  intro  <- paste0(
+    "These are calculated by the 'stat' part of layers and can be accessed ",
+    "with \\link[=aes_eval]{delayed evaluation}. "


Why not use markdown here?

hadley · 2023-01-01T18:02:05Z

R/utilities-help.r

+
+  # Compose item-list
+  fmt_descr <- paste0(gsub("\n", "", descr), "}")
+  fmt_list  <- paste(fmt_items, fmt_descr, sep = "\\cr ")


Suggested change

fmt_list <- paste(fmt_items, fmt_descr, sep = "\\cr ")

fmt_list <- paste(fmt_items, fmt_descr, sep = ": ")

?

teunbrand · 2023-01-02T10:10:27Z

Thanks for your review Hadley! I've processed the latest comments.

I'm still going to make one last attempt at advocating for \cr separation of item name and description. Ideally, what I think would be best is to have the typesetting as in @param fields, where all descriptions are indented the same. I don't think this can be replicated outside @param fields though.

This is what the documentation of ?geom_boxplot looks like if we separate with :. It is visually hard to discriminate where the item name ends and the description begins. The only visual clue is that the typesetting is a monospaced font for the item name (if we ignore the 'or's) and the colon itself. The good thing about this, is that it is more compact.

Whereas if we separate item name and description with \cr, the documentation looks like the one below. I think it is visually easier to discriminate the item name from the description, and it looks a bit tidier than separating with :.

Nonetheless, I have taken your suggestion to use : in the PR (but I couldn't let this go without an honest attempt at advocating for \cr one last time).

hadley · 2023-01-02T17:08:57Z

Ok, that looks good to me!

teunbrand added 4 commits December 22, 2022 17:35

Handle filename better in ggsave()

2fcee4f

Wrap computed variables in after_stat()

d8f27d4

Expand delayed evaluation docs

1986a39

Undo mistake due to clumsy git skills

b109e9f

yjunechoe reviewed Dec 24, 2022

View reviewed changes

teunbrand added 4 commits December 24, 2022 10:13

Improve example

e707131

Improve example

e1712b9

Merge branch 'comp_var_docs' of https://github.com/teunbrand/ggplot2 …

cffbb31

…into comp_var_docs # Conflicts: # R/aes-evaluation.r

Reoxygenate

04b42de

hadley requested changes Dec 26, 2022

View reviewed changes

Update suggestions

b88f7b2

teunbrand added 3 commits December 27, 2022 16:26

Write helper function for computed variables

917dabd

Use helper

68ba639

Crosslink aes and aes_eval docs

21e27ab

teunbrand requested a review from hadley December 27, 2022 15:51

hadley approved these changes Jan 1, 2023

View reviewed changes

teunbrand added 3 commits January 2, 2023 10:46

Helper uses markdown

b30fce4

Process comments

20861e2

Roxygenate

616c027

Revert to \cr separator

7b703f4

teunbrand merged commit 3daeb33 into tidyverse:main Jan 3, 2023

teunbrand deleted the comp_var_docs branch January 3, 2023 20:36

		#' fun = ~ round(mean(.x), 2),
		#' fun.max = ~ round(sd(.x), 2)

	#' @usage # These functions can be used inside the `aes()` function
	#' @usage
	#' # These functions can be used inside the `aes()` function

	fmt_list <- paste(fmt_items, fmt_descr, sep = "\\cr ")
	fmt_list <- paste(fmt_items, fmt_descr, sep = ": ")

Computed variable and delayed evaluation docs #5117

Computed variable and delayed evaluation docs #5117

Uh oh!

Conversation

teunbrand commented Dec 23, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yjunechoe Dec 24, 2022

Choose a reason for hiding this comment

Uh oh!

teunbrand Dec 24, 2022

Choose a reason for hiding this comment

Uh oh!

yutannihilation commented Dec 25, 2022

Uh oh!

hadley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

teunbrand commented Dec 27, 2022

Uh oh!

teunbrand commented Dec 27, 2022

Uh oh!

hadley Jan 1, 2023

Choose a reason for hiding this comment

Uh oh!

hadley Jan 1, 2023

Choose a reason for hiding this comment

Uh oh!

hadley Jan 1, 2023

Choose a reason for hiding this comment

Uh oh!

hadley Jan 1, 2023

Choose a reason for hiding this comment

Uh oh!

hadley Jan 1, 2023

Choose a reason for hiding this comment

Uh oh!

teunbrand commented Jan 2, 2023

Uh oh!

hadley commented Jan 2, 2023

Uh oh!

Uh oh!

teunbrand commented Dec 23, 2022 •

edited

Loading