tidyverse · teunbrand · Jan 3, 2023 · Dec 22, 2022 · Dec 23, 2022 · Dec 23, 2022
diff --git a/R/aes-evaluation.r b/R/aes-evaluation.r
@@ -1,41 +1,114 @@
 #' Control aesthetic evaluation
 #'
-#' Most aesthetics are mapped from variables found in the data. Sometimes,
-#' however, you want to delay the mapping until later in the rendering process.
-#' ggplot2 has three stages of the data that you can map aesthetics from. The
-#' default is to map at the beginning, using the layer data provided by the
-#' user. The second stage is after the data has been transformed by the layer
-#' stat. The third and last stage is after the data has been transformed and
-#' mapped by the plot scales. The most common example of mapping from stat
-#' transformed data is the height of bars in [geom_histogram()]:
-#' the height does not come from a variable in the underlying data, but
-#' is instead mapped to the `count` computed by [stat_bin()]. An example of
-#' mapping from scaled data could be to use a desaturated version of the stroke
-#' colour for fill. If you want to map directly from the layer data you should
-#' not do anything special. In order to map from stat transformed data you
+#' @description
+#' Most [aesthetics][aes()] are mapped from variables found in the data.
+#' Sometimes, however, you want to delay the mapping until later in the
+#' rendering process. ggplot2 has three stages of the data that you can map
+#' aesthetics from, and three functions to control at which stage aesthetics
+#' should be evaluated.
+#'
+#' @description
+#' `after_stat()` replaces the old approaches of using either `stat()`, e.g.
+#' `stat(density)`, or surrounding the variable names with `..`, e.g.
+#' `..density..`.
+#'
+#' @usage
+#' # These functions can be used inside the `aes()` function
+#' # used as the `mapping` argument in layers, for example:
+#' # geom_density(mapping = aes(y = after_stat(scaled)))
+#'
+#' @param x <[`data-masking`][rlang::topic-data-mask]> An aesthetic expression
+#'   using variables calculated by the stat (`after_stat()`) or layer aesthetics
+#'   (`after_scale()`).
+#' @param start <[`data-masking`][rlang::topic-data-mask]> An aesthetic
+#'   expression using variables from the layer data.
+#' @param after_stat <[`data-masking`][rlang::topic-data-mask]> An aesthetic
+#'   expression using variables calculated by the stat.
+#' @param after_scale <[`data-masking`][rlang::topic-data-mask]> An aesthetic
+#'   expression using layer aesthetics.
+#'
+#' @details
+#' # Staging
+#' Below follows an overview of the three stages of evaluation and how aesthetic
+#' evaluation can be controlled.
+#'
+#' ## Stage 1: direct input
+#' The default is to map at the beginning, using the layer data provided by
+#' the user. If you want to map directly from the layer data you should not do
+#' anything special. This is the only stage where the original layer data can
+#' be accessed.
+#'
+#' ```r
+#' # 'x' and 'y' are mapped directly
+#' ggplot(mtcars) + geom_point(aes(x = mpg, y = disp))
+#' ```
+#'
+#' ## Stage 2: after stat transformation
+#' The second stage is after the data has been transformed by the layer
+#' stat. The most common example of mapping from stat transformed data is the
+#' height of bars in [geom_histogram()]: the height does not come from a
+#' variable in the underlying data, but is instead mapped to the `count`
+#' computed by [stat_bin()]. In order to map from stat transformed data you
 #' should use the `after_stat()` function to flag that evaluation of the
 #' aesthetic mapping should be postponed until after stat transformation.
-#' Similarly, you should use `after_scale()` to flag evaluation of mapping for
-#' after data has been scaled. If you want to map the same aesthetic multiple
-#' times, e.g. map `x` to a data column for the stat, but remap it for the geom,
-#' you can use the `stage()` function to collect multiple mappings.
-#'
-#' `after_stat()` replaces the old approaches of using either `stat()` or
-#' surrounding the variable names with `..`.
-#'
-#' @note Evaluation after stat transformation will have access to the
-#' variables calculated by the stat, not the original mapped values. Evaluation
-#' after scaling will only have access to the final aesthetics of the layer
-#' (including non-mapped, default aesthetics). The original layer data can only
-#' be accessed at the first stage.
-#'
-#' @param x An aesthetic expression using variables calculated by the stat
-#'   (`after_stat()`) or layer aesthetics (`after_scale()`).
-#' @param start An aesthetic expression using variables from the layer data.
-#' @param after_stat An aesthetic expression using variables calculated by the
-#'   stat.
-#' @param after_scale An aesthetic expression using layer aesthetics.
+#' Evaluation after stat transformation will have access to the variables
+#' calculated by the stat, not the original mapped values. The 'computed
+#' variables' section in each stat lists which variables are available to
+#' access.
+#'
+#' ```r
+#' # The 'y' values for the histogram are computed by the stat
+#' ggplot(faithful, aes(x = waiting)) +
+#'   geom_histogram()
+#'
+#' # Choosing a different computed variable to display, matching up the
+#' # histogram with the density plot
+#' ggplot(faithful, aes(x = waiting)) +
+#'   geom_histogram(aes(y = after_stat(density))) +
+#'   geom_density()
+#' ```
+#'
+#' ## Stage 3: after scale transformation
+#' The third and last stage is after the data has been transformed and
+#' mapped by the plot scales. An example of mapping from scaled data could
+#' be to use a desaturated version of the stroke colour for fill. You should
+#' use `after_scale()` to flag evaluation of mapping for after data has been
+#' scaled. Evaluation after scaling will only have access to the final
+#' aesthetics of the layer (including non-mapped, default aesthetics).
+#'
+#' ```r
+#' # The exact colour is known after scale transformation
+#' ggplot(mpg, aes(cty, colour = factor(cyl))) +
+#'   geom_density()
 #'
+#' # We re-use colour properties for the fill without a separate fill scale
+#' ggplot(mpg, aes(cty, colour = factor(cyl))) +
+#'   geom_density(aes(fill = after_scale(alpha(colour, 0.3))))
+#' ```
+#'
+#' ## Complex staging
+#' If you want to map the same aesthetic multiple times, e.g. map `x` to a
+#' data column for the stat, but remap it for the geom, you can use the
+#' `stage()` function to collect multiple mappings.
+#'
+#' ```r
+#' # Use stage to modify the scaled fill
+#' ggplot(mpg, aes(class, hwy)) +
+#'   geom_boxplot(aes(fill = stage(class, after_scale = alpha(fill, 0.4))))
+#'
+#' # Using data for computing summary, but placing label elsewhere.
+#' # Also, we're making our own computed variable to use for the label.
+#' ggplot(mpg, aes(class, displ)) +
+#'   geom_violin() +
+#'   stat_summary(
+#'     aes(
+#'       y = stage(displ, after_stat = 8),
+#'       label = after_stat(paste(mean, "±", sd))
+#'     ),
+#'     geom = "text",
+#'     fun.data = ~ round(data.frame(mean = mean(.x), sd = sd(.x)), 2)
+#'   )
+#' ```
 #' @rdname aes_eval
 #' @name aes_eval
 #'
@@ -55,6 +128,52 @@
 #' # Use stage to modify the scaled fill
 #' ggplot(mpg, aes(class, hwy)) +
 #'   geom_boxplot(aes(fill = stage(class, after_scale = alpha(fill, 0.4))))
+#'
+#' # Making a proportional stacked density plot
+#' ggplot(mpg, aes(cty)) +
+#'   geom_density(
+#'     aes(
+#'       colour = factor(cyl),
+#'       fill = after_scale(alpha(colour, 0.3)),
+#'       y = after_stat(count / sum(n[!duplicated(group)]))
+#'     ),
+#'     position = "stack", bw = 1
+#'   ) +
+#'   geom_density(bw = 1)
+#'
+#' # Imitating a ridgeline plot
+#' ggplot(mpg, aes(cty, colour = factor(cyl))) +
+#'   geom_ribbon(
+#'     stat = "density", outline.type = "upper",
+#'     aes(
+#'       fill = after_scale(alpha(colour, 0.3)),
+#'       ymin = after_stat(group),
+#'       ymax = after_stat(group + ndensity)
+#'     )
+#'   )
+#'
+#' # Labelling a bar plot
+#' ggplot(mpg, aes(class)) +
+#'   geom_bar() +
+#'   geom_text(
+#'     aes(
+#'       y = after_stat(count + 2),
+#'       label = after_stat(count)
+#'     ),
+#'     stat = "count"
+#'   )
+#'
+#' # Labelling the upper hinge of a boxplot,
+#' # inspired by June Choe
+#' ggplot(mpg, aes(displ, class)) +
+#'   geom_boxplot(outlier.shape = NA) +
+#'   geom_text(
+#'     aes(
+#'       label = after_stat(xmax),
+#'       x = stage(displ, after_stat = xmax)
+#'     ),
+#'     stat = "boxplot", hjust = -0.5
+#'   )
 NULL
 
 #' @rdname aes_eval

diff --git a/R/aes.r b/R/aes.r
@@ -31,6 +31,8 @@ NULL
 #'   are typically omitted because they are so common; all other aesthetics must be named.
 #' @seealso [vars()] for another quoting function designed for
 #'   faceting specifications.
+#'
+#'   [Delayed evaluation][aes_eval] for working with computed variables.
 #' @return A list with class `uneval`. Components of the list are either
 #'   quosures or constants.
 #' @export

diff --git a/R/geom-dotplot.r b/R/geom-dotplot.r
@@ -17,18 +17,17 @@
 #' to match the number of dots.
 #'
 #' @eval rd_aesthetics("geom", "dotplot")
-#' @section Computed variables:
-#' \describe{
-#'   \item{x}{center of each bin, if binaxis is "x"}
-#'   \item{y}{center of each bin, if binaxis is "x"}
-#'   \item{binwidth}{max width of each bin if method is "dotdensity";
-#'     width of each bin if method is "histodot"}
-#'   \item{count}{number of points in bin}
-#'   \item{ncount}{count, scaled to maximum of 1}
-#'   \item{density}{density of points in bin, scaled to integrate to 1,
-#'     if method is "histodot"}
-#'   \item{ndensity}{density, scaled to maximum of 1, if method is "histodot"}
-#' }
+#' @eval rd_computed_vars(
+#'   x = 'center of each bin, if `binaxis` is `"x"`.',
+#'   y = 'center of each bin, if `binaxis` is `"x"`.',
+#'   binwidth = 'maximum width of each bin if method is `"dotdensity"`;
+#'   width of each bin if method is `"histodot"`.',
+#'   count   = "number of points in bin.",
+#'   ncount  = "count, scaled to a maximum of 1.",
+#'   density = 'density of points in bin, scaled to integrate to 1, if method
+#'   is `"histodot"`.',
+#'   ndensity = 'density, scaled to maximum of 1, if method is `"histodot"`.'
+#' )
 #'
 #' @inheritParams layer
 #' @inheritParams geom_point

diff --git a/R/stat-bin.r b/R/stat-bin.r
@@ -27,14 +27,13 @@
 #'   or left edges of bins are included in the bin.
 #' @param pad If `TRUE`, adds empty bins at either end of x. This ensures
 #'   frequency polygons touch 0. Defaults to `FALSE`.
-#' @section Computed variables:
-#' \describe{
-#'   \item{`count`}{number of points in bin}
-#'   \item{`density`}{density of points in bin, scaled to integrate to 1}
-#'   \item{`ncount`}{count, scaled to maximum of 1}
-#'   \item{`ndensity`}{density, scaled to maximum of 1}
-#'   \item{`width`}{widths of bins}
-#' }
+#' @eval rd_computed_vars(
+#'   count    = "number of points in bin.",
+#'   density  = "density of points in bin, scaled to integrate to 1.",
+#'   ncount   = "count, scaled to a maximum of 1.",
+#'   ndensity = "density, scaled to a maximum of 1.",
+#'   width    = "widths of bins."
+#' )
 #'
 #' @section Dropped variables:
 #' \describe{

diff --git a/R/stat-bin2d.r b/R/stat-bin2d.r
@@ -5,13 +5,12 @@
 #' @param drop if `TRUE` removes all cells with 0 counts.
 #' @export
 #' @rdname geom_bin_2d
-#' @section Computed variables:
-#' \describe{
-#'   \item{count}{number of points in bin}
-#'   \item{density}{density of points in bin, scaled to integrate to 1}
-#'   \item{ncount}{count, scaled to maximum of 1}
-#'   \item{ndensity}{density, scaled to maximum of 1}
-#' }
+#' @eval rd_computed_vars(
+#'   count    = "number of points in bin.",
+#'   density  = "density of points in bin, scaled to integrate to 1.",
+#'   ncount   = "count, scaled to maximum of 1.",
+#'   ndensity = "density, scaled to a maximum of 1."
+#' )
 stat_bin_2d <- function(mapping = NULL, data = NULL,
                         geom = "tile", position = "identity",
                         ...,

diff --git a/R/stat-binhex.r b/R/stat-binhex.r
@@ -1,13 +1,12 @@
 #' @export
 #' @rdname geom_hex
 #' @inheritParams stat_bin_2d
-#' @section Computed variables:
-#' \describe{
-#'   \item{count}{number of points in bin}
-#'   \item{density}{density of points in bin, scaled to integrate to 1}
-#'   \item{ncount}{count, scaled to maximum of 1}
-#'   \item{ndensity}{density, scaled to maximum of 1}
-#' }
+#' @eval rd_computed_vars(
+#'   count    = "number of points in bin.",
+#'   density  = "density of points in bin, scaled to integrate to 1.",
+#'   ncount   = "count, scaled to maximum of 1.",
+#'   ndensity = "density, scaled to maximum of 1."
+#' )
 stat_bin_hex <- function(mapping = NULL, data = NULL,
                          geom = "hex", position = "identity",
                          ...,

diff --git a/R/stat-boxplot.r b/R/stat-boxplot.r
@@ -1,19 +1,21 @@
 #' @rdname geom_boxplot
 #' @param coef Length of the whiskers as multiple of IQR. Defaults to 1.5.
 #' @inheritParams stat_identity
-#' @section Computed variables:
-#' `stat_boxplot()` provides the following variables, some of which depend on the orientation:
-#' \describe{
-#'   \item{width}{width of boxplot}
-#'   \item{ymin *or* xmin}{lower whisker = smallest observation greater than or equal to lower hinge - 1.5 * IQR}
-#'   \item{lower *or* xlower}{lower hinge, 25% quantile}
-#'   \item{notchlower}{lower edge of notch = median - 1.58 * IQR / sqrt(n)}
-#'   \item{middle *or* xmiddle}{median, 50% quantile}
-#'   \item{notchupper}{upper edge of notch = median + 1.58 * IQR / sqrt(n)}
-#'   \item{upper *or* xupper}{upper hinge, 75% quantile}
-#'   \item{ymax *or* xmax}{upper whisker = largest observation less than or equal to upper hinge + 1.5 * IQR}
-#' }
 #' @export
+#' @eval rd_computed_vars(
+#'   .details = "`stat_boxplot()` provides the following variables, some of
+#'   which depend on the orientation:",
+#'   width = "width of boxplot.",
+#'   "ymin|xmin" = "lower whisker = smallest observation greater than or equal
+#'   to lower hinger - 1.5 * IQR.",
+#'   "lower|xlower" = "lower hinge, 25% quantile.",
+#'   notchlower = "lower edge of notch = median - 1.58 * IQR / sqrt(n).",
+#'   "middle|xmiddle" = "median, 50% quantile.",
+#'   notchupper = "upper edge of notch = median + 1.58 * IQR / sqrt(n).",
+#'   "upper|xupper" = "upper hinge, 75% quantile.",
+#'   "ymax|xmax" = "upper whisker = largest observation less than or equal to
+#'   upper hinger + 1.5 * IQR."
+#' )
 stat_boxplot <- function(mapping = NULL, data = NULL,
                          geom = "boxplot", position = "dodge2",
                          ...,

diff --git a/R/stat-contour.r b/R/stat-contour.r
@@ -3,21 +3,21 @@
 #' @export
 #' @eval rd_aesthetics("stat", "contour")
 #' @eval rd_aesthetics("stat", "contour_filled")
-#' @section Computed variables:
-#' The computed variables differ somewhat for contour lines (computed by
-#' `stat_contour()`) and contour bands (filled contours, computed by `stat_contour_filled()`).
-#' The variables `nlevel` and `piece` are available for both, whereas `level_low`, `level_high`,
-#' and `level_mid` are only available for bands. The variable `level` is a numeric or a factor
-#' depending on whether lines or bands are calculated.
-#' \describe{
-#'  \item{`level`}{Height of contour. For contour lines, this is numeric vector that
-#'    represents bin boundaries. For contour bands, this is an ordered factor that
-#'    represents bin ranges.}
-#'  \item{`level_low`, `level_high`, `level_mid`}{(contour bands only) Lower and upper
-#'    bin boundaries for each band, as well the mid point between the boundaries.}
-#'  \item{`nlevel`}{Height of contour, scaled to maximum of 1.}
-#'  \item{`piece`}{Contour piece (an integer).}
-#' }
+#' @eval rd_computed_vars(
+#'   .details = "The computed variables differ somewhat for contour lines
+#'   (compbuted by `stat_contour()`) and contour bands (filled contours,
+#'   computed by `stat_contour_filled()`). The variables `nlevel` and `piece`
+#'   are available for both, whereas `level_low`, `level_high`, and `level_mid`
+#'   are only available for bands. The variable `level` is a numeric or a factor
+#'   depending on whether lines or bands are calculated.",
+#'   level = "Height of contour. For contour lines, this is a numeric vector
+#'   that represents bin boundaries. For contour bands, this is an ordered
+#'   factor that represents bin ranges.",
+#'   "level_low,level_high,level_mid" = "(contour bands only) Lower and upper
+#'   bin boundaries for each band, as well as the mid point between boundaries.",
+#'   nlevel = "Height of contour, scaled to a maximum of 1.",
+#'   piece = "Contour piece (an integer)."
+#' )
 #'
 #' @section Dropped variables:
 #' \describe{

diff --git a/R/stat-count.r b/R/stat-count.r
@@ -1,8 +1,7 @@
-#' @section Computed variables:
-#' \describe{
-#'   \item{count}{number of points in bin}
-#'   \item{prop}{groupwise proportion}
-#' }
+#' @eval rd_computed_vars(
+#'   count = "number of points in bin.",
+#'   prop  = "groupwise proportion"
+#' )
 #' @seealso [stat_bin()], which bins data in ranges and counts the
 #'   cases in each range. It differs from `stat_count()`, which counts the
 #'   number of cases at each `x` position (without binning into ranges).