Skip to content

'geom_rug()' prints a warning when 'na.rm = FALSE' #5906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jun 6, 2024
Merged

'geom_rug()' prints a warning when 'na.rm = FALSE' #5906

merged 9 commits into from
Jun 6, 2024

Conversation

pn317
Copy link
Contributor

@pn317 pn317 commented May 23, 2024

Fixes issue #5905.
When presented with missing values, 'geom_rug()' was not printing a warning message, contrary to the documentation.
A warning message is now printed when 'na.rm = FALSE' and suppressed when 'na.rm = TRUE', as expected.

Fixes issue #5905.
When presented with missing values, 'geom_rug()' was not printing a
warning message, contrary to the documentation.
A warning message is now printed when 'na.rm = FALSE' ad suppressed when
'na.rm = TRUE', as expected.
Copy link
Collaborator

@teunbrand teunbrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi thanks for preparing a PR to fix this!

There is a subtle problem with assigning variables to self with regards to statefulness. While it appears to work fine, like in the plot below;

devtools::load_all("~/packages/ggplot2/") # copied from your PR
#> ℹ Loading ggplot2

GeomRug$required_aes # empty
#> character(0)

ggplot(mpg, aes(displ, hwy)) +
  geom_rug()

What it actually does is to modify the global state of GeomRug to now have both x and y as required aesthetics. This will crash a subsequent plot wherein only one of the two is needed.

GeomRug$required_aes # populated
#> [1] "x" "y"

ggplot(mpg, aes(displ)) +
  geom_rug()
#> Error in `geom_rug()`:
#> ! Problem while setting up geom.
#> ℹ Error occurred in the 1st layer.
#> Caused by error in `compute_geom_1()` at ggplot2/R/ggproto.R:190:17:
#> ! `geom_rug()` requires the following missing aesthetics: y.

Created on 2024-05-23 with reprex v2.1.0

I think the solution to this will be to make a new GeomRug$handle_na() method, that computes the correct required_aes locally (without assigning it to self) and passes that on remove_missing().

@pn317
Copy link
Contributor Author

pn317 commented May 23, 2024

@teunbrand thanks for pointing that out. I had missed that when testing because I was just alternating between

ggplot(mtcars2, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_rug(sides = "b")

which (rightly) doesn't warn, because there are no NAs in wt, and

ggplot(mtcars2, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_rug(sides = "l")

which does warn (with my fix) and seems to be happy by dumb luck because I am still providing both x and y aesthetics.

I will push a new commit to adapt handle_na() instead, as you suggest.

Make local changes in 'handle_na() instead of global changes to 'GeomRug'
@teunbrand
Copy link
Collaborator

Yeah this looks great, thanks! Is there any way I could convince you to add a small test that simply checks that the warning is emitted when there is missing data?

@pn317
Copy link
Contributor Author

pn317 commented May 23, 2024

You can convince me! I'll add that shortly

Also added the issue number to NEWS
@teunbrand
Copy link
Collaborator

OK the following is my fault for not noticing in the first place, but I think there is an issue with removing rows where x or y are missing.

So with the current way things work, a missing y still renders the rug for the x that is present:

library(ggplot2)

df <- data.frame(x = 1:2, y = c(1, NA))

ggplot(df, aes(x, y)) +
  geom_rug()

However, when we start removing missing values, we also remove rows that are semi-complete. Notice how x = 2 no longer drawn with this approach.

devtools::load_all("~/packages/ggplot2") # this PR
#> ℹ Loading ggplot2

ggplot(df, aes(x, y)) +
  geom_rug()
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_rug()`).

Created on 2024-05-24 with reprex v2.1.0

So I think the correct approach is to only drop rows where x and y are missing if drawing both directions, and only the relevant aesthetic when the rug is drawn in one direction. I'd be happy to help out if this is becoming too complicated.

@pn317
Copy link
Contributor Author

pn317 commented May 29, 2024

@teunbrand -- sorry for the slow reply; I have been away from work for a bit. I'll have a go at doing something useful and shout if things are getting too complicated. Thanks for the support.

pn317 and others added 2 commits June 4, 2024 15:15
When plotting rugs in both the 'x' and 'y' direction simultaneously,
values of 'x' were being dropped when 'y' was missing, and vice versa.
A warning will be given for each axis ('x' or 'y') that contains missing
values, if 'na.rm = FALSE'.
R/geom-rug.R Outdated
paste0(sides_aes, collapse = ""),
"x" = ,
"y" = df_list[[1]],
"xy" = dplyr::union(df_list[[1]], df_list[[2]])
Copy link
Collaborator

@teunbrand teunbrand Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ggplot2 doesn't use dplyr (it is in suggests because of some cross-functionality).
I think a similar thing could be achieved by using vec_set_union() from {vctrs}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - I have updated this as suggested

paste0("Removed ", n_missing, " rows containing missing values or values outside the scale range")
)

expect_no_warning(ggplotGrob(p2))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be accompanied by a bump of the suggested {testthat} version to 3.1.5

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

pn317 added 3 commits June 6, 2024 13:46
Use 'vctrs::vec_set_union()' instead of 'dplyr::union()'
…into geom-rug-na-warning

Merge changes in NEWS.md
Copy link
Collaborator

@teunbrand teunbrand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thank you the contribution and entertaining the suggestions!
I think this should be good to go, but if you want you can add your github handle to the news bullet.

@pn317
Copy link
Contributor Author

pn317 commented Jun 6, 2024

Thanks very much. I've updated the NEWS bullet as suggested.

@teunbrand teunbrand merged commit e635262 into tidyverse:main Jun 6, 2024
11 checks passed
@teunbrand
Copy link
Collaborator

Thanks again!

@pn317 pn317 deleted the geom-rug-na-warning branch June 6, 2024 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants