Skip to content

Commit 97137c4

Browse files
hadleyDavisVaughan
andauthored
Rework converting vignette (#343)
* Rework converting vignette * A little tweaking --------- Co-authored-by: Davis Vaughan <[email protected]>
1 parent 59ee609 commit 97137c4

File tree

1 file changed

+95
-111
lines changed

1 file changed

+95
-111
lines changed

vignettes/converting.Rmd

Lines changed: 95 additions & 111 deletions
Original file line numberDiff line numberDiff line change
@@ -22,123 +22,108 @@ In many cases there is no need to convert a package from Rcpp.
2222
If the code is already written and you don't have a very compelling need to use cpp11 I would recommend you continue to use Rcpp.
2323
However if you _do_ feel like your project will benefit from using cpp11 this vignette will provide some guidance and doing the conversion.
2424

25-
It is also a place to highlight some of the largest differences between Rcpp and cpp11.
26-
27-
## Class comparison table
28-
29-
| Rcpp | cpp11 (read-only) | cpp11 (writable) | cpp11 header |
30-
| --- | --- | --- | --- |
31-
| NumericVector | doubles | writable::doubles | <cpp11/doubles.hpp> |
32-
| NumericMatrix | doubles_matrix<> | writable::doubles_matrix<> | <cpp11/doubles.hpp> |
33-
| IntegerVector | integers | writable::integers | <cpp11/integers.hpp> |
34-
| IntegerMatrix | integers_matrix<> | writable::integers_matrix<> | <cpp11/integers.hpp> |
35-
| CharacterVector | strings | writable::strings | <cpp11/strings.hpp> |
36-
| RawVector | raws | writable::raws | <cpp11/raws.hpp> |
37-
| List | list | writable::list | <cpp11/list.hpp> |
38-
| RObject | sexp | | <cpp11/sexp.hpp> |
39-
| XPtr | | external_pointer | <cpp11/external_pointer.hpp> |
40-
| Environment | | environment | <cpp11/environment.hpp> |
41-
| Function | | function | <cpp11/function.hpp> |
42-
| Environment (namespace) | | package | <cpp11/function.hpp> |
43-
| wrap | | as_sexp | <cpp11/as.hpp> |
44-
| as | | as_cpp | <cpp11/as.hpp> |
45-
| stop | stop | | <cpp11/protect.hpp> |
46-
| checkUserInterrupt | check_user_interrupt | | <cpp11/protect.hpp> |
47-
48-
## Incomplete list of Rcpp features not included in cpp11
49-
50-
- None of [Modules](https://CRAN.R-project.org/package=Rcpp/vignettes/Rcpp-modules.pdf)
51-
- None of [Sugar](https://CRAN.R-project.org/package=Rcpp/vignettes/Rcpp-sugar.pdf)
52-
- Some parts of [Attributes](https://CRAN.R-project.org/package=Rcpp/vignettes/Rcpp-attributes.pdf)
53-
- No dependencies
54-
- No random number generator restoration
55-
- No support for roxygen2 comments
56-
- No interfaces
25+
## Getting started
26+
27+
1. Add cpp11 by calling `usethis::use_cpp11()`.
28+
29+
1. Start converting function by function.
30+
31+
Converting the code a bit at a time (and regularly running your tests) is the best way to do the conversion correctly and make progress. Doing a separate commit after converting each file (or possibly each function) can make finding any regressions with [git bisect](https://youtu.be/KKeucpfAuuA) much easier in the future.
32+
33+
1. Convert `#include <Rcpp.h>` to `#include <cpp11.hpp>`.
34+
1. Convert all instances of `// [[Rcpp::export]]` to `[[cpp11::register]]`.
35+
1. Grep for `Rcpp::` and replace with the equivalent cpp11 function using the cheatsheets below.
36+
37+
1. Remove Rcpp
38+
1. Remove Rcpp from the `LinkingTo` and `Imports` fields.
39+
1. Remove `@importFrom Rcpp sourceCpp`.
40+
1. Delete `src/RccpExports.cpp` and `R/RcppExports.R`.
41+
1. Delete `src/Makevars` if it only contains `PKG_CPPFLAGS=-DSTRICT_R_HEADERS`.
42+
1. Clean out old compiled code with `pkgbuild::clean_dll()`.
43+
1. Re-document the package to update the `NAMESPACE`.
44+
45+
## Cheatsheet
46+
47+
### Vectors
48+
49+
| Rcpp | cpp11 (read-only) | cpp11 (writable) |
50+
| --- | --- | --- |
51+
| NumericVector | doubles | writable::doubles |
52+
| NumericMatrix | doubles_matrix<> | writable::doubles_matrix<> |
53+
| IntegerVector | integers | writable::integers |
54+
| IntegerMatrix | integers_matrix<> | writable::integers_matrix<> |
55+
| CharacterVector | strings | writable::strings |
56+
| RawVector | raws | writable::raws |
57+
| List | list | writable::list |
58+
| RObject | sexp | |
59+
60+
Note that each cpp11 vector class has a read-only and writeable version.
61+
The default classes, e.g. `cpp11::doubles` are *read-only* classes that do not permit modification.
62+
If you want to modify the data you or create a new vector, use the writeable variant.
5763

58-
## Read-only vs writable vectors
64+
Another major difference in Rcpp and cpp11 is how vectors are grown.
65+
Rcpp vectors have a `push_back()` method, but unlike `std::vector()` no additional space is reserved when pushing.
66+
This makes calling `push_back()` repeatably very expensive, as the entire vector has to be copied each call.
67+
In contrast `cpp11` vectors grow efficiently, reserving extra space.
68+
See <https://cpp11.r-lib.org/articles/motivations.html#growing-vectors> for more details.
5969

60-
The largest difference between cpp11 and Rcpp classes is that Rcpp classes modify their data in place, whereas cpp11 classes require copying the data to a writable class for modification.
70+
Rcpp also allows very flexible implicit conversions, e.g. if you pass a `REALSXP` to a function that takes a `Rcpp::IntegerVector()` it is implicitly converted to a `INTSXP`.
71+
These conversions are nice for usability, but require (implicit) duplication of the data, with the associated runtime costs.
72+
cpp11 throws an error in these cases. If you want the implicit coercions you can add a call to `as.integer()` or `as.double()` as appropriate from R when you call the function.
6173

62-
The default classes, e.g. `cpp11::doubles` are *read-only* classes that do not permit modification.
63-
If you want to modify the data you need to use the classes in the `cpp11::writable` namespace, e.g. `cpp11::writable::doubles`.
74+
### Other objects
6475

65-
In addition use the `writable` variants if you need to create a new R vector entirely in C++.
76+
| Rcpp | cpp11 |
77+
| --- | --- |
78+
| XPtr | external_pointer |
79+
| Environment | environment |
80+
| Function | function |
81+
| Environment (namespace) | package |
6682

67-
## Fewer implicit conversions
83+
### Functions
6884

69-
Rcpp also allows very flexible implicit conversions, e.g. if you pass a `REALSXP` to a function that takes a `Rcpp::IntegerVector()` it is implicitly converted to a `INTSXP`.
70-
These conversions are nice for usability, but require (implicit) duplication of the data, with the associated runtime costs.
85+
| Rcpp | cpp11 |
86+
| --- | --- |
87+
|`wrap()` | `as_sexp()` |
88+
|`as()` | `as_cpp()` |
89+
|`stop()` | `stop()` |
90+
|`checkUserInterrupt()` | `check_user_interrupt()` |
91+
|`CharacterVector::create("a", "b", "c")` | `{"a", "b", "c"}` |
7192

72-
cpp11 throws an error in these cases. If you want the implicit coercions you can add a call to `as.integer()` or `as.double()` as appropriate from R when you call the function.
93+
Note that `cpp11::stop()` and `cpp11::warning()` are thin wrappers around `Rf_stop()` and `Rf_warning()`.
94+
These are simple C functions with a `printf()` API, so they do not understand C++ objects like `std::string`.
95+
Therefore you need to call `obj.c_str()` when passing string data to them.
7396

74-
## Calling R functions from C++
97+
### R functions
7598

7699
Calling R functions from C++ is similar to using Rcpp.
77100

78101
```c++
102+
// Rcpp -----------------------------------------------
79103
Rcpp::Function as_tibble("as_tibble", Rcpp::Environment::namespace_env("tibble"));
80104
as_tibble(x, Rcpp::Named(".rows", num_rows), Rcpp::Named(".name_repair", name_repair));
81-
```
82105

83-
```c++
106+
// cpp11 -----------------------------------------------
84107
using namespace cpp11::literals; // so we can use ""_nm syntax
85108

86109
auto as_tibble = cpp11::package("tibble")["as_tibble"];
87110
as_tibble(x, ".rows"_nm = num_rows, ".name_repair"_nm = name_repair);
88111
```
89112
113+
### Unsupported Rcpp features
90114
91-
## Appending behavior
92-
93-
One major difference in Rcpp and cpp11 is how vectors are grown.
94-
Rcpp vectors have a `push_back()` method, but unlike `std::vector()` no additional space is reserved when pushing.
95-
This makes calling `push_back()` repeatably very expensive, as the entire vector has to be copied each call.
96-
97-
In contrast `cpp11` vectors grow efficiently, reserving extra space.
98-
Because of this you can do ~10,000,000 vector appends with cpp11 in approximately the same amount of time that Rcpp does 10,000, as this benchmark demonstrates.
99-
100-
```{r, message = FALSE, eval = should_run_benchmarks()}
101-
library(cpp11test)
102-
grid <- expand.grid(len = 10 ^ (0:7), pkg = "cpp11", stringsAsFactors = FALSE)
103-
grid <- rbind(
104-
grid,
105-
expand.grid(len = 10 ^ (0:4), pkg = "rcpp", stringsAsFactors = FALSE)
106-
)
107-
b_grow <- bench::press(.grid = grid,
108-
{
109-
fun = match.fun(sprintf("%sgrow_", ifelse(pkg == "cpp11", "", paste0(pkg, "_"))))
110-
bench::mark(
111-
fun(len)
112-
)
113-
}
114-
)[c("len", "pkg", "min", "mem_alloc", "n_itr", "n_gc")]
115-
saveRDS(b_grow, "growth.Rds", version = 2)
116-
```
117-
118-
```{r, echo = FALSE, dev = "svg", fig.ext = "svg", eval = capabilities("cairo")}
119-
b_grow <- readRDS("growth.Rds")
120-
library(ggplot2)
121-
ggplot(b_grow, aes(x = len, y = min, color = pkg)) +
122-
geom_point() +
123-
geom_line() +
124-
bench::scale_y_bench_time() +
125-
scale_x_log10(
126-
breaks = scales::trans_breaks("log10", function(x) 10^x),
127-
labels = scales::trans_format("log10", scales::math_format(10^.x))
128-
) +
129-
coord_fixed() +
130-
theme(panel.grid.minor = element_blank()) +
131-
labs(title = "log-log plot of vector size vs construction time", x = NULL, y = NULL)
132-
```
133-
134-
```{r, echo = FALSE}
135-
knitr::kable(b_grow)
136-
```
115+
- None of [Modules](https://CRAN.R-project.org/package=Rcpp/vignettes/Rcpp-modules.pdf)
116+
- None of [Sugar](https://CRAN.R-project.org/package=Rcpp/vignettes/Rcpp-sugar.pdf)
117+
- Some parts of [Attributes](https://CRAN.R-project.org/package=Rcpp/vignettes/Rcpp-attributes.pdf)
118+
- No dependencies
119+
- No random number generator restoration
120+
- No support for roxygen2 comments
121+
- No interfaces
137122
138-
## Random Number behavior
123+
### RNGs
139124
140-
Rcpp unconditionally includes calls to `GetRNGstate()` and `PutRNGstate()` before each wrapped function.
141-
This ensures that if any C++ code calls the R API functions `unif_rand()`, `norm_rand()`, `exp_rand()` or `R_unif_index()` the random seed state is set accordingly.
125+
Rcpp includes calls to `GetRNGstate()` and `PutRNGstate()` around the wrapped function.
126+
This ensures that if any C++ code calls the R API functions `unif_rand()`, `norm_rand()`, `exp_rand()`, or `R_unif_index()` the random seed state is set accordingly.
142127
cpp11 does _not_ do this, so you must include the calls to `GetRNGstate()` and `PutRNGstate()` _yourself_ if you use any of those functions in your C++ code.
143128
See [R-exts 6.3 - Random number generation](https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Random-numbers) for details on these functions.
144129
@@ -162,31 +147,36 @@ void foo() {
162147
}
163148
```
164149

165-
## Mechanics of converting a package from Rcpp
166-
167-
1. Add cpp11 to `LinkingTo`
168-
1. Convert all instances of `// [[Rcpp::export]]` to `[[cpp11::register]]`
169-
1. Clean and recompile the package, e.g. `pkgbuild::clean_dll()` `pkgload::load_all()`
170-
1. Run tests `devtools::test()`
171-
1. Start converting function by function
172-
- Remember you can usually inter-convert between cpp11 and Rcpp classes by going through `SEXP` if needed.
173-
- Converting the code a bit at a time (and regularly running your tests) is the best way to do the conversion correctly and make progress
174-
- Doing a separate commit after converting each file (or possibly each function) can make finding any regressions with [git bisect](https://youtu.be/KKeucpfAuuA) much easier in the future.
175150

176151
## Common issues when converting
177152

178153
### STL includes
179154

180155
Rcpp.h includes a number of STL headers automatically, notably `<string>` and `<vector>`, however the cpp11 headers generally do not. If you have errors like
181156

182-
> error: no type named 'string' in namespace 'std'
157+
```
158+
error: no type named 'string' in namespace 'std'
159+
```
183160

184161
You will need to include the appropriate STL header, in this case `<string>`.
185162

163+
### Strict headers
164+
165+
If you see something like this:
166+
167+
```
168+
In file included from file.cpp:1:
169+
In file included from path/cpp11/include/cpp11.hpp:3:
170+
path/cpp11/include/cpp11/R.hpp:12:9: warning: 'STRICT_R_HEADERS' macro redefined [-Wmacro-redefined]
171+
#define STRICT_R_HEADERS
172+
```
173+
174+
Make sure to remove `PKG_CPPFLAGS=-DSTRICT_R_HEADERS` from `src/Makevars`.
175+
186176
### R API includes
187177

188178
cpp11 conflicts with macros declared by some R headers unless the macros `R_NO_REMAP` and `STRICT_R_HEADERS` are defined.
189-
If you include `cpp11/R.hpp` before any R headers these macros will be defined appropriately, otherwise you may see errors like
179+
If you include `cpp11.hpp` (or, at a minimum, `cpp11/R.hpp`) before any R headers these macros will be defined appropriately, otherwise you may see errors like
190180

191181
> R headers were included before cpp11 headers and at least one of R_NO_REMAP or STRICT_R_HEADERS was not defined.
192182
@@ -197,12 +187,6 @@ Note that transitive includes of R headers (for example, those included by `Rcpp
197187

198188
If you use typedefs for cpp11 types or define custom types you will need to define them in a `pkgname_types.hpp` file so that `cpp_register()` can include it in the generated code.
199189

200-
### `cpp11::stop()` and `cpp11::warning()` with `std::string`
201-
202-
`cpp11::stop()` and `cpp11::warning()` are thin wrappers around `Rf_stop()` and `Rf_warning()`.
203-
These are simple C functions with a `printf()` API, so do not understand C++ objects like `std::string`.
204-
Therefore you need to call `obj.c_str()` when passing character data to them.
205-
206190
### Logical vector construction
207191

208192
If you are constructing a length 1 logical vector you may need to explicitly use a `r_bool()` object in the initializer list rather than `TRUE`, `FALSE` or `NA_INTEGER`.

0 commit comments

Comments
 (0)