Skip to content

Commit 6659865

Browse files
committed
rollup merge of #23920: steveklabnik/gh23881
Fixes #23881
2 parents 9ab6cc9 + 8da0831 commit 6659865

File tree

3 files changed

+153
-146
lines changed

3 files changed

+153
-146
lines changed

src/doc/trpl/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,5 +42,6 @@
4242
* [Intrinsics](intrinsics.md)
4343
* [Lang items](lang-items.md)
4444
* [Link args](link-args.md)
45+
* [Benchmark Tests](benchmark-tests.md)
4546
* [Conclusion](conclusion.md)
4647
* [Glossary](glossary.md)

src/doc/trpl/benchmark-tests.md

Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
% Benchmark tests
2+
3+
Rust supports benchmark tests, which can test the performance of your
4+
code. Let's make our `src/lib.rs` look like this (comments elided):
5+
6+
```{rust,ignore}
7+
#![feature(test)]
8+
9+
extern crate test;
10+
11+
pub fn add_two(a: i32) -> i32 {
12+
a + 2
13+
}
14+
15+
#[cfg(test)]
16+
mod tests {
17+
use super::*;
18+
use test::Bencher;
19+
20+
#[test]
21+
fn it_works() {
22+
assert_eq!(4, add_two(2));
23+
}
24+
25+
#[bench]
26+
fn bench_add_two(b: &mut Bencher) {
27+
b.iter(|| add_two(2));
28+
}
29+
}
30+
```
31+
32+
Note the `test` feature gate, which enables this unstable feature.
33+
34+
We've imported the `test` crate, which contains our benchmarking support.
35+
We have a new function as well, with the `bench` attribute. Unlike regular
36+
tests, which take no arguments, benchmark tests take a `&mut Bencher`. This
37+
`Bencher` provides an `iter` method, which takes a closure. This closure
38+
contains the code we'd like to benchmark.
39+
40+
We can run benchmark tests with `cargo bench`:
41+
42+
```bash
43+
$ cargo bench
44+
Compiling adder v0.0.1 (file:///home/steve/tmp/adder)
45+
Running target/release/adder-91b3e234d4ed382a
46+
47+
running 2 tests
48+
test tests::it_works ... ignored
49+
test tests::bench_add_two ... bench: 1 ns/iter (+/- 0)
50+
51+
test result: ok. 0 passed; 0 failed; 1 ignored; 1 measured
52+
```
53+
54+
Our non-benchmark test was ignored. You may have noticed that `cargo bench`
55+
takes a bit longer than `cargo test`. This is because Rust runs our benchmark
56+
a number of times, and then takes the average. Because we're doing so little
57+
work in this example, we have a `1 ns/iter (+/- 0)`, but this would show
58+
the variance if there was one.
59+
60+
Advice on writing benchmarks:
61+
62+
63+
* Move setup code outside the `iter` loop; only put the part you want to measure inside
64+
* Make the code do "the same thing" on each iteration; do not accumulate or change state
65+
* Make the outer function idempotent too; the benchmark runner is likely to run
66+
it many times
67+
* Make the inner `iter` loop short and fast so benchmark runs are fast and the
68+
calibrator can adjust the run-length at fine resolution
69+
* Make the code in the `iter` loop do something simple, to assist in pinpointing
70+
performance improvements (or regressions)
71+
72+
## Gotcha: optimizations
73+
74+
There's another tricky part to writing benchmarks: benchmarks compiled with
75+
optimizations activated can be dramatically changed by the optimizer so that
76+
the benchmark is no longer benchmarking what one expects. For example, the
77+
compiler might recognize that some calculation has no external effects and
78+
remove it entirely.
79+
80+
```{rust,ignore}
81+
#![feature(test)]
82+
83+
extern crate test;
84+
use test::Bencher;
85+
86+
#[bench]
87+
fn bench_xor_1000_ints(b: &mut Bencher) {
88+
b.iter(|| {
89+
(0..1000).fold(0, |old, new| old ^ new);
90+
});
91+
}
92+
```
93+
94+
gives the following results
95+
96+
```text
97+
running 1 test
98+
test bench_xor_1000_ints ... bench: 0 ns/iter (+/- 0)
99+
100+
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
101+
```
102+
103+
The benchmarking runner offers two ways to avoid this. Either, the closure that
104+
the `iter` method receives can return an arbitrary value which forces the
105+
optimizer to consider the result used and ensures it cannot remove the
106+
computation entirely. This could be done for the example above by adjusting the
107+
`b.iter` call to
108+
109+
```rust
110+
# struct X;
111+
# impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;
112+
b.iter(|| {
113+
// note lack of `;` (could also use an explicit `return`).
114+
(0..1000).fold(0, |old, new| old ^ new)
115+
});
116+
```
117+
118+
Or, the other option is to call the generic `test::black_box` function, which
119+
is an opaque "black box" to the optimizer and so forces it to consider any
120+
argument as used.
121+
122+
```rust
123+
#![feature(test)]
124+
125+
extern crate test;
126+
127+
# fn main() {
128+
# struct X;
129+
# impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;
130+
b.iter(|| {
131+
let n = test::black_box(1000);
132+
133+
(0..n).fold(0, |a, b| a ^ b)
134+
})
135+
# }
136+
```
137+
138+
Neither of these read or modify the value, and are very cheap for small values.
139+
Larger values can be passed indirectly to reduce overhead (e.g.
140+
`black_box(&huge_struct)`).
141+
142+
Performing either of the above changes gives the following benchmarking results
143+
144+
```text
145+
running 1 test
146+
test bench_xor_1000_ints ... bench: 131 ns/iter (+/- 3)
147+
148+
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
149+
```
150+
151+
However, the optimizer can still modify a testcase in an undesirable manner
152+
even when using either of the above.

src/doc/trpl/testing.md

Lines changed: 0 additions & 146 deletions
Original file line numberDiff line numberDiff line change
@@ -430,149 +430,3 @@ documentation tests: the `_0` is generated for the module test, and `add_two_0`
430430
for the function test. These will auto increment with names like `add_two_1` as
431431
you add more examples.
432432

433-
# Benchmark tests
434-
435-
Rust also supports benchmark tests, which can test the performance of your
436-
code. Let's make our `src/lib.rs` look like this (comments elided):
437-
438-
```{rust,ignore}
439-
extern crate test;
440-
441-
pub fn add_two(a: i32) -> i32 {
442-
a + 2
443-
}
444-
445-
#[cfg(test)]
446-
mod tests {
447-
use super::*;
448-
use test::Bencher;
449-
450-
#[test]
451-
fn it_works() {
452-
assert_eq!(4, add_two(2));
453-
}
454-
455-
#[bench]
456-
fn bench_add_two(b: &mut Bencher) {
457-
b.iter(|| add_two(2));
458-
}
459-
}
460-
```
461-
462-
We've imported the `test` crate, which contains our benchmarking support.
463-
We have a new function as well, with the `bench` attribute. Unlike regular
464-
tests, which take no arguments, benchmark tests take a `&mut Bencher`. This
465-
`Bencher` provides an `iter` method, which takes a closure. This closure
466-
contains the code we'd like to benchmark.
467-
468-
We can run benchmark tests with `cargo bench`:
469-
470-
```bash
471-
$ cargo bench
472-
Compiling adder v0.0.1 (file:///home/steve/tmp/adder)
473-
Running target/release/adder-91b3e234d4ed382a
474-
475-
running 2 tests
476-
test tests::it_works ... ignored
477-
test tests::bench_add_two ... bench: 1 ns/iter (+/- 0)
478-
479-
test result: ok. 0 passed; 0 failed; 1 ignored; 1 measured
480-
```
481-
482-
Our non-benchmark test was ignored. You may have noticed that `cargo bench`
483-
takes a bit longer than `cargo test`. This is because Rust runs our benchmark
484-
a number of times, and then takes the average. Because we're doing so little
485-
work in this example, we have a `1 ns/iter (+/- 0)`, but this would show
486-
the variance if there was one.
487-
488-
Advice on writing benchmarks:
489-
490-
491-
* Move setup code outside the `iter` loop; only put the part you want to measure inside
492-
* Make the code do "the same thing" on each iteration; do not accumulate or change state
493-
* Make the outer function idempotent too; the benchmark runner is likely to run
494-
it many times
495-
* Make the inner `iter` loop short and fast so benchmark runs are fast and the
496-
calibrator can adjust the run-length at fine resolution
497-
* Make the code in the `iter` loop do something simple, to assist in pinpointing
498-
performance improvements (or regressions)
499-
500-
## Gotcha: optimizations
501-
502-
There's another tricky part to writing benchmarks: benchmarks compiled with
503-
optimizations activated can be dramatically changed by the optimizer so that
504-
the benchmark is no longer benchmarking what one expects. For example, the
505-
compiler might recognize that some calculation has no external effects and
506-
remove it entirely.
507-
508-
```{rust,ignore}
509-
extern crate test;
510-
use test::Bencher;
511-
512-
#[bench]
513-
fn bench_xor_1000_ints(b: &mut Bencher) {
514-
b.iter(|| {
515-
(0..1000).fold(0, |old, new| old ^ new);
516-
});
517-
}
518-
```
519-
520-
gives the following results
521-
522-
```text
523-
running 1 test
524-
test bench_xor_1000_ints ... bench: 0 ns/iter (+/- 0)
525-
526-
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
527-
```
528-
529-
The benchmarking runner offers two ways to avoid this. Either, the closure that
530-
the `iter` method receives can return an arbitrary value which forces the
531-
optimizer to consider the result used and ensures it cannot remove the
532-
computation entirely. This could be done for the example above by adjusting the
533-
`b.iter` call to
534-
535-
```rust
536-
# struct X;
537-
# impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;
538-
b.iter(|| {
539-
// note lack of `;` (could also use an explicit `return`).
540-
(0..1000).fold(0, |old, new| old ^ new)
541-
});
542-
```
543-
544-
Or, the other option is to call the generic `test::black_box` function, which
545-
is an opaque "black box" to the optimizer and so forces it to consider any
546-
argument as used.
547-
548-
```rust
549-
# #![feature(test)]
550-
551-
extern crate test;
552-
553-
# fn main() {
554-
# struct X;
555-
# impl X { fn iter<T, F>(&self, _: F) where F: FnMut() -> T {} } let b = X;
556-
b.iter(|| {
557-
let n = test::black_box(1000);
558-
559-
(0..n).fold(0, |a, b| a ^ b)
560-
})
561-
# }
562-
```
563-
564-
Neither of these read or modify the value, and are very cheap for small values.
565-
Larger values can be passed indirectly to reduce overhead (e.g.
566-
`black_box(&huge_struct)`).
567-
568-
Performing either of the above changes gives the following benchmarking results
569-
570-
```text
571-
running 1 test
572-
test bench_xor_1000_ints ... bench: 131 ns/iter (+/- 3)
573-
574-
test result: ok. 0 passed; 0 failed; 0 ignored; 1 measured
575-
```
576-
577-
However, the optimizer can still modify a testcase in an undesirable manner
578-
even when using either of the above.

0 commit comments

Comments
 (0)