Skip to content

Commit 0d9890f

Browse files
authored
Merge pull request #1231 from rylev/add-regex-1.5.5-benchmark
Add `regex-1.5.5` benchmark
2 parents a1e07c1 + 1918860 commit 0d9890f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

95 files changed

+26304
-9
lines changed

ci/check-profiling.sh

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ test -f results/eprintln-Test-helloworld-Check-Full
129129
test ! -s results/eprintln-Test-helloworld-Check-Full
130130

131131
# llvm-lines. `Debug` not `Check` because it doesn't support `Check` profiles.
132-
# Including both `helloworld` and `regex` benchmarks, as they exercise the
132+
# Including both `helloworld` and `regex-1.5.5` benchmarks, as they exercise the
133133
# zero dependency and the greater than zero dependency cases, respectively, the
134134
# latter of which has broken before.
135135
RUST_BACKTRACE=1 RUST_LOG=raw_cargo_messages=trace,collector=debug,rust_sysroot=debug \
@@ -138,12 +138,12 @@ RUST_BACKTRACE=1 RUST_LOG=raw_cargo_messages=trace,collector=debug,rust_sysroot=
138138
--id Test \
139139
--profiles Debug \
140140
--cargo $bindir/cargo \
141-
--include helloworld,regex \
141+
--include helloworld,regex-1.5.5 \
142142
--scenarios Full
143143
test -f results/ll-Test-helloworld-Debug-Full
144144
grep -q "Lines.*Copies.*Function name" results/ll-Test-helloworld-Debug-Full
145-
test -f results/ll-Test-regex-Debug-Full
146-
grep -q "Lines.*Copies.*Function name" results/ll-Test-regex-Debug-Full
145+
test -f results/ll-Test-regex-1.5.5-Debug-Full
146+
grep -q "Lines.*Copies.*Function name" results/ll-Test-regex-1.5.5-Debug-Full
147147

148148

149149
#----------------------------------------------------------------------------

collector/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -189,14 +189,14 @@ Finally, while most of the options you can pass to the collector are supported,
189189
the profilers used in the `profile_local` command are not. In Windows, the only currently supported
190190
profiler is the `self-profiler`.
191191

192-
As a complete example, let's run just the `regex` benchmark in the `Debug`
192+
As a complete example, let's run just the `regex-1.5.5` benchmark in the `Debug`
193193
profile with self-profiling results available:
194194

195195
```pwsh
196196
$env:XPERF="C:\Program Files (x86)\Windows Kits\10\Windows Performance Toolkit\xperf.exe"
197197
$env:TRACELOG="C:\Program Files (x86)\Windows Kits\10\bin\10.0.19041.0\x64\tracelog.exe"
198-
.\target\release\collector.exe bench_local $env:RUST_ORIGINAL --id Original --profiles Debug --include regex --self-profile
199-
.\target\release\collector.exe bench_local $env:RUST_MODIFIED --id Modified --profiles Debug --include regex --self-profile
198+
.\target\release\collector.exe bench_local $env:RUST_ORIGINAL --id Original --profiles Debug --include regex-1.5.5 --self-profile
199+
.\target\release\collector.exe bench_local $env:RUST_MODIFIED --id Modified --profiles Debug --include regex-1.5.5 --self-profile
200200
.\target\release\site.exe .\results.db
201201
```
202202

collector/benchmarks/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ They mostly consist of real-world crates.
3737
- **hyper-2**: A fairly large crate. Utilizes async/await, and used by
3838
many Rust programs.
3939
- **piston-image**: A modular game engine. An interesting Rust program.
40-
- **regex**: A regular expression parser. Used by many Rust programs.
40+
- **regex-1.5.5**: A regular expression parser. Used by many Rust programs.
4141
- **ripgrep**: A line-oriented search tool. A widely-used utility.
4242
- **ripgrep-13.0.0**: A line-oriented search tool. A widely-used utility.
4343
- **serde**: A serialization/deserialization crate. Used by many other
@@ -140,7 +140,7 @@ Rust code being written today.
140140
- **inflate**: An old implementation of the DEFLATE algorithm. Contains
141141
a very large function containing many locals and basic blocks, similar to
142142
`keccak` but less extreme.
143-
- **regex**: See above.
143+
- **regex**: See above. This is an older version of the crate.
144144
- **piston-image**: See above.
145145
- **style-servo**: An old version of Servo's `style` crate. A large crate, and
146146
one used by old versions of Firefox.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"git": {
3+
"sha1": "d130381b150756ba7e5940efdc6ebdf47f4febc0"
4+
},
5+
"path_in_vcs": ""
6+
}
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
target
2+
bench-log
3+
.*.swp
4+
wiki
5+
tags
6+
examples/debug.rs
7+
tmp/
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
diff --git a/src/compile.rs b/src/compile.rs
2+
index 9db743f..ef1948e 100644
3+
--- a/src/compile.rs
4+
+++ b/src/compile.rs
5+
@@ -137,6 +137,8 @@ impl Compiler {
6+
}
7+
8+
fn compile_one(mut self, expr: &Hir) -> result::Result<Program, Error> {
9+
+ {} // @030
10+
+
11+
// If we're compiling a forward DFA and we aren't anchored, then
12+
// add a `.*?` before the first capture group.
13+
// Other matching engines handle this by baking the logic into the
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
diff --git a/src/expand.rs b/src/expand.rs
2+
index 9bea703..3b6ae94 100644
3+
--- a/src/expand.rs
4+
+++ b/src/expand.rs
5+
@@ -128,6 +128,7 @@ fn find_cap_ref(replacement: &[u8]) -> Option<CaptureRef<'_>> {
6+
}
7+
8+
/// Returns true if and only if the given byte is allowed in a capture name.
9+
fn is_valid_cap_letter(b: &u8) -> bool {
10+
+ { }
11+
match *b {
12+
b'0'..=b'9' | b'a'..=b'z' | b'A'..=b'Z' | b'_' => true,
13+
_ => false,
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
diff --git a/src/compile.rs b/src/compile.rs
2+
index 9db743f..fb812ae 100644
3+
--- a/src/compile.rs
4+
+++ b/src/compile.rs
5+
@@ -54,6 +54,7 @@ impl Compiler {
6+
///
7+
/// Various options can be set before calling `compile` on an expression.
8+
pub fn new() -> Self {
9+
+ {}
10+
Compiler {
11+
insts: vec![],
12+
compiled: Program::new(),
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
diff --git a/src/compile.rs b/src/compile.rs
2+
index 9db743f..4e56c2d 100644
3+
--- a/src/compile.rs
4+
+++ b/src/compile.rs
5+
@@ -114,6 +114,7 @@ impl Compiler {
6+
/// When set, the machine returned is suitable for matching text in
7+
/// reverse. In particular, all concatenations are flipped.
8+
pub fn reverse(mut self, yes: bool) -> Self {
9+
+ {}
10+
self.compiled.is_reverse = yes;
11+
self
12+
}
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
diff --git a/src/freqs.rs b/src/freqs.rs
2+
index 92bafc1..6eb5799 100644
3+
--- a/src/freqs.rs
4+
+++ b/src/freqs.rs
5+
@@ -2,7 +2,7 @@
6+
// edit directly
7+
8+
pub const BYTE_FREQUENCIES: [u8; 256] = [
9+
- 55, // '\x00'
10+
+ 54+1, // '\x00'
11+
52, // '\x01'
12+
51, // '\x02'
13+
50, // '\x03'
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
diff --git a/src/backtrack.rs b/src/backtrack.rs
2+
index 3c06254..4b72fd4 100644
3+
--- a/src/backtrack.rs
4+
+++ b/src/backtrack.rs
5+
@@ -82,8 +82,8 @@ impl Cache {
6+
/// stack to do it.
7+
#[derive(Clone, Copy, Debug)]
8+
enum Job {
9+
- Inst { ip: InstPtr, at: InputAt },
10+
SaveRestore { slot: usize, old_pos: Option<usize> },
11+
+ Inst { ip: InstPtr, at: InputAt },
12+
}
13+
14+
impl<'a, 'm, 'r, 's, I: Input> Bounded<'a, 'm, 'r, 's, I> {
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
diff --git a/src/re_set.rs b/src/re_set.rs
2+
index 95c4306..eff56b0 100644
3+
--- a/src/re_set.rs
4+
+++ b/src/re_set.rs
5+
@@ -216,6 +216,7 @@ pub struct SetMatches {
6+
impl SetMatches {
7+
/// Whether this set contains any matches.
8+
pub fn matched_any(&self) -> bool {
9+
+ println!("testing");
10+
self.matched_any
11+
}
12+

0 commit comments

Comments
 (0)