Skip to content

Commit 4d9c159

Browse files
committed
Update documentation re JIT fast path and UTF validity
1 parent f508518 commit 4d9c159

File tree

2 files changed

+18
-7
lines changed

2 files changed

+18
-7
lines changed

doc/pcre2_jit_match.3

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.TH PCRE2_JIT_MATCH 3 "11 February 2020" "PCRE2 10.35"
1+
.TH PCRE2_JIT_MATCH 3 "20 January 2023" "PCRE2 10.43"
22
.SH NAME
33
PCRE2 - Perl-compatible regular expressions (revised API)
44
.SH SYNOPSIS
@@ -20,7 +20,15 @@ This function matches a compiled regular expression that has been successfully
2020
processed by the JIT compiler against a given subject string, using a matching
2121
algorithm that is similar to Perl's. It is a "fast path" interface to JIT, and
2222
it bypasses some of the sanity checks that \fBpcre2_match()\fP applies.
23-
Its arguments are exactly the same as for
23+
.P
24+
In UTF mode, the subject string is not checked for UTF validity. Unless
25+
PCRE2_MATCH_INVALID_UTF was set when the pattern was compiled, passing an
26+
invalid UTF string results in undefined behaviour. Your program may crash or
27+
loop or give wrong results. In the absence of PCRE2_MATCH_INVALID_UTF you
28+
should only call \fBpcre2_jit_match()\fP in UTF mode if you are sure the
29+
subject is valid.
30+
.P
31+
The arguments for \fBpcre2_jit_match()\fP are exactly the same as for
2432
.\" HREF
2533
\fBpcre2_match()\fP,
2634
.\"
@@ -29,7 +37,7 @@ PCRE2_ZERO_TERMINATED is not supported.
2937
.P
3038
The supported options are PCRE2_NOTBOL, PCRE2_NOTEOL, PCRE2_NOTEMPTY,
3139
PCRE2_NOTEMPTY_ATSTART, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Unsupported
32-
options are ignored. The subject string is not checked for UTF validity.
40+
options are ignored.
3341
.P
3442
The return values are the same as for \fBpcre2_match()\fP plus
3543
PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or complete) is requested

doc/pcre2jit.3

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.TH PCRE2JIT 3 "30 November 2021" "PCRE2 10.40"
1+
.TH PCRE2JIT 3 "20 January 2023" "PCRE2 10.43"
22
.SH NAME
33
PCRE2 - Perl-compatible regular expressions (revised API)
44
.SH "PCRE2 JUST-IN-TIME COMPILER SUPPORT"
@@ -419,7 +419,10 @@ number of other sanity checks are performed on the arguments. For example, if
419419
the subject pointer is NULL but the length is non-zero, an immediate error is
420420
given. Also, unless PCRE2_NO_UTF_CHECK is set, a UTF subject string is tested
421421
for validity. In the interests of speed, these checks do not happen on the JIT
422-
fast path, and if invalid data is passed, the result is undefined.
422+
fast path, and if invalid UTF data is passed, the result is undefined. The
423+
program may crash or loop or give wrong results. In the absence of
424+
PCRE2_MATCH_INVALID_UTF you should only call \fBpcre2_jit_match()\fP in UTF
425+
mode if you are sure the subject is valid.
423426
.P
424427
Bypassing the sanity checks and the \fBpcre2_match()\fP wrapping can give
425428
speedups of more than 10%.
@@ -445,6 +448,6 @@ Cambridge, England.
445448
.rs
446449
.sp
447450
.nf
448-
Last updated: 30 November 2021
449-
Copyright (c) 1997-2021 University of Cambridge.
451+
Last updated: 20 January 2023
452+
Copyright (c) 1997-2023 University of Cambridge.
450453
.fi

0 commit comments

Comments
 (0)