Skip to content

Commit 877c84a

Browse files
dustanddreamsbrad0
authored andcommitted
[Support] unsafe pointer arithmetic in llvm_regcomp()
regcomp.c uses the "start + count < end" idiom to check that there are "count" bytes available in an array of char "start" and "end" both point to. This is fine, unless "start + count" goes beyond the last element of the array. In this case, pedantic interpretation of the C standard makes the comparison of such a pointer against "end" undefined, and optimizers from hell will happily remove as much code as possible because of this. An example of this occurs in regcomp.c's bothcases(), which defines bracket[3], sets "next" to "bracket" and "end" to "bracket + 2". Then it invokes p_bracket(), which starts with "if (p->next + 5 < p->end)"... Because bothcases() and p_bracket() are static functions in regcomp.c, there is a real risk of miscompilation if aggressive inlining happens. The following diff rewrites the "start + count < end" constructs into "end - start > count". Assuming "end" and "start" are always pointing in the array (such as "bracket[3]" above), "end - start" is well-defined and can be compared without trouble. As a bonus, MORE2() implies MORE() therefore SEETWO() can be simplified a bit. Bug report: llvm#47993 Reviewed By: MaskRay, vitalybuka Differential Revision: https://reviews.llvm.org/D97129
1 parent 2b78ef0 commit 877c84a

File tree

1 file changed

+14
-12
lines changed

1 file changed

+14
-12
lines changed

llvm/lib/Support/regcomp.c

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -249,10 +249,10 @@ static char nuls[10]; /* place to point scanner in event of error */
249249
*/
250250
#define PEEK() (*p->next)
251251
#define PEEK2() (*(p->next+1))
252-
#define MORE() (p->next < p->end)
253-
#define MORE2() (p->next+1 < p->end)
252+
#define MORE() (p->end - p->next > 0)
253+
#define MORE2() (p->end - p->next > 1)
254254
#define SEE(c) (MORE() && PEEK() == (c))
255-
#define SEETWO(a, b) (MORE() && MORE2() && PEEK() == (a) && PEEK2() == (b))
255+
#define SEETWO(a, b) (MORE2() && PEEK() == (a) && PEEK2() == (b))
256256
#define EAT(c) ((SEE(c)) ? (NEXT(), 1) : 0)
257257
#define EATTWO(a, b) ((SEETWO(a, b)) ? (NEXT2(), 1) : 0)
258258
#define NEXT() (p->next++)
@@ -800,15 +800,17 @@ p_bracket(struct parse *p)
800800
int invert = 0;
801801

802802
/* Dept of Truly Sickening Special-Case Kludges */
803-
if (p->next + 5 < p->end && strncmp(p->next, "[:<:]]", 6) == 0) {
804-
EMIT(OBOW, 0);
805-
NEXTn(6);
806-
return;
807-
}
808-
if (p->next + 5 < p->end && strncmp(p->next, "[:>:]]", 6) == 0) {
809-
EMIT(OEOW, 0);
810-
NEXTn(6);
811-
return;
803+
if (p->end - p->next > 5) {
804+
if (strncmp(p->next, "[:<:]]", 6) == 0) {
805+
EMIT(OBOW, 0);
806+
NEXTn(6);
807+
return;
808+
}
809+
if (strncmp(p->next, "[:>:]]", 6) == 0) {
810+
EMIT(OEOW, 0);
811+
NEXTn(6);
812+
return;
813+
}
812814
}
813815

814816
if ((cs = allocset(p)) == NULL) {

0 commit comments

Comments
 (0)