Skip to content

[FileCheck] Avoid capturing group for {{regex}} #72136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 14, 2023

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Nov 13, 2023

For {{regex}} we don't really need a capturing group, and only add it to properly handle cases like {{foo|bar}}. This is problematic, because the use of capturing groups makes our regex implementation slower (we have to go through the "dissect" stage, which can have quadratic complexity).

Unfortunately, our regex implementation does not support non-capturing groups like (?:regex). So instead, avoid adding the group entirely if the regex doesn't contain any alternations.

This causes a slight difference in escaping behavior, where previously it was possible to write {{{{}} and get the same behavior as {{\{\{}}. This will no longer work. I don't think this is a problem, especially as we recently taught update_analyze_test_checks.py to emit {{\{\{}}, so this shouldn't get introduced in any new tests.

For CodeGen/X86/vector-interleaved-store-i16-stride-7.ll (our slowest X86 test) this drops FileCheck time from 6s to 5s (the remainder is spent in a different regex issue). I expect similar speedups in other tests using a lot of {{}}.

For `{{regex}}` we don't really need a capturing group, and only
add it to properly handle cases like `{{foo|bar}}`. This is
problematic, because the use of capturing groups makes our
regex implementation slower (we have to go through the "dissect"
stage, which can have quadratic complexity).

Unfortunately, our regex implementation does not support
non-capturing groups like `(?:regex)`. So instead, avoid adding
the group entirely if the regex doesn't contain any alternations.

This causes a slight difference in escaping behavior, where
previously it was possible to write `{{{{}}` and get the same
behavior as `{{\{\{}}`. This will no longer work. I don't think
this is a problem, especially as we recently taught
update_analyze_test_checks.py to emit `{{\{\{}}`, so this
shouldn't get introduced in any new tests.

For CodeGen/X86/vector-interleaved-store-i16-stride-7.ll (our
slowest X86 test) this drops FileCheck time from 6s to 5s (the
remainder is spent in a different regex issue). I expect similar
speedups in other tests using a lot of `{{}}`.
@nikic nikic requested a review from tstellar November 13, 2023 16:54
@llvmbot llvmbot added testing-tools llvm:analysis Includes value tracking, cost tables and constant folding labels Nov 13, 2023
@llvmbot
Copy link
Member

llvmbot commented Nov 13, 2023

@llvm/pr-subscribers-testing-tools

@llvm/pr-subscribers-llvm-analysis

Author: Nikita Popov (nikic)

Changes

For {{regex}} we don't really need a capturing group, and only add it to properly handle cases like {{foo|bar}}. This is problematic, because the use of capturing groups makes our regex implementation slower (we have to go through the "dissect" stage, which can have quadratic complexity).

Unfortunately, our regex implementation does not support non-capturing groups like (?:regex). So instead, avoid adding the group entirely if the regex doesn't contain any alternations.

This causes a slight difference in escaping behavior, where previously it was possible to write {{{{}} and get the same behavior as {{\{\{}}. This will no longer work. I don't think this is a problem, especially as we recently taught update_analyze_test_checks.py to emit {{\{\{}}, so this shouldn't get introduced in any new tests.

For CodeGen/X86/vector-interleaved-store-i16-stride-7.ll (our slowest X86 test) this drops FileCheck time from 6s to 5s (the remainder is spent in a different regex issue). I expect similar speedups in other tests using a lot of {{}}.


Full diff: https://github.com/llvm/llvm-project/pull/72136.diff

2 Files Affected:

  • (modified) llvm/lib/FileCheck/FileCheck.cpp (+7-3)
  • (modified) llvm/test/Analysis/ScalarEvolution/different-loops-recs.ll (+18-18)
diff --git a/llvm/lib/FileCheck/FileCheck.cpp b/llvm/lib/FileCheck/FileCheck.cpp
index 49fda8fb63cd36c..bef6dd69fadfdee 100644
--- a/llvm/lib/FileCheck/FileCheck.cpp
+++ b/llvm/lib/FileCheck/FileCheck.cpp
@@ -832,12 +832,16 @@ bool Pattern::parsePattern(StringRef PatternStr, StringRef Prefix,
       // capturing the result for any purpose.  This is required in case the
       // expression contains an alternation like: CHECK:  abc{{x|z}}def.  We
       // want this to turn into: "abc(x|z)def" not "abcx|zdef".
-      RegExStr += '(';
-      ++CurParen;
+      bool HasAlternation = PatternStr.contains('|');
+      if (HasAlternation) {
+        RegExStr += '(';
+        ++CurParen;
+      }
 
       if (AddRegExToRegEx(PatternStr.substr(2, End - 2), CurParen, SM))
         return true;
-      RegExStr += ')';
+      if (HasAlternation)
+        RegExStr += ')';
 
       PatternStr = PatternStr.substr(End + 2);
       continue;
diff --git a/llvm/test/Analysis/ScalarEvolution/different-loops-recs.ll b/llvm/test/Analysis/ScalarEvolution/different-loops-recs.ll
index 349c6bde6991262..4d5f08d41b9ceef 100644
--- a/llvm/test/Analysis/ScalarEvolution/different-loops-recs.ll
+++ b/llvm/test/Analysis/ScalarEvolution/different-loops-recs.ll
@@ -18,17 +18,17 @@ define void @test_00() {
 ; CHECK:       %sum4 = add i32 %sum3, %phi6
 ; CHECK-NEXT:  -->  {159,+,6}<%loop2>
 ; CHECK:       %s1 = add i32 %phi1, %phi4
-; CHECK-NEXT:  -->  {{{{}}73,+,1}<nuw><nsw><%loop1>,+,1}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}73,+,1}<nuw><nsw><%loop1>,+,1}<nw><%loop2>
 ; CHECK:       %s2 = add i32 %phi5, %phi2
-; CHECK-NEXT:  -->  {{{{}}57,+,2}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}57,+,2}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
 ; CHECK:       %s3 = add i32 %sum1, %sum3
-; CHECK-NEXT:  -->  {{{{}}130,+,3}<%loop1>,+,3}<%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}130,+,3}<%loop1>,+,3}<%loop2>
 ; CHECK:       %s4 = add i32 %sum4, %sum2
-; CHECK-NEXT:  -->  {{{{}}179,+,6}<%loop1>,+,6}<%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}179,+,6}<%loop1>,+,6}<%loop2>
 ; CHECK:       %s5 = add i32 %phi3, %sum3
-; CHECK-NEXT:  -->  {{{{}}122,+,3}<nuw><nsw><%loop1>,+,3}<%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}122,+,3}<nuw><nsw><%loop1>,+,3}<%loop2>
 ; CHECK:       %s6 = add i32 %sum2, %phi6
-; CHECK-NEXT:  -->  {{{{}}63,+,6}<%loop1>,+,3}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}63,+,6}<%loop1>,+,3}<nw><%loop2>
 
 entry:
   br label %loop1
@@ -359,17 +359,17 @@ define void @test_06() {
 
 ; CHECK-LABEL: Classifying expressions for: @test_06
 ; CHECK:       %s1 = add i32 %phi1, %phi2
-; CHECK-NEXT:  -->  {{{{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
 ; CHECK:       %s2 = add i32 %phi2, %phi1
-; CHECK-NEXT:  -->  {{{{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
 ; CHECK:       %s3 = add i32 %phi1, %phi3
-; CHECK-NEXT:  -->  {{{{}}40,+,1}<nuw><nsw><%loop1>,+,3}<nw><%loop3>
+; CHECK-NEXT:  -->  {{\{\{}}40,+,1}<nuw><nsw><%loop1>,+,3}<nw><%loop3>
 ; CHECK:       %s4 = add i32 %phi3, %phi1
-; CHECK-NEXT:  -->  {{{{}}40,+,1}<nuw><nsw><%loop1>,+,3}<nw><%loop3>
+; CHECK-NEXT:  -->  {{\{\{}}40,+,1}<nuw><nsw><%loop1>,+,3}<nw><%loop3>
 ; CHECK:       %s5 = add i32 %phi2, %phi3
-; CHECK-NEXT:  -->  {{{{}}50,+,2}<nuw><nsw><%loop2>,+,3}<nw><%loop3>
+; CHECK-NEXT:  -->  {{\{\{}}50,+,2}<nuw><nsw><%loop2>,+,3}<nw><%loop3>
 ; CHECK:       %s6 = add i32 %phi3, %phi2
-; CHECK-NEXT:  -->  {{{{}}50,+,2}<nuw><nsw><%loop2>,+,3}<nw><%loop3>
+; CHECK-NEXT:  -->  {{\{\{}}50,+,2}<nuw><nsw><%loop2>,+,3}<nw><%loop3>
 
 entry:
   br label %loop1
@@ -411,17 +411,17 @@ define void @test_07() {
 
 ; CHECK-LABEL: Classifying expressions for: @test_07
 ; CHECK:       %s1 = add i32 %phi1, %phi2
-; CHECK-NEXT:  -->  {{{{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
 ; CHECK:       %s2 = add i32 %phi2, %phi1
-; CHECK-NEXT:  -->  {{{{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}30,+,1}<nuw><nsw><%loop1>,+,2}<nw><%loop2>
 ; CHECK:       %s3 = add i32 %phi1, %phi3
-; CHECK-NEXT:  -->  {{{{}}40,+,3}<nuw><nsw><%loop3>,+,1}<nw><%loop1>
+; CHECK-NEXT:  -->  {{\{\{}}40,+,3}<nuw><nsw><%loop3>,+,1}<nw><%loop1>
 ; CHECK:       %s4 = add i32 %phi3, %phi1
-; CHECK-NEXT:  -->  {{{{}}40,+,3}<nuw><nsw><%loop3>,+,1}<nw><%loop1>
+; CHECK-NEXT:  -->  {{\{\{}}40,+,3}<nuw><nsw><%loop3>,+,1}<nw><%loop1>
 ; CHECK:       %s5 = add i32 %phi2, %phi3
-; CHECK-NEXT:  -->  {{{{}}50,+,3}<nuw><nsw><%loop3>,+,2}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}50,+,3}<nuw><nsw><%loop3>,+,2}<nw><%loop2>
 ; CHECK:       %s6 = add i32 %phi3, %phi2
-; CHECK-NEXT:  -->  {{{{}}50,+,3}<nuw><nsw><%loop3>,+,2}<nw><%loop2>
+; CHECK-NEXT:  -->  {{\{\{}}50,+,3}<nuw><nsw><%loop3>,+,2}<nw><%loop2>
 
 entry:
   br label %loop3

Copy link
Collaborator

@tstellar tstellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@nikic nikic merged commit a3eeef8 into llvm:main Nov 14, 2023
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
For `{{regex}}` we don't really need a capturing group, and only add it
to properly handle cases like `{{foo|bar}}`. This is problematic,
because the use of capturing groups makes our regex implementation
slower (we have to go through the "dissect" stage, which can have
quadratic complexity).

Unfortunately, our regex implementation does not support non-capturing
groups like `(?:regex)`. So instead, avoid adding the group entirely if
the regex doesn't contain any alternations.

This causes a slight difference in escaping behavior, where previously
it was possible to write `{{{{}}` and get the same behavior as
`{{\{\{}}`. This will no longer work. I don't think this is a problem,
especially as we recently taught update_analyze_test_checks.py to emit
`{{\{\{}}`, so this shouldn't get introduced in any new tests.

For CodeGen/X86/vector-interleaved-store-i16-stride-7.ll (our slowest
X86 test) this drops FileCheck time from 6s to 5s (the remainder is
spent in a different regex issue). I expect similar speedups in other
tests using a lot of `{{}}`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding testing-tools
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants