Skip to content

[SandboxVec] Add option -sbvec-allow-file for bisection debugging #129127

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 27, 2025

Conversation

vporpo
Copy link
Contributor

@vporpo vporpo commented Feb 27, 2025

This new option lets you specify an allow-list of source files and disables vectorization if the IR is not in the list. This can be used for debugging miscompiles.

This new option lets you specify an allow-list of source files and
disables vectorization if the IR is not in the list.
This can be used for debugging miscompiles.
@llvmbot
Copy link
Member

llvmbot commented Feb 27, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: vporpo (vporpo)

Changes

This new option lets you specify an allow-list of source files and disables vectorization if the IR is not in the list. This can be used for debugging miscompiles.


Full diff: https://github.com/llvm/llvm-project/pull/129127.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h (+3)
  • (modified) llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp (+42)
  • (added) llvm/test/Transforms/SandboxVectorizer/allow_files.ll (+39)
diff --git a/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h b/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h
index 7ea9386f08bee..fea53329719b9 100644
--- a/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h
+++ b/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h
@@ -37,6 +37,9 @@ class SandboxVectorizerPass : public PassInfoMixin<SandboxVectorizerPass> {
   // within FPM may register/unregister callbacks, so they need access to
   // Context.
   sandboxir::FunctionPassManager FPM;
+  /// \Returns true if we should attempt to vectorize \p SrcFilePath based on
+  /// `AllowFiles` option.
+  bool allowFile(const std::string &SrcFilePath);
 
   bool runImpl(Function &F);
 
diff --git a/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp
index 5837cc16fcbac..bffb9f187e882 100644
--- a/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp
@@ -8,9 +8,11 @@
 
 #include "llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
+#include "llvm/IR/Module.h"
 #include "llvm/SandboxIR/Constant.h"
 #include "llvm/Support/CommandLine.h"
 #include "llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizerPassBuilder.h"
+#include <regex>
 
 using namespace llvm;
 
@@ -29,6 +31,22 @@ static cl::opt<std::string> UserDefinedPassPipeline(
     cl::desc("Comma-separated list of vectorizer passes. If not set "
              "we run the predefined pipeline."));
 
+// This option is useful for bisection debugging.
+// For example you may use it to figure out which filename is the one causing a
+// miscompile. You can specify a regex for the filename like: "/[a-m][^/]*"
+// which will enable any file name starting with 'a' to 'm' and disable the
+// rest. If the miscompile goes away, then we try "/[n-z][^/]*" for the other
+// half of the range, from 'n' to 'z'. If we can reproduce the miscompile then
+// we can keep looking in [n-r] and [s-z] and so on, in a binary-search fashion.
+//
+// Please note that we are using [^/]* and not .* to make sure that we are
+// matching the actual filename and not some other directory in the path.
+cl::opt<std::string> AllowFiles(
+    "sbvec-allow-files", cl::init(".*"), cl::Hidden,
+    cl::desc("Run the vectorizer only on file paths that match any in the "
+             "list of comma-separated regex's."));
+static constexpr const char AllowFilesDelim = ',';
+
 SandboxVectorizerPass::SandboxVectorizerPass() : FPM("fpm") {
   if (UserDefinedPassPipeline == DefaultPipelineMagicStr) {
     // TODO: Add passes to the default pipeline. It currently contains:
@@ -66,6 +84,23 @@ PreservedAnalyses SandboxVectorizerPass::run(Function &F,
   return PA;
 }
 
+bool SandboxVectorizerPass::allowFile(const std::string &SrcFilePath) {
+  // Iterate over all files in AllowFiles separated by `AllowFilesDelim`.
+  size_t DelimPos = 0;
+  do {
+    size_t LastPos = DelimPos != 0 ? DelimPos + 1 : DelimPos;
+    DelimPos = AllowFiles.find(AllowFilesDelim, LastPos);
+    auto FileNameToMatch = AllowFiles.substr(LastPos, DelimPos - LastPos);
+    if (FileNameToMatch.empty())
+      return false;
+    // Note: This only runs when debugging so its OK not to reuse the regex.
+    std::regex FileNameRegex(std::string(".*") + FileNameToMatch);
+    if (std::regex_match(SrcFilePath, FileNameRegex))
+      return true;
+  } while (DelimPos != std::string::npos);
+  return false;
+}
+
 bool SandboxVectorizerPass::runImpl(Function &LLVMF) {
   if (Ctx == nullptr)
     Ctx = std::make_unique<sandboxir::Context>(LLVMF.getContext());
@@ -75,6 +110,13 @@ bool SandboxVectorizerPass::runImpl(Function &LLVMF) {
     return false;
   }
 
+  // This is used for debugging.
+  if (LLVM_UNLIKELY(AllowFiles != ".*")) {
+    const auto &SrcFilePath = LLVMF.getParent()->getSourceFileName();
+    if (!allowFile(SrcFilePath))
+      return false;
+  }
+
   // If the target claims to have no vector registers early return.
   if (!TTI->getNumberOfRegisters(TTI->getRegisterClassForType(true))) {
     LLVM_DEBUG(dbgs() << "SBVec: Target has no vector registers, return.\n");
diff --git a/llvm/test/Transforms/SandboxVectorizer/allow_files.ll b/llvm/test/Transforms/SandboxVectorizer/allow_files.ll
new file mode 100644
index 0000000000000..0929eca6a1047
--- /dev/null
+++ b/llvm/test/Transforms/SandboxVectorizer/allow_files.ll
@@ -0,0 +1,39 @@
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" -sbvec-allow-files="some_other_file" %s -S | FileCheck %s --check-prefix=ALLOW_OTHER
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" -sbvec-allow-files="allow_files.ll" %s -S | FileCheck %s --check-prefix=ALLOW_THIS
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" -sbvec-allow-files="al.*_files.ll" %s -S | FileCheck %s --check-prefix=ALLOW_REGEX
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" -sbvec-allow-files="some_file,.*_files.ll,some_other_file" %s -S | FileCheck %s --check-prefix=ALLOW_REGEX_CSV
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" -sbvec-allow-files="allow" %s -S | FileCheck %s --check-prefix=ALLOW_BAD_REGEX
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" -sbvec-allow-files="some_file,some_other_file1,some_other_file2" %s -S | FileCheck %s --check-prefix=ALLOW_OTHER_CSV
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" -sbvec-allow-files="" %s -S | FileCheck %s --check-prefix=ALLOW_EMPTY
+; RUN: opt -passes=sandbox-vectorizer -sbvec-vec-reg-bits=1024 -sbvec-allow-non-pow2 -sbvec-passes="seed-collection<tr-save,bottom-up-vec,tr-accept>" %s -S | FileCheck %s --check-prefix=DEFAULT
+
+; Checks the command-line option `-sbvec-allow-files`.
+define void @widen(ptr %ptr) {
+; ALLOW_OTHER:     store float {{%.*}}, ptr {{%.*}}, align 4
+; ALLOW_OTHER:     store float {{%.*}}, ptr {{%.*}}, align 4
+;
+; ALLOW_THIS:      store <2 x float> {{%.*}}, ptr {{%.*}}, align 4
+;
+; ALLOW_REGEX:     store <2 x float> {{%.*}}, ptr {{%.*}}, align 4
+;
+; ALLOW_REGEX_CSV: store <2 x float> {{%.*}}, ptr {{%.*}}, align 4
+;
+; ALLOW_BAD_REGEX: store float {{%.*}}, ptr {{%.*}}, align 4
+; ALLOW_BAD_REGEX: store float {{%.*}}, ptr {{%.*}}, align 4
+;
+; ALLOW_OTHER_CSV: store float {{%.*}}, ptr {{%.*}}, align 4
+; ALLOW_OTHER_CSV: store float {{%.*}}, ptr {{%.*}}, align 4
+;
+; ALLOW_EMPTY:     store float {{%.*}}, ptr {{%.*}}, align 4
+; ALLOW_EMPTY:     store float {{%.*}}, ptr {{%.*}}, align 4
+;
+; DEFAULT:         store <2 x float> {{%.*}}, ptr {{%.*}}, align 4
+;
+  %ptr0 = getelementptr float, ptr %ptr, i32 0
+  %ptr1 = getelementptr float, ptr %ptr, i32 1
+  %ld0 = load float, ptr %ptr0
+  %ld1 = load float, ptr %ptr1
+  store float %ld0, ptr %ptr0
+  store float %ld1, ptr %ptr1
+  ret void
+}

@vporpo vporpo merged commit 32bcc9f into llvm:main Feb 27, 2025
11 of 14 checks passed
cheezeburglar pushed a commit to cheezeburglar/llvm-project that referenced this pull request Feb 28, 2025
…vm#129127)

This new option lets you specify an allow-list of source files and
disables vectorization if the IR is not in the list. This can be used
for debugging miscompiles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants