Skip to content

[SandboxVec] Clear Context's state within runOnFunction() #124842

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

vporpo
Copy link
Contributor

@vporpo vporpo commented Jan 28, 2025

sandboxir::Context is defined at a pass-level scope with the SandboxVectorizerPass class because the function pass manager FPM object depends on it, and that is in pass-level scope to avoid recreating the pass pipeline every single time runOnFunction() is called.

This means that the Context's state lives on across function passes. The problem is twofold:
(i) the LLVM IR to Sandbox IR map can grow very large including objects from different functions, which is of no use to the vectorizer, as it's a function-level pass.
(ii) this can result in stale data in the LLVM IR to Sandbox IR object map, as other passes may delete LLVM IR objects.

To fix both issues this patch introduces a Context::clear() function that clears the LLVMValueToValueMap.

`sandboxir::Context` is defined at a pass-level scope with the
`SandboxVectorizerPass` class because the function pass manager `FPM` object
depends on it, and that is in pass-level scope to avoid recreating the pass
pipeline every single time `runOnFunction()` is called.

This means that the Context's state lives on across function passes.
The problem is twofold:
(i) the LLVM IR to Sandbox IR map can grow very large including objects from
different functions, which is of no use to the vectorizer, as it's a
function-level pass.
(ii) this can result in stale data in the LLVM IR to Sandbox IR object map,
as other passes may delete LLVM IR objects.

To fix both issues this patch introduces a `Context::clear()` function
that clears the `LLVMValueToValueMap`.
@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-vectorizers

Author: vporpo (vporpo)

Changes

sandboxir::Context is defined at a pass-level scope with the SandboxVectorizerPass class because the function pass manager FPM object depends on it, and that is in pass-level scope to avoid recreating the pass pipeline every single time runOnFunction() is called.

This means that the Context's state lives on across function passes. The problem is twofold:
(i) the LLVM IR to Sandbox IR map can grow very large including objects from different functions, which is of no use to the vectorizer, as it's a function-level pass.
(ii) this can result in stale data in the LLVM IR to Sandbox IR object map, as other passes may delete LLVM IR objects.

To fix both issues this patch introduces a Context::clear() function that clears the LLVMValueToValueMap.


Full diff: https://github.com/llvm/llvm-project/pull/124842.diff

6 Files Affected:

  • (modified) llvm/include/llvm/SandboxIR/Context.h (+2)
  • (modified) llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h (+9-1)
  • (modified) llvm/lib/SandboxIR/Context.cpp (+6)
  • (modified) llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp (+5-2)
  • (modified) llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt (+1)
  • (added) llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizerTest.cpp (+63)
diff --git a/llvm/include/llvm/SandboxIR/Context.h b/llvm/include/llvm/SandboxIR/Context.h
index 7fe97d984b9589..a88b0003f55bdb 100644
--- a/llvm/include/llvm/SandboxIR/Context.h
+++ b/llvm/include/llvm/SandboxIR/Context.h
@@ -218,6 +218,8 @@ class Context {
 public:
   Context(LLVMContext &LLVMCtx);
   ~Context();
+  /// Clears function-level state.
+  void clear();
 
   Tracker &getTracker() { return IRTracker; }
   /// Convenience function for `getTracker().save()`
diff --git a/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h b/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h
index 09369dbb496fce..7ea9386f08beef 100644
--- a/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h
+++ b/llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h
@@ -24,10 +24,18 @@ class SandboxVectorizerPass : public PassInfoMixin<SandboxVectorizerPass> {
   TargetTransformInfo *TTI = nullptr;
   AAResults *AA = nullptr;
   ScalarEvolution *SE = nullptr;
-
+  // NOTE: We define the Context as a pass-scope object instead of local object
+  // in runOnFunction() because the passes defined in the pass-manager need
+  // access to it for registering/deregistering callbacks during construction
+  // and destruction.
   std::unique_ptr<sandboxir::Context> Ctx;
 
   // A pipeline of SandboxIR function passes run by the vectorizer.
+  // NOTE: We define this as a pass-scope object to avoid recreating the
+  // pass-pipeline every time in runOnFunction(). The downside is that the
+  // Context also needs to be defined as a pass-scope object because the passes
+  // within FPM may register/unregister callbacks, so they need access to
+  // Context.
   sandboxir::FunctionPassManager FPM;
 
   bool runImpl(Function &F);
diff --git a/llvm/lib/SandboxIR/Context.cpp b/llvm/lib/SandboxIR/Context.cpp
index 440210f5a1bf7b..830f2832853fe5 100644
--- a/llvm/lib/SandboxIR/Context.cpp
+++ b/llvm/lib/SandboxIR/Context.cpp
@@ -611,6 +611,12 @@ Context::Context(LLVMContext &LLVMCtx)
 
 Context::~Context() {}
 
+void Context::clear() {
+  // TODO: Ideally we should clear only function-scope objects, and keep global
+  // objects, like Constants to avoid recreating them.
+  LLVMValueToValueMap.clear();
+}
+
 Module *Context::getModule(llvm::Module *LLVMM) const {
   auto It = LLVMModuleToModuleMap.find(LLVMM);
   if (It != LLVMModuleToModuleMap.end())
diff --git a/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp
index 542fcde71e83c6..798a0ad915375b 100644
--- a/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.cpp
@@ -88,7 +88,10 @@ bool SandboxVectorizerPass::runImpl(Function &LLVMF) {
   sandboxir::Function &F = *Ctx->createFunction(&LLVMF);
   sandboxir::Analyses A(*AA, *SE, *TTI);
   bool Change = FPM.runOnFunction(F, A);
-  // TODO: This is a function pass, so we won't be needing the function-level
-  // Sandbox IR objects in the future. So we should clear them.
+  // Given that sandboxir::Context `Ctx` is defined at a pass-level scope, the
+  // maps from LLVM IR to Sandbox IR may go stale as later passes remove LLVM IR
+  // objects. To avoid issues caused by this clear the context's state.
+  // NOTE: The alternative would be to define Ctx and FPM within runOnFunction()
+  Ctx->clear();
   return Change;
 }
diff --git a/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt
index bbfbcc730a4cbe..104a27975cfc0c 100644
--- a/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt
+++ b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/CMakeLists.txt
@@ -12,6 +12,7 @@ add_llvm_unittest(SandboxVectorizerTests
   InstrMapsTest.cpp
   IntervalTest.cpp
   LegalityTest.cpp
+  SandboxVectorizerTest.cpp
   SchedulerTest.cpp
   SeedCollectorTest.cpp	
   VecUtilsTest.cpp
diff --git a/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizerTest.cpp b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizerTest.cpp
new file mode 100644
index 00000000000000..e9b304618a06ca
--- /dev/null
+++ b/llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizerTest.cpp
@@ -0,0 +1,63 @@
+//===- SandboxVectorizerTest.cpp ------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Transforms/Vectorize/SandboxVectorizer/SandboxVectorizer.h"
+#include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BasicAliasAnalysis.h"
+#include "llvm/Analysis/LoopInfo.h"
+#include "llvm/Analysis/ScalarEvolution.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/Analysis/TargetTransformInfo.h"
+#include "llvm/AsmParser/Parser.h"
+#include "llvm/IR/DataLayout.h"
+#include "llvm/IR/Dominators.h"
+#include "llvm/IR/PassInstrumentation.h"
+#include "llvm/SandboxIR/Function.h"
+#include "llvm/SandboxIR/Instruction.h"
+#include "llvm/Support/SourceMgr.h"
+#include "gmock/gmock.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+
+struct SandboxVectorizerTest : public testing::Test {
+  LLVMContext C;
+  std::unique_ptr<Module> M;
+
+  void parseIR(LLVMContext &C, const char *IR) {
+    SMDiagnostic Err;
+    M = parseAssemblyString(IR, Err, C);
+    if (!M)
+      Err.print("SandboxVectorizerTest", errs());
+  }
+};
+
+// Check that we can run the pass on the same function more than once without
+// issues. This basically checks that Sandbox IR Context gets cleared after we
+// run the function pass.
+TEST_F(SandboxVectorizerTest, ContextCleared) {
+  parseIR(C, R"IR(
+define void @foo() {
+  ret void
+}
+)IR");
+  auto &LLVMF = *M->getFunction("foo");
+  SandboxVectorizerPass SVecPass;
+  FunctionAnalysisManager AM;
+  AM.registerPass([] { return TargetIRAnalysis(); });
+  AM.registerPass([] { return AAManager(); });
+  AM.registerPass([] { return ScalarEvolutionAnalysis(); });
+  AM.registerPass([] { return PassInstrumentationAnalysis(); });
+  AM.registerPass([] { return TargetLibraryAnalysis(); });
+  AM.registerPass([] { return AssumptionAnalysis(); });
+  AM.registerPass([] { return DominatorTreeAnalysis(); });
+  AM.registerPass([] { return LoopAnalysis(); });
+  SVecPass.run(LLVMF, AM);
+  // This shouldn't crash.
+  SVecPass.run(LLVMF, AM);
+}

AM.registerPass([] { return LoopAnalysis(); });
SVecPass.run(LLVMF, AM);
// This shouldn't crash.
SVecPass.run(LLVMF, AM);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to also test the case where some llvm::Value is deleted and you try to get a sandboxir::Value from the Context by using the pointer to the deleted value? (and check you don't get a sandboxir::Value back, of course)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that I don't have access to sandboxir::Context at this level. The context is created and destroyed within the SVecPass. Any thoughts on how we could test what you are suggesting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will push this patch to get rid of the bug. We can follow up with a follow-up patch if needed.

@vporpo vporpo merged commit 79cbad1 into llvm:main Jan 29, 2025
12 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 30, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-win running on as-worker-93 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/2195

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM-Unit :: Support/./SupportTests.exe/38/87' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe-LLVM-Unit-13104-38-87.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=87 GTEST_SHARD_INDEX=38 C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe
--

Script:
--
C:\a\lld-x86_64-win\build\unittests\Support\.\SupportTests.exe --gtest_filter=ProgramEnvTest.CreateProcessLongPath
--
C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(160): error: Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp(163): error: fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied



C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:160
Expected equality of these values:
  0
  RC
    Which is: -2

C:\a\lld-x86_64-win\llvm-project\llvm\unittests\Support\ProgramTest.cpp:163
fs::remove(Twine(LongPath)): did not return errc::success.
error number: 13
error message: permission denied




********************


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants