Skip to content

[mlir][Pass] Enable the option for reproducer generation without crashing #75421

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 3, 2024

Conversation

plotfi
Copy link
Collaborator

@plotfi plotfi commented Dec 14, 2023

This PR adds API makeReproducer and cl::opt flag --mlir-generate-reproducer=<filename> in order to allow for mlir reproducer dumps even when the pipeline doesn't crash.

This PR also decouples the code that handles generation of an MLIR reproducer from the crash recovery portion. The purpose is to allow for generating reproducers outside of the context of a compiler crash.

This will be useful for frameworks and runtimes that use MLIR where it is needed to reproduce the pipeline behavior for reasons outside of diagnosing crashes. An example is for diagnosing performance issues using offline tools, where being able to dump the reproducer from a runtime compiler would be helpful.

@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels Dec 14, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 14, 2023

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-core

Author: Puyan Lotfi (plotfi)

Changes

Adding API enableReproducerGeneration (with an optional alwaysGenerateReproducer parameter) and cl::opt flag -mlir-pass-pipeline-always-generate-reproducer in order to allow for mlir reproducer dumps even when the pipeline doesn't crash.

This will be useful for frameworks and runtimes that use MLIR where it is needed to reproduce the pipeline behavior for reasons outside of diagnosing crashes. An example is for diagnosing performance issue using offline tools, where being able to dump the reproducer from a runtime compiler would be helpful.


Full diff: https://github.com/llvm/llvm-project/pull/75421.diff

5 Files Affected:

  • (modified) mlir/include/mlir/Pass/PassManager.h (+20)
  • (modified) mlir/lib/Pass/PassCrashRecovery.cpp (+55-25)
  • (modified) mlir/lib/Pass/PassDetail.h (+2-1)
  • (modified) mlir/lib/Pass/PassManagerOptions.cpp (+14-3)
  • (added) mlir/test/Pass/crashless-reproducer.mlir (+13)
diff --git a/mlir/include/mlir/Pass/PassManager.h b/mlir/include/mlir/Pass/PassManager.h
index d5f1ea0fe0350d..e6461d90abea98 100644
--- a/mlir/include/mlir/Pass/PassManager.h
+++ b/mlir/include/mlir/Pass/PassManager.h
@@ -243,6 +243,16 @@ class PassManager : public OpPassManager {
   void enableCrashReproducerGeneration(StringRef outputFile,
                                        bool genLocalReproducer = false);
 
+  /// Enable support for the pass manager to generate a reproducer. With this
+  /// invocation the reproducer is generated depending on the parameter
+  /// `alwaysGenerateReproducer`. `outputFile` is a .mlir filename used to
+  /// write the generated reproducer. If `genLocalReproducer` is true, the pass
+  /// manager will attempt to generate a local reproducer that contains the
+  /// smallest pipeline.
+  void enableReproducerGeneration(StringRef outputFile,
+                                  bool genLocalReproducer = false,
+                                  bool alwaysGenerateReproducer = false);
+
   /// Streams on which to output crash reproducer.
   struct ReproducerStream {
     virtual ~ReproducerStream() = default;
@@ -266,6 +276,16 @@ class PassManager : public OpPassManager {
   void enableCrashReproducerGeneration(ReproducerStreamFactory factory,
                                        bool genLocalReproducer = false);
 
+  /// Enable support for the pass manager to generate a reproducer. With this
+  /// invocation the reproducer is generated depending on the parameter
+  /// `alwaysGenerateReproducer`. `factory` is used to construct the streams
+  /// to write the generated reproducer to. If `genLocalReproducer` is true, the
+  /// pass manager will attempt to generate a local reproducer that contains the
+  /// smallest pipeline.
+  void enableReproducerGeneration(ReproducerStreamFactory factory,
+                                  bool genLocalReproducer = false,
+                                  bool alwaysGenerateReproducer = false);
+
   /// Runs the verifier after each individual pass.
   void enableVerifier(bool enabled = true);
 
diff --git a/mlir/lib/Pass/PassCrashRecovery.cpp b/mlir/lib/Pass/PassCrashRecovery.cpp
index df1a0762ae34a5..4480d6797de4c5 100644
--- a/mlir/lib/Pass/PassCrashRecovery.cpp
+++ b/mlir/lib/Pass/PassCrashRecovery.cpp
@@ -176,8 +176,10 @@ void RecoveryReproducerContext::registerSignalHandler() {
 
 struct PassCrashReproducerGenerator::Impl {
   Impl(PassManager::ReproducerStreamFactory &streamFactory,
-       bool localReproducer)
-      : streamFactory(streamFactory), localReproducer(localReproducer) {}
+       bool localReproducer,
+       bool alwaysGenerateReproducer = false)
+      : streamFactory(streamFactory), localReproducer(localReproducer),
+        alwaysGenerateReproducer(alwaysGenerateReproducer){}
 
   /// The factory to use when generating a crash reproducer.
   PassManager::ReproducerStreamFactory streamFactory;
@@ -195,11 +197,17 @@ struct PassCrashReproducerGenerator::Impl {
 
   /// Various pass manager flags that get emitted when generating a reproducer.
   bool pmFlagVerifyPasses = false;
+
+  /// Flag indicating if reproducer generation should occur regardless of 
+  /// a crash or failing pass.
+  bool alwaysGenerateReproducer = false;
 };
 
 PassCrashReproducerGenerator::PassCrashReproducerGenerator(
-    PassManager::ReproducerStreamFactory &streamFactory, bool localReproducer)
-    : impl(std::make_unique<Impl>(streamFactory, localReproducer)) {}
+    PassManager::ReproducerStreamFactory &streamFactory, bool localReproducer,
+    bool alwaysGenerateReproducer)
+    : impl(std::make_unique<Impl>(streamFactory, localReproducer,
+                                  alwaysGenerateReproducer)) {}
 PassCrashReproducerGenerator::~PassCrashReproducerGenerator() = default;
 
 void PassCrashReproducerGenerator::initialize(
@@ -235,13 +243,10 @@ void PassCrashReproducerGenerator::finalize(Operation *rootOp,
     return;
 
   // If the pass manager execution succeeded, we don't generate any reproducers.
-  if (succeeded(executionResult))
+  const bool executionResultSucceeded = succeeded(executionResult);
+  if (executionResultSucceeded && !impl->alwaysGenerateReproducer)
     return impl->activeContexts.clear();
 
-  InFlightDiagnostic diag = emitError(rootOp->getLoc())
-                            << "Failures have been detected while "
-                               "processing an MLIR pass pipeline";
-
   // If we are generating a global reproducer, we include all of the running
   // passes in the error message for the only active context.
   if (!impl->localReproducer) {
@@ -251,13 +256,18 @@ void PassCrashReproducerGenerator::finalize(Operation *rootOp,
     std::string description;
     impl->activeContexts.front()->generate(description);
 
-    // Emit an error to the user.
-    Diagnostic &note = diag.attachNote() << "Pipeline failed while executing [";
-    llvm::interleaveComma(impl->runningPasses, note,
-                          [&](const std::pair<Pass *, Operation *> &value) {
-                            formatPassOpReproducerMessage(note, value);
-                          });
-    note << "]: " << description;
+    if (!executionResultSucceeded) {
+      InFlightDiagnostic diag = emitError(rootOp->getLoc())
+                                << "Failures have been detected while "
+                                   "processing an MLIR pass pipeline";
+      // Emit an error to the user.
+      Diagnostic &note = diag.attachNote() << "Pipeline failed while executing [";
+      llvm::interleaveComma(impl->runningPasses, note,
+                            [&](const std::pair<Pass *, Operation *> &value) {
+                              formatPassOpReproducerMessage(note, value);
+                            });
+      note << "]: " << description;
+    }
     impl->runningPasses.clear();
     impl->activeContexts.clear();
     return;
@@ -274,10 +284,15 @@ void PassCrashReproducerGenerator::finalize(Operation *rootOp,
   std::string description;
   reproducerContext.generate(description);
 
-  // Emit an error to the user.
-  Diagnostic &note = diag.attachNote() << "Pipeline failed while executing ";
-  formatPassOpReproducerMessage(note, impl->runningPasses.back());
-  note << ": " << description;
+  if (!executionResultSucceeded) {
+    InFlightDiagnostic diag = emitError(rootOp->getLoc())
+                              << "Failures have been detected while "
+                                 "processing an MLIR pass pipeline";
+    // Emit an error to the user.
+    Diagnostic &note = diag.attachNote() << "Pipeline failed while executing ";
+    formatPassOpReproducerMessage(note, impl->runningPasses.back());
+    note << ": " << description;
+  }
 
   impl->activeContexts.clear();
   impl->runningPasses.clear();
@@ -420,10 +435,22 @@ LogicalResult PassManager::runWithCrashRecovery(Operation *op,
 
 void PassManager::enableCrashReproducerGeneration(StringRef outputFile,
                                                   bool genLocalReproducer) {
+  enableReproducerGeneration(outputFile, genLocalReproducer);
+}
+
+void PassManager::enableCrashReproducerGeneration(
+    ReproducerStreamFactory factory, bool genLocalReproducer) {
+  enableReproducerGeneration(factory, genLocalReproducer);
+}
+
+void PassManager::enableReproducerGeneration(
+    StringRef outputFile,
+    bool genLocalReproducer,
+    bool alwaysGenerateReproducer) {
   // Capture the filename by value in case outputFile is out of scope when
   // invoked.
   std::string filename = outputFile.str();
-  enableCrashReproducerGeneration(
+  enableReproducerGeneration(
       [filename](std::string &error) -> std::unique_ptr<ReproducerStream> {
         std::unique_ptr<llvm::ToolOutputFile> outputFile =
             mlir::openOutputFile(filename, &error);
@@ -433,11 +460,14 @@ void PassManager::enableCrashReproducerGeneration(StringRef outputFile,
         }
         return std::make_unique<FileReproducerStream>(std::move(outputFile));
       },
-      genLocalReproducer);
+      genLocalReproducer,
+      alwaysGenerateReproducer);
 }
 
-void PassManager::enableCrashReproducerGeneration(
-    ReproducerStreamFactory factory, bool genLocalReproducer) {
+void PassManager::enableReproducerGeneration(
+    ReproducerStreamFactory factory,
+    bool genLocalReproducer,
+    bool alwaysGenerateReproducer) {
   assert(!crashReproGenerator &&
          "crash reproducer has already been initialized");
   if (genLocalReproducer && getContext()->isMultithreadingEnabled())
@@ -446,7 +476,7 @@ void PassManager::enableCrashReproducerGeneration(
         "pass-manager without disabling multi-threading first.");
 
   crashReproGenerator = std::make_unique<PassCrashReproducerGenerator>(
-      factory, genLocalReproducer);
+      factory, genLocalReproducer, alwaysGenerateReproducer);
   addInstrumentation(
       std::make_unique<CrashReproducerInstrumentation>(*crashReproGenerator));
 }
diff --git a/mlir/lib/Pass/PassDetail.h b/mlir/lib/Pass/PassDetail.h
index 0e964b6d6d36bc..decc34c2af6c28 100644
--- a/mlir/lib/Pass/PassDetail.h
+++ b/mlir/lib/Pass/PassDetail.h
@@ -100,7 +100,8 @@ class PassCrashReproducerGenerator {
 public:
   PassCrashReproducerGenerator(
       PassManager::ReproducerStreamFactory &streamFactory,
-      bool localReproducer);
+      bool localReproducer,
+      bool alwaysGenerateReproducer = false);
   ~PassCrashReproducerGenerator();
 
   /// Initialize the generator in preparation for reproducer generation. The
diff --git a/mlir/lib/Pass/PassManagerOptions.cpp b/mlir/lib/Pass/PassManagerOptions.cpp
index ffc53b7e3ed023..d77aed772fee85 100644
--- a/mlir/lib/Pass/PassManagerOptions.cpp
+++ b/mlir/lib/Pass/PassManagerOptions.cpp
@@ -29,6 +29,10 @@ struct PassManagerOptions {
       llvm::cl::desc("When generating a crash reproducer, attempt to generated "
                      "a reproducer with the smallest pipeline."),
       llvm::cl::init(false)};
+  llvm::cl::opt<bool> alwaysGenerateReproducer{
+      "mlir-pass-pipeline-always-generate-reproducer",
+      llvm::cl::desc("Generating a reproducer even if a crash did not occur "),
+      llvm::cl::init(false)};
 
   //===--------------------------------------------------------------------===//
   // IR Printing
@@ -135,9 +139,16 @@ LogicalResult mlir::applyPassManagerCLOptions(PassManager &pm) {
     return failure();
 
   // Generate a reproducer on crash/failure.
-  if (options->reproducerFile.getNumOccurrences())
-    pm.enableCrashReproducerGeneration(options->reproducerFile,
-                                       options->localReproducer);
+  if (options->reproducerFile.getNumOccurrences()) {
+    if (options->alwaysGenerateReproducer) {
+      pm.enableReproducerGeneration(options->reproducerFile,
+                                    options->localReproducer,
+                                    true /*alwaysGenerateReproducer*/);
+    } else {
+      pm.enableCrashReproducerGeneration(options->reproducerFile,
+                                         options->localReproducer);
+    }
+  }
 
   // Enable statistics dumping.
   if (options->passStatistics)
diff --git a/mlir/test/Pass/crashless-reproducer.mlir b/mlir/test/Pass/crashless-reproducer.mlir
new file mode 100644
index 00000000000000..e34efaeace89ea
--- /dev/null
+++ b/mlir/test/Pass/crashless-reproducer.mlir
@@ -0,0 +1,13 @@
+// RUN: mlir-opt %s -pass-pipeline='builtin.module(builtin.module(test-module-pass))' \
+// RUN:   -mlir-pass-pipeline-crash-reproducer=%t \
+// RUN:   -mlir-pass-pipeline-always-generate-reproducer=true -verify-diagnostics
+
+// RUN: cat %t | FileCheck -check-prefix=REPRO %s
+
+module @inner_mod1 {
+  module @foo {}
+}
+
+// REPRO: module @inner_mod1
+// REPRO: module @foo {
+// REPRO: pipeline: "builtin.module(builtin.module(test-module-pass))"

Copy link

github-actions bot commented Dec 14, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@plotfi plotfi force-pushed the plotfi-mlir-pass-always-reproducer branch 2 times, most recently from 0fb3d49 to 02d37a9 Compare December 14, 2023 02:11
@joker-eph
Copy link
Collaborator

Being able to generate a reproducer for a framework (or other) outside of a crash seems useful, but it’s not clear to me why it is a passe manager option?
I would think that the framework can already generate it independently before calling the pass manager.

@plotfi
Copy link
Collaborator Author

plotfi commented Dec 14, 2023

Being able to generate a reproducer for a framework (or other) outside of a crash seems useful, but it’s not clear to me why it is a passe manager option?

I considered that, but I am primarily trying to target the mlir-opt's reproducer file ease of use here. Can I generate a reproducer file that could be loaded by mlir-opt or similar tools without doing a pass manager option?

I guess I should I mention this in my commit message explicitly?

@plotfi
Copy link
Collaborator Author

plotfi commented Dec 14, 2023

@joker-eph Ah, do you mean then to invoke something like RecoveryReproducerContext::generate() from the runtime?

@joker-eph
Copy link
Collaborator

@joker-eph Ah, do you mean then to invoke something like RecoveryReproducerContext::generate() from the runtime?

Yes exactly!

If we need to extract the implementation of RecoveryReproducerContext::generate() as a more easily reusable API, that's an OK refactoring of course.

@plotfi
Copy link
Collaborator Author

plotfi commented Dec 14, 2023

@joker-eph Ah, do you mean then to invoke something like RecoveryReproducerContext::generate() from the runtime?

Yes exactly!

If we need to extract the implementation of RecoveryReproducerContext::generate() as a more easily reusable API, that's an OK refactoring of course.

I will have to dive a little deeper into the current implementation to see how things could be setup without doing it in the pass manager directly. Ideally I'd want a similar mechanism to how the pass manager does it where you dump out the reproducer after everything has run. The pass manager's finalizer just made this kind of conveniently easy, but I am not certain if it is the right approach either.

@plotfi plotfi force-pushed the plotfi-mlir-pass-always-reproducer branch from 02d37a9 to 15b3128 Compare December 21, 2023 09:12
@plotfi
Copy link
Collaborator Author

plotfi commented Dec 21, 2023

@joker-eph Ah, do you mean then to invoke something like RecoveryReproducerContext::generate() from the runtime?

Yes exactly!

If we need to extract the implementation of RecoveryReproducerContext::generate() as a more easily reusable API, that's an OK refactoring of course.

Work in progress, but I have refactored the bits that append the reproducer info without having to do it from inside of the pass manager. The reproducers can be appended like so:

  for (auto &pass : pm.getPasses())
    makeReproducer(&pass, op.get(), "reproducer.mlir");

@plotfi plotfi force-pushed the plotfi-mlir-pass-always-reproducer branch 5 times, most recently from c4d138c to bfb0682 Compare December 22, 2023 21:38
@plotfi
Copy link
Collaborator Author

plotfi commented Dec 22, 2023

@joker-eph I've separated out the NFC portions from the test and the new functionality. Ready for review.

@plotfi plotfi force-pushed the plotfi-mlir-pass-always-reproducer branch 6 times, most recently from c3f9e71 to 67130c5 Compare December 29, 2023 21:30
llvm::cl::desc("Generate a reproducer at"
" --mlir-reproducer-filename=<filename> "
" (no crash required)"),
cl::location(generateReproducerFlag), cl::init(false));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use one option?

--mlir-generate-reproducer=<file>

If it is set we generate the repro using the provided path.
(cl::opt exposes getNumOccurences())

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is ok with me. Do you think it matters if we generate the same file path for a crash reproducer versus not? I wasn't completely certain about this, so thats why I did a separate cl::opt flag file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow what you're asking about here?

You're introducing 2 new options, I'm saying you can introduce one only. But that won't conflict with the mlir-pass-pipeline-crash-reproducer option right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, I understand now.

@plotfi plotfi force-pushed the plotfi-mlir-pass-always-reproducer branch 2 times, most recently from ede2eab to 101b205 Compare January 3, 2024 01:06
plotfi added 2 commits January 2, 2024 17:14
…ation

This patch decouples the code that handles generation of an MLIR reproducer
from the crash recovery portion. The purpose is to allow for generating
reproducers outside of the context of a compiler crash.

(cherry picked from commit bfb0682f2a318a84a05f91466757a03d68ddfd1f)
…hing

This patch adds API `makeReproducer` and cl::opt flag
--mlir-generate-reproducer=<filename> in order to allow for mlir
reproducer dumps even when the pipeline doesn't crash.

This will be useful for frameworks and runtimes that use MLIR where it is
needed to reproduce the pipeline behavior for reasons outside of diagnosing
crashes. An example is for diagnosing performance issues using offline tools,
where being able to dump the reproducer from a runtime compiler would be
helpful.
@plotfi plotfi force-pushed the plotfi-mlir-pass-always-reproducer branch from 101b205 to 8d21a8d Compare January 3, 2024 01:14
@plotfi
Copy link
Collaborator Author

plotfi commented Jan 3, 2024

@joker-eph Updated based on your feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants