Skip to content

[MLIR][Pass] Full & deterministic diagnostics #110311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 2, 2024

Conversation

zyx-billy
Copy link
Contributor

Today, when the pass infra schedules a pass/nested-pipeline on a set of ops, it exits early as soon as it fails on one of the ops. This leads to non-exhaustive, and more importantly, non-deterministic error reporting (under async).

This PR removes the early termination behavior so that all ops have a chance to run through the current pass/nested-pipeline, and all errors are reported (async diagnostics are already ordered). This guarantees deterministic & full error reporting. As a result, it's also no longer necessary to -split-input-file with one error per split when testing with -verify-diagnostics.

@zyx-billy zyx-billy force-pushed the mlir/pass/full_diagnostics branch from 324fe56 to aa4a0b6 Compare September 27, 2024 19:56
@zyx-billy zyx-billy changed the title [MLIR][Pass] Deterministic diagnostics [MLIR][Pass] Full & deterministic diagnostics Sep 30, 2024
@zyx-billy zyx-billy marked this pull request as ready for review September 30, 2024 20:13
@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir labels Sep 30, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-mlir

Author: Billy Zhu (zyx-billy)

Changes

Today, when the pass infra schedules a pass/nested-pipeline on a set of ops, it exits early as soon as it fails on one of the ops. This leads to non-exhaustive, and more importantly, non-deterministic error reporting (under async).

This PR removes the early termination behavior so that all ops have a chance to run through the current pass/nested-pipeline, and all errors are reported (async diagnostics are already ordered). This guarantees deterministic & full error reporting. As a result, it's also no longer necessary to -split-input-file with one error per split when testing with -verify-diagnostics.


Full diff: https://github.com/llvm/llvm-project/pull/110311.diff

4 Files Affected:

  • (modified) mlir/lib/Pass/Pass.cpp (+6-3)
  • (modified) mlir/test/Pass/crash-recovery-dynamic-failure.mlir (+1-1)
  • (added) mlir/test/Pass/full_diagnostics.mlir (+17)
  • (modified) mlir/test/lib/Pass/TestPassManager.cpp (+11-1)
diff --git a/mlir/lib/Pass/Pass.cpp b/mlir/lib/Pass/Pass.cpp
index 83b46e8f5680a0..90fd9a713e41d7 100644
--- a/mlir/lib/Pass/Pass.cpp
+++ b/mlir/lib/Pass/Pass.cpp
@@ -732,7 +732,7 @@ void OpToOpPassAdaptor::runOnOperationImpl(bool verifyPasses) {
         unsigned initGeneration = mgr->impl->initializationGeneration;
         if (failed(runPipeline(*mgr, &op, am.nest(&op), verifyPasses,
                                initGeneration, instrumentor, &parentInfo)))
-          return signalPassFailure();
+          signalPassFailure();
       }
     }
   }
@@ -799,6 +799,7 @@ void OpToOpPassAdaptor::runOnOperationAsyncImpl(bool verifyPasses) {
   // An atomic failure variable for the async executors.
   std::vector<std::atomic<bool>> activePMs(asyncExecutors.size());
   std::fill(activePMs.begin(), activePMs.end(), false);
+  std::atomic<bool> hasFailure = false;
   auto processFn = [&](OpPMInfo &opInfo) {
     // Find an executor for this operation.
     auto it = llvm::find_if(activePMs, [](std::atomic<bool> &isActive) {
@@ -812,14 +813,16 @@ void OpToOpPassAdaptor::runOnOperationAsyncImpl(bool verifyPasses) {
     LogicalResult pipelineResult = runPipeline(
         pm, opInfo.op, opInfo.am, verifyPasses,
         pm.impl->initializationGeneration, instrumentor, &parentInfo);
+    if (failed(pipelineResult))
+      hasFailure.store(true);
 
     // Reset the active bit for this pass manager.
     activePMs[pmIndex].store(false);
-    return pipelineResult;
   };
+  parallelForEach(context, opInfos, processFn);
 
   // Signal a failure if any of the executors failed.
-  if (failed(failableParallelForEach(context, opInfos, processFn)))
+  if (hasFailure)
     signalPassFailure();
 }
 
diff --git a/mlir/test/Pass/crash-recovery-dynamic-failure.mlir b/mlir/test/Pass/crash-recovery-dynamic-failure.mlir
index cea4f6a79963bd..6d2bd696391086 100644
--- a/mlir/test/Pass/crash-recovery-dynamic-failure.mlir
+++ b/mlir/test/Pass/crash-recovery-dynamic-failure.mlir
@@ -15,4 +15,4 @@ module @inner_mod1 {
 // REPRO_LOCAL_DYNAMIC_FAILURE: module @inner_mod1
 // REPRO_LOCAL_DYNAMIC_FAILURE: module @foo {
 
-// REPRO_LOCAL_DYNAMIC_FAILURE: pipeline: "builtin.module(builtin.module(test-pass-failure))"
+// REPRO_LOCAL_DYNAMIC_FAILURE: pipeline: "builtin.module(builtin.module(test-pass-failure{gen-diagnostics=false}))"
diff --git a/mlir/test/Pass/full_diagnostics.mlir b/mlir/test/Pass/full_diagnostics.mlir
new file mode 100644
index 00000000000000..881260ce60d321
--- /dev/null
+++ b/mlir/test/Pass/full_diagnostics.mlir
@@ -0,0 +1,17 @@
+// RUN: mlir-opt %s -pass-pipeline='builtin.module(func.func(test-pass-failure{gen-diagnostics}))' -verify-diagnostics
+
+// Test that all errors are reported.
+// expected-error@below {{illegal operation}}
+func.func @TestAlwaysIllegalOperationPass1() {
+  return
+}
+
+// expected-error@below {{illegal operation}}
+func.func @TestAlwaysIllegalOperationPass2() {
+  return
+}
+
+// expected-error@below {{illegal operation}}
+func.func @TestAlwaysIllegalOperationPass3() {
+  return
+}
diff --git a/mlir/test/lib/Pass/TestPassManager.cpp b/mlir/test/lib/Pass/TestPassManager.cpp
index ee32bec0c79bd4..7afe2109f04db3 100644
--- a/mlir/test/lib/Pass/TestPassManager.cpp
+++ b/mlir/test/lib/Pass/TestPassManager.cpp
@@ -151,11 +151,21 @@ struct TestCrashRecoveryPass
 struct TestFailurePass : public PassWrapper<TestFailurePass, OperationPass<>> {
   MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestFailurePass)
 
-  void runOnOperation() final { signalPassFailure(); }
+  TestFailurePass() = default;
+  TestFailurePass(const TestFailurePass &other) : PassWrapper(other) {}
+
+  void runOnOperation() final {
+    signalPassFailure();
+    if (genDiagnostics)
+      mlir::emitError(getOperation()->getLoc(), "illegal operation");
+  }
   StringRef getArgument() const final { return "test-pass-failure"; }
   StringRef getDescription() const final {
     return "Test a pass in the pass manager that always fails";
   }
+
+  Option<bool> genDiagnostics{*this, "gen-diagnostics",
+                              llvm::cl::desc("Generate a diagnostic message")};
 };
 
 /// A test pass that creates an invalid operation in a function body.

@llvmbot
Copy link
Member

llvmbot commented Sep 30, 2024

@llvm/pr-subscribers-mlir-core

Author: Billy Zhu (zyx-billy)

Changes

Today, when the pass infra schedules a pass/nested-pipeline on a set of ops, it exits early as soon as it fails on one of the ops. This leads to non-exhaustive, and more importantly, non-deterministic error reporting (under async).

This PR removes the early termination behavior so that all ops have a chance to run through the current pass/nested-pipeline, and all errors are reported (async diagnostics are already ordered). This guarantees deterministic & full error reporting. As a result, it's also no longer necessary to -split-input-file with one error per split when testing with -verify-diagnostics.


Full diff: https://github.com/llvm/llvm-project/pull/110311.diff

4 Files Affected:

  • (modified) mlir/lib/Pass/Pass.cpp (+6-3)
  • (modified) mlir/test/Pass/crash-recovery-dynamic-failure.mlir (+1-1)
  • (added) mlir/test/Pass/full_diagnostics.mlir (+17)
  • (modified) mlir/test/lib/Pass/TestPassManager.cpp (+11-1)
diff --git a/mlir/lib/Pass/Pass.cpp b/mlir/lib/Pass/Pass.cpp
index 83b46e8f5680a0..90fd9a713e41d7 100644
--- a/mlir/lib/Pass/Pass.cpp
+++ b/mlir/lib/Pass/Pass.cpp
@@ -732,7 +732,7 @@ void OpToOpPassAdaptor::runOnOperationImpl(bool verifyPasses) {
         unsigned initGeneration = mgr->impl->initializationGeneration;
         if (failed(runPipeline(*mgr, &op, am.nest(&op), verifyPasses,
                                initGeneration, instrumentor, &parentInfo)))
-          return signalPassFailure();
+          signalPassFailure();
       }
     }
   }
@@ -799,6 +799,7 @@ void OpToOpPassAdaptor::runOnOperationAsyncImpl(bool verifyPasses) {
   // An atomic failure variable for the async executors.
   std::vector<std::atomic<bool>> activePMs(asyncExecutors.size());
   std::fill(activePMs.begin(), activePMs.end(), false);
+  std::atomic<bool> hasFailure = false;
   auto processFn = [&](OpPMInfo &opInfo) {
     // Find an executor for this operation.
     auto it = llvm::find_if(activePMs, [](std::atomic<bool> &isActive) {
@@ -812,14 +813,16 @@ void OpToOpPassAdaptor::runOnOperationAsyncImpl(bool verifyPasses) {
     LogicalResult pipelineResult = runPipeline(
         pm, opInfo.op, opInfo.am, verifyPasses,
         pm.impl->initializationGeneration, instrumentor, &parentInfo);
+    if (failed(pipelineResult))
+      hasFailure.store(true);
 
     // Reset the active bit for this pass manager.
     activePMs[pmIndex].store(false);
-    return pipelineResult;
   };
+  parallelForEach(context, opInfos, processFn);
 
   // Signal a failure if any of the executors failed.
-  if (failed(failableParallelForEach(context, opInfos, processFn)))
+  if (hasFailure)
     signalPassFailure();
 }
 
diff --git a/mlir/test/Pass/crash-recovery-dynamic-failure.mlir b/mlir/test/Pass/crash-recovery-dynamic-failure.mlir
index cea4f6a79963bd..6d2bd696391086 100644
--- a/mlir/test/Pass/crash-recovery-dynamic-failure.mlir
+++ b/mlir/test/Pass/crash-recovery-dynamic-failure.mlir
@@ -15,4 +15,4 @@ module @inner_mod1 {
 // REPRO_LOCAL_DYNAMIC_FAILURE: module @inner_mod1
 // REPRO_LOCAL_DYNAMIC_FAILURE: module @foo {
 
-// REPRO_LOCAL_DYNAMIC_FAILURE: pipeline: "builtin.module(builtin.module(test-pass-failure))"
+// REPRO_LOCAL_DYNAMIC_FAILURE: pipeline: "builtin.module(builtin.module(test-pass-failure{gen-diagnostics=false}))"
diff --git a/mlir/test/Pass/full_diagnostics.mlir b/mlir/test/Pass/full_diagnostics.mlir
new file mode 100644
index 00000000000000..881260ce60d321
--- /dev/null
+++ b/mlir/test/Pass/full_diagnostics.mlir
@@ -0,0 +1,17 @@
+// RUN: mlir-opt %s -pass-pipeline='builtin.module(func.func(test-pass-failure{gen-diagnostics}))' -verify-diagnostics
+
+// Test that all errors are reported.
+// expected-error@below {{illegal operation}}
+func.func @TestAlwaysIllegalOperationPass1() {
+  return
+}
+
+// expected-error@below {{illegal operation}}
+func.func @TestAlwaysIllegalOperationPass2() {
+  return
+}
+
+// expected-error@below {{illegal operation}}
+func.func @TestAlwaysIllegalOperationPass3() {
+  return
+}
diff --git a/mlir/test/lib/Pass/TestPassManager.cpp b/mlir/test/lib/Pass/TestPassManager.cpp
index ee32bec0c79bd4..7afe2109f04db3 100644
--- a/mlir/test/lib/Pass/TestPassManager.cpp
+++ b/mlir/test/lib/Pass/TestPassManager.cpp
@@ -151,11 +151,21 @@ struct TestCrashRecoveryPass
 struct TestFailurePass : public PassWrapper<TestFailurePass, OperationPass<>> {
   MLIR_DEFINE_EXPLICIT_INTERNAL_INLINE_TYPE_ID(TestFailurePass)
 
-  void runOnOperation() final { signalPassFailure(); }
+  TestFailurePass() = default;
+  TestFailurePass(const TestFailurePass &other) : PassWrapper(other) {}
+
+  void runOnOperation() final {
+    signalPassFailure();
+    if (genDiagnostics)
+      mlir::emitError(getOperation()->getLoc(), "illegal operation");
+  }
   StringRef getArgument() const final { return "test-pass-failure"; }
   StringRef getDescription() const final {
     return "Test a pass in the pass manager that always fails";
   }
+
+  Option<bool> genDiagnostics{*this, "gen-diagnostics",
+                              llvm::cl::desc("Generate a diagnostic message")};
 };
 
 /// A test pass that creates an invalid operation in a function body.

@joker-eph joker-eph requested a review from River707 September 30, 2024 21:26
Copy link
Contributor

@River707 River707 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me!

};
parallelForEach(context, opInfos, processFn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just inline processFn now? It was out of line when the function was called in the if.

@zyx-billy zyx-billy force-pushed the mlir/pass/full_diagnostics branch from aa4a0b6 to e3f62cf Compare October 1, 2024 17:08
@zyx-billy zyx-billy merged commit 5b21fd2 into llvm:main Oct 2, 2024
8 checks passed
Sterling-Augustine pushed a commit to Sterling-Augustine/llvm-project that referenced this pull request Oct 3, 2024
Today, when the pass infra schedules a pass/nested-pipeline on a set of
ops, it exits early as soon as it fails on one of the ops. This leads to
non-exhaustive, and more importantly, non-deterministic error reporting
(under async).

This PR removes the early termination behavior so that all ops have a
chance to run through the current pass/nested-pipeline, and all errors
are reported (async diagnostics are already ordered). This guarantees
deterministic & full error reporting. As a result, it's also no longer
necessary to -split-input-file with one error per split when testing
with -verify-diagnostics.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants