Skip to content

[mlir][transform] Improve error message of tracking listener. #66987

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

ingomueller-net
Copy link
Contributor

This PR extends the error message of the tracking listener when replacement ops cannot be found. That may happen if the applied patterns replace an op by an op of a different kind or by block arguments. However, this only matters if there are alive handles to the replaced op. The new error message mentions that explicitly and reports the alive handles.

@llvmbot
Copy link
Member

llvmbot commented Sep 21, 2023

@llvm/pr-subscribers-mlir-tensor

@llvm/pr-subscribers-mlir

Changes

This PR extends the error message of the tracking listener when replacement ops cannot be found. That may happen if the applied patterns replace an op by an op of a different kind or by block arguments. However, this only matters if there are alive handles to the replaced op. The new error message mentions that explicitly and reports the alive handles.


Full diff: https://github.com/llvm/llvm-project/pull/66987.diff

3 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h (+8-5)
  • (modified) mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp (+14-11)
  • (modified) mlir/test/Dialect/Transform/test-pattern-application.mlir (+2-1)
diff --git a/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h b/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h
index 86af59142b77d9c..e1169f45d17f699 100644
--- a/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h
+++ b/mlir/include/mlir/Dialect/Transform/IR/TransformInterfaces.h
@@ -774,7 +774,8 @@ class TransformResults {
   /// corresponds to the given list of payload IR ops. Each result must be set
   /// by the transformation exactly once in case of transformation succeeding.
   /// The value must have a type implementing TransformHandleTypeInterface.
-  template <typename Range> void set(OpResult value, Range &&ops) {
+  template <typename Range>
+  void set(OpResult value, Range &&ops) {
     int64_t position = value.getResultNumber();
     assert(position < static_cast<int64_t>(operations.size()) &&
            "setting results for a non-existent handle");
@@ -942,8 +943,9 @@ class TrackingListener : public RewriterBase::Listener,
   /// This function is called when a tracked payload op is dropped because no
   /// replacement op was found. Derived classes can implement this function for
   /// custom error handling.
-  virtual void notifyPayloadReplacementNotFound(Operation *op,
-                                                ValueRange values) {}
+  virtual void
+  notifyPayloadReplacementNotFound(Operation *op, ValueRange values,
+                                   ArrayRef<Operation *> aliveUsers) {}
 
   /// Return the single op that defines all given values (if any).
   static Operation *getCommonDefiningOp(ValueRange values);
@@ -983,8 +985,9 @@ class ErrorCheckingTrackingListener : public TrackingListener {
   bool failed() const;
 
 protected:
-  void notifyPayloadReplacementNotFound(Operation *op,
-                                        ValueRange values) override;
+  void
+  notifyPayloadReplacementNotFound(Operation *op, ValueRange values,
+                                   ArrayRef<Operation *> aliveUsers) override;
 
 private:
   /// The error state of this listener. "Success" indicates that no error
diff --git a/mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp b/mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp
index 9cac178d3c2b869..ab26ec66a9ac4e3 100644
--- a/mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp
+++ b/mlir/lib/Dialect/Transform/IR/TransformInterfaces.cpp
@@ -1396,14 +1396,13 @@ void transform::TrackingListener::notifyOperationReplaced(
   };
 
   // Helper function to check if the handle is alive.
-  auto hasAliveUser = [&]() {
-    for (Value v : opHandles) {
-      for (Operation *user : v.getUsers())
-        if (user != transformOp && !happensBefore(user, transformOp))
-          return true;
-    }
-    return false;
-  };
+  SmallVector<Operation *> aliveUsers;
+  for (Value v : opHandles) {
+    for (Operation *user : v.getUsers())
+      if (user != transformOp && !happensBefore(user, transformOp))
+        aliveUsers.push_back(user);
+  }
+  auto hasAliveUser = [&]() { return !aliveUsers.empty(); };
 
   if (!hasAliveUser() || handleWasConsumed()) {
     // The op is tracked but the corresponding handles are dead or were
@@ -1416,7 +1415,7 @@ void transform::TrackingListener::notifyOperationReplaced(
   // If the op is tracked but no replacement op was found, send a
   // notification.
   if (failed(replacement)) {
-    notifyPayloadReplacementNotFound(op, newValues);
+    notifyPayloadReplacementNotFound(op, newValues, aliveUsers);
     (void)replacePayloadOp(op, nullptr);
     return;
   }
@@ -1444,16 +1443,20 @@ bool transform::ErrorCheckingTrackingListener::failed() const {
 }
 
 void transform::ErrorCheckingTrackingListener::notifyPayloadReplacementNotFound(
-    Operation *op, ValueRange values) {
+    Operation *op, ValueRange values, ArrayRef<Operation *> aliveUsers) {
   if (status.succeeded()) {
     status = emitSilenceableFailure(
-        getTransformOp(), "tracking listener failed to find replacement op");
+        getTransformOp(), "op was replaced but replacement was of different "
+                          "kind, invalidating alive handles");
   }
 
   status.attachNote(op->getLoc()) << "[" << errorCounter << "] replaced op";
   for (auto &&[index, value] : llvm::enumerate(values))
     status.attachNote(value.getLoc())
         << "[" << errorCounter << "] replacement value " << index;
+  for (auto &&[index, user] : llvm::enumerate(aliveUsers))
+    status.attachNote(user->getLoc())
+        << "[" << errorCounter << "] alive handle " << index;
 
   ++errorCounter;
 }
diff --git a/mlir/test/Dialect/Transform/test-pattern-application.mlir b/mlir/test/Dialect/Transform/test-pattern-application.mlir
index efbdab78d397faa..f6a0801204b80d7 100644
--- a/mlir/test/Dialect/Transform/test-pattern-application.mlir
+++ b/mlir/test/Dialect/Transform/test-pattern-application.mlir
@@ -37,12 +37,13 @@ transform.sequence failures(propagate) {
 ^bb1(%arg1: !transform.any_op):
   %0 = transform.structured.match ops{["test.container"]} in %arg1 : (!transform.any_op) -> !transform.any_op
   %1 = transform.structured.match ops{["test.foo"]} in %arg1 : (!transform.any_op) -> !transform.any_op
-  // expected-error @below {{tracking listener failed to find replacement op}}
+  // expected-error @below {{op was replaced but replacement was of different kind, invalidating alive handles}}
   transform.apply_patterns to %0 {
     transform.apply_patterns.transform.test_patterns
   } : !transform.any_op
   // %1 must be used in some way. If no replacement payload op could be found,
   // an error is thrown only if the handle is not dead.
+  // expected-note @below {{[0] alive handle 0}}
   transform.annotate %1 "annotated" : !transform.any_op
 }
 

@ingomueller-net
Copy link
Contributor Author

The PR currently changes the interface of TrackingListener by adding an argument to notifyPayloadReplacementNotFound. This may not be desired. An alternative would be to attach a note to the private status after calling notifyPayloadReplacementNotFound. (That may not be perfect either: it assumes a certain behavior of that function, which could be overridden by a derived class.)

Comment on lines 1449 to 1450
getTransformOp(), "op was replaced but replacement was of different "
"kind, invalidating alive handles");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacement may not be found for other reasons though, like replacement values belonging to different ops (so there is no single replacement though all of the new producers have the right type). I suspect the message should be produced by the function that looks for replacement and forwarded here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, I see. I guess that that part of the error messages should thus come from TrackingListener::findReplacementOp, which currently returns a FailureOr<Operation *>. What mechanism can I use here to return the replacment op and diagnostics? There is no DiagnosedSilenceableFailureOr<T>, AFAIK...

Copy link
Member

@ftynse ftynse Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can always return additional things via references, use std::tuple or implement DiagnosedSilenceableFailureOr. Your choice :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked a bit with Matthias about this aspect. Indeed, there are a couple of options. The next revision returns a DiagnosedSilenceableFailure and passes the result Operation* through an output argument. Let me know if you prefer one of the other options.

}

status.attachNote(op->getLoc()) << "[" << errorCounter << "] replaced op";
for (auto &&[index, value] : llvm::enumerate(values))
status.attachNote(value.getLoc())
<< "[" << errorCounter << "] replacement value " << index;
for (auto &&[index, user] : llvm::enumerate(aliveUsers))
status.attachNote(user->getLoc())
<< "[" << errorCounter << "] alive handle " << index;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels misleading. The user is not the handle, its operand is. We should capture uses (OpOperand *) rather than users and attach the note to the location of the value that is being used. Bonus points for giving extra information like the value being n-th result of the op (values have the location of the defining op so we cannot otherwise differentiate between them).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think that the newest revision implements that suggestion. Please take another look.

if (status.succeeded()) {
status = emitSilenceableFailure(
getTransformOp(), "tracking listener failed to find replacement op");
Copy link
Member

@matthias-springer matthias-springer Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe "tracking listener failed to find replacement op for op that is tracked by non-dead handle".

But even that could be confusing. Maybe we should just add a comment in the C++ code that mentions all the conditions (type changed, handle not dead, etc.).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking literally returning DiagnosedSilecenableFailure instead of LogicalResult, then the replacement lookup can specify the message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of that is good input. I now assemble the error message in different places, each contributing the information they have. Please take another look.

Copy link
Contributor Author

@ingomueller-net ingomueller-net left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the reviews! Issues addressed (or attempted to do so). Please have another look!

if (status.succeeded()) {
status = emitSilenceableFailure(
getTransformOp(), "tracking listener failed to find replacement op");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of that is good input. I now assemble the error message in different places, each contributing the information they have. Please take another look.

Comment on lines 1449 to 1450
getTransformOp(), "op was replaced but replacement was of different "
"kind, invalidating alive handles");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked a bit with Matthias about this aspect. Indeed, there are a couple of options. The next revision returns a DiagnosedSilenceableFailure and passes the result Operation* through an output argument. Let me know if you prefer one of the other options.

}

status.attachNote(op->getLoc()) << "[" << errorCounter << "] replaced op";
for (auto &&[index, value] : llvm::enumerate(values))
status.attachNote(value.getLoc())
<< "[" << errorCounter << "] replacement value " << index;
for (auto &&[index, user] : llvm::enumerate(aliveUsers))
status.attachNote(user->getLoc())
<< "[" << errorCounter << "] alive handle " << index;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think that the newest revision implements that suggestion. Please take another look.

Comment on lines 1453 to 1481
status.attachNote(op->getLoc()) << "[" << errorCounter << "] replaced op";
for (auto &&[index, value] : llvm::enumerate(values))
status.attachNote(value.getLoc())
<< "[" << errorCounter << "] replacement value " << index;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This information could (almost) be added to the diagnostics findReplacementOp; however, errorCounter is not available there. Are we sure we need this? (There are no tests where errorCounter > 0.) Note that some of the diagnostics that this PR currently adds also don't report the current error count...

This PR extends the error message of the tracking listener when
replacement ops cannot be found. That may happen if the applied patterns
replace an op by an op of a different kind or by block arguments.
However, this only matters if there are alive handles to the replaced
op. The new error message mentions that explicitly and reports the alive
handles.
In particular:

* Change `TrackingListener::findReplacementOp` such that it returns a
  `DiagnosedSilenceableFailure` rather than just a `FailureOr`.
* Have `TrackingListener::notifyPayloadReplacementNotFound` accept a
  `DiagnosedSilenceableFailure` with previous diagnostics.
* Create initial diagnostics in `findReplacementOp` and give information
  about why finding the replacement failed.
* Forward that result into `notifyPayloadReplacementNotFound` and
  adapt how the `ErroriCheckingTrackingListener` deals with the
  pre-existing diagnostics.
* Only report the first alive user.
* Adapt error messages to suggestions.
That op created an empty (and invalid) transform op with
`TransformOpInterface()` for testing the tracking listeners. The
previous listeners never accessed the op created this way, so this
wasn't a problem. However, the new version does, so we need to create a
real op, which is what this commit does.
@ingomueller-net ingomueller-net force-pushed the transform-tracking-listener-error branch from bf4fa56 to c25f347 Compare September 21, 2023 14:25
Copy link
Member

@ftynse ftynse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@ingomueller-net ingomueller-net merged commit a753045 into llvm:main Sep 25, 2023
@ingomueller-net ingomueller-net deleted the transform-tracking-listener-error branch September 25, 2023 11:57
@vitalybuka
Copy link
Collaborator

There is a leak https://lab.llvm.org/buildbot/#/builders/5/builds/36953/steps/9/logs/stdio
Please fix or revert.

@ingomueller-net
Copy link
Contributor Author

Thanks for notifying! Saw it just now as well. I am AFK and can only fix tomorrow and revert in 30m-2h. I'll do that if nobody else has reverted by then.

@vitalybuka
Copy link
Collaborator

Thanks, I will revert.

@vitalybuka
Copy link
Collaborator

Reverted 78c8ab5

DummyTrackingListener listener(transformState,
transform::TransformOpInterface());
MLIRContext *context = rootOp->getContext();
OpBuilder builder(context);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This OpBuilder has no insertion point. That's likely what caused the leak.

ingomueller-net added a commit that referenced this pull request Sep 26, 2023
#66987)"

This commit reapplies #66987, which got original contained a memory leak
and got reverted by 78c8ab5. The leak
is now fixed.

Original description:

This PR extends the error message of the tracking listener when
replacement ops cannot be found. That may happen if the applied patterns
replace an op by an op of a different kind or by block arguments.
However, this only matters if there are alive handles to the replaced
op. The new error message mentions that explicitly and reports the alive
handles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants