-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[MLIR] Fuse locations of merged constants #74670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8402239
to
90c1e7b
Compare
@llvm/pr-subscribers-mlir-core @llvm/pr-subscribers-mlir Author: Billy Zhu (zyx-billy) ChangesWhen hoisting & merging constants by the operation folder, the location of the op that remains should be updated to track the new meaning of this op. This way we do not lose track of all possible source locations that the constant op came from, and the final location of the op is less reliant on the order of folding. This will also help debuggers understand how to step these instructions. This PR introduces a helper for operation folder to fuse another location into the location of an op. There are two ways this is used now:
The FusedLoc will have a string metadata to help understand the reason for the location fusion (motivated by the example in the docstring of FusedLoc). Full diff: https://github.com/llvm/llvm-project/pull/74670.diff 2 Files Affected:
diff --git a/mlir/lib/Transforms/Utils/FoldUtils.cpp b/mlir/lib/Transforms/Utils/FoldUtils.cpp
index 90ee5ba51de3ad..23fcb4796892d6 100644
--- a/mlir/lib/Transforms/Utils/FoldUtils.cpp
+++ b/mlir/lib/Transforms/Utils/FoldUtils.cpp
@@ -19,6 +19,38 @@
using namespace mlir;
+// Fuse `foldedLocation` into the Location of `retainedOp`.
+// This will result in `retainedOp` having a FusedLoc with a StringAttr tag
+// "OpFold" to help trace the source of the fusion. If `retainedOp` already had
+// a FusedLoc with the same tag, `foldedLocation` will simply be appended to it.
+// Usage:
+// - When an op is deduplicated, fuse the location of the op to be removed into
+// the op that is retained.
+// - When an op is hoisted to the front/back of a block, fuse the location of
+// the parent region of the block into the hoisted op.
+static void appendFoldedLocation(Operation *retainedOp,
+ Location foldedLocation) {
+ constexpr std::string_view tag = "OpFold";
+ // Append into existing fused location if it has the same tag.
+ if (auto existingFusedLoc =
+ retainedOp->getLoc().dyn_cast<FusedLocWith<StringAttr>>()) {
+ StringAttr existingMetadata = existingFusedLoc.getMetadata();
+ if (existingMetadata.strref().equals(tag)) {
+ SmallVector<Location> locations(existingFusedLoc.getLocations());
+ locations.push_back(foldedLocation);
+ Location newFusedLoc =
+ FusedLoc::get(retainedOp->getContext(), locations, existingMetadata);
+ retainedOp->setLoc(newFusedLoc);
+ return;
+ }
+ }
+ // Create a new fusedloc with retainedOp's loc and foldedLocation.
+ Location newFusedLoc = FusedLoc::get(
+ retainedOp->getContext(), {retainedOp->getLoc(), foldedLocation},
+ StringAttr::get(retainedOp->getContext(), tag));
+ retainedOp->setLoc(newFusedLoc);
+}
+
/// Given an operation, find the parent region that folded constants should be
/// inserted into.
static Region *
@@ -77,8 +109,10 @@ LogicalResult OperationFolder::tryToFold(Operation *op, bool *inPlaceUpdate) {
// Check to see if we should rehoist, i.e. if a non-constant operation was
// inserted before this one.
Block *opBlock = op->getBlock();
- if (&opBlock->front() != op && !isFolderOwnedConstant(op->getPrevNode()))
+ if (&opBlock->front() != op && !isFolderOwnedConstant(op->getPrevNode())) {
op->moveBefore(&opBlock->front());
+ appendFoldedLocation(op, opBlock->getParent()->getLoc());
+ }
return failure();
}
@@ -112,8 +146,10 @@ bool OperationFolder::insertKnownConstant(Operation *op, Attribute constValue) {
// If this is a constant we unique'd, we don't need to insert, but we can
// check to see if we should rehoist it.
if (isFolderOwnedConstant(op)) {
- if (&opBlock->front() != op && !isFolderOwnedConstant(op->getPrevNode()))
+ if (&opBlock->front() != op && !isFolderOwnedConstant(op->getPrevNode())) {
op->moveBefore(&opBlock->front());
+ appendFoldedLocation(op, opBlock->getParent()->getLoc());
+ }
return true;
}
@@ -141,6 +177,7 @@ bool OperationFolder::insertKnownConstant(Operation *op, Attribute constValue) {
// If there is an existing constant, replace `op`.
if (folderConstOp) {
notifyRemoval(op);
+ appendFoldedLocation(folderConstOp, op->getLoc());
rewriter.replaceOp(op, folderConstOp->getResults());
return false;
}
@@ -151,8 +188,10 @@ bool OperationFolder::insertKnownConstant(Operation *op, Attribute constValue) {
// anything. Otherwise, we move the constant to the insertion block.
Block *insertBlock = &insertRegion->front();
if (opBlock != insertBlock || (&insertBlock->front() != op &&
- !isFolderOwnedConstant(op->getPrevNode())))
+ !isFolderOwnedConstant(op->getPrevNode()))) {
op->moveBefore(&insertBlock->front());
+ appendFoldedLocation(op, insertBlock->getParent()->getLoc());
+ }
folderConstOp = op;
referencedDialects[op].push_back(op->getDialect());
@@ -264,8 +303,10 @@ OperationFolder::processFoldResults(Operation *op,
// with. This may not automatically happen if the operation being folded
// was inserted before the constant within the insertion block.
Block *opBlock = op->getBlock();
- if (opBlock == constOp->getBlock() && &opBlock->front() != constOp)
+ if (opBlock == constOp->getBlock() && &opBlock->front() != constOp) {
constOp->moveBefore(&opBlock->front());
+ appendFoldedLocation(constOp, opBlock->getParent()->getLoc());
+ }
results.push_back(constOp->getResult(0));
continue;
@@ -294,8 +335,10 @@ OperationFolder::tryGetOrCreateConstant(ConstantMap &uniquedConstants,
// Check if an existing mapping already exists.
auto constKey = std::make_tuple(dialect, value, type);
Operation *&constOp = uniquedConstants[constKey];
- if (constOp)
+ if (constOp) {
+ appendFoldedLocation(constOp, loc);
return constOp;
+ }
// If one doesn't exist, try to materialize one.
if (!(constOp = materializeConstant(dialect, rewriter, value, type, loc)))
@@ -316,6 +359,7 @@ OperationFolder::tryGetOrCreateConstant(ConstantMap &uniquedConstants,
// materialized operation in favor of the existing one.
if (auto *existingOp = uniquedConstants.lookup(newKey)) {
notifyRemoval(constOp);
+ appendFoldedLocation(existingOp, constOp->getLoc());
rewriter.eraseOp(constOp);
referencedDialects[existingOp].push_back(dialect);
return constOp = existingOp;
diff --git a/mlir/test/Transforms/canonicalize-debuginfo.mlir b/mlir/test/Transforms/canonicalize-debuginfo.mlir
new file mode 100644
index 00000000000000..9cee2fec83bad6
--- /dev/null
+++ b/mlir/test/Transforms/canonicalize-debuginfo.mlir
@@ -0,0 +1,42 @@
+// RUN: mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(canonicalize{test-convergence}))' -split-input-file -mlir-print-debuginfo | FileCheck %s
+
+// CHECK-LABEL: func @merge_constants
+func.func @merge_constants() -> (index, index, index) {
+ // CHECK-NEXT: arith.constant 42 : index loc(#[[FusedLoc:.*]])
+ %0 = arith.constant 42 : index loc("merge_constants":0:0)
+ %1 = arith.constant 42 : index loc("merge_constants":1:0)
+ %2 = arith.constant 42 : index loc("merge_constants":2:0)
+ return %0, %1, %2: index, index, index
+}
+
+// CHECK-DAG: #[[LocConst0:.*]] = loc("merge_constants":0:0)
+// CHECK-DAG: #[[LocConst1:.*]] = loc("merge_constants":1:0)
+// CHECK-DAG: #[[LocConst2:.*]] = loc("merge_constants":2:0)
+
+// CHECK: #[[FusedLoc]] = loc(fused<"OpFold">[
+// CHECK-SAME: #[[LocConst0]]
+// CHECK-SAME: #[[LocConst1]]
+// CHECK-SAME: #[[LocConst2]]
+
+// -----
+
+// CHECK-LABEL: func @hoist_constant
+func.func @hoist_constant(%arg0: memref<8xi32>) {
+ // CHECK-NEXT: arith.constant 42 : i32 loc(#[[FusedWithFunction:.*]])
+ affine.for %arg1 = 0 to 8 {
+ %0 = arith.constant 42 : i32 loc("hoist_constant":0:0)
+ %1 = arith.constant 42 : i32 loc("hoist_constant":1:0)
+ memref.store %0, %arg0[%arg1] : memref<8xi32>
+ memref.store %1, %arg0[%arg1] : memref<8xi32>
+ }
+ return
+} loc("hoist_constant":2:0)
+
+// CHECK-DAG: #[[LocConst0:.*]] = loc("hoist_constant":0:0)
+// CHECK-DAG: #[[LocConst1:.*]] = loc("hoist_constant":1:0)
+// CHECK-DAG: #[[LocFunc:.*]] = loc("hoist_constant":2:0)
+
+// CHECK: #[[FusedWithFunction]] = loc(fused<"OpFold">[
+// CHECK-SAME: #[[LocConst0]]
+// CHECK-SAME: #[[LocFunc]]
+// CHECK-SAME: #[[LocConst1]]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice idea!
I'm not convinced about the block parent location part, we should probably move this to a separate PR and discuss a bit more what we'd be trying to achieve exactly.
// the parent region of the block into the hoisted op. | ||
static void appendFoldedLocation(Operation *retainedOp, | ||
Location foldedLocation) { | ||
constexpr std::string_view tag = "OpFold"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we usually use StringLiteral in LLVM: has there been a migration to std::string_view?
But even better, could you make this a StringAttr member of the OperationFolder
class? (this could be a private method, or just take it an extra argument).
That way we'd avoid the repeated StringAttr creation in this function, and the string comparison line 38 would become a pointer comparison instead.
For the naming, we already have a "CSE" tag I think, it may be useful to reuse the naming here instead?
retainedOp->getLoc().dyn_cast<FusedLocWith<StringAttr>>()) { | ||
StringAttr existingMetadata = existingFusedLoc.getMetadata(); | ||
if (existingMetadata.strref().equals(tag)) { | ||
SmallVector<Location> locations(existingFusedLoc.getLocations()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It this possible we'd fuse the same location over and over? Should this be a SmallPtrSet to remove the duplication?
Please add tests for this.
op->moveBefore(&opBlock->front()); | ||
appendFoldedLocation(op, opBlock->getParent()->getLoc()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure that makes sense to me: this is already a "folderOwnedConstant" and we just moved it up in the block
op->moveBefore(&opBlock->front()); | ||
appendFoldedLocation(op, opBlock->getParent()->getLoc()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
return; | ||
} | ||
} | ||
// Create a new fusedloc with retainedOp's loc and foldedLocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably check that the new loc isn't identical to the old one first.
constOp->moveBefore(&opBlock->front()); | ||
appendFoldedLocation(constOp, opBlock->getParent()->getLoc()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as before, I don't think this makes sense to me.
@@ -0,0 +1,42 @@ | |||
// RUN: mlir-opt -allow-unregistered-dialect %s -pass-pipeline='builtin.module(func.func(canonicalize{test-convergence}))' -split-input-file -mlir-print-debuginfo | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove -allow-unregistered-dialect
// CHECK: #[[FusedWithFunction]] = loc(fused<"OpFold">[ | ||
// CHECK-SAME: #[[LocConst0]] | ||
// CHECK-SAME: #[[LocFunc]] | ||
// CHECK-SAME: #[[LocConst1]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have 7 invocations of appendFoldedLocation
, I would think there would be 7 tests?
@@ -19,6 +19,38 @@ | |||
|
|||
using namespace mlir; | |||
|
|||
// Fuse `foldedLocation` into the Location of `retainedOp`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: please use ///
for docstrings
constexpr std::string_view tag = "OpFold"; | ||
// Append into existing fused location if it has the same tag. | ||
if (auto existingFusedLoc = | ||
retainedOp->getLoc().dyn_cast<FusedLocWith<StringAttr>>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
retainedOp->getLoc().dyn_cast<FusedLocWith<StringAttr>>()) { | |
dyn_cast<FusedLocWith<StringAttr>>(retainedOp->getLoc())) { |
Concretely, the problem with hoisting constants to the top of basic blocks is that if those constant ops don't become "trivial" (e.g. they don't lower directly to LLVM constants and have some codegen), debuggers will pick them up when stepping into a function. E.g. func @my_function() {
"some.op"() : () -> ()
%c0 = some.complex.constant = [1, 2, 3]
return
} If the constant gets hoisted and lowers to non-trivial code: llvm.func @my_function() {
%0 = llvm.mlir.undef : ...
// bunch of code constructing the "constant"
"some.op"() :() -> ()
} The debugger will jump around because the source locations are messed up. |
I do agree that fusing with the block location is more controversial than fusing constant op locations together. Perhaps the policy should lie with the constant ops themselves |
It's not clear to me why one constant vs another would treat it differently, I think it's best to have a separate PR just so that we can see examples of what we're trying to address exactly with this. |
Thanks for the reviews! Sounds like I can probably split up constant op location fusing & block parent location fusing, but let me elaborate on the motivation since they both are for the same cause. Like @Mogball said, these locations end up informing the debugger which line of source code is being stepped on, which is a nice proxy for testing whether the location of an op still makes sense*. Any time we do these constant merging, even though procedurally we're erasing ops & moving ops to the front, semantically it's more like saying we're eliminating all the same constants in the block, and creating a unified one at the front of the block to be shared by all their users. The reason we have this new constant op in the front of the block, is because of all the source lines that contributed to the original constants in this block. That's why I think both "the original constants" and "this block" are equally important in the creation of the new constant (after all, the block dictated the scope & position of this shared constant in the IR). At this point it's basically common setup code for the block (just like you might have language-dependent function setup code at the front of a function, and those carry the location of the function definition). The parallel optimization in LLVM actually does the same thing (though it achieves it in a different way, perhaps a bit more weird?) for the same reason (https://reviews.llvm.org/D38088):
Location merging in LLVM returns the location of the common parent scope, which will result in a similar behavior as us. Unlike us, it will unfortunately lose the location of the original constant, whereas we can hold both. * I'm not sure if this represents a slight semantic shift for what locations mean (I'm hoping it's captured by the original definition), but if it is, I'm also open to considering new concepts for capturing a "hoisted location", or perhaps more broadly a debug location vs. source location distinction if we want to go down that route. LMK what you think. Happy to discuss more! |
I looked at the LLVM revision, but I don't find the "parent op location" equivalent in MLIR here? I actually may start to understand somehow: they attach the fused location of the insertion point somehow (for PGO reasons apparently), and you're trying to translate it to the "parent op" because you think that in the entry point it'll translate to the closest to this? It'll probably help to look at a dedicated revision with dedicated examples crafted to showcase this. |
Ah, this happens inside the logic for For hoisting too, instead of following LLVM and just using the location of the insertion point instruction, we can also make things clearer by using the location of the parent scope directly, since we always move to the front of the block. This way we can later tell that this op is part of the common setup of the parent scope. DWARF emitters can then use this to mark this op as part of the prologue if appropriate.
Sounds good. Let me split up the PR so it can be considered separately. |
OK, this PR is now updated to only include changes for fusing merged constants. Also updated tests to check every case where deduplication happens. PTAL when you get a chance 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, thanks! Looking forward for the follow up on blocks ;)
Thank you @joker-eph! Do you mind clicking merge for me? I don't yet have commit access 🙏 |
[PR 74670](llvm#74670) added support for merging locations at constant folding time. We have discovered that in some cases, the number of locations grows so big as to cause a compilation process to OOM. In that case, many of the locations end up appearing several times in nested fused locations. We add here a helper that always flattens fused locations in order to eliminate duplicates in the case of nested fused locations. We only allow flattening nested fused locations when the inner fused location has no metadata, or has the same metadata as the outer fused location.
[PR 74670](#74670) added support for merging locations at constant folding time. We have discovered that in some cases, the number of locations grows so big as to cause a compilation process to OOM. In that case, many of the locations end up appearing several times in nested fused locations. We add here a helper that always flattens fused locations in order to eliminate duplicates in the case of nested fused locations.
A heads-up that this caused a lot of OOM and timeout failures to our internal users and the follow-up mitigation #75218 doesn't fully address. @bchetioui |
This reverts commit 87e2e89. and its follow-ups 0d1490f (llvm#75218) and 6fe3cd5 (llvm#75312). We observed significant OOM/timeout issues due to llvm#74670 to quite a few services including google-research/swirl-lm. The follow-up llvm#75218 and llvm#75312 do not address the issue. Perhaps this is worth more investigation.
This reverts commit 87e2e89. and its follow-ups 0d1490f (#75218) and 6fe3cd5 (#75312). We observed significant OOM/timeout issues due to #74670 to quite a few services including google-research/swirl-lm. The follow-up #75218 and #75312 do not address the issue. Perhaps this is worth more investigation.
Follow up to the discussion from #75258, and serves as an alternate solution for #74670. Set the location to Unknown for deduplicated / moved / materialized constants by OperationFolder. This makes sure that the folded constants don't end up with an arbitrary location of one of the original ops that became it, and that hoisted ops don't confuse the stepping order.
When merging constants by the operation folder, the location of the op that remains should be updated to track the new meaning of this op. This way we do not lose track of all possible source locations that the constant op came from, and the final location of the op is less reliant on the order of folding. This will also help debuggers understand how to step these instructions.
This PR introduces a helper for operation folder to fuse another location into the location of an op. When an op is deduplicated, fuse the location of the op to be removed into the op that is retained. The retained op now represents both original ops.
The FusedLoc will have a string metadata to help understand the reason for the location fusion (motivated by the example in the docstring of FusedLoc).