Skip to content

[MLIR][Tensor] Remove tensor.dim canonicalization patterns registered on tensor.expand_shape/tensor.collapse_shape #134219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 11, 2025

Conversation

vivekkhandelwal1
Copy link
Contributor

@vivekkhandelwal1 vivekkhandelwal1 commented Apr 3, 2025

These are problematic because the iterative application that locally resolves the tensor.dim operation introduces
intermediate floor_div, which is losing the information about the exact division that was carried out in the original
IR, and the iterative algorithm can't converge towards the simplest form.
Information loss is not acceptable for canonicalization.

Resolving the dimOp can be achieved through resolve-ranked-shaped-type-result-dims and
resolve-shaped-type-result-dims passes.

This commit removes the FoldDimOfExpandShape and FoldDimOfCollapseShape
pattern from the list of ExpandShapeOp's canonicalization patterns.

The above pattern were resulting in crash while folding the dims of
an expanded tensor. The issue can be reproduced by undoing the changes
done in this commit and by running the command:

```
mlir-opt --linalg-fuse-elementwise-ops repro.mlir
```

over the IR: https://gist.github.com/vivekkhandelwal1/56a1a398c21d739df77a67ce372b9366.

Signed-off-by: Vivek Khandelwal <[email protected]>
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2025

@llvm/pr-subscribers-mlir-linalg
@llvm/pr-subscribers-mlir-tensor

@llvm/pr-subscribers-mlir

Author: Vivek Khandelwal (vivekkhandelwal1)

Changes

This commit removes the FoldDimOfExpandShape and FoldDimOfCollapseShape pattern from the list of ExpandShapeOp's canonicalization patterns.

The above pattern were resulting in crash while folding the dims of an expanded tensor. The issue can be reproduced by undoing the changes done in this commit and by running the command:

mlir-opt --linalg-fuse-elementwise-ops repro.mlir

over the IR: https://gist.github.com/vivekkhandelwal1/56a1a398c21d739df77a67ce372b9366.


Full diff: https://github.com/llvm/llvm-project/pull/134219.diff

1 Files Affected:

  • (modified) mlir/lib/Dialect/Tensor/IR/TensorOps.cpp (+1-2)
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index d589f627d896e..35265ac49fcbf 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -2158,8 +2158,7 @@ void ExpandShapeOp::getCanonicalizationPatterns(RewritePatternSet &results,
       ComposeExpandOfCollapseOp<ExpandShapeOp, CollapseShapeOp>,
       ConvertToStaticExpandShape, FoldReshapeWithConstant<ExpandShapeOp>,
       FoldReshapeWithSplat<ExpandShapeOp>,
-      FoldReshapeWithFromElements<ExpandShapeOp>, FoldDimOfExpandShape,
-      FoldDimOfCollapseShape>(context);
+      FoldReshapeWithFromElements<ExpandShapeOp>>(context);
 }
 
 void CollapseShapeOp::getCanonicalizationPatterns(RewritePatternSet &results,

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very confused that there is no test change? That points to a previous lack of coverage...
Edit: actually the CI is failing, there are test changes and we had coverage :)

Can you add minimal test demonstrating the issue you're fixing?

@vivekkhandelwal1
Copy link
Contributor Author

I'm very confused that there is no test change? That points to a previous lack of coverage... Edit: actually the CI is failing, there are test changes and we had coverage :)

Can you add minimal test demonstrating the issue you're fixing?

I have fixed the failing regression tests. Sorry, I forgot to do that while creating the PR.

@joker-eph
Copy link
Collaborator

joker-eph commented Apr 3, 2025

We still need a test reproducing the crash you mentioned, right now it's not clear if what you're doing in this PR is fundamentally the right fix here or just a "big hammer" solution for the crash.

@vivekkhandelwal1
Copy link
Contributor Author

We still need a test reproducing the crash you mentioned, right now it's not clear if what you're doing in this PR is fundamentally the right fix here or just a "big hammer" solution for the crash.

Sure, I will add a test which would crash without these changes.

@vivekkhandelwal1
Copy link
Contributor Author

We still need a test reproducing the crash you mentioned, right now it's not clear if what you're doing in this PR is fundamentally the right fix here or just a "big hammer" solution for the crash.

Hi @joker-eph, I have added a test which will crash without the changes made in this PR.

@vivekkhandelwal1
Copy link
Contributor Author

Also, after the removal of those canonicalization patterns, the code below becomes dead code since it's not used anywhere else. Shall I remove that, too?

struct FoldDimOfExpandShape : public OpRewritePattern<DimOp> {
using OpRewritePattern<DimOp>::OpRewritePattern;
LogicalResult matchAndRewrite(DimOp dimOp,
PatternRewriter &rewriter) const override {
auto expandShapeOp = dimOp.getSource().getDefiningOp<ExpandShapeOp>();
if (!expandShapeOp)
return failure();
// Only constant dimension values are supported.
std::optional<int64_t> dim = dimOp.getConstantIndex();
if (!dim.has_value())
return failure();
// Skip static dims. These are folded to constant ops.
RankedTensorType resultType = expandShapeOp.getResultType();
if (!resultType.isDynamicDim(*dim))
return failure();
// Find reassociation group that contains this result dimension.
int64_t srcDim = expandShapeOp.getCorrespondingSourceDim(*dim);
// `dim` is the only dynamic dimension in `group`. (Otherwise, the
// ExpandShapeOp would be ambiguous.)
int64_t product = 1;
ReassociationIndices grp = expandShapeOp.getReassociationIndices()[srcDim];
for (int64_t d : grp) {
if (d != dim) {
assert(!resultType.isDynamicDim(d) && "expected static dim");
product *= resultType.getDimSize(d);
}
}
// result dim size = src dim size / (product(other dims in reassoc group))
Value srcDimSz =
rewriter.create<DimOp>(dimOp.getLoc(), expandShapeOp.getSrc(), srcDim);
AffineExpr expr;
bindSymbols(dimOp.getContext(), expr);
rewriter.replaceOpWithNewOp<affine::AffineApplyOp>(
dimOp, expr.floorDiv(product), srcDimSz);
return success();
}
};
struct FoldDimOfCollapseShape : public OpRewritePattern<DimOp> {
using OpRewritePattern<DimOp>::OpRewritePattern;
LogicalResult matchAndRewrite(DimOp dimOp,
PatternRewriter &rewriter) const override {
auto collapseShapeOp = dimOp.getSource().getDefiningOp<CollapseShapeOp>();
if (!collapseShapeOp)
return failure();
// Only constant dimension values are supported.
std::optional<int64_t> dim = dimOp.getConstantIndex();
if (!dim.has_value() ||
dim.value() >= collapseShapeOp.getResultType().getRank())
return failure();
// Skip static dims. These are folded to constant ops.
RankedTensorType resultType = collapseShapeOp.getResultType();
if (!resultType.isDynamicDim(*dim))
return failure();
// Get reassociation group of the result dimension.
ReassociationIndices group =
collapseShapeOp.getReassociationIndices()[*dim];
// result dim size = product(dims in reassoc group)
SmallVector<Value> srcDimSizes;
SmallVector<AffineExpr> syms;
AffineExpr product;
for (const auto &it : llvm::enumerate(group)) {
srcDimSizes.push_back(rewriter.create<DimOp>(
dimOp.getLoc(), collapseShapeOp.getSrc(), it.value()));
syms.push_back(rewriter.getAffineSymbolExpr(it.index()));
product = product ? product * syms.back() : syms.back();
}
rewriter.replaceOpWithNewOp<affine::AffineApplyOp>(dimOp, product,
srcDimSizes);
return success();
}
};

Copy link
Member

@Groverkss Groverkss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should the canonicalization patterns be removed instead of fixing them? I don't underestand the reasoning behind this PR

@joker-eph
Copy link
Collaborator

Hi @joker-eph, I have added a test which will crash without the changes made in this PR.

Thanks! Is it really minimal?

Also can you comment on my previous question:

right now it's not clear if what you're doing in this PR is fundamentally the right fix here or just a "big hammer" solution for the crash.

Why is removing these the right fix for the crash?

Copy link
Contributor

@qedawkins qedawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, dropping the canonicalizers is not the right fix here. The crash is because this comment is outdated:

// `dim` is the only dynamic dimension in `group`. (Otherwise, the
// ExpandShapeOp would be ambiguous.)

expand_shape takes a list of output sizes now so we can simplify these canonicalizers to just forward the value in output_shape

@MaheshRavishankar
Copy link
Contributor

Dropping these canonicalizations is the right thing. These patterns are redundant w.r.t

void mlir::tensor::registerInferTypeOpInterfaceExternalModels(
which is the current way of resolving tensor.dim of results w.r.t dimensions of its operands. There are two reasons why this canonicalization is bad

  1. In general all tensor.dim operation resolutions have been moved to use the ReifyRankedShapedTypeOpInterface and that is wrapped in
    void populateResolveRankedShapedTypeResultDimsPatterns(
    . These two patterns are outliers and redundant.
  2. None of the other tensor.dim resolution patterns are run during canonicalizations. They have non-local effects that might be too hard a hammer for canonicalizations, and was removed from canonicalizations a really long time ago.

@vivekkhandelwal1
Copy link
Contributor Author

Dropping these canonicalizations is the right thing. These patterns are redundant w.r.t

void mlir::tensor::registerInferTypeOpInterfaceExternalModels(

which is the current way of resolving tensor.dim of results w.r.t dimensions of its operands. There are two reasons why this canonicalization is bad

  1. In general all tensor.dim operation resolutions have been moved to use the ReifyRankedShapedTypeOpInterface and that is wrapped in
    void populateResolveRankedShapedTypeResultDimsPatterns(

    . These two patterns are outliers and redundant.
  2. None of the other tensor.dim resolution patterns are run during canonicalizations. They have non-local effects that might be too hard a hammer for canonicalizations, and was removed from canonicalizations a really long time ago.

Thanks @MaheshRavishankar for adding the rationale behind these changes.

I request all the existing reviewers to re-review this PR since we have been blocked on the Torch-MLIR for quite some time because of this.

CC: @joker-eph

Copy link
Contributor

@qedawkins qedawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok if this is in line with canonicalization of tensor.dim w.r.t. other ops then this makes sense. This wasn't immediately clear from the initial PR description. I'll bow out of the review here.

Copy link
Member

@Groverkss Groverkss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can you just update the PR description to include:

This PR makes the tensor.dim folders for tensor.expand_shape/tensor.collapse_shape consistent with other tensor operations. In general:

1. All tensor.dim operation resolutions have been moved to use the ReifyRankedShapedTypeOpInterface.
2. None of the other tensor.dim resolution patterns are run during canonicalizations. They have non-local effects, which make them bad for canonicalizations, and were removed from other tensor operations some time ago.

@qedawkins qedawkins dismissed their stale review April 7, 2025 16:37

self dismissing review

@vivekkhandelwal1
Copy link
Contributor Author

Hi, I don't have merge access. Can someone please merge this for me?

@joker-eph
Copy link
Collaborator

None of the other tensor.dim resolution patterns are run during canonicalizations. They have non-local effects that might be too hard a hammer for canonicalizations, and was removed from canonicalizations a really long time ago.

I don't see the non-local effects on the current test change, can you elaborate on this please? Maybe an example showing the effect of this canonicalization pattern that shouldn't be a canonicalization?

(I'd rather see this resolved before anyone merges this)

@MaheshRavishankar
Copy link
Contributor

MaheshRavishankar commented Apr 8, 2025

None of the other tensor.dim resolution patterns are run during canonicalizations. They have non-local effects that might be too hard a hammer for canonicalizations, and was removed from canonicalizations a really long time ago.

I don't see the non-local effects on the current test change, can you elaborate on this please? Maybe an example showing the effect of this canonicalization pattern that shouldn't be a canonicalization?

(I'd rather see this resolved before anyone merges this)

Let me clarify. These patterns are the outliers. None of the other tensor.dim resolution patterns are run during canonicalization. These pattern are also wrong and redundant. I see no reason to fix these patterns cause they are inconsistent, and duplicate of the tensor.dim resolution that happens through interface here. So the resolution here is to remove this pattern. The question of whether all the tensor.dim resolution needs to happen during canonicalization or not is a separate issue that we can revisit.

It would be better to not block this change cause this is removing a redundant and wrong pattern.

@joker-eph
Copy link
Collaborator

joker-eph commented Apr 8, 2025

. These pattern are also wrong

Wrong how?

and redundant

It's a bit ambiguous what you call redundant, if you compare to "other things that are not part of canonicalization" then it's a bit dubious to claim it "redundant" as it won't be part of the canonicalization run.

So the resolution here is to remove this pattern. The question of whether all the tensor.dim resolution needs to happen during canonicalization or not is a separate issue that we can revisit. It would be better to not block this change cause this is removing a redundant and wrong pattern.

This is all an opinion, but I'd rather see you elaborate you're thinking to enlighten everyone, instead of just making a strong claim that you're not trying to substantiate right now (other than very unspecific terms like "wrong" and "redundant").
Because right now:

So the resolution here is to remove this pattern.

may very well be the absolute right thing, however it's important to be able to convey why this is the right ultimate thing.

@MaheshRavishankar
Copy link
Contributor

. These pattern are also wrong

Wrong how?

Wrong cause they are crashing.

and redundant

It's a bit ambiguous what you call redundant, if you compare to "other things that are not part of canonicalization" then it's a bit dubious to claim it "redundant" as it won't be part of the canonicalization run.

Redundant because these pattern are replacing tensor.dim of result in terms of tensor.dim of operands. Which is exactly what these patterns do (

void populateResolveRankedShapedTypeResultDimsPatterns(
) using the interface here (
void mlir::tensor::registerInferTypeOpInterfaceExternalModels(
) . So its literally the same thing.

This is all an opinion, but I'd rather see you elaborate you're thinking to enlighten everyone, instead of just making a strong claim that you're not trying to substantiate right now (other than very unspecific terms like "wrong" and "redundant"). Because right now:

It isnt a strong claim. They literally do the same thing that another pattern in the code base does (redundant) and is crashing (wrong). I dont know what else you think would be needed to substantiate that this is wrong and redundant.

So the resolution here is to remove this pattern.

may very well be the absolute right thing, however it's important to be able to convey why this is the right ultimate thing.

I dont want to get dragged into canonicalization vs. non-canonicalization discussion again and again. These two patterns are outliers since they are resolving tensor.dim operations of result in terms of tensor.dim operation of operands. The way the rest of the operations in tree handle that is through the interface (including the two ops here). No other ops have that. So I think we dont even need to get into canonicalization vs. not discussion. These patterns are outliers.

@Groverkss
Copy link
Member

Ok let's take a step back and see why this patch is a good patch. I looked through the history and code and here is the reasoning:

There are two questions we are answering here:

  1. Is the current implementation of this pattern wrong? Should we fix it?
  2. Should the correct pattern be a canonicalization?

The answer to Question 1, is that this pattern is wrong. It's trying to infer the output shape of the tensor, which is already provided in the operation! This pattern is from the time when expand_shape did not carry this information. The output shape was explicitly added to the operation in #90040 and has reasoning why for dynamic shapes, the output shape inference is wrong.

So now we have two options, either fix the pattern, or remove it. Thing is, the same pattern is implemented correctly through ReifyRankedShapedTypeOpInterface as a general pattern. So it is okay to remove the pattern, as once fixed, this pattern is a duplicate of a more general pattern.

So now, to Question 2, is the correct pattern, in whatever shape or form, a good canonicalization? For static cases, expand_shape already has a folder to move to a more canonical form, where the output_shape field in the op moves to a more canonical form (dynamic -> static). For dynamic cases, the answer to this question is No. The pattern is not a canonicalization for expand_shape at all! In it's correct form, it doesn't change the representation of expand_shape in any form, instead it is using information from expand_shape to bubble up the tensor.dim operations upwards. What I understand by "non-local" effects here is that it's doing transformations around the operation that is meant to be canonicalized, but not doing anything to the operation itself. So it's "non-local" to the operation being canonicalized. This is a clear indication that this pattern is not really a canonicalization.

Finally, The pattern is implemented in transformations that refine shape information through ReifyRankedShapedTypeOpInterface so we aren't losing any old transformations here, this patch is fixing the questions I explained above.

@joker-eph @MaheshRavishankar Hope this clears it up. I think this should provide a clear answer why this pattern is being removed as a canonicalization, and also being deleted at the same time. Let me know if you need more clarification.

Comment on lines -1108 to +1114
// CHECK: #[[$MAP:.*]] = affine_map<()[s0] -> (s0 * 64)>
// CHECK-LABEL: @compose_expand_of_collapse_last_two_dims
// CHECK-SAME: %[[ARG0:.+]]: tensor<?x64x1xf32>
// CHECK: %[[CONSTANT0:.+]] = arith.constant 0 : index
// CHECK: %[[CONSTANT384:.+]] = arith.constant 384 : index
// CHECK: %[[CONSTANT0:.+]] = arith.constant 0 : index
// CHECK: %[[COLLAPSE:.+]] = tensor.collapse_shape %[[ARG0]] {{\[}}[0, 1, 2]] : tensor<?x64x1xf32> into tensor<?xf32>
// CHECK: %[[DIM:.+]] = tensor.dim %[[ARG0]], %[[CONSTANT0]] : tensor<?x64x1xf32>
// CHECK: %[[AFFAPPLY:.+]] = affine.apply #[[$MAP]]()[%[[DIM]]]
// CHECK: %[[DIVUI:.+]] = arith.divui %[[AFFAPPLY]], %[[CONSTANT384]] : index
// CHECK: %[[DIM:.+]] = tensor.dim %[[COLLAPSE]], %[[CONSTANT0]] : tensor<?xf32>
// CHECK: %[[DIVUI:.+]] = arith.divui %[[DIM]], %[[CONSTANT384]] : index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joker-eph For an example of non-local effects, the previous canonicalization pattern for collapse_shape, was not moving collapse_shape to a more canonical form. It's actually not touching the collapse_shape at all, and is doing things around it.

It's very hard to say what the canonical form of where tensor.dim should be here is. There is no reason for the tensor.dim, when propagated up to be more canonical. It can possibly introduce 2 tensor.dim operations, if there were 2 dynamic dimensions before the collapse shape. If there is a cost to a tensor.dim operation, you now pay the cost twice. We cannot really say if one form is more canonical than the other.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this current example, simplified we have:

func.func @dim_expand_shape(%arg0: tensor<?x64x1xf32>) -> index {
  %c0 = arith.constant 0 : index
  %collapsed = tensor.collapse_shape %arg0 [[0, 1, 2]] : tensor<?x64x1xf32> into tensor<?xf32>
  %dim = tensor.dim %collapsed, %c0 : tensor<?xf32>
  return %dim : index
}

If I want to know something about the dim, it's just the first dimension of the collapsed_shape and I have now to reason about it.

After canonicalization we have:

#map = affine_map<()[s0] -> (s0 * 64)>
module {
  func.func @dim_expand_shape(%arg0: tensor<?x64x1xf32>) -> index {
    %c0 = arith.constant 0 : index
    %dim = tensor.dim %arg0, %c0 : tensor<?x64x1xf32>
    %0 = affine.apply #map()[%dim]
    return %0 : index
  }

The dimension computation is made pretty explicit in terms of the input (the calculation is explicit) and the tensor.collapse_shape even disappears.

There is no reason for the tensor.dim, when propagated up to be more canonical.

It's not always obvious what the "best" canonical form should be, that does not mean we can't pick one though: often it resolves just about picking one convention and sticking to it to be consistent (I don't know if this is the right convention here, just saying).

It can possibly introduce 2 tensor.dim operations, if there were 2 dynamic dimensions before the collapse shape.

Here is the example you describe I believe:

func.func @dim_expand_shape(%arg0: tensor<?x?x64x1xf32>) -> index {
  %c0 = arith.constant 0 : index
  %collapsed = tensor.collapse_shape %arg0 [[0, 1, 2, 3]] : tensor<?x?x64x1xf32> into tensor<?xf32>
  %dim = tensor.dim %collapsed, %c0 : tensor<?xf32>
  return %dim : index
}

Becomes:

#map = affine_map<()[s0, s1] -> ((s0 * s1) * 64)>
module {
  func.func @dim_expand_shape(%arg0: tensor<?x?x64x1xf32>) -> index {
    %c1 = arith.constant 1 : index
    %c0 = arith.constant 0 : index
    %dim = tensor.dim %arg0, %c0 : tensor<?x?x64x1xf32>
    %dim_0 = tensor.dim %arg0, %c1 : tensor<?x?x64x1xf32>
    %0 = affine.apply #map()[%dim, %dim_0]
    return %0 : index
  }
}

As mentioned elsewhere, the cost of "dim" does not seem meaningful to me here. The question is rather about whether this IR is more amenable to reasoning on the values, or equivalence between values in further analysis. Any argument long this line of reasoning can be very valuable to make a statement about whether it is clearly belonging / not belonging to canonicalization.

Comment on lines -2140 to +2143
// CHECK: #[[$map:.*]] = affine_map<()[s0] -> (s0 floordiv 40)>
// CHECK-LABEL: func @dim_of_expand_shape(
// CHECK-SAME: %[[t:.*]]: tensor<?x?xf32>
// CHECK: %[[c1:.*]] = arith.constant 1 : index
// CHECK: %[[dim:.*]] = tensor.dim %[[t]], %[[c1]] : tensor<?x?xf32>
// CHECK: %[[apply:.*]] = affine.apply #[[$map]]()[%[[dim]]]
// CHECK: return %[[apply]]
// CHECK: %[[c2:.*]] = arith.constant 2 : index
// CHECK: %[[expanded:.*]] = tensor.expand_shape %[[t]] {{\[\[}}0], [1, 2, 3, 4, 5]] output_shape [%arg1, 1, %arg2, 5, 1, 8] : tensor<?x?xf32> into tensor<?x1x?x5x1x8xf32>
// CHECK: %[[dim:.*]] = tensor.dim %[[expanded]], %[[c2]] : tensor<?x1x?x5x1x8xf32>
// CHECK: return %[[dim]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joker-eph For an example of the old pattern being wrong, notice that we already know the output shape. The result should just be %arg2 here! But it's instead it was doing some magic with affine.apply, which is wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could implement a folder for this for tensor.dim, but that doesn't change the fact that this pattern is wrong, and doesn't even belong as a canonicalization to expand_shape.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could implement a folder for this for tensor.dim, but that doesn't change the fact that this pattern is wrong, and doesn't even belong as a canonicalization to expand_shape.

@Groverkss
Copy link
Member

I left some inline comments on test changes showcasing to back my reasoning

@joker-eph
Copy link
Collaborator

These pattern are also wrong
Wrong how?
Wrong cause they are crashing.

OK, we had a vocabulary mismatch here, "wrong" confused me, thanks for clarifying what you meant.

So I think we dont even need to get into canonicalization vs. not discussion.

You want to remove something done in canonicalization to make it not part of canonicalization: so while I sympathize, I don't quite see how you can avoid the conversation.

@joker-eph
Copy link
Collaborator

Thanks for elaborating @Groverkss!

So now, to Question 2, is the correct pattern, in whatever shape or form, a good canonicalization? For static cases, expand_shape already has a folder to move to a more canonical form, where the output_shape field in the op moves to a more canonical form (dynamic -> static). For dynamic cases, the answer to this question is No. The pattern is not a canonicalization for expand_shape at all! In it's correct form, it doesn't change the representation of expand_shape in any form, instead it is using information from expand_shape to bubble up the tensor.dim operations upwards. What I understand by "non-local" effects here is that it's doing transformations around the operation that is meant to be canonicalized, but not doing anything to the operation itself. So it's "non-local" to the operation being canonicalized. This is a clear indication that this pattern is not really a canonicalization.

I see two things here:

  1. "a pattern not touching the operation itself its not a canonicalization", I'm not sure where is this coming from? Idon't quite see why this should be criteria? This is actually something that is often done: you know you can simplify an op from another dialect when it consumes results from your own dialect, so you just load such a pattern in your op that actually affect another one (cf below as well wrt layering).
  2. The pattern is actually a DimOp canonicalization, this is in the signature: LogicalResult matchAndRewrite(DimOp dimOp. It is meant to simplify the dim operation, not the expand_shape operation.
    Where is the pattern loaded isn't very important: I'm not sure why this pattern is loaded here instead of in the dimOp for example, in general when we do this is just a dialect layering question (one dialect loads canonicalization for an op in another dialect)

Finally, The pattern is implemented in transformations that refine shape information through ReifyRankedShapedTypeOpInterface

But that does not run in canonicalization right? If you wanted to just eliminate a duplicate implementation, you could call the implementation in ReifyRankedShapedTypeOpInterface from here to delegate it.

I'm surprised you didn't mention the use of affine.apply: are there other patterns in the Tensor dialect that introduce affine.apply (or affine dialect constructs) "out of thin air"? Seems like these one are the only two I can spot.

Would we be able to completely remove the dependency from the Tensor dialect to the affine dialect if these patterns are removed or are there other things left right now?

@MaheshRavishankar
Copy link
Contributor

These pattern are also wrong
Wrong how?
Wrong cause they are crashing.

OK, we had a vocabulary mismatch here, "wrong" confused me, thanks for clarifying what you meant.

So I think we dont even need to get into canonicalization vs. not discussion.

You want to remove something done in canonicalization to make it not part of canonicalization: so while I sympathize, I don't quite see how you can avoid the conversation.

Let me understand what you are suggesting. Basically this pattern should never have been part of canonicalization, but it was. It is an outlier w.r.t to the designated way in tree for resolution of tensor.dim operations, and you want to "fix" this pattern to do exactly what the other pattern does. I dont see how that is a reasonable outcome here. I suggest you provide concrete details of how you want to proceed here. Blocking changes without providing guidance of next steps is not a very productive position to take.

@joker-eph
Copy link
Collaborator

joker-eph commented Apr 10, 2025

Basically this pattern should never have been part of canonicalization,

Can you elaborate on why? You're stating a fact, but as I mentioned before, while you may very well be right (I don't know, you still haven't explained), it would be helpful if you could make the reasoning behind this explicit.

. Blocking changes without providing guidance of next steps is not a very productive position to take.

Sorry if you feel that way, I believe I provided next steps quite a few times above, you just seem to have missed it maybe?
For example I wrote this that you didn't follow up on (unless I missed it?):

Maybe an example showing the effect of this canonicalization pattern that shouldn't be a canonicalization?

@joker-eph
Copy link
Collaborator

joker-eph commented Apr 10, 2025

I chatted with @Groverkss and we identified the reason these can't work well as canonicalization. It isn't visible on collapse_shape example because they trigger a multiplication operation, however it becomes very visible on the expand_shape operation which introduces a floor_div. The problem is that this is an information loss that can't be just recovered easily. The expand_shape guarantees an "exact division", which can't be expressed with affine_apply apparently. Losing information is obviously never OK in canonicalization.

Please update the PR description accordingly please @vivekkhandelwal1, something like:

Remove tensor.dim canonicalization patterns registered on tensor.expand_shape/tensor.collapse_shape.

These are problematic because the iterative application that locally resolves the tensor.dim operation introduces
intermediate floor_div which are losing the information about the exact division that was carried out in the original
IR, and the iterative algorithm can't converge towards the simplest form.
Information loss is not acceptable for a canonicalization.

Resolving the dimOp can be achieved through resolve-ranked-shaped-type-result-dims and
resolve-shaped-type-result-dims passes.

@vivekkhandelwal1 vivekkhandelwal1 changed the title [MLIR][Tensor] Remove FoldDimOf[Expand|Collapse]Shape Pattern [MLIR][Tensor] Remove tensor.dim canonicalization patterns registered on tensor.expand_shape/tensor.collapse_shape Apr 10, 2025
@vivekkhandelwal1
Copy link
Contributor Author

I chatted with @Groverkss and we identified the reason these can't work well as canonicalization. It isn't visible on collapse_shape example because they trigger a multiplication operation, however it becomes very visible on the expand_shape operation which introduces a floor_div. The problem is that this is an information loss that can't be just recovered easily. The expand_shape guarantees an "exact division", which can't be expressed with affine_apply apparently. Losing information is obviously never OK in canonicalization.

Please update the PR description accordingly please @vivekkhandelwal1, something like:

Remove tensor.dim canonicalization patterns registered on tensor.expand_shape/tensor.collapse_shape.
These are problematic because the iterative application that locally resolves the tensor.dim operation introduces
intermediate floor_div which are losing the information about the exact division that was carried out in the original
IR, and the iterative algorithm can't converge towards the simplest form.
Information loss is not acceptable for a canonicalization.
Resolving the dimOp can be achieved through resolve-ranked-shaped-type-result-dims and
resolve-shaped-type-result-dims passes.

@joker-eph I have made the changes as asked. Thanks!

@qedawkins qedawkins merged commit e377a5d into llvm:main Apr 11, 2025
11 checks passed
@vivekkhandelwal1 vivekkhandelwal1 deleted the expandshape-op branch April 12, 2025 04:33
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
… on tensor.expand_shape/tensor.collapse_shape (llvm#134219)

These are problematic because the iterative application that locally
resolves the tensor.dim operation introduces
intermediate floor_div, which is losing the information about the exact
division that was carried out in the original
IR, and the iterative algorithm can't converge towards the simplest
form.
Information loss is not acceptable for canonicalization.

Resolving the dimOp can be achieved through
resolve-ranked-shaped-type-result-dims and
resolve-shaped-type-result-dims passes.

---------

Signed-off-by: Vivek Khandelwal <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants