-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[mlir][docs] Update documentation for canonicalize. #99753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir Author: donald chen (cxy-1993) ChangesUpdate canonicalize docs. Full diff: https://github.com/llvm/llvm-project/pull/99753.diff 1 Files Affected:
diff --git a/mlir/docs/Canonicalization.md b/mlir/docs/Canonicalization.md
index d1cba572af212..f4c4edcefb115 100644
--- a/mlir/docs/Canonicalization.md
+++ b/mlir/docs/Canonicalization.md
@@ -33,6 +33,10 @@ together.
Some important things to think about w.r.t. canonicalization patterns:
+* The goal of canonicalization is to make subsequent analyses and
+ optimizations more effective. Therefore, performance improvements are not
+ necessary for canonicalization.
+
* Pass pipelines should not rely on the canonicalizer pass for correctness.
They should work correctly with all instances of the canonicalization pass
removed.
@@ -51,6 +55,61 @@ Some important things to think about w.r.t. canonicalization patterns:
* It is always good to eliminate operations entirely when possible, e.g. by
folding known identities (like "x + 0 = x").
+* Canonicalization isn't a great place to put pattens with expensive running
+ time (i.e. have O(n) complexity) or complicated cost models.
+
+* Canonicalize shouldn't lose the semantic of original operation: the original
+ information should always be recoverable from the transformed IR.
+
+For example, a pattern that transform
+
+```
+ %0 = tensor.insert_slice %slice into
+ %x[0, 0, 0, 0, 0][1, 1, 1, 16, 32][1, 1, 1, 1, 1] :
+ tensor<16x32xf32> into tensor<1x1x1x16x32xf32>
+```
+
+to
+
+```
+ %0 = tensor.expand_shape %slice[[0,1,2,3], [4]] :
+ tensor<16x32xf32> into tensor<1x1x1x16x32xf32>
+```
+
+is not a good canonicalize pattern because it lose the destination style
+semantic.
+
+
+A pattern that transform (linalg.transpose is only use of %broadcast)
+
+```
+ %broadcast = linalg.broadcast
+ ins(%input : tensor<2x4x5xf32>)
+ outs(%init1 : tensor<1x2x3x4x5x6xf32>)
+ dimensions = [0, 2, 5]
+ %transpose = linalg.transpose
+ ins(%broadcast : tensor<1x2x3x4x5x6xf32>)
+ outs(%init2 : tensor<1x6x2x3x5x4xf32>)
+ permutation = [0, 5, 1, 2, 4, 3]
+```
+
+to
+
+```
+ %tranpose = linalg.transpose
+ ins(%input : tensor<2x4x5xf32>)
+ outs(%tmp_init : tensor<2x5x4xf32>)
+ permutation = [0, 2, 1]
+ %broadcast = linalg.broadcast
+ ins(%transpose : tensor<2x5x4xf32>)
+ outs(%init2 : tensor<1x6x2x3x5x4xf32>)
+ dimensions = [0, 3, 1]
+```
+
+is a good canonicalize pattern because this pattern always transforms the
+program towards reducing the amount of computational data and keeps the semantic
+of original operation.
+
## Globally Applied Rules
These transformations are applied to all levels of IR:
@@ -189,7 +248,7 @@ each of the operands, returning the corresponding constant attribute. These
operands are those that implement the `ConstantLike` trait. If any of the
operands are non-constant, a null `Attribute` value is provided instead. For
example, if MyOp provides three operands [`a`, `b`, `c`], but only `b` is
-constant then `adaptor` will return Attribute() for `getA()` and `getC()`,
+constant then `adaptor` will return Attribute() for `getA()` and `getC()`,
and b-value for `getB()`.
Also above, is the use of `OpFoldResult`. This class represents the possible
|
This PR is resubmit from #97063 with comment modified |
mlir/docs/Canonicalization.md
Outdated
``` | ||
|
||
is not a good canonicalize pattern because it lose the destination style | ||
semantic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what "lose the destination style semantic" means, nor why it would be a bad canonicalization.
I think we should stick to very obvious examples here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example was probably chosen because it could trip up the bufferization analysis (and cause less efficient codegen). Destination-style was originally introduced to help with bufferization and I think there is currently no consensus on whether such "hints"/IR structures should be preserved by canonicalization.
The reader of this section may not know about destination style. Maybe a better example: rewriting a multiplication into a sequence of additions?
mlir/docs/Canonicalization.md
Outdated
semantic. | ||
|
||
|
||
A pattern that transform (linalg.transpose is only use of %broadcast) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(linalg.transpose is only use of %broadcast)
This seems to indicate that the canonicalization need to take into account the number of users, however this does not seem like a generally good guideline.
I think again this example isn't just an obvious enough one to be in the general canonicalization documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern would work even if %broadcast
has multiple users. By creating a new broadcast instead of modifying the existing one in-place. If the old broadcast actually had only a single user, it would then DCE away. If not, we now have two broadcasts (which may not be a bad thing).
Swapping patterns/bubble-up patterns/bubble-down pattern are a common class of patterns. It would be nice to clarify if they are suitable for canonicalization.
mlir/docs/Canonicalization.md
Outdated
* Canonicalization isn't a great place to put pattens with expensive running | ||
time (i.e. have O(n) complexity) or complicated cost models. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Canonicalization isn't a great place to put pattens with expensive running | |
time (i.e. have O(n) complexity) or complicated cost models. | |
* Pattens with expensive running time (i.e. have O(n) complexity) or complicated cost models | |
don't belong to canonicalization: since the algorithm is executed iteratively until fixed-point | |
we want patterns that execute quickly (in particular their matching phase). |
bc3fbd0
to
28926a3
Compare
Initially, I wanted to use some less clear examples to express our preferences. After all, it is natural to use patterns like x+0->x for canonicalization without the need for further emphasis. However, I also respect your opinion that such patterns fall within our gray area and should not be used as examples. I have made changes based on your suggestions, please review them again. @joker-eph @matthias-springer |
Update canonicalize docs.
Update canonicalize docs.