@@ -33,6 +33,11 @@ together.
33
33
34
34
Some important things to think about w.r.t. canonicalization patterns:
35
35
36
+ * The goal of canonicalization is to make subsequent optimizations more
37
+ effective. Therefore, performance improvements are not necessary for
38
+ canonicalization. But it is generally better to define a canonicalize
39
+ pattern that do not harm the performance.
40
+
36
41
* Pass pipelines should not rely on the canonicalizer pass for correctness.
37
42
They should work correctly with all instances of the canonicalization pass
38
43
removed.
@@ -51,6 +56,60 @@ Some important things to think about w.r.t. canonicalization patterns:
51
56
* It is always good to eliminate operations entirely when possible, e.g. by
52
57
folding known identities (like "x + 0 = x").
53
58
59
+ * Canonicalize isn't a great place to put pattens with expensive compile time
60
+ (i.e. have O(n) complexity) or complicated cost models.
61
+
62
+ * Canonicalize shouldn't drop the semantic of original operation.
63
+
64
+ For example, a pattern that transform
65
+
66
+ ```
67
+ %res = vector.transpose %0, [1, 0] : vector<nx1x<eltty>> to vector<1xnx<elty>>
68
+ ```
69
+
70
+ to
71
+
72
+ ```
73
+ %res = vector.shape_cast %0 : vector<nx1x<eltty>> to vector<1xnx<elty>>
74
+ ```
75
+
76
+ is not a good canonicalize pattern because it drops the transpose semantic.
77
+
78
+
79
+ A pattern that transform (linalg.transpose is only use of %broadcast)
80
+
81
+ ```
82
+ %broadcast = linalg.broadcast
83
+ ins(%input : tensor<2x4x5xf32>)
84
+ outs(%init1 : tensor<1x2x3x4x5x6xf32>)
85
+ dimensions = [0, 2, 5]
86
+ %transpose = linalg.transpose
87
+ ins(%broadcast : tensor<1x2x3x4x5x6xf32>)
88
+ outs(%init2 : tensor<1x6x2x3x5x4xf32>)
89
+ permutation = [0, 5, 1, 2, 4, 3]
90
+ ```
91
+
92
+ to
93
+
94
+ ```
95
+ %tranpose = linalg.transpose
96
+ ins(%input : tensor<2x4x5xf32>)
97
+ outs(%tmp_init : tensor<2x5x4xf32>)
98
+ permutation = [0, 2, 1]
99
+ %broadcast = linalg.broadcast
100
+ ins(%transpose : tensor<2x5x4xf32>)
101
+ outs(%init2 : tensor<1x6x2x3x5x4xf32>)
102
+ dimensions = [0, 3, 1]
103
+ ```
104
+
105
+ is a good canonicalize pattern because:
106
+
107
+ 1 . This pattern is converge.
108
+ 2 . This pattern always transforms the program towards reducing the amount of
109
+ computational data, which is a clear lattice.
110
+ 3 . This is not a one-off pattern, new matches may be generated during the
111
+ application process.
112
+
54
113
## Globally Applied Rules
55
114
56
115
These transformations are applied to all levels of IR:
@@ -189,7 +248,7 @@ each of the operands, returning the corresponding constant attribute. These
189
248
operands are those that implement the ` ConstantLike ` trait. If any of the
190
249
operands are non-constant, a null ` Attribute ` value is provided instead. For
191
250
example, if MyOp provides three operands [ ` a ` , ` b ` , ` c ` ] , but only ` b ` is
192
- constant then ` adaptor ` will return Attribute() for ` getA() ` and ` getC() ` ,
251
+ constant then ` adaptor ` will return Attribute() for ` getA() ` and ` getC() ` ,
193
252
and b-value for ` getB() ` .
194
253
195
254
Also above, is the use of ` OpFoldResult ` . This class represents the possible
0 commit comments