-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[MLIR][SCF] Fix LoopPeelOp documentation (NFC) #113179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mlir Author: Hugo Trachino (nujaa) ChangesThere is a mistake in the SCF documentation explaining the order of results of PeelOp in the case where
Gives :
Full diff: https://github.com/llvm/llvm-project/pull/113179.diff 2 Files Affected:
diff --git a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
index 20880d94a83cac..d13c1ba9a8ed1e 100644
--- a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
+++ b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
@@ -146,7 +146,7 @@ def LoopPeelOp : Op<Transform_Dialect, "loop.peel",
let summary = "Peels the first or last iteration of the loop";
let description = [{
Rewrite the given loop with a main loop and a partial (first or last) loop.
- When the `peelFront` option is set as true, the first iteration is peeled off.
+ When the `peelFront` option is set to true, the first iteration is peeled off.
Otherwise, updates the given loop so that its step evenly divides its range and puts
the remaining iteration into a separate loop or a conditional.
@@ -158,12 +158,12 @@ def LoopPeelOp : Op<Transform_Dialect, "loop.peel",
This operation ignores non-scf::ForOp ops and drops them in the return.
When `peelFront` is true, this operation returns two scf::ForOp Ops, the
- first scf::ForOp corresponds to the first iteration of the loop which can
- be canonicalized away in the following optimization. The second loop Op
- contains the remaining iteration, and the new lower bound is the original
- lower bound plus the number of steps.
+ first scf::ForOp corresponds to the target loop, whose lower bound has
+ been updated to the original lower bound plus the step. The second result
+ is the first iteration of the loop which can be canonicalized away in the
+ following optimization.
- When `peelFront` is not true, this operation returns two scf::ForOp Ops, with the first
+ When `peelFront` is false, this operation returns two scf::ForOp Ops, with the first
scf::ForOp satisfying: "the loop trip count is divisible by the step".
The second loop Op contains the remaining iteration. Note that even though the
Payload IR modification may be performed in-place, this operation consumes
diff --git a/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp b/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp
index a30e349d49136c..c06a448c3108c3 100644
--- a/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp
@@ -206,12 +206,11 @@ LogicalResult mlir::scf::peelForLoopAndSimplifyBounds(RewriterBase &rewriter,
return success();
}
-/// When the `peelFront` option is set as true, the first iteration of the loop
-/// is peeled off. This function rewrites the original scf::ForOp as two
-/// scf::ForOp Ops, the first scf::ForOp corresponds to the first iteration of
-/// the loop which can be canonicalized away in the following optimization. The
-/// second loop Op contains the remaining iteration, and the new lower bound is
-/// the original lower bound plus the number of steps.
+/// Rewrites the original scf::ForOp as two scf::ForOp Ops, the first scf::ForOp
+/// corresponds to the first iteration of the loop which can be canonicalized
+/// away in the following optimization. The second loop Op contains the
+/// remaining iteration, and the new lower bound is the original lower bound
+/// plus the number of steps.
LogicalResult mlir::scf::peelForLoopFirstIteration(RewriterBase &b, ForOp forOp,
ForOp &firstIteration) {
RewriterBase::InsertionGuard guard(b);
|
@llvm/pr-subscribers-mlir-scf Author: Hugo Trachino (nujaa) ChangesThere is a mistake in the SCF documentation explaining the order of results of PeelOp in the case where
Gives :
Full diff: https://github.com/llvm/llvm-project/pull/113179.diff 2 Files Affected:
diff --git a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
index 20880d94a83cac..d13c1ba9a8ed1e 100644
--- a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
+++ b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
@@ -146,7 +146,7 @@ def LoopPeelOp : Op<Transform_Dialect, "loop.peel",
let summary = "Peels the first or last iteration of the loop";
let description = [{
Rewrite the given loop with a main loop and a partial (first or last) loop.
- When the `peelFront` option is set as true, the first iteration is peeled off.
+ When the `peelFront` option is set to true, the first iteration is peeled off.
Otherwise, updates the given loop so that its step evenly divides its range and puts
the remaining iteration into a separate loop or a conditional.
@@ -158,12 +158,12 @@ def LoopPeelOp : Op<Transform_Dialect, "loop.peel",
This operation ignores non-scf::ForOp ops and drops them in the return.
When `peelFront` is true, this operation returns two scf::ForOp Ops, the
- first scf::ForOp corresponds to the first iteration of the loop which can
- be canonicalized away in the following optimization. The second loop Op
- contains the remaining iteration, and the new lower bound is the original
- lower bound plus the number of steps.
+ first scf::ForOp corresponds to the target loop, whose lower bound has
+ been updated to the original lower bound plus the step. The second result
+ is the first iteration of the loop which can be canonicalized away in the
+ following optimization.
- When `peelFront` is not true, this operation returns two scf::ForOp Ops, with the first
+ When `peelFront` is false, this operation returns two scf::ForOp Ops, with the first
scf::ForOp satisfying: "the loop trip count is divisible by the step".
The second loop Op contains the remaining iteration. Note that even though the
Payload IR modification may be performed in-place, this operation consumes
diff --git a/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp b/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp
index a30e349d49136c..c06a448c3108c3 100644
--- a/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/LoopSpecialization.cpp
@@ -206,12 +206,11 @@ LogicalResult mlir::scf::peelForLoopAndSimplifyBounds(RewriterBase &rewriter,
return success();
}
-/// When the `peelFront` option is set as true, the first iteration of the loop
-/// is peeled off. This function rewrites the original scf::ForOp as two
-/// scf::ForOp Ops, the first scf::ForOp corresponds to the first iteration of
-/// the loop which can be canonicalized away in the following optimization. The
-/// second loop Op contains the remaining iteration, and the new lower bound is
-/// the original lower bound plus the number of steps.
+/// Rewrites the original scf::ForOp as two scf::ForOp Ops, the first scf::ForOp
+/// corresponds to the first iteration of the loop which can be canonicalized
+/// away in the following optimization. The second loop Op contains the
+/// remaining iteration, and the new lower bound is the original lower bound
+/// plus the number of steps.
LogicalResult mlir::scf::peelForLoopFirstIteration(RewriterBase &b, ForOp forOp,
ForOp &firstIteration) {
RewriterBase::InsertionGuard guard(b);
|
first scf::ForOp corresponds to the first iteration of the loop which can | ||
be canonicalized away in the following optimization. The second loop Op | ||
contains the remaining iteration, and the new lower bound is the original | ||
lower bound plus the number of steps. | ||
first scf::ForOp corresponds to the target loop, whose lower bound has | ||
been updated to the original lower bound plus the step. The second result | ||
is the first iteration of the loop which can be canonicalized away in the | ||
following optimization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, the original comment is almost correct ("remaining iteration" -> "remaining iterations" and "new lower bound is the original lower bound plus the number of steps" -> "new lower bound is the original lower bound plus the step"). What am I missing?
The newly added paragraph is not clear to me:
lower bound has been updated to the original lower bound plus the step
Are you sure? The lower bound doesn't change. In general, the lower bound of the "first" loop is equal to the lower bound of the original loop. That's one thing that peelFront
won't change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I can understand my suggestion is unclear i'll try to come up with something better. What is originally wrong is the order of the results of the transformOp when peel_front is true.
The transform op returns 2 forops like :
%iteration1_to_n, %iteration_0= transform.loop.peel %1 {peel_front = true} : (!transform.op<"scf.for">) -> (!transform.op<"scf.for">, !transform.op<"scf.for">)
Whereas the original description suggested %iteration0, %iteration_1_to_n
New suggestion :
When peelFront
is true, this operation returns two scf::ForOp Ops, the
first scf::ForOp executes all iterations of the target loop but the first one. The second result is another scf::ForOp corresponding to the first iteration of the loop which can
be canonicalized away in the following optimizations. Although not following the execution order, the results are kept in this order to keep the [mainloop, remainder]
format of PeelOp general case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The transform op returns 2 forops like :
%iteration1_to_n, %iteration_0= transform.loop.peel %1 {peel_front = true} : (!transform.op<"scf.for">) -> (!transform.op<"scf.for">, !transform.op<"scf.for">)
Whereas the original description suggested %iteration0, %iteration_1_to_n
Yeah, that's clearly not correct, thanks for the example!
When peelFront is true, this operation returns two scf::ForOp Ops
The op returns two loops regardless of peelFront
. How about something like this (two avoid repeating that there are two results):
The op returns two loops, the peeled loop and the remainder loop.
When
peelFront
is true, the first scf::ForOp executes all iterations of the target loop but the first one (i.e. the remainder loop). The second scf::ForOp corresponds to the first iteration of the loop which can
be canonicalized away in the following optimizations (the peeled loop).When
peelFront
is false, ...
So basically what you are saying, but a bit shorter and trying to clearly identify the peeled and the remainder loop. Hope I got my terminology right 😅
Also, apologies for the delay. I am travelling for LLVM Dev this week and trying to prioritise F2F interactions over GitHub. I really appreciate you improving this!
The op returns two loops, the peeled loop whose tripcount is divisible by | ||
the step and the remainder loop. | ||
|
||
When `peelFront` is true, the first scf::ForOp (the remainder loop) | ||
executes all iterations of the target loop but the first one. The second | ||
scf::ForOp corresponds to the first iteration of the loop which can be | ||
canonicalized away in the following optimizations (the peeled loop). | ||
|
||
When `peelFront` is false, the first result (peeled loop) is "the loop | ||
with the highest upper bound that is divisible by the step". The second | ||
loopOp (remainder loop) contains the remaining iterations. Note that even | ||
though the Payload IR modification may be performed in-place, this | ||
operation consumes the operand handle and produces a new one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a massive improvement, thanks! This is already so much clearer 🤩
I have a few small suggestions.
- "first scf:ForOp" vs "first result" -> let's stick to one style
- "Note that even though the Payload IR modification may be performed in-place ..." -> IIUC, this applies to both paragraphs (i.e. when
peelFront
is true andfalse
?) So it should probably be a separate paragraph. - punctuation.
The op returns two loops, the peeled loop whose tripcount is divisible by | |
the step and the remainder loop. | |
When `peelFront` is true, the first scf::ForOp (the remainder loop) | |
executes all iterations of the target loop but the first one. The second | |
scf::ForOp corresponds to the first iteration of the loop which can be | |
canonicalized away in the following optimizations (the peeled loop). | |
When `peelFront` is false, the first result (peeled loop) is "the loop | |
with the highest upper bound that is divisible by the step". The second | |
loopOp (remainder loop) contains the remaining iterations. Note that even | |
though the Payload IR modification may be performed in-place, this | |
operation consumes the operand handle and produces a new one. | |
The op returns two loops: the peeled loop, which has trip count divisible by | |
the step, and the remainder loop. | |
When `peelFront` is true, the first result (remainder loop) | |
executes all but the first iteration of the target loop. The second | |
result (peeled loop) corresponds to the first iteration of the target loop, which can | |
later be canonicalized away in the following optimizations. | |
When `peelFront` is false, the first result (peeled loop) is the portion of the target | |
loop with the highest upper bound that is divisible by the step. The second | |
result (remainder loop) contains the remaining iterations. | |
Note that although the Payload IR modification may be done | |
in-place, this operation consumes the operand handle and produces | |
a new one. |
/// Rewrites the original scf::ForOp as two scf::ForOp Ops, the first scf::ForOp | ||
/// corresponds to the first iteration of the loop which can be canonicalized | ||
/// away in the following optimization. The second loop Op contains the | ||
/// remaining iteration, and the new lower bound is the original lower bound |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// remaining iteration, and the new lower bound is the original lower bound | |
/// remaining iterations, and the new lower bound is updated as the original lower bound |
/// corresponds to the first iteration of the loop which can be canonicalized | ||
/// away in the following optimization. The second loop Op contains the | ||
/// remaining iteration, and the new lower bound is the original lower bound | ||
/// plus the number of steps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// plus the number of steps. | |
/// plus one (i.e. skips the first iteration). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for the improvements and for bearing with me 🙏🏻
Co-authored-by: Andrzej Warzyński <[email protected]>
As an example, I added annotations to the peel_front unit test. ``` func.func @loop_peel_first_iter_op() { // CHECK: %[[C0:.+]] = arith.constant 0 // CHECK: %[[C41:.+]] = arith.constant 41 // CHECK: %[[C5:.+]] = arith.constant 5 // CHECK: %[[C5_0:.+]] = arith.constant 5 // CHECK: scf.for %{{.+}} = %[[C0]] to %[[C5_0]] step %[[C5]] // CHECK: arith.addi // CHECK: scf.for %{{.+}} = %[[C5_0]] to %[[C41]] step %[[C5]] // CHECK: arith.addi %0 = arith.constant 0 : index %1 = arith.constant 41 : index %2 = arith.constant 5 : index scf.for %i = %0 to %1 step %2 { arith.addi %i, %i : index } return } module attributes {transform.with_named_sequence} { transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) { %0 = transform.structured.match ops{["arith.addi"]} in %arg1 : (!transform.any_op) -> !transform.any_op %1 = transform.get_parent_op %0 {op_name = "scf.for"} : (!transform.any_op) -> !transform.op<"scf.for"> %main_loop, %remainder = transform.loop.peel %1 {peel_front = true} : (!transform.op<"scf.for">) -> (!transform.op<"scf.for">, !transform.op<"scf.for">) transform.annotate %main_loop "main_loop" : !transform.op<"scf.for"> transform.annotate %remainder "remainder" : !transform.op<"scf.for"> transform.yield } } ``` Gives : ``` func.func @loop_peel_first_iter_op() { %c0 = arith.constant 0 : index %c41 = arith.constant 41 : index %c5 = arith.constant 5 : index %c5_0 = arith.constant 5 : index scf.for %arg0 = %c0 to %c5_0 step %c5 { %0 = arith.addi %arg0, %arg0 : index } {remainder} // The first iteration loop (second result) has been annotated remainder scf.for %arg0 = %c5_0 to %c41 step %c5 { %0 = arith.addi %arg0, %arg0 : index } {main_loop} // The main loop (first result) has been annotated main_loop return } ``` --------- Co-authored-by: Andrzej Warzyński <[email protected]>
There is a mistake in the SCF documentation explaining the order of results of PeelOp in the case where
peel_front
is true.As an example, I added annotations to the peel_front unit test.
Gives :