-
Notifications
You must be signed in to change notification settings - Fork 10.5k
SESE: update LoopInfo incrementally when cloning a loop body with nested loops. #20167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SESE: update LoopInfo incrementally when cloning a loop body with nested loops. #20167
Conversation
@swift-ci please test tensorflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code structure looks good from a quick scan.
But I lost some larger context -- can you please extend PR description by elaborating on what "preserve the nesting structure" means? It might be useful to motivate this PR with an example, and describe how this PR helps in that context.
// the body of a loop during canonicalization. | ||
//expected-warning @+1 {{value implicitly copied to the host}} | ||
public func testLoopWithNestedLoopsRequiringCloning( | ||
_ breakIndex:Int32, _ repetitions: Int32) -> Tensor<Int32> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace tab with space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
// CHECK: Loop at depth 3 containing: {{.*}} | ||
// CHECK: Loop at depth 4 containing: {{.*}} | ||
|
||
// CHECK: --- XLA CFG Loops Before Canonicalize end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we instead CHECK for // CHECK-LABEL: --- XLA CFG Loops After Canonicalize:
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe CHECK-LABEL is better here. As far as I understand, CHECK-LABEL is used for breaking checks into logical blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. My question is whether we want to check for After
instead of Before Canonicalize end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, it should After
.
// CHECK: Loop at depth 2 containing: {{.*}} | ||
// CHECK: Loop at depth 3 containing: {{.*}} | ||
|
||
//-- Check the structure of the SESE region. (Note the cloned doubly nested loop.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this CFG below is very complex and I did not review it closely.
What are the key parts that we want to check and make sure we get right, and is it possible to simplify the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the simplest example that I could come up with.
Basically, we are checking a couple of things:
- The LoopInfo is correct before and after the canonicalization. We only check the loop nesting structure though.
- The clone loop body is structurally correct. I added a comment pointing to the loop that is being cloned in the test. Hope that improves the readability a bit?
/// fragment into the parent loop. If `startBlock` is the same as the header | ||
/// of `loop`, we clone the entire loop including the back edge. Otherwise, | ||
/// we clone one iteration of the loop body without the back edge. | ||
SILLoop *cloneLoop(SILLoopInfo *LI, SILLoop *loop, SILBasicBlock *startBlock) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from the call site below, it seems startBlock
is always header
. Why do we need the other code path?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other code path will be used in a subsequent PR that I am preparing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mhong has a valid point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this use case, we are only cloning part of the loop. The header
that is being passed at the call-site is the original header of the loop before canonicalization. After canonicalization, the loop will look like this:
new_header:
cond_br stayInLoop header loop_exit
header:
...
br latch
latch:
...
br new_header
exit:
...
We are only cloning the part of the loop reachable from header
and the result would look something like this (primed names are the clones):
header':
...
br latch'
latch':
...
br new_header
new_header:
cond_br stayInLoop header loop_exit
header:
...
br latch
latch:
...
br new_header
exit:
...
In the other path, we will pass in new_header
and thereby clone the entire loop. Will send out that PR shortly.
I have the same question. What does "preserve loop structure" mean? Does it simply mean that LoopInfo will be incrementally updating during cloning? |
Yes, LoopInfo is incrementally updated during cloning. I have changed the PR description as it is much better. :) |
0224b15
to
226cde3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional context/discussion!
@@ -189,6 +198,18 @@ namespace { | |||
SILBasicBlock *endBB); | |||
void processLoop(SILLoop *loop); | |||
void ensureSingleExitFromLoops(); | |||
|
|||
// Dump top-level loop information for debugging purposes. | |||
void dumpTopLevelLoopInfo(llvm::raw_ostream* outs, const char* stage) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be useful to have a routine that dumps loop info for a specific loop? this way after transforming one loop, we can just dump out it's structure before and after.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can already do that with loop->dump() or loop->print(...)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be useful for debugging, if we dump the loop info out before and after the transformation, for each transformation step?
// CHECK: Loop at depth 1 containing: {{.*}} | ||
// CHECK: Loop at depth 2 containing: {{.*}} | ||
// CHECK: Loop at depth 3 containing: {{.*}} | ||
// CHECK: Loop at depth 2 containing: {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there another nested "Loop at depth 3 containing" that we should also CHECK here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I think the confusion comes from the use of the term "cloning" here. I am actually referring to the unrolling the body of the loop at depth 2. When we nest the inner loop at depth 3 into the outer loop at depth 1, we get a loop of depth 2.
I have replaced uses of cloning with unrolling. Hope that clarifies things a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand now. Thanks.
//-- Check the structure of the SESE region. (Note the cloned loop.) | ||
// CHECK-LABEL: --- XLA CFG Canonicalize: {{.*}}LoopWithNestedLoopsRequiringCloning{{.*}} | ||
// CHECK: [sequence | ||
// CHECK: <while Preheader: {{.*}}, Header: {{.*}}, exit: {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this correspond to "Loop at depth 2" above? I didn't find 3 levels of loop nesting in the CFG here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above.
// CHECK: block {{.*}} | ||
// CHECK: block {{.*}}}]} | ||
// ---The body of this loop is cloned above--- | ||
// CHECK: <while Preheader: {{.*}}, Header: {{.*}}, exit: {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the indentation here match that of the line 58 above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the indentation is correct. This loop is nested in the while-loop at line 58.
// CHECK: Loop at depth 2 containing: {{.*}} | ||
// CHECK: Loop at depth 3 containing: {{.*}} | ||
// CHECK: Loop at depth 4 containing: {{.*}} | ||
// CHECK: Loop at depth 2 containing: {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we are cloning the loop at depth 2 -- consider adding a comment at the source code test case above, pointing out which loop this corresponds to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Done.
//-- Check the structure of the SESE region. (Note the cloned doubly nested loop.) | ||
// CHECK-LABEL: --- XLA CFG Canonicalize: {{.*}}LoopWithDoublyNestedLoopsRequiringCloning{{.*}} | ||
// CHECK: [sequence | ||
// CHECK: <while Preheader: {{.*}}, Header: {{.*}}, exit: {{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar questions to above:
- does this loop correspond to "loop at depth 2", not 1?
- should line 180 be indented to match the indentation here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the related comments above.
… unrolling and cloning.
Thanks for the comments, Ming. I think the source of confusion in interpreting tests is that I was using the term cloning instead of unrolling. Hope the new changes make it more readable. I will go ahead and merge it so that I can send out the subsequent PRs. |
When a loop contains nested loops and is cloned, the LoopInfo becomes stale. e.g.,
After cloning, we have the following:
Note that we have a new top-level loop
loop 2'
. This PR makes sure that LoopInfo is incrementally updated for such cases.