Skip to content

Commit e8d5db2

Browse files
aleks-tmbAleksandr Popov
andauthored
[LoopPeeling] Fix weights updating of peeled off branches (#70094)
In https://reviews.llvm.org/D64235 a new algorithm has been introduced for updating the branch weights of latch blocks and their copies. It increases the probability of going to the exit block for each next peel iteration, calculating weights by (F - I * E, E), where: - F is a weight of the edge from latch to header. - E is a weight of the edge from latch to exit. - I is a number of peeling iteration. E.g: Let's say the latch branch weights are (100,300) and the estimated trip count is 4. If we peel off all 4 iterations the weights of the copied branches will be: 0: (100,300) 1: (100,200) 2: (100,100) 3: (100,1) https://godbolt.org/z/93KnoEsT6 So we make the original loop almost unreachable from the 3rd peeled copy according to the profile data. But that's only true if the profiling data is accurate. Underestimated trip count can lead to a performance issues with the register allocator, which may decide to spill intervals inside the loop assuming it's unreachable. Since we don't know how accurate the profiling data is, it seems better to set neutral 1/1 weights on the last peeled latch branch. After this change, the weights in the example above will look like this: 0: (100,300) 1: (100,200) 2: (100,100) 3: (100,100) Co-authored-by: Aleksandr Popov <[email protected]>
1 parent e62d25e commit e8d5db2

File tree

3 files changed

+12
-9
lines changed

3 files changed

+12
-9
lines changed

llvm/lib/Transforms/Utils/LoopPeel.cpp

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -636,9 +636,14 @@ static void updateBranchWeights(Instruction *Term, WeightInfo &Info) {
636636
MDB.createBranchWeights(Info.Weights));
637637
for (auto [Idx, SubWeight] : enumerate(Info.SubWeights))
638638
if (SubWeight != 0)
639-
Info.Weights[Idx] = Info.Weights[Idx] > SubWeight
640-
? Info.Weights[Idx] - SubWeight
641-
: 1;
639+
// Don't set the probability of taking the edge from latch to loop header
640+
// to less than 1:1 ratio (meaning Weight should not be lower than
641+
// SubWeight), as this could significantly reduce the loop's hotness,
642+
// which would be incorrect in the case of underestimating the trip count.
643+
Info.Weights[Idx] =
644+
Info.Weights[Idx] > SubWeight
645+
? std::max(Info.Weights[Idx] - SubWeight, SubWeight)
646+
: SubWeight;
642647
}
643648

644649
/// Initialize the weights for all exiting blocks.

llvm/test/Transforms/LoopUnroll/peel-loop-pgo-deopt.ll

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
; CHECK: br i1 %{{.*}}, label %[[NEXT2:.*]], label %for.cond.for.end_crit_edge, !prof !18
2222
; CHECK: [[NEXT2]]:
2323
; CHECK: br i1 %c, label %{{.*}}, label %side_exit.loopexit, !prof !15
24-
; CHECK: br i1 %{{.*}}, label %for.body, label %{{.*}}, !prof !19
24+
; CHECK: br i1 %{{.*}}, label %for.body, label %{{.*}}, !prof !18
2525

2626
define i32 @basic(ptr %p, i32 %k, i1 %c) #0 !prof !15 {
2727
entry:
@@ -85,6 +85,5 @@ attributes #1 = { nounwind optsize }
8585
; This is a weights of latch and its copies.
8686
;CHECK: !16 = !{!"branch_weights", i32 3001, i32 1001}
8787
;CHECK: !17 = !{!"branch_weights", i32 2000, i32 1001}
88-
;CHECK: !18 = !{!"branch_weights", i32 999, i32 1001}
89-
;CHECK: !19 = !{!"branch_weights", i32 1, i32 1001}
88+
;CHECK: !18 = !{!"branch_weights", i32 1001, i32 1001}
9089

llvm/test/Transforms/LoopUnroll/peel-loop-pgo.ll

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@
2424
; CHECK: [[NEXT1]]:
2525
; CHECK: br i1 %{{.*}}, label %[[NEXT2:.*]], label %for.cond.for.end_crit_edge, !prof !17
2626
; CHECK: [[NEXT2]]:
27-
; CHECK: br i1 %{{.*}}, label %for.body, label %{{.*}}, !prof !18
27+
; CHECK: br i1 %{{.*}}, label %for.body, label %{{.*}}, !prof !17
2828

2929
define void @basic(ptr %p, i32 %k) #0 !prof !15 {
3030
entry:
@@ -105,6 +105,5 @@ attributes #1 = { nounwind optsize }
105105

106106
;CHECK: !15 = !{!"branch_weights", i32 3001, i32 1001}
107107
;CHECK: !16 = !{!"branch_weights", i32 2000, i32 1001}
108-
;CHECK: !17 = !{!"branch_weights", i32 999, i32 1001}
109-
;CHECK: !18 = !{!"branch_weights", i32 1, i32 1001}
108+
;CHECK: !17 = !{!"branch_weights", i32 1001, i32 1001}
110109

0 commit comments

Comments
 (0)