-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[VPlan] Also print final VPlan directly before codegen/execute. #82269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Some optimizations are apply after UF and VF have been chosen. This patch adds an extra print of the final VPlan just before codegen/execution. In the future, there will be additional transforms that are applied later (interleaving for example).
@llvm/pr-subscribers-backend-risc-v @llvm/pr-subscribers-llvm-transforms Author: Florian Hahn (fhahn) ChangesSome optimizations are apply after UF and VF have been chosen. This patch adds an extra print of the final VPlan just before codegen/execution. In the future, there will be additional transforms that are applied later (interleaving for example). Full diff: https://github.com/llvm/llvm-project/pull/82269.diff 3 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index eca901fcdae4ce..9a4d031e776de1 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -7455,6 +7455,9 @@ LoopVectorizationPlanner::executePlan(
if (!IsEpilogueVectorization)
VPlanTransforms::optimizeForVFAndUF(BestVPlan, BestVF, BestUF, PSE);
+ BestVPlan.setName("Final VPlan");
+ LLVM_DEBUG(BestVPlan.dump());
+
// Perform the actual loop transformation.
VPTransformState State(BestVF, BestUF, LI, DT, ILV.Builder, &ILV, &BestVPlan,
OrigLoop->getHeader()->getContext());
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
index 1bcd7a2e009e0b..da6dc34e409684 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
@@ -120,7 +120,7 @@ define void @vector_reverse_i64(ptr nocapture noundef writeonly %A, ptr nocaptur
; CHECK-NEXT: LV: Found a vectorizable loop (vscale x 4) in <stdin>
; CHECK-NEXT: LEV: Epilogue vectorization is not profitable for this loop
; CHECK-NEXT: Executing best plan with VF=vscale x 4, UF=1
-; CHECK-NEXT: LV: Interleaving disabled by the pass manager
+; CHECK: LV: Interleaving disabled by the pass manager
; CHECK-NEXT: LV: Vectorizing: innermost loop.
;
entry:
@@ -260,7 +260,7 @@ define void @vector_reverse_f32(ptr nocapture noundef writeonly %A, ptr nocaptur
; CHECK-NEXT: LV: Found a vectorizable loop (vscale x 4) in <stdin>
; CHECK-NEXT: LEV: Epilogue vectorization is not profitable for this loop
; CHECK-NEXT: Executing best plan with VF=vscale x 4, UF=1
-; CHECK-NEXT: LV: Interleaving disabled by the pass manager
+; CHECK: LV: Interleaving disabled by the pass manager
; CHECK-NEXT: LV: Vectorizing: innermost loop.
;
entry:
diff --git a/llvm/test/Transforms/LoopVectorize/vplan-printing-before-execute.ll b/llvm/test/Transforms/LoopVectorize/vplan-printing-before-execute.ll
new file mode 100644
index 00000000000000..1dddbfe20a2ed7
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/vplan-printing-before-execute.ll
@@ -0,0 +1,90 @@
+; RUN: opt -passes=loop-vectorize -force-vector-width=8 -force-vector-interleave=2 -disable-output -debug -S %s 2>&1 | FileCheck --check-prefixes=CHECK %s
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+
+; REQUIRES: asserts
+
+; Check if the vector loop condition can be simplified to true for a given
+; VF/IC combination.
+define void @test_tc_less_than_16(ptr %A, i64 %N) {
+; CHECK: LV: Scalarizing: %cmp =
+; CHECK-NEXT: VPlan 'Initial VPlan for VF={8},UF>=1' {
+; CHECK-NEXT: Live-in vp<[[VFxUF:%.+]]> = VF * UF
+; CHECK-NEXT: Live-in vp<[[VTC:%.+]]> = vector-trip-count
+; CHECK-NEXT: vp<[[TC:%.+]]> = original trip-count
+; CHECK-EMPTY:
+; CHECK-NEXT: ph:
+; CHECK-NEXT: EMIT vp<[[TC]]> = EXPAND SCEV (zext i4 (trunc i64 %N to i4) to i64)
+; CHECK-NEXT: No successors
+; CHECK-EMPTY:
+; CHECK-NEXT: vector.ph:
+; CHECK-NEXT: Successor(s): vector loop
+; CHECK-EMPTY:
+; CHECK-NEXT: <x1> vector loop: {
+; CHECK-NEXT: vector.body:
+; CHECK-NEXT: EMIT vp<[[CAN_IV:%.+]]> = CANONICAL-INDUCTION ir<0>, vp<[[CAN_IV_NEXT:%.+]]>
+; CHECK-NEXT: EMIT ir<%p.src> = WIDEN-POINTER-INDUCTION ir<%A>, 1
+; CHECK-NEXT: vp<[[VPTR:%.]]> = vector-pointer ir<%p.src>
+; CHECK-NEXT: WIDEN ir<%l> = load vp<[[VPTR]]>
+; CHECK-NEXT: WIDEN ir<%add> = add nsw ir<%l>, ir<10>
+; CHECK-NEXT: vp<[[VPTR2:%.+]]> = vector-pointer ir<%p.src>
+; CHECK-NEXT: WIDEN store vp<[[VPTR2]]>, ir<%add>
+; CHECK-NEXT: EMIT vp<[[CAN_IV_NEXT]]> = add nuw vp<[[CAN_IV:%.+]]>, vp<[[VFxUF]]>
+; CHECK-NEXT: EMIT branch-on-count vp<[[CAN_IV_NEXT]]>, vp<[[VTC]]>
+; CHECK-NEXT: No successors
+; CHECK-NEXT: }
+; CHECK-NEXT: Successor(s): middle.block
+; CHECK-EMPTY:
+; CHECK-NEXT: middle.block:
+; CHECK-NEXT: No successors
+; CHECK-NEXT: }
+;
+; CHECK: Executing best plan with VF=8, UF=2
+; CHECK-NEXT: VPlan 'Final VPlan for VF={8},UF={2}' {
+; CHECK-NEXT: Live-in vp<[[VFxUF:%.+]]> = VF * UF
+; CHECK-NEXT: vp<[[TC:%.+]]> = original trip-count
+; CHECK-EMPTY:
+; CHECK-NEXT: ph:
+; CHECK-NEXT: EMIT vp<[[TC]]> = EXPAND SCEV (zext i4 (trunc i64 %N to i4) to i64)
+; CHECK-NEXT: No successors
+; CHECK-EMPTY:
+; CHECK-NEXT: vector.ph:
+; CHECK-NEXT: Successor(s): vector loop
+; CHECK-EMPTY:
+; CHECK-NEXT: <x1> vector loop: {
+; CHECK-NEXT: vector.body:
+; CHECK-NEXT: EMIT vp<[[CAN_IV:%.+]]> = CANONICAL-INDUCTION ir<0>, vp<[[CAN_IV_NEXT:%.+]]>
+; CHECK-NEXT: EMIT ir<%p.src> = WIDEN-POINTER-INDUCTION ir<%A>, 1
+; CHECK-NEXT: vp<[[VPTR:%.]]> = vector-pointer ir<%p.src>
+; CHECK-NEXT: WIDEN ir<%l> = load vp<[[VPTR]]>
+; CHECK-NEXT: WIDEN ir<%add> = add nsw ir<%l>, ir<10>
+; CHECK-NEXT: vp<[[VPTR2:%.+]]> = vector-pointer ir<%p.src>
+; CHECK-NEXT: WIDEN store vp<[[VPTR2]]>, ir<%add>
+; CHECK-NEXT: EMIT vp<[[CAN_IV_NEXT]]> = add nuw vp<[[CAN_IV:%.+]]>, vp<[[VFxUF]]>
+; CHECK-NEXT: EMIT branch-on-cond ir<true>
+; CHECK-NEXT: No successors
+; CHECK-NEXT: }
+; CHECK-NEXT: Successor(s): middle.block
+; CHECK-EMPTY:
+; CHECK-NEXT: middle.block:
+; CHECK-NEXT: No successors
+; CHECK-NEXT: }
+;
+entry:
+ %and = and i64 %N, 15
+ br label %loop
+
+loop:
+ %iv = phi i64 [ %and, %entry ], [ %iv.next, %loop ]
+ %p.src = phi ptr [ %A, %entry ], [ %p.src.next, %loop ]
+ %p.src.next = getelementptr inbounds i8, ptr %p.src, i64 1
+ %l = load i8, ptr %p.src, align 1
+ %add = add nsw i8 %l, 10
+ store i8 %add, ptr %p.src
+ %iv.next = add nsw i64 %iv, -1
+ %cmp = icmp eq i64 %iv.next, 0
+ br i1 %cmp, label %exit, label %loop
+
+exit:
+ ret void
+}
|
ping :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good addition, raises some thoughts.
@@ -7456,6 +7456,9 @@ LoopVectorizationPlanner::executePlan( | |||
if (!IsEpilogueVectorization) | |||
VPlanTransforms::optimizeForVFAndUF(BestVPlan, BestVF, BestUF, PSE); | |||
|
|||
BestVPlan.setName("Final VPlan"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Final" VPlans should indeed be printed after the last optimizeForVFAndUF(), but preferably along with the "Executing best plan ..." dump above?
Regarding VPlan's name - "Initial" VPlans are currently dumped following buildVPlansWithVPRecipes() which also optimize()'s them, except for the last optimizeForVFAndUF(). Perhaps the processing stage a VPlan is currently in should be better maintained than in a name string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, will move printing "Executing best plan..." down!
Regarding VPlan's name - "Initial" VPlans are currently dumped following buildVPlansWithVPRecipes() which also optimize()'s them, except for the last optimizeForVFAndUF(). Perhaps the processing stage a VPlan is currently in should be better maintained than in a name string.
Perhaps Initial = after initial construction, Optimized = during most VPlan transforms, Final = just before execution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps Initial = after initial construction, Optimized = during most VPlan transforms, Final = just before execution?
SGTM. Although the name may be used more persistently to associate a VPlan with the fixed original loop, complemented with optimization decisions (e.g., VF, UF), rather than keep track of which optimizations were processed.
✅ With the latest revision this PR passed the C/C++ code formatter. |
…c746d9591 Local branch amd-gfx df9c746 Merged main:37daff028fcec27f2be1bb990df77e19c0244ccf into amd-gfx:2c4204e0a79d Remote branch main 15d9d0f [VPlan] Also print final VPlan directly before codegen/execute. (llvm#82269)
Some optimizations are apply after UF and VF have been chosen. This patch adds an extra print of the final VPlan just before codegen/execution.
In the future, there will be additional transforms that are applied later (interleaving for example).