-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AArch64] Don't try to sink and(load) #122274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-aarch64 Author: David Green (davemgreen) ChangesIf we sink the and in and(load), CGP can hoist is back again to the load, getting into an infinite loop. This prevents sinking the and in this case. Fixes #122074 Full diff: https://github.com/llvm/llvm-project/pull/122274.diff 2 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 29ea098386cec1..932a6f9ce23fd2 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -5514,7 +5514,10 @@ bool AArch64TTIImpl::isProfitableToSinkOperands(
NumZExts++;
}
- Ops.push_back(&Insert->getOperandUse(1));
+ // And(Load) is excluded to prevent CGP getting stuck in a loop of sinking
+ // the And, just to hoist it again back to the load.
+ if (!match(OperandInstr, m_And(m_Load(m_Value()), m_Value())))
+ Ops.push_back(&Insert->getOperandUse(1));
Ops.push_back(&Shuffle->getOperandUse(0));
Ops.push_back(&Op);
}
diff --git a/llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll b/llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll
index 95c54cd8b01511..60781466f0eb79 100644
--- a/llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll
+++ b/llvm/test/CodeGen/AArch64/aarch64-dup-ext-crash.ll
@@ -40,3 +40,49 @@ vector.body: ; preds = %vector.body, %vecto
store <2 x i32> %3, ptr %4, align 4
br label %vector.body
}
+
+; This test got stuck in a loop hoisting the and to the load, and sinking it back to the mull
+define i32 @dup_and_load(ptr %p, i1 %c) {
+; CHECK-LABEL: dup_and_load:
+; CHECK: // %bb.0: // %for.body.lr.ph
+; CHECK-NEXT: mov x8, x0
+; CHECK-NEXT: ldrb w0, [x0]
+; CHECK-NEXT: tbz w1, #0, .LBB1_3
+; CHECK-NEXT: // %bb.1: // %ph
+; CHECK-NEXT: dup v0.8h, w0
+; CHECK-NEXT: mov w9, wzr
+; CHECK-NEXT: .LBB1_2: // %vector.body
+; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
+; CHECK-NEXT: ldr d1, [x8]
+; CHECK-NEXT: add w9, w9, #1
+; CHECK-NEXT: cmp w9, #100
+; CHECK-NEXT: ushll v1.8h, v1.8b, #0
+; CHECK-NEXT: umull2 v2.4s, v0.8h, v1.8h
+; CHECK-NEXT: umull v1.4s, v0.4h, v1.4h
+; CHECK-NEXT: stp q1, q2, [x8]
+; CHECK-NEXT: b.lt .LBB1_2
+; CHECK-NEXT: .LBB1_3: // %end
+; CHECK-NEXT: ret
+for.body.lr.ph:
+ %l = load i32, ptr %p
+ %conv314 = and i32 %l, 255
+ br i1 %c, label %ph, label %end
+
+ph:
+ %broadcast.splatinsert = insertelement <8 x i32> poison, i32 %conv314, i32 0
+ %broadcast.splat = shufflevector <8 x i32> %broadcast.splatinsert, <8 x i32> poison, <8 x i32> zeroinitializer
+ br label %vector.body
+
+vector.body: ; preds = %vector.body, %vector.ph
+ %iv = phi i32 [ 0, %ph ], [ %iv.next, %vector.body ]
+ %wide.load = load <8 x i8>, ptr %p, align 4
+ %0 = zext <8 x i8> %wide.load to <8 x i32>
+ %1 = mul <8 x i32> %broadcast.splat, %0
+ store <8 x i32> %1, ptr %p, align 4
+ %iv.next = add i32 %iv, 1
+ %e = icmp slt i32 %iv.next, 100
+ br i1 %e, label %vector.body, label %end
+
+end:
+ ret i32 %conv314
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, LGTM - thanks for the fix! :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a sensible fix! I just had a few minor comments if you agree with them.
%broadcast.splat = shufflevector <8 x i32> %broadcast.splatinsert, <8 x i32> poison, <8 x i32> zeroinitializer | ||
br label %vector.body | ||
|
||
vector.body: ; preds = %vector.body, %vector.ph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be a loop in order to trigger the crash? Just wondering if it could be a normal block to reduce the lines of IR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is OK - it was copied from the test above and modified enough to trigger the issue.
; CHECK-NEXT: b.lt .LBB1_2 | ||
; CHECK-NEXT: .LBB1_3: // %end | ||
; CHECK-NEXT: ret | ||
for.body.lr.ph: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Might be nicer to change the block name to entry
or something like that?
; CHECK-NEXT: ret | ||
for.body.lr.ph: | ||
%l = load i32, ptr %p | ||
%conv314 = and i32 %l, 255 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Perhaps rename the variables to something a bit more readable, i.e. %and
or %and_255
. Similarly for the zext and mul in the loop below.
Thanks, verified that this change fixes the Android's build hang. |
If we sink the and in and(load), CGP can hoist is back again to the laod, getting into an infinite loop. This prevents sinking the and in this case. Fixes llvm#122074
dae8a51
to
e295a9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, verified that this change fixes the Android's build hang.
Thanks for checking!
%broadcast.splat = shufflevector <8 x i32> %broadcast.splatinsert, <8 x i32> poison, <8 x i32> zeroinitializer | ||
br label %vector.body | ||
|
||
vector.body: ; preds = %vector.body, %vector.ph |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is OK - it was copied from the test above and modified enough to trigger the issue.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/154/builds/10060 Here is the relevant piece of the build log for the reference
|
If we sink the and in and(load), CGP can hoist is back again to the load, getting into an infinite loop. This prevents sinking the and in this case. Fixes llvm#122074
If we sink the and in and(load), CGP can hoist is back again to the load, getting into an infinite loop. This prevents sinking the and in this case.
Fixes #122074