-
Notifications
You must be signed in to change notification settings - Fork 14.3k
release/20.x: [DAGCombiner] Don't ignore N2's undef elements in foldVSelectOfConstants
(#129272)
#129383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@RKSimon What do you think about merging this PR to the release branch? |
@llvm/pr-subscribers-llvm-selectiondag @llvm/pr-subscribers-backend-x86 Author: None (llvmbot) ChangesBackport 2709366 Requested by: @dtcxzyw Full diff: https://github.com/llvm/llvm-project/pull/129383.diff 2 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 9d04568483677..e921ced8326b1 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -12547,9 +12547,10 @@ SDValue DAGCombiner::foldVSelectOfConstants(SDNode *N) {
for (unsigned i = 0; i != Elts; ++i) {
SDValue N1Elt = N1.getOperand(i);
SDValue N2Elt = N2.getOperand(i);
- if (N1Elt.isUndef() || N2Elt.isUndef())
+ if (N1Elt.isUndef())
continue;
- if (N1Elt.getValueType() != N2Elt.getValueType()) {
+ // N2 should not contain undef values since it will be reused in the fold.
+ if (N2Elt.isUndef() || N1Elt.getValueType() != N2Elt.getValueType()) {
AllAddOne = false;
AllSubOne = false;
break;
diff --git a/llvm/test/CodeGen/X86/vselect-constants.ll b/llvm/test/CodeGen/X86/vselect-constants.ll
index 901f7e4a00eb5..34bda718db8f6 100644
--- a/llvm/test/CodeGen/X86/vselect-constants.ll
+++ b/llvm/test/CodeGen/X86/vselect-constants.ll
@@ -302,3 +302,21 @@ define i32 @wrong_min_signbits(<2 x i16> %x) {
%t1 = bitcast <2 x i16> %sel to i32
ret i32 %t1
}
+
+define i32 @pr129181() {
+; SSE-LABEL: pr129181:
+; SSE: # %bb.0: # %entry
+; SSE-NEXT: xorl %eax, %eax
+; SSE-NEXT: retq
+;
+; AVX-LABEL: pr129181:
+; AVX: # %bb.0: # %entry
+; AVX-NEXT: xorl %eax, %eax
+; AVX-NEXT: retq
+entry:
+ %x = insertelement <4 x i32> zeroinitializer, i32 0, i32 0
+ %cmp = icmp ult <4 x i32> %x, splat (i32 1)
+ %sel = select <4 x i1> %cmp, <4 x i32> zeroinitializer, <4 x i32> <i32 0, i32 0, i32 1, i32 poison>
+ %reduce = tail call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %sel)
+ ret i32 %reduce
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Does this need to be in 20.1.0 or could it wait until 20.1.1 ? |
I think it can wait, cheers. |
…ants` (llvm#129272) Since N2 will be reused in the fold, we cannot skip N2's undef elements if the corresponding element in N1 is well-defined. For example: ``` t2: v4i32 = BUILD_VECTOR Constant:i32<0>, Constant:i32<0>, Constant:i32<0>, Constant:i32<0> t24: v4i32 = BUILD_VECTOR undef:i32, undef:i32, Constant:i32<1>, undef:i32 t11: v4i32 = vselect t8, t2, t10 ``` Before this patch, we fold t11 into: ``` t26: v4i32 = sign_extend t8 t27: v4i32 = add t26, t24 ``` The last element of t27 is incorrect. Closes llvm#129181. (cherry picked from commit 2709366)
@dtcxzyw (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR. |
Backport 2709366
Requested by: @dtcxzyw