Skip to content

[SelectionDAG] Simplify vselect true, T, F -> T #100992

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions llvm/include/llvm/CodeGen/SelectionDAG.h
Original file line number Diff line number Diff line change
Expand Up @@ -2320,6 +2320,11 @@ class SelectionDAG {
isConstantFPBuildVectorOrConstantFP(N);
}

/// Check if a value \op N is a constant using the target's BooleanContent for
/// its type.
std::optional<bool> isBoolConstant(SDValue N,
bool AllowTruncation = false) const;

/// Set CallSiteInfo to be associated with Node.
void addCallSiteInfo(const SDNode *Node, CallSiteInfo &&CallInfo) {
SDEI[Node].CSInfo = std::move(CallInfo);
Expand Down
23 changes: 2 additions & 21 deletions llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3260,28 +3260,9 @@ static SDValue extractBooleanFlip(SDValue V, SelectionDAG &DAG,
if (V.getOpcode() != ISD::XOR)
return SDValue();

ConstantSDNode *Const = isConstOrConstSplat(V.getOperand(1), false);
if (!Const)
return SDValue();

EVT VT = V.getValueType();

bool IsFlip = false;
switch(TLI.getBooleanContents(VT)) {
case TargetLowering::ZeroOrOneBooleanContent:
IsFlip = Const->isOne();
break;
case TargetLowering::ZeroOrNegativeOneBooleanContent:
IsFlip = Const->isAllOnes();
break;
case TargetLowering::UndefinedBooleanContent:
IsFlip = (Const->getAPIntValue() & 0x01) == 1;
break;
}

if (IsFlip)
if (DAG.isBoolConstant(V.getOperand(1)) == true)
return V.getOperand(0);
if (Force)
if (Force && isConstOrConstSplat(V.getOperand(1), false))
return DAG.getLogicalNOT(SDLoc(V), V, V.getValueType());
return SDValue();
}
Expand Down
36 changes: 27 additions & 9 deletions llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9924,15 +9924,8 @@ SDValue SelectionDAG::simplifySelect(SDValue Cond, SDValue T, SDValue F) {

// select true, T, F --> T
// select false, T, F --> F
if (auto *CondC = dyn_cast<ConstantSDNode>(Cond))
return CondC->isZero() ? F : T;

// TODO: This should simplify VSELECT with non-zero constant condition using
// something like this (but check boolean contents to be complete?):
if (ConstantSDNode *CondC = isConstOrConstSplat(Cond, /*AllowUndefs*/ false,
/*AllowTruncation*/ true))
if (CondC->isZero())
return F;
if (auto C = isBoolConstant(Cond, /*AllowTruncation=*/true))
return *C ? T : F;

// select ?, T, T --> T
if (T == F)
Expand Down Expand Up @@ -13124,6 +13117,31 @@ SDNode *SelectionDAG::isConstantFPBuildVectorOrConstantFP(SDValue N) const {
return nullptr;
}

std::optional<bool> SelectionDAG::isBoolConstant(SDValue N,
bool AllowTruncation) const {
ConstantSDNode *Const = isConstOrConstSplat(N, false, AllowTruncation);
if (!Const)
return std::nullopt;

const APInt &CVal = Const->getAPIntValue();
switch (TLI->getBooleanContents(N.getValueType())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be safer if we only accept the 'clean' 1/0, -1/0 and 1b/0b T/F values and std::nullopt otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see what you meant by the original signature you suggested, that makes sense

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - an alternative would be an optional result that returns the exact match - which would mean we only need one call in simplifySelect. No strong preference on this one though tbh - as long as we only accept the 'clean' boolean contents values.

case TargetLowering::ZeroOrOneBooleanContent:
if (CVal.isOne())
return true;
if (CVal.isZero())
return false;
return std::nullopt;
case TargetLowering::ZeroOrNegativeOneBooleanContent:
if (CVal.isAllOnes())
return true;
if (CVal.isZero())
return false;
return std::nullopt;
case TargetLowering::UndefinedBooleanContent:
return CVal[0];
}
}

void SelectionDAG::createOperands(SDNode *Node, ArrayRef<SDValue> Vals) {
assert(!Node->OperandList && "Node already has operands");
assert(SDNode::getMaxNumOperands() >= Vals.size() &&
Expand Down
2 changes: 1 addition & 1 deletion llvm/test/CodeGen/AArch64/srem-seteq-vec-splat.ll
Original file line number Diff line number Diff line change
Expand Up @@ -206,7 +206,7 @@ define <4 x i32> @test_srem_int_min(<4 x i32> %X) nounwind {
; CHECK-NEXT: movi v1.4s, #128, lsl #24
; CHECK-NEXT: usra v3.4s, v2.4s, #1
; CHECK-NEXT: and v1.16b, v3.16b, v1.16b
; CHECK-NEXT: add v0.4s, v1.4s, v0.4s
; CHECK-NEXT: add v0.4s, v0.4s, v1.4s
; CHECK-NEXT: movi v1.4s, #1
; CHECK-NEXT: cmeq v0.4s, v0.4s, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
Expand Down
19 changes: 19 additions & 0 deletions llvm/test/CodeGen/RISCV/rvv/vp-select.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 5
; RUN: llc < %s -mtriple=riscv64 -mattr=+v -verify-machineinstrs | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
; RUN: llc < %s -mtriple=riscv64 -mattr=+v -verify-machineinstrs | FileCheck %s
; RUN: llc -mtriple=riscv64 -mattr=+v < %s | FileCheck %s

Don't need to add -verify-machineinstrs in any random test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I specifically added it for this test since it's dealing with RVV instructions. After we migrated the RISCVInsertVSETVLI pass to operate on LiveIntervals it really helps to have the machine verifier check that they're intact. It's already caught quite a few issues just on the existing RVV codegen tests alone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They'll get caught in expensive_checks builds. Unless this test is specifically trying to stress a verifier error, which it isn't, it's just wasting test time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already the de-facto for the RVV in-tree tests:

$ git grep "\-verify\-machineinstrs" -- llvm/test/CodeGen/RISCV/rvv | wc -l
    2017

I get your point about slow test times but we rely very heavily on this for coverage on RISCVInsertVSETVLI. Should we create a separate issue to discuss this outside of this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I think this is a long standing issue, and we have thousands of violations. It defeats the point of having a separate EXPENSIVE_CHECKS behavior

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right I think this is a long standing issue, and we have thousands of violations. It defeats the point of having a separate EXPENSIVE_CHECKS behavior

Do we have a RISCV expensive checks bot?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the x86_64 expensive checks bot catch this? https://lab.llvm.org/buildbot/#/builders/16

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the x86_64 expensive checks bot catch this? https://lab.llvm.org/buildbot/#/builders/16

I think it would. I just wasn't sure what targets it builds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #102092 to separate out this discussion


define <vscale x 1 x i64> @all_ones(<vscale x 1 x i64> %true, <vscale x 1 x i64> %false, i32 %evl) {
; CHECK-LABEL: all_ones:
; CHECK: # %bb.0:
; CHECK-NEXT: ret
%v = call <vscale x 1 x i64> @llvm.vp.select.nxv1i64(<vscale x 1 x i1> splat (i1 true), <vscale x 1 x i64> %true, <vscale x 1 x i64> %false, i32 %evl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test the all false case? Plus another target with the other bits behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried creating a test case on r600 which seems to have the different behaviour but couldn't come up with anything meaningful. Mainly because the combine kicks in when the condition vector is at nxi1, so every behaviour considers it true.

ret <vscale x 1 x i64> %v
}

define <vscale x 1 x i64> @all_zeroes(<vscale x 1 x i64> %true, <vscale x 1 x i64> %false, i32 %evl) {
; CHECK-LABEL: all_zeroes:
; CHECK: # %bb.0:
; CHECK-NEXT: vmv1r.v v8, v9
; CHECK-NEXT: ret
%v = call <vscale x 1 x i64> @llvm.vp.select.nxv1i64(<vscale x 1 x i1> splat (i1 false), <vscale x 1 x i64> %true, <vscale x 1 x i64> %false, i32 %evl)
ret <vscale x 1 x i64> %v
}
40 changes: 22 additions & 18 deletions llvm/test/CodeGen/X86/combine-srem.ll
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ define <4 x i32> @combine_vec_srem_by_minsigned(<4 x i32> %x) {
; AVX1-NEXT: vpsrld $1, %xmm1, %xmm1
; AVX1-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; AVX1-NEXT: vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm1
; AVX1-NEXT: vpaddd %xmm0, %xmm1, %xmm0
; AVX1-NEXT: vpaddd %xmm1, %xmm0, %xmm0
; AVX1-NEXT: retq
;
; AVX2-LABEL: combine_vec_srem_by_minsigned:
Expand All @@ -93,7 +93,7 @@ define <4 x i32> @combine_vec_srem_by_minsigned(<4 x i32> %x) {
; AVX2-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
; AVX2-NEXT: vpand %xmm2, %xmm1, %xmm1
; AVX2-NEXT: vpaddd %xmm0, %xmm1, %xmm0
; AVX2-NEXT: vpaddd %xmm1, %xmm0, %xmm0
; AVX2-NEXT: retq
%1 = srem <4 x i32> %x, <i32 -2147483648, i32 -2147483648, i32 -2147483648, i32 -2147483648>
ret <4 x i32> %1
Expand Down Expand Up @@ -225,24 +225,28 @@ define <4 x i32> @combine_vec_srem_by_pow2a_neg(<4 x i32> %x) {
; SSE-NEXT: psrad $31, %xmm1
; SSE-NEXT: psrld $30, %xmm1
; SSE-NEXT: paddd %xmm0, %xmm1
; SSE-NEXT: psrld $2, %xmm1
; SSE-NEXT: pxor %xmm2, %xmm2
; SSE-NEXT: psubd %xmm1, %xmm2
; SSE-NEXT: pslld $2, %xmm2
; SSE-NEXT: paddd %xmm2, %xmm0
; SSE-NEXT: pand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1
; SSE-NEXT: psubd %xmm1, %xmm0
; SSE-NEXT: retq
;
; AVX-LABEL: combine_vec_srem_by_pow2a_neg:
; AVX: # %bb.0:
; AVX-NEXT: vpsrad $31, %xmm0, %xmm1
; AVX-NEXT: vpsrld $30, %xmm1, %xmm1
; AVX-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; AVX-NEXT: vpsrld $2, %xmm1, %xmm1
; AVX-NEXT: vpxor %xmm2, %xmm2, %xmm2
; AVX-NEXT: vpsubd %xmm1, %xmm2, %xmm1
; AVX-NEXT: vpslld $2, %xmm1, %xmm1
; AVX-NEXT: vpaddd %xmm1, %xmm0, %xmm0
; AVX-NEXT: retq
; AVX1-LABEL: combine_vec_srem_by_pow2a_neg:
; AVX1: # %bb.0:
; AVX1-NEXT: vpsrad $31, %xmm0, %xmm1
; AVX1-NEXT: vpsrld $30, %xmm1, %xmm1
; AVX1-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; AVX1-NEXT: vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm1
; AVX1-NEXT: vpsubd %xmm1, %xmm0, %xmm0
; AVX1-NEXT: retq
;
; AVX2-LABEL: combine_vec_srem_by_pow2a_neg:
; AVX2: # %bb.0:
; AVX2-NEXT: vpsrad $31, %xmm0, %xmm1
; AVX2-NEXT: vpsrld $30, %xmm1, %xmm1
; AVX2-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; AVX2-NEXT: vpbroadcastd {{.*#+}} xmm2 = [4294967292,4294967292,4294967292,4294967292]
; AVX2-NEXT: vpand %xmm2, %xmm1, %xmm1
; AVX2-NEXT: vpsubd %xmm1, %xmm0, %xmm0
; AVX2-NEXT: retq
%1 = srem <4 x i32> %x, <i32 -4, i32 -4, i32 -4, i32 -4>
ret <4 x i32> %1
}
Expand Down
6 changes: 3 additions & 3 deletions llvm/test/CodeGen/X86/srem-seteq-vec-splat.ll
Original file line number Diff line number Diff line change
Expand Up @@ -624,7 +624,7 @@ define <4 x i32> @test_srem_int_min(<4 x i32> %X) nounwind {
; CHECK-AVX1-NEXT: vpsrld $1, %xmm1, %xmm1
; CHECK-AVX1-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; CHECK-AVX1-NEXT: vpand {{\.?LCPI[0-9]+_[0-9]+}}(%rip), %xmm1, %xmm1
; CHECK-AVX1-NEXT: vpaddd %xmm0, %xmm1, %xmm0
; CHECK-AVX1-NEXT: vpaddd %xmm1, %xmm0, %xmm0
; CHECK-AVX1-NEXT: vpxor %xmm1, %xmm1, %xmm1
; CHECK-AVX1-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
; CHECK-AVX1-NEXT: vpsrld $31, %xmm0, %xmm0
Expand All @@ -637,7 +637,7 @@ define <4 x i32> @test_srem_int_min(<4 x i32> %X) nounwind {
; CHECK-AVX2-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; CHECK-AVX2-NEXT: vpbroadcastd {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
; CHECK-AVX2-NEXT: vpand %xmm2, %xmm1, %xmm1
; CHECK-AVX2-NEXT: vpaddd %xmm0, %xmm1, %xmm0
; CHECK-AVX2-NEXT: vpaddd %xmm1, %xmm0, %xmm0
; CHECK-AVX2-NEXT: vpxor %xmm1, %xmm1, %xmm1
; CHECK-AVX2-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
; CHECK-AVX2-NEXT: vpsrld $31, %xmm0, %xmm0
Expand All @@ -649,7 +649,7 @@ define <4 x i32> @test_srem_int_min(<4 x i32> %X) nounwind {
; CHECK-AVX512VL-NEXT: vpsrld $1, %xmm1, %xmm1
; CHECK-AVX512VL-NEXT: vpaddd %xmm1, %xmm0, %xmm1
; CHECK-AVX512VL-NEXT: vpandd {{\.?LCPI[0-9]+_[0-9]+}}(%rip){1to4}, %xmm1, %xmm1
; CHECK-AVX512VL-NEXT: vpaddd %xmm0, %xmm1, %xmm0
; CHECK-AVX512VL-NEXT: vpaddd %xmm1, %xmm0, %xmm0
; CHECK-AVX512VL-NEXT: vpxor %xmm1, %xmm1, %xmm1
; CHECK-AVX512VL-NEXT: vpcmpeqd %xmm1, %xmm0, %xmm0
; CHECK-AVX512VL-NEXT: vpsrld $31, %xmm0, %xmm0
Expand Down
Loading