Skip to content

[DAG] Prefer 0.0 over -0.0 as neutral value for FADD w/NoSignedZero #106616

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 30, 2024

Conversation

preames
Copy link
Collaborator

@preames preames commented Aug 29, 2024

When getting a neutral value, we can prefer using a positive zero over a negative zero if nsz is set on the FADD (or reduction). A positive zero should be cheaper to materialize on basically all targets.

Arguably, we should be doing this kind of canonicalization in DAGCombine, but we don't do that for any of the other reduction variants, so this seems like path of least resistance. This does mean that we can only do this for "fast" reductions. Just nsz isn't enough, as that goes through the SEQ_FADD path where the IR level start value isn't folded away.

If folks think this is to RISCV specific, let me know. There's a trivial RISCV specific implementation. I went with the generic one as I through this might benefit other targets.

When getting a neutral value, we can prefer using a positive zero
over a negative zero if nsz is set on the FADD (or reduction).  A
positive zero should be cheaper to materialize on basically all
targets.

Arguably, we should be doing this kind of canonicalization in
DAGCombine, but we don't do that for any of the other reduction
variants, so this seems like path of least resistance.  This does
mean that we can only do this for "fast" reductions.  Just nsz
isn't enough, as that goes through the SEQ_FADD path where the
IR level start value isn't folded away.

If folks think this is to RISCV specific, let me know.  There's a
trivial RISCV specific implementation. I went with the generic one
as I through this might benefit other targets.
@preames preames requested a review from topperc August 29, 2024 19:35
@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Aug 29, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 29, 2024

@llvm/pr-subscribers-llvm-selectiondag

Author: Philip Reames (preames)

Changes

When getting a neutral value, we can prefer using a positive zero over a negative zero if nsz is set on the FADD (or reduction). A positive zero should be cheaper to materialize on basically all targets.

Arguably, we should be doing this kind of canonicalization in DAGCombine, but we don't do that for any of the other reduction variants, so this seems like path of least resistance. This does mean that we can only do this for "fast" reductions. Just nsz isn't enough, as that goes through the SEQ_FADD path where the IR level start value isn't folded away.

If folks think this is to RISCV specific, let me know. There's a trivial RISCV specific implementation. I went with the generic one as I through this might benefit other targets.


Full diff: https://github.com/llvm/llvm-project/pull/106616.diff

2 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (+3-1)
  • (modified) llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll (+1-2)
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
index 9efcd3f25797b5..7f57b6db40ef49 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
@@ -13267,7 +13267,9 @@ SDValue SelectionDAG::getNeutralElement(unsigned Opcode, const SDLoc &DL,
   case ISD::SMIN:
     return getConstant(APInt::getSignedMaxValue(VT.getSizeInBits()), DL, VT);
   case ISD::FADD:
-    return getConstantFP(-0.0, DL, VT);
+    // If flags allow, prefer positive zero single it's generally cheaper
+    // to materialize on most targets.
+    return getConstantFP(Flags.hasNoSignedZeros() ? 0.0 : -0.0, DL, VT);
   case ISD::FMUL:
     return getConstantFP(1.0, DL, VT);
   case ISD::FMINNUM:
diff --git a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
index 5d5807cbadbad5..4be680e272e5b9 100644
--- a/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
@@ -524,8 +524,7 @@ define float @vreduce_fadd_v7f32_neutralstart_fast(ptr %x) {
 ; CHECK:       # %bb.0:
 ; CHECK-NEXT:    vsetivli zero, 7, e32, m2, ta, ma
 ; CHECK-NEXT:    vle32.v v8, (a0)
-; CHECK-NEXT:    lui a0, 524288
-; CHECK-NEXT:    vmv.s.x v10, a0
+; CHECK-NEXT:    vmv.s.x v10, zero
 ; CHECK-NEXT:    vfredusum.vs v8, v8, v10
 ; CHECK-NEXT:    vfmv.f.s fa0, v8
 ; CHECK-NEXT:    ret

@dtcxzyw dtcxzyw requested a review from arsenm August 30, 2024 01:32
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this function isn't called getIdentityValue like in other places

@preames preames merged commit 924907b into llvm:main Aug 30, 2024
10 checks passed
@preames preames deleted the pr-dag-fadd_reduce_nsz branch August 30, 2024 14:56
@@ -13267,7 +13267,9 @@ SDValue SelectionDAG::getNeutralElement(unsigned Opcode, const SDLoc &DL,
case ISD::SMIN:
return getConstant(APInt::getSignedMaxValue(VT.getSizeInBits()), DL, VT);
case ISD::FADD:
return getConstantFP(-0.0, DL, VT);
// If flags allow, prefer positive zero single it's generally cheaper
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird typo "single" for "since"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in e1bde1c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:SelectionDAG SelectionDAGISel as well
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants