Skip to content

[RISCV] Optimize lowering of VECREDUCE_FMINIMUM/VECREDUCE_FMAXIMUM. #85165

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 32 additions & 6 deletions llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -717,7 +717,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,

static const unsigned FloatingPointVecReduceOps[] = {
ISD::VECREDUCE_FADD, ISD::VECREDUCE_SEQ_FADD, ISD::VECREDUCE_FMIN,
ISD::VECREDUCE_FMAX};
ISD::VECREDUCE_FMAX, ISD::VECREDUCE_FMINIMUM, ISD::VECREDUCE_FMAXIMUM};

if (!Subtarget.is64Bit()) {
// We must custom-lower certain vXi64 operations on RV32 due to the vector
Expand Down Expand Up @@ -6541,6 +6541,8 @@ SDValue RISCVTargetLowering::LowerOperation(SDValue Op,
case ISD::VECREDUCE_SEQ_FADD:
case ISD::VECREDUCE_FMIN:
case ISD::VECREDUCE_FMAX:
case ISD::VECREDUCE_FMAXIMUM:
case ISD::VECREDUCE_FMINIMUM:
return lowerFPVECREDUCE(Op, DAG);
case ISD::VP_REDUCE_ADD:
case ISD::VP_REDUCE_UMAX:
Expand Down Expand Up @@ -9541,14 +9543,17 @@ getRVVFPReductionOpAndOperands(SDValue Op, SelectionDAG &DAG, EVT EltVT,
case ISD::VECREDUCE_SEQ_FADD:
return std::make_tuple(RISCVISD::VECREDUCE_SEQ_FADD_VL, Op.getOperand(1),
Op.getOperand(0));
case ISD::VECREDUCE_FMINIMUM:
case ISD::VECREDUCE_FMAXIMUM:
case ISD::VECREDUCE_FMIN:
case ISD::VECREDUCE_FMAX: {
SDValue Front =
DAG.getNode(ISD::EXTRACT_VECTOR_ELT, DL, EltVT, Op.getOperand(0),
DAG.getVectorIdxConstant(0, DL));
unsigned RVVOpc = (Opcode == ISD::VECREDUCE_FMIN)
? RISCVISD::VECREDUCE_FMIN_VL
: RISCVISD::VECREDUCE_FMAX_VL;
unsigned RVVOpc =
(Opcode == ISD::VECREDUCE_FMIN || Opcode == ISD::VECREDUCE_FMINIMUM)
? RISCVISD::VECREDUCE_FMIN_VL
: RISCVISD::VECREDUCE_FMAX_VL;
return std::make_tuple(RVVOpc, Op.getOperand(0), Front);
}
}
Expand All @@ -9571,9 +9576,30 @@ SDValue RISCVTargetLowering::lowerFPVECREDUCE(SDValue Op,
VectorVal = convertToScalableVector(ContainerVT, VectorVal, DAG, Subtarget);
}

MVT ResVT = Op.getSimpleValueType();
auto [Mask, VL] = getDefaultVLOps(VecVT, ContainerVT, DL, DAG, Subtarget);
return lowerReductionSeq(RVVOpcode, Op.getSimpleValueType(), ScalarVal,
VectorVal, Mask, VL, DL, DAG, Subtarget);
SDValue Res = lowerReductionSeq(RVVOpcode, ResVT, ScalarVal, VectorVal, Mask,
VL, DL, DAG, Subtarget);
if (Op.getOpcode() != ISD::VECREDUCE_FMINIMUM &&
Op.getOpcode() != ISD::VECREDUCE_FMAXIMUM)
return Res;

if (Op->getFlags().hasNoNaNs())
return Res;

// Force output to NaN if any element is Nan.
SDValue IsNan =
DAG.getNode(RISCVISD::SETCC_VL, DL, Mask.getValueType(),
{VectorVal, VectorVal, DAG.getCondCode(ISD::SETNE),
DAG.getUNDEF(Mask.getValueType()), Mask, VL});
MVT XLenVT = Subtarget.getXLenVT();
SDValue CPop = DAG.getNode(RISCVISD::VCPOP_VL, DL, XLenVT, IsNan, Mask, VL);
SDValue NoNaNs = DAG.getSetCC(DL, XLenVT, CPop,
DAG.getConstant(0, DL, XLenVT), ISD::SETEQ);
return DAG.getSelect(
DL, ResVT, NoNaNs, Res,
DAG.getConstantFP(APFloat::getNaN(DAG.EVTToAPFloatSemantics(ResVT)), DL,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is okay, but let me call out a possible concern here to confirm.

This is not propagating the nan payload (if any) from the argument. Instead, it is returning a canonical nan if any of the inputs were nan (of any payload). I don't believe we're required to propagate nan payloads per relevant spec documents, but would be good to confirm you agree.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning was that if RISC-V had nan propagating min/max reduction instruction, it would produce a canonical nan like every other instruction that propagates nan. So we should be fine to use a canonical nan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so both llvm.maximum and llvm.vector.reduce.fmaximum.* are consistent to produce a canonical nan.

ResVT));
}

SDValue RISCVTargetLowering::lowerVPREDUCE(SDValue Op,
Expand Down
Loading