Skip to content

[RISCV] Fix mgather -> riscv.masked.strided.load combine not extending indices #82506

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 12 additions & 8 deletions llvm/lib/Target/RISCV/RISCVISelLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3240,15 +3240,16 @@ static std::optional<uint64_t> getExactInteger(const APFloat &APF,
// Note that this method will also match potentially unappealing index
// sequences, like <i32 0, i32 50939494>, however it is left to the caller to
// determine whether this is worth generating code for.
static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op,
unsigned EltSizeInBits) {
unsigned NumElts = Op.getNumOperands();
assert(Op.getOpcode() == ISD::BUILD_VECTOR && "Unexpected BUILD_VECTOR");
bool IsInteger = Op.getValueType().isInteger();

std::optional<unsigned> SeqStepDenom;
std::optional<int64_t> SeqStepNum, SeqAddend;
std::optional<std::pair<uint64_t, unsigned>> PrevElt;
unsigned EltSizeInBits = Op.getValueType().getScalarSizeInBits();
assert(EltSizeInBits >= Op.getValueType().getScalarSizeInBits());
for (unsigned Idx = 0; Idx < NumElts; Idx++) {
// Assume undef elements match the sequence; we just have to be careful
// when interpolating across them.
Expand All @@ -3261,14 +3262,14 @@ static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
if (!isa<ConstantSDNode>(Op.getOperand(Idx)))
return std::nullopt;
Val = Op.getConstantOperandVal(Idx) &
maskTrailingOnes<uint64_t>(EltSizeInBits);
maskTrailingOnes<uint64_t>(Op.getScalarValueSizeInBits());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make Op.getScalarValueSizeInBits() a variable? It has a lot of usages.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also reduce the number of usages by only creating the list of integers once. I'll try and create a PR for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've opened #82590, hopefully it removes some duplication

} else {
// The BUILD_VECTOR must be all constants.
if (!isa<ConstantFPSDNode>(Op.getOperand(Idx)))
return std::nullopt;
if (auto ExactInteger = getExactInteger(
cast<ConstantFPSDNode>(Op.getOperand(Idx))->getValueAPF(),
EltSizeInBits))
Op.getScalarValueSizeInBits()))
Val = *ExactInteger;
else
return std::nullopt;
Expand Down Expand Up @@ -3324,11 +3325,11 @@ static std::optional<VIDSequence> isSimpleVIDSequence(SDValue Op) {
uint64_t Val;
if (IsInteger) {
Val = Op.getConstantOperandVal(Idx) &
maskTrailingOnes<uint64_t>(EltSizeInBits);
maskTrailingOnes<uint64_t>(Op.getScalarValueSizeInBits());
} else {
Val = *getExactInteger(
cast<ConstantFPSDNode>(Op.getOperand(Idx))->getValueAPF(),
EltSizeInBits);
Op.getScalarValueSizeInBits());
}
uint64_t ExpectedVal =
(int64_t)(Idx * (uint64_t)*SeqStepNum) / *SeqStepDenom;
Expand Down Expand Up @@ -3598,7 +3599,7 @@ static SDValue lowerBuildVectorOfConstants(SDValue Op, SelectionDAG &DAG,
// Try and match index sequences, which we can lower to the vid instruction
// with optional modifications. An all-undef vector is matched by
// getSplatValue, above.
if (auto SimpleVID = isSimpleVIDSequence(Op)) {
if (auto SimpleVID = isSimpleVIDSequence(Op, Op.getScalarValueSizeInBits())) {
int64_t StepNumerator = SimpleVID->StepNumerator;
unsigned StepDenominator = SimpleVID->StepDenominator;
int64_t Addend = SimpleVID->Addend;
Expand Down Expand Up @@ -15978,7 +15979,10 @@ SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,

if (Index.getOpcode() == ISD::BUILD_VECTOR &&
MGN->getExtensionType() == ISD::NON_EXTLOAD && isTypeLegal(VT)) {
if (std::optional<VIDSequence> SimpleVID = isSimpleVIDSequence(Index);
// The sequence will be XLenVT, not the type of Index. Tell
// isSimpleVIDSequence this so we avoid overflow.
if (std::optional<VIDSequence> SimpleVID =
isSimpleVIDSequence(Index, Subtarget.getXLen());
SimpleVID && SimpleVID->StepDenominator == 1) {
const int64_t StepNumerator = SimpleVID->StepNumerator;
const int64_t Addend = SimpleVID->Addend;
Expand Down
12 changes: 4 additions & 8 deletions llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
Original file line number Diff line number Diff line change
Expand Up @@ -15086,23 +15086,19 @@ define <32 x i64> @mgather_strided_split(ptr %base) {
ret <32 x i64> %x
}

; FIXME: This is a miscompile triggered by the mgather ->
; riscv.masked.strided.load combine. In order for it to trigger we need either a
; strided gather that RISCVGatherScatterLowering doesn't pick up, or a new
; strided gather generated by the widening sew combine.
define <4 x i32> @masked_gather_widen_sew_negative_stride(ptr %base) {
; RV32V-LABEL: masked_gather_widen_sew_negative_stride:
; RV32V: # %bb.0:
; RV32V-NEXT: addi a0, a0, -120
; RV32V-NEXT: li a1, 120
; RV32V-NEXT: addi a0, a0, 136
; RV32V-NEXT: li a1, -136
; RV32V-NEXT: vsetivli zero, 2, e64, m1, ta, ma
; RV32V-NEXT: vlse64.v v8, (a0), a1
; RV32V-NEXT: ret
;
; RV64V-LABEL: masked_gather_widen_sew_negative_stride:
; RV64V: # %bb.0:
; RV64V-NEXT: addi a0, a0, -120
; RV64V-NEXT: li a1, 120
; RV64V-NEXT: addi a0, a0, 136
; RV64V-NEXT: li a1, -136
; RV64V-NEXT: vsetivli zero, 2, e64, m1, ta, ma
; RV64V-NEXT: vlse64.v v8, (a0), a1
; RV64V-NEXT: ret
Expand Down