Skip to content

[DAG] Matched FixedWidth pattern for ISD::AVGFLOORU #84903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 19, 2024
21 changes: 21 additions & 0 deletions llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2820,6 +2820,23 @@ SDValue DAGCombiner::visitADDLike(SDNode *N) {
return SDValue();
}

// Attempt to form avgflooru(A, B) from (A & B) + ((A ^ B) >> 1)
static SDValue combineFixedwidthToAVGFLOORU(SDNode *N, SelectionDAG &DAG) {
const TargetLowering &TLI = DAG.getTargetLoweringInfo();
SDValue N0 = N->getOperand(0);
EVT VT = N0.getValueType();
SDLoc DL(N);
if (TLI.isOperationLegal(ISD::AVGFLOORU, VT)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the operands of the Add would be treated commutatively - either could be the and and either could be the srl. Same for the and/xor args below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using sd_match would handle commutative matching for us

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the kind of thing I had in mind:

  if (TLI.isOperationLegal(ISD::AVGFLOORU, VT)) {
    SDValue A, B;
    if (sd_match(N, m_Add(m_And(m_Value(A), m_Value(B)),
                          m_Srl(m_Xor(m_Deferred(A), m_Deferred(B)),
                                m_SpecificInt(1))))) {
      return DAG.getNode(ISD::AVGFLOORU, DL, VT, A, B);
    }
  }

Copy link
Member Author

@Sh0g0-1758 Sh0g0-1758 Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But since m_And and m_Xor are not available in SDPatternMatch, we will have to use llvm::PatternMatch for them. But I don't think if this will work. This is what I get finally and it crashes the build process.

  const TargetLowering &TLI = DAG.getTargetLoweringInfo();
  SDValue N0 = N->getOperand(0);
  EVT VT = N0.getValueType();
  SDLoc DL(N);
  if (TLI.isOperationLegal(ISD::AVGFLOORU, VT)) {
    SDValue A, B;
    if (sd_match(N, llvm::SDPatternMatch::m_Add(llvm::PatternMatch::m_And(m_Value(A), m_Value(B)),
                          llvm::SDPatternMatch::m_Srl(llvm::PatternMatch::m_Xor(m_Deferred(A), m_Deferred(B)),
                                m_SpecificInt(1))))) {
      return DAG.getNode(ISD::AVGFLOORU, DL, VT, A, B);
    }
  }

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the code for commutative matching, but not using the method as suggested by @RKSimon.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll get m_And / mXor / m_Or added shortly

Copy link
Member Author

@Sh0g0-1758 Sh0g0-1758 Mar 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sure thing. Then we can use the shortened version as part of this PR itself.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the code accordingly.

SDValue A, B;
if (sd_match(N, m_Add(m_And(m_Value(A), m_Value(B)),
m_Srl(m_Xor(m_Deferred(A), m_Deferred(B)),
m_SpecificInt(1))))) {
return DAG.getNode(ISD::AVGFLOORU, DL, VT, A, B);
}
}
return SDValue();
}

SDValue DAGCombiner::visitADD(SDNode *N) {
SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);
Expand All @@ -2835,6 +2852,10 @@ SDValue DAGCombiner::visitADD(SDNode *N) {
if (SDValue V = foldAddSubOfSignBit(N, DAG))
return V;

// Try to match AVGFLOORU fixedwidth pattern
if (SDValue V = combineFixedwidthToAVGFLOORU(N, DAG))
return V;

// fold (a+b) -> (a|b) iff a and b share no bits.
if ((!LegalOperations || TLI.isOperationLegal(ISD::OR, VT)) &&
DAG.haveNoCommonBitsSet(N0, N1))
Expand Down
12 changes: 12 additions & 0 deletions llvm/test/CodeGen/AArch64/hadd-combine.ll
Original file line number Diff line number Diff line change
Expand Up @@ -859,6 +859,18 @@ define <4 x i32> @urhadd_v4i32(<4 x i32> %x) {
ret <4 x i32> %r
}

define <8 x i16> @uhadd_fixedwidth_v4i32(<8 x i16> %a0, <8 x i16> %a1) {
; CHECK-LABEL: uhadd_fixedwidth_v4i32:
; CHECK: // %bb.0:
; CHECK-NEXT: uhadd v0.8h, v0.8h, v1.8h
; CHECK-NEXT: ret
%and = and <8 x i16> %a0, %a1
%xor = xor <8 x i16> %a0, %a1
%srl = lshr <8 x i16> %xor, <i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1, i16 1>
%res = add <8 x i16> %and, %srl
ret <8 x i16> %res
}

declare <8 x i8> @llvm.aarch64.neon.shadd.v8i8(<8 x i8>, <8 x i8>)
declare <4 x i16> @llvm.aarch64.neon.shadd.v4i16(<4 x i16>, <4 x i16>)
declare <2 x i32> @llvm.aarch64.neon.shadd.v2i32(<2 x i32>, <2 x i32>)
Expand Down