-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DAG] Matched FixedWidth pattern for ISD::AVGFLOORU #84903
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
f24f251
c2f47d6
298712b
93c9305
194fd28
46c23cb
ed3aae8
766caed
9365317
00986c4
eb994e0
c706403
e7dc0c0
830ac3c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2826,6 +2826,38 @@ SDValue DAGCombiner::visitADDLike(SDNode *N) { | |
return SDValue(); | ||
} | ||
|
||
// Attempt to form ext(avgflooru(A, B)) from add(and(A, B), lshr(xor(A, B), 1)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally the operands of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using sd_match would handle commutative matching for us There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the kind of thing I had in mind: if (TLI.isOperationLegal(ISD::AVGFLOORU, VT)) {
SDValue A, B;
if (sd_match(N, m_Add(m_And(m_Value(A), m_Value(B)),
m_Srl(m_Xor(m_Deferred(A), m_Deferred(B)),
m_SpecificInt(1))))) {
return DAG.getNode(ISD::AVGFLOORU, DL, VT, A, B);
}
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. But since const TargetLowering &TLI = DAG.getTargetLoweringInfo();
SDValue N0 = N->getOperand(0);
EVT VT = N0.getValueType();
SDLoc DL(N);
if (TLI.isOperationLegal(ISD::AVGFLOORU, VT)) {
SDValue A, B;
if (sd_match(N, llvm::SDPatternMatch::m_Add(llvm::PatternMatch::m_And(m_Value(A), m_Value(B)),
llvm::SDPatternMatch::m_Srl(llvm::PatternMatch::m_Xor(m_Deferred(A), m_Deferred(B)),
m_SpecificInt(1))))) {
return DAG.getNode(ISD::AVGFLOORU, DL, VT, A, B);
}
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have added the code for commutative matching, but not using the method as suggested by @RKSimon. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll get m_And / mXor / m_Or added shortly There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, sure thing. Then we can use the shortened version as part of this PR itself. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Modified the code accordingly. |
||
static SDValue combineFixedwidthToAVG(SDNode *N, SelectionDAG &DAG) { | ||
assert(N->getOpcode() == ISD::ADD && "ADD node is required here"); | ||
SDValue And = N->getOperand(0); | ||
SDValue Lshr = N->getOperand(1); | ||
if (And.getOpcode() != ISD::AND || Lshr.getOpcode() != ISD::SRL) | ||
return SDValue(); | ||
SDValue Xor = Lshr.getOperand(0); | ||
if (Xor.getOpcode() != ISD::XOR) | ||
Sh0g0-1758 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
return SDValue(); | ||
SDValue And1 = And.getOperand(0); | ||
SDValue And2 = And.getOperand(1); | ||
SDValue Xor1 = Xor.getOperand(0); | ||
SDValue Xor2 = Xor.getOperand(1); | ||
if (Xor1 != And1 && Xor2 != And2) | ||
return SDValue(); | ||
// Is the right shift using an immediate value of 1? | ||
Sh0g0-1758 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
ConstantSDNode *N1C = isConstOrConstSplat(Lshr.getOperand(1)); | ||
if (!N1C || N1C->getAPIntValue() != 1) | ||
return SDValue(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should be able to vastly simplify this matching code by using the sd_match pattern helper - #84759 will introduce this very soon, but if you're willing to try it you can just add the include and using lines from the patch you should be able to use it here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. #84759 has now been committed if you want to try it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I would much rather do it as a separate issue after implementing the fixedwidthpatterns for both floor and ceil. But if you want it to be part of this PR only, please do let me know. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think using sd_match is the way to go from the start There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sure thing. Please push the code for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. saw the updates, updating it. |
||
EVT VT = And.getValueType(); | ||
if (VT.isVector()) | ||
return SDValue(); | ||
Sh0g0-1758 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
SDLoc DL(N); | ||
Sh0g0-1758 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
const TargetLowering &TLI = DAG.getTargetLoweringInfo(); | ||
if (!TLI.isOperationLegalOrCustom(ISD::AVGFLOORU, VT)) | ||
return SDValue(); | ||
return DAG.getNode(ISD::AVGFLOORU, DL, VT, | ||
DAG.getExtOrTrunc(false, And1, DL, VT), | ||
DAG.getExtOrTrunc(false, And2, DL, VT)); | ||
} | ||
|
||
SDValue DAGCombiner::visitADD(SDNode *N) { | ||
SDValue N0 = N->getOperand(0); | ||
SDValue N1 = N->getOperand(1); | ||
|
@@ -2841,6 +2873,10 @@ SDValue DAGCombiner::visitADD(SDNode *N) { | |
if (SDValue V = foldAddSubOfSignBit(N, DAG)) | ||
return V; | ||
|
||
// Try to match AVG fixedwidth pattern | ||
if (SDValue V = combineFixedwidthToAVG(N, DAG)) | ||
return V; | ||
|
||
// fold (a+b) -> (a|b) iff a and b share no bits. | ||
if ((!LegalOperations || TLI.isOperationLegal(ISD::OR, VT)) && | ||
DAG.haveNoCommonBitsSet(N0, N1)) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -859,6 +859,21 @@ define <4 x i32> @urhadd_v4i32(<4 x i32> %x) { | |
ret <4 x i32> %r | ||
} | ||
|
||
define <4 x i32> @fixedwidth(<4 x i32> %a0, <4 x i32> %a1) { | ||
Sh0g0-1758 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
; CHECK-LABEL: fixedwidth: | ||
; CHECK: // %bb.0: | ||
; CHECK-NEXT: and v2.16b, v0.16b, v1.16b | ||
; CHECK-NEXT: eor v0.16b, v0.16b, v1.16b | ||
; CHECK-NEXT: usra v2.4s, v0.4s, #1 | ||
; CHECK-NEXT: mov v0.16b, v2.16b | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any idea why this is not producing a single uhadd instruction? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just ran the automation script for the tests. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me rephrase that :) - this test case should be creating an uhadd (which is the aarch64 spelling for AVGFLOORU). Currently it's just producing the same as without the patch. I think it's because of the vector type check. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, indeed but whenever I run the update_llc_test_checks.py on the test case : define <4 x i32> @uhadd_fixedwidth_v4i32(<4 x i32> %a0, <4 x i32> %a1) {
%and = and <4 x i32> %a0, %a1
%xor = xor <4 x i32> %a0, %a1
%srl = lshr <4 x i32> %xor, <i32 1,i32 1,i32 1,i32 1>
%res = add <4 x i32> %and, %srl
ret <4 x i32> %res
} I get the same check assertions for both when I add the change in VT to vector type and when I don't. The only difference is that in the former, the tests pass. Is there something else that needs to be done to generate the correct assertions or is there some change in the original DAGCombiner.cpp required? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I wrote the correct check assertions manually now with the original value type and the checks are passing now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the automation script |
||
; CHECK-NEXT: ret | ||
Sh0g0-1758 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
%and = and <4 x i32> %a0, %a1 | ||
%xor = xor <4 x i32> %a0, %a1 | ||
%srl = lshr <4 x i32> %xor, <i32 1,i32 1,i32 1,i32 1> | ||
%res = add <4 x i32> %and, %srl | ||
ret <4 x i32> %res | ||
} | ||
|
||
declare <8 x i8> @llvm.aarch64.neon.shadd.v8i8(<8 x i8>, <8 x i8>) | ||
declare <4 x i16> @llvm.aarch64.neon.shadd.v4i16(<4 x i16>, <4 x i16>) | ||
declare <2 x i32> @llvm.aarch64.neon.shadd.v2i32(<2 x i32>, <2 x i32>) | ||
|
Uh oh!
There was an error while loading. Please reload this page.