-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[SDAG] Adding scalarization of llvm.vector.insert
#71614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -675,6 +675,9 @@ bool DAGTypeLegalizer::ScalarizeVectorOperand(SDNode *N, unsigned OpNo) { | |
case ISD::BITCAST: | ||
Res = ScalarizeVecOp_BITCAST(N); | ||
break; | ||
case ISD::INSERT_SUBVECTOR: | ||
Res = ScalarizeVecOp_INSERT_SUBVECTOR(N, OpNo); | ||
break; | ||
case ISD::ANY_EXTEND: | ||
case ISD::ZERO_EXTEND: | ||
case ISD::SIGN_EXTEND: | ||
|
@@ -766,6 +769,22 @@ SDValue DAGTypeLegalizer::ScalarizeVecOp_BITCAST(SDNode *N) { | |
N->getValueType(0), Elt); | ||
} | ||
|
||
/// If the value to subvector is a vector that needs to be scalarized, it must | ||
/// be <1 x ty>. Return the element instead. | ||
SDValue DAGTypeLegalizer::ScalarizeVecOp_INSERT_SUBVECTOR(SDNode *N, | ||
unsigned OpNo) { | ||
// If the destination vector is unary, we can just return the source vector | ||
auto Src = GetScalarizedVector(N->getOperand(1)); | ||
if (OpNo == 0) { | ||
return Src; | ||
} | ||
|
||
auto Dest = N->getOperand(0); | ||
auto Idx = N->getOperand(2); | ||
return DAG.getNode(ISD::INSERT_VECTOR_ELT, SDLoc(N), N->getValueType(0), Dest, | ||
Src, Idx); | ||
} | ||
|
||
/// If the input is a vector that needs to be scalarized, it must be <1 x ty>. | ||
/// Do the operation on the element instead. | ||
SDValue DAGTypeLegalizer::ScalarizeVecOp_UnaryOp(SDNode *N) { | ||
|
@@ -5891,8 +5910,11 @@ SDValue DAGTypeLegalizer::WidenVecRes_SETCC(SDNode *N) { | |
InOp1 = GetWidenedVector(InOp1); | ||
InOp2 = GetWidenedVector(InOp2); | ||
} else { | ||
InOp1 = DAG.WidenVector(InOp1, SDLoc(N)); | ||
InOp2 = DAG.WidenVector(InOp2, SDLoc(N)); | ||
do { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Type legalization doesn't operate incrementally in individual steps. Newly legalized operations are supposed to be re-legalized later if necessary There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For having looked at this with Rob, the problem we are facing here is:
There may be a better way to fix that but the newly legalized operations won't get a chance to be re-legalized. Open to suggestions, admittedly I went with a quick fix here. FWIW, GISel doesn't have this issue. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The widening is only really intended to apply to integer types. A There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the correct way to handle this is by inserting an explicit conversion instruction from the widened v2f16 to match v4f16. i.e. create a new INSERT_SUBVECTOR v4f16:undef, (v2f16 GetWidenedVector), 0 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes, but that's not what the AArch64 lowering wants (see https://github.com/llvm/llvm-project/blob/f0cdf4b468f6ee48b0d0d51ce78145455e2f07a6/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L24550). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let me re-phrase, we could change the code to explicitly create an insert_subvector here, but I don't see how this is better :). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, The point of that code is to act upon integer types, to keep them in vector registers as opposed to scalarizing them to gprs. A I agree that we shouldn't be changing the way that integer types are handled in general, but if there was a way to specify that a setcc with fp operands should really be treated like fp types, that would avoid a lot of the awkwardness here where we attempt to widen it only to scalarize again. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I agree, but could you refresh my memory: is it something we have a control on? IIRC, the selection of a strategy for type legalization is agnostic of the particular instruction being legalized and happens before the custom lowering, but it's been a while since I touched SDISel. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thats always the question, and Im not sure there is a way to specify the types for operations separately. Looking through the example though, it appears that the I've put up #72048 as a possible fix. It just tries to prevent the generation of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Works for me. |
||
InOp1 = DAG.WidenVector(InOp1, SDLoc(N)); | ||
InOp2 = DAG.WidenVector(InOp2, SDLoc(N)); | ||
} while (ElementCount::isKnownLT( | ||
InOp1.getValueType().getVectorElementCount(), WidenEC)); | ||
} | ||
|
||
// Assume that the input and output will be widen appropriately. If not, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -67,3 +67,12 @@ if.then: | |
if.end: | ||
ret i32 1; | ||
} | ||
|
||
define dso_local <1 x half> @cmp_select(<1 x half> %i105, <1 x half> %in) { | ||
; CHECK-LABEL: @cmp_select | ||
; CHECL: fcmge | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CHECL->CHECK |
||
newFuncRoot: | ||
%i179 = fcmp uno <1 x half> %i105, zeroinitializer | ||
%i180 = select <1 x i1> %i179, <1 x half> %in, <1 x half> %i105 | ||
ret <1 x half> %i180 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this ever be called with OpNo==0? Wouldn't ScalarVecRes_INSERT_SUBVECTOR be called first? The first operand the result must have the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not certain, it would depend where the type propagation occurs. My assumption is that this could iteratively come from a previously operation now scalarized without the result type necessarily being updated (which is only an asumption). Do you have a suggestion for how to verify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Result types are always legalized first and need to update the first operand along with them. You could try just asserting OpNo == 1