-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DAGCombiner][X86] Improve chain handling in fold (fshl ld1, ld0, c) -> (ld0[ofs]) combine. #124871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-x86 @llvm/pr-subscribers-llvm-selectiondag Author: Craig Topper (topperc) ChangesHappened to notice some odd things related to chains in this code. I don't have a good test case yet. The code calls hasOneUse on LoadSDNode* which will check users of the data and the chain. I think this was trying to check that the data had one use so one of the loads would definitely be removed by the transform. Load chains don't always have users so our testing may not have noticed that the chains being used would block the transform. The code makes all users of ld1's chain use the new load's chain, but we don't know that ld1 becomes dead. This can cause incorrect dependencies if ld1's chain is used and it isn't deleted. I think the better thing to do is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1 depend on the new load and the original loads. If the olds loads become dead, their chain will be cleaned up later. Please let me know if I've gotten any of this wrong. If it looks good I'll work on getting a test. Full diff: https://github.com/llvm/llvm-project/pull/124871.diff 1 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index a0c703d2df8a2f..925da2f0ff2732 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -11005,8 +11005,8 @@ SDValue DAGCombiner::visitFunnelShift(SDNode *N) {
auto *RHS = dyn_cast<LoadSDNode>(N1);
if (LHS && RHS && LHS->isSimple() && RHS->isSimple() &&
LHS->getAddressSpace() == RHS->getAddressSpace() &&
- (LHS->hasOneUse() || RHS->hasOneUse()) && ISD::isNON_EXTLoad(RHS) &&
- ISD::isNON_EXTLoad(LHS)) {
+ (LHS->hasNUsesOfValue(1, 0) || RHS->hasNUsesOfValue(1, 0)) &&
+ ISD::isNON_EXTLoad(RHS) && ISD::isNON_EXTLoad(LHS)) {
if (DAG.areNonVolatileConsecutiveLoads(LHS, RHS, BitWidth / 8, 1)) {
SDLoc DL(RHS);
uint64_t PtrOff =
@@ -11024,9 +11024,8 @@ SDValue DAGCombiner::visitFunnelShift(SDNode *N) {
VT, DL, RHS->getChain(), NewPtr,
RHS->getPointerInfo().getWithOffset(PtrOff), NewAlign,
RHS->getMemOperand()->getFlags(), RHS->getAAInfo());
- // Replace the old load's chain with the new load's chain.
- WorklistRemover DeadNodes(*this);
- DAG.ReplaceAllUsesOfValueWith(N1.getValue(1), Load.getValue(1));
+ DAG.makeEquivalentMemoryOrdering(LHS, Load.getValue(1));
+ DAG.makeEquivalentMemoryOrdering(RHS, Load.getValue(1));
return Load;
}
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most hasOneUse checks on SDNode and not SDValue are probably wrong
…d0[ofs]) combine. Happened to notice some odd things related to chains in this code. The code calls hasOneUse on LoadSDNode* which will check users of the data and the chain. I think this was trying to check that the data had one use so one of the loads would definitely be removed by the transform. Load chains don't always have users so our testing may not have noticed that the chains being used would block the transform. The code makes all users of ld1's chain use the new load's chain, but we don't know that ld1 becomes dead. This can cause incorrect dependencies if ld1's chain is used and it isn't deleted. I think the better thing to do is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1 depend on the new load and the original loads. If the olds loads become dead, their chain will be cleaned up later. I'm having trouble getting a test for any ordering issue. areNonVolatileConsecutiveLoads requires the two loads to have the same input chain. Given that I don't know how to use one of the load chain results without also using the other. If they are both used we don't do the transform because hasOneUse will return false for both.
5ad7186
to
4ff8dc5
Compare
I haven't been able to get a test that shows a ordering problem. See the description. I was able to test that we didn't do the transform previously if a store occurs after the loads. I'll pre-commit if it looks good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - cheers. (I think DAGCombiner::CombineConsecutiveLoads has a similar issue?)
This shows missed opportunity to fold (fshl ld1, ld0, c) -> (ld0[ofs]) if the load chain results are used.
…d0[ofs]) combine. (#124871) Happened to notice some odd things related to chains in this code. The code calls hasOneUse on LoadSDNode* which will check users of the data and the chain. I think this was trying to check that the data had one use so one of the loads would definitely be removed by the transform. Load chains don't always have users so our testing may not have noticed that the chains being used would block the transform. The code makes all users of ld1's chain use the new load's chain, but we don't know that ld1 becomes dead. This can cause incorrect dependencies if ld1's chain is used and it isn't deleted. I think the better thing to do is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1 depend on the new load and the original loads. If the olds loads become dead, their chain will be cleaned up later. I'm having trouble getting a test for any ordering issue with the current code. areNonVolatileConsecutiveLoads requires the two loads to have the same input chain. Given that, I don't know how to use one of the load chain results without also using the other. If they are both used we don't do the transform because SDNode::hasOneUse will return false for both.
Committed as 788bbd2 |
This shows missed opportunity to fold (fshl ld1, ld0, c) -> (ld0[ofs]) if the load chain results are used.
…d0[ofs]) combine. (llvm#124871) Happened to notice some odd things related to chains in this code. The code calls hasOneUse on LoadSDNode* which will check users of the data and the chain. I think this was trying to check that the data had one use so one of the loads would definitely be removed by the transform. Load chains don't always have users so our testing may not have noticed that the chains being used would block the transform. The code makes all users of ld1's chain use the new load's chain, but we don't know that ld1 becomes dead. This can cause incorrect dependencies if ld1's chain is used and it isn't deleted. I think the better thing to do is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1 depend on the new load and the original loads. If the olds loads become dead, their chain will be cleaned up later. I'm having trouble getting a test for any ordering issue with the current code. areNonVolatileConsecutiveLoads requires the two loads to have the same input chain. Given that, I don't know how to use one of the load chain results without also using the other. If they are both used we don't do the transform because SDNode::hasOneUse will return false for both.
Happened to notice some odd things related to chains in this code.
The code calls hasOneUse on LoadSDNode* which will check users of the data and the chain. I think this was trying to check that the data had one use so one of the loads would definitely be removed by the transform. Load chains don't always have users so our testing may not have noticed that the chains being used would block the transform.
The code makes all users of ld1's chain use the new load's chain, but we don't know that ld1 becomes dead. This can cause incorrect dependencies if ld1's chain is used and it isn't deleted. I think the better thing to do is use makeEquivalentMemoryOrdering to make all users of ld0 and ld1 depend on the new load and the original loads. If the olds loads become dead, their chain will be cleaned up later.
I'm having trouble getting a test for any ordering issue. areNonVolatileConsecutiveLoads requires the two loads to have the same input chain. Given that I don't know how to use one of the load chain results without also using the other. If they are both used we don't do the transform because hasOneUse will return false for both.
Please let me know if I've gotten any of this wrong.