-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[AMDGPU] Implement hasAndNot for scalar bitwise AND-NOT operations. #112647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
a29b0b8
e0ddab1
7c900c2
563de33
b06240e
244612d
70d8ac0
ee5ca4e
2ec01c6
28ea084
dba6155
9990cfb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17577,3 +17577,13 @@ SITargetLowering::lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *AI) const { | |
AI->eraseFromParent(); | ||
return LI; | ||
} | ||
|
||
bool SITargetLowering::hasAndNot(SDValue Op) const { | ||
// AND-NOT is only valid on uniform (SGPR) values; divergent values live in | ||
// VGPRs. | ||
if (Op->isDivergent()) | ||
return false; | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comment why this is the set of cases There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shouldn't really need to consider the machine opcode case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry for the late update on this PR. Last month, I was still thinking about this patch and forgot to push it to the origin branch. I'm still considering this issue, because some lit tests show an increase in the number of instructions, while others show a decrease. So, I'm not yet sure whether it impacts performance. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Most of the work of this change is avoiding the regressions There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! Do you have any further suggestions? Also, do you think it's ready to be merged now? :-) |
||
EVT VT = Op.getValueType(); | ||
return VT == MVT::i32 || VT == MVT::i64; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure we need to check types here. How about just There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If it's a different type and then is legalized, there will be intermediate instructions that break the and not pattern |
||
} | ||
harrisonGPU marked this conversation as resolved.
Show resolved
Hide resolved
|
Uh oh!
There was an error while loading. Please reload this page.