-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[NVPTX] Add support for atomic add for f16 type #84295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
atom.add.noftz.f16 is supported since SM 7.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general, modulo missing constrain on PTX version.
@@ -0,0 +1,28 @@ | |||
; RUN: llc < %s -march=nvptx -mcpu=sm_70 -mattr=+ptx63 | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this test it might be convenient to autogenerate the checks with llvm/utils/update_test_checks.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion. Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ordering of the ld.param.*
s relative to the atom.*
instructions isn't relevant to this test, correct? If so, we may not want to include those CHECK-NEXT:
s in the test.
Reverts #84295 due to breakages.
By now I have created a reproducer for what caused the revert. If I adjust the test slightly, and change the line with %r1:
The codegen for this becomes:
And this fails verification with this error:
I lack the knowledge about PTX to know what is wrong with that. Will try to figure it out. |
Ok, seems that ptx makes a difference between floating point constants and integer constants. And we are generating a integer constant, I guess due to using Int16Register. @Artem-B in case you can give me a pointer where I can make sure that we are using a floating point constant here, that would be great :) |
Ok, I found something in NVPTXISelDagToDag.cpp, seems like there is no hex representation for f16 constants, and it is replaced by loading from a f16 register. Is this the right place where to look further? |
Huh. It appears that the instruction does not accept immetiate arguments for f16 variants, though it does accept them for f32. This looks like a bug in ptxas. Normally f16 instruction variants accept plain hex immediate values. Looks like we'll need to disable insttruction variant with an immediate argument and force passing it via a register. |
Tried to disable the instruction variant with an immediate argument, and that seems to work: |
atom.add.noftz.f16 is supported since SM 7.0