Skip to content

[NVPTX] Add support for atomic add for bf16 type #89586

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 8, 2024
Merged

Conversation

akuegel
Copy link
Member

@akuegel akuegel commented Apr 22, 2024

atom.add.noftz.bf16 is supported since SM 9.0 and PTX 7.8

atom.add.noftz.bf16 is supported since SM 9.0 and PTX 7.8
@akuegel akuegel requested a review from Artem-B April 22, 2024 10:21
Copy link
Member

@Artem-B Artem-B left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Do we need to worry about denormals? I.e. we only have noftz variant of the atomic, which potentially will give us different results that a regular addition would, depending on denormals handling mode. Do we need to add an extra op that folds denormals to zero when normal FP ops would be expected to do so?

I'm way out of my depth on FP handling nuances here. We need to ask an FP expert.

@akuegel akuegel requested a review from ThomasRaoux April 23, 2024 07:31
@akuegel
Copy link
Member Author

akuegel commented May 7, 2024

@Artem-B Do you know someone we could ask who is a floating point expert?
Note that in my previous change where I added support for f16 type, I also didn't handle the case with ftz mode. So do I need to revert that previous patch?

@Artem-B
Copy link
Member

Artem-B commented May 7, 2024

Do you know someone we could ask who is a floating point expert?

@majnemer Do you have an opinion on atomics and denormals handling for bf16/f16? Who would be the right folks to consult on that?

@majnemer
Copy link
Contributor

majnemer commented May 7, 2024

I think it is OK for us not to do anything special for noftz.

It is a bit tricky to fix it post-hoc because of cases like:

  x = 0
  x += 0x1.02p-126
  x += -0x1p-126

Here, none of the inputs were subnormal but x ends up as 0x1p-133 which is subnormal.

Also, LLVM semantics allow us to return subnormal results instead of flushing.

LGTM.

@akuegel akuegel merged commit bafbe39 into llvm:main May 8, 2024
@akuegel akuegel deleted the atomics branch May 8, 2024 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants