Skip to content

[libclc] Fix int<->float conversion builtins #126905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

frasercrmck
Copy link
Contributor

@frasercrmck frasercrmck commented Feb 12, 2025

While working on moving the conversion builtins to the CLC library in 25c0554 it was discovered that many weren't passing the OpenCL-CTS tests.

As it happens, the clspv-specific code for conversion implementations between integer and floating-point types was more correct. However:

  • The clspv code was generating 'sat' conversions to floating-point types, which are not legal
  • The clspv code around rtn/rtz conversions needed tweaking as it wasn't validating when sizeof(dst) > sizeof(src), e.g., int -> double.

With this commit, the CTS failures seen before have been resolved.

This also assumes that the new implementations are correct also for clspv. If this is the case, then 'clc' and 'clspv' modes are mutually exclusive and we can simplify the build process for conversions by not building clc-clspv-convert.cl.

While working on moving the conversion builtins to the CLC library in
25c0554 it was discovered that many weren't passing the OpenCL-CTS
tests.

As it happens, the clspv-specific code for conversion implementations
between integer and floating-point types was more correct. However:

* The clspv code was generating 'sat' conversions to floating-point
  types, which are not legal
* The clspv code around rtn/rtz conversions needed tweaking as it wasn't
  validating when sizeof(dst) > sizeof(src), e.g., int -> double.

With this commit, the CTS failures seen before have been resolved.

This also assumes that the new implementations are correct also for
clspv. If this is the case, then 'clc' and 'clspv' modes are mutually
exclusive and we can simplify the build process for conversions by not
building clc-clspv-convert.cl.

Note that there are still outstanding failures in 'rtz' and 'rtp'
rounding modes between float->half, double->half, and double->float
(i.e., floating-point types when sizeof(src) > sizeof(dst)). These will
be looked at separately.
@frasercrmck frasercrmck added the libclc libclc OpenCL library label Feb 12, 2025
@frasercrmck
Copy link
Contributor Author

frasercrmck commented Feb 12, 2025

CC @rjodinchr. I remember you saying you're OOO so no pressure, but perhaps you could verify this commit with clspv?

For clarity, the remaining failures I see are in half_rtp_float half_rtz_float half_rtp_double half_rtz_double float_rtp_double float_rtz_double (non-wimpy), e.g.,

Error for vector size 1 found at 0x00000001:  *-0x0p+0 vs 0x0p+0
Input value: -0x1.000002p-25  (convert_half_rtp( float ))
         *** convert_halfn_rtp( floatn ) FAILED **
Building convert_half_rtz( float ) test
Testing... ......
Error for vector size 1 found at 0x00000001:  *-0x0p+0 vs 0x0p+0
Input value: -0x1.000002p-25  (convert_half_rtz( float ))
         *** convert_halfn_rtz( floatn ) FAILED **

Looks like a problem with rounding to near zero to me. Does anyone else see this?

@frasercrmck frasercrmck requested a review from arsenm February 12, 2025 13:10
@arsenm
Copy link
Contributor

arsenm commented Feb 13, 2025

Do we really need to generate source for this? It's harder to reason about. It looks like the sign bit of the input isn't preserved on underflow to 0

@frasercrmck
Copy link
Contributor Author

Do we really need to generate source for this? It's harder to reason about. It looks like the sign bit of the input isn't preserved on underflow to 0

Perhaps we could simply check in the generated source? It shouldn't change all that often. We might want to split it up into separate files as it's massive.

This sort of thing should really be in an issue ticket, but I had a brief look at the remaining failures yesterday. For some context:

_CLC_DEF _CLC_OVERLOAD half __clc_convert_half_rtp(float x) {
  half r = __clc_convert_half(x);
  float y = __clc_convert_float(r);
  half sel = __clc_select(r, __clc_nextafter(r, (half)INFINITY), __clc_convert_short(y < x));
  return sel;
}

At least it's nice and small.

Since the first two conversions are just fptrunc (0xBE60000020000000 > 0xH8001) and fpext (0xH8001 -> 0xBE70000000000000), and y < x should be true, it seems we expect nextafter to take us from from -0x1p-24 (0xH8001) to -0x0p+0 (0x8000). This makes sense to me, but it takes us to 0xH0000 instead.

I do pass the OpenCL-CTS for nextafter in the same (admittedly janky) configuration so I may have crossed some wires somewhere.

Anyway, I ran the full OpenCL-CTS with this PR overnight and the only remaining failures are those six I listed above.

@frasercrmck
Copy link
Contributor Author

See #127469 for the fix for the other conversion builtins.

@frasercrmck frasercrmck merged commit 6c2d418 into llvm:main Feb 24, 2025
10 checks passed
@frasercrmck frasercrmck deleted the libclc-clc-fix-int-fp-conversions branch February 24, 2025 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libclc libclc OpenCL library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants