Skip to content

libclc: increase fp16 support #98149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 18, 2024
Merged

Conversation

rjodinchr
Copy link
Contributor

Increase fp16 support to allow clspv to continue to be OpenCL compliant following the update of the OpenCL-CTS adding more testing on math functions and conversions with half.

Math functions are implemented by upscaling to fp32 and using the fp32 implementation. It garantees the accuracy required for half-precision float-point by the CTS.

@rjodinchr
Copy link
Contributor Author

Those implementations have been tested using 1 Intel and 1 AMD devices (with clvk/clspv)

@rjodinchr
Copy link
Contributor Author

@alan-baker @kpet , could you please review?

Copy link

github-actions bot commented Jul 9, 2024

✅ With the latest revision this PR passed the Python code formatter.

Copy link

github-actions bot commented Jul 9, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@rjodinchr rjodinchr force-pushed the pr/libclc-fp16 branch 2 times, most recently from c03c6e3 to a387a97 Compare July 9, 2024 12:53
Copy link
Contributor

@alan-baker alan-baker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me.

@rjodinchr
Copy link
Contributor Author

@frasercrmck Could you review it please?

@frasercrmck
Copy link
Contributor

@frasercrmck Could you review it please?

Sure, will do, thanks for this. We've done something similar downstream so I can hopefully verify against what we've done.

Copy link
Contributor

@frasercrmck frasercrmck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

Increase fp16 support to allow clspv to continue to be OpenCL
compliant following the update of the OpenCL-CTS adding more testing
on math functions and conversions with half.

Math functions are implemented by upscaling to fp32 and using the fp32
implementation. It garantees the accuracy required for half-precision
float-point by the CTS.
@rjodinchr
Copy link
Contributor Author

Also, just leave that thought here: A next step might be to use native functions instead (when available). They should provide enough precision for half, but be more performance. Also as the accuracy is undefined for those functions, maybe it should be an opt-in option in the libclc (clspv would enable it for sure).

@rjodinchr
Copy link
Contributor Author

@frasercrmck could you submit that PR? I cannot do it.

@frasercrmck
Copy link
Contributor

Also, just leave that thought here: A next step might be to use native functions instead (when available). They should provide enough precision for half, but be more performance. Also as the accuracy is undefined for those functions, maybe it should be an opt-in option in the libclc (clspv would enable it for sure).

Yeah, it would definitely need to be opt-in. Ideally we'd also implement the half builtins properly so they're not going through float and are as fast as they can be, but then also allow native functions if users so wish. But that involves maths, haha.

@frasercrmck
Copy link
Contributor

@rjodinchr I was about to merge and noticed your email is hidden. I just wanted to check with you whether you want to make it public, as per the discussion https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it/74223/67. AFAICT there's still no official policy, but thought I'd check.

@rjodinchr
Copy link
Contributor Author

@rjodinchr I was about to merge and noticed your email is hidden. I just wanted to check with you whether you want to make it public, as per the discussion https://discourse.llvm.org/t/hidden-emails-on-github-should-we-do-something-about-it/74223/67. AFAICT there's still no official policy, but thought I'd check.

I was not aware my email was private. It should be public now. Thanks

@frasercrmck frasercrmck merged commit 7e6a739 into llvm:main Jul 18, 2024
7 checks passed
@rjodinchr rjodinchr deleted the pr/libclc-fp16 branch July 18, 2024 11:01
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jul 18, 2024

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building libclc at step 6 "build-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/3385

Here is the relevant piece of the build log for the reference:

Step 6 (build-unified-tree) failure: build (failure)
...
17.724 [2782/58/268] Generating obj.libclc.dir/tahiti-amdgcn-mesa-mesa3d/generic/lib/relational/signbit.cl.bc
17.726 [2781/58/269] Generating obj.libclc.dir/nvptx--/generic/lib/math/asinpi.cl.bc
17.727 [2780/58/270] Generating obj.libclc.dir/nvptx--/generic/lib/math/acospi.cl.bc
17.728 [2779/58/271] Linking CXX shared module tools/mlir/python_packages/mlir_core/mlir/_mlir_libs/_mlirDialectsSparseTensor.cpython-310-x86_64-linux-gnu.so
17.728 [2778/58/272] Linking CXX shared module tools/mlir/python_packages/mlir_core/mlir/_mlir_libs/_mlirExecutionEngine.cpython-310-x86_64-linux-gnu.so
17.731 [2777/58/273] Generating obj.libclc.dir/amdgcn--amdhsa/generic/lib/cl_khr_int64_base_atomics/atom_sub.cl.bc
17.744 [2776/58/274] Generating obj.libclc.dir/clspv--/generic/lib/math/clc_remainder.cl.bc
17.753 [2775/58/275] Generating obj.libclc.dir/clspv--/generic/lib/math/ldexp.cl.bc
17.754 [2774/58/276] Generating obj.libclc.dir/tahiti-amdgcn-mesa-mesa3d/generic/lib/shared/max.cl.bc
17.856 [2773/58/277] Generating obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl.bc
FAILED: tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl.bc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl.bc 
cd /build/buildbot/premerge-monolithic-linux/build/tools/libclc && /build/buildbot/premerge-monolithic-linux/build/bin/clang-19 -target r600-- -c -mcpu=barts -fno-builtin -nostdlib -D__CLC_INTERNAL -DCLC_R600 -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include -Wno-bitwise-conditional-parentheses -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_base_atomics -MD -MF /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl.bc.d -MT /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl.bc -cl-no-stdinc -emit-llvm -o /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl.bc -x cl /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl && /etc/cmake/bin/cmake -E cmake_transform_depfile Ninja gccdepfile /build/buildbot/premerge-monolithic-linux/llvm-project/llvm /build/buildbot/premerge-monolithic-linux/llvm-project/libclc /build/buildbot/premerge-monolithic-linux/build /build/buildbot/premerge-monolithic-linux/build/tools/libclc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl.bc.d /build/buildbot/premerge-monolithic-linux/build/CMakeFiles/d/54ef415d6b9c7c45422c2441355b19af2ed1958f751add27287a5a7a084406ed.d
In file included from /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_base_atomics/atom_inc.cl:1:
In file included from /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/clc.h:22:
/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/convert.h:73:1: error: expected function body after function declarator
   73 | _CLC_VECTOR_CONVERT_TO_SUFFIX(_rtn)
      | ^
/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/convert.h:71:3: note: expanded from macro '_CLC_VECTOR_CONVERT_TO_SUFFIX'
   71 |   _CLC_VECTOR_CONVERT_TO(ROUND)
      |   ^
1 error generated.
17.856 [2773/57/278] Generating obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl.bc
FAILED: tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl.bc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl.bc 
cd /build/buildbot/premerge-monolithic-linux/build/tools/libclc && /build/buildbot/premerge-monolithic-linux/build/bin/clang-19 -target r600-- -c -mcpu=barts -fno-builtin -nostdlib -D__CLC_INTERNAL -DCLC_R600 -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include -Wno-bitwise-conditional-parentheses -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_extended_atomics -MD -MF /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl.bc.d -MT /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl.bc -cl-no-stdinc -emit-llvm -o /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl.bc -x cl /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl && /etc/cmake/bin/cmake -E cmake_transform_depfile Ninja gccdepfile /build/buildbot/premerge-monolithic-linux/llvm-project/llvm /build/buildbot/premerge-monolithic-linux/llvm-project/libclc /build/buildbot/premerge-monolithic-linux/build /build/buildbot/premerge-monolithic-linux/build/tools/libclc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl.bc.d /build/buildbot/premerge-monolithic-linux/build/CMakeFiles/d/4916e754d4323cb996fa367f8ae31fa5d58f414c0e6e482bbe15a131d1cb4920.d
In file included from /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_extended_atomics/atom_and.cl:3:
In file included from /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_extended_atomics/../atom_int32_binary.inc:1:
In file included from /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/clc.h:22:
/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/convert.h:73:1: error: expected function body after function declarator
   73 | _CLC_VECTOR_CONVERT_TO_SUFFIX(_rtn)
      | ^
/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/convert.h:71:3: note: expanded from macro '_CLC_VECTOR_CONVERT_TO_SUFFIX'
   71 |   _CLC_VECTOR_CONVERT_TO(ROUND)
      |   ^
1 error generated.
17.874 [2773/56/279] Generating obj.libclc.dir/barts-r600--/generic/lib/cl_khr_int64_base_atomics/atom_add.cl.bc
FAILED: tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_int64_base_atomics/atom_add.cl.bc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_int64_base_atomics/atom_add.cl.bc 
cd /build/buildbot/premerge-monolithic-linux/build/tools/libclc && /build/buildbot/premerge-monolithic-linux/build/bin/clang-19 -target r600-- -c -mcpu=barts -fno-builtin -nostdlib -D__CLC_INTERNAL -DCLC_R600 -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include -Wno-bitwise-conditional-parentheses -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_int64_base_atomics -MD -MF /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_int64_base_atomics/atom_add.cl.bc.d -MT /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_int64_base_atomics/atom_add.cl.bc -cl-no-stdinc -emit-llvm -o /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_int64_base_atomics/atom_add.cl.bc -x cl /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_int64_base_atomics/atom_add.cl && /etc/cmake/bin/cmake -E cmake_transform_depfile Ninja gccdepfile /build/buildbot/premerge-monolithic-linux/llvm-project/llvm /build/buildbot/premerge-monolithic-linux/llvm-project/libclc /build/buildbot/premerge-monolithic-linux/build /build/buildbot/premerge-monolithic-linux/build/tools/libclc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_int64_base_atomics/atom_add.cl.bc.d /build/buildbot/premerge-monolithic-linux/build/CMakeFiles/d/a38604cc3e9218cb5ee9648f70d45dfc701f9c9ca53b6a98f44565b6f90607d0.d
In file included from /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_int64_base_atomics/atom_add.cl:1:
In file included from /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/clc.h:22:
/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/convert.h:73:1: error: expected function body after function declarator
   73 | _CLC_VECTOR_CONVERT_TO_SUFFIX(_rtn)
      | ^
/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include/clc/convert.h:71:3: note: expanded from macro '_CLC_VECTOR_CONVERT_TO_SUFFIX'
   71 |   _CLC_VECTOR_CONVERT_TO(ROUND)
      |   ^
1 error generated.
17.875 [2773/55/280] Generating obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl.bc
FAILED: tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl.bc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl.bc 
cd /build/buildbot/premerge-monolithic-linux/build/tools/libclc && /build/buildbot/premerge-monolithic-linux/build/bin/clang-19 -target r600-- -c -mcpu=barts -fno-builtin -nostdlib -D__CLC_INTERNAL -DCLC_R600 -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/include -Wno-bitwise-conditional-parentheses -I/build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_extended_atomics -MD -MF /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl.bc.d -MT /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl.bc -cl-no-stdinc -emit-llvm -o /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl.bc -x cl /build/buildbot/premerge-monolithic-linux/llvm-project/libclc/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl && /etc/cmake/bin/cmake -E cmake_transform_depfile Ninja gccdepfile /build/buildbot/premerge-monolithic-linux/llvm-project/llvm /build/buildbot/premerge-monolithic-linux/llvm-project/libclc /build/buildbot/premerge-monolithic-linux/build /build/buildbot/premerge-monolithic-linux/build/tools/libclc /build/buildbot/premerge-monolithic-linux/build/tools/libclc/obj.libclc.dir/barts-r600--/generic/lib/cl_khr_global_int32_extended_atomics/atom_or.cl.bc.d /build/buildbot/premerge-monolithic-linux/build/CMakeFiles/d/6b47d740c1c252336bdd546b73b979676993080ac978783cb9fd776930286d5e.d

@rjodinchr
Copy link
Contributor Author

I am able to reproduce it, working on it right now

@rjodinchr
Copy link
Contributor Author

This PR should fix the build issue: #99481

@mgorny
Copy link
Member

mgorny commented Jul 20, 2024

This change seems to have caused test regression. This is Gentoo Linux amd64, tried LLVM 72d8c27 and 18.1.8.

To reproduce, I've done:

cmake -G Ninja "${llvm_project}"/libclc -DLIBCLC_TARGETS_TO_BUILD="spirv-mesa3d-;spirv64-mesa3d-;r600--;amdgcn--;amdgcn-mesa-mesa3d;amdgcn--amdhsa"
ninja
ninja test

The result:

[0/1] Running tests...
Test project /home/mgorny/git/llvm-project/libclc/build
    Start 1: external-calls-tahiti-amdgcn--.bc
1/7 Test #1: external-calls-tahiti-amdgcn--.bc .............***Failed    5.46 sec
    Start 2: external-calls-amdgcn--amdhsa.bc
2/7 Test #2: external-calls-amdgcn--amdhsa.bc ..............***Failed    5.54 sec
    Start 3: external-calls-tahiti-amdgcn-mesa-mesa3d.bc
3/7 Test #3: external-calls-tahiti-amdgcn-mesa-mesa3d.bc ...***Failed    5.43 sec
    Start 4: external-calls-cedar-r600--.bc
4/7 Test #4: external-calls-cedar-r600--.bc ................   Passed    2.05 sec
    Start 5: external-calls-cypress-r600--.bc
5/7 Test #5: external-calls-cypress-r600--.bc ..............   Passed    2.07 sec
    Start 6: external-calls-barts-r600--.bc
6/7 Test #6: external-calls-barts-r600--.bc ................   Passed    2.03 sec
    Start 7: external-calls-cayman-r600--.bc
7/7 Test #7: external-calls-cayman-r600--.bc ...............   Passed    2.07 sec

57% tests passed, 3 tests failed out of 7

Total Test time (real) =  24.66 sec

The following tests FAILED:
	  1 - external-calls-tahiti-amdgcn--.bc (Failed)
	  2 - external-calls-amdgcn--amdhsa.bc (Failed)
	  3 - external-calls-tahiti-amdgcn-mesa-mesa3d.bc (Failed)
Errors while running CTest
Output from these tests are in: /home/mgorny/git/llvm-project/libclc/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
FAILED: CMakeFiles/test.util 
cd /home/mgorny/git/llvm-project/libclc/build && /usr/bin/ctest --force-new-ctest-process
ninja: build stopped: subcommand failed.

LastTest.log: LastTest.log

If I go back to 6838c7a, they all pass.

@rjodinchr
Copy link
Contributor Author

@mgorny thank you for the report. It should be fixed with: #99841

yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Increase fp16 support to allow clspv to continue to be OpenCL compliant
following the update of the OpenCL-CTS adding more testing on math
functions and conversions with half.

Math functions are implemented by upscaling to fp32 and using the fp32
implementation. It garantees the accuracy required for half-precision
float-point by the CTS.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants