Skip to content

[libclc] Move min/max/clamp into the CLC builtins library #114386

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 31, 2024

Conversation

frasercrmck
Copy link
Contributor

These functions are "shared" between integer and floating-point types, hence the directory name. They are used in several CLC internal functions such as __clc_ldexp.

Note that clspv and spirv targets don't want to define these functions, so pre-processor macros replace calls to __clc_min with regular min, for example. This means they can use as much of the generic CLC source files as possible, but where CLC functions would usually call out to an external __clc_min symbol, they call out to an external min symbol. Then they opt out of defining __clc_min itself in their CLC builtins library.

Preprocessor definitions for these targets have also been changed somewhat: what used to be CLC_SPIRV (the 32-bit target) is now CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or CLC_SPIRV64. Same goes for CLC_CLSPV.

There are no differences (measured with llvm-diff) in any of the final builtins libraries for nvptx, amdgpu, or clspv. Neither are there differences in the SPIR-V targets' LLVM IR before it's actually lowered to SPIR-V.

@frasercrmck
Copy link
Contributor Author

CC @rjodinchr, I hope this method of dealing with clspv works for you.

@frasercrmck frasercrmck requested a review from arsenm October 31, 2024 10:08
Copy link

github-actions bot commented Oct 31, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@frasercrmck frasercrmck force-pushed the libclc-clc-shared-fns branch 3 times, most recently from c0e8cf9 to 13e6967 Compare October 31, 2024 10:39
These functions are "shared" between integer and floating-point types,
hence the directory name. They are used in several CLC internal
functions such as __clc_ldexp.

Note that clspv and spirv targets don't want to define these functions,
so pre-processor macros replace calls to __clc_min with regular min, for
example. This means they can use as much of the generic CLC source files
as possible, but where CLC functions would usually call out to an
external __clc_min symbol, they call out to an external min symbol. Then
they opt out of defining __clc_min itself in their CLC builtins
library.

Preprocessor definitions for these targets have also been
changed somewhat: what used to be CLC_SPIRV (the 32-bit target) is now
CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or
CLC_SPIRV64. Same goes for CLC_CLSPV.

There are no differences (measured with llvm-diff) in any of the final
builtins libraries for nvptx, amdgpu, or clspv. Neither are there
differences in the SPIR-V targets' LLVM IR before it's actually lowered
to SPIR-V.
@frasercrmck frasercrmck force-pushed the libclc-clc-shared-fns branch from 13e6967 to 0465b89 Compare October 31, 2024 11:25
Comment on lines +1 to +11
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_max(__CLC_GENTYPE a,
__CLC_GENTYPE b) {
return (a > b ? a : b);
}

#ifndef __CLC_SCALAR
_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_max(__CLC_GENTYPE a,
__CLC_SCALAR_GENTYPE b) {
return (a > (__CLC_GENTYPE)b ? a : (__CLC_GENTYPE)b);
}
#endif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use __builtin_elementwise_max? It's simplest to just not have all this boilerplate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, seems like a sensible refactor while we're at it. I'm generally just copy/pasting code over to try and affect as little as possible.

@arsenm arsenm added the libclc libclc OpenCL library label Oct 31, 2024
Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving them seems fine but these should probably just be deleted. __clc_min is equivalent to __builtin_elementwise_min etc.

I see there's a __builtin_hlsl_elementwise_clamp, but given the language prefix it may be more abusive

@frasercrmck
Copy link
Contributor Author

frasercrmck commented Oct 31, 2024

Moving them seems fine but these should probably just be deleted. __clc_min is equivalent to __builtin_elementwise_min etc.

I see there's a __builtin_hlsl_elementwise_clamp, but given the language prefix it may be more abusive

OpenCL min/max specify min(vector, scalar) (forgive the syntax) which __builtin_elementwise_min doesn't like. So I think sticking with the clc_min body might make sense.

However, changing the definition to

_CLC_OVERLOAD _CLC_DEF __CLC_GENTYPE __clc_max(__CLC_GENTYPE a,
                                               __CLC_SCALAR_GENTYPE b) {
  return __builtin_elementwise_max(a, (__CLC_GENTYPE)b);
}

does improve the LLVM IR, to use llvm.maxnum.v2f32 where we currently scalarize it. If we think that's okay then it's still an improvement over what we have.

@frasercrmck frasercrmck reopened this Oct 31, 2024
@arsenm
Copy link
Contributor

arsenm commented Oct 31, 2024

We could probably extend the elementwise builtins to handle the splat case

@frasercrmck
Copy link
Contributor Author

We could probably extend the elementwise builtins to handle the splat case

Yeah I don't see why not. Let me see what I can do. If you're okay with it, I might just merge this as it is and come back to it if and when the builtins are improved.

@frasercrmck frasercrmck merged commit d12a8da into llvm:main Oct 31, 2024
14 checks passed
@frasercrmck frasercrmck deleted the libclc-clc-shared-fns branch October 31, 2024 16:45
smallp-o-p pushed a commit to smallp-o-p/llvm-project that referenced this pull request Nov 3, 2024
These functions are "shared" between integer and floating-point types,
hence the directory name. They are used in several CLC internal
functions such as __clc_ldexp.

Note that clspv and spirv targets don't want to define these functions,
so pre-processor macros replace calls to __clc_min with regular min, for
example. This means they can use as much of the generic CLC source files
as possible, but where CLC functions would usually call out to an
external __clc_min symbol, they call out to an external min symbol. Then
they opt out of defining __clc_min itself in their CLC builtins library.

Preprocessor definitions for these targets have also been changed
somewhat: what used to be CLC_SPIRV (the 32-bit target) is now
CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or
CLC_SPIRV64. Same goes for CLC_CLSPV.

There are no differences (measured with llvm-diff) in any of the final
builtins libraries for nvptx, amdgpu, or clspv. Neither are there
differences in the SPIR-V targets' LLVM IR before it's actually lowered
to SPIR-V.
NoumanAmir657 pushed a commit to NoumanAmir657/llvm-project that referenced this pull request Nov 4, 2024
These functions are "shared" between integer and floating-point types,
hence the directory name. They are used in several CLC internal
functions such as __clc_ldexp.

Note that clspv and spirv targets don't want to define these functions,
so pre-processor macros replace calls to __clc_min with regular min, for
example. This means they can use as much of the generic CLC source files
as possible, but where CLC functions would usually call out to an
external __clc_min symbol, they call out to an external min symbol. Then
they opt out of defining __clc_min itself in their CLC builtins library.

Preprocessor definitions for these targets have also been changed
somewhat: what used to be CLC_SPIRV (the 32-bit target) is now
CLC_SPIRV32, and CLC_SPIRV now represents either CLC_SPIRV32 or
CLC_SPIRV64. Same goes for CLC_CLSPV.

There are no differences (measured with llvm-diff) in any of the final
builtins libraries for nvptx, amdgpu, or clspv. Neither are there
differences in the SPIR-V targets' LLVM IR before it's actually lowered
to SPIR-V.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libclc libclc OpenCL library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants