Skip to content

[libclc] Move smoothstep to CLC and optimize its codegen #123183

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 16, 2025

Conversation

frasercrmck
Copy link
Contributor

This commit moves the implementation of the smoothstep function to the CLC library, whilst optimizing the codegen.

This commit also adds support for 'half' versions of smoothstep, which were previously missing.

The CLC smoothstep implementation now keeps everything in vectors, rather than recursively splitting vectors by half down to the scalar base form. This should result in more optimal codegen across the board.

This commit also removes some non-standard overloads of smoothstep with mixed types, such as 'double smoothstep(float, float, float)'. There aren't any mixed-(element )type versions of smoothstep as far as I can see:

gentype smoothstep(gentype edge0, gentype edge1, gentype x)
gentypef smoothstep(float edge0, float edge1, gentypef x)
gentyped smoothstep(double edge0, double edge1, gentyped x)
gentypeh smoothstep(half edge0, half edge1, gentypeh x)

The CLC library only defines the first type, for simplicity; the OpenCL layer is responsible for handling the scalar/scalar/vector forms. Note that the scalar/scalar/vector forms now splat the scalars to the vector type, rather than recursively split vectors as before. The macro that used to 'vectorize' smoothstep in this way has been moved out of the shared clcmacro.h header as it was only used for the smoothstep builtin.

Note that the CLC clamp function is now built for both SPIR-V targets. This is to help build the CLC smoothstep function for the Mesa SPIR-V target.

This commit moves the implementation of the smoothstep function to the
CLC library, whilst optimizing the codegen.

This commit also adds support for 'half' versions of smoothstep, which
were previously missing.

The CLC smoothstep implementation now keeps everything in vectors,
rather than recursively splitting vectors by half down to the scalar
base form. This should result in more optimal codegen across the board.

This commit also removes some non-standard overloads of smoothstep with
mixed types, such as 'double smoothstep(float, float, float)'. There
aren't any mixed-(element )type versions of smoothstep as far as I can
see:

    gentype smoothstep(gentype edge0, gentype edge1, gentype x)
    gentypef smoothstep(float edge0, float edge1, gentypef x)
    gentyped smoothstep(double edge0, double edge1, gentyped x)
    gentypeh smoothstep(half edge0, half edge1, gentypeh x)

The CLC library only defines the first type, for simplicity; the OpenCL
layer is responsible for handling the scalar/scalar/vector forms. Note
that the scalar/scalar/vector forms now splat the scalars to the vector
type, rather than recursively split vectors as before. The macro that
used to 'vectorize' smoothstep in this way has been moved out of the
shared clcmacro.h header as it was only used for the smoothstep builtin.

Note that the CLC clamp function is now built for both SPIR-V targets.
This is to help build the CLC smoothstep function for the Mesa SPIR-V
target.
@frasercrmck frasercrmck added the libclc libclc OpenCL library label Jan 16, 2025
@frasercrmck frasercrmck requested a review from arsenm January 16, 2025 11:12
@frasercrmck
Copy link
Contributor Author

Here's an example of the difference in LLVM IR for smoothstep(double16, double16, double16): https://godbolt.org/z/P9sM3Wjjn

@@ -0,0 +1,52 @@
/*
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure if the AMD copyright still applies to this copied file, so I kept it.

@frasercrmck frasercrmck merged commit b7e2014 into llvm:main Jan 16, 2025
10 checks passed
@frasercrmck frasercrmck deleted the libclc-clc-smoothstep branch January 16, 2025 11:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libclc libclc OpenCL library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants