Skip to content

[SYCL][ESIMD] Add hidden compiler option to not compile host code #11037

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 14, 2023

Conversation

sarnex
Copy link
Contributor

@sarnex sarnex commented Aug 31, 2023

ESIMD code can only be run on an Intel GPU. Previously we had the ESMID emulator that called into host implementations of some of the GPU intrinsics, but that has been deprecated and will be removed. Even when the emulator was supported, it was seldom used by customers.

To summarize, the host compile for ESIMD code had little value to customers previous, and will soon have zero value once the emulator is removed.

However, the host compile can be slow. We have a bug report from a customer reporting slow compilation time for an ESIMD kernel with a large vector. The reason for the slowdown was investigated by the X86 backend team and they found for the host compile, the vector exceeds the maximum legal vector size on X86 and instruction selection needs to spend a lot of time doing heavy lifting to generate the right assembly.

This is a total waste of the user's time as that code will never and can never be run.

As a workaround for these cases, add a new hidden option, -fsycl-esimd-build-host-code/-fno-sycl-esimd-build-host-code, default -fsycl-esimd-build-host-code, to try to speed up the host compile. This is mostly done through a new pass which runs early and removes all IR for ESIMD functions and replaces them with a single return statement. This allows later optimizations to do even more cleanup.

Also, I modified some of the ESIMD macros to generate IR that is more likely to be optimized out for the host, just disabling inlining and setting internal linkage.

We do not intend to advertise this option as there should be no issue in most cases, but we would like to be able to provide a workaround if users hit an issue.

I have manually verified this option fixes the slowdown reported by the customer.

@sarnex sarnex force-pushed the hostopt branch 3 times, most recently from 0c14bba to 9147564 Compare September 7, 2023 14:54
@sarnex sarnex marked this pull request as ready for review September 7, 2023 17:44
@sarnex sarnex requested review from a team as code owners September 7, 2023 17:44
Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE changes LGTM

Copy link
Contributor

@smanna12 smanna12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FE changes LGTM

@AlexeySachkov
Copy link
Contributor

To me, that seems like a hack. We already have a lot of functionality which is not available on host (sub-groups, matrix, a lot of other built-ins) and this is handled in headers by using #ifdef __SYCL_DEVICE_ONLY__.

Is it legal to call ESIMD functions directly from host? Does ESIMD emulator skip SPIR-V generation? If the answer is no to both, I would prefer to adjust ESIMD header files to erase built-ins implementation for host there and enable some errors reporting (through exceptions, for example).

@sarnex
Copy link
Contributor Author

sarnex commented Sep 12, 2023

@AlexeySachkov Thanks for the review.

I would agree this is a hack and it's not intended to be a user-visible solution which is why it's hidden and default off. It's intended to be a last ditch workaround for customers hitting compile slowdowns that we haven't fixed the root cause for yet, so we tell them about this option in the meantime.

Is it legal to call ESIMD functions directly from host?

Right now yes because the ESIMD emulator calls the Clang-generated X86 directly but is deprecated, very shortly the answer will be no as the emulator will be removed and there will be no legal way to call ESIMD code on the host.

Does ESIMD emulator skip SPIR-V generation?

Yes, it doesn't use SPIR-V, it uses Clang's X86 from the host compile but it is going to be removed anyway

For some ESIMD functions/classes, adjusting the header files will be trivial and we plan to do this, and actually we already do it for some things. However, for some cases, managing the header files can be somewhat tricky. For example, our users interact with the sycl::ext::intel::esimd::simd class very often, it's basically just a wrapper around a clang vector with some useful functions. If the host instantiates the clang vector, they may see slowdowns for reasons mentioned above.
We can't wrap the entire class in the macro because then the user would need to wrap their kernel code, so we'd have to wrap the vector member and all functions referencing it, friend classes referencing it, etc. We plan to do this but it may be tricky and we may miss something so I was thinking a hidden default off option would be okay as a last ditch workaround.

Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Seems modular and changes are also nicely localized to ESIMD lowering. Just one code change requested.

Thanks

Copy link
Contributor

@AlexeySachkov AlexeySachkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, @sarnex. I'm fine then with adding this as a temporary solution, however, I would suggest to add a TODO comment somewhere to remove both the flag and the pass once ESIMD emulator is removed

Signed-off-by: Sarnie, Nick <[email protected]>
@sarnex
Copy link
Contributor Author

sarnex commented Sep 14, 2023

@intel/llvm-gatekeepers This one is ready to go, thanks!

@steffenlarsen steffenlarsen merged commit 3d6bc0f into intel:sycl Sep 14, 2023
Comment on lines +3628 to +3629
PosFlag<SetTrue, [], [ClangOption], "Build the host implementation of ESIMD functions."
"Enabled by default.">,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarnex, thinking about this a bit more, I don't think that we should be introducing a positive flag at all. If the option is a temporary thing, which is supposed to go away together with ESIMD emulator (or at some other point), I don't see a value in introducing the positive flag here. It is just a one more thing which users will have to adjust to if they start using it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a good point, I didn't want to add a positive option either originally but I got the idea there was an informal rule to have one. I don't think that's true and it doesn't make sense here anyway. I'll make a PR to remove it.

sarnex added a commit to sarnex/llvm that referenced this pull request Sep 21, 2023
Remove the positive option and keep only the negative option.

Based on feedback from intel#11037 (comment)

Signed-off-by: Sarnie, Nick <[email protected]>
steffenlarsen pushed a commit that referenced this pull request Sep 22, 2023
Remove the positive option and keep only the negative option. The option
was added only a couple of days ago and nobody should be using it. It is
a hidden option as well.

Based on feedback from
#11037 (comment)

---------

Signed-off-by: Sarnie, Nick <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants