Skip to content

[SYCL] Make invoke_simd convert its arguments to appropriate type. #6544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Aug 16, 2022

Conversation

kbobrovs
Copy link
Contributor

@kbobrovs kbobrovs commented Aug 9, 2022

  • Introduce an intermediate lambda in invoke_simd and call SIMD target
    from the lambda with given (SIMD) arguments coming from the lambda's
    formal parameters. This way compiler automatically performs necessary
    argument type conversion.

  • Introduce a new implicit constructor for simd to allow conversion
    of simd objects with _VecExt storage kind (used in invoke_simd
    extension).

  • Fix linkonce_odr linkage of entry points : change it to external
    linkage to avoid removal by the inliner and llvm-linker.

This new scheme complicates the SIMD target function pointer flow, so
LowerInvokeSimd.cpp update is needed to accommodate the new flow.

E2E tests: intel/llvm-test-suite#1146

Signed-off-by: Konstantin S Bobrovsky [email protected]

@kbobrovs kbobrovs requested review from a team as code owners August 9, 2022 06:23
@kbobrovs kbobrovs requested a review from cperkinsintel August 9, 2022 06:23
@@ -139,14 +139,14 @@ bool isESIMDFunction(const Function &F) {

// This function makes one or two groups depending on kernel types (SYCL, ESIMD)
EntryPointGroupVec
groupEntryPointsByKernelType(const ModuleDesc &MD,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const is removed in a number of places because some entry points can be changed now - their linkage fixed.

@kbobrovs
Copy link
Contributor Author

kbobrovs commented Aug 9, 2022

Folks, I would appreciate if could you review the following parts:

@rolandschulz:
sycl/include/std/experimental/simd.hpp
sycl/include/sycl/ext/oneapi/experimental/invoke_simd.hpp
sycl/test/invoke_simd/invoke_simd.cpp

@v-klochkov:
llvm/lib/SYCLLowerIR/LowerInvokeSimd.cpp

@asudarsa:
llvm/tools/sycl-post-link/ModuleSplitter.cpp
llvm/tools/sycl-post-link/ModuleSplitter.h

@kbobrovs kbobrovs force-pushed the invoke_simd_conv_args branch from 9a8d833 to 345e247 Compare August 9, 2022 06:37
- Introduce an intermediate lambda in invoke_simd and call SIMD target
from the lambda with given (SIMD) arguments coming from the lambda's
formal paremeters. This way compiler automatically performs necessary
argument type conversion.

- Introduce a new implicit constructor for simd to allow conversion
of simd objects with _VecExt storage kind (used in invoke_simd
extension).

- Fix linkonce_odr linkage of entry points : change it to external
  linkage to avoid removal by the inliner and llvm-linker.

This new scheme complicates the SIMD target function pointer flow, so
LowerInvokeSimd.cpp update is needed to accomodate the new flow.

Signed-off-by: Konstantin S Bobrovsky <[email protected]>
@kbobrovs kbobrovs force-pushed the invoke_simd_conv_args branch from 345e247 to 222263a Compare August 9, 2022 06:41
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Aug 9, 2022

The llvm/test/SYCLLowerIR/ESIMD/lower_invoke_simd.ll failure will be fixed by test update (hopefully).

@@ -209,8 +211,15 @@ using SpmdRetType =

template <class SimdCallable, class... SpmdArgs>
static constexpr int get_sg_size() {
if constexpr (has_uniform_signature<SimdCallable, SpmdArgs...>()) {
return 0; // subgroup size does not matter then
if constexpr (all_uniform_types<SpmdArgs...>()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a a good idea. But we should update the extension document to reflect this. It seems at the moment it is missing a description when the sg-size can be deduced. BTW now that this is implemented should the extension be moved to experimental?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I can create a PR, you can do this as well if you wish

BTW now that this is implemented should the extension be moved to experimental?

This is in sycl::ext::oneapi::experimental already, am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I can create a PR, you can do this as well if you wish

happy for you to do it

I meant the location of https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/proposed/sycl_ext_oneapi_invoke_simd.asciidoc. I think this should be moved from proposed to experimental.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy for you to do it

OK, I'll do this

I think this should be moved from proposed to experimental.

Oh, that's what you mean. Agree.

typename detail::spmd2simd<T, N>::type... simd_args) {
SYCL_EXTERNAL __regcall detail::SimdRetType<N, Callable, T...>
simd_obj_call_helper(const void *obj_ptr,
typename detail::spmd2simd<T, N>::type... simd_args) {
auto f =
*reinterpret_cast<const std::remove_reference_t<Callable> *>(obj_ptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this reinterpret_cast needed? (I realize it's not changed in this PR but I would like to understand)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because const void* can't be used as a function pointer. Another question is why obj_ptr is const void* and not const Callable*. I tried changing (along with __builtin_invoke_simd declaration, got lots of errors. Will need to figure out once functors/lambdas are enabled.

rolandschulz
rolandschulz previously approved these changes Aug 9, 2022
// definition. This function changes linkage of linkonce_odr entry points.
// An example of a such entry point is an instantiation of a function template
// even marked SYCL_EXTERNAL.
// TODO: is it practical to fix FE instead to assign different linkage for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be useful to get @rolandschulz's comment here. Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either inliner pass or LLVM linker will remove functions with such linkage

They will only remove them if unused. Why do they incorrectly assume they are unused? Is it because you spit into SPMD and SIMD and the entry point into SIMD is unused inside the SIMD part? And therefore it gets removed after splitting?

// definition. This function changes linkage of linkonce_odr entry points.
// An example of a such entry point is an instantiation of a function template
// even marked SYCL_EXTERNAL.
// TODO: is it practical to fix FE instead to assign different linkage for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

either inliner pass or LLVM linker will remove functions with such linkage

They will only remove them if unused. Why do they incorrectly assume they are unused? Is it because you spit into SPMD and SIMD and the entry point into SIMD is unused inside the SIMD part? And therefore it gets removed after splitting?


if ((L == GlobalValue::LinkageTypes::LinkOnceODRLinkage) ||
(L == GlobalValue::LinkageTypes::LinkOnceAnyLinkage)) {
F->setLinkage(GlobalValue::LinkageTypes::ExternalLinkage);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you change it to external and not weak_odr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entry point need to be available externally, yet an attempt to later link in another module which defines the same entry point would be an error. So ExternalLinkage seems the best fit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would it be an error to have the same template function (which are entry point to invoke_simd) in two different TUs? This is valid C++. Why should this not work for invoke_simd?

@kbobrovs
Copy link
Contributor Author

kbobrovs commented Aug 10, 2022

They will only remove them if unused. Why do they incorrectly assume they are unused? Is it because you spit into SPMD and SIMD and the entry point into SIMD is unused inside the SIMD part? And therefore it gets removed after splitting?

No, it is not because of splitting. They (linker and inliner) correctly assume they are unused, because there can remain no calls to entry points within the module. In this case inliner turns this

invoke_simd(Helper,...)
Helper(...) {
  call UserF
}
UserF(...) { code; } // <-- declared as entry point in the user code (SYCL_EXTERNAL)

into

invoke_simd(Helper,...)
Helper(...) {
  code;
}

which is correct from the LLVMIR optimization standpoint, but not correct conceptually because UserF is declared as entry point and someone might want to link to it / call it later. Ideally, SYCL_EXTERNAL must not lead to linkonce* linkage, even for template instantiations (or linkonce* must be interpreted differently by inliner/linker, but this might be against its definition).

More practical example would be a template SYCL_EXTERNAL function.

@rolandschulz
Copy link
Contributor

I'm pretty sure the intention of SYCL_EXTERNAL (as defined by SYCL spec and therefore not in the context of invoke_simd) was never to give a function external linkage if it didn't have it according to the C++ spec. Only to guarantee that a function which has external linkage also has external linkage for device linking. An implicit instantiated template function doesn't have external linkage in C++. You need to explicit instantiate it. And if you do that the LLVM-IR linkage is weak not linkonce. Which guarantees it works.

Ideally, SYCL_EXTERNAL must not lead to linkonce* linkage, even for template instantiations
Therefore I disagree with that.

For the case of invoke_simd why is it possible that the inliner gets rid of the call. My understanding is that we don't support inlining through invoke_simd (yet).

@kbobrovs
Copy link
Contributor Author

'm pretty sure the intention of SYCL_EXTERNAL (as defined by SYCL spec and therefore not in the context of invoke_simd) was never to give a function external linkage if it didn't have it according to the C++ spec. Only to guarantee that a function which has external linkage also has external linkage for device linking.

Hm. This excerpt from the spec + details in (5.10.1.) pretty much means SYCL_EXTERNAL = external linkage for me:

Any variable or function that is odr-used from a device function must be defined in the same translation unit as that use. However, a function may be defined in another translation unit if the implementation defines the SYCL_EXTERNAL macro as described in Section 5.10.1.

But WeakODR might be better in this case of implicit instantiation indeed. I changed the code.

@kbobrovs kbobrovs requested a review from a team as a code owner August 11, 2022 06:19
We now don't try to retain linkonce_odr entry points if they are removed
by optimizations.
- "__invoke_simd_target" function attribute introduced to mark call
  targets. The attribute is used to protect call targets referenced from
  SYCL callgraph from DCE'ing on split ESIMD module.
- Automatically mark invoke_simd call helper library function as ESIMD,
  so that no ESIMD specifics needs to be added to the invoke_simd.hpp.
- Fix linkage of the helper functions so that they are not dropped by
  Linker/Global DCE.
- Fix test.
@kbobrovs
Copy link
Contributor Author

kbobrovs commented Aug 14, 2022

We discussed the linkonce_odr/SYCL_EXTERNAL problem with @rolandschulz and others in more details. Key takeaways:

  • SYCL_EXTERNAL does not affect function linkage as defined by C++ rules
  • linkonce_odr can be SYCL_EXTERNAL. But if no uses after optimizations - compiler is free to remove it.
  • To avoid removal, weak_odr can replace linkonce_odr w/o affecting correctness.

In this patch I do the linkage replacement for both helper and target. For helper this is permanent change, for target - temporary. Just to protect the target from removal when ModuleDesc::cleanup() is called on split ESIMD module, because there can be other uses of the target in the SYCL part of the program. Once ESIMD and SYCL are linked back together (this is what invoke_simd requires), cleanup is run on the linked module.

Original Pseudo-IR illustrating helper and target:

...
__builtin_invoke_simd(helper, target,...);
...

ret helper(callable f, args...) {
  f(args...);
}

Optimized to:

...
__builtin_invoke_simd(helper, ...);
...

ret helper(args...) {
  target(args...);
}

but there can be still other uses of target. Otherwise it gets removed.

@kbobrovs kbobrovs requested a review from rolandschulz August 14, 2022 00:43
@kbobrovs
Copy link
Contributor Author

The only failure is InvokeSimd/invoke_simd_smoke.cpp, which is fixed by intel/llvm-test-suite#1146

@kbobrovs
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1146

v-klochkov
v-klochkov previously approved these changes Aug 15, 2022
rolandschulz
rolandschulz previously approved these changes Aug 15, 2022
@kbobrovs
Copy link
Contributor Author

@asudarsa, @cperkinsintel - I resolved outstanding comments with @rolandschulz and @v-klochkov. Please take a look.

@@ -613,6 +603,52 @@ void ModuleDesc::renameDuplicatesOf(const Module &MA, StringRef Suff) {
}
}

constexpr char SYCL_ORIG_LINKAGE_ATTR[] = "__sycl_orig_linkage";
Copy link
Contributor

@asudarsa asudarsa Aug 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will help to have a short comment about this constexpr and why we need this. Thanks

}
}

// TODO: try to move including all passes (cleanup, spec consts, compile time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo? may be 'including' needs to be removed?

asudarsa
asudarsa previously approved these changes Aug 15, 2022
Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only nitpicks. LGTM.
Thanks

@kbobrovs
Copy link
Contributor Author

Only nitpicks. LGTM.

@asudarsa, thanks - let me address your NIT comments in a separate PR to reduce the risk of conflicts for this big PR.

@kbobrovs kbobrovs dismissed stale reviews from asudarsa, rolandschulz, and v-klochkov via b4b6749 August 15, 2022 23:59
@kbobrovs
Copy link
Contributor Author

Had to fix test failure (see 9dd341e), so I addressed @asudarsa's review comments as well (b4b6749).
@rolandschulz, @v-klochkov, @asudarsa - please re-approve.

@kbobrovs
Copy link
Contributor Author

/verify with intel/llvm-test-suite#1146

Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks

@kbobrovs
Copy link
Contributor Author

  • Expected failure:

  • Unrelated failures:

    • SYCL / Linux / HIP AMDGPU LLVM Test Suite (pull_request_target) :
    • SYCL / Linux / CUDA LLVM Test Suite (pull_request_target):
      • SYCL :: DeviceLib/imf_simd_emulate_test.cpp

@kbobrovs
Copy link
Contributor Author

@cperkinsintel, please review/approve for @intel/llvm-reviewers-runtime

@kbobrovs
Copy link
Contributor Author

@cperkinsintel, please review/approve for @intel/llvm-reviewers-runtime

Actually, I have approval from @intel/llvm-reviewers-runtime - from @v-klochkov. Merging.

@kbobrovs kbobrovs merged commit 038764f into intel:sycl Aug 16, 2022
@kbobrovs kbobrovs deleted the invoke_simd_conv_args branch August 16, 2022 18:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants