-
Notifications
You must be signed in to change notification settings - Fork 787
[sycl-post-link] Do not drop SYCL_EXTERNAL functions #3793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[sycl-post-link] Do not drop SYCL_EXTERNAL functions #3793
Conversation
I'm not 100% sure it always safe to allow unreferenced functions to go through sycl-post-link. One case I have in mind is when we split per-kernel. In such a scenario, sycl-post-link will place unreferenced functions in a separate module. So far, I believe the DPCPP runtime expects to see kernels in each SPIRV module, but there might be none. I know that RT claims binary images based on symbols they contain, but still, I'm not 100% sure about it. @romanovvlad, do you have any comments here? |
I will address CI failures shortly. |
Currently SYCL RT will assume a device images has all kernels if the list of kernel in the device image is zero. |
llvm/test/tools/sycl-post-link/unreferenced_funcs/split-global.ll
Outdated
Show resolved
Hide resolved
llvm/test/tools/sycl-post-link/unreferenced_funcs/split-per-source.ll
Outdated
Show resolved
Hide resolved
|
||
std::vector<llvm::Function *> UnreferencedFuncs; | ||
for (auto &F : M.functions()) { | ||
if (!F.isDeclaration() && !ReferencedFuncs.count(&F)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have an internal
(I mean linkage type) unreferenced function here? Could it cause any conflicts later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can ignore unreferenced internal
functions, just like we did before. AFAIK if two LLVM modules defined internal functions with same names, they won't cause conflicts when these two LLVM modules linked. I'm not sure about SPIR-V though, I can't find any analog of internal
linkage type in SPIR-V spec, but the translator seems to support some Internal linkage type. (@AlexeySotkin , any idea why?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There might be cases when we want to preserve unreferenced functions in SPIR-V, e.g. those which are mentioned in @llvm.used
variable.
Whether they will conflict or not depends on linkage type.
internal
linkage type is the default in SPIR-V, i.e. when global variable or function doesn't have LinkageType
decoration it has internal linkage.
The idea behind dynamic linking support was to consider |
That's exactly what I'm trying to do in this patch.
I thought that non-
Do I understand correctly that your idea is to distinguish roots (entry points) by the presence of Also, there is a comment in clang/lib/CodeGen/CodeGenFunction.cpp that says:
|
Yes, FE ignores functions that is not referenced from kernels/SYCL_EXTERNALs (I'll call it device context), but with early optimizations there is the case when function was originally referenced from device context, but use of it was optimized away. In this case we get unreferenced non-
Yes, that is right.
Yes, there is still a room for improvement, I don't remember if anyone ever returned to it. But it shouldn't block us from re-using the attribute for |
210d80b
to
d76610c
Compare
Thanks, @Fznamznon, |
" any kernel are dropped from the resulting module(s), except from\n" | ||
" SYCL_EXTERNAL functions.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about "Functions unreachable from any kernel or SYCL_EXTERNAL function are dropped from the resulting module(s)."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
target datalayout = "e-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024" | ||
target triple = "spir64-unknown-linux-sycldevice" | ||
|
||
define dso_local spir_func void @externalDeviceFunc() #0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a test where some SYCL_EXTERNAL function foo
is defined in source "a.cpp" but it is also referenced by a kernel defined in source "b.cpp"? In this case design of dynamic code linking says that definition of function foo
should be present in both resulting modules when per-source split is requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
This PR is supposed to fix #3239. |
llvm/test/tools/sycl-post-link/sycl-external-funcs/split-and-lower-esimd.ll
Show resolved
Hide resolved
This is an NFC patch, a preparation for further changes.
Looks like CUDA AOT mode generates call without |
I have a new issue that popped up in the CI testing (
And this is why DPCPP RT asserts that the set of kernel symbols are disjoint across all the images in the device binary. I tend to think we should filter SPIRV ("_spirv") and SYCL ("_sycl") built-in functions in @Fznamznon, @AlexeySachkov, @kbobrovs any thoughts? |
Is there any check for functions? I remember RT has a thing called kernel set id, it is based on kernels presented in the device image. Are you saying with your changes, we add names of llvm/sycl/include/CL/sycl/detail/pi.h Line 728 in a3fc51a
The idea behind dynamic linking was to have a separate property set for SYCL_EXTERNAL function names and not add them to entries table, so this thing wouldn't break. However I think we actually could use either only entries table or only property table to save kernel names and SYCL_EXTERNAL names both, but we will need to re-do RT part with kernel set id. If I understand it correct, this kernel set id is actually not a necessity, but some choice made by RT developers some time ago. @s-kanaev , WDYT?
Not sure about |
We do have a few |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DenisBakhvalov, looks like there is a real regression: [SYCL :: AOT/multiple-devices.cpp]
Right. I didn't have the time to fix it yet. But it's still under my radar. :) |
Sorry for the late response.
Yes, here is the assert that gets triggered:
So, I went ahead and added this exception. |
@Fznamznon, @AlexeySachkov, please review. |
@@ -21,25 +21,27 @@ | |||
|
|||
declare dso_local spir_func zeroext i1 @_Z33__sycl_getScalarSpecConstantValueIbET_PKc(i8 addrspace(4)*) | |||
|
|||
define dso_local spir_kernel void @KERNEL_AAA() { | |||
define dso_local spir_kernel void @KERNEL_AAA() #0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, why do we need to add the attribute to this test. Why It wasn't here before? Is it not possible to process the code without the attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this is how we detect whether the function should be considered as an entry point or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct.
@@ -6,20 +6,22 @@ | |||
; RUN: FileCheck %s -input-file=%t.files.table --check-prefixes CHECK-TABLE | |||
; RUN: FileCheck %s -input-file=%t.files_0.sym --match-full-lines --check-prefixes CHECK-SYM | |||
|
|||
define dso_local spir_kernel void @KERNEL_AAA() { | |||
define dso_local spir_kernel void @KERNEL_AAA() #0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
@@ -21,25 +21,27 @@ | |||
|
|||
declare dso_local spir_func zeroext i1 @_Z33__sycl_getScalarSpecConstantValueIbET_PKc(i8 addrspace(4)*) | |||
|
|||
define dso_local spir_kernel void @KERNEL_AAA() { | |||
define dso_local spir_kernel void @KERNEL_AAA() #0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, this is how we detect whether the function should be considered as an entry point or not
There are a few use cases where unreferenced device functions should go through sycl-post-link. Such functions may be referenced later, for example on a SPIRV level.
Dynamic linking of device binary images is one such case. I.e. when the kernel will have an import list with functionality implemented in a dynamic library. DPCPP RT will resolve imported functions, which could be implemented as SYCL_EXTERNAL functions. Thus we need to allow SYCL_EXTERNAL to go through sycl-post-link to make it work.
ESIMD/ISPC interoperability requested by @aneshlya is another motivation for this functionality.