-
Notifications
You must be signed in to change notification settings - Fork 789
[DeviceSanitizer] Checking out-of-bounds error on sycl::local_accessor #13247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
FunctionType *NewFTy = FunctionType::get(F->getReturnType(), Types, false); | ||
|
||
std::string OrigFuncName = F->getName().str(); | ||
F->setName(OrigFuncName + "_del"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NewF->takeName(F);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think added "_del" suffix in old func is helpful for debugging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F is going to erased from parent at the end of this function, right? So why still need for debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I forget to replace the old function in other instructions/metadata, I can know this immediately.
} | ||
|
||
// New Argument | ||
Types.push_back(IntptrTy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it right that the new arg points to a global memory? If so, why are we using integer type for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's a pointer point to global memory. I use integer type for simplicity (just like the "address" parameter is also an integer).
In fact, its type should be __SYCL_GLOBAL__ LaunchInfo*
, or simply __SYCL_GLOBAL__ void*
, or simply "void *".
I found that using integer value just worked as expect.
But it's ok for me to change it to pointer type if you think it's necessary (I'll also need to modify libdevice file).
What's your idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is probably better to use its real type:
- metadata in
!kernel_arg_addr_space !kernel_arg_access_qual !kernel_arg_type !kernel_arg_base_type !kernel_arg_type_qual
will be correct. - the backend compiler is aware that it is a global buffer.
- easy for people to understand the IR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll change its type to "i64 as(1)*".
|
||
FixupMetadata("kernel_arg_buffer_location", Builder.getInt32(-1)); | ||
FixupMetadata("kernel_arg_runtime_aligned", Builder.getFalse()); | ||
FixupMetadata("kernel_arg_exclusive_ptr", Builder.getFalse()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are metadata !kernel_arg_addr_space !kernel_arg_access_qual !kernel_arg_type !kernel_arg_base_type !kernel_arg_type_qual !kernel_arg_accessor_ptr
fixed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offline sync with @wenju-he, these metadata don't need to update here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These metadata probably still need updating as I've met a case that fails because these metadata are not updated by AddressSanitizerPass
if (F->getCallingConv() == CallingConv::SPIR_KERNEL) | ||
F->removeFromParent(); | ||
else | ||
SpirFuncs.emplace_back(F, NewF); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a sycl kernel be called from another sycl kernel as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure about this, I assume sycl kernel can't call another sycl kernel.
This is the answer from IntelGPU(typo, IntelGPT):
As of the SYCL 2020 specification, SYCL does not natively support invoking one kernel directly from another kernel, a concept often referred to as dynamic parallelism in the context of some other parallel computing platforms like CUDA.
In SYCL, the programming model is generally centered around the submission of kernels from the host code to the device. The kernels are enqueued onto command queues, and the SYCL runtime handles their scheduling and execution. This model does not inherently support kernel-to-kernel launches as one might see with CUDA's dynamic parallelism feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure about this, I assume sycl kernel can't call another sycl kernel. This is the answer from IntelGPU ...
As of the SYCL 2020 specification, SYCL does not natively support invoking one kernel directly from another kernel, a concept often referred to as dynamic parallelism in the context of some other parallel computing platforms like CUDA.
In SYCL, the programming model is generally centered around the submission of kernels from the host code to the device. The kernels are enqueued onto command queues, and the SYCL runtime handles their scheduling and execution. This model does not inherently support kernel-to-kernel launches as one might see with CUDA's dynamic parallelism feature.
I don't think it applies to the @wenju-he's question as his question is not related to the dynamic parallelism. According to my understanding, the question is "can a function with spir_kernel calling convention call another function with spir_kernel calling convention"?
This is not prohibited by SYCL or SPIR-V standards. It might be tricky to create a reproducer for that case using existing DPC++ implementation, but it's not impossible.
BTW, who is IntelGPU?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for correcting me. I'll handle this situation as well.
"IntelGPU" -> "IntelGPT", typo.
I found it's hard to search from SYCL spec or internet, so I tried to ask ChatGPT for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not prohibited by SYCL or SPIR-V standards. It might be tricky to create a reproducer for that case using existing DPC++ implementation, but it's not impossible.
Hi @bader, can you help to craft one example?
I tried to use function object but failed, and device code can't use function pointer either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we only extend kernels' argument, if we can't call kernel in device code, then I needn't to fix the users of kernels at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do this using online compiler extension.
E.g. https://github.com/intel/llvm/blob/c0ade1b520b79d0e1666f193b34524f1e33e29b7/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler_opencl.asciidoc
UPDATE: I'm not sure how well SYCL extensions are supported by the Device Sanitizer.
for (auto [F, NewF] : SpirFuncs) { | ||
SmallVector<User *, 16> Users(F->users()); | ||
for (User *U : Users) { | ||
if (auto *GA = dyn_cast<GlobalAlias>(U)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm ValueMapper might simplify the logic below and handle more cases like nested constexpr and indirect call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indirect call is introduced by mismatched func arguments, device code doesn't support calling function pointer.
Since we only extend spir kernel, we needn't to fix alias and contexpr either.
// Append a new argument "launch_data" to user's spir_kernel & spir_func | ||
static void ExtendSpirKernelArgs(Module &M, FunctionAnalysisManager &FAM) { | ||
SmallVector<Function *> SpirFixupFuncs; | ||
for (Function &F : M) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to patch functions that doesn't use local_accessor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, launch_data is also used to store sanitizer report.
#include <sycl/ext/oneapi/group_local_memory.hpp> | ||
#include <sycl/usm.hpp> | ||
// REQUIRES: linux | ||
// RUN: %{build} %device_sanitizer_flags -g -O0 -o %t.out |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add tests to cover sycl external usage scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we can't handle "sycl external" well because global variables (defined in libdevice) are redefined in shared library.
I will fix this when we support link libdevice in UR.
Hi Can you help to review this PR? Thank you very much! |
@AllanZyne |
Sorry, this pr merged into wrong branch! I'll fix all comments in new PR. |
Recreated PR: #13503 |
Hi @maksimsab, I've added LIT tests in #13503. |
UR: AllanZyne/unified-runtime#16 (for internal review)
To check sycl::local_accessor(aka, dynamic local memory), we need to extend a new argument in spir kernel, this is because:
I named this argument as "__asan_launch", which is a pointer pointed to "LaunchInfo" structure and allocated it in shared USM. To make this pointer can be used in spir_func w/o extending their argument, I created a global external local memory (external, so that it can be shared with other translation units, and its instance is defined in libdevice), and save the "__asan_launch" into this local memory immediately at the entry of kernel.
UR can't check the name of kernel arguments, so it can't know if the kernel has "__asan_launch". So I assume the "__asan_launch" is always there, and added a check to prevent DAE pass from removing it.