Skip to content

[DeviceSanitizer] Checking out-of-bounds error on sycl::local_accessor #13247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

AllanZyne
Copy link
Contributor

@AllanZyne AllanZyne commented Apr 2, 2024

UR: AllanZyne/unified-runtime#16 (for internal review)

To check sycl::local_accessor(aka, dynamic local memory), we need to extend a new argument in spir kernel, this is because:

  • ASan needs to know some size information of local buffer, like its size and size with redzone, so that it can poison its shadow memory
  • By using this new argument, we can also pass some per-launch information (that is, it is different in each launch of kernel). One obvious example is SanitizerReport, which saves the error message, so that we can store and print multiple error reports for one kernel with different arguments. Another example is the shadow memory of local memory, this should be different per-launch as well, since one kernel can be launched multiple times and executed in parallel.

I named this argument as "__asan_launch", which is a pointer pointed to "LaunchInfo" structure and allocated it in shared USM. To make this pointer can be used in spir_func w/o extending their argument, I created a global external local memory (external, so that it can be shared with other translation units, and its instance is defined in libdevice), and save the "__asan_launch" into this local memory immediately at the entry of kernel.

UR can't check the name of kernel arguments, so it can't know if the kernel has "__asan_launch". So I assume the "__asan_launch" is always there, and added a check to prevent DAE pass from removing it.

@AllanZyne AllanZyne requested review from a team as code owners April 2, 2024 07:46
@AllanZyne AllanZyne requested review from bso-intel and removed request for a team April 2, 2024 07:46
@AllanZyne AllanZyne marked this pull request as draft April 2, 2024 08:08
FunctionType *NewFTy = FunctionType::get(F->getReturnType(), Types, false);

std::string OrigFuncName = F->getName().str();
F->setName(OrigFuncName + "_del");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewF->takeName(F);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think added "_del" suffix in old func is helpful for debugging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F is going to erased from parent at the end of this function, right? So why still need for debugging?

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I forget to replace the old function in other instructions/metadata, I can know this immediately.

}

// New Argument
Types.push_back(IntptrTy);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it right that the new arg points to a global memory? If so, why are we using integer type for it.

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a pointer point to global memory. I use integer type for simplicity (just like the "address" parameter is also an integer).
In fact, its type should be __SYCL_GLOBAL__ LaunchInfo*, or simply __SYCL_GLOBAL__ void*, or simply "void *".
I found that using integer value just worked as expect.
But it's ok for me to change it to pointer type if you think it's necessary (I'll also need to modify libdevice file).
What's your idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is probably better to use its real type:

  1. metadata in !kernel_arg_addr_space !kernel_arg_access_qual !kernel_arg_type !kernel_arg_base_type !kernel_arg_type_qual will be correct.
  2. the backend compiler is aware that it is a global buffer.
  3. easy for people to understand the IR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll change its type to "i64 as(1)*".


FixupMetadata("kernel_arg_buffer_location", Builder.getInt32(-1));
FixupMetadata("kernel_arg_runtime_aligned", Builder.getFalse());
FixupMetadata("kernel_arg_exclusive_ptr", Builder.getFalse());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are metadata !kernel_arg_addr_space !kernel_arg_access_qual !kernel_arg_type !kernel_arg_base_type !kernel_arg_type_qual !kernel_arg_accessor_ptr fixed?

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offline sync with @wenju-he, these metadata don't need to update here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These metadata probably still need updating as I've met a case that fails because these metadata are not updated by AddressSanitizerPass

if (F->getCallingConv() == CallingConv::SPIR_KERNEL)
F->removeFromParent();
else
SpirFuncs.emplace_back(F, NewF);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can a sycl kernel be called from another sycl kernel as well?

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure about this, I assume sycl kernel can't call another sycl kernel.
This is the answer from IntelGPU(typo, IntelGPT):

As of the SYCL 2020 specification, SYCL does not natively support invoking one kernel directly from another kernel, a concept often referred to as dynamic parallelism in the context of some other parallel computing platforms like CUDA.

In SYCL, the programming model is generally centered around the submission of kernels from the host code to the device. The kernels are enqueued onto command queues, and the SYCL runtime handles their scheduling and execution. This model does not inherently support kernel-to-kernel launches as one might see with CUDA's dynamic parallelism feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure about this, I assume sycl kernel can't call another sycl kernel. This is the answer from IntelGPU ...

As of the SYCL 2020 specification, SYCL does not natively support invoking one kernel directly from another kernel, a concept often referred to as dynamic parallelism in the context of some other parallel computing platforms like CUDA.
In SYCL, the programming model is generally centered around the submission of kernels from the host code to the device. The kernels are enqueued onto command queues, and the SYCL runtime handles their scheduling and execution. This model does not inherently support kernel-to-kernel launches as one might see with CUDA's dynamic parallelism feature.

I don't think it applies to the @wenju-he's question as his question is not related to the dynamic parallelism. According to my understanding, the question is "can a function with spir_kernel calling convention call another function with spir_kernel calling convention"?
This is not prohibited by SYCL or SPIR-V standards. It might be tricky to create a reproducer for that case using existing DPC++ implementation, but it's not impossible.

BTW, who is IntelGPU?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for correcting me. I'll handle this situation as well.

"IntelGPU" -> "IntelGPT", typo.
I found it's hard to search from SYCL spec or internet, so I tried to ask ChatGPT for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not prohibited by SYCL or SPIR-V standards. It might be tricky to create a reproducer for that case using existing DPC++ implementation, but it's not impossible.

Hi @bader, can you help to craft one example?
I tried to use function object but failed, and device code can't use function pointer either.

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we only extend kernels' argument, if we can't call kernel in device code, then I needn't to fix the users of kernels at all.

Copy link
Contributor

@bader bader Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do this using online compiler extension.
E.g. https://github.com/intel/llvm/blob/c0ade1b520b79d0e1666f193b34524f1e33e29b7/sycl/doc/extensions/experimental/sycl_ext_oneapi_kernel_compiler_opencl.asciidoc

UPDATE: I'm not sure how well SYCL extensions are supported by the Device Sanitizer.

for (auto [F, NewF] : SpirFuncs) {
SmallVector<User *, 16> Users(F->users());
for (User *U : Users) {
if (auto *GA = dyn_cast<GlobalAlias>(U)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llvm ValueMapper might simplify the logic below and handle more cases like nested constexpr and indirect call.

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indirect call is introduced by mismatched func arguments, device code doesn't support calling function pointer.
Since we only extend spir kernel, we needn't to fix alias and contexpr either.

// Append a new argument "launch_data" to user's spir_kernel & spir_func
static void ExtendSpirKernelArgs(Module &M, FunctionAnalysisManager &FAM) {
SmallVector<Function *> SpirFixupFuncs;
for (Function &F : M) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to patch functions that doesn't use local_accessor?

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, launch_data is also used to store sanitizer report.

#include <sycl/ext/oneapi/group_local_memory.hpp>
#include <sycl/usm.hpp>
// REQUIRES: linux
// RUN: %{build} %device_sanitizer_flags -g -O0 -o %t.out
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add tests to cover sycl external usage scenario?

Copy link
Contributor Author

@AllanZyne AllanZyne Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we can't handle "sycl external" well because global variables (defined in libdevice) are redefined in shared library.
I will fix this when we support link libdevice in UR.

@AllanZyne
Copy link
Contributor Author

Hi
@intel/dpcpp-clang-driver-reviewers
@intel/dpcpp-tools-reviewers
@intel/unified-runtime-reviewers
@intel/llvm-reviewers-runtime

Can you help to review this PR? Thank you very much!

@AllanZyne AllanZyne marked this pull request as ready for review April 11, 2024 09:27
@AllanZyne AllanZyne requested review from a team as code owners April 11, 2024 09:27
@maksimsab
Copy link
Contributor

@AllanZyne
Should llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp and llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp be LIT tested?

@AllanZyne AllanZyne deleted the branch review/yang/use-after-free April 22, 2024 01:56
@AllanZyne AllanZyne closed this Apr 22, 2024
@AllanZyne
Copy link
Contributor Author

Sorry, this pr merged into wrong branch! I'll fix all comments in new PR.

@AllanZyne AllanZyne deleted the review/yang/local_accessor branch April 22, 2024 01:59
@AllanZyne
Copy link
Contributor Author

AllanZyne commented Apr 22, 2024

Recreated PR: #13503
Thanks!

@AllanZyne
Copy link
Contributor Author

@AllanZyne Should llvm/lib/Transforms/IPO/DeadArgumentElimination.cpp and llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp be LIT tested?

Hi @maksimsab, I've added LIT tests in #13503.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants