Skip to content

[SYCL] Free functions and dynamic linking fixes for CUDA/HIP #17899

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Apr 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 10 additions & 5 deletions llvm/lib/SYCLPostLink/ModuleSplitter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,8 +121,9 @@ bool isGenericBuiltin(StringRef FName) {
}

bool isKernel(const Function &F) {
return F.getCallingConv() == CallingConv::SPIR_KERNEL ||
F.getCallingConv() == CallingConv::AMDGPU_KERNEL;
const auto CC = F.getCallingConv();
return CC == CallingConv::SPIR_KERNEL || CC == CallingConv::AMDGPU_KERNEL ||
CC == CallingConv::PTX_Kernel;
}

bool isEntryPoint(const Function &F, bool EmitOnlyKernelsAsEntryPoints) {
Expand Down Expand Up @@ -697,9 +698,13 @@ static bool mustPreserveGV(const GlobalValue &GV) {
// kernels which are the entry points from host code to device code) that
// cannot be imported which also means that there is no point of having it
// visible outside of the current module.
if (AllowDeviceImageDependencies)
return F->getCallingConv() == CallingConv::SPIR_KERNEL ||
canBeImportedFunction(*F);
if (AllowDeviceImageDependencies) {
const auto CC = F->getCallingConv();
const bool SpirOrGPU = CC == CallingConv::SPIR_KERNEL ||
CC == CallingConv::AMDGPU_KERNEL ||
CC == CallingConv::PTX_Kernel;
return SpirOrGPU || canBeImportedFunction(*F);
}

// Otherwise, we are being even more aggressive: SYCL modules are expected
// to be self-contained, meaning that they have no external dependencies.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,16 +1,12 @@
// UNSUPPORTED: true
// UNSUPPORTED-TRACKER: https://github.com/intel/llvm/issues/17812

// Ensure -fsycl-allow-device-dependencies can work with free function kernels.

// REQUIRES: aspect-usm_shared_allocations
// RUN: %{build} -o %t.out --offload-new-driver -fsycl-allow-device-image-dependencies
// RUN: %{run} %t.out

// The name mangling for free function kernels currently does not work with PTX.
// UNSUPPORTED: target-nvidia
// UNSUPPORTED-INTENDED: Not implemented yet for Nvidia/AMD backends.

// XFAIL: target-amd
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/15742

#include <iostream>
#include <sycl/detail/core.hpp>
#include <sycl/ext/oneapi/free_function_queries.hpp>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,6 @@
// RUN: %{build} -o %t.out -fsycl-allow-device-image-dependencies
// RUN: %{run} %t.out

// The name mangling for free function kernels currently does not work with PTX.
// UNSUPPORTED: cuda

// XFAIL: hip
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/15742

#include <iostream>
#include <sycl/detail/core.hpp>
#include <sycl/ext/oneapi/free_function_queries.hpp>
Expand Down
3 changes: 0 additions & 3 deletions sycl/test-e2e/Graph/Explicit/free_function_kernels.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be worth checking in Graph/Update/FreeFunctionKernels for other tests to re-enable. I think most (all?) of them have this cuda xfail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

#define GRAPH_E2E_EXPLICIT

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}

// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

#define GRAPH_E2E_EXPLICIT

#include "../Inputs/work_group_memory_free_function.cpp"
3 changes: 0 additions & 3 deletions sycl/test-e2e/Graph/RecordReplay/free_function_kernels.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

#define GRAPH_E2E_RECORD_REPLAY

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,6 @@
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}

// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

#define GRAPH_E2E_RECORD_REPLAY

#include "../Inputs/work_group_memory_free_function.cpp"
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating dynamic_work_group_memory with a new size.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating a graph node before finalization

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests creating multiple executable graphs from the same modifiable graph and
// only updating one of them.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests that updating a graph is ordered with respect to previous executions of
// the graph which may be in flight.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating a graph node using index-based explicit update

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating a 3D ND-Range graph kernel node using index-based explicit
// update
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating a graph node using index-based explicit update

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating a single dynamic parameter which is registered with multiple
// graph nodes
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating multiple parameters to a singlegraph node using index-based
// explicit update
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating a graph node in an executable graph that was used as a
// subgraph node in another executable graph is not reflected in the graph
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,6 @@
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=0 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
// Extra run to check for immediate-command-list in Level Zero
// RUN: %if level_zero %{env SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 %{l0_leak_check} %{run} %t.out 2>&1 | FileCheck %s --implicit-check-not=LEAK %}
//
// XFAIL: cuda
// XFAIL-TRACKER: https://github.com/intel/llvm/issues/16004

// Tests updating a graph node scalar argument using index-based explicit update

Expand Down
3 changes: 0 additions & 3 deletions sycl/test-e2e/KernelAndProgram/free_function_apis.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out

// The name mangling for free function kernels currently does not work with PTX.
// UNSUPPORTED: cuda

#include <iostream>
#include <sycl/detail/core.hpp>
#include <sycl/ext/oneapi/experimental/free_function_traits.hpp>
Expand Down
3 changes: 0 additions & 3 deletions sycl/test-e2e/KernelAndProgram/free_function_kernels.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out

// The name mangling for free function kernels currently does not work with PTX.
// UNSUPPORTED: cuda

// This test tests free function kernel code generation and execution.

#include <iostream>
Expand Down
3 changes: 0 additions & 3 deletions sycl/test-e2e/WorkGroupMemory/reduction_free_function.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,6 @@
// RUN: %{build} -o %t.out
// RUN: %{run} %t.out

// UNSUPPORTED: cuda
// UNSUPPORTED-TRACKER: https://github.com/intel/llvm/issues/16004

#include "common_free_function.hpp"

// Basic usage reduction test using free function kernels.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@
// tests to match the required format and in that case you should just update
// (i.e. reduce) the number and the list below.
//
// NUMBER-OF-UNSUPPORTED-WITHOUT-INFO: 276
// NUMBER-OF-UNSUPPORTED-WITHOUT-INFO: 273
//
// List of improperly UNSUPPORTED tests.
// Remove the CHECK once the test has been properly UNSUPPORTED.
Expand Down Expand Up @@ -93,7 +93,6 @@
// CHECK-NEXT: Basic/kernel_info_attr.cpp
// CHECK-NEXT: Basic/submit_time.cpp
// CHECK-NEXT: DeviceImageDependencies/dynamic.cpp
// CHECK-NEXT: DeviceImageDependencies/free_function_kernels.cpp
// CHECK-NEXT: DeviceImageDependencies/math_device_lib.cpp
// CHECK-NEXT: DeviceImageDependencies/objects.cpp
// CHECK-NEXT: DeviceImageDependencies/singleDynamicLibrary.cpp
Expand Down Expand Up @@ -205,8 +204,6 @@
// CHECK-NEXT: InvokeSimd/Regression/ImplicitSubgroup/call_vadd_1d_spill.cpp
// CHECK-NEXT: InvokeSimd/Regression/call_vadd_1d_spill.cpp
// CHECK-NEXT: KernelAndProgram/cache-build-result.cpp
// CHECK-NEXT: KernelAndProgram/free_function_apis.cpp
// CHECK-NEXT: KernelAndProgram/free_function_kernels.cpp
// CHECK-NEXT: KernelAndProgram/kernel-bundle-merge-options-env.cpp
// CHECK-NEXT: KernelAndProgram/kernel-bundle-merge-options.cpp
// CHECK-NEXT: KernelAndProgram/level-zero-static-link-flow.cpp
Expand Down