Skip to content

[AMDGPU] Do not enable GPU sanitizers by default #126090

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 6, 2025
Merged

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Feb 6, 2025

Summary:
This probably wasn't the intended result, but the code here causes
OpenMP to always link in ockl.bc which was intentionally not linked.
This results in the OCKL definitions conflicting with the OpenMP ones
and also prevents them from being optimized out (Might be fixed with
newer ROCm that actually builds the visibility correctly).

I'm pretty sure the only reason this didn't break the tests is because
we're smart and pass -nogpulib there to keep the environment from
being poisoned with stuff like this.

Summary:
This probably wasn't the intended result, but the code here causes
OpenMP to always link in `ockl.bc` which was intentionally not linked.
This results in the OCKL definitions conflicting with the OpenMP ones
and also prevents them from being optimized out (Might be fixed with
newer ROCm that actually builds the visibility correctly).

I'm pretty sure the only reason this didn't break the tests is because
we're smart and pass `-nogpulib` there to keep the environment from
being poisoned with stuff like this.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' labels Feb 6, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)

Changes

Summary:
This probably wasn't the intended result, but the code here causes
OpenMP to always link in ockl.bc which was intentionally not linked.
This results in the OCKL definitions conflicting with the OpenMP ones
and also prevents them from being optimized out (Might be fixed with
newer ROCm that actually builds the visibility correctly).

I'm pretty sure the only reason this didn't break the tests is because
we're smart and pass -nogpulib there to keep the environment from
being poisoned with stuff like this.


Full diff: https://github.com/llvm/llvm-project/pull/126090.diff

2 Files Affected:

  • (modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+2-2)
  • (modified) clang/test/Driver/hip-sanitize-options.hip (+1-1)
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index 83f486611bc946..e66e5a32e58acd 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -1066,7 +1066,7 @@ ROCMToolChain::getCommonDeviceLibNames(const llvm::opt::ArgList &DriverArgs,
   // them all?
   std::tuple<bool, const SanitizerArgs> GPUSan(
       DriverArgs.hasFlag(options::OPT_fgpu_sanitize,
-                         options::OPT_fno_gpu_sanitize, true),
+                         options::OPT_fno_gpu_sanitize, false),
       getSanitizerArgs(DriverArgs));
   bool DAZ = DriverArgs.hasFlag(options::OPT_fgpu_flush_denormals_to_zero,
                                 options::OPT_fno_gpu_flush_denormals_to_zero,
@@ -1099,7 +1099,7 @@ bool AMDGPUToolChain::shouldSkipSanitizeOption(
     return false;
 
   if (!DriverArgs.hasFlag(options::OPT_fgpu_sanitize,
-                          options::OPT_fno_gpu_sanitize, true))
+                          options::OPT_fno_gpu_sanitize, false))
     return true;
 
   auto &Diags = TC.getDriver().getDiags();
diff --git a/clang/test/Driver/hip-sanitize-options.hip b/clang/test/Driver/hip-sanitize-options.hip
index 8de0ee9e18426b..8a852867f5b3b3 100644
--- a/clang/test/Driver/hip-sanitize-options.hip
+++ b/clang/test/Driver/hip-sanitize-options.hip
@@ -1,5 +1,5 @@
 // RUN: %clang -### --target=x86_64-unknown-linux-gnu --offload-arch=gfx900:xnack+ \
-// RUN:   -fsanitize=address \
+// RUN:   -fsanitize=address -fgpu-sanitize \
 // RUN:   -nogpuinc --rocm-path=%S/Inputs/rocm \
 // RUN:   %s 2>&1 | FileCheck -check-prefixes=NORDC %s
 

@ampandey-1995
Copy link
Contributor

Accepting this PR right now , will come with fix https://github.com/llvm/llvm-project/pull/123922/files#r1945159951 soon

@jhuber6 jhuber6 merged commit ff049e0 into llvm:main Feb 6, 2025
10 of 11 checks passed
@jhuber6 jhuber6 deleted the Sanitize branch February 6, 2025 18:10
@yxsamliu
Copy link
Collaborator

yxsamliu commented Feb 6, 2025

This may break ROCm santizer builds since whether santizer is on should be determined by -fsanitize only, i.e. by default -fgpu-sanitize should be on.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 6, 2025

This may break ROCm santizer builds since whether santizer is on should be determined by -fsanitize only, i.e. by default -fgpu-sanitize should be on.

Yeah @ampandey-1995 will fix it properly, the issue with the previous patch was that it was enabling it for all compilations because it considered -fgpu-sanitize always on regardless of -fsanitize. This is a temporary revert of that part upstream.

@yxsamliu yxsamliu requested a review from b-sumner February 6, 2025 19:16
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
Summary:
This probably wasn't the intended result, but the code here causes
OpenMP to always link in `ockl.bc` which was intentionally not linked.
This results in the OCKL definitions conflicting with the OpenMP ones
and also prevents them from being optimized out (Might be fixed with
newer ROCm that actually builds the visibility correctly).

I'm pretty sure the only reason this didn't break the tests is because
we're smart and pass `-nogpulib` there to keep the environment from
being poisoned with stuff like this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants