[AMDGPU] Do not emit arch dependent macros with unspecified cpu #80035

jhuber6 · 2024-01-30T17:08:28Z

Summary:
Currently, the AMDGPU toolchain accepts not passing -mcpu as a means
to create a sort of "generic" IR. The resulting IR will not contain any
target dependent attributes and can then be inserted into another
program via -mlink-builtin-bitcode to inherit its attributes.

However, there are a handful of macros that can leak incorrect
information when compiling for an unspecified architecture. Currently,
things like the wavefront size will default to 64, which is actually
variable. We should not expose these macros unless it is known.

Summary: Currently, the AMDGPU toolchain accepts not passing `-mcpu` as a means to create a sort of "generic" IR. The resulting IR will not contain any target dependent attributes and can then be inserted into another program via `-mlink-builtin-bitcode` to inherit its attributes. However, there are a handful of macros that can leak incorrect information when compiling for an unspecified architecture. Currently, things like the wavefront size will default to 64, which is actually variable. We should not expose these macros unless it is known.

llvmbot · 2024-01-30T17:08:57Z

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-amdgpu

Author: Joseph Huber (jhuber6)

Changes

Summary:
Currently, the AMDGPU toolchain accepts not passing -mcpu as a means
to create a sort of "generic" IR. The resulting IR will not contain any
target dependent attributes and can then be inserted into another
program via -mlink-builtin-bitcode to inherit its attributes.

However, there are a handful of macros that can leak incorrect
information when compiling for an unspecified architecture. Currently,
things like the wavefront size will default to 64, which is actually
variable. We should not expose these macros unless it is known.

Full diff: https://github.com/llvm/llvm-project/pull/80035.diff

4 Files Affected:

(modified) clang/docs/HIPSupport.rst (+2)
(modified) clang/lib/Basic/Targets/AMDGPU.cpp (+24-24)
(modified) clang/test/CodeGenOpenCL/builtins-amdgcn-wave64.cl (+1-1)
(modified) clang/test/Preprocessor/predefined-arch-macros.c (+12-5)

diff --git a/clang/docs/HIPSupport.rst b/clang/docs/HIPSupport.rst
index 84cee45e83ba3..191efdc1db689 100644
--- a/clang/docs/HIPSupport.rst
+++ b/clang/docs/HIPSupport.rst
@@ -175,6 +175,8 @@ Predefined Macros
      - Defined when the GPU default stream is set to per-thread mode.
    * - ``HIP_API_PER_THREAD_DEFAULT_STREAM``
      - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated.
+   * - ``__AMDGCN_WAVEFRONT_SIZE__``
+     - Represents the target wavefront size for HIP device compilation and defaults to 64 for HIP host compilation.
 
 Compilation Modes
 =================
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 6f3a4908623da..ab77d116f8ad5 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -274,30 +274,30 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts,
   else
     Builder.defineMacro("__R600__");
 
-  if (GPUKind != llvm::AMDGPU::GK_NONE) {
-    StringRef CanonName = isAMDGCN(getTriple()) ?
-      getArchNameAMDGCN(GPUKind) : getArchNameR600(GPUKind);
-    Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__"));
-    // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___
-    if (isAMDGCN(getTriple())) {
-      assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name");
-      Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) +
-                          Twine("__"));
-    }
-    if (isAMDGCN(getTriple())) {
-      Builder.defineMacro("__amdgcn_processor__",
-                          Twine("\"") + Twine(CanonName) + Twine("\""));
-      Builder.defineMacro("__amdgcn_target_id__",
-                          Twine("\"") + Twine(*getTargetID()) + Twine("\""));
-      for (auto F : getAllPossibleTargetIDFeatures(getTriple(), CanonName)) {
-        auto Loc = OffloadArchFeatures.find(F);
-        if (Loc != OffloadArchFeatures.end()) {
-          std::string NewF = F.str();
-          std::replace(NewF.begin(), NewF.end(), '-', '_');
-          Builder.defineMacro(Twine("__amdgcn_feature_") + Twine(NewF) +
-                                  Twine("__"),
-                              Loc->second ? "1" : "0");
-        }
+  // Legacy HIP host code relies on these default attributes to be defined.
+  if (GPUKind == llvm::AMDGPU::GK_NONE && !(Opts.HIP && !Opts.CUDAIsDevice))
+    return;
+
+  StringRef CanonName = isAMDGCN(getTriple()) ? getArchNameAMDGCN(GPUKind)
+                                              : getArchNameR600(GPUKind);
+  Builder.defineMacro(Twine("__") + Twine(CanonName) + Twine("__"));
+  // Emit macros for gfx family e.g. gfx906 -> __GFX9__, gfx1030 -> __GFX10___
+  if (isAMDGCN(getTriple())) {
+    assert(CanonName.starts_with("gfx") && "Invalid amdgcn canonical name");
+    Builder.defineMacro(Twine("__") + Twine(CanonName.drop_back(2).upper()) +
+                        Twine("__"));
+    Builder.defineMacro("__amdgcn_processor__",
+                        Twine("\"") + Twine(CanonName) + Twine("\""));
+    Builder.defineMacro("__amdgcn_target_id__",
+                        Twine("\"") + Twine(*getTargetID()) + Twine("\""));
+    for (auto F : getAllPossibleTargetIDFeatures(getTriple(), CanonName)) {
+      auto Loc = OffloadArchFeatures.find(F);
+      if (Loc != OffloadArchFeatures.end()) {
+        std::string NewF = F.str();
+        std::replace(NewF.begin(), NewF.end(), '-', '_');
+        Builder.defineMacro(Twine("__amdgcn_feature_") + Twine(NewF) +
+                                Twine("__"),
+                            Loc->second ? "1" : "0");
       }
     }
   }
diff --git a/clang/test/CodeGenOpenCL/builtins-amdgcn-wave64.cl b/clang/test/CodeGenOpenCL/builtins-amdgcn-wave64.cl
index 53f34c6a44ae7..5875f6fef2f27 100644
--- a/clang/test/CodeGenOpenCL/builtins-amdgcn-wave64.cl
+++ b/clang/test/CodeGenOpenCL/builtins-amdgcn-wave64.cl
@@ -44,6 +44,6 @@ void test_read_exec_hi(global ulong* out) {
   *out = __builtin_amdgcn_read_exec_hi();
 }
 
-#if __AMDGCN_WAVEFRONT_SIZE != 64
+#if defined(__AMDGCN_WAVEFRONT_SIZE__) && __AMDGCN_WAVEFRONT_SIZE__ != 64
 #error Wrong wavesize detected
 #endif
diff --git a/clang/test/Preprocessor/predefined-arch-macros.c b/clang/test/Preprocessor/predefined-arch-macros.c
index 27c7b4a271fee..9879a61825fbb 100644
--- a/clang/test/Preprocessor/predefined-arch-macros.c
+++ b/clang/test/Preprocessor/predefined-arch-macros.c
@@ -4294,13 +4294,20 @@
 
 // Begin amdgcn tests ----------------
 
-// RUN: %clang -march=amdgcn -E -dM %s -o - 2>&1 \
+// RUN: %clang -mcpu=gfx803 -E -dM %s -o - 2>&1 \
+// RUN:     -target amdgcn-unknown-unknown \
+// RUN:   | FileCheck -match-full-lines %s -check-prefixes=CHECK_AMDGCN,CHECK_AMDGCN_803
+// RUN: %clang -E -dM %s -o - 2>&1 \
 // RUN:     -target amdgcn-unknown-unknown \
-// RUN:   | FileCheck -match-full-lines %s -check-prefix=CHECK_AMDGCN
+// RUN:   | FileCheck -match-full-lines %s -check-prefixes=CHECK_AMDGCN,CHECK_AMDGCN_NONE
 // CHECK_AMDGCN: #define __AMDGCN__ 1
-// CHECK_AMDGCN: #define __HAS_FMAF__ 1
-// CHECK_AMDGCN: #define __HAS_FP64__ 1
-// CHECK_AMDGCN: #define __HAS_LDEXPF__ 1
+// CHECK_AMDGCN_803: #define __HAS_FMAF__ 1
+// CHECK_AMDGCN_803: #define __HAS_FP64__ 1
+// CHECK_AMDGCN_803: #define __HAS_LDEXPF__ 1
+// CHECK_AMDGCN_NONE-NOT: #define __HAS_FMAF__
+// CHECK_AMDGCN_NONE-NOT: #define __HAS_FP64__
+// CHECK_AMDGCN_NONE-NOT: #define __HAS_LDEXPF__
+// CHECK_AMDGCN_NONE-NOT: #define __AMDGCN_WAVEFRONT_SIZE__
 
 // Begin r600 tests ----------------

jhuber6 · 2024-01-30T17:16:21Z

Rework of #79660 to handle old behavior of these being defined for the host.

yxsamliu · 2024-01-30T17:54:00Z

clang/docs/HIPSupport.rst

@@ -175,6 +175,8 @@ Predefined Macros
     - Defined when the GPU default stream is set to per-thread mode.
   * - ``HIP_API_PER_THREAD_DEFAULT_STREAM``
     - Alias to ``__HIP_API_PER_THREAD_DEFAULT_STREAM__``. Deprecated.
+   * - ``__AMDGCN_WAVEFRONT_SIZE__``


this macro and other AMDGPU predefined macros are documented at https://clang.llvm.org/docs/AMDGPUSupport.html

we should not just mention the link and talk about their dependency on specific GPU arch and taking default value in host compilation.

do these really need to be defined for the host?

No, it's explicitly wrong, but there's a lot of HIP headers that depend on it for some reason so we can't break it.

Tried to explain this further.

clang/test/Preprocessor/predefined-arch-macros.c

yxsamliu

LGTM. Thanks

nico · 2024-01-30T19:23:55Z

This seems to break tests: http://45.33.8.238/linux/129493/step_7.txt

Please take a look and revert for now if it takes a while to fix.

jhuber6 · 2024-01-30T19:26:10Z

This seems to break tests: http://45.33.8.238/linux/129493/step_7.txt

Please take a look and revert for now if it takes a while to fix.

Is it still broken? I pushed a fix because I'm pretty sure the problem was not passing -nogpulib -nogpuinc so the test runs on machines without ROCm. I always forget about that part because I have ROCm on my machine. It used to work silently before a patch last year or something that made clang failing make the test fail.

nico · 2024-01-30T19:32:53Z

This is with that change: http://45.33.8.238/linux/129493/step_7.txt

nico · 2024-01-30T19:34:48Z

i.e. it helped with Clang :: Preprocessor/predefined-arch-macros.c but not with:

Failed Tests (2):
Clang :: Driver/amdgpu-macros.cl
Clang :: Driver/target-id-macros.cl

jhuber6 · 2024-01-30T19:36:37Z

i.e. it helped with Clang :: Preprocessor/predefined-arch-macros.c but not with:

Failed Tests (2): Clang :: Driver/amdgpu-macros.cl Clang :: Driver/target-id-macros.cl

Thanks, seeing it locally now. I'll try to fix it quick and revert if it's not working soon.

jhuber6 · 2024-01-30T19:45:24Z

i.e. it helped with Clang :: Preprocessor/predefined-arch-macros.c but not with:

Failed Tests (2): Clang :: Driver/amdgpu-macros.cl Clang :: Driver/target-id-macros.cl

Pushed a fix, check-clang passes on my machine now. Let me know if it's still broken.

nico · 2024-01-30T19:55:05Z

My bot is happy again, thanks.

Please run tests before committing :)

jhuber6 requested review from b-sumner, JonChesterfield and yxsamliu January 30, 2024 17:08

llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 30, 2024

yxsamliu reviewed Jan 30, 2024

View reviewed changes

clang/test/Preprocessor/predefined-arch-macros.c Show resolved Hide resolved

Address comments

9105363

yxsamliu approved these changes Jan 30, 2024

View reviewed changes

jhuber6 merged commit f2a78e6 into llvm:main Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Do not emit arch dependent macros with unspecified cpu #80035

[AMDGPU] Do not emit arch dependent macros with unspecified cpu #80035

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

llvmbot commented Jan 30, 2024 •

edited

Loading

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

yxsamliu Jan 30, 2024

Uh oh!

arsenm Jan 30, 2024

Uh oh!

jhuber6 Jan 30, 2024

Uh oh!

jhuber6 Jan 30, 2024

Uh oh!

Uh oh!

yxsamliu left a comment

Uh oh!

nico commented Jan 30, 2024

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

nico commented Jan 30, 2024

Uh oh!

nico commented Jan 30, 2024

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

nico commented Jan 30, 2024

Uh oh!

Uh oh!

[AMDGPU] Do not emit arch dependent macros with unspecified cpu #80035

[AMDGPU] Do not emit arch dependent macros with unspecified cpu #80035

Uh oh!

Conversation

jhuber6 commented Jan 30, 2024

Uh oh!

llvmbot commented Jan 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

yxsamliu Jan 30, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm Jan 30, 2024

Choose a reason for hiding this comment

Uh oh!

jhuber6 Jan 30, 2024

Choose a reason for hiding this comment

Uh oh!

jhuber6 Jan 30, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yxsamliu left a comment

Choose a reason for hiding this comment

Uh oh!

nico commented Jan 30, 2024

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

nico commented Jan 30, 2024

Uh oh!

nico commented Jan 30, 2024

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

jhuber6 commented Jan 30, 2024

Uh oh!

nico commented Jan 30, 2024

Uh oh!

Uh oh!

llvmbot commented Jan 30, 2024 •

edited

Loading