Skip to content

[Clang][HIP] Deprecate the AMDGCN_WAVEFRONT_SIZE macros #112849

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 8, 2024

Conversation

ritter-x2a
Copy link
Member

So far, these macros can be used in contexts where no meaningful wavefront size is available. We therefore deprecate these macros, to replace them with a more resilient interface to access wavefront size information where it is available.

For SWDEV-491529.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Fabian Ritter (ritter-x2a)

Changes

So far, these macros can be used in contexts where no meaningful wavefront size is available. We therefore deprecate these macros, to replace them with a more resilient interface to access wavefront size information where it is available.

For SWDEV-491529.


Full diff: https://github.com/llvm/llvm-project/pull/112849.diff

5 Files Affected:

  • (modified) clang/docs/AMDGPUSupport.rst (+2-2)
  • (modified) clang/docs/HIPSupport.rst (+1-1)
  • (modified) clang/include/clang/Basic/MacroBuilder.h (+8-1)
  • (modified) clang/lib/Basic/Targets/AMDGPU.cpp (+6-3)
  • (added) clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip (+111)
diff --git a/clang/docs/AMDGPUSupport.rst b/clang/docs/AMDGPUSupport.rst
index e63c0e1ba7d67b..3eada5f900613a 100644
--- a/clang/docs/AMDGPUSupport.rst
+++ b/clang/docs/AMDGPUSupport.rst
@@ -50,9 +50,9 @@ Predefined Macros
    * - ``__AMDGCN_UNSAFE_FP_ATOMICS__``
      - Defined if unsafe floating-point atomics are allowed.
    * - ``__AMDGCN_WAVEFRONT_SIZE__``
-     - Defines the wavefront size. Allowed values are 32 and 64.
+     - Defines the wavefront size. Allowed values are 32 and 64 (deprecated).
    * - ``__AMDGCN_WAVEFRONT_SIZE``
-     - Alias to ``__AMDGCN_WAVEFRONT_SIZE__``. To be deprecated.
+     - Alias to ``__AMDGCN_WAVEFRONT_SIZE__`` (deprecated).
    * - ``__HAS_FMAF__``
      - Defined if FMAF instruction is available (deprecated).
    * - ``__HAS_LDEXPF__``
diff --git a/clang/docs/HIPSupport.rst b/clang/docs/HIPSupport.rst
index e26297c7af97ac..e830acd8dd85c0 100644
--- a/clang/docs/HIPSupport.rst
+++ b/clang/docs/HIPSupport.rst
@@ -178,7 +178,7 @@ Predefined Macros
 
 Note that some architecture specific AMDGPU macros will have default values when
 used from the HIP host compilation. Other :doc:`AMDGPU macros <AMDGPUSupport>`
-like ``__AMDGCN_WAVEFRONT_SIZE__`` will default to 64 for example.
+like ``__AMDGCN_WAVEFRONT_SIZE__`` (deprecated) will default to 64 for example.
 
 Compilation Modes
 =================
diff --git a/clang/include/clang/Basic/MacroBuilder.h b/clang/include/clang/Basic/MacroBuilder.h
index 96e67cbbfa3f21..c8236cb40a1cf2 100644
--- a/clang/include/clang/Basic/MacroBuilder.h
+++ b/clang/include/clang/Basic/MacroBuilder.h
@@ -17,6 +17,7 @@
 #include "clang/Basic/LLVM.h"
 #include "llvm/ADT/Twine.h"
 #include "llvm/Support/raw_ostream.h"
+#include <optional>
 
 namespace clang {
 
@@ -26,8 +27,14 @@ class MacroBuilder {
   MacroBuilder(raw_ostream &Output) : Out(Output) {}
 
   /// Append a \#define line for macro of the form "\#define Name Value\n".
-  void defineMacro(const Twine &Name, const Twine &Value = "1") {
+  /// If DeprecationMsg is provided, also append a pragma to deprecate the
+  /// defined macro.
+  void defineMacro(const Twine &Name, const Twine &Value = "1",
+                   std::optional<Twine> DeprecationMsg = std::nullopt) {
     Out << "#define " << Name << ' ' << Value << '\n';
+    if (DeprecationMsg.has_value())
+      Out << "#pragma clang deprecated(" << Name << ", \""
+          << DeprecationMsg.value() << "\")\n";
   }
 
   /// Append a \#undef line for Name.  Name should be of the form XXX
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 3b748d0249d57b..8bb4cf5c597dd7 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -337,9 +337,12 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts,
   if (hasFastFMA())
     Builder.defineMacro("FP_FAST_FMA");
 
-  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize));
-  // ToDo: deprecate this macro for naming consistency.
-  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE", Twine(WavefrontSize));
+  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize),
+                      "compile-time-constant access to the wavefront size will "
+                      "be removed in a future release");
+  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE", Twine(WavefrontSize),
+                      "compile-time-constant access to the wavefront size will "
+                      "be removed in a future release");
   Builder.defineMacro("__AMDGCN_CUMODE__", Twine(CUMode));
 }
 
diff --git a/clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip b/clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip
new file mode 100644
index 00000000000000..aca591536a76c0
--- /dev/null
+++ b/clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip
@@ -0,0 +1,111 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang -xhip --offload-arch=gfx1030 --offload-host-only -pedantic -nogpuinc -nogpulib -nobuiltininc -fsyntax-only -Xclang -verify %s
+// RUN: %clang -xhip --offload-arch=gfx1030 --offload-device-only -pedantic -nogpuinc -nogpulib -nobuiltininc -fsyntax-only -Xclang -verify %s
+
+// Test that deprecation warnings for the wavefront size macro are emitted properly.
+
+#include <type_traits>
+
+#define WRAPPED __AMDGCN_WAVEFRONT_SIZE__
+
+#define DOUBLE_WRAPPED (WRAPPED)
+
+__attribute__((host, device)) void use(int, const char*);
+
+template<int N> __attribute__((host, device)) int templatify(int x) {
+    return x + N;
+}
+
+__attribute__((device)) const int GlobalConst = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+constexpr int GlobalConstExpr = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+
+#if defined(__HIP_DEVICE_COMPILE__) && (__AMDGCN_WAVEFRONT_SIZE__ == 64) // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+int foo(void);
+#endif
+
+__attribute__((device)) int device_var = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+
+__attribute__((device))
+void device_fun() {
+    use(__AMDGCN_WAVEFRONT_SIZE, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+    use(__AMDGCN_WAVEFRONT_SIZE__, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(GlobalConst, "device function");
+    use(GlobalConstExpr, "device function");
+}
+
+__attribute__((global))
+void global_fun() {
+    // no warnings expected
+    use(__AMDGCN_WAVEFRONT_SIZE, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+    use(__AMDGCN_WAVEFRONT_SIZE__, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+}
+
+int host_var = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+int host_var_alt = __AMDGCN_WAVEFRONT_SIZE; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+int host_var_wrapped = WRAPPED; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+int host_var_double_wrapped = DOUBLE_WRAPPED; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+
+__attribute__((host))
+void host_fun() {
+    use(__AMDGCN_WAVEFRONT_SIZE, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+    use(__AMDGCN_WAVEFRONT_SIZE__, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(GlobalConst, "host function");
+    use(GlobalConstExpr, "host function");
+}
+
+__attribute((host, device))
+void host_device_fun() {
+    use(__AMDGCN_WAVEFRONT_SIZE__, "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+}
+
+template <unsigned int OuterWarpSize = __AMDGCN_WAVEFRONT_SIZE__> // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+class FunSelector {
+public:
+    template<unsigned int FunWarpSize = OuterWarpSize>
+    __attribute__((device))
+    auto fun(void)
+        -> typename std::enable_if<(FunWarpSize <= __AMDGCN_WAVEFRONT_SIZE__), void>::type // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    {
+        use(1, "yay!");
+    }
+
+    template<unsigned int FunWarpSize = OuterWarpSize>
+    __attribute__((device))
+    auto fun(void)
+        -> typename std::enable_if<(FunWarpSize > __AMDGCN_WAVEFRONT_SIZE__), void>::type // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    {
+        use(0, "nay!");
+    }
+};
+
+__attribute__((device))
+void device_fun_selector_user() {
+    FunSelector<> f;
+    f.fun<>();
+    f.fun<1>();
+    f.fun<1000>();
+
+    std::enable_if<(1 <= __AMDGCN_WAVEFRONT_SIZE__), int>::type x = 42; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+}
+
+__attribute__((device)) std::enable_if<(1 <= __AMDGCN_WAVEFRONT_SIZE__), int>::type DeviceFunTemplateRet(void) { // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    return 42;
+}
+
+__attribute__((device)) int DeviceFunTemplateArg(std::enable_if<(1 <= __AMDGCN_WAVEFRONT_SIZE__), int>::type x) { // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    return x;
+}
+
+// expected-note@* 0+ {{macro marked 'deprecated' here}}

@llvmbot
Copy link
Member

llvmbot commented Oct 18, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Fabian Ritter (ritter-x2a)

Changes

So far, these macros can be used in contexts where no meaningful wavefront size is available. We therefore deprecate these macros, to replace them with a more resilient interface to access wavefront size information where it is available.

For SWDEV-491529.


Full diff: https://github.com/llvm/llvm-project/pull/112849.diff

5 Files Affected:

  • (modified) clang/docs/AMDGPUSupport.rst (+2-2)
  • (modified) clang/docs/HIPSupport.rst (+1-1)
  • (modified) clang/include/clang/Basic/MacroBuilder.h (+8-1)
  • (modified) clang/lib/Basic/Targets/AMDGPU.cpp (+6-3)
  • (added) clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip (+111)
diff --git a/clang/docs/AMDGPUSupport.rst b/clang/docs/AMDGPUSupport.rst
index e63c0e1ba7d67b..3eada5f900613a 100644
--- a/clang/docs/AMDGPUSupport.rst
+++ b/clang/docs/AMDGPUSupport.rst
@@ -50,9 +50,9 @@ Predefined Macros
    * - ``__AMDGCN_UNSAFE_FP_ATOMICS__``
      - Defined if unsafe floating-point atomics are allowed.
    * - ``__AMDGCN_WAVEFRONT_SIZE__``
-     - Defines the wavefront size. Allowed values are 32 and 64.
+     - Defines the wavefront size. Allowed values are 32 and 64 (deprecated).
    * - ``__AMDGCN_WAVEFRONT_SIZE``
-     - Alias to ``__AMDGCN_WAVEFRONT_SIZE__``. To be deprecated.
+     - Alias to ``__AMDGCN_WAVEFRONT_SIZE__`` (deprecated).
    * - ``__HAS_FMAF__``
      - Defined if FMAF instruction is available (deprecated).
    * - ``__HAS_LDEXPF__``
diff --git a/clang/docs/HIPSupport.rst b/clang/docs/HIPSupport.rst
index e26297c7af97ac..e830acd8dd85c0 100644
--- a/clang/docs/HIPSupport.rst
+++ b/clang/docs/HIPSupport.rst
@@ -178,7 +178,7 @@ Predefined Macros
 
 Note that some architecture specific AMDGPU macros will have default values when
 used from the HIP host compilation. Other :doc:`AMDGPU macros <AMDGPUSupport>`
-like ``__AMDGCN_WAVEFRONT_SIZE__`` will default to 64 for example.
+like ``__AMDGCN_WAVEFRONT_SIZE__`` (deprecated) will default to 64 for example.
 
 Compilation Modes
 =================
diff --git a/clang/include/clang/Basic/MacroBuilder.h b/clang/include/clang/Basic/MacroBuilder.h
index 96e67cbbfa3f21..c8236cb40a1cf2 100644
--- a/clang/include/clang/Basic/MacroBuilder.h
+++ b/clang/include/clang/Basic/MacroBuilder.h
@@ -17,6 +17,7 @@
 #include "clang/Basic/LLVM.h"
 #include "llvm/ADT/Twine.h"
 #include "llvm/Support/raw_ostream.h"
+#include <optional>
 
 namespace clang {
 
@@ -26,8 +27,14 @@ class MacroBuilder {
   MacroBuilder(raw_ostream &Output) : Out(Output) {}
 
   /// Append a \#define line for macro of the form "\#define Name Value\n".
-  void defineMacro(const Twine &Name, const Twine &Value = "1") {
+  /// If DeprecationMsg is provided, also append a pragma to deprecate the
+  /// defined macro.
+  void defineMacro(const Twine &Name, const Twine &Value = "1",
+                   std::optional<Twine> DeprecationMsg = std::nullopt) {
     Out << "#define " << Name << ' ' << Value << '\n';
+    if (DeprecationMsg.has_value())
+      Out << "#pragma clang deprecated(" << Name << ", \""
+          << DeprecationMsg.value() << "\")\n";
   }
 
   /// Append a \#undef line for Name.  Name should be of the form XXX
diff --git a/clang/lib/Basic/Targets/AMDGPU.cpp b/clang/lib/Basic/Targets/AMDGPU.cpp
index 3b748d0249d57b..8bb4cf5c597dd7 100644
--- a/clang/lib/Basic/Targets/AMDGPU.cpp
+++ b/clang/lib/Basic/Targets/AMDGPU.cpp
@@ -337,9 +337,12 @@ void AMDGPUTargetInfo::getTargetDefines(const LangOptions &Opts,
   if (hasFastFMA())
     Builder.defineMacro("FP_FAST_FMA");
 
-  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize));
-  // ToDo: deprecate this macro for naming consistency.
-  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE", Twine(WavefrontSize));
+  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize),
+                      "compile-time-constant access to the wavefront size will "
+                      "be removed in a future release");
+  Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE", Twine(WavefrontSize),
+                      "compile-time-constant access to the wavefront size will "
+                      "be removed in a future release");
   Builder.defineMacro("__AMDGCN_CUMODE__", Twine(CUMode));
 }
 
diff --git a/clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip b/clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip
new file mode 100644
index 00000000000000..aca591536a76c0
--- /dev/null
+++ b/clang/test/Driver/hip-wavefront-size-deprecation-diagnostics.hip
@@ -0,0 +1,111 @@
+// REQUIRES: amdgpu-registered-target
+// RUN: %clang -xhip --offload-arch=gfx1030 --offload-host-only -pedantic -nogpuinc -nogpulib -nobuiltininc -fsyntax-only -Xclang -verify %s
+// RUN: %clang -xhip --offload-arch=gfx1030 --offload-device-only -pedantic -nogpuinc -nogpulib -nobuiltininc -fsyntax-only -Xclang -verify %s
+
+// Test that deprecation warnings for the wavefront size macro are emitted properly.
+
+#include <type_traits>
+
+#define WRAPPED __AMDGCN_WAVEFRONT_SIZE__
+
+#define DOUBLE_WRAPPED (WRAPPED)
+
+__attribute__((host, device)) void use(int, const char*);
+
+template<int N> __attribute__((host, device)) int templatify(int x) {
+    return x + N;
+}
+
+__attribute__((device)) const int GlobalConst = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+constexpr int GlobalConstExpr = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+
+#if defined(__HIP_DEVICE_COMPILE__) && (__AMDGCN_WAVEFRONT_SIZE__ == 64) // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+int foo(void);
+#endif
+
+__attribute__((device)) int device_var = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+
+__attribute__((device))
+void device_fun() {
+    use(__AMDGCN_WAVEFRONT_SIZE, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+    use(__AMDGCN_WAVEFRONT_SIZE__, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(GlobalConst, "device function");
+    use(GlobalConstExpr, "device function");
+}
+
+__attribute__((global))
+void global_fun() {
+    // no warnings expected
+    use(__AMDGCN_WAVEFRONT_SIZE, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+    use(__AMDGCN_WAVEFRONT_SIZE__, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "global function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+}
+
+int host_var = __AMDGCN_WAVEFRONT_SIZE__; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+int host_var_alt = __AMDGCN_WAVEFRONT_SIZE; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+int host_var_wrapped = WRAPPED; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+int host_var_double_wrapped = DOUBLE_WRAPPED; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+
+__attribute__((host))
+void host_fun() {
+    use(__AMDGCN_WAVEFRONT_SIZE, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE' has been marked as deprecated}}
+    use(__AMDGCN_WAVEFRONT_SIZE__, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "host function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(GlobalConst, "host function");
+    use(GlobalConstExpr, "host function");
+}
+
+__attribute((host, device))
+void host_device_fun() {
+    use(__AMDGCN_WAVEFRONT_SIZE__, "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(WRAPPED, "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(DOUBLE_WRAPPED, "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    use(templatify<__AMDGCN_WAVEFRONT_SIZE__>(42), "host device function"); // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+}
+
+template <unsigned int OuterWarpSize = __AMDGCN_WAVEFRONT_SIZE__> // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+class FunSelector {
+public:
+    template<unsigned int FunWarpSize = OuterWarpSize>
+    __attribute__((device))
+    auto fun(void)
+        -> typename std::enable_if<(FunWarpSize <= __AMDGCN_WAVEFRONT_SIZE__), void>::type // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    {
+        use(1, "yay!");
+    }
+
+    template<unsigned int FunWarpSize = OuterWarpSize>
+    __attribute__((device))
+    auto fun(void)
+        -> typename std::enable_if<(FunWarpSize > __AMDGCN_WAVEFRONT_SIZE__), void>::type // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    {
+        use(0, "nay!");
+    }
+};
+
+__attribute__((device))
+void device_fun_selector_user() {
+    FunSelector<> f;
+    f.fun<>();
+    f.fun<1>();
+    f.fun<1000>();
+
+    std::enable_if<(1 <= __AMDGCN_WAVEFRONT_SIZE__), int>::type x = 42; // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+}
+
+__attribute__((device)) std::enable_if<(1 <= __AMDGCN_WAVEFRONT_SIZE__), int>::type DeviceFunTemplateRet(void) { // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    return 42;
+}
+
+__attribute__((device)) int DeviceFunTemplateArg(std::enable_if<(1 <= __AMDGCN_WAVEFRONT_SIZE__), int>::type x) { // expected-warning {{macro '__AMDGCN_WAVEFRONT_SIZE__' has been marked as deprecated}}
+    return x;
+}
+
+// expected-note@* 0+ {{macro marked 'deprecated' here}}

Comment on lines +340 to +345
Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE__", Twine(WavefrontSize),
"compile-time-constant access to the wavefront size will "
"be removed in a future release");
Builder.defineMacro("__AMDGCN_WAVEFRONT_SIZE", Twine(WavefrontSize),
"compile-time-constant access to the wavefront size will "
"be removed in a future release");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is unhelpful to deprecate without a suggested replacement.

I also disagree with removing this for the device side compile. There still are uses for it, and there are still non-offloading languages

Copy link
Contributor

@AlexVlx AlexVlx Oct 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are removing an antipattern, the replacement is “don't do this”? Not all things that exist are valid. We should absolutely remove this, at least for HIP, where it creates needless divergence from CUDA and is a bug farm as a consequence.

Orthogonally, what are the uses for it that you have in mind, and which non-offloading languages are you thinking of? The only scenario I can envision here is if one does an end-to-end compile (so not the weird 1-per-target thing we do with offload, and not playing games around single AST that gets forked at CodeGen time). In that case, yes, that might be useful, but delivering it as a macro is fundamentally a bad design IMHO, so even there we should not have this. Furthermore, at the moment, that sort of use is incredibly underspecified / governed by happens to work by accident rather than by design, so building some early technical debt seems problematic, at least to me

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This never should've been defined on the host side for sure. We do this in the OpenMP device runtime.

return llvm::omp::getAMDGPUGridValues<__AMDGCN_WAVEFRONT_SIZE>();

That could probably be replaced with

return __builtin_amdgcn_wavefrontsize();

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of lower level uses. i.e. not user facing code we would want in headers or people using directly. e.g. cases where implementing cross lane operations relies on the operations that only work on half the lanes, or don't work at all when executed in wave32

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use we currently have in our project is to detect, via CMake's try_compile, whether we have Wave64 or Wave32 targets (or both) to later choose which data structure flavors to instantiate. With the macro, we can easily do #if+#warning inside a kernel, and then parse the build log.

Without it, the only solution we found was to parse the list of targets, which feels more brittle. __builtin_amdgcn_wavefrontsize() is not a compile-time constant expression, so cannot be used for such checks. Am I missing something? What is the planned timeline for the removal of this macro?

Copy link
Contributor

@jhuber6 jhuber6 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that it's not that simple, since the user can easily change the wavefront size by compiling with -mwavefrontsize64. I ran into these types of issues myself while working on the RPC interface for the libc . There I pretty much just take the wavefront size as an argument to the indexing / allocation functions. Alternatively you use a template and do runtime dispatch and assume that optimizations will DCE the other code.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that it's not that simple, since the user can easily change the wavefront size by compiling with -mwavefrontsize64.

I assume you are highlighting a problem with parsing the target list? That's a good point.

I ran into these types of issues myself while working on the RPC interface for the rpc interface.

Ah, the new meaning of "running DOOM on GPUs" 😄 👍

Alternatively you use a template and do runtime dispatch and assume that optimizations will DCE the other code.

That's a nice solution and we are moving into that direction (with AMD's help). Unfortunately, we already template on a lot of things, and doubling the compile time in the most common case (compiling for a single device architecture) would be quite noticeable. So, having a compile-time thing to only instantiate the data structures / functions for the wavefront size we need would still be desirable.

Not saying that the whole SPIR-V / JIT thing is bad (having portable binaries would be great, especially the ones using standardized IR), but it's not all upsides.

Copy link
Contributor

@jhuber6 jhuber6 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I think these macros are fine for the case where the user manually passed an -mcpu= argument to the compilation job, and we're not targeting SPIR-V. As far as I know, the main motivation is because for HIP SPIR-V it's an incorrect value since it will JIT compile, but if we're not doing that then it's fine. So it's mostly for compatibility between the two. You could probably make CMake define a custom macro depending on what rocminfo states if nothing else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, the change was not motivated by anything SPIR-V related, the discussion predates it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The big problem previously was that it was defined on the host for some reason.

Copy link
Contributor

@shiltian shiltian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a heads up, this change is gonna break some OpenMP tests, specifically those prefix with ompx.

jhuber6 added a commit to jhuber6/llvm-project that referenced this pull request Oct 21, 2024
Summary:
This is going to be deprecated in
llvm#112849. This patch ports it to
use the builtin instead. This isn't a compile constant, so it could
slightly negatively affect codegen. There really should be an IR pass to
turn it into a constant if the function has known attributes.

Using the builtin is correct when we just do it for knowing the size
like we do here. Obviously guarding w32/w64 code with this check would
be broken.
@AlexVlx
Copy link
Contributor

AlexVlx commented Oct 21, 2024

Just a heads up, this change is gonna break some OpenMP tests, specifically those prefix with ompx.

Just to clarify, adding the deprecation warning will break them, or the eventual, as-of-yet not-scheduled, removal, will?

@ritter-x2a
Copy link
Member Author

Just a heads up, this change is gonna break some OpenMP tests, specifically those prefix with ompx.

Just to clarify, adding the deprecation warning will break them, or the eventual, as-of-yet not-scheduled, removal, will?

On my system, the deprecation warning did not break any lit tests (nor tests from our internal testing). The ompx tests use the macro and will break once we actually remove it.

@jhuber6
Copy link
Contributor

jhuber6 commented Oct 21, 2024

I made #113156 to hopefully address the OpenMP issue. We use it in the runtime itself for a lot of stuff so removing it would definitely break that. The function call here matches what NVIDIA provides via PTX. I think we should have an optimization that replaces the intrinsic with a constant if the function it belongs to has wavefrontsize32 or whatever attributes.

@AlexVlx
Copy link
Contributor

AlexVlx commented Oct 21, 2024

I made #113156 to hopefully address the OpenMP issue. We use it in the runtime itself for a lot of stuff so removing it would definitely break that. The function call here matches what NVIDIA provides via PTX. I think we should have an optimization that replaces the intrinsic with a constant if the function it belongs to has wavefrontsize32 or whatever attributes.

We already do this during legalisation / legacy ISEL; there might be merit into doing it as early as possible though.

@shiltian
Copy link
Contributor

Just a heads up, this change is gonna break some OpenMP tests, specifically those prefix with ompx.

Just to clarify, adding the deprecation warning will break them, or the eventual, as-of-yet not-scheduled, removal, will?

On my system, the deprecation warning did not break any lit tests (nor tests from our internal testing). The ompx tests use the macro and will break once we actually remove it.

Oh didn’t notice this PR was actually to just warn it instead of entirely removing it. Sorry for the notice.

@arsenm
Copy link
Contributor

arsenm commented Oct 22, 2024

We already do this during legalisation / legacy ISEL; there might be merit into doing it as early as possible though.

Yes, we definitely need to start folding this out much earlier.

@ritter-x2a
Copy link
Member Author

Should we move on with the deprecation of the macros with this PR then? Please let me know if there are technical concerns remaining with the PR or approve it so that it can land in trunk.

@jhuber6
Copy link
Contributor

jhuber6 commented Oct 29, 2024

Should we move on with the deprecation of the macros with this PR then? Please let me know if there are technical concerns remaining with the PR or approve it so that it can land in trunk.

CI is red but it seems to be some weird Windows failures. Ideally if the builtin / intrinsic is the way to go I'd like that to be folded much earlier because it prevents loop unrolling for cases like SIMT scan / reduce.

@AlexVlx
Copy link
Contributor

AlexVlx commented Oct 29, 2024

Should we move on with the deprecation of the macros with this PR then? Please let me know if there are technical concerns remaining with the PR or approve it so that it can land in trunk.

CI is red but it seems to be some weird Windows failures. Ideally if the builtin / intrinsic is the way to go I'd like that to be folded much earlier because it prevents loop unrolling for cases like SIMT scan / reduce.

I should have a patch for earlier folding up today or tomorrow, so that will hopefully be addressed.

@ritter-x2a
Copy link
Member Author

Should we move on with the deprecation of the macros with this PR then? Please let me know if there are technical concerns remaining with the PR or approve it so that it can land in trunk.

CI is red but it seems to be some weird Windows failures. Ideally if the builtin / intrinsic is the way to go I'd like that to be folded much earlier because it prevents loop unrolling for cases like SIMT scan / reduce.

Let me rebase so that CI runs again; Windows CI was a bit of a mess in the last week or so.

So far, these macros can be used in contexts where no meaningful wavefront size
is available. We therefore deprecate these macros, to replace them with a more
resilient interface to access wavefront size information where it is available.

For SWDEV-491529.
use plain Twine with empty string instead of optional
@ritter-x2a ritter-x2a force-pushed the deprecate-wavefrontsize-macro branch from a798615 to 96eb2c8 Compare October 29, 2024 14:50
@ritter-x2a
Copy link
Member Author

Should we move on with the deprecation of the macros with this PR then? Please let me know if there are technical concerns remaining with the PR or approve it so that it can land in trunk.

CI is red but it seems to be some weird Windows failures. Ideally if the builtin / intrinsic is the way to go I'd like that to be folded much earlier because it prevents loop unrolling for cases like SIMT scan / reduce.

Let me rebase so that CI runs again; Windows CI was a bit of a mess in the last week or so.

CI looks better now.

@AlexVlx
Copy link
Contributor

AlexVlx commented Oct 31, 2024

CI is red but it seems to be some weird Windows failures. Ideally if the builtin / intrinsic is the way to go I'd like that to be folded much earlier because it prevents loop unrolling for cases like SIMT scan / reduce.

#114481 should deal with this @arsenm @jhuber6.

@ritter-x2a
Copy link
Member Author

@AlexVlx @jhuber6 @arsenm is there a dependence between this deprecation PR and #114481, or can we already go ahead with the deprecation of the macro via this PR?

@jhuber6
Copy link
Contributor

jhuber6 commented Nov 6, 2024

This just emits a warning so it doesn't actually change anything yet. I'm waiting on the wavesize folding before I land the changes in libomptarget because it'd have performance considerations.

Copy link
Contributor

@AlexVlx AlexVlx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@ritter-x2a ritter-x2a merged commit e5c6d1f into llvm:main Nov 8, 2024
9 checks passed

// Test that deprecation warnings for the wavefront size macro are emitted properly.

#include <type_traits>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Clang's tests must be hermetic and can't include system headers
  2. This fails on Mac: http://45.33.8.238/macm1/95266/step_6.txt

Please take a look and revert for now if it takes a while to fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nico Thanks for the report and sorry for the inconvenience! I reverted the patch with #115499 and opened PR #115507 with a fix. I'd appreciate it if you could take a look and see if that fixes the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Since the test is now hermetic, I'm guessing it'll work now. I'll check my bot after you've relanded your patch.

ritter-x2a added a commit that referenced this pull request Nov 8, 2024
ritter-x2a added a commit that referenced this pull request Nov 8, 2024
ritter-x2a added a commit to ritter-x2a/llvm-project that referenced this pull request Nov 8, 2024
So far, these macros can be used in contexts where no meaningful
wavefront size is available. We therefore deprecate these macros, to
replace them with a more resilient interface to access wavefront size
information where it is available.

Reapplies llvm#112849 with a fix for the non-hermetic clang test that failed
on Mac after the revert in llvm#115499.

For SWDEV-491529.
ritter-x2a added a commit that referenced this pull request Nov 11, 2024
…5507)

So far, these macros can be used in contexts where no meaningful
wavefront size is available. We therefore deprecate these macros, to
replace them with a more resilient interface to access wavefront size
information where it is available.

Reapplies #112849 with a fix for the non-hermetic clang test that failed
on Mac after the revert in #115499.

For SWDEV-491529.
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
So far, these macros can be used in contexts where no meaningful
wavefront size is available. We therefore deprecate these macros, to
replace them with a more resilient interface to access wavefront size
information where it is available.

For SWDEV-491529.
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
…m#115507)

So far, these macros can be used in contexts where no meaningful
wavefront size is available. We therefore deprecate these macros, to
replace them with a more resilient interface to access wavefront size
information where it is available.

Reapplies llvm#112849 with a fix for the non-hermetic clang test that failed
on Mac after the revert in llvm#115499.

For SWDEV-491529.
jhuber6 added a commit that referenced this pull request Nov 25, 2024
Summary:
This is going to be deprecated in
#112849. This patch ports it to
use the builtin instead. This isn't a compile constant, so it could
slightly negatively affect codegen. There really should be an IR pass to
turn it into a constant if the function has known attributes.

Using the builtin is correct when we just do it for knowing the size
like we do here. Obviously guarding w32/w64 code with this check would
be broken.
searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request Dec 18, 2024
…m#115507)

So far, these macros can be used in contexts where no meaningful
wavefront size is available. We therefore deprecate these macros, to
replace them with a more resilient interface to access wavefront size
information where it is available.

Reapplies llvm#112849 with a fix for the non-hermetic clang test that failed
on Mac after the revert in llvm#115499.

For SWDEV-491529.

(cherry picked from commit d893c5a)
Change-Id: I84ab3daf63afa68e47d8df79a72a7825e055c6c3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants