Skip to content

[Clang][AMDGPU] Improve error message when device libraries for COV6 are missing #134745

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 8, 2025

Conversation

shiltian
Copy link
Contributor

@shiltian shiltian commented Apr 7, 2025

#130963 switches the default to COV6, which requires ROCm 6.3. Currently, if the
device libraries for COV6 are not found, the error message is not very helpful.
This PR provides a more informative error message in such cases.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Apr 7, 2025
Copy link
Contributor Author

shiltian commented Apr 7, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@llvmbot
Copy link
Member

llvmbot commented Apr 7, 2025

@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-clang

Author: Shilei Tian (shiltian)

Changes

130963 switches the default to COV6, which requires ROCm 6.3. Currently, if the
device libraries for COV6 are not found, the error message is not very helpful.
This PR provides a more informative error message in such cases.


Full diff: https://github.com/llvm/llvm-project/pull/134745.diff

4 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+2-1)
  • (modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+4-1)
  • (modified) clang/lib/Driver/ToolChains/ROCm.h (+4-2)
  • (modified) clang/test/Driver/hip-device-libs.hip (+1-1)
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index df24cca49aaae..636c3a879d26b 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -67,7 +67,8 @@ def err_drv_no_cuda_libdevice : Error<
   "libdevice">;
 
 def err_drv_no_rocm_device_lib : Error<
-  "cannot find ROCm device library%select{| for %1| for ABI version %1}0; provide its path via "
+  "cannot find ROCm device library%select{| for %1| for ABI version %1"
+  "%select{|, which requires ROCm 6.3 or higher}2}0; provide its path via "
   "'--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build "
   "without ROCm device library">;
 def err_drv_no_hip_runtime : Error<
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index dffc70d5e5b69..2b1d996f9652e 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -935,7 +935,10 @@ bool RocmInstallationDetector::checkCommonBitcodeLibs(
     return false;
   }
   if (ABIVer.requiresLibrary() && getABIVersionPath(ABIVer).empty()) {
-    D.Diag(diag::err_drv_no_rocm_device_lib) << 2 << ABIVer.toString();
+    if (ABIVer.getAsCodeObjectVersion() < 6)
+      D.Diag(diag::err_drv_no_rocm_device_lib) << 2 << ABIVer.toString() << 0;
+    else
+      D.Diag(diag::err_drv_no_rocm_device_lib) << 2 << ABIVer.toString() << 1;
     return false;
   }
   return true;
diff --git a/clang/lib/Driver/ToolChains/ROCm.h b/clang/lib/Driver/ToolChains/ROCm.h
index a6cc41db383b6..1ba0f1b9f30d6 100644
--- a/clang/lib/Driver/ToolChains/ROCm.h
+++ b/clang/lib/Driver/ToolChains/ROCm.h
@@ -37,9 +37,11 @@ struct DeviceLibABIVersion {
   /// and below works with ROCm 5.0 and below which does not have
   /// abi_version_*.bc. Code object v5 requires abi_version_500.bc.
   bool requiresLibrary() { return ABIVersion >= 500; }
-  std::string toString() {
+  std::string toString() { return Twine(getAsCodeObjectVersion()).str(); }
+
+  unsigned getAsCodeObjectVersion() const {
     assert(ABIVersion % 100 == 0 && "Not supported");
-    return Twine(ABIVersion / 100).str();
+    return ABIVersion / 100;
   }
 };
 
diff --git a/clang/test/Driver/hip-device-libs.hip b/clang/test/Driver/hip-device-libs.hip
index c7cafd0027bc5..b123f741bdee5 100644
--- a/clang/test/Driver/hip-device-libs.hip
+++ b/clang/test/Driver/hip-device-libs.hip
@@ -254,4 +254,4 @@
 // NOABI4-NOT: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_400.bc"
 // NOABI4-NOT: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_500.bc"
 // NOABI5: error: cannot find ROCm device library for ABI version 5; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
-// NOABI6: error: cannot find ROCm device library for ABI version 6; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
+// NOABI6: error: cannot find ROCm device library for ABI version 6, which requires ROCm 6.3 or higher; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library

@llvmbot
Copy link
Member

llvmbot commented Apr 7, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Shilei Tian (shiltian)

Changes

130963 switches the default to COV6, which requires ROCm 6.3. Currently, if the
device libraries for COV6 are not found, the error message is not very helpful.
This PR provides a more informative error message in such cases.


Full diff: https://github.com/llvm/llvm-project/pull/134745.diff

4 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticDriverKinds.td (+2-1)
  • (modified) clang/lib/Driver/ToolChains/AMDGPU.cpp (+4-1)
  • (modified) clang/lib/Driver/ToolChains/ROCm.h (+4-2)
  • (modified) clang/test/Driver/hip-device-libs.hip (+1-1)
diff --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index df24cca49aaae..636c3a879d26b 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -67,7 +67,8 @@ def err_drv_no_cuda_libdevice : Error<
   "libdevice">;
 
 def err_drv_no_rocm_device_lib : Error<
-  "cannot find ROCm device library%select{| for %1| for ABI version %1}0; provide its path via "
+  "cannot find ROCm device library%select{| for %1| for ABI version %1"
+  "%select{|, which requires ROCm 6.3 or higher}2}0; provide its path via "
   "'--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build "
   "without ROCm device library">;
 def err_drv_no_hip_runtime : Error<
diff --git a/clang/lib/Driver/ToolChains/AMDGPU.cpp b/clang/lib/Driver/ToolChains/AMDGPU.cpp
index dffc70d5e5b69..2b1d996f9652e 100644
--- a/clang/lib/Driver/ToolChains/AMDGPU.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPU.cpp
@@ -935,7 +935,10 @@ bool RocmInstallationDetector::checkCommonBitcodeLibs(
     return false;
   }
   if (ABIVer.requiresLibrary() && getABIVersionPath(ABIVer).empty()) {
-    D.Diag(diag::err_drv_no_rocm_device_lib) << 2 << ABIVer.toString();
+    if (ABIVer.getAsCodeObjectVersion() < 6)
+      D.Diag(diag::err_drv_no_rocm_device_lib) << 2 << ABIVer.toString() << 0;
+    else
+      D.Diag(diag::err_drv_no_rocm_device_lib) << 2 << ABIVer.toString() << 1;
     return false;
   }
   return true;
diff --git a/clang/lib/Driver/ToolChains/ROCm.h b/clang/lib/Driver/ToolChains/ROCm.h
index a6cc41db383b6..1ba0f1b9f30d6 100644
--- a/clang/lib/Driver/ToolChains/ROCm.h
+++ b/clang/lib/Driver/ToolChains/ROCm.h
@@ -37,9 +37,11 @@ struct DeviceLibABIVersion {
   /// and below works with ROCm 5.0 and below which does not have
   /// abi_version_*.bc. Code object v5 requires abi_version_500.bc.
   bool requiresLibrary() { return ABIVersion >= 500; }
-  std::string toString() {
+  std::string toString() { return Twine(getAsCodeObjectVersion()).str(); }
+
+  unsigned getAsCodeObjectVersion() const {
     assert(ABIVersion % 100 == 0 && "Not supported");
-    return Twine(ABIVersion / 100).str();
+    return ABIVersion / 100;
   }
 };
 
diff --git a/clang/test/Driver/hip-device-libs.hip b/clang/test/Driver/hip-device-libs.hip
index c7cafd0027bc5..b123f741bdee5 100644
--- a/clang/test/Driver/hip-device-libs.hip
+++ b/clang/test/Driver/hip-device-libs.hip
@@ -254,4 +254,4 @@
 // NOABI4-NOT: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_400.bc"
 // NOABI4-NOT: "-mlink-builtin-bitcode" "{{.*}}oclc_abi_version_500.bc"
 // NOABI5: error: cannot find ROCm device library for ABI version 5; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
-// NOABI6: error: cannot find ROCm device library for ABI version 6; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library
+// NOABI6: error: cannot find ROCm device library for ABI version 6, which requires ROCm 6.3 or higher; provide its path via '--rocm-path' or '--rocm-device-lib-path', or pass '-nogpulib' to build without ROCm device library

@shiltian shiltian force-pushed the users/shiltian/better-cov6-err-msg branch from 1bdbd17 to aab55f6 Compare April 8, 2025 02:06
…are missing

130963 switches the default to COV6, which requires ROCm 6.3. Currently, if the
device libraries for COV6 are not found, the error message is not very helpful.
This PR provides a more informative error message in such cases.
@shiltian shiltian force-pushed the users/shiltian/better-cov6-err-msg branch from aab55f6 to c16e980 Compare April 8, 2025 02:14
@shiltian shiltian merged commit f19c6f2 into main Apr 8, 2025
11 checks passed
@shiltian shiltian deleted the users/shiltian/better-cov6-err-msg branch April 8, 2025 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants