Skip to content

Reland [HIP] fix host min/max in header #133590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 1, 2025
Merged

Conversation

yxsamliu
Copy link
Collaborator

CUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA.

Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__, which is 0 by default. In the future, this will change to 1 by
default after most existing HIP apps adopt this change.

Also allows users to define
__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__ to disable host max/min in global namespace.

min/max functions with mixed signed/unsigned integer parameters are not defined unless
__HIP_DEFINE_MIXED_HOST_MIN_MAX__ is defined.

Fixes: SWDEV-446564

@yxsamliu yxsamliu requested a review from Artem-B March 29, 2025 17:24
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics labels Mar 29, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 29, 2025

@llvm/pr-subscribers-clang

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

CUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA.

Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__, which is 0 by default. In the future, this will change to 1 by
default after most existing HIP apps adopt this change.

Also allows users to define
__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__ to disable host max/min in global namespace.

min/max functions with mixed signed/unsigned integer parameters are not defined unless
__HIP_DEFINE_MIXED_HOST_MIN_MAX__ is defined.

Fixes: SWDEV-446564


Full diff: https://github.com/llvm/llvm-project/pull/133590.diff

1 Files Affected:

  • (modified) clang/lib/Headers/__clang_hip_math.h (+78-6)
diff --git a/clang/lib/Headers/__clang_hip_math.h b/clang/lib/Headers/__clang_hip_math.h
index f6c06eaf4afe0..297f82ebf46dd 100644
--- a/clang/lib/Headers/__clang_hip_math.h
+++ b/clang/lib/Headers/__clang_hip_math.h
@@ -1311,15 +1311,87 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); }
 __DEVICE__
 double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
 
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __arg1, int __arg2) {
-  return __arg1 < __arg2 ? __arg1 : __arg2;
+// Define host min/max functions.
+#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) &&                  \
+    !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
+
+// TODO: make this default to 1 after existing HIP apps adopting this change.
+#ifndef __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+#define __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__ 0
+#endif
+
+#ifndef __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+#define __HIP_DEFINE_MIXED_HOST_MIN_MAX__ 0
+#endif
+
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#define DEFINE_MIN_MAX_FUNCTIONS(ret_type, type1, type2)                       \
+  inline ret_type min(const type1 __a, const type2 __b) {                      \
+    return (__a < __b) ? __a : __b;                                            \
+  }                                                                            \
+  inline ret_type max(const type1 __a, const type2 __b) {                      \
+    return (__a > __b) ? __a : __b;                                            \
+  }
+
+// Define min and max functions for same type comparisons
+DEFINE_MIN_MAX_FUNCTIONS(int, int, int)
+
+#if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(long, long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(long long, long long, long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long,
+                         unsigned long long)
+#endif // if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+
+// The host min/max functions below accept mixed signed/unsigned integer
+// parameters and perform unsigned comparisons, which may produce unexpected
+// results if a signed integer was passed unintentionally. To avoid this
+// happening silently, these overloaded functions are not defined by default.
+// However, for compatibility with CUDA, they will be defined if users define
+// __HIP_DEFINE_MIXED_HOST_MIN_MAX__.
+#if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, long long, unsigned long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long, long long)
+#endif // if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+
+// Floating-point comparisons using built-in functions
+inline float min(float const __a, float const __b) {
+  return __builtin_fminf(__a, __b);
+}
+inline double min(double const __a, double const __b) {
+  return __builtin_fmin(__a, __b);
+}
+inline double min(float const __a, double const __b) {
+  return __builtin_fmin(__a, __b);
+}
+inline double min(double const __a, float const __b) {
+  return __builtin_fmin(__a, __b);
 }
 
-__host__ inline static int max(int __arg1, int __arg2) {
-  return __arg1 > __arg2 ? __arg1 : __arg2;
+inline float max(float const __a, float const __b) {
+  return __builtin_fmaxf(__a, __b);
+}
+inline double max(double const __a, double const __b) {
+  return __builtin_fmax(__a, __b);
 }
-#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
+inline double max(float const __a, double const __b) {
+  return __builtin_fmax(__a, __b);
+}
+inline double max(double const __a, float const __b) {
+  return __builtin_fmax(__a, __b);
+}
+
+#pragma pop_macro("DEFINE_MIN_MAX_FUNCTIONS")
+
+#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) &&
+       // !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
 #endif
 
 #pragma pop_macro("__DEVICE__")

@llvmbot
Copy link
Member

llvmbot commented Mar 29, 2025

@llvm/pr-subscribers-backend-x86

Author: Yaxun (Sam) Liu (yxsamliu)

Changes

CUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA.

Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__, which is 0 by default. In the future, this will change to 1 by
default after most existing HIP apps adopt this change.

Also allows users to define
__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__ to disable host max/min in global namespace.

min/max functions with mixed signed/unsigned integer parameters are not defined unless
__HIP_DEFINE_MIXED_HOST_MIN_MAX__ is defined.

Fixes: SWDEV-446564


Full diff: https://github.com/llvm/llvm-project/pull/133590.diff

1 Files Affected:

  • (modified) clang/lib/Headers/__clang_hip_math.h (+78-6)
diff --git a/clang/lib/Headers/__clang_hip_math.h b/clang/lib/Headers/__clang_hip_math.h
index f6c06eaf4afe0..297f82ebf46dd 100644
--- a/clang/lib/Headers/__clang_hip_math.h
+++ b/clang/lib/Headers/__clang_hip_math.h
@@ -1311,15 +1311,87 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); }
 __DEVICE__
 double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
 
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __arg1, int __arg2) {
-  return __arg1 < __arg2 ? __arg1 : __arg2;
+// Define host min/max functions.
+#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) &&                  \
+    !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
+
+// TODO: make this default to 1 after existing HIP apps adopting this change.
+#ifndef __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+#define __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__ 0
+#endif
+
+#ifndef __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+#define __HIP_DEFINE_MIXED_HOST_MIN_MAX__ 0
+#endif
+
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#define DEFINE_MIN_MAX_FUNCTIONS(ret_type, type1, type2)                       \
+  inline ret_type min(const type1 __a, const type2 __b) {                      \
+    return (__a < __b) ? __a : __b;                                            \
+  }                                                                            \
+  inline ret_type max(const type1 __a, const type2 __b) {                      \
+    return (__a > __b) ? __a : __b;                                            \
+  }
+
+// Define min and max functions for same type comparisons
+DEFINE_MIN_MAX_FUNCTIONS(int, int, int)
+
+#if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(long, long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(long long, long long, long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long,
+                         unsigned long long)
+#endif // if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+
+// The host min/max functions below accept mixed signed/unsigned integer
+// parameters and perform unsigned comparisons, which may produce unexpected
+// results if a signed integer was passed unintentionally. To avoid this
+// happening silently, these overloaded functions are not defined by default.
+// However, for compatibility with CUDA, they will be defined if users define
+// __HIP_DEFINE_MIXED_HOST_MIN_MAX__.
+#if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, long long, unsigned long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long, long long)
+#endif // if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+
+// Floating-point comparisons using built-in functions
+inline float min(float const __a, float const __b) {
+  return __builtin_fminf(__a, __b);
+}
+inline double min(double const __a, double const __b) {
+  return __builtin_fmin(__a, __b);
+}
+inline double min(float const __a, double const __b) {
+  return __builtin_fmin(__a, __b);
+}
+inline double min(double const __a, float const __b) {
+  return __builtin_fmin(__a, __b);
 }
 
-__host__ inline static int max(int __arg1, int __arg2) {
-  return __arg1 > __arg2 ? __arg1 : __arg2;
+inline float max(float const __a, float const __b) {
+  return __builtin_fmaxf(__a, __b);
+}
+inline double max(double const __a, double const __b) {
+  return __builtin_fmax(__a, __b);
 }
-#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
+inline double max(float const __a, double const __b) {
+  return __builtin_fmax(__a, __b);
+}
+inline double max(double const __a, float const __b) {
+  return __builtin_fmax(__a, __b);
+}
+
+#pragma pop_macro("DEFINE_MIN_MAX_FUNCTIONS")
+
+#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) &&
+       // !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
 #endif
 
 #pragma pop_macro("__DEVICE__")

@yxsamliu
Copy link
Collaborator Author

This is an effort to reland #82956

Since we kept encountering regressions, we plan to add the change conditionally under a macro so that we can deliver the fix to intended users without causing regressions. Then we will try making the change to be default on.

CUDA defines min/max functions for host in global namespace.
HIP header needs to define them too to be compatible.
Currently only min/max(int, int) is defined. This causes
wrong result for arguments that are out of range for int.
This patch defines host min/max functions to be compatible
with CUDA.

Since some HIP apps defined min/max functions by themselves, newly
added min/max function are under the control of macro
`__HIP_DEFINE_EXTENDED_HOST_MIN_MAX__`, which is 0 by
default. In the future, this will change to 1 by
default after most existing HIP apps adopt this change.

Also allows users to define
`__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__` to disable
host max/min in global namespace.

min/max functions with mixed signed/unsigned integer
parameters are not defined unless
`__HIP_DEFINE_MIXED_HOST_MIN_MAX__` is defined.

Fixes: SWDEV-446564
@yxsamliu yxsamliu merged commit 0248d27 into llvm:main Apr 1, 2025
11 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 1, 2025

LLVM Buildbot has detected a new failure on builder lldb-aarch64-ubuntu running on linaro-lldb-aarch64-ubuntu while building clang at step 6 "test".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/15227

Here is the relevant piece of the build log for the reference
Step 6 (test) failure: build (failure)
...
PASS: lldb-api :: lang/cpp/function-qualifiers/TestCppFunctionQualifiers.py (797 of 2111)
PASS: lldb-api :: lang/cpp/forward/TestCPPForwardDeclaration.py (798 of 2111)
PASS: lldb-api :: lang/cpp/const_static_integral_member/TestConstStaticIntegralMember.py (799 of 2111)
PASS: lldb-api :: lang/cpp/function-ref-qualifiers/TestCppFunctionRefQualifiers.py (800 of 2111)
UNSUPPORTED: lldb-api :: lang/cpp/gmodules/alignment/TestPchAlignment.py (801 of 2111)
UNSUPPORTED: lldb-api :: lang/cpp/gmodules/basic/TestWithModuleDebugging.py (802 of 2111)
PASS: lldb-api :: lang/cpp/function_refs/TestFunctionRefs.py (803 of 2111)
UNSUPPORTED: lldb-api :: lang/cpp/gmodules/pch-chain/TestPchChain.py (804 of 2111)
UNSUPPORTED: lldb-api :: lang/cpp/gmodules/template-with-same-arg/TestTemplateWithSameArg.py (805 of 2111)
UNSUPPORTED: lldb-api :: lang/cpp/gmodules/templates/TestGModules.py (806 of 2111)
FAIL: lldb-api :: lang/cpp/global_variables/TestCPPGlobalVariables.py (807 of 2111)
******************** TEST 'lldb-api :: lang/cpp/global_variables/TestCPPGlobalVariables.py' FAILED ********************
Script:
--
/usr/bin/python3.10 /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/dotest.py -u CXXFLAGS -u CFLAGS --env LLVM_LIBS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib --env LLVM_INCLUDE_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/include --env LLVM_TOOLS_DIR=/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --arch aarch64 --build-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex --lldb-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-lldb/lldb-api --clang-module-cache-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/lldb-test-build.noindex/module-cache-clang/lldb-api --executable /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/lldb --compiler /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/clang --dsymutil /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin/dsymutil --make /usr/bin/gmake --llvm-tools-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./bin --lldb-obj-root /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/tools/lldb --lldb-libs-dir /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/./lib /home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/llvm-project/lldb/test/API/lang/cpp/global_variables -p TestCPPGlobalVariables.py
--
Exit Code: -11

Command Output (stdout):
--
lldb version 21.0.0git (https://github.com/llvm/llvm-project.git revision 0248d277cabab370b48114cc62aff393b273971b)
  clang revision 0248d277cabab370b48114cc62aff393b273971b
  llvm revision 0248d277cabab370b48114cc62aff393b273971b
Skipping the following test categories: ['libc++', 'dsym', 'gmodules', 'debugserver', 'objc']

--
Command Output (stderr):
--
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_access_by_mangled_name_dsym (TestCPPGlobalVariables.GlobalVariablesCppTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_access_by_mangled_name_dwarf (TestCPPGlobalVariables.GlobalVariablesCppTestCase)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_access_by_mangled_name_dwo (TestCPPGlobalVariables.GlobalVariablesCppTestCase)
UNSUPPORTED: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_dsym (TestCPPGlobalVariables.GlobalVariablesCppTestCase) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_dwarf (TestCPPGlobalVariables.GlobalVariablesCppTestCase)
PASS: LLDB (/home/tcwg-buildbot/worker/lldb-aarch64-ubuntu/build/bin/clang-aarch64) :: test_dwo (TestCPPGlobalVariables.GlobalVariablesCppTestCase)
----------------------------------------------------------------------
Ran 6 tests in 1.037s

OK (skipped=2)

--

********************
PASS: lldb-api :: lang/cpp/global_operators/TestCppGlobalOperators.py (808 of 2111)
PASS: lldb-api :: lang/cpp/incompatible-class-templates/TestCppIncompatibleClassTemplates.py (809 of 2111)
PASS: lldb-api :: lang/cpp/keywords_enabled/TestCppKeywordsEnabled.py (810 of 2111)
PASS: lldb-api :: lang/cpp/inlines/TestInlines.py (811 of 2111)
PASS: lldb-api :: lang/cpp/incomplete-types/members/TestCppIncompleteTypeMembers.py (812 of 2111)
UNSUPPORTED: lldb-api :: lang/cpp/libcxx-internals-recognizer/TestLibcxxInternalsRecognizer.py (813 of 2111)
PASS: lldb-api :: lang/cpp/incomplete-stl-types/TestStlIncompleteTypes.py (814 of 2111)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants