-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Reland [HIP] fix host min/max in header #133590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang Author: Yaxun (Sam) Liu (yxsamliu) ChangesCUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA. Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro Also allows users to define min/max functions with mixed signed/unsigned integer parameters are not defined unless Fixes: SWDEV-446564 Full diff: https://github.com/llvm/llvm-project/pull/133590.diff 1 Files Affected:
diff --git a/clang/lib/Headers/__clang_hip_math.h b/clang/lib/Headers/__clang_hip_math.h
index f6c06eaf4afe0..297f82ebf46dd 100644
--- a/clang/lib/Headers/__clang_hip_math.h
+++ b/clang/lib/Headers/__clang_hip_math.h
@@ -1311,15 +1311,87 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __arg1, int __arg2) {
- return __arg1 < __arg2 ? __arg1 : __arg2;
+// Define host min/max functions.
+#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) && \
+ !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
+
+// TODO: make this default to 1 after existing HIP apps adopting this change.
+#ifndef __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+#define __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__ 0
+#endif
+
+#ifndef __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+#define __HIP_DEFINE_MIXED_HOST_MIN_MAX__ 0
+#endif
+
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#define DEFINE_MIN_MAX_FUNCTIONS(ret_type, type1, type2) \
+ inline ret_type min(const type1 __a, const type2 __b) { \
+ return (__a < __b) ? __a : __b; \
+ } \
+ inline ret_type max(const type1 __a, const type2 __b) { \
+ return (__a > __b) ? __a : __b; \
+ }
+
+// Define min and max functions for same type comparisons
+DEFINE_MIN_MAX_FUNCTIONS(int, int, int)
+
+#if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(long, long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(long long, long long, long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long,
+ unsigned long long)
+#endif // if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+
+// The host min/max functions below accept mixed signed/unsigned integer
+// parameters and perform unsigned comparisons, which may produce unexpected
+// results if a signed integer was passed unintentionally. To avoid this
+// happening silently, these overloaded functions are not defined by default.
+// However, for compatibility with CUDA, they will be defined if users define
+// __HIP_DEFINE_MIXED_HOST_MIN_MAX__.
+#if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, long long, unsigned long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long, long long)
+#endif // if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+
+// Floating-point comparisons using built-in functions
+inline float min(float const __a, float const __b) {
+ return __builtin_fminf(__a, __b);
+}
+inline double min(double const __a, double const __b) {
+ return __builtin_fmin(__a, __b);
+}
+inline double min(float const __a, double const __b) {
+ return __builtin_fmin(__a, __b);
+}
+inline double min(double const __a, float const __b) {
+ return __builtin_fmin(__a, __b);
}
-__host__ inline static int max(int __arg1, int __arg2) {
- return __arg1 > __arg2 ? __arg1 : __arg2;
+inline float max(float const __a, float const __b) {
+ return __builtin_fmaxf(__a, __b);
+}
+inline double max(double const __a, double const __b) {
+ return __builtin_fmax(__a, __b);
}
-#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
+inline double max(float const __a, double const __b) {
+ return __builtin_fmax(__a, __b);
+}
+inline double max(double const __a, float const __b) {
+ return __builtin_fmax(__a, __b);
+}
+
+#pragma pop_macro("DEFINE_MIN_MAX_FUNCTIONS")
+
+#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) &&
+ // !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
#endif
#pragma pop_macro("__DEVICE__")
|
@llvm/pr-subscribers-backend-x86 Author: Yaxun (Sam) Liu (yxsamliu) ChangesCUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA. Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro Also allows users to define min/max functions with mixed signed/unsigned integer parameters are not defined unless Fixes: SWDEV-446564 Full diff: https://github.com/llvm/llvm-project/pull/133590.diff 1 Files Affected:
diff --git a/clang/lib/Headers/__clang_hip_math.h b/clang/lib/Headers/__clang_hip_math.h
index f6c06eaf4afe0..297f82ebf46dd 100644
--- a/clang/lib/Headers/__clang_hip_math.h
+++ b/clang/lib/Headers/__clang_hip_math.h
@@ -1311,15 +1311,87 @@ float min(float __x, float __y) { return __builtin_fminf(__x, __y); }
__DEVICE__
double min(double __x, double __y) { return __builtin_fmin(__x, __y); }
-#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
-__host__ inline static int min(int __arg1, int __arg2) {
- return __arg1 < __arg2 ? __arg1 : __arg2;
+// Define host min/max functions.
+#if !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) && \
+ !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
+
+// TODO: make this default to 1 after existing HIP apps adopting this change.
+#ifndef __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+#define __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__ 0
+#endif
+
+#ifndef __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+#define __HIP_DEFINE_MIXED_HOST_MIN_MAX__ 0
+#endif
+
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#pragma push_macro("DEFINE_MIN_MAX_FUNCTIONS")
+#define DEFINE_MIN_MAX_FUNCTIONS(ret_type, type1, type2) \
+ inline ret_type min(const type1 __a, const type2 __b) { \
+ return (__a < __b) ? __a : __b; \
+ } \
+ inline ret_type max(const type1 __a, const type2 __b) { \
+ return (__a > __b) ? __a : __b; \
+ }
+
+// Define min and max functions for same type comparisons
+DEFINE_MIN_MAX_FUNCTIONS(int, int, int)
+
+#if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(long, long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(long long, long long, long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long,
+ unsigned long long)
+#endif // if __HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
+
+// The host min/max functions below accept mixed signed/unsigned integer
+// parameters and perform unsigned comparisons, which may produce unexpected
+// results if a signed integer was passed unintentionally. To avoid this
+// happening silently, these overloaded functions are not defined by default.
+// However, for compatibility with CUDA, they will be defined if users define
+// __HIP_DEFINE_MIXED_HOST_MIN_MAX__.
+#if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, int, unsigned int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned int, unsigned int, int)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, long, unsigned long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long, unsigned long, long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, long long, unsigned long long)
+DEFINE_MIN_MAX_FUNCTIONS(unsigned long long, unsigned long long, long long)
+#endif // if __HIP_DEFINE_MIXED_HOST_MIN_MAX__
+
+// Floating-point comparisons using built-in functions
+inline float min(float const __a, float const __b) {
+ return __builtin_fminf(__a, __b);
+}
+inline double min(double const __a, double const __b) {
+ return __builtin_fmin(__a, __b);
+}
+inline double min(float const __a, double const __b) {
+ return __builtin_fmin(__a, __b);
+}
+inline double min(double const __a, float const __b) {
+ return __builtin_fmin(__a, __b);
}
-__host__ inline static int max(int __arg1, int __arg2) {
- return __arg1 > __arg2 ? __arg1 : __arg2;
+inline float max(float const __a, float const __b) {
+ return __builtin_fmaxf(__a, __b);
+}
+inline double max(double const __a, double const __b) {
+ return __builtin_fmax(__a, __b);
}
-#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__)
+inline double max(float const __a, double const __b) {
+ return __builtin_fmax(__a, __b);
+}
+inline double max(double const __a, float const __b) {
+ return __builtin_fmax(__a, __b);
+}
+
+#pragma pop_macro("DEFINE_MIN_MAX_FUNCTIONS")
+
+#endif // !defined(__HIPCC_RTC__) && !defined(__OPENMP_AMDGCN__) &&
+ // !defined(__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__)
#endif
#pragma pop_macro("__DEVICE__")
|
This is an effort to reland #82956 Since we kept encountering regressions, we plan to add the change conditionally under a macro so that we can deliver the fix to intended users without causing regressions. Then we will try making the change to be default on. |
CUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA. Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro `__HIP_DEFINE_EXTENDED_HOST_MIN_MAX__`, which is 0 by default. In the future, this will change to 1 by default after most existing HIP apps adopt this change. Also allows users to define `__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__` to disable host max/min in global namespace. min/max functions with mixed signed/unsigned integer parameters are not defined unless `__HIP_DEFINE_MIXED_HOST_MIN_MAX__` is defined. Fixes: SWDEV-446564
a2909e9
to
3f533ad
Compare
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/15227 Here is the relevant piece of the build log for the reference
|
CUDA defines min/max functions for host in global namespace. HIP header needs to define them too to be compatible. Currently only min/max(int, int) is defined. This causes wrong result for arguments that are out of range for int. This patch defines host min/max functions to be compatible with CUDA.
Since some HIP apps defined min/max functions by themselves, newly added min/max function are under the control of macro
__HIP_DEFINE_EXTENDED_HOST_MIN_MAX__
, which is 0 by default. In the future, this will change to 1 bydefault after most existing HIP apps adopt this change.
Also allows users to define
__HIP_NO_HOST_MIN_MAX_IN_GLOBAL_NAMESPACE__
to disable host max/min in global namespace.min/max functions with mixed signed/unsigned integer parameters are not defined unless
__HIP_DEFINE_MIXED_HOST_MIN_MAX__
is defined.Fixes: SWDEV-446564