Skip to content

[libc] Work around incorrect fmin/fmax results for +/-x #83158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 27, 2024

Conversation

jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Feb 27, 2024

Summary:
The IEEE 754 standard as of the 2019 revision states that for fmin -0.0
is always less than 0.0 and for fmax 0.0 is always greater than 0.0.
These are currently not respected by the builtin value and thus cause
the tests to fail. This patch works around it in the implementation for
now by explicitly modifying the sign bit.

Summary:
The IEEE 754 standard as of the 2019 revision states that for fmin -0.0
is always less than 0.0 and for fmax 0.0 is always greater than 0.0.
These are currently not respected by the builtin value and thus cause
the tests to fail. This patch works around it in the implementation for
now by explicitly modifying the sign bit.
@llvmbot
Copy link
Member

llvmbot commented Feb 27, 2024

@llvm/pr-subscribers-libc

@llvm/pr-subscribers-backend-amdgpu

Author: Joseph Huber (jhuber6)

Changes

Summary:
The IEEE 754 standard as of the 2019 revision states that for fmin -0.0
is always less than 0.0 and for fmax 0.0 is always greater than 0.0.
These are currently not respected by the builtin value and thus cause
the tests to fail. This patch works around it in the implementation for
now by explicitly modifying the sign bit.


Full diff: https://github.com/llvm/llvm-project/pull/83158.diff

9 Files Affected:

  • (modified) libc/src/math/amdgpu/fmax.cpp (+7)
  • (modified) libc/src/math/amdgpu/fmaxf.cpp (+7)
  • (modified) libc/src/math/amdgpu/fmin.cpp (+7)
  • (modified) libc/src/math/amdgpu/fminf.cpp (+7)
  • (modified) libc/src/math/nvptx/fmax.cpp (+7)
  • (modified) libc/src/math/nvptx/fmaxf.cpp (+7)
  • (modified) libc/src/math/nvptx/fmin.cpp (+7)
  • (modified) libc/src/math/nvptx/fminf.cpp (+7)
  • (modified) libc/test/src/math/smoke/CMakeLists.txt (+96-99)
diff --git a/libc/src/math/amdgpu/fmax.cpp b/libc/src/math/amdgpu/fmax.cpp
index a2c35371d12b6a..09624cc6f092af 100644
--- a/libc/src/math/amdgpu/fmax.cpp
+++ b/libc/src/math/amdgpu/fmax.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmax.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmax, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) &
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmax(x, y);
 }
 
diff --git a/libc/src/math/amdgpu/fmaxf.cpp b/libc/src/math/amdgpu/fmaxf.cpp
index 67178b3e273575..f6ed46699a049f 100644
--- a/libc/src/math/amdgpu/fmaxf.cpp
+++ b/libc/src/math/amdgpu/fmaxf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmaxf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fmaxf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) &
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fmaxf(x, y);
 }
 
diff --git a/libc/src/math/amdgpu/fmin.cpp b/libc/src/math/amdgpu/fmin.cpp
index 7303adcd347ee9..8977ff7a066c6b 100644
--- a/libc/src/math/amdgpu/fmin.cpp
+++ b/libc/src/math/amdgpu/fmin.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmin.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmin, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) |
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmin(x, y);
 }
 
diff --git a/libc/src/math/amdgpu/fminf.cpp b/libc/src/math/amdgpu/fminf.cpp
index bbf0c677b5e3ae..3be55257f61649 100644
--- a/libc/src/math/amdgpu/fminf.cpp
+++ b/libc/src/math/amdgpu/fminf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fminf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fminf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) |
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fminf(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fmax.cpp b/libc/src/math/nvptx/fmax.cpp
index a2c35371d12b6a..09624cc6f092af 100644
--- a/libc/src/math/nvptx/fmax.cpp
+++ b/libc/src/math/nvptx/fmax.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmax.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmax, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) &
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmax(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fmaxf.cpp b/libc/src/math/nvptx/fmaxf.cpp
index 67178b3e273575..f6ed46699a049f 100644
--- a/libc/src/math/nvptx/fmaxf.cpp
+++ b/libc/src/math/nvptx/fmaxf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmaxf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fmaxf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) &
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fmaxf(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fmin.cpp b/libc/src/math/nvptx/fmin.cpp
index 7303adcd347ee9..8977ff7a066c6b 100644
--- a/libc/src/math/nvptx/fmin.cpp
+++ b/libc/src/math/nvptx/fmin.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmin.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmin, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) |
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmin(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fminf.cpp b/libc/src/math/nvptx/fminf.cpp
index bbf0c677b5e3ae..3be55257f61649 100644
--- a/libc/src/math/nvptx/fminf.cpp
+++ b/libc/src/math/nvptx/fminf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fminf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fminf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) |
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fminf(x, y);
 }
 
diff --git a/libc/test/src/math/smoke/CMakeLists.txt b/libc/test/src/math/smoke/CMakeLists.txt
index 7ef87a0dd2eecf..825000e1cb7afe 100644
--- a/libc/test/src/math/smoke/CMakeLists.txt
+++ b/libc/test/src/math/smoke/CMakeLists.txt
@@ -1092,112 +1092,109 @@ add_fp_unittest(
     libc.src.__support.FPUtil.fp_bits
 )
 
-# FIXME: These tests are currently broken on the GPU.
-if(NOT LIBC_TARGET_OS_IS_GPU)
-  add_fp_unittest(
-    fminf_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fminf_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fminf
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fminf_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fminf_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fminf
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmin_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmin_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fmin
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmin_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmin_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fmin
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fminl_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fminl_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fminl
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fminl_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fminl_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fminl
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fminf128_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fminf128_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fminf128
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fminf128_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fminf128_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fminf128
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmaxf_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmaxf_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmaxf
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmaxf_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmaxf_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmaxf
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmax_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmax_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmax
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmax_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmax_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmax
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmaxl_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmaxl_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmaxl
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmaxl_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmaxl_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmaxl
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmaxf128_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmaxf128_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmaxf128
-      libc.src.__support.FPUtil.fp_bits
-  )
-endif()
+add_fp_unittest(
+  fmaxf128_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmaxf128_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmaxf128
+    libc.src.__support.FPUtil.fp_bits
+)
 
 add_fp_unittest(
   sqrtf_test

@jhuber6
Copy link
Contributor Author

jhuber6 commented Feb 27, 2024

@arsenm I'm guessing this is a change we'll need to put in the backend at some point? The C/C++ standard seems lax on this point, but it's explicitly stated in IEEE754 2019 to have these semantics. The implementation of fmin in the ISA is listed as the following, which seems to provide these semantics, but looking at https://godbolt.org/z/oP8zrYh6T it doesn't seem to be the case.

// V_MIN_F64 Compute the minimum of two double-precision floats.
if (IEEE_MODE && S0.d == sNaN)
  D.d = Quiet(S0.d);
else if (IEEE_MODE && S1.d == sNaN)
  D.d = Quiet(S1.d);
else if (S0.d == NaN)
  D.d = S1.d;
else if (S1.d == NaN)
  D.d = S0.d;
else if (S0.d == +0.0 && S1.d == -0.0)
  D.d = S1.d;
else if (S0.d == -0.0 && S1.d == +0.0)
  D.d = S0.d;
else
  D.d = (S0.d < S1.d ? S0.d : S1.d);
endif.

@jhuber6 jhuber6 merged commit 7706945 into llvm:main Feb 27, 2024
@arsenm
Copy link
Contributor

arsenm commented Feb 28, 2024

https://godbolt.org/z/oP8zrYh6T

This change should be reverted

  1. The library shouldn't be working around instruction behavior
  2. C does not require this
  3. The instruction (and expand-with-quieting implementation) does have the correct signed behavior, I don't see how your godbolt example demonstrates it does not
  4. We would be better off with a separate set of intrinsics which do have the 2019 behavior

arsenm added a commit to arsenm/llvm-project that referenced this pull request Feb 29, 2024
Follow the 2019 rules and order -0 as less than +0 and +0 as greater than
-0. As currently defined this isn't required for the intrinsics, but is
a better QoI.

This will avoid the workaround in libc added by llvm#83158
sys-ceuplift pushed a commit to nstester/llvm-project that referenced this pull request Feb 29, 2024
Follow the 2019 rules and order -0 as less than +0 and +0 as greater
than -0. As currently defined this isn't required for the intrinsics,
but is a better QoI.

This will avoid the workaround in libc added by llvm#83158
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants