[libc] Work around incorrect fmin/fmax results for +/-x #83158

jhuber6 · 2024-02-27T17:43:14Z

Summary:
The IEEE 754 standard as of the 2019 revision states that for fmin -0.0
is always less than 0.0 and for fmax 0.0 is always greater than 0.0.
These are currently not respected by the builtin value and thus cause
the tests to fail. This patch works around it in the implementation for
now by explicitly modifying the sign bit.

Summary: The IEEE 754 standard as of the 2019 revision states that for fmin -0.0 is always less than 0.0 and for fmax 0.0 is always greater than 0.0. These are currently not respected by the builtin value and thus cause the tests to fail. This patch works around it in the implementation for now by explicitly modifying the sign bit.

llvmbot · 2024-02-27T17:43:44Z

@llvm/pr-subscribers-libc

@llvm/pr-subscribers-backend-amdgpu

Author: Joseph Huber (jhuber6)

Changes

Summary:
The IEEE 754 standard as of the 2019 revision states that for fmin -0.0
is always less than 0.0 and for fmax 0.0 is always greater than 0.0.
These are currently not respected by the builtin value and thus cause
the tests to fail. This patch works around it in the implementation for
now by explicitly modifying the sign bit.

Full diff: https://github.com/llvm/llvm-project/pull/83158.diff

9 Files Affected:

(modified) libc/src/math/amdgpu/fmax.cpp (+7)
(modified) libc/src/math/amdgpu/fmaxf.cpp (+7)
(modified) libc/src/math/amdgpu/fmin.cpp (+7)
(modified) libc/src/math/amdgpu/fminf.cpp (+7)
(modified) libc/src/math/nvptx/fmax.cpp (+7)
(modified) libc/src/math/nvptx/fmaxf.cpp (+7)
(modified) libc/src/math/nvptx/fmin.cpp (+7)
(modified) libc/src/math/nvptx/fminf.cpp (+7)
(modified) libc/test/src/math/smoke/CMakeLists.txt (+96-99)

diff --git a/libc/src/math/amdgpu/fmax.cpp b/libc/src/math/amdgpu/fmax.cpp
index a2c35371d12b6a..09624cc6f092af 100644
--- a/libc/src/math/amdgpu/fmax.cpp
+++ b/libc/src/math/amdgpu/fmax.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmax.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmax, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) &
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmax(x, y);
 }
 
diff --git a/libc/src/math/amdgpu/fmaxf.cpp b/libc/src/math/amdgpu/fmaxf.cpp
index 67178b3e273575..f6ed46699a049f 100644
--- a/libc/src/math/amdgpu/fmaxf.cpp
+++ b/libc/src/math/amdgpu/fmaxf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmaxf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fmaxf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) &
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fmaxf(x, y);
 }
 
diff --git a/libc/src/math/amdgpu/fmin.cpp b/libc/src/math/amdgpu/fmin.cpp
index 7303adcd347ee9..8977ff7a066c6b 100644
--- a/libc/src/math/amdgpu/fmin.cpp
+++ b/libc/src/math/amdgpu/fmin.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmin.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmin, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) |
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmin(x, y);
 }
 
diff --git a/libc/src/math/amdgpu/fminf.cpp b/libc/src/math/amdgpu/fminf.cpp
index bbf0c677b5e3ae..3be55257f61649 100644
--- a/libc/src/math/amdgpu/fminf.cpp
+++ b/libc/src/math/amdgpu/fminf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fminf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fminf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) |
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fminf(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fmax.cpp b/libc/src/math/nvptx/fmax.cpp
index a2c35371d12b6a..09624cc6f092af 100644
--- a/libc/src/math/nvptx/fmax.cpp
+++ b/libc/src/math/nvptx/fmax.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmax.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmax, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) &
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmax(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fmaxf.cpp b/libc/src/math/nvptx/fmaxf.cpp
index 67178b3e273575..f6ed46699a049f 100644
--- a/libc/src/math/nvptx/fmaxf.cpp
+++ b/libc/src/math/nvptx/fmaxf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmaxf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fmaxf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) &
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fmaxf(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fmin.cpp b/libc/src/math/nvptx/fmin.cpp
index 7303adcd347ee9..8977ff7a066c6b 100644
--- a/libc/src/math/nvptx/fmin.cpp
+++ b/libc/src/math/nvptx/fmin.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fmin.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(double, fmin, (double x, double y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<double>(cpp::bit_cast<uint64_t>(x) |
+                                 cpp::bit_cast<uint64_t>(y));
   return __builtin_fmin(x, y);
 }
 
diff --git a/libc/src/math/nvptx/fminf.cpp b/libc/src/math/nvptx/fminf.cpp
index bbf0c677b5e3ae..3be55257f61649 100644
--- a/libc/src/math/nvptx/fminf.cpp
+++ b/libc/src/math/nvptx/fminf.cpp
@@ -7,11 +7,18 @@
 //===----------------------------------------------------------------------===//
 
 #include "src/math/fminf.h"
+
+#include "src/__support/CPP/bit.h"
 #include "src/__support/common.h"
+#include "src/__support/macros/optimization.h"
 
 namespace LIBC_NAMESPACE {
 
 LLVM_LIBC_FUNCTION(float, fminf, (float x, float y)) {
+  // FIXME: The builtin function does not correctly handle the +/-0.0 case.
+  if (LIBC_UNLIKELY(x == y))
+    return cpp::bit_cast<float>(cpp::bit_cast<uint32_t>(x) |
+                                cpp::bit_cast<uint32_t>(y));
   return __builtin_fminf(x, y);
 }
 
diff --git a/libc/test/src/math/smoke/CMakeLists.txt b/libc/test/src/math/smoke/CMakeLists.txt
index 7ef87a0dd2eecf..825000e1cb7afe 100644
--- a/libc/test/src/math/smoke/CMakeLists.txt
+++ b/libc/test/src/math/smoke/CMakeLists.txt
@@ -1092,112 +1092,109 @@ add_fp_unittest(
     libc.src.__support.FPUtil.fp_bits
 )
 
-# FIXME: These tests are currently broken on the GPU.
-if(NOT LIBC_TARGET_OS_IS_GPU)
-  add_fp_unittest(
-    fminf_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fminf_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fminf
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fminf_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fminf_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fminf
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmin_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmin_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fmin
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmin_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmin_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fmin
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fminl_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fminl_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fminl
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fminl_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fminl_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fminl
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fminf128_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fminf128_test.cpp
-    HDRS
-      FMinTest.h
-    DEPENDS
-      libc.src.math.fminf128
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fminf128_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fminf128_test.cpp
+  HDRS
+    FMinTest.h
+  DEPENDS
+    libc.src.math.fminf128
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmaxf_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmaxf_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmaxf
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmaxf_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmaxf_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmaxf
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmax_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmax_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmax
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmax_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmax_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmax
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmaxl_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmaxl_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmaxl
-      libc.src.__support.FPUtil.fp_bits
-  )
+add_fp_unittest(
+  fmaxl_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmaxl_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmaxl
+    libc.src.__support.FPUtil.fp_bits
+)
 
-  add_fp_unittest(
-    fmaxf128_test
-    SUITE
-      libc-math-smoke-tests
-    SRCS
-      fmaxf128_test.cpp
-    HDRS
-      FMaxTest.h
-    DEPENDS
-      libc.src.math.fmaxf128
-      libc.src.__support.FPUtil.fp_bits
-  )
-endif()
+add_fp_unittest(
+  fmaxf128_test
+  SUITE
+    libc-math-smoke-tests
+  SRCS
+    fmaxf128_test.cpp
+  HDRS
+    FMaxTest.h
+  DEPENDS
+    libc.src.math.fmaxf128
+    libc.src.__support.FPUtil.fp_bits
+)
 
 add_fp_unittest(
   sqrtf_test

jhuber6 · 2024-02-27T17:57:06Z

@arsenm I'm guessing this is a change we'll need to put in the backend at some point? The C/C++ standard seems lax on this point, but it's explicitly stated in IEEE754 2019 to have these semantics. The implementation of fmin in the ISA is listed as the following, which seems to provide these semantics, but looking at https://godbolt.org/z/oP8zrYh6T it doesn't seem to be the case.

// V_MIN_F64 Compute the minimum of two double-precision floats.
if (IEEE_MODE && S0.d == sNaN)
  D.d = Quiet(S0.d);
else if (IEEE_MODE && S1.d == sNaN)
  D.d = Quiet(S1.d);
else if (S0.d == NaN)
  D.d = S1.d;
else if (S1.d == NaN)
  D.d = S0.d;
else if (S0.d == +0.0 && S1.d == -0.0)
  D.d = S1.d;
else if (S0.d == -0.0 && S1.d == +0.0)
  D.d = S0.d;
else
  D.d = (S0.d < S1.d ? S0.d : S1.d);
endif.

arsenm · 2024-02-28T03:36:08Z

https://godbolt.org/z/oP8zrYh6T

This change should be reverted

The library shouldn't be working around instruction behavior
C does not require this
The instruction (and expand-with-quieting implementation) does have the correct signed behavior, I don't see how your godbolt example demonstrates it does not
We would be better off with a separate set of intrinsics which do have the 2019 behavior

Follow the 2019 rules and order -0 as less than +0 and +0 as greater than -0. As currently defined this isn't required for the intrinsics, but is a better QoI. This will avoid the workaround in libc added by llvm#83158

jhuber6 requested review from arsenm, Artem-B, lntue and michaelrj-google February 27, 2024 17:43

llvmbot added backend:AMDGPU libc labels Feb 27, 2024

lntue approved these changes Feb 27, 2024

View reviewed changes

jhuber6 merged commit 7706945 into llvm:main Feb 27, 2024

arsenm mentioned this pull request Feb 29, 2024

APFloat: Fix signed zero handling in minnum/maxnum #83376

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[libc] Work around incorrect fmin/fmax results for +/-x #83158

[libc] Work around incorrect fmin/fmax results for +/-x #83158

Uh oh!

jhuber6 commented Feb 27, 2024

Uh oh!

llvmbot commented Feb 27, 2024 •

edited

Loading

Uh oh!

jhuber6 commented Feb 27, 2024 •

edited

Loading

Uh oh!

arsenm commented Feb 28, 2024

Uh oh!

Uh oh!

[libc] Work around incorrect fmin/fmax results for +/-x #83158

[libc] Work around incorrect fmin/fmax results for +/-x #83158

Uh oh!

Conversation

jhuber6 commented Feb 27, 2024

Uh oh!

llvmbot commented Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jhuber6 commented Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm commented Feb 28, 2024

Uh oh!

Uh oh!

llvmbot commented Feb 27, 2024 •

edited

Loading

jhuber6 commented Feb 27, 2024 •

edited

Loading