-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[libc][math] Implement a fast pass for atan2f128 with 1ULP error using DyadicFloat<128>. #133150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…g DyadicFloat<128>.
@llvm/pr-subscribers-libc Author: None (lntue) ChangesPart of #131642. Patch is 28.89 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/133150.diff 15 Files Affected:
diff --git a/libc/config/linux/aarch64/entrypoints.txt b/libc/config/linux/aarch64/entrypoints.txt
index bca0b13feb1a0..3020f924bcf1d 100644
--- a/libc/config/linux/aarch64/entrypoints.txt
+++ b/libc/config/linux/aarch64/entrypoints.txt
@@ -758,6 +758,7 @@ endif()
if(LIBC_TYPES_HAS_FLOAT128)
list(APPEND TARGET_LIBM_ENTRYPOINTS
# math.h C23 _Float128 entrypoints
+ libc.src.math.atan2f128
libc.src.math.canonicalizef128
libc.src.math.ceilf128
libc.src.math.copysignf128
diff --git a/libc/config/linux/riscv/entrypoints.txt b/libc/config/linux/riscv/entrypoints.txt
index 9cf05ef6d5a61..479b97abfcf6b 100644
--- a/libc/config/linux/riscv/entrypoints.txt
+++ b/libc/config/linux/riscv/entrypoints.txt
@@ -639,6 +639,7 @@ endif()
if(LIBC_TYPES_HAS_FLOAT128)
list(APPEND TARGET_LIBM_ENTRYPOINTS
# math.h C23 _Float128 entrypoints
+ libc.src.math.atan2f128
libc.src.math.canonicalizef128
libc.src.math.ceilf128
libc.src.math.copysignf128
diff --git a/libc/config/linux/x86_64/entrypoints.txt b/libc/config/linux/x86_64/entrypoints.txt
index ac281e8d39066..9713a5420be95 100644
--- a/libc/config/linux/x86_64/entrypoints.txt
+++ b/libc/config/linux/x86_64/entrypoints.txt
@@ -773,6 +773,7 @@ endif()
if(LIBC_TYPES_HAS_FLOAT128)
list(APPEND TARGET_LIBM_ENTRYPOINTS
# math.h C23 _Float128 entrypoints
+ libc.src.math.atan2f128
libc.src.math.canonicalizef128
libc.src.math.ceilf128
libc.src.math.copysignf128
diff --git a/libc/docs/headers/math/index.rst b/libc/docs/headers/math/index.rst
index 16903a40c03fa..bef946f4b63d0 100644
--- a/libc/docs/headers/math/index.rst
+++ b/libc/docs/headers/math/index.rst
@@ -263,7 +263,7 @@ Higher Math Functions
+-----------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
| atan | |check| | 1 ULP | | | | 7.12.4.3 | F.10.1.3 |
+-----------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
-| atan2 | |check| | 1 ULP | | | | 7.12.4.4 | F.10.1.4 |
+| atan2 | |check| | 1 ULP | | | 1 ULP | 7.12.4.4 | F.10.1.4 |
+-----------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
| atan2pi | | | | | | 7.12.4.11 | F.10.1.11 |
+-----------+------------------+-----------------+------------------------+----------------------+------------------------+------------------------+----------------------------+
diff --git a/libc/include/math.yaml b/libc/include/math.yaml
index bd29eb0213c4f..5c1b3b594f3a9 100644
--- a/libc/include/math.yaml
+++ b/libc/include/math.yaml
@@ -86,6 +86,14 @@ functions:
arguments:
- type: long double
- type: long double
+ - name: atan2f128
+ standards:
+ - stdc
+ return_type: float128
+ arguments:
+ - type: float128
+ - type: float128
+ guard: LIBC_TYPES_HAS_FLOAT128
- name: atanf
standards:
- stdc
diff --git a/libc/src/__support/FPUtil/dyadic_float.h b/libc/src/__support/FPUtil/dyadic_float.h
index 65d55e16d95ec..673354888d589 100644
--- a/libc/src/__support/FPUtil/dyadic_float.h
+++ b/libc/src/__support/FPUtil/dyadic_float.h
@@ -104,7 +104,7 @@ template <size_t Bits> struct DyadicFloat {
normalize();
}
- LIBC_INLINE constexpr DyadicFloat(Sign s, int e, MantissaType m)
+ LIBC_INLINE constexpr DyadicFloat(Sign s, int e, const MantissaType &m)
: sign(s), exponent(e), mantissa(m) {
normalize();
}
diff --git a/libc/src/math/CMakeLists.txt b/libc/src/math/CMakeLists.txt
index 92c80a1053c9e..dcc35f1146d06 100644
--- a/libc/src/math/CMakeLists.txt
+++ b/libc/src/math/CMakeLists.txt
@@ -61,6 +61,7 @@ add_math_entrypoint_object(atanf)
add_math_entrypoint_object(atan2)
add_math_entrypoint_object(atan2f)
add_math_entrypoint_object(atan2l)
+add_math_entrypoint_object(atan2f128)
add_math_entrypoint_object(atanh)
add_math_entrypoint_object(atanhf)
diff --git a/libc/src/math/atan2f128.h b/libc/src/math/atan2f128.h
new file mode 100644
index 0000000000000..26f7ec624940c
--- /dev/null
+++ b/libc/src/math/atan2f128.h
@@ -0,0 +1,21 @@
+//===-- Implementation header for atan2f128 ---------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC_MATH_ATAN2F128_H
+#define LLVM_LIBC_SRC_MATH_ATAN2F128_H
+
+#include "src/__support/macros/config.h"
+#include "src/__support/macros/properties/types.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+float128 atan2f128(float128 x, float128 y);
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SRC_MATH_ATAN2F128_H
diff --git a/libc/src/math/generic/CMakeLists.txt b/libc/src/math/generic/CMakeLists.txt
index 77c6244c5e5da..50b137c76407e 100644
--- a/libc/src/math/generic/CMakeLists.txt
+++ b/libc/src/math/generic/CMakeLists.txt
@@ -4080,8 +4080,11 @@ add_header_library(
HDRS
atan_utils.h
DEPENDS
+ libc.src.__support.integer_literals
libc.src.__support.FPUtil.double_double
+ libc.src.__support.FPUtil.dyadic_float
libc.src.__support.FPUtil.multiply_add
+ libc.src.__support.FPUtil.polyeval
libc.src.__support.macros.optimization
)
@@ -4163,6 +4166,23 @@ add_entrypoint_object(
.atan2
)
+add_entrypoint_object(
+ atan2f128
+ SRCS
+ atan2f128.cpp
+ HDRS
+ ../atan2f128.h
+ DEPENDS
+ .atan_utils
+ libc.src.__support.integer_literals
+ libc.src.__support.FPUtil.dyadic_float
+ libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.FPUtil.multiply_add
+ libc.src.__support.FPUtil.nearest_integer
+ libc.src.__support.macros.optimization
+ libc.src.__support.macros.properties.types
+)
+
add_entrypoint_object(
scalbln
SRCS
diff --git a/libc/src/math/generic/atan2f128.cpp b/libc/src/math/generic/atan2f128.cpp
new file mode 100644
index 0000000000000..808f41982cffc
--- /dev/null
+++ b/libc/src/math/generic/atan2f128.cpp
@@ -0,0 +1,201 @@
+//===-- Quad-precision atan2 function -------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "src/math/atan2f128.h"
+#include "atan_utils.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/dyadic_float.h"
+#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/FPUtil/nearest_integer.h"
+#include "src/__support/integer_literals.h"
+#include "src/__support/macros/config.h"
+#include "src/__support/macros/optimization.h" // LIBC_UNLIKELY
+#include "src/__support/macros/properties/types.h"
+#include "src/__support/uint128.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace {
+
+using Float128 = fputil::DyadicFloat<128>;
+
+static constexpr Float128 ZERO = {Sign::POS, 0, 0_u128};
+static constexpr Float128 MZERO = {Sign::NEG, 0, 0_u128};
+static constexpr Float128 PI = {Sign::POS, -126,
+ 0xc90fdaa2'2168c234'c4c6628b'80dc1cd1_u128};
+static constexpr Float128 MPI = {Sign::NEG, -126,
+ 0xc90fdaa2'2168c234'c4c6628b'80dc1cd1_u128};
+static constexpr Float128 PI_OVER_2 = {
+ Sign::POS, -127, 0xc90fdaa2'2168c234'c4c6628b'80dc1cd1_u128};
+static constexpr Float128 MPI_OVER_2 = {
+ Sign::NEG, -127, 0xc90fdaa2'2168c234'c4c6628b'80dc1cd1_u128};
+static constexpr Float128 PI_OVER_4 = {
+ Sign::POS, -128, 0xc90fdaa2'2168c234'c4c6628b'80dc1cd1_u128};
+static constexpr Float128 THREE_PI_OVER_4 = {
+ Sign::POS, -128, 0x96cbe3f9'990e91a7'9394c9e8'a0a5159d_u128};
+
+// Adjustment for constant term:
+// CONST_ADJ[x_sign][y_sign][recip]
+static constexpr Float128 CONST_ADJ[2][2][2] = {
+ {{ZERO, MPI_OVER_2}, {MZERO, MPI_OVER_2}},
+ {{MPI, PI_OVER_2}, {MPI, PI_OVER_2}}};
+
+} // anonymous namespace
+
+// There are several range reduction steps we can take for atan2(y, x) as
+// follow:
+
+// * Range reduction 1: signness
+// atan2(y, x) will return a number between -PI and PI representing the angle
+// forming by the 0x axis and the vector (x, y) on the 0xy-plane.
+// In particular, we have that:
+// atan2(y, x) = atan( y/x ) if x >= 0 and y >= 0 (I-quadrant)
+// = pi + atan( y/x ) if x < 0 and y >= 0 (II-quadrant)
+// = -pi + atan( y/x ) if x < 0 and y < 0 (III-quadrant)
+// = atan( y/x ) if x >= 0 and y < 0 (IV-quadrant)
+// Since atan function is odd, we can use the formula:
+// atan(-u) = -atan(u)
+// to adjust the above conditions a bit further:
+// atan2(y, x) = atan( |y|/|x| ) if x >= 0 and y >= 0 (I-quadrant)
+// = pi - atan( |y|/|x| ) if x < 0 and y >= 0 (II-quadrant)
+// = -pi + atan( |y|/|x| ) if x < 0 and y < 0 (III-quadrant)
+// = -atan( |y|/|x| ) if x >= 0 and y < 0 (IV-quadrant)
+// Which can be simplified to:
+// atan2(y, x) = sign(y) * atan( |y|/|x| ) if x >= 0
+// = sign(y) * (pi - atan( |y|/|x| )) if x < 0
+
+// * Range reduction 2: reciprocal
+// Now that the argument inside atan is positive, we can use the formula:
+// atan(1/x) = pi/2 - atan(x)
+// to make the argument inside atan <= 1 as follow:
+// atan2(y, x) = sign(y) * atan( |y|/|x|) if 0 <= |y| <= x
+// = sign(y) * (pi/2 - atan( |x|/|y| ) if 0 <= x < |y|
+// = sign(y) * (pi - atan( |y|/|x| )) if 0 <= |y| <= -x
+// = sign(y) * (pi/2 + atan( |x|/|y| )) if 0 <= -x < |y|
+
+// * Range reduction 3: look up table.
+// After the previous two range reduction steps, we reduce the problem to
+// compute atan(u) with 0 <= u <= 1, or to be precise:
+// atan( n / d ) where n = min(|x|, |y|) and d = max(|x|, |y|).
+// An accurate polynomial approximation for the whole [0, 1] input range will
+// require a very large degree. To make it more efficient, we reduce the input
+// range further by finding an integer idx such that:
+// | n/d - idx/64 | <= 1/128.
+// In particular,
+// idx := round(2^6 * n/d)
+// Then for the fast pass, we find a polynomial approximation for:
+// atan( n/d ) ~ atan( idx/64 ) + (n/d - idx/64) * Q(n/d - idx/64)
+// For the accurate pass, we use the addition formula:
+// atan( n/d ) - atan( idx/64 ) = atan( (n/d - idx/64)/(1 + (n*idx)/(64*d)) )
+// = atan( (n - d*(idx/64))/(d + n*(idx/64)) )
+// And for the fast pass, we use degree-9 Taylor polynomial to compute the RHS:
+// atan(u) ~ P(u) = u - u^3/3 + u^5/5 - u^7/7 + u^9/9
+// with absolute errors bounded by:
+// |atan(u) - P(u)| < |u|^11 / 11 < 2^-80
+// and relative errors bounded by:
+// |(atan(u) - P(u)) / P(u)| < u^10 / 11 < 2^-73.
+
+LLVM_LIBC_FUNCTION(float128, atan2f128, (float128 y, float128 x)) {
+ using FPBits = fputil::FPBits<float128>;
+ using Float128 = fputil::DyadicFloat<128>;
+
+ FPBits x_bits(x), y_bits(y);
+ bool x_sign = x_bits.sign().is_neg();
+ bool y_sign = y_bits.sign().is_neg();
+ x_bits = x_bits.abs();
+ y_bits = y_bits.abs();
+ UInt128 x_abs = x_bits.uintval();
+ UInt128 y_abs = y_bits.uintval();
+ bool recip = x_abs < y_abs;
+ UInt128 min_abs = recip ? x_abs : y_abs;
+ UInt128 max_abs = !recip ? x_abs : y_abs;
+ unsigned min_exp = static_cast<unsigned>(min_abs >> FPBits::FRACTION_LEN);
+ unsigned max_exp = static_cast<unsigned>(max_abs >> FPBits::FRACTION_LEN);
+
+ Float128 num(FPBits(min_abs).get_val());
+ Float128 den(FPBits(max_abs).get_val());
+
+ // Check for exceptional cases, whether inputs are 0, inf, nan, or close to
+ // overflow, or close to underflow.
+ if (LIBC_UNLIKELY(max_exp >= 0x7fffU || min_exp == 0U)) {
+ if (x_bits.is_nan() || y_bits.is_nan())
+ return FPBits::quiet_nan().get_val();
+ unsigned x_except = x == 0 ? 0 : (FPBits(x_abs).is_inf() ? 2 : 1);
+ unsigned y_except = y == 0 ? 0 : (FPBits(y_abs).is_inf() ? 2 : 1);
+
+ // Exceptional cases:
+ // EXCEPT[y_except][x_except][x_is_neg]
+ // with x_except & y_except:
+ // 0: zero
+ // 1: finite, non-zero
+ // 2: infinity
+ constexpr Float128 EXCEPTS[3][3][2] = {
+ {{ZERO, PI}, {ZERO, PI}, {ZERO, PI}},
+ {{PI_OVER_2, PI_OVER_2}, {ZERO, ZERO}, {ZERO, PI}},
+ {{PI_OVER_2, PI_OVER_2},
+ {PI_OVER_2, PI_OVER_2},
+ {PI_OVER_4, THREE_PI_OVER_4}},
+ };
+
+ if ((x_except != 1) || (y_except != 1)) {
+ Float128 r = EXCEPTS[y_except][x_except][x_sign];
+ if (y_sign)
+ r.sign = (r.sign == Sign::POS) ? Sign::NEG : Sign::POS;
+ return static_cast<float128>(r);
+ }
+ }
+
+ bool final_sign = ((x_sign != y_sign) != recip);
+ Float128 const_term = CONST_ADJ[x_sign][y_sign][recip];
+ int exp_diff = den.exponent - num.exponent;
+ // We have the following bound for normalized n and d:
+ // 2^(-exp_diff - 1) < n/d < 2^(-exp_diff + 1).
+ if (LIBC_UNLIKELY(exp_diff > FPBits::FRACTION_LEN + 2)) {
+ if (final_sign)
+ const_term.sign = (const_term.sign == Sign::POS) ? Sign::NEG : Sign::POS;
+ return static_cast<float128>(const_term);
+ }
+
+ // Take 24 leading bits of num and den to convert to float for fast division.
+ // We also multiply the numerator by 64 using integer addition directly to the
+ // exponent field.
+ float num_f =
+ cpp::bit_cast<float>(static_cast<uint32_t>(num.mantissa >> 104) +
+ (6U << fputil::FPBits<float>::FRACTION_LEN));
+ float den_f = cpp::bit_cast<float>(
+ static_cast<uint32_t>(den.mantissa >> 104) +
+ (static_cast<uint32_t>(exp_diff) << fputil::FPBits<float>::FRACTION_LEN));
+
+ float k = fputil::nearest_integer(num_f / den_f);
+ unsigned idx = static_cast<unsigned>(k);
+
+ // k_f128 = idx / 64
+ Float128 k_f128(Sign::POS, -6, Float128::MantissaType(idx));
+
+ // Range reduction:
+ // atan(n/d) - atan(k) = atan((n/d - k/64) / (1 + (n/d) * (k/64)))
+ // = atan((n - d * k/64)) / (d + n * k/64))
+ // num_f128 = n - d * k/64
+ Float128 num_f128 = fputil::multiply_add(den, -k_f128, num);
+ // den_f128 = d + n * k/64
+ Float128 den_f128 = fputil::multiply_add(num, k_f128, den);
+
+ // q = (n - d * k) / (d + n * k)
+ Float128 q = fputil::quick_mul(num_f128, fputil::approx_reciprocal(den_f128));
+ // p ~ atan(q)
+ Float128 p = atan_eval(q);
+
+ Float128 r =
+ fputil::quick_add(const_term, fputil::quick_add(ATAN_I_F128[idx], p));
+ if (final_sign)
+ r.sign = (r.sign == Sign::POS) ? Sign::NEG : Sign::POS;
+
+ return static_cast<float128>(r);
+}
+
+} // namespace LIBC_NAMESPACE_DECL
diff --git a/libc/src/math/generic/atan_utils.h b/libc/src/math/generic/atan_utils.h
index 3331843900d4b..24c7271b7e4ec 100644
--- a/libc/src/math/generic/atan_utils.h
+++ b/libc/src/math/generic/atan_utils.h
@@ -9,8 +9,11 @@
#ifndef LLVM_LIBC_SRC_MATH_GENERIC_ATAN_UTILS_H
#define LLVM_LIBC_SRC_MATH_GENERIC_ATAN_UTILS_H
+#include "src/__support/FPUtil/PolyEval.h"
#include "src/__support/FPUtil/double_double.h"
+#include "src/__support/FPUtil/dyadic_float.h"
#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/integer_literals.h"
#include "src/__support/macros/config.h"
namespace LIBC_NAMESPACE_DECL {
@@ -18,6 +21,7 @@ namespace LIBC_NAMESPACE_DECL {
namespace {
using DoubleDouble = fputil::DoubleDouble;
+using Float128 = fputil::DyadicFloat<128>;
// atan(i/64) with i = 0..64, generated by Sollya with:
// > for i from 0 to 64 do {
@@ -25,7 +29,7 @@ using DoubleDouble = fputil::DoubleDouble;
// b = round(atan(i/64) - a, D, RN);
// print("{", b, ",", a, "},");
// };
-constexpr fputil::DoubleDouble ATAN_I[65] = {
+constexpr DoubleDouble ATAN_I[65] = {
{0.0, 0.0},
{-0x1.220c39d4dff5p-61, 0x1.fff555bbb729bp-7},
{-0x1.5ec431444912cp-60, 0x1.ffd55bba97625p-6},
@@ -106,7 +110,7 @@ constexpr fputil::DoubleDouble ATAN_I[65] = {
// + x_lo * (1 - x_hi^2 + x_hi^4)
// Since p.lo is ~ x^3/3, the relative error from rounding is bounded by:
// |(atan(x) - P(x))/atan(x)| < ulp(x^2) <= 2^(-14-52) = 2^-66.
-DoubleDouble atan_eval(const DoubleDouble &x) {
+[[maybe_unused]] DoubleDouble atan_eval(const DoubleDouble &x) {
DoubleDouble p;
p.hi = x.hi;
double x_hi_sq = x.hi * x.hi;
@@ -130,6 +134,106 @@ DoubleDouble atan_eval(const DoubleDouble &x) {
return p;
}
+// Float128 versions.
+// atan(i/64) with i = 0..64, generated by Sollya with:
+// > for i from 1 to 64 do {
+// a = round(atan(i/64), 128, RN);
+// ll = ceil(log2(a));
+// b = 2^ll + a;
+// print("{Sign::POS, ", 2^(ll - 128), ",", b, "},");
+// };
+constexpr Float128 ATAN_I_F128[65] = {
+ {Sign::POS, 0, 0_u128},
+ {Sign::POS, -134, 0xfffaaadd'db94d5bb'e78c5640'15f76048_u128},
+ {Sign::POS, -133, 0xffeaaddd'4bb12542'779d776d'da8c6214_u128},
+ {Sign::POS, -132, 0xbfdc0c21'86d14fcf'220e10d6'1df56ec7_u128},
+ {Sign::POS, -132, 0xffaaddb9'67ef4e36'cb2792dc'0e2e0d51_u128},
+ {Sign::POS, -131, 0x9facf873'e2aceb58'99c50bbf'08e6cdf6_u128},
+ {Sign::POS, -131, 0xbf70c130'17887460'93567e78'4cf83676_u128},
+ {Sign::POS, -131, 0xdf1cf5f3'783e1bef'71e5340b'30e5d9ef_u128},
+ {Sign::POS, -131, 0xfeadd4d5'617b6e32'c897989f'3e888ef8_u128},
+ {Sign::POS, -130, 0x8f0fd7d8'21b93725'bd375929'83a0af9a_u128},
+ {Sign::POS, -130, 0x9eb77746'331362c3'47619d25'0360fe85_u128},
+ {Sign::POS, -130, 0xae4c08f1'f6134efa'b54d3fef'0c2de994_u128},
+ {Sign::POS, -130, 0xbdcbda5e'72d81134'7b0b4f88'1c9c7488_u128},
+ {Sign::POS, -130, 0xcd35474b'643130e7'b00f3da1'a46eeb3b_u128},
+ {Sign::POS, -130, 0xdc86ba94'93051022'f621a5c1'cb552f03_u128},
+ {Sign::POS, -130, 0xebbeaef9'02b9b38c'91a2a68b'2fbd78e8_u128},
+ {Sign::POS, -130, 0xfadbafc9'6406eb15'6dc79ef5'f7a217e6_u128},
+ {Sign::POS, -129, 0x84ee2cbe'c31b12c5'c8e72197'0cabd3a3_u128},
+ {Sign::POS, -129, 0x8c5fad18'5f8bc130'ca4748b1'bf88298d_u128},
+ {Sign::POS, -129, 0x93c1b902'bf7a2df1'06459240'6fe1447a_u128},
+ {Sign::POS, -129, 0x9b13b9b8'3f5e5e69'c5abb498'd27af328_u128},
+ {Sign::POS, -129, 0xa25521b6'15784d45'43787549'88b8d9e3_u128},
+ {Sign::POS, -129, 0xa9856cca'8e6a4eda'99b7f77b'f7d9e8c1_u128},
+ {Sign::POS, -129, 0xb0a42018'4e7f0cb1'b51d51dc'200a0fc3_u128},
+ {Sign::POS, -129, 0xb7b0ca0f'26f78473'8aa32122'dcfe4483_u128},
+ {Sign::POS, -129, 0xbeab025b'1d9fbad3'910b8564'93411026_u128},
+ {Sign::POS, -129, 0xc59269ca'50d92b6d'a1746e91'f50a28de_u128},
+ {Sign::POS, -129, 0xcc66aa2a'6b58c33c'd9311fa1'4ed9b7c4_u128},
+ {Sign::POS, -129, 0xd327761e'611fe5b6'427c95e9'001e7136_u128},
+ {Sign::POS, -129, 0xd9d488ed'32e3635c'30f6394a'0806345d_u128},
+ {Sign::POS, -129, 0xe06da64a'764f7c67'c631ed96'798cb804_u128},
+ {Sign::POS, -129, 0xe6f29a19'609a84ba'60b77ce1'ca6dc2c8_u128},
+ {Sign::POS, -129, 0xed63382b'0dda7b45'6fe445ec'bc3a8d03_u128},
+ {Sign::POS, -129, 0xf3bf5bf8'bad1a21c'a7b837e6'86adf3fa_u128},
+ {Sign::POS, -129, 0xfa06e85a'a0a0be5c'66d23c7d'5dc8ecc2_u128},
+ {Sign::POS, -128, 0x801ce39e'0d205c99'a6d6c6c5'4d938596_u128},
+ {Sign::POS, -128, 0x832bf4a6'd9867e2a'4b6a09cb'61a515c1_u128},
+ {Sign::POS, -128, 0x8630a2da'da1ed065'd3e84ed5'013ca37e_u128},
+ {Sign::POS, -128, 0x892aecdf'de9547b5'094478fc'472b4afc_u128},
+ {Sign::POS, -128, 0x8c1ad445'f3e09b8c'439d8018'60205921_u128},
+ {Sign::POS, -128, 0x8f005d5e'f7f59f9b'5c835e16'65c43748_u128},
+ {Sign::POS, -128, 0x91db8f16'64f3...
[truncated]
|
// > P = fpminimax(atan(x), [|1, 3, 5, 7, 9, 11, 13|], [|1, 128...|], | ||
// [0, 2^-7]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Sollya 0161d13c:
~ ➜ sollya
> P = fpminimax(atan(x), [|1, 3, 5, 7, 9, 11, 13|], [|1, 128...|],
[0, 2^-7]);
Warning in fpminimax: the list of formats is an numerical end-elliptic list.
This is deprecated and will be forbidden in future releases.
>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning message is confusing because its usage is still presented in the help text of fpminimax
, and there won't be any cleaner way to do this beside listing all of them explicitly. I'll check with Sollya's maintainer about this.
@@ -106,7 +110,7 @@ constexpr fputil::DoubleDouble ATAN_I[65] = { | |||
// + x_lo * (1 - x_hi^2 + x_hi^4) | |||
// Since p.lo is ~ x^3/3, the relative error from rounding is bounded by: | |||
// |(atan(x) - P(x))/atan(x)| < ulp(x^2) <= 2^(-14-52) = 2^-66. | |||
DoubleDouble atan_eval(const DoubleDouble &x) { | |||
[[maybe_unused]] DoubleDouble atan_eval(const DoubleDouble &x) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add the [[maybe_unused]]
attribute here by the way? Isn't this always used by the implementaton of atan
and atan2
for double
s? Same question for the atan_eval
overload for Float128
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because atan_eval(const DoubleDouble &)
is not used in atan2f128
, and (right now) atan_eval(Float128)
is not used in atan
and atan2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/104/builds/19383 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/188/builds/12609 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/196/builds/6586 Here is the relevant piece of the build log for the reference
|
…g DyadicFloat<128>. (llvm#133150) Part of llvm#131642.
Part of #131642.