Skip to content

stdlib: excise the FP16 support routines on x64 #62988

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions lib/ClangImporter/ClangImporter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -780,6 +780,16 @@ importer::addCommonInvocationArguments(
invocationArgStrs.push_back("-mcx16");
}

if (triple.isX86()) {
// Enable the FC16/CVT16 extensions which provide half prevision floating
// point support on x86_64. This bumps the minimum requirement of the CPU
// to Ivy Bridge, however, Windows 10 (at least as of 1709) requires
// ~Broadwell. At this point, the complexity of supporting an older release
// no longer is justified. For uniformity, bump the minimum x86 CPU on all
// the targets.
invocationArgStrs.push_back("-mf16c");
}

if (!importerOpts.Optimization.empty()) {
invocationArgStrs.push_back(importerOpts.Optimization);
}
Expand Down
12 changes: 12 additions & 0 deletions stdlib/cmake/modules/AddSwiftStdlib.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,18 @@ function(_add_target_variant_c_compile_flags)
endif()
endif()

# Avoid the need for the FP16 truncation and extension routines from
# compiler-rt by assuming that we have hardware capable of performing
# half-precision floating point conversions. This effectively pins the
# minimum CPU requirements to Ivy Bridge (~2013).
if(CFLAGS_ARCH STREQUAL x86_64 OR CFLAGS_ARCH STREQUAL i686)
if(SWIFT_COMPILER_IS_MSVC_LIKE)
list(APPEND result /clang:-mf16c)
else()
list(APPEND result -mf16c)
endif()
endif()

if(CFLAGS_ENABLE_ASSERTIONS)
list(APPEND result "-UNDEBUG")
else()
Expand Down
20 changes: 18 additions & 2 deletions stdlib/public/runtime/Float16Support.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
// __gnu_h2f_ieee
// __gnu_f2h_ieee
// __truncdfhf2
// __extendhfxf2
//
// On Darwin platforms, these are provided by the host compiler-rt, but we
// can't depend on that everywhere, so we have to provide them in the Swift
Expand All @@ -30,10 +31,12 @@
// Android NDK <r21 do not provide `__aeabi_d2h` in the compiler runtime,
// provide shims in that case.
#if (defined(__ANDROID__) && defined(__ARM_ARCH_7A__) && defined(__ARM_EABI__)) || \
((defined(__i386__) || defined(__i686__) || defined(__x86_64__)) && !defined(__APPLE__) && !defined(__linux__))
(!defined(__APPLE__) && (defined(__i386__) || defined(__x86_64__)))

#include "swift/shims/Visibility.h"

#include <stdint.h>

static unsigned toEncoding(float f) {
unsigned e;
static_assert(sizeof e == sizeof f, "float and int must have the same size");
Expand Down Expand Up @@ -150,10 +153,23 @@ SWIFT_RUNTIME_EXPORT unsigned short __truncdfhf2(double d) {
return __gnu_f2h_ieee(f);
}

// F16C does not cover FP80 conversions, so we still need an implementation
// here.
#if (defined(__i386__) || defined(__x86_64__)) && \
!(defined(__ANDROID__) || defined(__APPLE__) || defined(_WIN32))

SWIFT_RUNTIME_EXPORT long double __extendhfxf2(short h) {
return (long double)__gnu_h2f_ieee(h);
}

#endif // (defined(__i386__) || defined(__x86_64__)) &&
// !(defined(__ANDROID__) || defined(__APPLE__) || defined(_WIN32))

#if defined(__ARM_EABI__)
SWIFT_RUNTIME_EXPORT unsigned short __aeabi_d2h(double d) {
return __truncdfhf2(d);
}
#endif

#endif // defined(__x86_64__) && !defined(__APPLE__)
#endif // (defined(__ANDROID__) && defined(__ARM_ARCH_7A__) && defined(__ARM_EABI__)) ||
// (!defined(__APPLE__) && (defined(__i386__) || defined(__x86_64__)))
2 changes: 1 addition & 1 deletion test/IRGen/ordering_x86.sil
Original file line number Diff line number Diff line change
Expand Up @@ -41,4 +41,4 @@ bb0:
// the order of features differs.

// X86_64: define{{( protected)?}} swiftcc void @baz{{.*}}#0
// X86_64: #0 = {{.*}}"target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+ssse3,+x87"
// X86_64: #0 = {{.*}}"target-features"="+avx,+crc32,+cx16,+cx8,+f16c,+fxsr,+mmx,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're definitely not OK with bumping the baseline requirement for swift on x86 by ~4 years without a language workgroup and/or core team discussion. I think we should really try to fix this in a different manner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that the Language WG should be involved in that - I think that @airspeedswift was relayed this though, and @rjmccall was also CC'ed for that.

Copy link
Member Author

@compnerd compnerd Jan 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should also mention that the f16c removes the __extendhfxf2 not the __truncsfhf2 and that the latter is something that I will restore for the time being. I think that one option might be to add __extendhfxf2 to the support routines (Linux specific - FP80)