-
Notifications
You must be signed in to change notification settings - Fork 10.5k
stdlib: excise the FP16 support routines on x64 #62988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
CC: @rjmccall |
This doesn’t actually solve the problem on its own; F16C provides conversions float <-> float16, but the compiler will still generate runtime calls for conversions to/from double and float80. I’ve worked around these in the stdlib for the conversions that were missing in macOS’s compiler-rt, but there are some others IIRC. |
@stephentyrone - that is good to know! Do you have an example of that? I am hitting another issue in the compiler atm, so I haven't gotten to that point yet. At least via testing through clang, it appeared that the truncation and extension would get lowered to F16C. |
@swift-ci please test Windows platform |
@swift-ci test |
@compnerd here's an example of a runtime call generated even with f16c (by clang): https://godbolt.org/z/6b7513on6 |
Interesting; we should definitely add a test case for that - https://godbolt.org/z/nbGjYG1bs - doesn't generate the libcall on Windows as per the ABI. Seems like we will need to ensure that the builtins are linked on non-Windows targets. |
Here's one where you generate a call on Windows: https://godbolt.org/z/qsYqTvb1c |
I can work around that one in the stdlib for you, but more generally we need to be able to use the arithmetic builtins from the just-built compiler-rt, so making that work is the long-term solution we really want. |
Most of the math routines shouldn't be forming libcalls on Windows. FP16 operations are odd since there is no FP16 on Windows AFAIK. We should be forcing LLVM to lower that properly in the longer term rather than having to workaround it at the library level. |
I'm not sure what "forcing LLVM to lower that properly" means other than generating a libcall; that's the proper lowering. We don't want to unconditionally expand everything that's a libcall on any platform inline. |
I was thinking that it should avoid the libcall on Windows specifically - as the OS doesn't expect to link compiler-rt ever (it should be similar to what MSVC does). However, a trophy for you - you correctly identified that |
Please test with following PRs: @swift-ci please test |
Please test with following PRs: @swift-ci please test Linux platform |
@swift-ci please test Windows platform |
@@ -41,4 +41,4 @@ bb0: | |||
// the order of features differs. | |||
|
|||
// X86_64: define{{( protected)?}} swiftcc void @baz{{.*}}#0 | |||
// X86_64: #0 = {{.*}}"target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+ssse3,+x87" | |||
// X86_64: #0 = {{.*}}"target-features"="+avx,+crc32,+cx16,+cx8,+f16c,+fxsr,+mmx,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're definitely not OK with bumping the baseline requirement for swift on x86 by ~4 years without a language workgroup and/or core team discussion. I think we should really try to fix this in a different manner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that the Language WG should be involved in that - I think that @airspeedswift was relayed this though, and @rjmccall was also CC'ed for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should also mention that the f16c removes the __extendhfxf2
not the __truncsfhf2
and that the latter is something that I will restore for the time being. I think that one option might be to add __extendhfxf2
to the support routines (Linux specific - FP80)
The Foundation test failures are intriguing as locally, I am seeing only 2 failures. But seems that Swift, libdispatch, and XCTest are good. The Foundation test failures seem to match the current ones in CI, so I think that Windows is still looking good assuming we figure out the FP16 issue. |
Some text editing with the actual failures here indicates that the failures are the two that I see locally, and 4 related to symlink handling where the expectation doesn't match due to the build being on a different drive than the source. |
@swift-ci please build toolchain Windows platform |
@swift-ci please test Linux platform |
@swift-ci please test macOS platform |
Bump the minimum supported CPU to IvyBridge so that we can take advantage of the F16C/CVT16 extensions to support half floating point rounding. This avoids the undefined reference to `__extendsfhf2` in the FP16 support.
@swift-ci please test |
Add a simple special case implementation of the FP16 routines for the FP80 extension. Implementation by @al45tair!
@swift-ci please test |
@swift-ci please test |
@swift-ci please test |
@swift-ci please test |
@swift-ci please test Linux platform |
@swift-ci please test Linux platform |
1 similar comment
@swift-ci please test Linux platform |
@swift-ci please test Linux platform |
I think that this is no longer needed! |
I've been watching the progress on it, can you elaborate more on the reason? |
@etcwilde perfect, Thanks |
Bump the minimum supported CPU to ~Nahelem so that we can take advantage of the F16C/CVT16 extensions to support half floating point rounding. This should avoid the need to replicate the compiler-rt functionality in the runtime and the associated problems with linking compiler-rt builtins on non-Windows targets when performing static linking.