stdlib: excise the FP16 support routines on x64 #62988

compnerd · 2023-01-12T00:58:44Z

Bump the minimum supported CPU to ~Nahelem so that we can take advantage of the F16C/CVT16 extensions to support half floating point rounding. This should avoid the need to replicate the compiler-rt functionality in the runtime and the associated problems with linking compiler-rt builtins on non-Windows targets when performing static linking.

compnerd · 2023-01-12T01:00:19Z

CC: @stephentyrone @airspeedswift @bnbarham @etcwilde @eeckstein @3405691582

lib/ClangImporter/ClangImporter.cpp

stdlib/cmake/modules/AddSwiftStdlib.cmake

compnerd · 2023-01-12T01:30:53Z

CC: @rjmccall

stdlib/cmake/modules/AddSwiftStdlib.cmake

stdlib/public/runtime/Float16Support.cpp

lib/ClangImporter/ClangImporter.cpp

stephentyrone · 2023-01-12T13:32:19Z

This doesn’t actually solve the problem on its own; F16C provides conversions float <-> float16, but the compiler will still generate runtime calls for conversions to/from double and float80. I’ve worked around these in the stdlib for the conversions that were missing in macOS’s compiler-rt, but there are some others IIRC.

compnerd · 2023-01-12T15:39:21Z

@stephentyrone - that is good to know! Do you have an example of that? I am hitting another issue in the compiler atm, so I haven't gotten to that point yet. At least via testing through clang, it appeared that the truncation and extension would get lowered to F16C.

compnerd · 2023-01-12T16:04:15Z

@swift-ci please test Windows platform

shahmishal · 2023-01-12T17:11:26Z

@swift-ci test

stephentyrone · 2023-01-12T17:39:07Z

@compnerd here's an example of a runtime call generated even with f16c (by clang): https://godbolt.org/z/6b7513on6

compnerd · 2023-01-12T17:40:39Z

Interesting; we should definitely add a test case for that - https://godbolt.org/z/nbGjYG1bs - doesn't generate the libcall on Windows as per the ABI. Seems like we will need to ensure that the builtins are linked on non-Windows targets.

stephentyrone · 2023-01-12T17:49:24Z

Here's one where you generate a call on Windows: https://godbolt.org/z/qsYqTvb1c

stephentyrone · 2023-01-12T18:11:37Z

I can work around that one in the stdlib for you, but more generally we need to be able to use the arithmetic builtins from the just-built compiler-rt, so making that work is the long-term solution we really want.

compnerd · 2023-01-12T18:27:28Z

Most of the math routines shouldn't be forming libcalls on Windows. FP16 operations are odd since there is no FP16 on Windows AFAIK. We should be forcing LLVM to lower that properly in the longer term rather than having to workaround it at the library level.

stephentyrone · 2023-01-12T18:28:15Z

I'm not sure what "forcing LLVM to lower that properly" means other than generating a libcall; that's the proper lowering. We don't want to unconditionally expand everything that's a libcall on any platform inline.

compnerd · 2023-01-13T15:57:37Z

I was thinking that it should avoid the libcall on Windows specifically - as the OS doesn't expect to link compiler-rt ever (it should be similar to what MSVC does).

However, a trophy for you - you correctly identified that __truncdfhf2 is needed after the X86 CodeGen issue is resolved.

compnerd · 2023-01-13T19:50:57Z

Please test with following PRs:
swiftlang/llvm-project#5995

@swift-ci please test

compnerd · 2023-01-13T20:59:35Z

Please test with following PRs:
swiftlang/llvm-project#5995

@swift-ci please test Linux platform

compnerd · 2023-01-16T16:55:40Z

@swift-ci please test Windows platform

stephentyrone · 2023-01-17T17:09:29Z

test/IRGen/ordering_x86.sil

@@ -41,4 +41,4 @@ bb0:
 // the order of features differs.

 // X86_64: define{{( protected)?}} swiftcc void @baz{{.*}}#0
-// X86_64: #0 = {{.*}}"target-features"="+cx16,+cx8,+fxsr,+mmx,+sahf,+sse,+sse2,+sse3,+ssse3,+x87"
+// X86_64: #0 = {{.*}}"target-features"="+avx,+crc32,+cx16,+cx8,+f16c,+fxsr,+mmx,+popcnt,+sahf,+sse,+sse2,+sse3,+sse4.1,+sse4.2,+ssse3,+x87,+xsave"


We're definitely not OK with bumping the baseline requirement for swift on x86 by ~4 years without a language workgroup and/or core team discussion. I think we should really try to fix this in a different manner.

Yeah, I think that the Language WG should be involved in that - I think that @airspeedswift was relayed this though, and @rjmccall was also CC'ed for that.

I should also mention that the f16c removes the __extendhfxf2 not the __truncsfhf2 and that the latter is something that I will restore for the time being. I think that one option might be to add __extendhfxf2 to the support routines (Linux specific - FP80)

compnerd · 2023-01-24T17:54:46Z

-- Testing: 8865 tests, 36 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 

3 warning(s) in tests

Testing Time: 390.24s
  Unsupported      : 2104
  Passed           : 6884
  Expectedly Failed:   51

Test project T:/3
      Start  1: dispatch_apply
 1/22 Test  #1: dispatch_apply ...................   Passed    0.18 sec
      Start  2: dispatch_api
 2/22 Test  #2: dispatch_api .....................   Passed    0.14 sec
      Start  3: dispatch_debug
 3/22 Test  #3: dispatch_debug ...................   Passed    0.14 sec
      Start  4: dispatch_queue_finalizer
 4/22 Test  #4: dispatch_queue_finalizer .........   Passed    1.14 sec
      Start  5: dispatch_overcommit
 5/22 Test  #5: dispatch_overcommit ..............   Passed    0.15 sec
      Start  6: dispatch_context_for_key
 6/22 Test  #6: dispatch_context_for_key .........   Passed    0.14 sec
      Start  7: dispatch_after
 7/22 Test  #7: dispatch_after ...................   Passed    9.16 sec
      Start  8: dispatch_timer
 8/22 Test  #8: dispatch_timer ...................   Passed    2.15 sec
      Start  9: dispatch_timer_short
 9/22 Test  #9: dispatch_timer_short .............   Passed    1.40 sec
      Start 10: dispatch_timer_timeout
10/22 Test #10: dispatch_timer_timeout ...........   Passed    6.15 sec
      Start 11: dispatch_sema
11/22 Test #11: dispatch_sema ....................   Passed    0.22 sec
      Start 12: dispatch_timer_bit31
12/22 Test #12: dispatch_timer_bit31 .............   Passed    2.30 sec
      Start 13: dispatch_timer_bit63
13/22 Test #13: dispatch_timer_bit63 .............   Passed    1.15 sec
      Start 14: dispatch_timer_set_time
14/22 Test #14: dispatch_timer_set_time ..........   Passed    2.45 sec
      Start 15: dispatch_data
15/22 Test #15: dispatch_data ....................   Passed    0.15 sec
      Start 16: dispatch_io_muxed
16/22 Test #16: dispatch_io_muxed ................   Passed    1.36 sec
      Start 17: dispatch_io_net
17/22 Test #17: dispatch_io_net ..................   Passed    0.47 sec
      Start 18: dispatch_io_pipe
18/22 Test #18: dispatch_io_pipe .................   Passed   10.17 sec
      Start 19: dispatch_io_pipe_close
19/22 Test #19: dispatch_io_pipe_close ...........   Passed    0.16 sec
      Start 20: dispatch_select
20/22 Test #20: dispatch_select ..................   Passed    0.22 sec
      Start 21: dispatch_c99
21/22 Test #21: dispatch_c99 .....................   Passed    0.15 sec
      Start 22: dispatch_plusplus
22/22 Test #22: dispatch_plusplus ................   Passed    0.15 sec

100% tests passed, 0 tests failed out of 22

Total Test time (real) =  39.81 sec

The following tests FAILED:
	1281 - TestFoundation.TestNSString-test_NSHomeDirectoryForUser (Failed)
	1283 - TestFoundation.TestNSString-test_expandingTildeInPath (Failed)
	1347 - TestFoundation.TestProcess-test_multiProcesses (Failed)
	1377 - TestFoundation.TestURL-test_URLByResolvingSymlinksInPathShouldUseTheCurrentDirectory (Failed)
	1378 - TestFoundation.TestURL-test_resolvingSymlinksInPathShouldAppendTrailingSlashWhenExistingDirectory (Failed)
	1379 - TestFoundation.TestURL-test_resolvingSymlinksInPathShouldResolveSymlinks (Failed)

[0/1][  0%][0.000s] Running XCTest functional test suite
-- Testing: 26 tests, 26 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. 

Testing Time: 24.11s
  Passed: 26

The Foundation test failures are intriguing as locally, I am seeing only 2 failures. But seems that Swift, libdispatch, and XCTest are good. The Foundation test failures seem to match the current ones in CI, so I think that Windows is still looking good assuming we figure out the FP16 issue.

compnerd · 2023-01-24T18:02:31Z

swift-corelibs-foundation\Tests\Foundation\Tests\TestNSString.swift:1009: error: TestNSString.test_NSHomeDirectoryForUser :
  XCTAssertEqual failed: ("Optional("C:/Users/swift-ci")") is not equal to ("nil") - 

swift-corelibs-foundation\Tests\Foundation\Tests\TestNSString.swift:1036: error: TestNSString.test_expandingTildeInPath :
  XCTAssertEqual failed: ("~swift-ci") is not equal to ("C:/Users/swift-ci") - Could resolve home directory for specific user

swift-corelibs-foundation\Tests\Foundation\Tests\TestProcess.swift:803: error: TestProcess.test_multiProcesses :
  XCTAssertEqual failed: ("T:/4/TestFoundation.app") is not equal to ("C:/Users/swift-ci/jenkins/workspace/swift-PR-build-toolchain-windows/build/4/TestFoundation.app") - 

swift-corelibs-foundation\Tests\Foundation\Tests\TestURL.swift:562: error: TestURL.test_URLByResolvingSymlinksInPathShouldUseTheCurrentDirectory :
  XCTAssertEqual failed: ("file:///C:/Users/swift-ci/jenkins/workspace/swift-PR-build-toolchain-windows/build/tmp/org.swift.TestFoundation.TestURL.resourceValues.464/foo/bar/baz") is not equal to ("file:///T:/tmp/org.swift.TestFoundation.TestURL.resourceValues.464/foo/bar/baz") - 

swift-corelibs-foundation\Tests\Foundation\Tests\TestURL.swift:576: error: TestURL.test_resolvingSymlinksInPathShouldAppendTrailingSlashWhenExistingDirectory :
  XCTAssertEqual failed: ("file:///C:/Users/swift-ci/jenkins/workspace/swift-PR-build-toolchain-windows/build/tmp/org.swift.TestFoundation.TestURL.resourceValues.7768/") is not equal to ("file:///T:/tmp/org.swift.TestFoundation.TestURL.resourceValues.7768/") - 

swift-corelibs-foundation\Tests\Foundation\Tests\TestURL.swift:591: error: TestURL.test_resolvingSymlinksInPathShouldResolveSymlinks :
  XCTAssertEqual failed: ("file:///C:/Users/swift-ci/jenkins/workspace/swift-PR-build-toolchain-windows/build/tmp/org.swift.TestFoundation.TestURL.resourceValues.5012/destination") is not equal to ("file:///T:/tmp/org.swift.TestFoundation.TestURL.resourceValues.5012/destination") -

Some text editing with the actual failures here indicates that the failures are the two that I see locally, and 4 related to symlink handling where the expectation doesn't match due to the build being on a different drive than the source.

compnerd · 2023-01-26T18:27:58Z

@swift-ci please build toolchain Windows platform

bnbarham · 2023-01-26T19:19:56Z

@swift-ci please test Linux platform

compnerd · 2023-01-27T15:52:30Z

@swift-ci please test macOS platform

Bump the minimum supported CPU to IvyBridge so that we can take advantage of the F16C/CVT16 extensions to support half floating point rounding. This avoids the undefined reference to `__extendsfhf2` in the FP16 support.

compnerd · 2023-01-30T16:38:51Z

@swift-ci please test

@al45tair

Add a simple special case implementation of the FP16 routines for the FP80 extension. Implementation by @al45tair!

compnerd · 2023-01-30T18:13:23Z

@swift-ci please test

compnerd · 2023-01-30T20:49:52Z

@swift-ci please test

compnerd · 2023-02-01T17:09:14Z

@swift-ci please test

compnerd · 2023-02-01T18:53:11Z

@swift-ci please test

compnerd · 2023-02-01T20:15:24Z

@swift-ci please test Linux platform

compnerd · 2023-02-01T22:29:53Z

@swift-ci please test Linux platform

bnbarham · 2023-02-02T00:16:13Z

@swift-ci please test Linux platform

compnerd · 2023-02-02T04:54:01Z

@swift-ci please test Linux platform

compnerd · 2023-02-03T18:44:01Z

I think that this is no longer needed!

lin72h · 2023-02-04T10:52:01Z

I've been watching the progress on it, can you elaborate more on the reason?

etcwilde · 2023-02-05T16:34:08Z

@lin72h I've been watching the progress on it, can you elaborate more on the reason?

@al45tair fixed it without requiring bumping the minimum CPU version in 125e54f and df1891e. Not having to bump the minimum processor version is desirable.

lin72h · 2023-02-06T13:22:22Z

@etcwilde perfect, Thanks

compnerd requested review from zoecarver, hyp and egorzhdan as code owners January 12, 2023 00:58

bnbarham reviewed Jan 12, 2023

View reviewed changes

lib/ClangImporter/ClangImporter.cpp Show resolved Hide resolved

bnbarham reviewed Jan 12, 2023

View reviewed changes

stdlib/cmake/modules/AddSwiftStdlib.cmake Outdated Show resolved Hide resolved

compnerd force-pushed the cvt16 branch from afaa974 to 6a4985d Compare January 12, 2023 01:12

finagolfin reviewed Jan 12, 2023

View reviewed changes

stdlib/cmake/modules/AddSwiftStdlib.cmake Show resolved Hide resolved

finagolfin reviewed Jan 12, 2023

View reviewed changes

stdlib/public/runtime/Float16Support.cpp Outdated Show resolved Hide resolved

stephentyrone reviewed Jan 12, 2023

View reviewed changes

lib/ClangImporter/ClangImporter.cpp Outdated Show resolved Hide resolved

compnerd force-pushed the cvt16 branch from 14e7df3 to 57fdf30 Compare January 13, 2023 20:59

compnerd mentioned this pull request Jan 15, 2023

test: adjust path separator for tests #63027

Merged

compnerd force-pushed the cvt16 branch from 57fdf30 to 923c876 Compare January 16, 2023 19:02

stephentyrone reviewed Jan 17, 2023

View reviewed changes

compnerd force-pushed the cvt16 branch from 923c876 to a6cf09b Compare January 17, 2023 18:18

compnerd force-pushed the cvt16 branch from a6cf09b to da4e937 Compare January 30, 2023 16:01

X86: bump the minimum required CPU to IvyBridge

523e178

Bump the minimum supported CPU to IvyBridge so that we can take advantage of the F16C/CVT16 extensions to support half floating point rounding. This avoids the undefined reference to `__extendsfhf2` in the FP16 support.

compnerd force-pushed the cvt16 branch from da4e937 to 523e178 Compare January 30, 2023 16:38

runtime: add __extendhfxf2 implementation

a6de6d4

Add a simple special case implementation of the FP16 routines for the FP80 extension. Implementation by @al45tair!

Update Float16Support.cpp

593bfd9

Update Float16Support.cpp

d61181b

Update Float16Support.cpp

2e33d2f

Update Float16Support.cpp

975b436

Update Float16Support.cpp

5dde7e4

Update Float16Support.cpp

3aad090

compnerd closed this Feb 3, 2023

compnerd deleted the cvt16 branch February 3, 2023 18:44

stdlib: excise the FP16 support routines on x64 #62988

stdlib: excise the FP16 support routines on x64 #62988

Uh oh!

Conversation

compnerd commented Jan 12, 2023

Uh oh!

compnerd commented Jan 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

compnerd commented Jan 12, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

stephentyrone commented Jan 12, 2023

Uh oh!

compnerd commented Jan 12, 2023

Uh oh!

compnerd commented Jan 12, 2023

Uh oh!

shahmishal commented Jan 12, 2023

Uh oh!

stephentyrone commented Jan 12, 2023

Uh oh!

compnerd commented Jan 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentyrone commented Jan 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stephentyrone commented Jan 12, 2023

Uh oh!

compnerd commented Jan 12, 2023

Uh oh!

stephentyrone commented Jan 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

compnerd commented Jan 13, 2023

Uh oh!

compnerd commented Jan 13, 2023

Uh oh!

compnerd commented Jan 13, 2023

Uh oh!

compnerd commented Jan 16, 2023

Uh oh!

stephentyrone Jan 17, 2023

Choose a reason for hiding this comment

Uh oh!

compnerd Jan 17, 2023

Choose a reason for hiding this comment

Uh oh!

compnerd Jan 17, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

compnerd commented Jan 24, 2023

Uh oh!

compnerd commented Jan 24, 2023

Uh oh!

compnerd commented Jan 26, 2023

Uh oh!

bnbarham commented Jan 26, 2023

Uh oh!

compnerd commented Jan 27, 2023

Uh oh!

compnerd commented Jan 30, 2023

Uh oh!

compnerd commented Jan 30, 2023

Uh oh!

compnerd commented Jan 30, 2023

Uh oh!

compnerd commented Feb 1, 2023

Uh oh!

compnerd commented Feb 1, 2023

Uh oh!

compnerd commented Feb 1, 2023

Uh oh!

compnerd commented Feb 1, 2023

compnerd commented Jan 12, 2023 •

edited

Loading

compnerd commented Jan 12, 2023 •

edited

Loading

stephentyrone commented Jan 12, 2023 •

edited

Loading

stephentyrone commented Jan 12, 2023 •

edited

Loading

compnerd Jan 17, 2023 •

edited

Loading

etcwilde commented Feb 5, 2023 •

edited

Loading