Skip to content

Make Float16 available for macOS on Apple Silicon #34821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

stephentyrone
Copy link
Contributor

Due to an unstable (and undesirable) calling convention in the LLVM layer for x86, I had previously marked Float16 unconditionally unavailable on macOS. My hope was that Intel would stabilize the calling convention and we could make it available on both macOS platforms at the same time. Unfortunately, that hasn't happened, and we want to make the type available for macOS/arm users.

So, I am making the availability mirror Float80--the type will be unavailable for macOS on x86_64, and available on all other platforms (the other x86 platforms don't have a binary-stability guarantee to worry about). This isn't ideal. In particular, if/when the calling conventions for Float16 stabilize in LLVM, we would want to make the type available, but it would then have different availability for different architectures of macOS, which the current availability system is not well-equipped to handle (it's possible, but not very ergonomic). Nonetheless, this seems like the best option.

The good news is that because the full API is already built in Swift (and simply marked unavailable), we can simply add macOS 11.0 availability for these API and it will work.

Due to an unstable (and undesirable) calling convention in the LLVM layer for x86, I had previously marked Float16 unconditionally unavailable on macOS. My hope was that Intel would stabilize the calling convention and we could make it available on both macOS platforms at the same time. Unfortunately, that hasn't happened, and we want to make the type available for macOS/arm users.

So, I am making the availability mirror Float80--the type will be unavailable for macOS on x86_64, and available on all other platforms (the other x86 platforms don't have a binary-stability guarantee to worry about). This isn't ideal. In particular, if/when the calling conventions for Float16 stabilize in LLVM, we would want to make the type available, but it would then have _different_ availability for different architectures of macOS, which the current availability system is not well-equipped to handle (it's possible, but not very ergonomic). Nonetheless, this seems like the best option.

The good news is that because the full API is already built in Swift (and simply marked unavailable), we can simply add macOS 11.0 availability for these API and it will work.
@stephentyrone
Copy link
Contributor Author

@swift-ci please test

@@ -136,7 +135,7 @@ extension ${Self}: LosslessStringConvertible {
%if bits == 16:
self.init(Substring(text))
%else:
if #available(macOS 10.16, iOS 14.0, watchOS 7.0, tvOS 14.0, *) {
if #available(macOS 11.0, iOS 14.0, watchOS 7.0, tvOS 14.0, *) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbkka I think we made a change here that hasn't been pushed upstream for whatever reason--do you want to bring the two in sync now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should sync these up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to put up a PR and tag me to review?

Copy link
Member

@lorentey lorentey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! 👍

Using these will be a pain on macOS for a while, but partial availability is still better than none at all.

@stephentyrone stephentyrone merged commit abea46e into swiftlang:main Nov 19, 2020
@stephentyrone stephentyrone deleted the enable-float16-on-apple-silicon branch November 19, 2020 14:36
ainu-bot added a commit to google/swift that referenced this pull request Nov 19, 2020
* 'main' of github.com:apple/swift:
  Make Float16 available for macOS on Apple Silicon (swiftlang#34821)
  [concurrency] Correctly handle dead-end and unreachable blocks in OptimizeHopToExecutor
  [Concurrency] Remove the C++ runtime test for swift_future_task_wait.
  [Concurrency] Implement Task.Handle.get() in terms of an async runtime call.
  SILGen: Update emitForeignToNativeThunk to handle async methods.
@xedin
Copy link
Contributor

xedin commented Nov 20, 2020

@stephentyrone @lorentey Looks like this has caused a regression in Surge and Kingfisher projects. Source compat suite failed here - #34834. Looks like main branch source compat suite just picked that commit up and going to start failing as well - https://ci.swift.org/view/Source%20Compatibility/job/swift-main-source-compat-suite/5600/

@xedin
Copy link
Contributor

xedin commented Nov 20, 2020

I'm not sure what the action plan is here - XFAIL the projects or fix Accelerate?

@lorentey
Copy link
Member

Interesting! ci.swift.org currently builds with Xcode: 12.2 Beta 3 (12B5035g), whose Accelerate overlay includes the following:

@available(iOS 14, tvOS 14, watchOS 7, *)
@available(OSX, unavailable)
@available(macCatalyst, unavailable)
extension Float16 : Accelerate.BNNSScalar {
  public static var bnnsDataType: Accelerate.BNNSDataType {
    get
  }
}

This looks just fine to me. Perhaps the two flavors of unavailability differ somehow?

@lorentey
Copy link
Member

Oh, I guess it could be because the x86_64 macOS interface now declares Float16 unavailable everywhere, including non-macOS platforms.

@AnthonyLatsis
Copy link
Collaborator

AnthonyLatsis commented Nov 20, 2020

Oh, I guess it could be because the x86_64 macOS interface now declares Float16 unavailable everywhere, including non-macOS platforms.

Yeah. I think we should keep the old availability for the macOS x86_64 interface:

...

% if bits == 16:
@available(iOS 14, tvOS 14, watchOS 7, *)
@available(OSX, unavailable)
@available(macCatalyst, unavailable)
% else:
@available(*, unavailable, message: "${Self} is not available on target platform.")
% end
public struct ${Self} {

@stephentyrone
Copy link
Contributor Author

stephentyrone commented Nov 20, 2020

Yeah, I'll put the old availability on, then we should work with the Accelerate team to fix the overlay, then we should switch the availability back once CI is using an Xcode with the updated overlay.

@philipturner
Copy link
Contributor

philipturner commented Jul 21, 2022

@stephentyrone

I'm trying to bring a GPU framework to macOS, where half precision is used in shaders. I can avoid doing actual Float16 computations on the CPU, but I want to use Float16 in the generic signature of a GPU-backed buffer. Here's my dilemma:

#if !((os(macOS) || targetEnvironment(macCatalyst)) && arch(x86_64))
test(Tensor<Float16>.incremented, input: 42, expected: 43)
#endif
test(Tensor<Float>.incremented, input: 42, expected: 43)
test(Tensor<Int8>.incremented, input: 42, expected: 43)
test(Tensor<Int16>.incremented, input: 42, expected: 43)
test(Tensor<Int32>.incremented, input: 42, expected: 43)
test(Tensor<UInt8>.incremented, input: 42, expected: 43)
test(Tensor<UInt16>.incremented, input: 42, expected: 43)

I want people with Intel Macs to take full advantage of half-precision processing power on their GPUs. I'm planning to create a swift-intel-float16 package, which brings Float16 to Intel Macs through conditional package dependencies. It will play out similarly to swift-reflection-mirror, which was Stdlib copypasta that enabled @_spi(Reflection) on release toolchains.

Is there anything I should look out for? I'm not trying to create ABI-stable Xcode frameworks. I just want users to compile my code via SwiftPM and play around with Swift for TensorFlow.

@stephentyrone
Copy link
Contributor Author

stephentyrone commented Jul 21, 2022

If you don’t need ABI stability, you shouldn’t have any major issues. If you’re just marshaling data for GPU buffers though, it’s going to get you much better performance if you use Float for any CPU side work and then run through bulk-conversion API (as in vImage or BNNS) to convert all the data to Float16 to ship to GPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants