Float16 (#1120)

stephentyrone · web-flow · commit b874d4b5e374 · 2020-02-06T16:34:42.000-08:00
* Float16 proposal
* Kick off Float16 review
diff --git a/proposals/0000-float16.md b/proposals/0000-float16.md
@@ -0,0 +1,125 @@
+# Float16
+
+* Proposal: [SE-0276](0276-float16.md)
+* Author: [Stephen Canon](https://github.com/stephentyrone)
+* Review Manager: [Ben Cohen](https://github.com/airspeedswift)
+* Pitch thread: [Add Float16](https://forums.swift.org/t/add-float16/33019)
+* Implementation: [Implement Float16](https://github.com/apple/swift/pull/21738)
+* Status: **Active review (6–17 February 2020)**
+
+## Introduction
+
+Introduce the `Float16` type conforming to the `BinaryFloatingPoint` and `SIMDScalar`
+protocols, binding the IEEE 754 *binary16* format (aka *float16*, *half-precision*, or *half*),
+and bridged by the compiler to the C `_Float16` type.
+
+* Old pitch thread: [Add `Float16`](https://forums.swift.org/t/add-float16/19370).
+
+## Motivation
+
+The last decade has seen a dramatic increase in the use of floating-point types smaller
+than (32-bit) `Float`. The most widely implemented is `Float16`, which is used
+extensively on mobile GPUs for computation, as a pixel format for HDR images, and as
+a compressed format for weights in ML applications.
+
+Introducing the type to Swift is especially important for interoperability with shader-language
+programs; users frequently need to set up data structures on the CPU to
+pass to their GPU programs. Without the type available in Swift, they are forced to use
+unsafe mechanisms to create these structures.
+
+In addition, C APIs that use these types simply cannot be imported, making them 
+unavailable in Swift.
+
+## Proposed solution
+
+Add `Float16` to the standard library.
+
+## Detailed design
+
+There is shockingly little to say here. We will add:
+```
+@frozen
+struct Float16: BinaryFloatingPoint, SIMDScalar, CustomStringConvertible { }
+```
+The entire API falls out from that, with no additional surface outside that provided by those 
+protocols. `Float16` will provide exactly the operations that `Float` and `Double` and `Float80` 
+do for their conformance to these protocols.
+
+We also need to ensure that the parameter passing conventions followed by the compiler
+for `Float16` are what we want; these values should be passed and returned in the
+floating-point registers, and vectors should be passed and returned in SIMD registers.
+
+On platforms that do not have native arithmetic support, we will convert `Float16` to
+`Float` and use the hardware support for `Float` to perform the operation. This is
+correctly-rounded for every operation except fused multiply-add. A software sequence
+will be used to emulate fused multiply-add in these cases (the easiest option is to convert
+to `Double`, but other options may be more efficient on some architectures, especially
+for vectors).
+
+## Source compatibility
+
+N/A
+
+## Effect on ABI stability
+
+There is no change to existing ABI. We will be introducing a new type, which will have
+appropriate availability annotations.
+
+## Effect on API resilience
+
+The `Float16` type would become part of the public API. It will be `@frozen`, so no
+further changes will be possible, but its API and layout are entirely constrained by
+IEEE 754 and conformance to `BinaryFloatingPoint`, so there are no alternatives
+possible anyway.
+
+## Alternatives considered
+
+Q: Why isn't it called `Half`?
+
+A: `FloatN` is the more consistent pattern. Swift already has `Float32`,
+`Float64` and `Float80` (with `Float` and `Double` as alternative spellings of `Float32`
+and `Float64`, respectively). At some future point we will add `Float128`. Plus, the C
+language type that this will bridge to is named `_Float16`.
+
+`Half` is not completely outrageous as an alias, but we shouldn't add aliases unless
+there's a really compelling reason.
+
+During the pitch phase, feedback was fairly overwhelmingly in favor of `Float16`, though
+there are a few people who would like to have both names available. Unless significantly
+more people come forward, however, we should make the "opinionated" choice to have a single
+name for the type. An alias could always be added with a subsequent minor proposal if
+necessary.
+
+Q: What about ARM's ["alternative half precision"](https://en.wikipedia.org/wiki/Half-precision_floating-point_format#ARM_alternative_half-precision)?
+What about [bfloat16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format)?
+
+A: Alternative half-precision is no longer supported; ARMv8.2-FP16 only uses the IEEE 754
+encoding. Bfloat is something that we should *also* support eventually, but it's a separate
+proposal. and we should wait a little while before doing it. Conformance to IEEE 754 fully
+defines the semantics of `Float16` (and several hardware vendors, including Apple have
+been shipping processors that implement those semantics for a few years). By contrast,
+there are a few proposals for hardware that implements bfloat16; ARM and Intel designed
+different multiply-add units for it, and haven't shipped yet (and haven't defined any 
+operations *other* than a non-homogeneous multiply-add). Other companies have HW in
+use, but haven't (publicly) formally specified their arithmetic. It's a moving target, and it
+would be a mistake to attempt to specify language bindings today.
+
+Q: Do we need conformance to `BinaryFloatingPoint`? What if we made it a storage-only format?
+
+A: We could make it a type that can only be used to convert to/from `Float` and `Double`,
+forcing all arithmetic to be performed in another format. However, this would mean that
+it would be much harder, in some cases, to get the same numerical result on the CPU and
+GPU (when GPU computation is performed in half-precision). It's a very nice convenience
+to be able to do a computation in the REPL and see what a GPU is going to do.
+
+Q: Why not put it in Swift Numerics?
+
+A: The biggest reason to add the type is for interoperability with C-family and GPU programs
+that want to use their analogous types. In order to maximize the support for that interoperability,
+and to get the calling conventions that we want to have in the long-run, it makes more sense to put
+this type in the standard library.
+
+Q: What about math library support?
+
+A: If this proposal is approved, I will add conformance to `Real` in Swift Numerics, providing
+the math library functions (by using the corresponding `Float` implementations initially).