-
Notifications
You must be signed in to change notification settings - Fork 2.4k
[pitch] Span
-providing properties in the standard library
#2620
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
6984ba3
ac2f2db
670ac2d
a6744a7
94f0002
a70244f
79d15a3
ade59ca
bf7a3d3
31f8031
49668a2
9b1b015
325e3ac
a8c8559
c4f3921
1415c2f
a0dcdf1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,295 @@ | ||
# Add `Span`-providing Properties to Standard Library Types | ||
|
||
* Proposal: [PR-2620](https://github.com/swiftlang/swift-evolution/pull/2620) | ||
* Author: [Guillaume Lessard](https://github.com/glessard) | ||
* Review Manager: (tbd) | ||
* Status: **Pitch** | ||
* Roadmap: [BufferView Roadmap](https://forums.swift.org/t/66211) | ||
* Bug: rdar://137710901 | ||
* Implementation: (tbd) | ||
* Review: [pitch](https://forums.swift.org/t/76138) | ||
|
||
[SE-0446]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0446-non-escapable.md | ||
[SE-0447]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0447-span-access-shared-contiguous-storage.md | ||
[PR-2305]: https://github.com/swiftlang/swift-evolution/pull/2305 | ||
[SE-0453]: https://github.com/swiftlang/swift-evolution/blob/main/proposals/0453-vector.md | ||
|
||
## Introduction | ||
|
||
We recently [introduced][SE-0447] the `Span` and `RawSpan` types, but did not provide ways to obtain instances of either from existing types. This proposal adds properties that vend a lifetime-dependent `Span` from a variety of standard library types, as well as vend a lifetime-dependent `RawSpan` when the underlying element type supports it. | ||
|
||
## Motivation | ||
|
||
Many standard library container types can provide direct access to their internal representation. Up to now, it has only been possible to do so in an unsafe way. The standard library provides this unsafe functionality with closure-taking functions such as `withUnsafeBufferPointer()`, `withContiguousStorageIfAvailable()` and `withUnsafeBytes()`. These functions have a few different drawbacks, most prominently their reliance on unsafe types, which makes them unpalatable in security-conscious environments. Closure-taking API can also be difficult to compose with new features and with one another. These issues are addressed head-on with non-escapable types in general, and `Span` in particular. With this proposal, compatible standard library types will provide access to their internal representation via computed properties of type `Span` and `RawSpan`. | ||
|
||
## Proposed solution | ||
|
||
Computed properties returning [non-escapable][SE-0446] copyable values represent a particular case of lifetime relationships between two bindings. While initializing a non-escapable value in general requires [lifetime annotations][PR-2305] in order to correctly describe the lifetime relationship, the specific case of computed properties returning non-escapable copyable values can only represent one type of relationship between the parent binding and the non-escapable instance it provides: a borrowing relationship. | ||
|
||
For example, in the example below we have an instance of type `A`, with a well-defined lifetime because it is non-copyable. An instance of `A` can provide access to a type `B` which borrows the instance `A`: | ||
|
||
```swift | ||
struct A: ~Copyable, Escapable {} | ||
struct B: ~Escapable, Copyable { | ||
init(_ a: borrowing A) {} | ||
} | ||
extension A { | ||
var b: B { B(self) } | ||
} | ||
|
||
func function() { | ||
var a = A() | ||
var b = a.b // access to `a` begins here | ||
read(b) | ||
// `b` has ended here, ending access to `a` | ||
modify(&a) // `modify()` can have exclusive access to `a` | ||
} | ||
``` | ||
If we were to attempt using `b` again after the call to `modify(&a)`, the compiler would report an overlapping access error, due to attempting to mutate `a` (with `modify(&a)`) while it is already being accessed through `b`'s borrow. Note that the copyability of `B` means that it cannot represent a mutation of `A`; it therefore represents a non-exclusive borrowing relationship. | ||
|
||
Given this, we propose to enable the definition of a borrowing relationship via a computed property. With this feature we then propose to add `storage` computed properties to standard library types that can share their internal typed storage, as well as `bytes` computed properties to those standard library types that can safely share their internal storage as untyped memory. | ||
|
||
One of the purposes of `Span` is to provide a safer alternative to `UnsafeBufferPointer`. This proposal builds on it and allows us to rewrite code reliant on `withUnsafeBufferPointer()` to use `storage` properties instead. Eventually, code that requires access to contiguous memory can be rewritten to use `Span`, gaining better composability in the process. For example: | ||
|
||
```swift | ||
let result = try myArray.withUnsafeBufferPointer { buffer in | ||
let indices = findElements(buffer) | ||
var myResult = MyResult() | ||
for i in indices { | ||
try myResult.modify(buffer[i]) | ||
} | ||
} | ||
``` | ||
|
||
This closure-based call is difficult to evolve, such as making `result` have a non-copyable type, adding a concurrent task, or adding typed throws. An alternative based on a vended `Span` property would look like this: | ||
|
||
```swift | ||
let span = myArray.storage | ||
let indices = findElements(span) | ||
var myResult = MyResult() | ||
for i in indices { | ||
try myResult.modify(span[i]) | ||
} | ||
``` | ||
|
||
In this version, code evolution is not constrained by a closure. Incorrect escapes of `span` will be diagnosed by the compiler, and the `modify()` function can be updated with typed throws, concurrency or other features as necessary. | ||
|
||
## Detailed Design | ||
|
||
Computed property getters returning non-escapable and copyable types (`~Escapable & Copyable`) become possible, requiring no additional annotations. The lifetime of their returned value depends on the type vending them. A `~Escapable & Copyable` value borrows another binding. In terms of the law of exclusivity, a borrow is a read-only access. Multiple borrows are allowed to overlap, but cannot overlap with any mutation. | ||
|
||
A computed property getter defined on an `Escapable` type and returning a `~Escapable & Copyable` value establishes a borrowing lifetime relationship of the returned value on the callee's binding. As long as the returned value exists (including local copies,) then the callee's binding remains borrowed. | ||
|
||
A computed property getter defined on a non-escapable and copyable (`~Escapable & Copyable`) type and returning a `~Escapable & Copyable` value copies the lifetime dependency of the callee. The returned value becomes an additional borrow of the callee's dependency, but is otherwise independent from the callee. | ||
|
||
A computed property getter defined on a non-escapable and non-copyable (`~Escapable & ~Copyable`) type returning a `~Escapable & Copyable` value establishes a borrowing lifetime relationship of the returned value on the callee's binding. As long as the returned value exists (including local copies,) then the callee's binding remains borrowed. | ||
|
||
By allowing the language to define lifetime dependencies in these limited ways, we can add `Span`-providing properties to standard library types. | ||
|
||
#### <a name="extensions"></a>Extensions to Standard Library types | ||
|
||
The standard library and Foundation will provide `storage` and computed properties, returning lifetime-dependent `Span` instances. These computed properties are the safe and composable replacements for the existing `withUnsafeBufferPointer` closure-taking functions. | ||
|
||
```swift | ||
extension Array { | ||
/// Share this `Array`'s elements as a `Span` | ||
var storage: Span<Element> { get } | ||
} | ||
|
||
extension ArraySlice { | ||
/// Share this `Array`'s elements as a `Span` | ||
var storage: Span<Element> { get } | ||
} | ||
|
||
extension ContiguousArray { | ||
/// Share this `Array`'s elements as a `Span` | ||
var storage: Span<Element> { get } | ||
} | ||
|
||
extension String.UTF8View { | ||
/// Share this `UTF8View`'s code units as a `Span` | ||
var storage: Span<Unicode.UTF8.CodeUnit> { get } | ||
} | ||
|
||
extension Substring.UTF8View { | ||
/// Share this `UTF8View`'s code units as a `Span` | ||
var storage: Span<Unicode.UTF8.CodeUnit> { get } | ||
} | ||
|
||
extension CollectionOfOne { | ||
/// Share this `Collection`'s element as a `Span` | ||
var storage: Span<Element> { get } | ||
} | ||
|
||
extension SIMD_N_ { // where _N_ ∈ {2, 3, 4 ,8, 16, 32, 64} | ||
/// Share this vector's elements as a `Span` | ||
var storage: Span<Scalar> { get } | ||
} | ||
|
||
extension KeyValuePairs { | ||
/// Share this `Collection`'s elements as a `Span` | ||
var storage: Span<(Key, Value)> { get } | ||
} | ||
``` | ||
|
||
Conditionally to the acceptance of [`Vector`][SE-0453], we will also add the following: | ||
|
||
```swift | ||
extension Vector where Element: ~Copyable { | ||
/// Share this vector's elements as a `Span` | ||
var storage: Span<Element> { get } | ||
} | ||
``` | ||
|
||
#### Accessing the raw bytes of a `Span` | ||
|
||
When a `Span`'s element is `BitwiseCopyable`, we allow viewing the underlying storage as raw bytes with `RawSpan`: | ||
|
||
```swift | ||
extension Span where Element: BitwiseCopyable { | ||
/// Share the raw bytes of this `Span`'s elements | ||
var bytes: RawSpan { get } | ||
} | ||
``` | ||
|
||
The returned `RawSpan` instance will borrow the same binding as is borrowed by the `Span`. | ||
|
||
#### Extensions to unsafe buffer types | ||
|
||
We hope that `Span` and `RawSpan` will become the standard ways to access shared contiguous memory in Swift, but current API provide `UnsafeBufferPointer` and `UnsafeRawBufferPointer` instances to do this. We will provide ways to unsafely obtain `Span` and `RawSpan` instances from them, in order to bridge `UnsafeBufferPointer` to contexts that use `Span`, or `UnsafeRawBufferPointer` to contexts that use `RawSpan`. | ||
|
||
```swift | ||
extension UnsafeBufferPointer { | ||
/// Unsafely view this buffer as a `Span` | ||
var storage: Span<Element> { get } | ||
} | ||
|
||
extension UnsafeMutableBufferPointer { | ||
/// Unsafely view this buffer as a `Span` | ||
var storage: Span<Element> { get } | ||
} | ||
|
||
extension UnsafeRawBufferPointer { | ||
/// Unsafely view this raw buffer as a `RawSpan` | ||
var bytes: RawSpan { get } | ||
} | ||
|
||
extension UnsafeMutableRawBufferPointer { | ||
/// Unsafely view this raw buffer as a `RawSpan` | ||
var bytes: RawSpan { get } | ||
} | ||
``` | ||
|
||
All of these unsafe conversions return a value whose lifetime is dependent on the _binding_ of the UnsafeBufferPointer. Note that this does not keep the underlying memory alive, as usual where the `UnsafePointer` family of types is involved. The programmer must ensure the following invariants for as long as the `Span` or `RawSpan` binding is valid: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This sentence is somewhat mysterious:
Here's an attempted concrete explanation... Swift does not manage the lifetime of unsafe buffer values. Consequently, when an unsafe buffer type provides a For example, if the unsafe buffer is a function argument, then it's storage is valid within the function body and can even be returned if the function's result depends on the argument:
It is an error to use the unsafe buffer's storage outside of it's variable's lexical scope, regardless of the buffer's origin:
|
||
|
||
- The underlying memory remains initialized. | ||
- The underlying memory is not mutated. | ||
|
||
Failure to keep these invariants results in undefined behaviour. | ||
|
||
#### Extensions to `Foundation.Data` | ||
|
||
While the `swift-foundation` package and the `Foundation` framework are not governed by the Swift evolution process, `Data` is similar in use to standard library types, and the project acknowledges that it is desirable for it to have similar API when appropriate. Accordingly, we would add the following properties to `Foundation.Data`: | ||
|
||
```swift | ||
extension Foundation.Data { | ||
// Share this `Data`'s bytes as a `Span` | ||
var storage: Span<UInt8> { get } | ||
|
||
// Share this `Data`'s bytes as a `RawSpan` | ||
var bytes: RawSpan { get } | ||
} | ||
``` | ||
|
||
Unlike with the standard library types, we plan to have a `bytes` property on `Foundation.Data` directly. This type conceptually consists of untyped bytes, and `bytes` is likely to be the primary way to directly access its memory. As `Data`'s API presents its storage as a collection of `UInt8` elements, we provide both `bytes` and `storage`. Types similar to `Data` may choose to provide both typed and untyped `Span` properties. | ||
|
||
#### <a name="performance"></a>Performance | ||
|
||
The `storage` and `bytes` properties should be performant and return their `Span` or `RawSpan` with very little work, in O(1) time. This is the case for all native standard library types. There is a performance wrinkle for bridged `Array` and `String` instances on Darwin-based platforms, where they can be bridged to Objective-C types that do not guarantee contiguous storage. In such cases the implementation will eagerly copy the underlying data to the native Swift form, and return a `Span` or `RawSpan` pointing to that copy. | ||
|
||
This eager copy behaviour will be specific to the `storage` and `bytes` properties, and therefore the memory usage behaviour of existing unchanged code will remain the same. New code that adopts the `storage` and `bytes` properties will occasionally have higher memory usage due to the eager copies, but we believe this performance compromise is the right approach for the standard library. The alternative is to compromise the design for all platforms supported by Swift, and we consider that a non-starter. | ||
|
||
As a result of the eager copy behaviour for bridged `String.UTF8View` and `Array` instances, the `storage` property for these types will have a documented performance characteristic of "amortized constant time performance." | ||
|
||
## Source compatibility | ||
|
||
This proposal is additive and source-compatible with existing code. | ||
|
||
## ABI compatibility | ||
|
||
This proposal is additive and ABI-compatible with existing code. | ||
|
||
## Implications on adoption | ||
|
||
The additions described in this proposal require a new version of the Swift standard library and runtime. | ||
|
||
## Alternatives considered | ||
|
||
#### Adding `withSpan()` and `withBytes()` closure-taking functions | ||
|
||
The `storage` and `bytes` properties aim to be safe replacements for the `withUnsafeBufferPointer()` and `withUnsafeBytes()` closure-taking functions. We could consider `withSpan()` and `withBytes()` closure-taking functions that would provide an quicker migration away from the older unsafe functions. We do not believe the closure-taking functions are desirable in the long run. In the short run, there may be a desire to clearly mark the scope where a `Span` instance is used. The default method would be to explicitly consume a `Span` instance: | ||
```swift | ||
var a = ContiguousArray(0..<8) | ||
var span = a.storage | ||
read(span) | ||
_ = consume span | ||
a.append(8) | ||
``` | ||
|
||
In order to visually distinguish this lifetime, we could simply use a `do` block: | ||
```swift | ||
var a = ContiguousArray(0..<8) | ||
do { | ||
let span = a.storage | ||
read(span) | ||
} | ||
a.append(8) | ||
``` | ||
|
||
A more targeted solution may be a consuming function that takes a non-escaping closure: | ||
```swift | ||
var a = ContiguousArray(0..<8) | ||
var span = a.storage | ||
consuming(span) { span in | ||
read(span) | ||
} | ||
a.append(8) | ||
``` | ||
|
||
During the evolution of Swift, we have learned that closure-based API are difficult to compose, especially with one another. They can also require alterations to support new language features. For example, the generalization of closure-taking API for non-copyable values as well as typed throws is ongoing; adding more closure-taking API may make future feature evolution more labor-intensive. By instead relying on returned values, whether from computed properties or functions, we build for greater composability. Use cases where this approach falls short should be reported as enhancement requests or bugs. | ||
|
||
#### Giving the properties different names | ||
|
||
We chose the names `storage` and `bytes` because those reflect _what_ they represent. Another option would be to name the properties after _how_ they represent what they do, which would be `span` and `rawSpan`. It is possible the name `storage` would be deemed to clash too much with existing properties of types that would like to provide views of their internal storage with `Span`-providing properties. For example, the Standard Library's concrete `SIMD`-conforming types have a property `var _storage`. The current proposal means that making this property of `SIMD` types into public API would entail a name change more significant than simply removing its leading underscore. | ||
|
||
#### Disallowing the definition of non-escapable properties of non-escapable types | ||
|
||
The particular case of the lifetime dependence created by a property of a copyable non-escapable type is not as simple as when the parent type is escapable. There are two possible ways to define the lifetime of the new instance: it can either depend on the lifetime of the original instance, or it can acquire the lifetime of the original instance and be otherwise independent. We believe that both these cases can be useful, but that in the majority of cases the desired behaviour will be to have an independent return value, where the newly returned value borrows the same binding as the callee. Therefore we believe that is reasonable to reserve the unannotated spelling for this more common case. | ||
|
||
The original version of this pitch disallowed this. As a consequence, the `bytes` property had to be added on each individual type, rather than having `bytes` as a conditional property of `Span`. | ||
|
||
#### Omitting extensions to `UnsafeBufferPointer` and related types | ||
|
||
We could omit the extensions to `UnsafeBufferPointer` and related types, and rely instead of future `Span` and `RawSpan` initializers. The initializers can have the advantage of being able to communicate semantics (somewhat) through their parameter labels. However, they also have a very different shape than the `storage` computed properties we are proposing. We believe that the adding the same API on both safe and unsafe types is advantageous, even if the preconditions for the properties cannot be statically enforced. | ||
|
||
## <a name="directions"></a>Future directions | ||
|
||
Note: The future directions stated in [SE-0447](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0447-span-access-shared-contiguous-storage.md#Directions) apply here as well. | ||
|
||
#### <a name="MutableSpan"></a>Safe mutations with `MutableSpan<T>` | ||
|
||
Some data structures can delegate mutations of their owned memory. In the standard library the function `withMutableBufferPointer()` provides this functionality in an unsafe manner. We expect to add a `MutableSpan` type to support delegating mutations of initialized memory. Standard library types will then add a way to vend `MutableSpan` instances. This could be with a closure-taking `withMutableSpan()` function, or a new property, such as `var mutableStorage`. Note that a computed property providing mutable access needs to have a different name than the `storage` properties proposed here, because we cannot overload the return type of computed properties based on whether mutation is desired. | ||
|
||
#### <a name="ContiguousStorage"></a>A `ContiguousStorage` protocol | ||
|
||
An early version of the `Span` proposal ( [SE-0447][SE-0447] ) proposed a `ContiguousStorage` protocol by which a type could indicate that it can provide a `Span`. `ContiguousStorage` would form a bridge between generically-typed interfaces and a performant concrete implementation. It would supersede the rejected [SE-0256](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0256-contiguous-collection.md), and many of the standard library collections could conform to `ContiguousStorage`. | ||
|
||
The properties added by this proposal are largely the concrete implementations of `ContiguousStorage`. As such, it seems like an obvious enhancement to this proposal. | ||
|
||
Unfortunately, a major issue prevents us from proposing it at this time: the ability to suppress requirements on `associatedtype` declarations was deferred during the review of [SE-0427](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0427-noncopyable-generics.md). Once this restriction is lifted, then we could propose a `ContiguousStorage` protocol. | ||
|
||
The other limitation stated in [SE-0447][SE-0447]'s section about `ContiguousStorage` is "the inability to declare a `_read` acessor as a protocol requirement." This proposal's addition to enable defining a borrowing relationship via a computed property is a solution to that, as long as we don't need to use a coroutine accessor to produce a `Span`. While allowing the return of `Span`s through coroutine accessors may be undesirable, whether it is undesirable is unclear until coroutine accessors are formalized in the language. | ||
|
||
## Acknowledgements | ||
|
||
Thanks to Ben Rimmington for suggesting that the `bytes` property should be on `Span` rather than on every type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: What is the behavior (and what are the performance characteristics) of this operation when a
String
has storage other than contiguous UTF-8? Would it be desirable to return an optional, or alternatively require the user to know or have calledmakeContiguousUTF8()
on theString
as a precondition?Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Behind the scenes, we are implementing a new "lazy eager bridging" behaviour for bridged
Array
and bridgedString
. They will always succeed, but they will sometimes entail an allocation and copy (i.e. "usually O(1), sometimes O(n)".) An initial step is here. For native non-bridged instances, always O(1).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool! The implementation details would of course be out of scope for this text, but might be worth calling out here the performance implications for calling
storage
andbytes
on these types, pros and cons for usability vs predictability, etc. (given that this is supposed to be a safe, performant API).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appended a paragraph to the detailed design to spell this out.