Skip to content

Commit 1a927c0

Browse files
authored
Merge pull request #882 from natecook1000/nc-array-uninitialized
Add a proposal for `Array.init(unsafeUninitializedCapacity:initializingWith:)`
2 parents 4ce84ca + b7107f4 commit 1a927c0

File tree

1 file changed

+340
-0
lines changed

1 file changed

+340
-0
lines changed
Lines changed: 340 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,340 @@
1+
# Accessing an Array's Uninitialized Buffer
2+
3+
* Proposal: [SE-NNNN](NNNN-array-uninitialized-buffer.md)
4+
* Author: [Nate Cook](https://github.com/natecook1000)
5+
* Review Manager: TBD
6+
* Status: **Awaiting review**
7+
* Implementation: https://github.com/apple/swift/pull/17389
8+
* Bug: [SR-3087](https://bugs.swift.org/browse/SR-3087)
9+
10+
## Introduction
11+
12+
This proposal suggests a new initializer and method for `Array` and `ContiguousArray`
13+
that provide access to an array's uninitialized storage buffer.
14+
15+
Swift-evolution thread: [https://forums.swift.org/t/array-initializer-with-access-to-uninitialized-buffer/13689](https://forums.swift.org/t/array-initializer-with-access-to-uninitialized-buffer/13689)
16+
17+
## Motivation
18+
19+
Some collection operations require working on a fixed-size buffer of uninitialized memory.
20+
For example, one O(*n*) algorithm for performing a stable partition of an array is as follows:
21+
22+
1. Create a new array the same size as the original array.
23+
2. Iterate over the original array,
24+
copying matching elements to the beginning of the new array
25+
and non-matching elements to the end.
26+
3. When finished iterating, reverse the slice of non-matching elements.
27+
28+
Unfortunately, the standard library provides no way to create an array
29+
of a particular size without allocating every element,
30+
or to copy elements to the end of an array's buffer
31+
without initializing every preceding element.
32+
Even if we avoid initialization by manually allocating the memory using an `UnsafeMutableBufferPointer`,
33+
there's no way to convert that buffer into an array without copying the contents.
34+
There simply isn't a way to implement this particular algorithm with maximum efficiency in Swift.
35+
36+
We also see this limitation when working with C APIs
37+
that fill a buffer with an unknown number of elements and return the count.
38+
The workarounds are the same as above:
39+
either initialize an array before passing it
40+
or copy the elements from an unsafe mutable buffer into an array after calling.
41+
42+
## Proposed solution
43+
44+
Adding a new `Array` initializer
45+
that lets a program work with an uninitialized buffer,
46+
and a method for accessing an existing array's buffer
47+
of both initialized and uninitialized memory,
48+
would fill in this missing functionality.
49+
50+
The new initializer takes a closure that operates on an `UnsafeMutableBufferPointer`
51+
and an `inout` count of initialized elements.
52+
This closure has access to the uninitialized contents
53+
of the newly created array's storage,
54+
and must set the intialized count of the array before exiting.
55+
56+
```swift
57+
var myArray = Array<Int>(unsafeUninitializedCapacity: 10) { buffer, initializedCount in
58+
for x in 1..<5 {
59+
buffer[x] = x
60+
}
61+
buffer[0] = 10
62+
initializedCount = 5
63+
}
64+
// myArray == [10, 1, 2, 3, 4]
65+
```
66+
67+
With this new initializer, it's possible to implement the stable partition
68+
as an extension to the `Collection` protocol, without any unnecessary copies:
69+
70+
```swift
71+
func stablyPartitioned(by belongsInFirstPartition: (Element) throws -> Bool) rethrows -> [Element] {
72+
return try Array<Element>(unsafeUninitializedCapacity: count) {
73+
buffer, initializedCount in
74+
var low = buffer.baseAddress!
75+
var high = low + buffer.count
76+
for element in self {
77+
if try belongsInFirstPartition(element) {
78+
low.initialize(to: element)
79+
low += 1
80+
} else {
81+
high -= 1
82+
high.initialize(to: element)
83+
}
84+
}
85+
86+
let highIndex = high - buffer.baseAddress!
87+
buffer[highIndex...].reverse()
88+
initializedCount = buffer.count
89+
}
90+
}
91+
```
92+
93+
## Detailed design
94+
95+
The new initializer and method are added to both `Array` and `ContiguousArray`.
96+
97+
```swift
98+
/// Creates an array with the specified capacity, then calls the given closure
99+
/// with a buffer covering the array's uninitialized memory.
100+
///
101+
/// The closure must set its second parameter to a number `c`, the number
102+
/// of elements that are initialized. The memory in the range `buffer[0..<c]`
103+
/// must be initialized at the end of the closure's execution, and the memory
104+
/// in the range `buffer[c...]` must be uninitialized.
105+
///
106+
/// - Note: While the resulting array may have a capacity larger than the
107+
/// requested amount, the buffer passed to the closure will cover exactly
108+
/// the requested number of elements.
109+
///
110+
/// - Parameters:
111+
/// - unsafeUninitializedCapacity: The number of elements to allocate space
112+
/// for in the new array.
113+
/// - initializer: A closure that initializes elements and sets the count of
114+
/// the new array.
115+
/// - Parameters:
116+
/// - buffer: A buffer covering uninitialized memory with room
117+
/// for the specified number of of elements.
118+
/// - initializedCount: The count of the array's initialized elements.
119+
/// After initializing any elements inside `initializer`, update
120+
/// `initializedCount` with the new count for the array.
121+
public init(
122+
unsafeUninitializedCapacity: Int,
123+
initializingWith initializer: (
124+
_ buffer: inout UnsafeMutableBufferPointer<Element>,
125+
_ initializedCount: inout Int
126+
) throws -> Void
127+
) rethrows
128+
129+
/// Calls the given closure with a buffer of the array's mutable contiguous
130+
/// storage, reserving the specified capacity if necessary.
131+
///
132+
/// The closure must set its second parameter to a number `c`, the number
133+
/// of elements that are initialized. The memory in the range `buffer[0..<c]`
134+
/// must be initialized at the end of the closure's execution, and the memory
135+
/// in the range `buffer[c...]` must be uninitialized.
136+
///
137+
/// - Parameters:
138+
/// - capacity: The capacity to guarantee for the array. `capacity` must
139+
/// be greater than or equal to the array's current `count`.
140+
/// - body: A closure that can modify or deinitialize existing
141+
/// elements or initialize new elements.
142+
/// - Parameters:
143+
/// - buffer: An unsafe mutable buffer of the array's storage, covering
144+
/// memory for the number of elements specifed by the `capacity`
145+
/// parameter. The elements in `buffer[0..<initializedCount]` are
146+
/// initialized, the memory in `buffer[initializedCount..<capacity]`
147+
/// is uninitialized.
148+
/// - initializedCount: The count of the array's initialized elements.
149+
/// If you initialize or deinitialize any elements inside `body`,
150+
/// update `initializedCount` with the new count for the array.
151+
/// - Returns: The return value, if any, of the `body` closure parameter.
152+
public mutating func withUnsafeMutableBufferPointerToStorage<Result>(
153+
capacity: Int,
154+
_ body: (
155+
_ buffer: inout UnsafeMutableBufferPointer<Element>,
156+
_ initializedCount: inout Int
157+
) throws -> Result
158+
) rethrows -> Result
159+
```
160+
161+
### Specifying a capacity
162+
163+
Both the initializer and the mutating method take
164+
the specific capacity that a user wants to work with as a parameter.
165+
In each case, the buffer passed to the closure has a count
166+
that is exactly the same as the specified capacity,
167+
even if the ultimate capacity of the new or existing array is larger.
168+
This helps avoid bugs where a user assumes that the capacity they observe
169+
before calling the mutating method would match the size of the buffer.
170+
171+
The method requires that the capacity specified be at least the current `count` of the array
172+
to prevent nonsensical operations,
173+
like reducing the size of the array from the middle.
174+
That is, this will result in a runtime error:
175+
176+
```swift
177+
var a = Array(1...10)
178+
a.withUnsafeMutableBufferPointerToStorage(capacity: 5) { ... }
179+
```
180+
181+
### Guarantees after throwing
182+
183+
If the closure parameter to either the initializer
184+
or the mutating method throws,
185+
the `initializedCount` value at the time an error is thrown is assumed to be correct.
186+
This means that a user who needs to throw from inside the closure has one of two options.
187+
Before throwing, they must:
188+
189+
1. deinitialize any newly initialized instances or re-initialize any deinitialized instances, or
190+
2. update `initializedCount` to the new count.
191+
192+
In either case,
193+
the postconditions that `buffer[0..<initializedCount]` are initialized
194+
and `buffer[initializedCount...]` are deinitialized still hold.
195+
196+
### Naming considerations
197+
198+
The names of these new additions are definitely a little on the long side!
199+
Here are the considerations used when selecting these names.
200+
201+
#### `init(unsafeUninitializedCapacity:initializingWith:)`
202+
203+
There are two important details of this API that led to the proposed spelling.
204+
First, the initializer is *unsafe*,
205+
in that the user must be sure to properly manage the memory
206+
addressed by the closure's buffer pointer parameter.
207+
Second, the initializer provides access to the array's *uninitialized* storage,
208+
unlike the other `Array.withUnsafe...` methods that already exist.
209+
Because trailing closures are commonly used,
210+
it's important to include those terms in the initial argument label,
211+
such that they're always visible at the use site.
212+
213+
#### `withUnsafeMutableBufferPointerToStorage(capacity:_:)`
214+
215+
The mutating method is closely linked to the existing methods
216+
for accessing an array's storage via mutable buffer pointer,
217+
but has the important distinction of including access
218+
to not just the elements of the array,
219+
but also the uninitialized portion of the array's storage.
220+
Extending the name of the closest existing method (`withUnsafeMutableBufferPointer`)
221+
to mark the distinction makes the relationship (hopefully) clear.
222+
223+
**Suggested alternatives:**
224+
225+
- `withUnsafeMutableBufferPointerToReservedCapacity(_:_:)`
226+
- `withUnsafeMutableBufferPointer(reservingCapacity:_:)`
227+
- `withUnsafeMutableBufferPointerToFullCapacity(capacity:_:)`
228+
229+
#### Unused terminology
230+
231+
This proposal leaves out wording that would reference two other relevant concepts:
232+
233+
- *reserving capacity*:
234+
Arrays currently have a `reserveCapacity(_:)` method,
235+
which is somewhat akin to the first step of the initializer.
236+
However, that method is used for the sake of optimizing performance when adding to an array,
237+
rather than providing direct access to the array's capacity.
238+
In fact, as part of the `RangeReplaceableCollection` protocol,
239+
that method doesn't even require any action to be taken by the targeted type.
240+
For those reasons,
241+
the idea of "reserving" capacity doesn't seem as appropriate
242+
as providing a specific capacity that will be used.
243+
244+
- *unmanaged*:
245+
The proposed initializer is unusual in that it converts
246+
the lifetime management of manually initialized instances to be automatically managed,
247+
as elements of an `Array` instance.
248+
The only other type that performs this kind of conversion is `Unmanaged`,
249+
which is primarily used at the border of Swift and C interoperability,
250+
particularly with Core Foundation.
251+
Additionally, `Unmanaged` can be used to maintain and manage the lifetime of an instance
252+
over a long period of time,
253+
while this initializer performs the conversion as soon as the closure executes.
254+
As above, this term doesn't seem appropriate for use with this new API.
255+
256+
257+
## Source compatibility
258+
259+
This is an additive change to the standard library,
260+
so there is no effect on source compatibility.
261+
262+
## Effect on ABI stability
263+
264+
This addition has no effect on ABI stability.
265+
266+
## Effect on API resilience
267+
268+
The additional APIs will be a permanent part of the standard library,
269+
and will need to remain public API.
270+
271+
## Alternatives considered
272+
273+
### Returning the new count from the initializer closure
274+
275+
An earlier proposal had the initializer's closure return the new count,
276+
instead of using an `inout` parameter.
277+
This proposal uses the parameter instead,
278+
so that the method and initializer use the same closure type.
279+
280+
In addition, the throwing behavior described above requires that
281+
the initialized count be set as an `inout` parameter instead of as a return value.
282+
Not every `Element` type can be trivially initialized,
283+
so a user that deinitializes some elements and then needs to throw an error would be stuck.
284+
(This is only an issue with the mutating method.)
285+
Removing the `throws` capability from the closure
286+
would solve this problem and simplify the new APIs' semantics,
287+
but would be inconsistent with the other APIs in this space
288+
and would make them more difficult to use as building blocks
289+
for higher-level operations like `stablyPartitioned(by:)`.
290+
291+
### Creating an array from a buffer
292+
293+
An `Array` initializer that simply converts an `UnsafeMutableBufferPointer`
294+
into an array's backing storage seems like it would be another solution.
295+
However, an array's storage includes information
296+
about the count and capacity at the beginning of its buffer,
297+
so an `UnsafeMutableBufferPointer` created from scratch isn't usable.
298+
299+
## Addendum
300+
301+
You can Try This At Home™ with this extension,
302+
which provides the semantics
303+
(but not the copy-avoiding performance benefits)
304+
of the proposed additions:
305+
306+
```swift
307+
extension Array {
308+
public init(
309+
unsafeUninitializedCapacity: Int,
310+
initializingWith initializer: (
311+
_ buffer: inout UnsafeMutableBufferPointer<Element>,
312+
_ initializedCount: inout Int
313+
) throws -> Void
314+
) rethrows {
315+
self = []
316+
try self.withUnsafeMutableBufferPointerToStorage(capacity: unsafeUninitializedCapacity, initializer)
317+
}
318+
319+
public mutating func withUnsafeMutableBufferPointerToStorage<Result>(
320+
capacity: Int,
321+
_ body: (
322+
_ buffer: inout UnsafeMutableBufferPointer<Element>,
323+
_ initializedCount: inout Int
324+
) throws -> Result
325+
) rethrows -> Result {
326+
var buffer = UnsafeMutableBufferPointer<Element>.allocate(capacity: capacity)
327+
buffer.initialize(from: self)
328+
var initializedCount = self.count
329+
defer {
330+
buffer.baseAddress?.deinitialize(count: initializedCount)
331+
buffer.deallocate()
332+
}
333+
334+
let result = try body(&buffer, &initializedCount)
335+
self = Array(buffer[..<initializedCount])
336+
self.reserveCapacity(capacity)
337+
return result
338+
}
339+
}
340+
```

0 commit comments

Comments
 (0)