Skip to content

Commit fe3a90e

Browse files
authored
AsyncSequence (#1246)
* Initial version * AsyncIterator.Element -> Element * Result type of makeAsyncIterator is AsyncIterator * Add flatMap * Create an AsyncIterator instead of Iterator * Apply feedback from pitch thread * await try -> try await * Minor process updates
1 parent a698729 commit fe3a90e

File tree

1 file changed

+386
-0
lines changed

1 file changed

+386
-0
lines changed

proposals/NNNN-asyncsequence.md

Lines changed: 386 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,386 @@
1+
# Async/Await: Sequences
2+
3+
* Proposal: [SE-NNNN](Async-Await-Series.md.md)
4+
* Authors: [Tony Parker](https://github.com/parkera), [Philippe Hausler](https://github.com/phausler)
5+
* Review Manager: TBD
6+
* Status: **Awaiting review**
7+
* Implementation: [apple/swift#35224](https://github.com/apple/swift/pull/35224)
8+
9+
## Introduction
10+
11+
Swift's [async/await](https://github.com/apple/swift-evolution/blob/main/proposals/0296-async-await.md) feature provides an intuitive, built-in way to write and use functions that return a single value at some future point in time. We propose building on top of this feature to create an intuitive, built-in way to write and use functions that return many values over time.
12+
13+
This proposal is composed of the following pieces:
14+
15+
1. A standard library definition of a protocol that represents an asynchronous sequence of values
16+
2. Compiler support to use `for...in` syntax on an asynchronous sequence of values
17+
3. A standard library implementation of commonly needed functions that operate on an asynchronous sequence of values
18+
19+
## Motivation
20+
21+
We'd like iterating over asynchronous sequences of values to be as easy as iterating over synchronous sequences of values. An example use case is iterating over the lines in a file, like this:
22+
23+
```swift
24+
for try await line in myFile.lines() {
25+
// Do something with each line
26+
}
27+
```
28+
29+
Using the `for...in` syntax that Swift developers are already familiar with will reduce the barrier to entry when working with asynchronous APIs. Consistency with other Swift types and concepts is therefore one of our most important goals. The requirement of using the `await` keyword in this loop will distinguish it from synchronous sequences.
30+
### `for/in` Syntax
31+
32+
To enable the use of `for in`, we must define the return type from `func lines()` to be something that the compiler understands can be iterated. Today, we have the `Sequence` protocol. Let's try to use it here:
33+
34+
```swift
35+
extension URL {
36+
struct Lines: Sequence { /* ... */ }
37+
func lines() async -> Lines
38+
}
39+
```
40+
41+
Unfortunately, what this function actually does is wait until *all* lines are available before returning. What we really wanted in this case was to await *each* line. While it is possible to imagine modifications to `lines` to behave differently (e.g., giving the result reference semantics), it would be better to define a new protocol to make this iteration behavior as simple as possible.
42+
43+
```swift
44+
extension URL {
45+
struct Lines: AsyncSequence { /* ... */ }
46+
func lines() async -> Lines
47+
}
48+
```
49+
50+
`AsyncSequence` allows for waiting on each element instead of the entire result by defining an asynchronous `next()` function on its associated iterator type.
51+
52+
### Additional AsyncSequence functions
53+
54+
Going one step further, let's imagine how it might look to use our new `lines` function in more places. Perhaps we only want the first line of a file because it contains a header that we are interested in:
55+
56+
```swift
57+
let header: String?
58+
do {
59+
for try await line in myFile.lines() {
60+
header = line
61+
break
62+
}
63+
} catch {
64+
header = nil // file didn't exist
65+
}
66+
```
67+
68+
Or, perhaps we actually do want to read all lines in the file before starting our processing:
69+
70+
```swift
71+
var allLines: [String] = []
72+
do {
73+
for try await line in myFile.lines() {
74+
allLines.append(line)
75+
}
76+
} catch {
77+
allLines = []
78+
}
79+
```
80+
81+
There's nothing wrong with the above code, and it must be possible for a developer to write it. However, it does seem like a lot of boilerplate for what might be a common operation. One way to solve this would be to add more functions to `URL`:
82+
83+
```swift
84+
extension URL {
85+
struct Lines : AsyncSequence { }
86+
87+
func lines() -> Lines
88+
func firstLine() throws async -> String?
89+
func collectLines() throws async -> [String]
90+
}
91+
```
92+
93+
It doesn't take much imagination to think of other places where we may want to do similar operations, though. Therefore, we believe the best place to put these functions is instead as an extension on `AsyncSequence` itself, specified generically -- just like `Sequence`.
94+
95+
## Proposed solution
96+
97+
The standard library will define the following protocols:
98+
99+
```swift
100+
public protocol AsyncSequence {
101+
associatedtype AsyncIterator: AsyncIteratorProtocol where AsyncIterator.Element == Element
102+
associatedtype Element
103+
func makeAsyncIterator() -> AsyncIterator
104+
}
105+
106+
public protocol AsyncIteratorProtocol {
107+
associatedtype Element
108+
mutating func next() async throws -> Element?
109+
__consuming mutating func cancel()
110+
}
111+
```
112+
113+
The compiler will generate code to allow use of a `for in` loop on any type which conforms with `AsyncSequence`. The standard library will also extend the protocol to provide familiar generic algorithms. Here is an example which does not actually call an `async` function within its `next`, but shows the basic shape:
114+
115+
```swift
116+
struct Counter : AsyncSequence {
117+
let howHigh: Int
118+
119+
struct AsyncIterator : AsyncIteratorProtocol {
120+
let howHigh: Int
121+
var current = 1
122+
mutating func next() async -> Int? {
123+
guard current <= howHigh else {
124+
return nil
125+
}
126+
127+
let result = current
128+
current += 1
129+
return result
130+
}
131+
132+
mutating func cancel() {
133+
current = howHigh + 1 // Make sure we do not emit another value
134+
}
135+
}
136+
137+
func makeAsyncIterator() -> AsyncIterator {
138+
return AsyncIterator(howHigh: howHigh)
139+
}
140+
}
141+
```
142+
143+
At the call site, using `Counter` would look like this:
144+
145+
```swift
146+
for await i in Counter(howHigh: 3) {
147+
print(i)
148+
}
149+
150+
/*
151+
Prints the following, and finishes the loop:
152+
1
153+
2
154+
3
155+
*/
156+
157+
158+
for await i in Counter(howHigh: 3) {
159+
print(i)
160+
if i == 2 { break }
161+
}
162+
/*
163+
Prints the following, then calls cancel before breaking out of the loop:
164+
1
165+
2
166+
*/
167+
```
168+
169+
Any other exit (e.g., `return` or `throw`) from the `for` loop will also call `cancel` first.
170+
171+
## Detailed design
172+
173+
Returning to our earlier example:
174+
175+
```swift
176+
for try await line in myFile.lines() {
177+
// Do something with each line
178+
}
179+
```
180+
181+
The compiler will emit the equivalent of the following code:
182+
183+
```swift
184+
var it = myFile.lines().makeAsyncIterator()
185+
while let value = try await it.next() {
186+
// Do something with each line
187+
}
188+
```
189+
190+
All of the usual rules about error handling apply. For example, this iteration must be surrounded by `do/catch`, or be inside a `throws` function to handle the error. All of the usual rules about `await` also apply. For example, this iteration must be inside a context in which calling `await` is allowed like an `async` function.
191+
192+
### Cancellation
193+
194+
If `next()` returns `nil` then the iteration ends naturally and the compiler does not insert a call to `cancel()`. If `next()` throws an error, then iteration also ends and the compiler does not insert a call to `cancel()`. In both of these cases, it was the `AsyncSequence` itself which decided to end iteration and there is no need to tell it to cancel.
195+
196+
If, inside the body of the loop, the code calls `break`, `return` or `throw`, then the compiler first inserts a synchronous call to `cancel()` on the `it` iterator.
197+
198+
If this iteration is itself in a context in which cancellation can occur, then it is up to the developer to check for cancellation themselves and break out of the loop:
199+
200+
```swift
201+
for try await line in myFile.lines() {
202+
// Do something
203+
...
204+
// Check for cancellation
205+
try await Task.checkCancellation()
206+
}
207+
```
208+
209+
In this case, control of cancellation (which is a potential suspension point, and may be something to do either before or after receiving a value) is up to the author of the code.
210+
211+
#### Cancellation on Reference Types
212+
213+
If the `AsyncIterator` is a `class` type, it should assume that `deinit` is equivalent to calling `cancel`. This will prevent leaking of resources in cases where the iterator is used manually and `cancel` is not called. It also provides a future-proofing path for move-only iterators.
214+
#### Automatic Cancellation
215+
216+
"Automatic" calls to `cancel` are conceptually compatible with `defer`. Given the following code:
217+
218+
```swift
219+
for await x in seq {
220+
// code
221+
}
222+
```
223+
224+
The compiler generates code equivalent to this:
225+
226+
```swift
227+
do {
228+
var $_iterator = seq.makeAsyncIterator()
229+
var $_element: Element? = await $_iterator.next()
230+
defer { if $_element != nil { $_iterator.cancel() } }
231+
while let x = $_element {
232+
// code
233+
$_element = await $_iterator.next()
234+
}
235+
}
236+
```
237+
238+
### Rethrows
239+
240+
This proposal will take advantage of a separate proposal to add specialized `rethrows` conformance in a protocol, pitched [here](https://forums.swift.org/t/pitch-rethrowing-protocol-conformances/42373). With the changes proposed there for `rethrows`, it will not be required to use `try` when iterating an `AsyncSequence` which does not itself throw.
241+
242+
The `await` is always required because the definition of the protocol is that it is always asynchronous.
243+
244+
## AsyncSequence Functions
245+
246+
The existence of a standard `AsyncSequence` protocol allows us to write generic algorithms for any type that conforms to it. There are two categories of functions: those that return a single value (and are thus marked as `async`), and those that return a new `AsyncSequence` (and are not marked as `async` themselves).
247+
248+
The functions that return a single value are especially interesting because they increase usability by changing a loop into a single `await` line. Functions in this category are `first`, `contains`, `count`, `min`, `max`, `reduce`, and more. Functions that return a new `AsyncSequence` include `filter`, `map`, and `compactMap`.
249+
250+
### AsyncSequence to single value
251+
252+
Algorithms that reduce a for loop into a single call can improve readability of code. They remove the boilerplate required to set up and iterate a loop.
253+
254+
For example, here is the `first` function:
255+
256+
```swift
257+
extension AsyncSequence {
258+
public func first() async rethrows -> Element?
259+
}
260+
```
261+
262+
With this extension, our "first line" example from earlier becomes simply:
263+
264+
```swift
265+
let first = try? await myFile.lines().first()
266+
```
267+
268+
The following functions will be added to `AsyncSequence`:
269+
270+
| Function | Note |
271+
| - | - |
272+
| `contains(_ value: Element) async rethrows -> Bool` | Requires `Equatable` element |
273+
| `contains(where: (Element) async throws -> Bool) async rethrows -> Bool` | The `async` on the closure allows optional async behavior, but does not require it |
274+
| `allSatisfy(_ predicate: (Element) async throws -> Bool) async rethrows -> Bool` | |
275+
| `first(where: (Element) async throws -> Bool) async rethrows -> Element?` | |
276+
| `first() async rethrows -> Element?` | Not a property since properties cannot `throw` |
277+
| `min() async rethrows -> Element?` | Requires `Comparable` element |
278+
| `min(by: (Element, Element) async throws -> Bool) async rethrows -> Element?` | |
279+
| `max() async rethrows -> Element?` | Requires `Comparable` element |
280+
| `max(by: (Element, Element) async throws -> Bool) async rethrows -> Element?` | |
281+
| `reduce<T>(_ initialResult: T, _ nextPartialResult: (T, Element) async throws -> T) async rethrows -> T` | |
282+
| `reduce<T>(into initialResult: T, _ updateAccumulatingResult: (inout T, Element) async throws -> ()) async rethrows -> T` | |
283+
284+
### AsyncSequence to AsyncSequence
285+
286+
These functions on `AsyncSequence` return a result which is itself an `AsyncSequence`. Due to the asynchronous nature of `AsyncSequence`, the behavior is similar in many ways to the existing `Lazy` types in the standard library. Calling these functions does not eagerly `await` the next value in the sequence, leaving it up to the caller to decide when to start that work by simply starting iteration when they are ready.
287+
288+
As an example, let's look at `map`:
289+
290+
```swift
291+
extension AsyncSequence {
292+
public func map<Transformed>(
293+
_ transform: @escaping (Element) async throws -> Transformed
294+
) -> AsyncMapSequence<Self, Transformed>
295+
}
296+
297+
public struct AsyncMapSequence<Upstream: AsyncSequence, Transformed>: AsyncSequence {
298+
public let upstream: Upstream
299+
public let transform: (Upstream.Element) async throws -> Transformed
300+
public struct Iterator : AsyncIterator {
301+
public mutating func next() async rethrows -> Transformed?
302+
public mutating func cancel()
303+
}
304+
}
305+
```
306+
307+
For each of these functions, we first define a type which conforms with the `AsyncSequence` protocol. The name is modeled after existing standard library `Sequence` types like `LazyDropWhileCollection` and `LazyMapSequence`. Then, we add a function in an extension on `AsyncSequence` which creates the new type (using `self` as the `upstream`) and returns it.
308+
309+
| Function |
310+
| - |
311+
| `map<T>(_ transform: (Element) async throws -> T) -> AsyncMapSequence` |
312+
| `compactMap<T>(_ transform: (Element) async throws -> T?) -> AsyncCompactMapSequence` |
313+
| `flatMap<SegmentOfResult: AsyncSequence>(_ transform: (Element) async throws -> SegmentOfResult) async rethrows -> AsyncFlatMapSequence` |
314+
| `drop(while: (Element) async throws -> Bool) async rethrows -> AsyncDropWhileSequence` |
315+
| `dropFirst(_ n: Int) async rethrows -> AsyncDropFirstSequence` |
316+
| `prefix(while: (Element) async throws -> Bool) async rethrows -> AsyncPrefixWhileSequence` |
317+
| `prefix(_ n: Int) async rethrows -> AsyncPrefixSequence` |
318+
| `filter(_ predicate: (Element) async throws -> Bool) async rethrows -> AsyncFilterSequence` |
319+
320+
## Future Proposals
321+
322+
The following topics are things we consider important and worth discussion in future proposals:
323+
### Additional `AsyncSequence` functions
324+
325+
We've aimed for parity with the most relevant `Sequence` functions. There may be others that are worth adding in a future proposal.
326+
327+
API which uses a time argument must be coordinated with the discussion about `Executor` as part of the [structured concurrency proposal](https://github.com/DougGregor/swift-evolution/blob/structured-concurrency/proposals/nnnn-structured-concurrency.md).
328+
329+
### AsyncSequence Builder
330+
331+
In the standard library we have not only the `Sequence` and `Collection` protocols, but concrete types which adopt them (for example, `Array`). We will need a similar API for `AsyncSequence` that makes it easy to construct a concrete instance when needed, without declaring a new type and adding protocol conformance.
332+
333+
## Source compatibility
334+
335+
This new functionality will be source compatible with existing Swift.
336+
337+
## Effect on ABI stability
338+
339+
This change is additive to the ABI.
340+
341+
## Effect on API resilience
342+
343+
This change is additive to API.
344+
345+
## Alternatives considered
346+
347+
### Asynchronous cancellation
348+
349+
The `cancel()` function on the iterator could be marked as `async`. However, this means that the implicit cancellation done when leaving a `for/in` loop would require an implicit `await` -- something we think is probably too much to hide from the developer. Most cancellation behavior is going to be as simple as setting a flag to check later, so we leave it as a synchronous function and encourage adopters to make cancellation fast and non-blocking.
350+
### Opaque Types
351+
352+
Each `AsyncSequence`-to-`AsyncSequence` algorithm will define its own concrete type. We could attempt to hide these details behind a general purpose type eraser. We believe leaving the types exposed gives us (and the compiler) more optimization opportunities. A great future enhancement would be for the language to support `some AsyncSequence where Element=...`-style syntax, allowing hiding of concrete `AsyncSequence` types at API boundaries.
353+
354+
### Reusing Sequence
355+
356+
If the language supported a `reasync` concept, then it seems plausible that the `AsyncSequence` and `Sequence` APIs could be merged. However, we believe it is still valuable to consider these as two different types. The added complexity of a time dimension in asynchronous code means that some functions need more configuration options or more complex implementations. Some algorithms that are useful on asynchronous sequences are not meaningful on synchronous ones. We prefer not to complicate the API surface of the synchronous collection types in these cases.
357+
358+
### Naming
359+
360+
The names of the concrete `AsyncSequence` types is designed to mirror existing standard library API like `LazyMapSequence`. Another option is to introduce a new pattern with an empty enum or other namespacing mechanism.
361+
362+
We considered `AsyncGenerator` but would prefer to leave the `Generator` name for future language enhancements. `Stream` is a type in Foundation, so we did not reuse it here to avoid confusion.
363+
364+
### `await in`
365+
366+
We considered a shorter syntax of `await...in`. However, since the behavior here is fundamentally a loop, we feel it is important to use the existing `for` keyword as a strong signal of intent to readers of the code. Although there are a lot of keywords, each one has purpose and meaning to readers of the code.
367+
368+
### Add APIs to iterator instead of sequence
369+
370+
We discussed applying the fundamental API (`map`, `reduce`, etc.) to the `AsyncIterator` protocol instead of `AsyncSequence`. There has been a long-standing (albeit deliberate) ambiguity in the `Sequence` API -- is it supposed to be single-pass or multi-pass? This new kind of iterator & sequence could provide an opportunity to define this more concretely.
371+
372+
While it is tempting to use this new API to right past wrongs, we maintain that the high level goal of consistency with existing Swift concepts is more important.
373+
374+
For example, `for...in` cannot be used on an `Iterator` -- only a `Sequence`. If we chose to make `AsyncIterator` use `for...in` as described here, that leaves us with the choice of either introducing an inconsistency between `AsyncIterator` and `Iterator` or giving up on the familiar `for...in` syntax. Even if we decided to add `for...in` to `Iterator`, it would still be inconsistent because we would be required to leave `for...in` syntax on the existing `Sequence`.
375+
376+
Another point in favor of consistency is that implementing an `AsyncSequence` should feel familiar to anyone who knows how to implement a `Sequence`.
377+
378+
We are hoping for widespread adoption of the protocol in API which would normally have instead used a `Notification`, informational delegate pattern, or multi-callback closure argument. In many of these cases we feel like the API should return the 'factory type' (an `AsyncSequence`) so that it can be iterated again. It will still be up to the caller to be aware of any underlying cost of performing that operation, as with iteration of any `Sequence` today.
379+
380+
### Move-only iterator and removing Cancel
381+
382+
We discussed waiting to introduce this feature until move-only types are available in the future. This is a tradeoff in which we look to the Core Team for advice, but the authors believe the benefit of having this functionality now has the edge. It will likely be the case that move-only types will bring changes to other `Sequence` and `Iterator` types when it arrives in any case.
383+
384+
Prototyping of the patch does not seem to indicate undue complexity in the compiler implementation. In fact, it appears that the existing ideas around `defer` actually match this concept cleanly.
385+
386+
We have included a `__consuming` attribute on the `cancel` function, which should allow move-only iterators to exist in the future.

0 commit comments

Comments
 (0)