Skip to content

Commit 0b54163

Browse files
[Proposal] AttributedString UTF-8 and UTF-16 Views (#1067)
* [Proposal] AttributedString UTF-8 and UTF-16 Views * Remove discontiguous slices * Add default implementations for new AttributedStringProtocol requirements * Apply suggestions from code review Co-authored-by: Tina Liu <[email protected]> --------- Co-authored-by: Tina Liu <[email protected]>
1 parent e25a8aa commit 0b54163

File tree

1 file changed

+94
-0
lines changed

1 file changed

+94
-0
lines changed
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# `AttributedString` UTF-8 and UTF-16 Views
2+
3+
* Proposal: [SF-0012](0012-attributedstring-utf8-utf16-views.md)
4+
* Authors: [Jeremy Schonfeld](https://github.com/jmschonfeld)
5+
* Review Manager: [Tina Liu](https://github.com/itingliu)
6+
* Status: **Accepted**
7+
* Implementation: [swiftlang/swift-foundation#1066](https://github.com/swiftlang/swift-foundation/pull/1066)
8+
9+
## Introduction/Motivation
10+
11+
In macOS 12-aligned releases, Foundation added the `AttributedString` type as a new API representing rich/attributed text. `AttributedString` itself is not a collection, but rather a type that offers various views into its contents where each view represents a `Collection` over a different type of element. Today, `AttributedString` offers three views: the character view (`.characters`) which provides a collection of grapheme clusters using the `Character` element type, the unicode scalar view (`.unicodeScalars`) which provides a collection of `Unicode.Scalar`s, and the attribute runs view (`.runs`) which provides a collection of attribute runs present across the text using the `AttributedString.Runs.Run` element type. These three views form the critical APIs required to interact with an `AttributedString` via its text (either at the visual, grapheme cluster level or the underlying scalar level) and its runs. However, more advanced use cases require other ways to view an `AttributedString`'s text.
12+
13+
When working with the text content of an `AttributedString`, sometimes it is necessary to view not only the characters or unicode scalars, but the underlying UTF-8 or UTF-16 contents that make up that text. This can be especially useful when interoperating with other types that use UTF-8 or UTF-16 encoded units as their currency types (for example, `NSAttributedString` and `NSString` which use UTF-16 offsets and UTF-16 scalars as their index and element types). Today, `String` itself has a UTF-8 and UTF-16 view that can be used to perform these encoding-specific operations, however `AttributedString` offers no equivalent. This proposal seeks to remedy this by adding equivalent UTF-8 and UTF-16 views to `AttributedString`, offering easy access to the encoded forms of the text.
14+
15+
## Proposed solution
16+
17+
Just like `String`, `AttributedString` will offer new, immutable UTF-8 and UTF-16 character views via the `.utf8` and `.utf16` properties. Developers will be able to use these new views like the following example:
18+
19+
```swift
20+
var attrStr: AttributedString
21+
22+
// Iterate over the UTF-8 scalars
23+
for scalar in attrStr.utf8 {
24+
print(scalar)
25+
}
26+
27+
// Determine the UTF-8 offset of a particular index
28+
let offset = attrStr.utf8.distance(from: attrStr.startIndex, to: someOtherIndex)
29+
```
30+
31+
## Detailed design
32+
33+
We propose adding the following API surface:
34+
35+
```swift
36+
@available(FoundationPreview 6.2, *)
37+
extension AttributedString {
38+
public struct UTF8View : BidirectionalCollection, CustomStringConvertible, Sendable {
39+
public typealias Element = UTF8.CodeUnit
40+
public typealias Index = AttributedString.Index
41+
public typealias SubSequence = AttributedString.UTF8View
42+
}
43+
44+
public struct UTF16View : BidirectionalCollection, CustomStringConvertible, Sendable {
45+
public typealias Element = UTF16.CodeUnit
46+
public typealias Index = AttributedString.Index
47+
public typealias SubSequence = AttributedString.UTF16View
48+
}
49+
50+
public var utf8: UTF8View { get }
51+
public var utf16: UTF16View { get }
52+
}
53+
54+
@available(macOS 12, iOS 15, tvOS 15, watchOS 8, *)
55+
protocol AttributedStringProtocol {
56+
// ...
57+
58+
@available(FoundationPreview 6.2, *)
59+
var utf8: AttributedString.UTF8View { get }
60+
@available(FoundationPreview 6.2, *)
61+
var utf16: AttributedString.UTF16View { get }
62+
}
63+
64+
65+
@available(FoundationPreview 6.2, *)
66+
extension AttributedStringProtocol {
67+
public var utf8: AttributedString.UTF8View { get }
68+
public var utf16: AttributedString.UTF16View { get }
69+
}
70+
71+
@available(FoundationPreview 6.2, *)
72+
extension AttributedSubstring {
73+
public var utf8: AttributedString.UTF8View { get }
74+
public var utf16: AttributedString.UTF16View { get }
75+
}
76+
```
77+
78+
_Note: omitted here for brevity, `AttributedString.UTF8View` and `AttributedString.UTF16View` must implement all relevant, optional protocol requirements from `BidirectionalCollection` and `RangeReplaceableCollection` to ensure efficient operations over the underlying storage_
79+
80+
## Source compatibility
81+
82+
All of these changes are additive and have no impact on source compatibility. The added requirements to `AttributedStringProtocol` have provided default implementations and as such are not ABI/API breaking changes.
83+
84+
## Implications on adoption
85+
86+
These new views will be annotated with `FoundationPreview 6.2` availability. On platforms where availability is relevant, these APIs may only be used on versions where these new views are present.
87+
88+
## Future directions
89+
90+
No future directions are considered at this time.
91+
92+
## Alternatives considered
93+
94+
No alternatives are considered at this time.

0 commit comments

Comments
 (0)