You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Motivation for adding algorithms
- Motivation for `CustomRegexComponent`
- Design for added algorithms
- Design for `CustomRegexComponent`
- Add a few doc comments
Copy file name to clipboardExpand all lines: Documentation/Evolution/StringProcessingAlgorithms.md
+87-76Lines changed: 87 additions & 76 deletions
Original file line number
Diff line number
Diff line change
@@ -6,37 +6,12 @@ The standard library is currently missing a large number of `String` algorithms
6
6
7
7
## Motivation
8
8
9
-
TODO
9
+
TODO: Motivation for adding both generic `<r: RegexProtocol>` and non-generic algorithm functions.
10
10
11
-
## Proposed solution
12
-
13
-
We introduce internal infrastructure that allows groups of `Collection` algorithms that perform the same operations on different types to share their implementation, leading to a more coherent set of public APIs. This allows us to more easily provide algorithms that work with `RegexProtocol` values, such as
We also introduce the `CustomRegexComponent` protocol that conveniently lets types from outside the standard library participate in regex builders and `RegexProtocol` algorithms:
12
+
### Use custom parsers in regex builders and `RegexProtocol` algorithms
/// Match the input string within the specified bounds, beginning at the given index, and return
26
-
/// the end position (upper bound) of the match and the matched instance.
27
-
/// - Parameters:
28
-
/// - input: The string in which the match is performed.
29
-
/// - index: An index of `input` at which to begin matching.
30
-
/// - bounds: The bounds in `input` in which the match is performed.
31
-
/// - Returns: The upper bound where the match terminates and a matched instance, or nil if
32
-
/// there isn't a match.
33
-
funcmatch(
34
-
_input: String,
35
-
startingAtindex: String.Index,
36
-
inbounds: Range<String.Index>
37
-
) -> (upperBound: String.Index, match: Match)?
38
-
}
39
-
```
14
+
It would be handy if you can use types from outside the standard library in regex builders and `RegexProtocol` algorithms.
40
15
41
16
Consider parsing an HTTP header to capture the date field as a `Date` type:
42
17
@@ -52,6 +27,7 @@ Content-Language: en
52
27
You are likely going to match a substring that look like a date string (`16 Feb 2022`), and parse the substring as a `Date` with one of Foundation's date parsers:
53
28
54
29
```swift
30
+
let dateParser = Date.ParseStrategy(format: "\(day: .twoDigits)\(month: .abbreviated)\(year: .padded(4))"
@@ -82,32 +58,9 @@ if let match = header.firstMatch(of: regex) {
82
58
}
83
59
```
84
60
85
-
You can do this because Foundation framework's `Date.ParseStrategy` conforms to `CustomRegexComponent`, defined above. You can also conform your custom parser to `CustomRegexComponent`. Conformance is simple: implement the `match` function to return the upper bound of the matched substring, and the type represented by the matched range. It inherits from `RegexProtocol`, so you will be able to use it with all of the string algorithms that take a `RegexProtocol` type.
86
-
87
-
Foundation framework's `Date.ParseStrategy` conforms to `CustomRegexComponent` this way. It also adds a static function `date(format:timeZone:locale)` as a static member of `RegexProtocol`, so you can refer to it as `.date(format:...)` in the `Regex` result builder.
Here's another example of how you can use `FloatingPointFormatStyle<Double>.Currency` to parse a bank statement and record all the monetary values:
61
+
Here's another example of how you can use `Foundation.FloatingPointFormatStyle<Double>.Currency` to parse a bank statement and record all the monetary values:
Parsing a currency string such as `$3,020.85` with regex isn't trivial -- it can contain grouping separators, a decimal separator, and a currency symbol, all of which can be localized. Delegating parsing such strings to a dedicated currency parser alleviates the need to handle it yourself.
131
82
132
-
### `CustomRegexComponent` protocol
83
+
In the second part of the pitch, we introduce the `CustomRegexComponent` protocol that conveniently lets types from outside the standard library participate in regex builders and `RegexProtocol` algorithms.
133
84
134
-
The `CustomRegexComponent` protocol inherits from `RegexProtocol` and satisfies its sole requirement. This enables the usage of types that conform to `CustomRegexComponent` in regex builders and `RegexProtocol` algorithms.
85
+
## Proposed solution
86
+
87
+
We introduce internal infrastructure that allows groups of `Collection` algorithms that perform the same operations on different types to share their implementation, leading to a more coherent set of public APIs. This allows us to more easily provide algorithms that work with `RegexProtocol` values, such as
We also introduce the `CustomRegexComponent` protocol that conveniently lets types from outside the standard library participate in regex builders and `RegexProtocol` algorithms.
96
+
97
+
98
+
## Detailed design
99
+
154
100
### Algorithms
155
101
156
102
The following algorithms are included in this pitch:
@@ -159,11 +105,17 @@ The following algorithms are included in this pitch:
159
105
160
106
```swift
161
107
extensionCollectionwhereElement:Equatable {
108
+
/// Returns a Boolean value indicating whether the collection contains the given sequence.
109
+
/// - Parameter other: A sequence to search for within this collection.
110
+
/// - Returns: `true` if the collection contains the specified sequence, otherwise `false`.
/// Returns a new collection of the same type by removing initial elements that satisfy the given predicate from the start
139
+
/// - Parameter predicate: A closure that takes an element of the sequence as its argument and returns a Boolean value indicating whether the element should be removed from the collection.
140
+
/// - Returns: A collection containing the elements of the receiver that are not removed by `predicate`.
/// Removes the initial elements that satisfy the given predicate from the start of the sequence.
146
+
/// - Parameter predicate: A closure that takes an element of the sequence as its argument and returns a Boolean value indicating whether the element should be removed from the collection.
/// Removes the initial elements that satisfy the given predicate from the start of the sequence.
152
+
/// - Parameter predicate: A closure that takes an element of the sequence as its argument and returns a Boolean value indicating whether the element should be removed from the collection.
The `CustomRegexComponent` protocol inherits from `RegexProtocol` and satisfies its sole requirement. This enables the usage of types that conform to `CustomRegexComponent` in regex builders and `RegexProtocol` algorithms.
/// Match the input string within the specified bounds, beginning at the given index, and return
314
+
/// the end position (upper bound) of the match and the matched instance.
315
+
/// - Parameters:
316
+
/// - input: The string in which the match is performed.
317
+
/// - index: An index of `input` at which to begin matching.
318
+
/// - bounds: The bounds in `input` in which the match is performed.
319
+
/// - Returns: The upper bound where the match terminates and a matched instance, or nil if
320
+
/// there isn't a match.
321
+
funcmatch(
322
+
_input: String,
323
+
startingAtindex: String.Index,
324
+
inbounds: Range<String.Index>
325
+
) -> (upperBound: String.Index, match: Match)?
326
+
}
327
+
```
328
+
329
+
You can conform your custom parser to `CustomRegexComponent`. Conformance is simple: implement the `match` function to return the upper bound of the matched substring, and the type represented by the matched range. It inherits from `RegexProtocol`, so you will be able to use it with all of the string algorithms that take a `RegexProtocol` type.
330
+
331
+
Here, we use Foundation framework's `FloatingPointFormatStyle<Double>.Currency` as an example. `FloatingPointFormatStyle<Double>.Currency` would conform to `CustomRegexComponent` by implementing the `match` function with `Match` being a `Double`. It could also add a static function `.localizedCurrency(code:)` as a member of `RegexProtocol`, so you can refer to it as `.localizedCurrency(code:)` in the `Regex` result builder.
Users could specify a pattern to match a localized currency amount such as `"$3,020.85"` simply with the following, and use it inany of the string matching algorithms introduced above.
0 commit comments