You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To support `if` statements, `buildEither(first:)`, `buildEither(second:)` and `buildOptional(_:)` are defined with overloads to support up to 10 captures because each capture type needs to be transformed to an optional. The overload for non-capturing regexes, due to the lack of generic constraints, must be annotated with `@_disfavoredOverload` in order not shadow other overloads. We expect that a variadic-generic version of this method will eventually superseded all of these overloads.
404
-
405
-
```swift
406
-
extensionRegexComponentBuilder {
407
-
// The following builder methods implement what would be possible with
408
-
// variadic generics (using imaginary syntax) as a single method:
409
-
//
410
-
// public static func buildEither<
411
-
// Component, WholeMatch, Capture...
412
-
// >(
413
-
// first component: Component
414
-
// ) -> Regex<(Substring, Capture...)>
415
-
// where Component.Output == (WholeMatch, Capture...)
// ... `O(arity)` overloads of `buildOptional(_:)`
488
-
}
489
-
```
490
-
491
-
To support `if #available(...)` statements, `buildLimitedAvailability(_:)` is defined with overloads to support up to 10 captures. Similar to `buildOptional`, the overload for non-capturing regexes must be annotated with `@_disfavoredOverload`.
403
+
To support `if #available(...)` statements, `buildLimitedAvailability(_:)` is defined with overloads to support up to 10 captures. The overload for non-capturing regexes, due to the lack of generic constraints, must be annotated with `@_disfavoredOverload` in order not shadow other overloads. We expect that a variadic-generic version of this method will eventually superseded all of these overloads.
`buildOptional` and `buildEither` are intentionally not supported due to ergonomic issues and fundamental semantic differences between regex conditionals and result builder conditionals. Please refer to the [alternatives considered](#support-buildoptional-and-buildeither) section for detailed rationale.
434
+
521
435
### Alternation
522
436
523
437
Alternations are used to match one of multiple patterns. An alternation wraps its underlying patterns' capture types in an `Optional` and concatenates them together, first to last.
@@ -620,99 +534,6 @@ public enum AlternationBuilder {
620
534
// ... `O(arity^2)` overloads of `buildPartialBlock(accumulated:next:)`
621
535
}
622
536
623
-
extensionAlternationBuilder {
624
-
// The following builder methods implement what would be possible with
625
-
// variadic generics (using imaginary syntax) as a single method:
// The following builder methods implement what would be possible with
718
539
// variadic generics (using imaginary syntax) as a single method:
@@ -1290,6 +1111,53 @@ Regex { wholeSentence in
1290
1111
}
1291
1112
```
1292
1113
1114
+
### Scoping
1115
+
1116
+
In textual regexes, atomic groups (`(?>...)`) can be used to define a backtracking scope. That is, when the regex engine exits from the scope successfully, it throws away all backtracking positions from the scope. In regex builder, the `Local` type serves this purpose.
1117
+
1118
+
```swift
1119
+
publicstructLocal<Output>: RegexComponent {
1120
+
publicvar regex: Regex<Output>
1121
+
1122
+
// The following builder methods implement what would be possible with
1123
+
// variadic generics (using imaginary syntax) as a single set of methods:
1124
+
//
1125
+
// public init<WholeMatch, Capture..., Component: RegexComponent>(
For example, the following regex matches string `abcc` but not `abc`.
1147
+
1148
+
```swift
1149
+
Regex {
1150
+
"a"
1151
+
Local {
1152
+
ChoiceOf {
1153
+
"bc"
1154
+
"b"
1155
+
}
1156
+
}
1157
+
"c"
1158
+
}
1159
+
```
1160
+
1293
1161
## Source compatibility
1294
1162
1295
1163
Regex builder will be shipped in a new module named `RegexBuilder`, and thus will not affect the source compatibility of the existing code.
@@ -1306,7 +1174,7 @@ The proposed feature relies heavily upon overloads of `buildBlock` and `buildPar
1306
1174
1307
1175
### Operators for quantification and alternation
1308
1176
1309
-
While `ChoiceOf` and quantifier functions provide a general way of creating alternations and quantifications, we recognize that some synctactic sugar can be useful for creating one-liners like in textual regexes, e.g. infix operator `|`, postfix operator `*`, etc.
1177
+
While `ChoiceOf` and quantifier types provide a general way of creating alternations and quantifications, we recognize that some synctactic sugar can be useful for creating one-liners like in textual regexes, e.g. infix operator `|`, postfix operator `*`, etc.
1310
1178
1311
1179
```swift
1312
1180
// The following functions implement what would be possible with variadic
@@ -1441,6 +1309,83 @@ One could argue that type such as `OneOrMore<Output>` could be defined as a top-
1441
1309
1442
1310
Another reason to use types instead of free functions is consistency with existing result-builder-based DSLs such as SwiftUI.
1443
1311
1312
+
### Support `buildOptional` and `buildEither`
1313
+
1314
+
To support `if` statements, an earlier iteration of this proposal defined `buildEither(first:)`, `buildEither(second:)` and `buildOptional(_:)` as the following:
1315
+
1316
+
```swift
1317
+
extension RegexComponentBuilder {
1318
+
publicstaticfuncbuildEither<
1319
+
Component, WholeMatch, Capture...
1320
+
>(
1321
+
first component: Component
1322
+
) -> Regex<(Substring, Capture...)>
1323
+
where Component.Output== (WholeMatch, Capture...)
1324
+
1325
+
publicstaticfuncbuildEither<
1326
+
Component, WholeMatch, Capture...
1327
+
>(
1328
+
second component: Component
1329
+
) -> Regex<(Substring, Capture...)>
1330
+
where Component.Output== (WholeMatch, Capture...)
1331
+
1332
+
publicstaticfuncbuildOptional<
1333
+
Component, WholeMatch, Capture...
1334
+
>(
1335
+
_ component: Component?
1336
+
) where Component.Output== (WholeMatch, Capture...)
1337
+
}
1338
+
```
1339
+
1340
+
However, multiple-branch control flow statements (e.g. `if`-`else` and `switch`) would need to be required to produce either the same regex type, which is limiting, or an "either-like" type, which can be difficult to work with when nested. Unlike `ChoiceOf`, producing a tuple of optionals is not an option, because the branch taken would be decided when the builder closure is executed, and it would cause capture numbering to be inconsistent with conventional regex.
1341
+
1342
+
Moreover, result builder conditionals does not work the same way as regex conditionals. In regex conditionals, the conditions are themselves regexes and are evaluated by the regex engine during matching, whereas result builder conditionals are evaluated as part of the builder closure. We hope that a future result builder feature will support "lifting" control flow conditions into the DSL domain, e.g. supporting `Regex<Bool>` as a condition.
1343
+
1344
+
### Flatten optionals
1345
+
1346
+
With the proposed design, `ChoiceOf` with `AlternationBuilder` wraps every component's capture type with an `Optional`. This means that any `ChoiceOf` with optional-capturing components would lead to a doubly-nested optional captures. This could make the result of matching harder to use.
One way to improve this could be overloading quantifier initializers (e.g. `ZeroOrMore.init(_:)`) and `AlternationBuilder.buildPartialBlock` to flatten any optionals upon composition. However, this would be non-trivial. Quantifier initializers would need to be overloaded `O(2^arity)` times to account for all possible positions of `Optional` that may appear in the `Output` tuple. Even worse, `AlternationBuilder.buildPartialBlock` would need to be overloaded `O(arity!)` times to account for all possible combinations of two `Output` tuples with all possible positions of `Optional` that may appear in one of the `Output` tuples.
1359
+
1360
+
### Structured rather than flat captures
1361
+
1362
+
We propose inferring capture types in such a way as to align with the traditional numbering of backreferences. This is because much of the motivation behind providing regex literals in Swift is their familiarity.
1363
+
1364
+
If we decided to deprioritize this motivation, there are opportunities to infer safer, more ergonomic, and arguably more intuitive types for captures. For example, to be consistent with traditional regex backreferences quantifications of multiple or nested captures had to produce parallel arrays rather than an array of tuples.
Similarly, an alternation of multiple or nested captures could produce a structured alternation type (or an anonymous sum type) rather than flat optionals.
1385
+
1386
+
This is cool, but it adds extra complexity to regex builder and it isn't as clear because the generic type no longer aligns with the traditional regex backreference numbering. We think the consistency of the flat capture types trumps the added safety and ergonomics of the structured capture types.
0 commit comments