Skip to content

Commit 604678e

Browse files
authored
Amend SE-0351 and SE-0350 (#1612)
SE-0351: - Add future direction about conversion to textual regex. - Move recursive subpatterns to future directions. - Clarify `Regex.Match.subscript(_:)` precondition. SE-0350: - Rename `firstMatch(_: Substring)` to `firstMatch(in: Substring)` to be consistent with the `String` variant. - Specify accessors on properties to clarify mutability.
1 parent 34ce4fc commit 604678e

File tree

2 files changed

+75
-50
lines changed

2 files changed

+75
-50
lines changed

proposals/0350-regex-type-overview.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -339,7 +339,7 @@ public struct Regex<Output> {
339339
/// Find the first match in a substring
340340
///
341341
/// Returns `nil` if no match is found and throws on abort
342-
public func firstMatch(_ s: Substring) throws -> Regex<Output>.Match?
342+
public func firstMatch(in s: Substring) throws -> Regex<Output>.Match?
343343

344344
/// The result of matching a regex against a string.
345345
///
@@ -348,19 +348,19 @@ public struct Regex<Output> {
348348
@dynamicMemberLookup
349349
public struct Match {
350350
/// The range of the overall match
351-
public let range: Range<String.Index>
351+
public var range: Range<String.Index> { get }
352352

353353
/// The produced output from the match operation
354-
public var output: Output
354+
public var output: Output { get }
355355

356356
/// Lookup a capture by name or number
357-
public subscript<T>(dynamicMember keyPath: KeyPath<Output, T>) -> T
357+
public subscript<T>(dynamicMember keyPath: KeyPath<Output, T>) -> T { get }
358358

359359
/// Lookup a capture by number
360360
@_disfavoredOverload
361361
public subscript(
362362
dynamicMember keyPath: KeyPath<(Output, _doNotUse: ()), Output>
363-
) -> Output
363+
) -> Output { get }
364364
// Note: this allows `.0` when `Match` is not a tuple.
365365

366366
}

proposals/0351-regex-builder.md

Lines changed: 70 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1008,12 +1008,16 @@ let regex = Regex {
10081008
Variants of `Capture` and `TryCapture` accept a `Reference` argument. References can be used to achieve named captures and named backreferences from textual regexes.
10091009

10101010
```swift
1011+
/// A reference to a regex capture.
10111012
public struct Reference<Capture>: RegexComponent {
10121013
public init(_ captureType: Capture.Type = Capture.self)
10131014
public var regex: Regex<Capture>
10141015
}
10151016

10161017
extension Regex.Match {
1018+
/// Returns the capture referenced by the given reference.
1019+
///
1020+
/// - Precondition: The reference must have been captured in the regex that produced this match.
10171021
public subscript<Capture>(_ reference: Reference<Capture>) -> Capture { get }
10181022
}
10191023
```
@@ -1036,7 +1040,7 @@ if let result = input.firstMatch(of: regex) {
10361040
}
10371041
```
10381042

1039-
A regex is considered invalid when it contains a use of reference without it ever being captured in the regex. When this occurs in the regex builder DSL, an runtime error will be reported.
1043+
A regex is considered invalid when it contains a use of reference without it ever being captured in the regex. When this occurs in the regex builder DSL, a runtime error will be reported. Similarly, the use of a reference in a `Regex.Match.subscript(_:)` must have been captured in the regex that produced the match.
10401044

10411045
### Subpattern
10421046

@@ -1056,54 +1060,21 @@ With regex builder, there is no special API required to reuse existing subpatter
10561060

10571061
```swift
10581062
Regex {
1059-
let subject = ChoiceOf {
1060-
"I"
1061-
"you"
1062-
}
1063-
let object = ChoiceOf {
1064-
"goodbye"
1065-
"hello"
1066-
}
1067-
subject
1068-
"say"
1069-
object
1070-
";"
1071-
subject
1072-
"say"
1073-
object
1074-
}
1075-
```
1076-
1077-
Sometimes, a textual regex may also use `(?R)` or `(?0)` to recusively evaluate the entire regex. For example, the following textual regex matches "I say you say I say you say hello".
1078-
1079-
```
1080-
(you|I) say (goodbye|hello|(?R))
1081-
```
1082-
1083-
For this, `Regex` offers a special initializer that allows its pattern to recursively reference itself. This is somewhat akin to a fixed-point combinator.
1084-
1085-
```swift
1086-
extension Regex {
1087-
public init<R: RegexComponent>(
1088-
@RegexComponentBuilder _ content: (Regex<Substring>) -> R
1089-
) where R.Output == Match
1090-
}
1091-
```
1092-
1093-
With this initializer, the above regex can be expressed as the following using regex builder.
1094-
1095-
```swift
1096-
Regex { wholeSentence in
1097-
ChoiceOf {
1098-
"I"
1099-
"you"
1063+
let subject = ChoiceOf {
1064+
"I"
1065+
"you"
11001066
}
1101-
"say"
1102-
ChoiceOf {
1067+
let object = ChoiceOf {
11031068
"goodbye"
11041069
"hello"
1105-
wholeSentence
11061070
}
1071+
subject
1072+
"say"
1073+
object
1074+
";"
1075+
subject
1076+
"say"
1077+
object
11071078
}
11081079
```
11091080

@@ -1166,6 +1137,59 @@ The proposed feature does not change the ABI of existing features.
11661137

11671138
The proposed feature relies heavily upon overloads of `buildBlock` and `buildPartialBlock(accumulated:next:)` to work for different capture arities. In the fullness of time, we are hoping for variadic generics to supercede existing overloads. Such a change should not involve ABI-breaking modifications as it is merely a change of overload resolution.
11681139

1140+
## Future directions
1141+
1142+
### Conversion to textual regex
1143+
1144+
Sometimes it may be useful to convert a regex created using regex builder to textual regex. This may be achieved in the future by extending `RegexComponent` with a computed property.
1145+
1146+
```swift
1147+
extension RegexComponent {
1148+
public func makeTextualRegex() -> String?
1149+
}
1150+
```
1151+
1152+
It is worth noting that the internal representation of a `Regex` is _not_ textual regex, but an efficient pattern matching bytecode compiled from an abstract syntax tree. Moreover, not every `Regex` can be converted to textual regex. Regex builder supports arbitrary types that conform to the `RegexComponent` protocol, including `CustomMatchingRegexComponent` (pitched in [String Processing Algorithms]) which can be implemented with arbitrary code. If a `Regex` contains a `CustomMatchingRegexComponent`, it cannot be converted to textual regex.
1153+
1154+
### Recursive subpatterns
1155+
1156+
Sometimes, a textual regex may also use `(?R)` or `(?0)` to recusively evaluate the entire regex. For example, the following textual regex matches "I say you say I say you say hello".
1157+
1158+
```
1159+
(you|I) say (goodbye|hello|(?R))
1160+
```
1161+
1162+
For this, `Regex` offers a special initializer that allows its pattern to recursively reference itself. This is somewhat akin to a fixed-point combinator.
1163+
1164+
```swift
1165+
extension Regex {
1166+
public init<R: RegexComponent>(
1167+
@RegexComponentBuilder _ content: (Regex<Substring>) -> R
1168+
) where R.Output == Match
1169+
}
1170+
```
1171+
1172+
With this initializer, the above regex can be expressed as the following using regex builder.
1173+
1174+
```swift
1175+
Regex { wholeSentence in
1176+
ChoiceOf {
1177+
"I"
1178+
"you"
1179+
}
1180+
"say"
1181+
ChoiceOf {
1182+
"goodbye"
1183+
"hello"
1184+
wholeSentence
1185+
}
1186+
}
1187+
```
1188+
1189+
There are some concerns with this design which we need to consider:
1190+
- Due to the lack of labeling, the argument to the builder closure can be arbitrarily named and cause confusion.
1191+
- When there is an initializer that accepts a result builder closure, overloading that initializer with the same argument labels could lead to bad error messages upon interor type errors.
1192+
11691193
## Alternatives considered
11701194

11711195
### Operators for quantification and alternation
@@ -1385,3 +1409,4 @@ This is cool, but it adds extra complexity to regex builder and it isn't as clea
13851409
[Declarative String Processing]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/DeclarativeStringProcessing.md
13861410
[Strongly Typed Regex Captures]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/StronglyTypedCaptures.md
13871411
[Regex Syntax]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/RegexSyntax.md
1412+
[String Processing Algorithms]: https://github.com/apple/swift-experimental-string-processing/blob/main/Documentation/Evolution/StringProcessingAlgorithms.md

0 commit comments

Comments
 (0)