Skip to content

Commit a1ae820

Browse files
committed
Merge branch 'main' into algorithms_updates
2 parents 709659b + 6fab471 commit a1ae820

File tree

10 files changed

+165
-106
lines changed

10 files changed

+165
-106
lines changed

Documentation/Evolution/ProposalOverview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ Covers the "interior" syntax, extended syntaxes, run-time construction of a rege
3939

4040
Proposes a slew of Regex-powered algorithms.
4141

42-
Introduces `CustomPrefixMatchRegexComponent`, which is a monadic-parser style interface for external parsers to be used as components of a regex.
42+
Introduces `CustomConsumingRegexComponent`, which is a monadic-parser style interface for external parsers to be used as components of a regex.
4343

4444
## Unicode for String Processing
4545

Documentation/Evolution/RegexTypeOverview.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -231,7 +231,7 @@ The result builder allows for inline failable value construction, which particip
231231

232232
Swift regexes describe an unambiguous algorithm, where choice is ordered and effects can be reliably observed. For example, a `print()` statement inside the `TryCapture`'s transform function will run whenever the overall algorithm naturally dictates an attempt should be made. Optimizations can only elide such calls if they can prove it is behavior-preserving (e.g. "pure").
233233

234-
`CustomPrefixMatchRegexComponent`, discussed in [String Processing Algorithms][pitches], allows industrial-strength parsers to be used a regex components. This allows us to drop the overly-permissive pre-parsing step:
234+
`CustomConsumingRegexComponent`, discussed in [String Processing Algorithms][pitches], allows industrial-strength parsers to be used a regex components. This allows us to drop the overly-permissive pre-parsing step:
235235

236236
```swift
237237
func processEntry(_ line: String) -> Transaction? {
@@ -431,7 +431,7 @@ Regular expressions have a deservedly mixed reputation, owing to their historica
431431

432432
* "Regular expressions are bad because you should use a real parser"
433433
- In other systems, you're either in or you're out, leading to a gravitational pull to stay in when... you should get out
434-
- Our remedy is interoperability with real parsers via `CustomPrefixMatchRegexComponent`
434+
- Our remedy is interoperability with real parsers via `CustomConsumingRegexComponent`
435435
- Literals with refactoring actions provide an incremental off-ramp from regex syntax to result builders and real parsers
436436
* "Regular expressions are bad because ugly unmaintainable syntax"
437437
- We propose literals with source tools support, allowing for better syntax highlighting and analysis
@@ -516,7 +516,7 @@ Regex are compiled into an intermediary representation and fairly simple analysi
516516

517517
### Future work: parser combinators
518518

519-
What we propose here is an incremental step towards better parsing support in Swift using parser-combinator style libraries. The underlying execution engine supports recursive function calls and mechanisms for library extensibility. `CustomPrefixMatchRegexComponent`'s protocol requirement is effectively a [monadic parser](https://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf), meaning `Regex` provides a regex-flavored combinator-like system.
519+
What we propose here is an incremental step towards better parsing support in Swift using parser-combinator style libraries. The underlying execution engine supports recursive function calls and mechanisms for library extensibility. `CustomConsumingRegexComponent`'s protocol requirement is effectively a [monadic parser](https://homepages.inf.ed.ac.uk/wadler/papers/marktoberdorf/baastad.pdf), meaning `Regex` provides a regex-flavored combinator-like system.
520520

521521
An issues with traditional parser combinator libraries are the compilation barriers between call-site and definition, resulting in excessive and overly-cautious backtracking traffic. These can be eliminated through better [compilation techniques](https://core.ac.uk/download/pdf/148008325.pdf). As mentioned above, Swift's support for custom static compilation is still under development.
522522

@@ -565,7 +565,7 @@ Regexes are often used for tokenization and tokens can be represented with Swift
565565
566566
### Future work: baked-in localized processing
567567
568-
- `CustomPrefixMatchRegexComponent` gives an entry point for localized processors
568+
- `CustomConsumingRegexComponent` gives an entry point for localized processors
569569
- Future work includes (sub?)protocols to communicate localization intent
570570
571571
-->

Documentation/Evolution/StringProcessingAlgorithms.md

Lines changed: 99 additions & 61 deletions
Large diffs are not rendered by default.

Sources/VariadicsGenerator/VariadicsGenerator.swift

Lines changed: 28 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -101,8 +101,17 @@ let defaultAvailableAttr = "@available(SwiftStdlib 5.7, *)"
101101
@main
102102
struct VariadicsGenerator: ParsableCommand {
103103
@Option(help: "The maximum arity of declarations to generate.")
104-
var maxArity: Int
104+
var maxArity: Int = 10
105+
106+
@Flag(help: "Suppress status messages while generating.")
107+
var silent: Bool = false
105108

109+
func log(_ message: String, terminator: String = "\n") {
110+
if !silent {
111+
print(message, terminator: terminator, to: &standardError)
112+
}
113+
}
114+
106115
func run() throws {
107116
precondition(maxArity > 1)
108117
precondition(maxArity < Counter.bitWidth)
@@ -126,14 +135,12 @@ struct VariadicsGenerator: ParsableCommand {
126135
127136
""")
128137

129-
print("Generating concatenation overloads...", to: &standardError)
138+
log("Generating concatenation overloads...")
130139
for (leftArity, rightArity) in Permutations(totalArity: maxArity) {
131140
guard rightArity != 0 else {
132141
continue
133142
}
134-
print(
135-
" Left arity: \(leftArity) Right arity: \(rightArity)",
136-
to: &standardError)
143+
log(" Left arity: \(leftArity) Right arity: \(rightArity)")
137144
emitConcatenation(leftArity: leftArity, rightArity: rightArity)
138145
}
139146

@@ -143,50 +150,48 @@ struct VariadicsGenerator: ParsableCommand {
143150

144151
output("\n\n")
145152

146-
print("Generating quantifiers...", to: &standardError)
153+
log("Generating quantifiers...")
147154
for arity in 0...maxArity {
148-
print(" Arity \(arity): ", terminator: "", to: &standardError)
155+
log(" Arity \(arity): ", terminator: "")
149156
for kind in QuantifierKind.allCases {
150-
print("\(kind.rawValue) ", terminator: "", to: &standardError)
157+
log("\(kind.rawValue) ", terminator: "")
151158
emitQuantifier(kind: kind, arity: arity)
152159
}
153-
print("repeating ", terminator: "", to: &standardError)
160+
log("repeating ", terminator: "")
154161
emitRepeating(arity: arity)
155-
print(to: &standardError)
162+
log("")
156163
}
157164

158-
print("Generating atomic groups...", to: &standardError)
165+
log("Generating atomic groups...")
159166
for arity in 0...maxArity {
160-
print(" Arity \(arity): ", terminator: "", to: &standardError)
167+
log(" Arity \(arity): ", terminator: "")
161168
emitAtomicGroup(arity: arity)
162-
print(to: &standardError)
169+
log("")
163170
}
164171

165-
print("Generating alternation overloads...", to: &standardError)
172+
log("Generating alternation overloads...")
166173
for (leftArity, rightArity) in Permutations(totalArity: maxArity) {
167-
print(
168-
" Left arity: \(leftArity) Right arity: \(rightArity)",
169-
to: &standardError)
174+
log(" Left arity: \(leftArity) Right arity: \(rightArity)")
170175
emitAlternation(leftArity: leftArity, rightArity: rightArity)
171176
}
172177

173-
print("Generating 'AlternationBuilder.buildBlock(_:)' overloads...", to: &standardError)
178+
log("Generating 'AlternationBuilder.buildBlock(_:)' overloads...")
174179
for arity in 1...maxArity {
175-
print(" Capture arity: \(arity)", to: &standardError)
180+
log(" Capture arity: \(arity)")
176181
emitUnaryAlternationBuildBlock(arity: arity)
177182
}
178183

179-
print("Generating 'capture' and 'tryCapture' overloads...", to: &standardError)
184+
log("Generating 'capture' and 'tryCapture' overloads...")
180185
for arity in 0...maxArity {
181-
print(" Capture arity: \(arity)", to: &standardError)
186+
log(" Capture arity: \(arity)")
182187
emitCapture(arity: arity)
183188
}
184189

185190
output("\n\n")
186191

187192
output("// END AUTO-GENERATED CONTENT\n")
188193

189-
print("Done!", to: &standardError)
194+
log("Done!")
190195
}
191196

192197
func tupleType(arity: Int, genericParameters: () -> String) -> String {
@@ -517,7 +522,7 @@ struct VariadicsGenerator: ParsableCommand {
517522
\(params.disfavored)\
518523
public init<\(params.genericParams), R: RangeExpression>(
519524
_ expression: R,
520-
_ behavior: QuantificationBehavior? = nil,
525+
_ behavior: RegexRepetitionBehavior? = nil,
521526
@\(concatBuilderName) _ component: () -> Component
522527
) \(params.repeatingWhereClause) {
523528
self.init(node: .repeating(expression.relative(to: 0..<Int.max), behavior, component().regex.root))

Sources/_StringProcessing/Regex/CustomComponents.swift

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@
1212
@available(SwiftStdlib 5.7, *)
1313
/// A protocol allowing custom types to function as regex components by
1414
/// providing the raw functionality backing `prefixMatch`.
15-
public protocol CustomPrefixMatchRegexComponent: RegexComponent {
15+
public protocol CustomConsumingRegexComponent: RegexComponent {
1616
/// Process the input string within the specified bounds, beginning at the given index, and return
1717
/// the end position (upper bound) of the match and the produced output.
1818
/// - Parameters:
@@ -29,7 +29,7 @@ public protocol CustomPrefixMatchRegexComponent: RegexComponent {
2929
}
3030

3131
@available(SwiftStdlib 5.7, *)
32-
extension CustomPrefixMatchRegexComponent {
32+
extension CustomConsumingRegexComponent {
3333
public var regex: Regex<RegexOutput> {
3434
let node: DSLTree.Node = .matcher(RegexOutput.self, { input, index, bounds in
3535
try consuming(input, startingAt: index, in: bounds)

Sources/_StringProcessing/Regex/Match.swift

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,3 +185,15 @@ extension Regex {
185185
self.init(node: .quotedLiteral(string))
186186
}
187187
}
188+
189+
@available(SwiftStdlib 5.7, *)
190+
public func ~=<Output>(regex: Regex<Output>, input: String) -> Bool {
191+
guard let _ = try? regex.wholeMatch(in: input) else { return false }
192+
return true
193+
}
194+
195+
@available(SwiftStdlib 5.7, *)
196+
public func ~=<Output>(regex: Regex<Output>, input: Substring) -> Bool {
197+
guard let _ = try? regex.wholeMatch(in: input) else { return false }
198+
return true
199+
}

Sources/_StringProcessing/Unicode/NecessaryEvils.swift

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -88,14 +88,3 @@ extension UTF16 {
8888
(UInt32(lead & 0x03ff) &<< 10 | UInt32(trail & 0x03ff)))
8989
}
9090
}
91-
92-
extension String.Index {
93-
internal var _encodedOffset: Int {
94-
// The encoded offset is found in the top 48 bits.
95-
Int(unsafeBitCast(self, to: UInt64.self) >> 16)
96-
}
97-
98-
internal init(_encodedOffset offset: Int) {
99-
self = unsafeBitCast(offset << 16, to: Self.self)
100-
}
101-
}

Tests/RegexBuilderTests/CustomTests.swift

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ import _StringProcessing
1414
@testable import RegexBuilder
1515

1616
// A nibbler processes a single character from a string
17-
private protocol Nibbler: CustomPrefixMatchRegexComponent {
17+
private protocol Nibbler: CustomConsumingRegexComponent {
1818
func nibble(_: Character) -> RegexOutput?
1919
}
2020

@@ -49,7 +49,7 @@ private struct Asciibbler: Nibbler {
4949
}
5050
}
5151

52-
private struct IntParser: CustomPrefixMatchRegexComponent {
52+
private struct IntParser: CustomConsumingRegexComponent {
5353
struct ParseError: Error, Hashable {}
5454
typealias RegexOutput = Int
5555
func consuming(_ input: String,
@@ -71,7 +71,7 @@ private struct IntParser: CustomPrefixMatchRegexComponent {
7171
}
7272
}
7373

74-
private struct CurrencyParser: CustomPrefixMatchRegexComponent {
74+
private struct CurrencyParser: CustomConsumingRegexComponent {
7575
enum Currency: String, Hashable {
7676
case usd = "USD"
7777
case ntd = "NTD"

Tests/RegexBuilderTests/RegexDSLTests.swift

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -863,7 +863,7 @@ class RegexDSLTests: XCTestCase {
863863
var patch: Int
864864
var dev: String?
865865
}
866-
struct SemanticVersionParser: CustomPrefixMatchRegexComponent {
866+
struct SemanticVersionParser: CustomConsumingRegexComponent {
867867
typealias RegexOutput = SemanticVersion
868868
func consuming(
869869
_ input: String,

Tests/RegexTests/AlgorithmsTests.swift

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -324,4 +324,19 @@ class AlgorithmTests: XCTestCase {
324324
s2.matches(of: regex).map(\.0),
325325
["aa"])
326326
}
327+
328+
func testSwitches() {
329+
switch "abcde" {
330+
case try! Regex("a.*f"):
331+
XCTFail()
332+
case try! Regex("abc"):
333+
XCTFail()
334+
335+
case try! Regex("a.*e"):
336+
break // success
337+
338+
default:
339+
XCTFail()
340+
}
341+
}
327342
}

0 commit comments

Comments
 (0)