Add assertions to the DSL #154

natecook1000 · 2022-02-10T17:31:06Z

This adds an Anchor type that handles the different kinds of assertions supported by regular expression literals. Some spelling are different — instead of separate \w and \W assertions, this design provides just Anchor.wordBoundary as well as an inverted property that is available on all assertions. (This property should perhaps be named negated instead.) That is, \W in a regex literal maps to Anchor.wordBoundary.inverted.

A lookahead(isNegative:) function provides the positive/negative lookahead functionality from regex literals.

natecook1000 · 2022-02-10T17:34:26Z

Sources/_StringProcessing/RegexDSL/Assertion.swift

+      case .startOfSubject: fatalError("Not yet supported")
+      case .endOfSubjectBeforeNewline: fatalError("Not yet supported")
+      case .endOfSubject: fatalError("Not yet supported")
+      case .firstMatchingPositionInSubject: fatalError("Not yet supported")
+      case .textSegmentBoundary: return .notTextSegment
+      case .startOfLine: fatalError("Not yet supported")
+      case .endOfLine: fatalError("Not yet supported")


We don't currently have a representation for these negated assertions in the AST, since things like notWordBoundary are represented as specific individual cases. These fatalError'd cases don't have a regex literal equivalent, but would be available if we use an API like Assertion.wordBoundary.inverted.

Why are we going to an AST here?

DSLTree tracks assertions using AST.Atom.AssertionKind right now.

milseman · 2022-02-10T22:05:54Z

Sources/_StringProcessing/RegexDSL/Assertion.swift

+    case endOfLine
+    case wordBoundary
+    case lookahead(DSLTree.Node)
+  }


Is this meant to be a listing of built-in assertions, or are each of these the kinds of assertions someone could write?

milseman · 2022-02-10T22:06:37Z

Sources/_StringProcessing/RegexDSL/Assertion.swift

+      case .startOfSubject: fatalError("Not yet supported")
+      case .endOfSubjectBeforeNewline: fatalError("Not yet supported")
+      case .endOfSubject: fatalError("Not yet supported")
+      case .firstMatchingPositionInSubject: fatalError("Not yet supported")
+      case .textSegmentBoundary: return .notTextSegment
+      case .startOfLine: fatalError("Not yet supported")
+      case .endOfLine: fatalError("Not yet supported")


Why are we going to an AST here?

milseman · 2022-02-10T22:08:18Z

Tests/RegexTests/RegexDSLTests.swift

+      ("aaaaabc", nil),
+      captureType: Substring.self, ==)
+    {
+      Assertion.startOfLine


Many of the built-in ones are more commonly called "anchors", which might be worth considering too.

`lookahead` is moved to a free function, since it isn't an anchor like the others

milseman

LGTM.

milseman · 2022-02-18T23:10:20Z

Sources/_StringProcessing/RegexDSL/Assertion.swift

+    var result = self
+    result.isInverted.toggle()
+    return result
+  }


Would we want an isInverted then? Also, is it the case that all anchors can be inverted?

I don't know that a property makes sense if we aren't going to also expose kind as public API, and the purpose of this is just to carry the regex wrapper.

Everything in this PR can be inverted, just need a little more plumbing. If we want to provide the functionality of a "reset match" assertion, that could just be a separate function or type, since it isn't an anchor anyway.

natecook1000 · 2022-02-21T08:02:05Z

@swift-ci Please test Linux platform

rxwei · 2022-02-21T08:14:50Z

Sources/_StringProcessing/RegexDSL/Assertion.swift

+}
+
+public func lookahead<R: RegexProtocol>(
+  isNegative: Bool = false,


While generally Boolean properties should read like an assertion about the receiver, I'm not sure it's quite as useful for function parameters. In this case, isNegative is not forming a phrase with the base name to produce Boolean result, but a dictation by the caller to modify the behavior of the callee. As such, I wonder if we should call it negative instead.

You're right, this is like the second parameter in split(separator: "-", omittingEmptySubsequences: false). 👍🏻

natecook1000 · 2022-02-21T09:01:34Z

@swift-ci Please test Linux platform

natecook1000 · 2022-02-21T09:01:51Z

@swift-ci Please test Linux platform

natecook1000 added 5 commits February 9, 2022 12:55

Add assertion type for DSL

2f825ff

Merge branch 'main' into dsl_assertions

38d61b3

Merge branch 'main' into dsl_assertions

091443b

Update tests

2741126

Move Assertion into its own file

31d0a05

natecook1000 commented Feb 10, 2022

View reviewed changes

Remove a merge-o

c674db9

natecook1000 requested a review from rxwei February 10, 2022 17:38

milseman reviewed Feb 10, 2022

View reviewed changes

milseman marked this pull request as draft February 11, 2022 00:42

Anchor is a better API name than Assertion

f629a4a

`lookahead` is moved to a free function, since it isn't an anchor like the others

natecook1000 force-pushed the dsl_assertions branch from 22e5194 to f629a4a Compare February 14, 2022 21:43

natecook1000 marked this pull request as ready for review February 18, 2022 21:02

natecook1000 requested a review from milseman February 18, 2022 21:02

milseman approved these changes Feb 18, 2022

View reviewed changes

natecook1000 changed the title ~~[Draft] Add assertions to the DSL~~ Add assertions to the DSL Feb 21, 2022

rxwei reviewed Feb 21, 2022

View reviewed changes

lookahead(isNegative:_:) -> lookahead(negative:_:)

c970ab2

rxwei approved these changes Feb 21, 2022

View reviewed changes

natecook1000 closed this Feb 21, 2022

natecook1000 reopened this Feb 21, 2022

natecook1000 merged commit f8e1dc2 into swiftlang:main Feb 21, 2022

natecook1000 deleted the dsl_assertions branch February 21, 2022 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add assertions to the DSL #154

Add assertions to the DSL #154

Uh oh!

natecook1000 commented Feb 10, 2022 •

edited

Loading

Uh oh!

natecook1000 Feb 10, 2022

Uh oh!

milseman Feb 10, 2022

Uh oh!

natecook1000 Feb 10, 2022

Uh oh!

milseman Feb 10, 2022

Uh oh!

milseman Feb 10, 2022

Uh oh!

milseman Feb 10, 2022

Uh oh!

milseman left a comment •

edited

Loading

Uh oh!

milseman Feb 18, 2022

Uh oh!

natecook1000 Feb 21, 2022

Uh oh!

natecook1000 Feb 21, 2022

Uh oh!

natecook1000 commented Feb 21, 2022

Uh oh!

rxwei Feb 21, 2022

Uh oh!

natecook1000 Feb 21, 2022

Uh oh!

natecook1000 commented Feb 21, 2022

Uh oh!

natecook1000 commented Feb 21, 2022

Uh oh!

Uh oh!

Add assertions to the DSL #154

Add assertions to the DSL #154

Uh oh!

Conversation

natecook1000 commented Feb 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

milseman left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

natecook1000 commented Feb 21, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

natecook1000 commented Feb 21, 2022

Uh oh!

natecook1000 commented Feb 21, 2022

Uh oh!

Uh oh!

natecook1000 commented Feb 10, 2022 •

edited

Loading

milseman left a comment •

edited

Loading