Skip to content

Some AST construction and DSL refactoring #60

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 8, 2021

Conversation

milseman
Copy link
Member

@milseman milseman commented Dec 8, 2021

  • Add in PatternTree, which holds a formula over an algebra of patterns
  • Minor refactorings

Doing the AST overhaul involves heavily modifying the DSL code, and I realize much of this overhaul can be refactored out and developed in parallel / incrementally. Trying to do it all at once is proving troublesome.

This does some cosmetic changes and starts us down the path towards a pattern algebra. It still falls back to regex AST. I also didn't bother to rename anything fundamental (like Pattern vs Regex). This is meant to be the minimal changes that help unblock AST reworking.

Next step will be to get groupTransform cleanly off the AST and into PatternTree. Compilation should compile a pattern tree. I don't know how or if we want to do a lot of nesting here...

@milseman milseman requested a review from rxwei December 8, 2021 17:31

init(ast: AST) {
self.ast = ast
init(_ tree: PatternTree) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I’d prefer keeping an ast: or tree: argument label since it’s not exactly a value-preserving conversion and it’s better to be able to tell this is calling an internal API at call sites.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When might it not be value-preserving? It's weird to have the type name as the label, if there's a value loss, is there a way to describe what could be lost?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagined there would eventually be invalid trees? If so it would not be a monomorphism, otherwise never mind the value-preserving conversion argument.

That said, IMO the main value of having an argument label is to distinguish between calls to internal API and calls to public API more easily in our tests. I'm imagining there will be call sites like Pattern(.hexDigit), where which API being called would be less distinguishable between that and Pattern(.regex(...)). An argument label will describe the argument's role and surely doesn't have to repeat the type information. Something like init(representing:) would work, too.

init(ast: AST) {
self.program = Program(ast: ast)
var tree: PatternTree { program.tree }
var regexAST: AST? { program.regexAST }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we kept its name ast and made it non-optional (it okay to fail for now since it’s to be overhauled again), most of the changes to the gyb’d contents will not be necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we'd still need to change them when we stop using the AST, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we can swap the regex AST out for PatternTree under the same name. No changes to the call site.

/// A regular expression.
public struct Regex<Capture>: RegexProtocol {
/// A program representation that caches any lowered representation for
/// execution.
internal class Program {
/// The underlying AST.
let ast: AST
/// The underlying pattern tree
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// The underlying pattern tree
/// The underlying pattern tree.

@milseman milseman marked this pull request as draft December 8, 2021 19:23
@milseman milseman force-pushed the range_generic_babblings branch from 2b0d915 to 1903170 Compare December 8, 2021 22:10
@milseman milseman changed the title Some DSL refactoring Some AST construction and DSL refactoring Dec 8, 2021
@milseman milseman marked this pull request as ready for review December 8, 2021 22:13
@milseman
Copy link
Member Author

milseman commented Dec 8, 2021

@swift-ci please test linux platform

AST.

*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For namespacing, could you move these top-level symbols to an enum, e.g. ASTBuilder, as static methods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe eventually. I'm inclined not to for now because otherwise you end up either with a bunch of ASTBuilder.concat(ASTBuilder.atom....) in unit tests, or every client just reinvents these. Since the contextual type requested is AST and not ASTBuilder, they wouldn't be seen for lookup unless everything was typed as a builder instead of an AST. These don't work well hosted on AST because names can collide with case names.

Longer term, I don't think we'll be as enthusiastically creating AST nodes for every need. This is public in the "core" module right now (so visible to _StringProcessing but not its clients). Another option is to make these internal inside _StringProcessing and the test use case can use @testable anyways.

Also, we'll see if AST even stays as an enum or in its current form. We might need to overhaul it more for options tracking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the contextual type requested is AST and not ASTBuilder, they wouldn't be seen for lookup unless everything was typed as a builder instead of an AST.

Hmm, these could also be static methods on AST itself though. I'm still concerned about polluting the global namespace.

Another option is to make these internal inside _StringProcessing and the test use case can use @testable anyways.

I think that this approach works better; we only build location-less ASTs in _StringProcessing module and its tests.

@milseman milseman merged commit 843b541 into swiftlang:main Dec 8, 2021
@milseman milseman deleted the range_generic_babblings branch December 8, 2021 23:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants