Skip to content

Introduce a library to handle user-defined operators #619

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 50 commits into from
Sep 17, 2022

Conversation

DougGregor
Copy link
Member

@DougGregor DougGregor commented Aug 23, 2022

Introduce a library that implements the semantics of Swift user-defined operators and precedence groups. The primary operation is a "fold", which takes a sequence expression (i.e., something like a + b * c) and turns it into a structured syntax tree that reflects the relative precedence of the operators involved (here, * has higher precedence than +). This is one step beyond simple parsing that provides the necessary structure for later stages of processing of a Swift syntax tree.

There are several pieces to this new SwiftOperators library, including:

  • PrecedenceGroup and Operator, which are semantic representations of precedence groups and operator declarations.
  • PrecedenceGraph, which keeps track of a set of precedence groups and their relationships to each other, and is able to answer questions like "how to the precedences of two different groups compare?".
  • Syntactic -> semantic translation to go from PrecedenceGroupDeclSyntax and OperatorDeclSyntax nodes to PrecedenceGroup and Operator, by walking the child syntax.
  • OperatorTable, which contains a PrecedenceGraph that can be created by processing a source file to pick out the various operator and precedence-group declarations. It provides the core "fold" operation that turns a SequenceExprSyntax node into one that is fully-structured to show the order of operations.
  • Prefabricated OperatorTable instances for the logical operators (&& and ||, as used in Swift #if conditions) and the Swift standard library operators (all the ones most Swift developers are used to), which can handle most Swift code.
  • OperatorError, which describes the kinds of errors that can occur when handling operators, from missing precedence groups to ambiguous parses. There is a mapping from OperatorError to Diagnostic so that we can emit diagnostics for each error via the normal means.
  • A new --fold-sequences option to -swift-parser-test to apply sequence folding to the input file as part of processing.

All that said, there are a few things that need to be finished before this library is ready:

  • Finish remaining syntactic expressions that use operator precedence:
    • Explicit casts
    • Ternary expressions (? :)
    • Assignment expressions
    • Arrow expression (->)
  • Integration in swift-parser-test via an option, so we can try this on lots more code
  • Detect cycles in precedence groups
  • Diagnostics for semantic errors in operator and precedence-group declarations
  • Diagnostics for try that doesn't extend all the way to the right of ternary and assignment expressions
  • Throwing lots and lots of code at it to make sure it works

@DougGregor DougGregor requested a review from ahoppen as a code owner August 23, 2022 04:57
@DougGregor
Copy link
Member Author

@swift-ci please test

// handler to throw, invoke the error handler again with the original
// error.
if let origFatalError = folder.firstFatalError {
try errorHandler(origFatalError)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn’t this call errorHandler twice? Once from SequenceFolder and once from here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the problem here is that rethrows checking cannot account for errors thrown in the visitor. I'll document this as a limitation. The real fix is something like swiftlang/swift-evolution#1271.

@DougGregor
Copy link
Member Author

@ahoppen thanks for the great review! I answered a few things up front, and will go through and address the rest of your comments in the code before replying.

@DougGregor DougGregor force-pushed the operator-precedence branch 2 times, most recently from 5d46f91 to 9621183 Compare September 2, 2022 04:52
Comment on lines 127 to 147
return try searchRelationships(
initialGroupName: endGroupName, initialSyntax: endSyntax,
targetGroupName: startGroupName, direction: .lowerThan,
errorHandler: errorHandler
) ?? searchRelationships(
initialGroupName: startGroupName, initialSyntax: startSyntax,
targetGroupName: endGroupName, direction: .higherThan,
errorHandler: errorHandler
) ?? searchRelationships(
initialGroupName: startGroupName, initialSyntax: startSyntax,
targetGroupName: endGroupName, direction: .lowerThan,
errorHandler: errorHandler
).map {
$0.flipped
} ?? searchRelationships(
initialGroupName: endGroupName, initialSyntax: endSyntax,
targetGroupName: startGroupName, direction: .higherThan,
errorHandler: errorHandler
).map {
$0.flipped
} ?? .unrelated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this I initially thought that $0.flipped matched all previous checks (my brain somehow thought that ??) has higher precedence than .. Just wondering whether this would read easier as follows (I’m also not a huge fan of the 20 line expression)

Suggested change
return try searchRelationships(
initialGroupName: endGroupName, initialSyntax: endSyntax,
targetGroupName: startGroupName, direction: .lowerThan,
errorHandler: errorHandler
) ?? searchRelationships(
initialGroupName: startGroupName, initialSyntax: startSyntax,
targetGroupName: endGroupName, direction: .higherThan,
errorHandler: errorHandler
) ?? searchRelationships(
initialGroupName: startGroupName, initialSyntax: startSyntax,
targetGroupName: endGroupName, direction: .lowerThan,
errorHandler: errorHandler
).map {
$0.flipped
} ?? searchRelationships(
initialGroupName: endGroupName, initialSyntax: endSyntax,
targetGroupName: startGroupName, direction: .higherThan,
errorHandler: errorHandler
).map {
$0.flipped
} ?? .unrelated
if let relationship = try searchRelationships(
initialGroupName: endGroupName, initialSyntax: endSyntax,
targetGroupName: startGroupName, direction: .lowerThan,
errorHandler: errorHandler
) {
return relationship
} else if let relationship = try searchRelationships(
initialGroupName: startGroupName, initialSyntax: startSyntax,
targetGroupName: endGroupName, direction: .higherThan,
errorHandler: errorHandler
) {
return relationship
} else if let relationship = try searchRelationships(
initialGroupName: startGroupName, initialSyntax: startSyntax,
targetGroupName: endGroupName, direction: .lowerThan,
errorHandler: errorHandler
) {
return relationship.flipped
} else if relationship = try searchRelationships(
initialGroupName: endGroupName, initialSyntax: endSyntax,
targetGroupName: startGroupName, direction: .higherThan,
errorHandler: errorHandler
) {
return relationship.flipped
} else {
return .unrelated
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my eye, both look equally messy. I'd rather find a better answer, perhaps with a real data structure.

@DougGregor DougGregor changed the title Introduce an operator precedence library Introduce a library to handle user-defined operators Sep 6, 2022
@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor
Copy link
Member Author

@swift-ci please test

1 similar comment
@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor
Copy link
Member Author

@swift-ci please test

1 similar comment
@DougGregor
Copy link
Member Author

@swift-ci please test

Implement the sequence "fold" operation that takes a sequence
expression (i.e., something like a + b * c) and turns it into a
structured syntax tree that reflects the relative precedence of the
operators involved (here, * has higher precedence than +). The core of
this algorithm is a direct port of the C++ implementation
of operator-precedence parsing. There are several pieces to it:

* PrecedenceGroup and Operator, which are semantic representations of
precedence groups and operator declarations.
* PrecedenceGraph, which keeps track of a set of precedence groups and
their relationships to each other, and is able to answer questions
like "how to the precedences of two different groups compare?".
* Syntactic -> semantic translation to go from
PrecedenceGroupDeclSyntax and OperatorDeclSyntax nodes to
PrecedenceGroup and Operator, by walking the child syntax.
* OperatorPrecedence, which contains a PrecedenceGraph that can be
created by processing a source file to pick out the various operator
and precedence-group declarations. It provides the core "fold"
operation that turns a SequenceExprSyntax node into one that is
fully-structured to show the order of operations.

The parsing infrastructure and basic algorithm work (modulo lots of
testing), but there are a lot of missing pieces:

* Right now we either throw an error or outright ignore pretty much
any problem that shows up. Neither of those is an acceptable answer,
so we'll need to decide how to
* There are default precedence groups for certain expression kinds
(e.g., '=', 'as?', '->') that we will need to implement by looking at
the syntax node kind for the "operator" and handling each of the
cases. This is easy but tedious to do/test.
* There is no syntax node for "infix binary expression" in the
grammar, so we cannot actually represent the result the way we want
to. For testing purposes now, we create a new TupleExprSyntax instead,
so we can see the full parenthesization in the output.
* We do not have primitives for merging precedence graphs when
combining multiple source files, modules, etc. We'll need to define
those rules.
* We don't make any attempt at caching repeated queries, which could
matter later on.
* We probably want to provide some pre-canned OperatorPrecedence
instances for, e.g., the logical operators (&&, ||, !) used when
evaluating preprocessor conditionals (#if x || y) and possibly the
standard library itself.

The biggest question of all is whether any of this even makes sense to
do as part of the syntax library, or whether it belongs with the main
Swift compiler.
…source file

Once the parser switches over to building raw syntax syntax trees, this
will allow us to realize fewer nodes.
When the operator-precedence parser encounters errors at any point,
provide the errors to an error handler and then keep going. This makes
all of the entrypoints rethrowing, so clients can bail out early if they
wish (the default handler just throws the error) or can record/report
the errors and continue with some default behavior.
Plus, remove an unnecessary function whose optionality was unused.
At present, this only ends up checking for duplicate definitions.
This module has been generalized (slightly) to handle everything involving
Swift's user-defined operators and precedence groups, so rename it
accordingly.
More renaming for generalization.
Introduce `synthesizeSyntax()` operations for `Operator` and
`PrecedenceGroup` to produce syntax nodes describing the semantics.
…ible.

Make the string representation of the main data types be their syntactic
form. This makes it easier to inspect and reason about their values.
Always provide syntax nodes, even if they are synthesized, when forming
an OperatorError. This eliminates the need for clients to do this
synthesis.
Use the `SwiftOperators` library to add support for sequence folding
into `swift-parser-test`. Each of the subcommands now supports the
`--fold-sequences` operation, which applies sequence folding using the
standard library operators and any additional precedence groups and
operators defines within the provided source file. The resulting
behavior for each of the subcommands is:

* `verify-round-trip`: ensure that, after folding, the resulting tree
still reproduces the input source.
* `print-diags`: also prints any diagnostics from sequence folding,
e.g., a ** b ** c where ** is non-associative will produce an error.
* `dump-tree`: prints the folded tree, which (for example) will no
longer have any SequenceExprSyntax nodes.
* `reduce`: passes `--fold-sequences` down for reduction.
When folding an expression such as `try x + y`, instead of producing
`(try x) + y`, hoist the `try` up to produce `try (x + y)`.
@DougGregor
Copy link
Member Author

@swift-ci please test

@DougGregor DougGregor merged commit 42a1131 into swiftlang:main Sep 17, 2022
@DougGregor DougGregor deleted the operator-precedence branch September 17, 2022 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants