Skip to content

Add debug descriptions for Lexeme and LexemeSequence #758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 22, 2022

Conversation

ahoppen
Copy link
Member

@ahoppen ahoppen commented Sep 9, 2022

The debug description of Lexeme outputs the current token’s text and the debug description of LexemeSequence outputs the source code that’s remaining in the LexemeSequence. In practice this means that po currentToken output’s the current token’s text and po lexemes shows the remaining source code.

@ahoppen ahoppen requested a review from CodaFi September 9, 2022 05:13
@ahoppen
Copy link
Member Author

ahoppen commented Sep 9, 2022

@swift-ci Please test

@CodaFi
Copy link
Contributor

CodaFi commented Sep 9, 2022

shows the remaining source code.

That's... a lot to print. Perhaps we can display the code up to the next newline or some fixed set of bytes to lookahead like 100.

@ahoppen
Copy link
Member Author

ahoppen commented Sep 10, 2022

That's... a lot to print. Perhaps we can display the code up to the next newline or some fixed set of bytes to lookahead like 100.

Just printing to the end of the next newline is sometimes now sufficient to disambiguate where you are in the source file if you’ve got code snippets that are repeated.

I also thought about capping at a character limit but decided against capping it because

  • I couldn’t decide on a limit
  • I tried printing the remaining source code in one of the self-parser tests and it was fine
  • I expect this to mostly be used in test cases that have already been reduced, so there’s no need to cap it
  • I have a vague memory that the remaining source code is capped at some point by the debugger and that annoyed me once

I don’t have super strong feelings about it though.

@ahoppen ahoppen force-pushed the ahoppen/debug-descriptions branch from 47952bc to c9b63d1 Compare September 20, 2022 13:37
@@ -150,6 +154,10 @@ extension Lexer {
func peek() -> Lexer.Lexeme {
return self.nextToken
}

public var debugDescription: String {
return self.nextToken.debugDescription + String(syntaxText: SyntaxText(baseAddress: self.cursor.input.baseAddress, count: self.cursor.input.count))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is backwards - it’ll print the next then the current source text behind it. If there were delimiters here that call that out I’d be happier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My ide was that po self.lexemes should be analogous to po *Ptr (or something like this) in the C++ parser, i.e. print the remaining source text, except for the current token. And AFAICT that’s what it does. self.nextToken prints the next token and SyntaxText(baseAddress: self.cursor.input.baseAddress, count: self.cursor.input.count) prints the remaining source text thereafter.

In my experience printing self.lexemes is useful to disambiguate where in the source file the current token is positioned, e.g. if there are multiple function declarations in the file and we’re currently at a func keyword.

If you want to inspect the next token in particular, you can always po self.peek(), which will print the next token similar to self.currentToken.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like *Ptr

I think that line of reasoning works for Lexer.Cursor, but for the lexeme sequence I would very much expect to see the current token, the lookahead token, and then the rest of the buffer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How exactly would you expect those to be printed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this has sat long enough, I'll just ask for a delimiter between these two pieces to separate the head and tail of the sequence. I still think we ought to limit the amount of text returned in that tail (even lldb gives up after a while and gives you ...), but it's not as important.

@@ -97,13 +97,17 @@ public struct Lexer {
SyntaxText(baseAddress: start.advanced(by: leadingTriviaByteLength+textByteLength),
count: trailingTriviaByteLength)
}

public var debugDescription: String {
return String(syntaxText: SyntaxText(baseAddress: start, count: byteLength))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s a good start. Maybe we can render the flagset here. And even dump the trivia.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flags are still being rendered if you po currentToken in the parser. This is what the debug output looks like with this change.

(lldb) po self.currentToken
▿ func 
  - tokenKind : SwiftSyntax.RawTokenKind.funcKeyword
  ▿ flags : Flags
    - rawValue : 0
  ▿ start : 0x0000000104014000
    - pointerValue : 4362158080
  - leadingTriviaByteLength : 0
  - textByteLength : 4
  - trailingTriviaByteLength : 1

@ahoppen ahoppen force-pushed the ahoppen/debug-descriptions branch from c9b63d1 to f9d8e9f Compare October 22, 2022 16:44
@ahoppen
Copy link
Member Author

ahoppen commented Oct 22, 2022

@swift-ci Please test

@ahoppen ahoppen merged commit 87d4a8e into swiftlang:main Oct 22, 2022
@ahoppen ahoppen deleted the ahoppen/debug-descriptions branch October 22, 2022 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants