Add debug descriptions for `Lexeme` and `LexemeSequence` #758

ahoppen · 2022-09-09T05:13:40Z

The debug description of Lexeme outputs the current token’s text and the debug description of LexemeSequence outputs the source code that’s remaining in the LexemeSequence. In practice this means that po currentToken output’s the current token’s text and po lexemes shows the remaining source code.

ahoppen · 2022-09-09T05:13:50Z

@swift-ci Please test

CodaFi · 2022-09-09T21:32:08Z

shows the remaining source code.

That's... a lot to print. Perhaps we can display the code up to the next newline or some fixed set of bytes to lookahead like 100.

ahoppen · 2022-09-10T05:31:06Z

That's... a lot to print. Perhaps we can display the code up to the next newline or some fixed set of bytes to lookahead like 100.

Just printing to the end of the next newline is sometimes now sufficient to disambiguate where you are in the source file if you’ve got code snippets that are repeated.

I also thought about capping at a character limit but decided against capping it because

I couldn’t decide on a limit
I tried printing the remaining source code in one of the self-parser tests and it was fine
I expect this to mostly be used in test cases that have already been reduced, so there’s no need to cap it
I have a vague memory that the remaining source code is capped at some point by the debugger and that annoyed me once

I don’t have super strong feelings about it though.

CodaFi · 2022-09-20T17:13:27Z

Sources/SwiftParser/Lexer.swift

@@ -150,6 +154,10 @@ extension Lexer {
    func peek() -> Lexer.Lexeme {
      return self.nextToken
    }
+
+    public var debugDescription: String {
+      return self.nextToken.debugDescription + String(syntaxText: SyntaxText(baseAddress: self.cursor.input.baseAddress, count: self.cursor.input.count))


This is backwards - it’ll print the next then the current source text behind it. If there were delimiters here that call that out I’d be happier.

My ide was that po self.lexemes should be analogous to po *Ptr (or something like this) in the C++ parser, i.e. print the remaining source text, except for the current token. And AFAICT that’s what it does. self.nextToken prints the next token and SyntaxText(baseAddress: self.cursor.input.baseAddress, count: self.cursor.input.count) prints the remaining source text thereafter.

In my experience printing self.lexemes is useful to disambiguate where in the source file the current token is positioned, e.g. if there are multiple function declarations in the file and we’re currently at a func keyword.

If you want to inspect the next token in particular, you can always po self.peek(), which will print the next token similar to self.currentToken.

Like *Ptr

I think that line of reasoning works for Lexer.Cursor, but for the lexeme sequence I would very much expect to see the current token, the lookahead token, and then the rest of the buffer.

How exactly would you expect those to be printed?

Since this has sat long enough, I'll just ask for a delimiter between these two pieces to separate the head and tail of the sequence. I still think we ought to limit the amount of text returned in that tail (even lldb gives up after a while and gives you ...), but it's not as important.

CodaFi · 2022-09-20T17:14:26Z

Sources/SwiftParser/Lexer.swift

@@ -97,13 +97,17 @@ public struct Lexer {
      SyntaxText(baseAddress: start.advanced(by: leadingTriviaByteLength+textByteLength),
                 count: trailingTriviaByteLength)
    }
+
+    public var debugDescription: String {
+      return String(syntaxText: SyntaxText(baseAddress: start, count: byteLength))


It’s a good start. Maybe we can render the flagset here. And even dump the trivia.

The flags are still being rendered if you po currentToken in the parser. This is what the debug output looks like with this change.

(lldb) po self.currentToken ▿ func - tokenKind : SwiftSyntax.RawTokenKind.funcKeyword ▿ flags : Flags - rawValue : 0 ▿ start : 0x0000000104014000 - pointerValue : 4362158080 - leadingTriviaByteLength : 0 - textByteLength : 4 - trailingTriviaByteLength : 1

ahoppen · 2022-10-22T16:45:04Z

@swift-ci Please test

ahoppen requested a review from CodaFi September 9, 2022 05:13

ahoppen force-pushed the ahoppen/debug-descriptions branch from 47952bc to c9b63d1 Compare September 20, 2022 13:37

CodaFi reviewed Sep 20, 2022

View reviewed changes

ahoppen added 2 commits October 21, 2022 19:05

Add debug description to lexeme printing the lexeme’s content

b604a48

Make LexemeSequence CustomDebugStringConvertible

f9d8e9f

ahoppen force-pushed the ahoppen/debug-descriptions branch from c9b63d1 to f9d8e9f Compare October 22, 2022 16:44

ahoppen merged commit 87d4a8e into swiftlang:main Oct 22, 2022

ahoppen deleted the ahoppen/debug-descriptions branch October 22, 2022 20:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add debug descriptions for `Lexeme` and `LexemeSequence` #758

Add debug descriptions for `Lexeme` and `LexemeSequence` #758

Uh oh!

ahoppen commented Sep 9, 2022

Uh oh!

ahoppen commented Sep 9, 2022

Uh oh!

CodaFi commented Sep 9, 2022

Uh oh!

ahoppen commented Sep 10, 2022

Uh oh!

CodaFi Sep 20, 2022

Uh oh!

ahoppen Sep 21, 2022

Uh oh!

CodaFi Sep 30, 2022

Uh oh!

ahoppen Sep 30, 2022

Uh oh!

CodaFi Oct 21, 2022

Uh oh!

CodaFi Sep 20, 2022

Uh oh!

ahoppen Sep 21, 2022

Uh oh!

ahoppen commented Oct 22, 2022

Uh oh!

Uh oh!

Add debug descriptions for Lexeme and LexemeSequence #758

Add debug descriptions for Lexeme and LexemeSequence #758

Uh oh!

Conversation

ahoppen commented Sep 9, 2022

Uh oh!

ahoppen commented Sep 9, 2022

Uh oh!

CodaFi commented Sep 9, 2022

Uh oh!

ahoppen commented Sep 10, 2022

Uh oh!

CodaFi Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

ahoppen Sep 21, 2022

Choose a reason for hiding this comment

Uh oh!

CodaFi Sep 30, 2022

Choose a reason for hiding this comment

Uh oh!

ahoppen Sep 30, 2022

Choose a reason for hiding this comment

Uh oh!

CodaFi Oct 21, 2022

Choose a reason for hiding this comment

Uh oh!

CodaFi Sep 20, 2022

Choose a reason for hiding this comment

Uh oh!

ahoppen Sep 21, 2022

Choose a reason for hiding this comment

Uh oh!

ahoppen commented Oct 22, 2022

Uh oh!

Uh oh!

Add debug descriptions for `Lexeme` and `LexemeSequence` #758

Add debug descriptions for `Lexeme` and `LexemeSequence` #758