-
Notifications
You must be signed in to change notification settings - Fork 441
Refactor the lexer to store lexer errors in the tokens instead of having a diagnosticsHandler #1191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci Please test |
Sources/SwiftParser/Lexer.swift
Outdated
self.advance(while: { $0.isValidIdentifierContinuationCodePoint }) | ||
return (.integerLiteral, [.isErroneous]) | ||
return LexerResult(.integerLiteral) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to accept the literal 0x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is still an issue too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is not a regression, I just opened #1214 so we can fix it in a follow-up PR.
Sources/SwiftSyntax/LexerError.swift
Outdated
/// If the token has a lexical error, this defines the type of the error. | ||
/// `lexerErrorOffset` in the token will specify at which offset the error | ||
/// occurred. | ||
public enum LexerError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any thoughts on something like:
public struct LexerError {
public enum Kind {
case ...
}
public let kind: Kind
public let byteOffset: UInt16 // No need for alias, it's only defined in one place now
public init(...) { ... }
}
And then just having the one public var lexerError: LexerError
field and a
public var hasLexerError: Bool {
return lexerError != nil
}
?
This would also simplify all the comments that mention "if there's an error, lexerErrorOffset
will specify the offset".
As an aside, I'd personally be fine with UInt8
as the offset type. Do people really have more than 255 characters for a token?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s a much better idea. No idea why I modeled it the way I did…
As an aside, I'd personally be fine with UInt8 as the offset type. Do people really have more than 255 characters for a token?
String literals are currently single tokens and those can reasonably be bigger than 255 characters, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String literals are currently single tokens and those can reasonably be bigger than 255 characters, I think.
Oh, unfortunate.
80cd97b
to
f80753d
Compare
@@ -169,6 +169,11 @@ public struct TokenSyntax: SyntaxProtocol, SyntaxHashable { | |||
public func childNameForDiagnostics(_ index: SyntaxChildrenIndex) -> String? { | |||
return nil | |||
} | |||
|
|||
/// If the token has a lexial error, the type of the error. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// If the token has a lexial error, the type of the error. | |
/// If the token has a lexical error, the type of the error. |
if tokenView.hasLexerError || tokenView.presence == .missing { | ||
if tokenView.lexerError != nil || tokenView.presence == .missing { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it wouldn't hurt to add a hasLexerError
, but up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer to remove it.
f80753d
to
58554b0
Compare
58554b0
to
13e60bb
Compare
@swift-ci Please test |
13e60bb
to
b72c418
Compare
…ing a diagnosticsHandler Instead of marking a token as having an error and relying on re-lexing to emit a diagnostic, store the type of error and the offset at which it occurred in the token itself. This is a more sound design IMO and will continue to work even if we introduce state into the lexer (e.g. whether we are currently in a string literal or inside a conflict marker range).
b72c418
to
8345eac
Compare
@swift-ci Please test |
Instead of marking a token as having an error and relying on re-lexing to emit a diagnostic, store the type of error and the offset at which it occurred in the token itself.
This is a more sound design IMO and will continue to work even if we introduce state into the lexer (e.g. whether we are currently in a string literal or inside a conflict marker range).