Skip to content

Parse 0.2 after as float literal, not member access #1277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 15 additions & 4 deletions Sources/SwiftParser/Lexer/Cursor.swift
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,8 @@ extension Lexer {
struct Cursor {
var input: UnsafeBufferPointer<UInt8>
var previous: UInt8
/// If we have already lexed a token, the kind of the previously lexed token
var previousTokenKind: RawTokenBaseKind?
private var stateStack: StateStack = StateStack()

init(input: UnsafeBufferPointer<UInt8>, previous: UInt8) {
Expand Down Expand Up @@ -335,6 +337,7 @@ extension Lexer.Cursor {
flags.insert(.isAtStartOfLine)
}

self.previousTokenKind = result.tokenKind.base
Copy link
Contributor

@bnbarham bnbarham Jan 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's Cursor.backUp where previousTokenKind wouldn't be correct any more. We also don't set the state in there, but I suppose the assumption there is that we're always in normal when splitting tokens.

(Just have to pass the token kind through to backUp)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor.backUp is still correct because the previous token still has the same kind if we resetForSplit, which is the only caller of this function (and should remain to be the only caller).

If we have just lexed <.. as a binary operator, previousToken is binaryOperator. If we now back up by two characters to consume only <, we are now placed at the first . but the previous token continues to be of kind binaryOperator. So no change of previousTokenKind is necessary here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. It would be nice to have a comment mentioning this (just in case we ever end up calling backUp elsewhere). Thanks for the explanation!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one

let error = result.error.map { error in
return LexerError(error.kind, byteOffset: cursor.distance(to: error.position))
}
Expand Down Expand Up @@ -676,6 +679,9 @@ extension Lexer.Cursor {
}

/// Rever the lexer by `offset` bytes. This should only be used by `resetForSplit`.
/// This must not back up by more bytes than the last token because that would
/// require us to also update `previousTokenKind`, which we don't do in this
/// function
mutating func backUp(by offset: Int) {
assert(!self.isAtStartOfFile)
self.previous = self.input.baseAddress!.advanced(by: -(offset + 1)).pointee
Expand Down Expand Up @@ -1224,11 +1230,16 @@ extension Lexer.Cursor {

// TODO: This can probably be unified with lexHexNumber somehow

// Lex things like 4.x as '4' followed by a tok::period.
if self.is(at: ".") {
// NextToken is the soon to be previous token
// Therefore: x.0.1 is sub-tuple access, not x.float_literal
if let peeked = self.peek(at: 1), !Unicode.Scalar(peeked).isDigit || tokenStart.previous == UInt8(ascii: ".") {
if self.peek(at: 1) == nil {
// If there are no more digits following the '.', we don't have a float
// literal.
return Lexer.Result(.integerLiteral)
} else if let peeked = self.peek(at: 1), !Unicode.Scalar(peeked).isDigit {
// ".a" is a member access and certainly not a float literal
return Lexer.Result(.integerLiteral)
} else if self.previousTokenKind == .period {
// Lex x.0.1 is sub-tuple access, not x.float_literal.
return Lexer.Result(.integerLiteral)
}
} else if self.isAtEndOfFile || self.is(notAt: "e", "E") {
Expand Down
34 changes: 34 additions & 0 deletions Tests/SwiftParserTest/LexerTests.swift
Original file line number Diff line number Diff line change
Expand Up @@ -1002,4 +1002,38 @@ public class LexerTests: XCTestCase {
]
)
}

func testMultiDigitTupleAccess() {
AssertLexemes(
"x.13.1",
lexemes: [
LexemeSpec(.identifier, text: "x"),
LexemeSpec(.period, text: "."),
LexemeSpec(.integerLiteral, text: "13"),
LexemeSpec(.period, text: "."),
LexemeSpec(.integerLiteral, text: "1"),
]
)
}

func testFloatingPointNumberAfterRangeOperator() {
AssertLexemes(
"0.1...0.2",
lexemes: [
LexemeSpec(.floatingLiteral, text: "0.1"),
LexemeSpec(.binaryOperator, text: "..."),
LexemeSpec(.floatingLiteral, text: "0.2"),
]
)
}

func testUnterminatedFloatLiteral() {
AssertLexemes(
"0.",
lexemes: [
LexemeSpec(.integerLiteral, text: "0"),
LexemeSpec(.period, text: "."),
]
)
}
}