[Parser, NameLookup, ASTScope] Parser changes for lazy ASTScopes #26768

davidungar · 2019-08-21T18:41:05Z

Factors locations for missing close braces (etc.) and error into two functions: 'getConfabulatedMatchingTokenLoc' and getErrorOrMissingLocForLazyASTScopes.
Also takes AppendingExpr into account when computing the end location of aTapExpr.
Adds a check in ASTVerifier for the end locations.
Moves the fictional closing brace in the synthesized expression for an InterpolatedStringLiteralExpr back one character so it coincides with the closing quote, in order to avoid a nesting violation.
Leaves the PreviousTokenLoc pointing to the start of an InterpolatedStringLiteralExpr instead of somewhere in the middle, and uses that to locate a missing closing brace at the close quote.

davidungar · 2019-08-21T19:05:32Z

@swift-ci please smoke test os x platform

davidungar · 2019-08-21T20:46:43Z

@swift-ci please test source compatibility

akyrtzi

These changes break the invariant that the EndLocs of AST nodes are token-based. My recommendation is to preserve token-based EndLocs, and have AST scopes record their CharSourceRanges and do their binary lookup using character ranges.

akyrtzi · 2019-08-22T00:08:18Z

lib/Parse/ParseExpr.cpp

+  // of the literal so that it doesn't go past where a missing close brace
+  // for an enclosing IterableTypeDecl will be added.
+  // (See \c parseMatchingToken.)
+  SourceLoc EndLoc = Loc.getAdvancedLoc(Tok.getLength() - 1);


This EndLoc that is used for the "closing brace" is character-based, it should actually be token based (it was character-based before as well, before this change, but we should correct it).

It's not clear to me if the implicit BraceStmt that is created for the TapExpr is supposed to be the SourceRange of the whole string literal or just the range of the expression segments, but if it is supposed to be the SourceRange of the whole string literal then it should be matching the string literal's SourceRange, whose start and end locations are the token location of where the literal string begins.

Sounds good; I'll try that. As you saw, before this change it was one character past the end of the token; neither fish nor fowl.

I think it will work better to point at the last token in the AppendingExpression. Going to try that.

akyrtzi · 2019-08-22T00:13:40Z

lib/Parse/Parser.cpp

+  // middle.
+  return PreviousTok.getKind() != tok::string_literal
+    ? PreviousLoc
+    : PreviousLoc.getAdvancedLoc(PreviousTok.getLength() - 1);


Same issue here, this is returning a character-based location. I think this should always return PreviousLoc, which AFAICT is the location of the last parsed token.

I can do this, and try the change suggested to ASTScopes (using character locations). And I'm happy to give it a whirl. But I'm still bothered because the notion of "last parsed token" seems ill-defined to me, as if we're building on sand.

Consider "\(foo)". You could say that the last parsed token is the whole interpolated string literal, except that the last parsed token is also the closing parenthesis, or maybe the foo.
Before I added the save-and-restore line for PreviousLoc that value did point to the foo IIRC. Part of the confusion is in the word "last", I suppose. Is that the most-recently-parsed? (in which case the ambiguity is whether it's the most-recently-started-to-parse or the most-recently-finished-to-parse). Or is it the last token that has been parsed in the file? (I.e., the token furthest from the start of the file that has been parsed?)

If you help me understand this, I'll try to add some comments, too.

The thing that complicates the interpolated string literal is that it is a "tokens within a token" situation, but a way to reason about it is that there are these set of tokens:

the string literal token "\(foo)"

The interpolated tokens '(', 'foo', ')'

Then the question is what kind of source range you want. If you want the range of the whole string literal then you should use the literal token range. If you want the range of the interpolated segment then you should use the interpolated tokens.

But you cannot have the EndLoc of the SourceRange be the closing quote, because the closing quote is not a token, and the SourceRange you'd form with it would not be token-based.

Thanks, that helps. Given that, if I want to keep ASTScope token-like SourceRange based, the key would be to look inside the string literal, at the tokens of the expression within, and take the location of the last of those tokens, the ')'. Or as you say, change ASTScopes to be character-location based.

The other question is where should the EndLoc of a StructDecl (etc.) be when the close brace is missing? Given the rule that SourceLocs refer to tokens, and that sub-nodes are enclosed (perhaps improperly) by super-nodes, one could put that EndLoc either at the SourceLoc of the string literal (i.e. the open '"'), or at the SourceLoc of the ')'. Do you have a sense of which of these would be more fitting? What's tough here is that we're talking about illegal code.

In the latter case, I think I saw other things break, so I suspect the closing brace belongs with the open quote of the string literal.

Going to try using the SourceLoc of the string literal, the open quote.

davidungar · 2019-08-22T00:42:54Z

Thanks a lot, @akyrtzi , for getting to this so quickly. I'll try out the directions you suggest.

davidungar · 2019-08-22T05:19:38Z

@akyrtzi I'll make the changes I think will work for the invariants, and do some testing tomorrow. Will push when it looks like it will work, and let you know so you can take a look. Thanks again!

davidungar · 2019-08-22T05:27:43Z

@akyrtzi Update: I pushed, but haven't tested or proofread yet.

davidungar · 2019-08-22T05:37:28Z

@swift-ci please test source code compatibility

davidungar · 2019-08-22T05:37:40Z

@swift-ci please test

swift-ci · 2019-08-22T05:39:53Z

Build failed
Swift Test Linux Platform
Git Sha - d84d541

swift-ci · 2019-08-22T05:41:26Z

Build failed
Swift Test OS X Platform
Git Sha - d84d541

davidungar · 2019-08-22T05:42:14Z

@akyrtzi I think this PR is now ready for your perusal. I haven't tested it yet, nor changed ASTScope to work with it. I'm thinking that a lazy IterableTypeDecl scope can ask the lexer for the token at the EndLoc of an IterableTypeDecl. If it's an InterpolatedStringLiteral, it can lex it to find the end. And, as you suggest, the ASTScope system can use the charSourceLocations. For expressions in general, come to think of it, the same thing ought to work: lex the EndLoc of the expression, and use the charSourceLoc of the end. Will that tomorrow.

akyrtzi · 2019-08-22T17:34:07Z

lib/Parse/ParseExpr.cpp

-    auto Body = BraceStmt::create(Context, Loc, Stmts, EndLoc,
+    // At this point, PreviousLoc points to the last token parsed within
+    // the body, so use that for the brace statement location.
+    auto Body = BraceStmt::create(Context, Loc, Stmts, PreviousLoc,


The source range of this BraceStmt is rather strange because, according to the comment, the start loc is the start of the literal and the end loc is at a token inside it, like this:

" blah \(some) blah \(thing) blah " ^ ^

I'd recommend that it should either be the full range of the literal ([Loc, Loc]), or the range of the statements ([Stmts.first.StartLoc, Stmts.last.EndLoc]), or this range if you can manage it:

" blah \(some) blah \(thing) blah " ^ ^

I like the Stmts.{first,last} idea. I'll try that.

…xpressions and their subexpressions

davidungar · 2019-08-23T01:26:02Z

@swift-ci please smoke test

davidungar · 2019-08-23T03:05:08Z

@swift-ci please test source compatibility

davidungar · 2019-08-23T20:46:08Z

@akyrtzi Thanks!!

Parser changes for lazy ASTScopes

0dbf7e6

davidungar changed the title ~~Parser changes for lazy ASTScopes~~ [WIP, DNM, Parser, NameLookup, ASTScope] Parser changes for lazy ASTScopes Aug 21, 2019

De-lazify

d84d541

davidungar changed the title ~~[WIP, DNM, Parser, NameLookup, ASTScope] Parser changes for lazy ASTScopes~~ [Parser, NameLookup, ASTScope] Parser changes for lazy ASTScopes Aug 21, 2019

davidungar requested review from beccadax and akyrtzi August 21, 2019 20:45

akyrtzi requested changes Aug 22, 2019

View reviewed changes

davidungar force-pushed the A-8-21-parserPR branch 2 times, most recently from 4c880d6 to 79c7cbd Compare August 22, 2019 05:33

Use token start locations

a40b694

davidungar force-pushed the A-8-21-parserPR branch from 79c7cbd to a40b694 Compare August 22, 2019 05:36

akyrtzi requested changes Aug 22, 2019

View reviewed changes

Back off and strategically retreat from more sensible ranges for TapE…

35e8218

…xpressions and their subexpressions

akyrtzi approved these changes Aug 23, 2019

View reviewed changes

davidungar merged commit 72b1e27 into swiftlang:master Aug 23, 2019

[Parser, NameLookup, ASTScope] Parser changes for lazy ASTScopes #26768

[Parser, NameLookup, ASTScope] Parser changes for lazy ASTScopes #26768

Uh oh!

Conversation

davidungar commented Aug 21, 2019

Uh oh!

davidungar commented Aug 21, 2019

Uh oh!

davidungar commented Aug 21, 2019

Uh oh!

akyrtzi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

akyrtzi Aug 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidungar Aug 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidungar commented Aug 22, 2019

Uh oh!

davidungar commented Aug 22, 2019

Uh oh!

davidungar commented Aug 22, 2019

Uh oh!

davidungar commented Aug 22, 2019

Uh oh!

davidungar commented Aug 22, 2019

Uh oh!

swift-ci commented Aug 22, 2019

Uh oh!

swift-ci commented Aug 22, 2019

Uh oh!

davidungar commented Aug 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davidungar commented Aug 23, 2019

Uh oh!

davidungar commented Aug 23, 2019

Uh oh!

davidungar commented Aug 23, 2019

Uh oh!

Uh oh!

akyrtzi Aug 22, 2019 •

edited

Loading

davidungar Aug 22, 2019 •

edited

Loading