Skip to content

Parser: avoid on-demand parsing when parsing interface tokens. #19192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 35 commits into from

Conversation

nkcsgexi
Copy link
Contributor

@nkcsgexi nkcsgexi commented Sep 7, 2018

This is to address @rintaro 's comments on the previous PR about on-demand parsing. The interface hash relies on the order of tokens getting consumed; thus on-demand parsing will make the generated hash unstable.

@nkcsgexi
Copy link
Contributor Author

nkcsgexi commented Sep 7, 2018

@swift-ci please smoke test

@jrose-apple
Copy link
Contributor

Can you add a test case? We definitely don't want to regress here!

@rintaro
Copy link
Member

rintaro commented Sep 7, 2018

I'm worrying about delayed parsing might randomly updates the interface-hash.
But I think we cannot simply disable it for actual delayed parsing, because if we do that, changes in iterable decl contexts aren't reflected in interface-hash..

@@ -3332,6 +3332,8 @@ bool Parser::parseDeclList(SourceLoc LBLoc, SourceLoc &RBLoc,
}

bool Parser::canDelayBodyParsing() {
if (IsParsingInterfaceTokens)
return false;
Copy link
Member

@rintaro rintaro Sep 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this disables "delay body parsing" effectively.
llvm::SaveAndRestore<bool> S(P.IsParsingInterfaceTokens, true) is at swift::parseIntoSourceFile() (it's in ParseSIL.cpp but this is the main entry point of parsing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, does it mean we always collect interface tokens for hashing and this requires no delayed parsing at all?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like this:

/*<enabled>*/
class C {
  var value = 12 // This affects interface-hash because modifying the initializer might change the type of the property. 

  func foo<T>()  where T : P /*<disabled>*/{
    // Here, no token affects interface-hash.
  }/*</disabled>*/

  func bar() -> Baz /*<disabled>*/{
  }/*</disabled>*/
}
/*</enabled>*/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the example! probably the only parsing can be delayed is for function bodies then :(

@nkcsgexi
Copy link
Contributor Author

nkcsgexi commented Sep 7, 2018

@jrose-apple is it possible that we generate interface-hash from the AST level rather than the token level? which solves this breakages fundamentally?

@jrose-apple
Copy link
Contributor

We don't need to generate interface hashes for anything but the primary source files, which as @slavapestov pointed out are the ones that we are going to parse everything for anyway. And we don't need to generate it if we're not generating reference dependencies.

Generating the interface hash from the AST is a good idea and also terrifying, since it's way easier to miss something. It's not something I want to just sneak in.

@nkcsgexi
Copy link
Contributor Author

nkcsgexi commented Sep 7, 2018

But P.IsParsingInterfaceTokens is always set true in ParseSIL.cpp:120. Do we ever set the flag to false in the parser except skipped function bodies? or are you saying we collect these hashes all the time but we don't use them except for primary files?

@jrose-apple
Copy link
Contributor

The latter, sorry. We could have parseIntoSourceFile take a flag to decide whether we need it, or put the flag in PersistentParserState (which is already per-SourceFile, right?).

@jrose-apple
Copy link
Contributor

Or make SourceFile's InterfaceHash Optional, and add a getter to decide if it's wanted or not.

@nkcsgexi
Copy link
Contributor Author

nkcsgexi commented Sep 7, 2018

Can we make parser collect these tokens only when front-end action -dump-interface-hash is specified? disregarding of the file under parsing is primary or not?

@jrose-apple
Copy link
Contributor

No, -dump-interface-hash is a debugging tool. The common use for interface hashes is in reference dependency files.

airspeedswift and others added 14 commits September 10, 2018 14:53
* Switch Array to use subscript _modify

* Add _modify to ContiguousArray

* XFAIL linux failing test
Package up the logic that generates a full Clang module name, so that
(a) we don't have to deal with clang::Module in quite as many places
in the /Swift/ compiler, and (b) we can avoid the cost of a temporary
string in a few places.

The main places where this is /not/ adopted is where we don't just
want to know the parent module name, but actually the module itself.
This is mostly indexing-related queries, which use the very similar
ModuleEntity class also defined in Module.h. I didn't quite see an
obvious way to unify these, but that might be where we want to go.

No functionality change.
Fixes a longstanding issue where submodules with the same name in
different top-level modules weren't being sorted deterministically.
This doesn't come up very much in practice, and it would have been
hard to notice anything wrong, but it's good to be right.
Since this library uses ADT headers, it will pull in a dependency for
LLVMSupport (Error.cpp has the necessary variable declaration for ABI breaking
checks).  Disable the ABI breaking check validation as we only use the library
for header-only dependencies.  Avoiding this dependency shaves nearly 900 KiB
from the resultant binary.
It's not correct to depend on the order of entries in a witness table
if the protocol is resilient, so don't even try.
This eliminates the 'non-resilient class' case here entirely since
all class methods have method descriptors.
broken by b4a23ad

rdar://problem/44229005 - for erik
* [AST] Remove stored TypeLoc from TypedPattern

TypedPattern was only using this TypeLoc as a means to a TypeRepr, which
caused it to store the pattern type twice (through the superclass and through
the TypeLoc itself.)

This also fixes a bug where deserializing a TypedPattern doesn't store
the type correctly and generally cleans up TypedPattern initialization.

Resolves rdar://44144435

* Address review comments
Now we properly enforce that all guaranteed phi arguments's lifetime is
completely enclosed by the lifetimes of the phi argument's inputs. This is
implemented by inserting an "implicit" regular use into the use stream of the
phi argument's inputs. This is nice since it ensures that when we are verifying
ownership of a specific argument, we will not walk through all
subarguments... instead we just need to verify that the end_borrow is completely
enclosed within the lifetime of the input guaranteed value.

I also fixed a case where we were bailing early out of the SIL Ownership
Verifier causing us to emit an incorrect error. This error would only occur if
we already diagnosed an error.

I also re-added all of the tests that I removed when I replaced
end_borrow_argument with end_borrow.

rdar://33440767
…nd interface types.

Works around an oddity of the associated type inference when the
potential witness is generic and returns (e.g.) Self.

Associated type inference needs to move off of archetypes for the
implementation to become sane. For now, this eliminates a crash.

Fixes rdar://problem/43591024.
…initializers.

Before adding implicit initializers synthesizes a new initializer, make sure that
we’ve validated the current class declaration to determine it’s generic signature.

Fixes rdar://problem/44235762.
slavapestov and others added 20 commits September 10, 2018 14:53
…e dispatch thunk

Make sure we check resilience of the class defining the method,
and not the class with the override we're calling.
…ta layouts

If the class is resilient to us, we cannot assume anything about the
layout of its member area, so let's not even try; just skip visiting
the members entirely.
…ber parsing.

New construction of a parser instance will reset the marked parser
position in PersistentParserState. We should recover the marked parser
position during on-demand parsing in case the marked position will be
later used to create another parser instance to continue parsing
linearly.
This enables in-place mutations of the key-based subscript.
These get rid of one of two full hash table lookup operations (including hashing the key) relative to the getter+setter approach. They can also eliminate COW copies in code like this:

dictionary[foo]?.append(bar)

(Admittedly this isn’t a very useful example.)
in attribute message

Strings of diagnostics message processed in EncodedDiagnosticMessage
aren't necessarily from parsed Swift source code. That means, they might
not have quotes around it. Furthermore, memory around them might not be
managed. The logic in 'Lexer::getEncodedStringSegment()' used to cause
access violation.

For now, disable multiline string literal and extended escaping in string
literal for attribute message position. Considering the message might be
from Clang, we cannot simply enable this.

rdar://problem/44228891
This should have happened when closures_swift3.swift was removed.
When emitting accessor calls for keypaths, make sure that we reference
protocol requirements that introduce witness table entries. Other
protocol requirements don't have the necessary dispatch thunks,
resulting in linker errors.

Fixes rdar://problem/44187969.
@nkcsgexi nkcsgexi closed this Sep 10, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants