Skip to content

[Serialization] Add a "nested types" lookup table for partial modules #7067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 27, 2017

Conversation

jrose-apple
Copy link
Contributor

There's a class of errors in Serialization called "circularity issues", where declaration A in file A.swift depends on declaration B in file B.swift, and B also depends on A. In some cases we can manage to type-check each of these files individually due to the laziness of 'validateDecl', but then fail to merge the "partial modules" generated from A.swift and B.swift to form a single swiftmodule for the library (because deserialization is a little less lazy for some things). A common case of this is when at least one of the declarations is nested, in which case a lookup to find that declaration needs to load all the members of the parent type. This gets even worse when the nested type is defined in an extension.

This commit sidesteps that issue specifically for nested types by creating a top-level, per-file table of nested types in the "partial modules". When a type is in the same module, we can then look it up without importing all other members of the parent type.

The long-term solution is to allow accessing any members of a type without having to load them all, something we should support not just for module-merging while building a single target but when reading from imported modules as well. This should improve both compile time and memory usage, though I'm not sure to what extent. (Unfortunately, too many things still depend on the whole members list being loaded.)

Because this is a new code path, I put in a switch to turn it off: frontend flag -disable-serialization-nested-type-lookup-table.

SR-3707 / rdar://problem/30172856 (and possibly others)

- We aren't using derived top-level declarations anymore.
- We're looking for more than just operators.
- Nested members of local decls are still local, so we don't need to record
  their Objective-C methods.

No intended functionality change.
There's a class of errors in Serialization called "circularity
issues", where declaration A in file A.swift depends on declaration B
in file B.swift, and B also depends on A. In some cases we can manage
to type-check each of these files individually due to the laziness of
'validateDecl', but then fail to merge the "partial modules" generated
from A.swift and B.swift to form a single swiftmodule for the library
(because deserialization is a little less lazy for some things). A
common case of this is when at least one of the declarations is
nested, in which case a lookup to find that declaration needs to load
all the members of the parent type. This gets even worse when the
nested type is defined in an extension.

This commit sidesteps that issue specifically for nested types by
creating a top-level, per-file table of nested types in the "partial
modules". When a type is in the same module, we can then look it up
/without/ importing all other members of the parent type.

The long-term solution is to allow accessing any members of a type
without having to load them all, something we should support not just
for module-merging while building a single target but when reading
from imported modules as well. This should improve both compile time
and memory usage, though I'm not sure to what extent. (Unfortunately,
too many things still depend on the whole members list being loaded.)

Because this is a new code path, I put in a switch to turn it off:
frontend flag -disable-serialization-nested-type-lookup-table

https://bugs.swift.org/browse/SR-3707 (and possibly others)
@jrose-apple
Copy link
Contributor Author

I'd like to pull this into 3.1, because SR-3707 is a regression, but it's definitely a fairly major change to this part of serialization. What do you think, @DougGregor?

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 8145cd0
Test requested by - @jrose-apple

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test Linux Platform
Git Commit - 8145cd0
Test requested by - @jrose-apple

@jrose-apple
Copy link
Contributor Author

@swift-ci Please test

@swift-ci
Copy link
Contributor

Build failed
Jenkins build - Swift Test OS X Platform
Git Commit - 8145cd0
Test requested by - @jrose-apple

@jrose-apple
Copy link
Contributor Author

@swift-ci Please test macOS

Copy link
Member

@DougGregor DougGregor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite the size of the change, it looks good and relatively safe. Putting the workaround flags into the crash diagnostic is a nice, paranoid touch :)

for (std::pair<DeclID, DeclID> entry : data) {
assert(entry.first);
auto declOrOffset = Decls[entry.first - 1];
if (!declOrOffset.isComplete())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Now I understand the "fault in the extensions" comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I don't love it, but the old code path would have done that anyway.


void EmitKey(raw_ostream &out, key_type_ref key, unsigned len) {
// FIXME: Avoid writing string data for identifiers here.
out << key.str();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was it hard to use IdentifierID's here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was where I was leaning but for now I kept it at close as possible to the existing tables. I'll file an SR for it and maybe someone can pick it up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed SR-3764.

for (const Decl *member : members) {
if (auto memberValue = dyn_cast<ValueDecl>(member)) {
if (!memberValue->hasName())
continue;

if (isDerivedTopLevel) {
topLevelDecls[memberValue->getName()].push_back({
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. Is topLevelDecls fully dead now? It probably died when we started putting synthesized operators into nominal types or extensions thereof rather than at the top level, but I failed to delete this part of the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to collect decls from anywhere but the actual top level anymore, but it's still how we do top-level lookups without deserializing everything.

@jrose-apple jrose-apple merged commit 99d4761 into swiftlang:master Jan 27, 2017
@jrose-apple jrose-apple deleted the nested-type-lookup-table branch January 27, 2017 16:55
jrose-apple added a commit to jrose-apple/swift that referenced this pull request Jan 27, 2017
…-table

[Serialization] Add a "nested types" lookup table for partial modules
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants