-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[Serialization] Add a "nested types" lookup table for partial modules #7067
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serialization] Add a "nested types" lookup table for partial modules #7067
Conversation
- We aren't using derived top-level declarations anymore. - We're looking for more than just operators. - Nested members of local decls are still local, so we don't need to record their Objective-C methods. No intended functionality change.
There's a class of errors in Serialization called "circularity issues", where declaration A in file A.swift depends on declaration B in file B.swift, and B also depends on A. In some cases we can manage to type-check each of these files individually due to the laziness of 'validateDecl', but then fail to merge the "partial modules" generated from A.swift and B.swift to form a single swiftmodule for the library (because deserialization is a little less lazy for some things). A common case of this is when at least one of the declarations is nested, in which case a lookup to find that declaration needs to load all the members of the parent type. This gets even worse when the nested type is defined in an extension. This commit sidesteps that issue specifically for nested types by creating a top-level, per-file table of nested types in the "partial modules". When a type is in the same module, we can then look it up /without/ importing all other members of the parent type. The long-term solution is to allow accessing any members of a type without having to load them all, something we should support not just for module-merging while building a single target but when reading from imported modules as well. This should improve both compile time and memory usage, though I'm not sure to what extent. (Unfortunately, too many things still depend on the whole members list being loaded.) Because this is a new code path, I put in a switch to turn it off: frontend flag -disable-serialization-nested-type-lookup-table https://bugs.swift.org/browse/SR-3707 (and possibly others)
I'd like to pull this into 3.1, because SR-3707 is a regression, but it's definitely a fairly major change to this part of serialization. What do you think, @DougGregor? @swift-ci Please test |
Build failed |
Build failed |
@swift-ci Please test |
Build failed |
@swift-ci Please test macOS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the size of the change, it looks good and relatively safe. Putting the workaround flags into the crash diagnostic is a nice, paranoid touch :)
for (std::pair<DeclID, DeclID> entry : data) { | ||
assert(entry.first); | ||
auto declOrOffset = Decls[entry.first - 1]; | ||
if (!declOrOffset.isComplete()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Now I understand the "fault in the extensions" comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I don't love it, but the old code path would have done that anyway.
|
||
void EmitKey(raw_ostream &out, key_type_ref key, unsigned len) { | ||
// FIXME: Avoid writing string data for identifiers here. | ||
out << key.str(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was it hard to use IdentifierID
's here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was where I was leaning but for now I kept it at close as possible to the existing tables. I'll file an SR for it and maybe someone can pick it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed SR-3764.
for (const Decl *member : members) { | ||
if (auto memberValue = dyn_cast<ValueDecl>(member)) { | ||
if (!memberValue->hasName()) | ||
continue; | ||
|
||
if (isDerivedTopLevel) { | ||
topLevelDecls[memberValue->getName()].push_back({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh. Is topLevelDecls
fully dead now? It probably died when we started putting synthesized operators into nominal types or extensions thereof rather than at the top level, but I failed to delete this part of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to collect decls from anywhere but the actual top level anymore, but it's still how we do top-level lookups without deserializing everything.
…-table [Serialization] Add a "nested types" lookup table for partial modules
There's a class of errors in Serialization called "circularity issues", where declaration A in file A.swift depends on declaration B in file B.swift, and B also depends on A. In some cases we can manage to type-check each of these files individually due to the laziness of 'validateDecl', but then fail to merge the "partial modules" generated from A.swift and B.swift to form a single swiftmodule for the library (because deserialization is a little less lazy for some things). A common case of this is when at least one of the declarations is nested, in which case a lookup to find that declaration needs to load all the members of the parent type. This gets even worse when the nested type is defined in an extension.
This commit sidesteps that issue specifically for nested types by creating a top-level, per-file table of nested types in the "partial modules". When a type is in the same module, we can then look it up without importing all other members of the parent type.
The long-term solution is to allow accessing any members of a type without having to load them all, something we should support not just for module-merging while building a single target but when reading from imported modules as well. This should improve both compile time and memory usage, though I'm not sure to what extent. (Unfortunately, too many things still depend on the whole members list being loaded.)
Because this is a new code path, I put in a switch to turn it off: frontend flag -disable-serialization-nested-type-lookup-table.
SR-3707 / rdar://problem/30172856 (and possibly others)