Skip to content

add support to getTopLevelDecls for clang submodules #76401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jan 30, 2025

Conversation

QuietMisdreavus
Copy link
Contributor

Resolves rdar://126031510

This PR adds partial support for Clang submodules in the ClangModuleUnit code. Specifically, it allows getTopLevelDeclsForDisplay (and the crawling methods it calls) to correctly load decls when given a submodule.

This is a behavior change for this function; specifically, passing in the top-level module now only returns the decls declared specifically in that module, as opposed to adding in the decls from its submodules as well. This broke the module interface printer, which i updated in this PR.

The goal of this was to allow swift-symbolgraph-extract to receive the name of a Clang submodule and correctly load its constituent symbols (without tripping an assertion on assertions builds), which this now enables.

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

To aid code review, i'll provide a bit of a writeup of what i changed:

AST

@hborla @slavapestov @xedin

I added the ClangNode::getOwningClangModule method, which gets the owning module of the underlying Clang decl, if it exists. Module nodes are returned as-is, and MacroInfo macros return nullptr since they do not have a recorded owning module.

IDE

@ahoppen @bnbarham @hamishknight @rintaro

I had to update the module interface printer to accommodate the change in behavior surrounding Clang modules. Parent Clang modules now no longer report the decls of their submodules, so they will need to be crawled separately. I split out a list of "modules to crawl" and added submodules to it if VisitSubmodules is given as a traversal option. Then getTopLevelDeclsForDisplay is called repeatedly with the same output vector to accumulate all the requested decls.

I also updated the test output for SourceKit/DocSupport/doc_clang_module.swift and SourceKit/InterfaceGen/gen_clang_module.swift when printing the interface for the submodule. The import FooHelper statement was actually not present in the original submodule, and this update changed the printer behavior to reflect that.

SymbolGraphGen

@daniel-grumberg

I tweaked the SymbolGraphASTWalker to probe the underlying Clang module of any Clang nodes when determining which symbol graph to sort a symbol into. Since the owning module still reports the top-level module, this step is required to accurately sort symbols.

ClangImporter

@zoecarver @hyp @egorzhdan @beccadax @ian-twilightcoder

To collect decls into the right module when crawling a Clang module, i updated isDeclaredInModule to compare the owning module as reported by Clang against our ModuleFilter, rather than using the owning module as reported by Swift. As mentioned above, i did not update the owning module given to imported decls. (Doing so created deserialization errors when building the stdlib that i didn't want to debug fully.) So to accommodate that, we need to circumvent it and ask Clang for the owning module instead, which accurately reports the submodule. This way, if a submodule is given to getTopLevelDeclsForDisplay (which creates a swift::ModuleDecl that wraps the submodule itself), we can correctly report which decls belong to that submodule.

I also updated ClangModuleUnit::getTopLevelDecls to double-check that any Objective-C categories in the ClangImporter lookup table actually belong to this module before adding them to the output. When getTopLevelDeclsForDisplay is called repeatedly, like i did in the module interface printer, the lookup table is not cleared, leading to categories appearing multiple times. This filter prevents that from happening.

Finally, i removed the CMU->isTopLevel() component of the assertion in the FilteringDeclaredDeclConsumer constructor, since we now support submodules in the decl crawler.


Since the biggest conceptual changes are in the ClangImporter and the module interface printer, i would most appreciate a review from those code owners. Thanks!

Copy link
Member

@ahoppen ahoppen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDE changes look good to me. I don’t understand where the FooHelper import was coming from before because FooSub.h doesn’t include anything that might be from FooHelper and the module map also doesn’t have any mention of FooHelper. @rintaro @bnbarham @hamishknight Do you know?

@QuietMisdreavus
Copy link
Contributor Author

@ahoppen IIRC it has to do with how the decls were being collected before. Import decls are filtered separately from other kinds of decls, which means the decl filtering that was being done skipped any import decls. Since printing submodules involved crawling the top-level module, that meant that any import from the top-level module also made its way into the submodule, regardless of whether the submodule actually imported it. That's what happened in this situation.

@ahoppen
Copy link
Member

ahoppen commented Sep 16, 2024

I see. In that case: Good fix, thank you!

Copy link
Contributor

@hamishknight hamishknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDE changes LGTM

@daniel-grumberg
Copy link
Contributor

I am not entirely sure but it seems like generating symbol graphs for a top level module doesn't include symbols from submodules. This seems incorrect as unfortunately most ObjC frameworks in SDK have a modulemap that looks something like:

framework module MyFramework {
    umbrella header "MyFramework.h"
    
    export *
    module * {
        export *
    }
}

which basically says reexport all the modules it pulled in by the umbrella header (that's the export *), and that every header is it's own submodule that's the part (that's the module *), So effectively most ObjC symbols are actually in a submodule and would get dropped if I interpreted this change correctly.

@QuietMisdreavus QuietMisdreavus force-pushed the QuietMisdreavus/submodule-decls branch from 9d83f19 to 8299c83 Compare September 30, 2024 22:35
@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

@daniel-grumberg I've just pushed a commit that should fix that. It required some tweaks in the export-import logic and around SymbolGraphGen, but it should automatically treat Clang submodules as re-exported if they were declared with the export *; module * { export * } structure.

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

if (!ClangModule)
return false;

while ((ClangModule = ClangModule->Parent)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
while ((ClangModule = ClangModule->Parent)) {
ClangModule->isSubmoduleOf(ClangParent)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the implementation of isSubmoduleOf, and this is a single-equals assignment anyway; i'm checking that ClangModule->Parent exists, and also assigning it to ClangModule for the next loop iteration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to using this https://clang.llvm.org/doxygen/classclang_1_1Module.html#ab9ffc11ef69081f60cdc8fabe67f1cbf but maybe I got confused by which types are swift modules vs clang modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I totally missed that there was an isSubmoduleOf on Clang modules. I can replace the entire loop with that call, then.

Copy link
Contributor

@slavapestov slavapestov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your change looks good; there's an existing problem in collectExportedImports() you may want to address

@@ -1435,8 +1445,12 @@ collectExportedImports(const ModuleDecl *topLevelModule,
stack.push_back(topLevelModule);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the import graph is not a tree, you will push the same node multiple times on the stack and repeat the work of visiting its successors. Eg, suppose we're asked to collectExportedImports() of A with the import graph below, where A re-exports B and C, B and C both re-export D, etc:

    B       E       H   
   + \     + \     + \    
  /   \   /   \   /   \   
 /     + /     + /     + 
A       D       G       J
 \     + \     + \     + 
  \   /   \   /   \   /  
   + /     + /     + /   
    C       F       I

We're going to visit D twice, G four times, J eight times, etc, so this algorithm has a worst case running time of O(2^n). You can fix this by adding a visited set alongside stack. Before pushing something on the stack, check if it's already been visited, and skip it if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! I've just pushed a commit that adds a visited set to counteract this.

@QuietMisdreavus QuietMisdreavus force-pushed the QuietMisdreavus/submodule-decls branch from 5e1c071 to 936d105 Compare October 17, 2024 18:54
@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

@daniel-grumberg I've pushed an update that handles export statements in Clang modules, as well as simplifies the isSubmoduleOf implementation. Can you take another look?

@QuietMisdreavus QuietMisdreavus force-pushed the QuietMisdreavus/submodule-decls branch from 936d105 to b3348a6 Compare November 7, 2024 21:51
@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

I pushed a new commit that tweaks the behavior of collectExportedImports to deal with Clang modules that have Swift overlays. Previously we were not properly crawling the Clang module for its exports, since we were skipping underlying Clang modules in this function. We still don't want to process the import and include its symbols twice, but we do want to process its re-exports for header-based submodules. This should fix an issue we saw when trying this change offline. /cc @daniel-grumberg

@QuietMisdreavus QuietMisdreavus force-pushed the QuietMisdreavus/submodule-decls branch from b3348a6 to 10818f5 Compare December 9, 2024 20:39
@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

I've had to update the PR again to handle an edge case when using this code against some test data offline. It turns out that some modules don't properly add an export * statement in their module map, which means that technically the top-level module doesn't actually export the symbols from its submodules. To counteract this, i introduced a workaround in SymbolGraphGen to consider umbrella headers to implicitly export a module's submodules if there's an umbrella statement but no export statements. This means that it no longer accurately reflects the actual module structure, but it should reflect the intended module structure, and that feels like a better solution for something documentation-focused. I'll do more offline testing before merging this, in case there are yet more issues i need to handle.

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please smoke test

@QuietMisdreavus
Copy link
Contributor Author

@swift-ci Please test

@QuietMisdreavus QuietMisdreavus merged commit ab26b8b into main Jan 30, 2025
5 checks passed
@QuietMisdreavus QuietMisdreavus deleted the QuietMisdreavus/submodule-decls branch January 30, 2025 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants