Skip to content

[clang][deps] Cherry-pick some dependency-scanner related commits #3441

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Oct 22, 2021

Conversation

jansvoboda11
Copy link

No description provided.

@jansvoboda11
Copy link
Author

@swift-ci test

@jansvoboda11
Copy link
Author

swiftlang/swift#39810
@swift-ci test

1 similar comment
@jansvoboda11
Copy link
Author

swiftlang/swift#39810
@swift-ci test

…earch::lookupModule`

This patch propagates the import `SourceLocation` into `HeaderSearch::lookupModule`. This enables remarks on search path usage (implemented in D102923) to point to the source code that initiated header search.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111557
…:lookupModule`

This fixes an LLDB build failure where the `ImportLoc` argument is missing: https://lab.llvm.org/buildbot#builders/68/builds/19975

This change also makes it possible to drop `SourceLocation()` in `Preprocessor::getCurrentModule`.
For dependency scanning, it would be useful to collect header search paths (provided on command-line via `-I` and friends) that were actually used during preprocessing. This patch adds that feature to `HeaderSearch` along with a new remark that reports such paths as they get used.

Previous version of this patch tried to use the existing `LookupFileCache` to report used paths via `HitIdx`. That doesn't work for `ComputeUserEntryUsage` (which is intended to be called *after* preprocessing), because it indexes used search paths by the file name. This means the values get overwritten when the code contains `#include_next`.

Note that `HeaderSearch` doesn't use `HeaderSearchOptions::UserEntries` directly. Instead, `InitHeaderSearch` pre-processes them (adds platform-specific paths, removes duplicates, removes paths that don't exist) and creates `DirectoryLookup` instances. This means we need a mechanism for translating between those two. It's not possible to go from `DirectoryLookup` back to the original `HeaderSearch`, so `InitHeaderSearch` now tracks the relationships explicitly.

Depends on D111557.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102923
To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.

In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.

We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.

There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.

Depends on D102923.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102488
At this point, `F.ImportLoc` has not been initialized by the `ASTReader` yet and using it leads to an assertion failure.

Introduced in 638c673 and 4445135.
This is a prep patch for D111560.
During explicit modular build, PCM files are typically specified via the `-fmodule-file=<path>` command-line option. Early during the compilation, Clang uses the `ASTReader` to read their contents and caches the result so that the module isn't loaded implicitly later on. A listener is attached to the `ASTReader` to collect names of the modules read from the PCM files. However, if the PCM has already been loaded previously via PCH:
1. the `ASTReader` doesn't do anything for the second time,
2. the listener is not invoked at all,
3. the module load result is not cached,
4. the compilation fails when attempting to load the module implicitly later on.

This patch solves this problem by attaching the listener to the `ASTReader` for PCH reading as well.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111560
…t modules

When using explicit Clang modules, some declarations might unexpectedly become invisible.

This is caused by the mechanism that loads PCM files passed via `-fmodule-file=<path>` and creates an `IdentifierInfo` for the module name. The `IdentifierInfo` creation takes place when the `ASTReader` is in a weird state, with modules that are loaded but not yet set up properly. This patch delays the creation of `IdentifierInfo` until the `ASTReader` is done with reading the PCM.

Note that the `-fmodule-file=<name>=<path>` form of the argument doesn't suffer from this issue, since it doesn't create `IdentifierInfo` for the module name.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111543
One of main goals of the dependency scanner is to be strict about module compatibility. This is achieved through strict context hash. This patch ensures that strict context hash is enabled not only during the scan itself (and its minimized implicit build), but also when actually reporting the dependency.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111720
The `ModuleDepCollectorPP` class holds a reference to `ModuleDepCollector` as well as `ModuleDepCollector`'s `CompilerInstance`. The fact that these refer to the same object is non-obvious.

This patch removes the `CompilerInvocation` reference from `ModuleDepCollectorPP` and accesses it through `ModuleDepCollector` instead.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111724
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.

This patch gives a distinct name to the `CompilerInstance` that's used to run the implicit build during dependency scan.

Depends on D111724.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111725
@jansvoboda11 jansvoboda11 force-pushed the jan_svoboda/cherry_pick branch from 82f8e45 to 110502c Compare October 21, 2021 12:22
The dependency scanner works with multiple instances of `Compiler{Instance,Invocation}`. From names of the variables/members, their purpose is not obvious.

This patch gives descriptive name to the generated `CompilerInvocation` that can be used to derive the command-line to build a modular dependency.

Depends on D111725.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D111728
The `clang-scan-deps` CLI tool invokes the compiler with `-print-resource-dir` in case the `-resource-dir` argument is missing from the compilation command line. This is to enable running the tool on compilation databases that use compiler from a different toolchain than `clang-scan-deps` itself. While this doesn't make sense when scanning modular builds (due to the `-cc1` arguments the tool generates), the tool can can be used to efficiently scan for file dependencies of non-modular builds too.

This patch stops deducing the resource directory by invoking the compiler by default. This mode can still be enabled by invoking `clang-scan-deps` with `--resource-dir-recipe invoke-compiler`. The new default is `--resource-dir-recipe modify-compiler-path` which relies on the resource directory deduction taking place in `Driver::Driver` which is based on the compiler path. This makes the default more aligned with the intended usage of the tool while still allowing it to serve other use-cases.

Note that this functionality was also influenced by D108979, where the dependency scanner stopped going through `ClangTool::run`. The function tried to deduce the resource directory based on the current executable path, which might not be what the users expect when invoked from within a shared library.

Depends on D108979.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D108366
@jansvoboda11 jansvoboda11 force-pushed the jan_svoboda/cherry_pick branch from 110502c to 6029b46 Compare October 21, 2021 12:42
@jansvoboda11
Copy link
Author

swiftlang/swift#39810
@swift-ci test

@jansvoboda11 jansvoboda11 merged commit eca53ab into stable/20210726 Oct 22, 2021
@jansvoboda11 jansvoboda11 deleted the jan_svoboda/cherry_pick branch October 22, 2021 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant