-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][deps] Overload Filesystem::exists
in DependencyScanningFilesystem
to have it use cached status
#88152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang][deps] Overload Filesystem::exists
in DependencyScanningFilesystem
to have it use cached status
#88152
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write If you have received no comments on your PR for a week, you can request a review If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-llvm-support @llvm/pr-subscribers-clang Author: Artem Chikin (artemcm) ChangesAs-is, calls to Instead, directly overloading Full diff: https://github.com/llvm/llvm-project/pull/88152.diff 2 Files Affected:
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
index 9a522a3e2fe252..4fdfebada5b7f7 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningFilesystem.h
@@ -365,6 +365,10 @@ class DependencyScanningWorkerFilesystem
return LocalCache.insertEntryForFilename(Filename, Entry);
}
+ /// Check whether \p Path exists. By default checks cached result of \c status(),
+ /// and falls back on FS if unable to do so.
+ bool exists(const Twine &Path) override;
+
/// Returns entry associated with the filename in the shared cache if there is
/// some. Otherwise, constructs new one with the given error code, associates
/// it with the filename and returns the result.
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
index 9b7812a1adb9e3..174edc98da5877 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
@@ -270,6 +270,12 @@ DependencyScanningWorkerFilesystem::status(const Twine &Path) {
return Result->getStatus();
}
+bool
+DependencyScanningWorkerFilesystem::exists(const Twine &Path) {
+ auto Status = status(Path);
+ return Status && Status->exists();
+}
+
namespace {
/// The VFS that is used by clang consumes the \c CachedFileSystemEntry using
|
8be194a
to
0ab9e02
Compare
e73e72a
to
c094e57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang/lib/Tooling/DependencyScanning/DependencyScanningFilesystem.cpp
Outdated
Show resolved
Hide resolved
c094e57
to
1ee923c
Compare
This patch doesn`t improve my usage, it seems I'm hitting a different codepath than you do. I'll investigate. |
Ah I see, the commit b768a8c didn't make it in time for the 18.x branch. The issue I'm seeing here is with regular C++ code, no PCH, no modules. It is related to
As an order of magnitude, the Unreal Engine target I'm testing with Not sure @jansvoboda11 perhaps if we want to cherry pick b768a8c on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise this PR LGTM.
I can try pulling b768a8c into the release branch. |
I'd like to see a unit test specific to |
1ee923c
to
8a2c67f
Compare
Done! Added one in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Would be nice to land Ben's change separately.
Seems like some tests failed on Linux. |
|
||
DepFS.exists("/foo"); | ||
DepFS.exists("/bar"); | ||
EXPECT_EQ(InstrumentingFS->NumStatCalls, 2u); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, doesn't this test pass even prior to this patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gah, was silly to use a ProxyFilesystem as the underlying FS. Fixed, thank you very much for catching.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to explicitly check that we're not calling exists()
instead of relying on the fact that InMemoryFileSystem
keeps implementing exists()
in terms of status()
.
Sorry I've been misleading, pushing b768a8c on 18.x doesn't fix the issue. The problem is around the directory queries and the function in b768a8c returns Running with:
Takes 4 min 8 sec. Then running with:
Takes 1 min 47 sec. I think something more involved would be needed here. @jansvoboda11 in which case we don't want to consider the file system immutable during the execution of clang-scan-deps? |
@aganea Ah, got it. Unfortunately, caching stat failures for all directories doesn't work for modules. Clang is supposed to create the modules cache directory if one doesn't exist. But if we first cache its non-existence, Clang will never see it again, even after Clang itself created it. I think we could turn off caching of stat failures only for the modules cache directory by giving the build system an API to configure In the meantime, are you able to work around this for your non-modular use-case by applying your change when building your own compiler downstream? Also, just to be sure, this is not a regression, correct? |
1b27958
to
2c94305
Compare
Yes I fixed it in our downstream. I am more worried about other users, since the initial version that @hyp committed a while ago in e1f4c4a didn't suffer of these issues.
Not per se, not from the latest commits. It seems the issue was introduced when module support was added in 9ab6d82. Unfortunately there are users like us with big C++ codebases and stuck with regular #includes, no modules for now. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
Please apply the code formatting suggestions. |
2c94305
to
f1c4624
Compare
8e98fd0
to
24e869d
Compare
Clang test now looks good to me. Might be nice to drop |
e010a76
to
8164aaf
Compare
I had to update the commit anyway. Unified on |
1dd15b3
to
b7c6930
Compare
…esystem' to have it use cached `status` As-is, calls to `exists()` fallback on the implementation in 'ProxyFileSystem::exists' which explicitly calls out to the underlying `FS`, which for the 'DependencyScanningFilesystem' (overlay) is the real underlying filesystem. Instead, directly overloading 'exists' allows us to have it rely on the cached `status` behavior used elsewhere by the 'DependencyScanningFilesystem'.
b7c6930
to
1989bbb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot!
@artemcm Congratulations on having your first Pull Request (PR) merged into the LLVM Project! Your changes will be combined with recent changes from other authors, then tested Please check whether problems have been caused by your change specifically, as How to do this, and the rest of the post-merge process, is covered in detail here. If your change does cause a problem, it may be reverted, or you can revert it yourself. If you don't get any reports, no action is required from you. Your changes are working as expected, well done! |
…esystem` to have it use cached `status` (llvm#88152) As-is, calls to `exists()` fallback on the implementation in `ProxyFileSystem::exists` which explicitly calls out to the underlying `FS`, which for the `DependencyScanningFilesystem` (overlay) is the real underlying filesystem. Instead, directly overloading `exists` allows us to have it rely on the cached `status` behavior used elsewhere by the `DependencyScanningFilesystem`. (cherry picked from commit 779ba60)
…esystem` to have it use cached `status` (llvm#88152) As-is, calls to `exists()` fallback on the implementation in `ProxyFileSystem::exists` which explicitly calls out to the underlying `FS`, which for the `DependencyScanningFilesystem` (overlay) is the real underlying filesystem. Instead, directly overloading `exists` allows us to have it rely on the cached `status` behavior used elsewhere by the `DependencyScanningFilesystem`.
…esystem` to have it use cached `status` (llvm#88152) As-is, calls to `exists()` fallback on the implementation in `ProxyFileSystem::exists` which explicitly calls out to the underlying `FS`, which for the `DependencyScanningFilesystem` (overlay) is the real underlying filesystem. Instead, directly overloading `exists` allows us to have it rely on the cached `status` behavior used elsewhere by the `DependencyScanningFilesystem`. (cherry picked from commit 779ba60)
…n hot code paths in `FileManager`. (#88427) `FileManager::getDirectoryRef()` and `FileManager::getFileRef()` are hot code paths in `clang-scan-deps`. These functions are updating on every call a few atomics related to printing statistics, which causes contention on high core count machines.    After this patch we make the variables local to the `FileManager`. In our test case, this saves about 49 sec over 1 min 47 sec of `clang-scan-deps` run time (1 min 47 sec before, 58 sec after). These figures are after applying my suggestion in #88152 (comment), that is: ``` static bool shouldCacheStatFailures(StringRef Filename) { return true; } ``` Without the above, there's just too much OS noise from the high volume of `status()` calls with regular non-modules C++ code. Tested on Windows with clang-cl.
As-is, calls to
exists()
fallback on the implementation inProxyFileSystem::exists
which explicitly calls out to the underlyingFS
, which for theDependencyScanningFilesystem
(overlay) is the real underlying filesystem.Instead, directly overloading
exists
allows us to have it rely on the cachedstatus
behavior used elsewhere by theDependencyScanningFilesystem
.