Skip to content

[clang][deps] Print tracing VFS data #108056

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 11, 2024
Merged

Conversation

jansvoboda11
Copy link
Contributor

Clang's -cc1 -print-stats shows lots of useful internal data including basic FileManager stats. Since this layer caches some results, it is unclear how that information translates to actual filesystem accesses. This PR uses llvm::vfs::TracingFileSystem to provide that missing information.

Similar mechanism is implemented for clang-scan-deps's verbose mode (-v). IO contention proved to be a real bottleneck a couple of times already and this new feature should make those easier to detect in the future. The tracing VFS is inserted below the caching FS and above the real FS.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 10, 2024

@llvm/pr-subscribers-clang

Author: Jan Svoboda (jansvoboda11)

Changes

Clang's -cc1 -print-stats shows lots of useful internal data including basic FileManager stats. Since this layer caches some results, it is unclear how that information translates to actual filesystem accesses. This PR uses llvm::vfs::TracingFileSystem to provide that missing information.

Similar mechanism is implemented for clang-scan-deps's verbose mode (-v). IO contention proved to be a real bottleneck a couple of times already and this new feature should make those easier to detect in the future. The tracing VFS is inserted below the caching FS and above the real FS.


Full diff: https://github.com/llvm/llvm-project/pull/108056.diff

10 Files Affected:

  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h (+5-1)
  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h (+2)
  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h (+2)
  • (modified) clang/lib/Basic/FileManager.cpp (+11)
  • (modified) clang/lib/Frontend/CompilerInstance.cpp (+3)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp (+2-2)
  • (modified) clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp (+3)
  • (added) clang/test/ClangScanDeps/verbose.test (+28)
  • (added) clang/test/Misc/print-stats-vfs.test (+17)
  • (modified) clang/tools/clang-scan-deps/ClangScanDeps.cpp (+29-1)
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 557f0e547ab4a8..4a343f2872d8d9 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -76,7 +76,7 @@ class DependencyScanningService {
   DependencyScanningService(
       ScanningMode Mode, ScanningOutputFormat Format,
       ScanningOptimizations OptimizeArgs = ScanningOptimizations::Default,
-      bool EagerLoadModules = false);
+      bool EagerLoadModules = false, bool TraceVFS = false);
 
   ScanningMode getMode() const { return Mode; }
 
@@ -86,6 +86,8 @@ class DependencyScanningService {
 
   bool shouldEagerLoadModules() const { return EagerLoadModules; }
 
+  bool shouldTraceVFS() const { return TraceVFS; }
+
   DependencyScanningFilesystemSharedCache &getSharedCache() {
     return SharedCache;
   }
@@ -97,6 +99,8 @@ class DependencyScanningService {
   const ScanningOptimizations OptimizeArgs;
   /// Whether to set up command-lines to load PCM files eagerly.
   const bool EagerLoadModules;
+  /// Whether to trace VFS accesses.
+  const bool TraceVFS;
   /// The global file system cache.
   DependencyScanningFilesystemSharedCache SharedCache;
 };
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
index cb9476d1550df3..012237e0278f4a 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningTool.h
@@ -144,6 +144,8 @@ class DependencyScanningTool {
       StringRef CWD, const llvm::DenseSet<ModuleID> &AlreadySeen,
       LookupModuleOutputCallback LookupModuleOutput);
 
+  llvm::vfs::FileSystem &getWorkerVFS() const { return Worker.getVFS(); }
+
 private:
   DependencyScanningWorker Worker;
 };
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
index 0f607862194b31..da6e0401411a34 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
@@ -104,6 +104,8 @@ class DependencyScanningWorker {
 
   bool shouldEagerLoadModules() const { return EagerLoadModules; }
 
+  llvm::vfs::FileSystem &getVFS() const { return *BaseFS; }
+
 private:
   std::shared_ptr<PCHContainerOperations> PCHContainerOps;
   /// The file system to be used during the scan.
diff --git a/clang/lib/Basic/FileManager.cpp b/clang/lib/Basic/FileManager.cpp
index 4509cee1ca0fed..6097b85a03064b 100644
--- a/clang/lib/Basic/FileManager.cpp
+++ b/clang/lib/Basic/FileManager.cpp
@@ -692,5 +692,16 @@ void FileManager::PrintStats() const {
   llvm::errs() << NumFileLookups << " file lookups, "
                << NumFileCacheMisses << " file cache misses.\n";
 
+  getVirtualFileSystem().visit([](llvm::vfs::FileSystem &VFS) {
+    if (auto *T = dyn_cast_or_null<llvm::vfs::TracingFileSystem>(&VFS))
+      llvm::errs() << "\n*** Virtual File System Stats:\n"
+                   << T->NumStatusCalls << " status() calls\n"
+                   << T->NumOpenFileForReadCalls << " openFileForRead() calls\n"
+                   << T->NumDirBeginCalls << " dir_begin() calls\n"
+                   << T->NumGetRealPathCalls << " getRealPath() calls\n"
+                   << T->NumExistsCalls << " exists() calls\n"
+                   << T->NumIsLocalCalls << " isLocal() calls\n";
+  });
+
   //llvm::errs() << PagesMapped << BytesOfPagesMapped << FSLookups;
 }
diff --git a/clang/lib/Frontend/CompilerInstance.cpp b/clang/lib/Frontend/CompilerInstance.cpp
index 1364641a9b71e1..5a273474f1d6b6 100644
--- a/clang/lib/Frontend/CompilerInstance.cpp
+++ b/clang/lib/Frontend/CompilerInstance.cpp
@@ -381,6 +381,9 @@ FileManager *CompilerInstance::createFileManager(
                   : createVFSFromCompilerInvocation(getInvocation(),
                                                     getDiagnostics());
   assert(VFS && "FileManager has no VFS?");
+  if (getFrontendOpts().ShowStats)
+    VFS =
+        llvm::makeIntrusiveRefCnt<llvm::vfs::TracingFileSystem>(std::move(VFS));
   FileMgr = new FileManager(getFileSystemOpts(), std::move(VFS));
   return FileMgr.get();
 }
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
index 7458ef484b16c4..4fb5977580497c 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
@@ -15,9 +15,9 @@ using namespace dependencies;
 
 DependencyScanningService::DependencyScanningService(
     ScanningMode Mode, ScanningOutputFormat Format,
-    ScanningOptimizations OptimizeArgs, bool EagerLoadModules)
+    ScanningOptimizations OptimizeArgs, bool EagerLoadModules, bool TraceVFS)
     : Mode(Mode), Format(Format), OptimizeArgs(OptimizeArgs),
-      EagerLoadModules(EagerLoadModules) {
+      EagerLoadModules(EagerLoadModules), TraceVFS(TraceVFS) {
   // Initialize targets for object file support.
   llvm::InitializeAllTargets();
   llvm::InitializeAllTargetMCs();
diff --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index 09ad5ebc7954cf..d77187bfb1f2b8 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -501,6 +501,9 @@ DependencyScanningWorker::DependencyScanningWorker(
   // The scanner itself writes only raw ast files.
   PCHContainerOps->registerWriter(std::make_unique<RawPCHContainerWriter>());
 
+  if (Service.shouldTraceVFS())
+    FS = llvm::makeIntrusiveRefCnt<llvm::vfs::TracingFileSystem>(std::move(FS));
+
   switch (Service.getMode()) {
   case ScanningMode::DependencyDirectivesScan:
     DepFS =
diff --git a/clang/test/ClangScanDeps/verbose.test b/clang/test/ClangScanDeps/verbose.test
new file mode 100644
index 00000000000000..99c5214c762018
--- /dev/null
+++ b/clang/test/ClangScanDeps/verbose.test
@@ -0,0 +1,28 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: sed -e "s|DIR|%/t|g" %t/cdb.json.in > %t/cdb.json
+
+// RUN: clang-scan-deps -compilation-database %t/cdb.json -v -o %t/result.json 2>&1 | FileCheck %s
+// CHECK:      *** Virtual File System Stats:
+// CHECK-NEXT: {{[[:digit:]]+}} status() calls
+// CHECK-NEXT: {{[[:digit:]]+}} openFileForRead() calls
+// CHECK-NEXT: {{[[:digit:]]+}} dir_begin() calls
+// CHECK-NEXT: {{[[:digit:]]+}} getRealPath() calls
+// CHECK-NEXT: {{[[:digit:]]+}} exists() calls
+// CHECK-NEXT: {{[[:digit:]]+}} isLocal() calls
+
+//--- tu.c
+
+//--- cdb.json.in
+[
+  {
+    "file": "DIR/tu.c"
+    "directory": "DIR",
+    "command": "clang -c DIR/tu.c -o DIR/tu.o"
+  },
+  {
+    "file": "DIR/tu.c"
+    "directory": "DIR",
+    "command": "clang -c DIR/tu.c -o DIR/tu.o"
+  }
+]
diff --git a/clang/test/Misc/print-stats-vfs.test b/clang/test/Misc/print-stats-vfs.test
new file mode 100644
index 00000000000000..65446cb7a5077d
--- /dev/null
+++ b/clang/test/Misc/print-stats-vfs.test
@@ -0,0 +1,17 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+
+// RUN: %clang_cc1 -fsyntax-only %t/tu.c -I %t/dir1 -I %t/dir2 -print-stats 2>&1 | FileCheck %s
+
+//--- tu.c
+#include "header.h"
+//--- dir1/other.h
+//--- dir2/header.h
+
+// CHECK:      *** Virtual File System Stats:
+// CHECK-NEXT: {{[[:digit:]]+}} status() calls
+// CHECK-NEXT: {{[[:digit:]]+}} openFileForRead() calls
+// CHECK-NEXT: {{[[:digit:]]+}} dir_begin() calls
+// CHECK-NEXT: {{[[:digit:]]+}} getRealPath() calls
+// CHECK-NEXT: {{[[:digit:]]+}} exists() calls
+// CHECK-NEXT: {{[[:digit:]]+}} isLocal() calls
diff --git a/clang/tools/clang-scan-deps/ClangScanDeps.cpp b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
index a8f6150dd3493d..bf1621c91c2c92 100644
--- a/clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -915,6 +915,13 @@ int clang_scan_deps_main(int argc, char **argv, const llvm::ToolContext &) {
   if (Format == ScanningOutputFormat::Full)
     FD.emplace(ModuleName.empty() ? Inputs.size() : 0);
 
+  std::atomic<size_t> NumStatusCalls = 0;
+  std::atomic<size_t> NumOpenFileForReadCalls = 0;
+  std::atomic<size_t> NumDirBeginCalls = 0;
+  std::atomic<size_t> NumGetRealPathCalls = 0;
+  std::atomic<size_t> NumExistsCalls = 0;
+  std::atomic<size_t> NumIsLocalCalls = 0;
+
   auto ScanningTask = [&](DependencyScanningService &Service) {
     DependencyScanningTool WorkerTool(Service);
 
@@ -999,10 +1006,21 @@ int clang_scan_deps_main(int argc, char **argv, const llvm::ToolContext &) {
           HadErrors = true;
       }
     }
+
+    WorkerTool.getWorkerVFS().visit([&](llvm::vfs::FileSystem &VFS) {
+      if (auto *T = dyn_cast_or_null<llvm::vfs::TracingFileSystem>(&VFS)) {
+        NumStatusCalls += T->NumStatusCalls;
+        NumOpenFileForReadCalls += T->NumOpenFileForReadCalls;
+        NumDirBeginCalls += T->NumDirBeginCalls;
+        NumGetRealPathCalls += T->NumGetRealPathCalls;
+        NumExistsCalls += T->NumExistsCalls;
+        NumIsLocalCalls += T->NumIsLocalCalls;
+      }
+    });
   };
 
   DependencyScanningService Service(ScanMode, Format, OptimizeArgs,
-                                    EagerLoadModules);
+                                    EagerLoadModules, /*TraceVFS=*/Verbose);
 
   llvm::Timer T;
   T.startTimer();
@@ -1024,6 +1042,16 @@ int clang_scan_deps_main(int argc, char **argv, const llvm::ToolContext &) {
     Pool.wait();
   }
 
+  if (Verbose) {
+    llvm::errs() << "\n*** Virtual File System Stats:\n"
+                 << NumStatusCalls << " status() calls\n"
+                 << NumOpenFileForReadCalls << " openFileForRead() calls\n"
+                 << NumDirBeginCalls << " dir_begin() calls\n"
+                 << NumGetRealPathCalls << " getRealPath() calls\n"
+                 << NumExistsCalls << " exists() calls\n"
+                 << NumIsLocalCalls << " isLocal() calls\n";
+  }
+
   T.stopTimer();
   if (PrintTiming)
     llvm::errs() << llvm::format(

Copy link
Collaborator

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a comment.

@jansvoboda11 jansvoboda11 merged commit 6e4dcbb into llvm:main Sep 11, 2024
5 of 7 checks passed
@jansvoboda11 jansvoboda11 deleted the use-vfs-tracing branch September 11, 2024 23:05
jansvoboda11 added a commit to swiftlang/llvm-project that referenced this pull request Nov 4, 2024
Clang's `-cc1 -print-stats` shows lots of useful internal data including
basic `FileManager` stats. Since this layer caches some results, it is
unclear how that information translates to actual filesystem accesses.
This PR uses `llvm::vfs::TracingFileSystem` to provide that missing
information.

Similar mechanism is implemented for `clang-scan-deps`'s verbose mode
(`-v`). IO contention proved to be a real bottleneck a couple of times
already and this new feature should make those easier to detect in the
future. The tracing VFS is inserted below the caching FS and above the
real FS.

(cherry picked from commit 6e4dcbb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants