Skip to content

[clang module] Current Working Directory Pruning #124786

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Feb 5, 2025

Conversation

qiongsiwu
Copy link
Contributor

When computing the context hash, clang always includes the compiler's working directory. This can lead to situations when the only difference between two compilations is the working directory, different module variants are generated. These variants are redundant. This PR implements an optimization that ignores the working directory when computing the context hash when safe.

Specifically, clang checks if it is safe to ignore the working directory in isSafeToIgnoreCWD. The check involves going through compile command options to see if any paths specified are relative. The definition of relative path used here is that the input path is not empty, and llvm::sys::path::is_absolute is false. If all the paths examined are not relative, clang considers it safe to ignore the current working directory and does not consider the working directory when computing the context hash.

@qiongsiwu qiongsiwu marked this pull request as draft January 28, 2025 16:41
@llvmbot llvmbot added the clang Clang issues not falling into any other category label Jan 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-clang

Author: Qiongsi Wu (qiongsiwu)

Changes

When computing the context hash, clang always includes the compiler's working directory. This can lead to situations when the only difference between two compilations is the working directory, different module variants are generated. These variants are redundant. This PR implements an optimization that ignores the working directory when computing the context hash when safe.

Specifically, clang checks if it is safe to ignore the working directory in isSafeToIgnoreCWD. The check involves going through compile command options to see if any paths specified are relative. The definition of relative path used here is that the input path is not empty, and llvm::sys::path::is_absolute is false. If all the paths examined are not relative, clang considers it safe to ignore the current working directory and does not consider the working directory when computing the context hash.


Full diff: https://github.com/llvm/llvm-project/pull/124786.diff

5 Files Affected:

  • (modified) clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h (+4-1)
  • (modified) clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp (+89-3)
  • (added) clang/test/ClangScanDeps/modules-context-hash-cwd.c (+123)
  • (modified) clang/test/ClangScanDeps/working-dir.m (+1-1)
  • (modified) clang/tools/clang-scan-deps/ClangScanDeps.cpp (+2)
diff --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 4a343f2872d8d9..9ad8e68c33eb10 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -63,7 +63,10 @@ enum class ScanningOptimizations {
   /// Canonicalize -D and -U options.
   Macros = 8,
 
-  DSS_LAST_BITMASK_ENUM(Macros),
+  /// Ignore the compiler's working directory if it is safe.
+  IgnoreCWD = 0x10,
+
+  DSS_LAST_BITMASK_ENUM(IgnoreCWD),
   Default = All
 };
 
diff --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index 2e97cac0796cee..714efb86fa3796 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -397,9 +397,92 @@ void ModuleDepCollector::applyDiscoveredDependencies(CompilerInvocation &CI) {
   }
 }
 
+static bool isSafeToIgnoreCWD(const CowCompilerInvocation &CI) {
+  // Check if the command line input uses relative paths.
+  // It is not safe to ignore the current working directory if any of the
+  // command line inputs use relative paths.
+#define IF_RELATIVE_RETURN_FALSE(PATH)                                         \
+  do {                                                                         \
+    if (!PATH.empty() && !llvm::sys::path::is_absolute(PATH))                  \
+      return false;                                                            \
+  } while (0)
+
+#define IF_ANY_RELATIVE_RETURN_FALSE(PATHS)                                    \
+  do {                                                                         \
+    if (std::any_of(PATHS.begin(), PATHS.end(), [](const auto &P) {            \
+          return !P.empty() && !llvm::sys::path::is_absolute(P);               \
+        }))                                                                    \
+      return false;                                                            \
+  } while (0)
+
+  // Header search paths.
+  const auto &HeaderSearchOpts = CI.getHeaderSearchOpts();
+  IF_RELATIVE_RETURN_FALSE(HeaderSearchOpts.Sysroot);
+  for (auto &Entry : HeaderSearchOpts.UserEntries)
+    if (Entry.IgnoreSysRoot)
+      IF_RELATIVE_RETURN_FALSE(Entry.Path);
+  IF_RELATIVE_RETURN_FALSE(HeaderSearchOpts.ResourceDir);
+  IF_RELATIVE_RETURN_FALSE(HeaderSearchOpts.ModuleCachePath);
+  IF_RELATIVE_RETURN_FALSE(HeaderSearchOpts.ModuleUserBuildPath);
+  for (auto I = HeaderSearchOpts.PrebuiltModuleFiles.begin(),
+            E = HeaderSearchOpts.PrebuiltModuleFiles.end();
+       I != E;) {
+    auto Current = I++;
+    IF_RELATIVE_RETURN_FALSE(Current->second);
+  }
+  IF_ANY_RELATIVE_RETURN_FALSE(HeaderSearchOpts.PrebuiltModulePaths);
+  IF_ANY_RELATIVE_RETURN_FALSE(HeaderSearchOpts.VFSOverlayFiles);
+
+  // Preprocessor options.
+  const auto &PPOpts = CI.getPreprocessorOpts();
+  IF_ANY_RELATIVE_RETURN_FALSE(PPOpts.MacroIncludes);
+  IF_ANY_RELATIVE_RETURN_FALSE(PPOpts.Includes);
+  IF_RELATIVE_RETURN_FALSE(PPOpts.ImplicitPCHInclude);
+
+  // Frontend options.
+  const auto &FrontendOpts = CI.getFrontendOpts();
+  for (const FrontendInputFile &Input : FrontendOpts.Inputs) {
+    if (Input.isBuffer())
+      continue; // FIXME: Can this happen when parsing command-line?
+
+    IF_RELATIVE_RETURN_FALSE(Input.getFile());
+  }
+  IF_RELATIVE_RETURN_FALSE(FrontendOpts.CodeCompletionAt.FileName);
+  IF_ANY_RELATIVE_RETURN_FALSE(FrontendOpts.ModuleMapFiles);
+  IF_ANY_RELATIVE_RETURN_FALSE(FrontendOpts.ModuleFiles);
+  IF_ANY_RELATIVE_RETURN_FALSE(FrontendOpts.ModulesEmbedFiles);
+  IF_ANY_RELATIVE_RETURN_FALSE(FrontendOpts.ASTMergeFiles);
+  IF_RELATIVE_RETURN_FALSE(FrontendOpts.OverrideRecordLayoutsFile);
+  IF_RELATIVE_RETURN_FALSE(FrontendOpts.StatsFile);
+
+  // Filesystem options.
+  const auto &FileSystemOpts = CI.getFileSystemOpts();
+  IF_RELATIVE_RETURN_FALSE(FileSystemOpts.WorkingDir);
+
+  // Codegen options.
+  const auto &CodeGenOpts = CI.getCodeGenOpts();
+  IF_RELATIVE_RETURN_FALSE(CodeGenOpts.DebugCompilationDir);
+  IF_RELATIVE_RETURN_FALSE(CodeGenOpts.CoverageCompilationDir);
+
+  // Sanitizer options.
+  IF_ANY_RELATIVE_RETURN_FALSE(CI.getLangOpts().NoSanitizeFiles);
+
+  // Coverage mappings.
+  IF_RELATIVE_RETURN_FALSE(CodeGenOpts.ProfileInstrumentUsePath);
+  IF_RELATIVE_RETURN_FALSE(CodeGenOpts.SampleProfileFile);
+  IF_RELATIVE_RETURN_FALSE(CodeGenOpts.ProfileRemappingFile);
+
+  // Dependency output options.
+  for (auto &ExtraDep : CI.getDependencyOutputOpts().ExtraDeps)
+    IF_RELATIVE_RETURN_FALSE(ExtraDep.first);
+
+  return true;
+}
+
 static std::string getModuleContextHash(const ModuleDeps &MD,
                                         const CowCompilerInvocation &CI,
                                         bool EagerLoadModules,
+                                        bool IgnoreCWD,
                                         llvm::vfs::FileSystem &VFS) {
   llvm::HashBuilder<llvm::TruncatedBLAKE3<16>, llvm::endianness::native>
       HashBuilder;
@@ -410,7 +493,7 @@ static std::string getModuleContextHash(const ModuleDeps &MD,
   HashBuilder.add(getClangFullRepositoryVersion());
   HashBuilder.add(serialization::VERSION_MAJOR, serialization::VERSION_MINOR);
   llvm::ErrorOr<std::string> CWD = VFS.getCurrentWorkingDirectory();
-  if (CWD)
+  if (CWD && !IgnoreCWD)
     HashBuilder.add(*CWD);
 
   // Hash the BuildInvocation without any input files.
@@ -443,8 +526,11 @@ static std::string getModuleContextHash(const ModuleDeps &MD,
 
 void ModuleDepCollector::associateWithContextHash(
     const CowCompilerInvocation &CI, ModuleDeps &Deps) {
-  Deps.ID.ContextHash = getModuleContextHash(
-      Deps, CI, EagerLoadModules, ScanInstance.getVirtualFileSystem());
+  bool IgnoreCWD = any(OptimizeArgs & ScanningOptimizations::IgnoreCWD) &&
+                   isSafeToIgnoreCWD(CI);
+  Deps.ID.ContextHash =
+      getModuleContextHash(Deps, CI, EagerLoadModules, IgnoreCWD,
+                           ScanInstance.getVirtualFileSystem());
   bool Inserted = ModuleDepsByID.insert({Deps.ID, &Deps}).second;
   (void)Inserted;
   assert(Inserted && "duplicate module mapping");
diff --git a/clang/test/ClangScanDeps/modules-context-hash-cwd.c b/clang/test/ClangScanDeps/modules-context-hash-cwd.c
new file mode 100644
index 00000000000000..45be72301c635d
--- /dev/null
+++ b/clang/test/ClangScanDeps/modules-context-hash-cwd.c
@@ -0,0 +1,123 @@
+// Test current directory pruning when computing the context hash.
+
+// REQUIRES: shell
+
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: sed -e "s|DIR|%/t|g" %t/cdb0.json.in > %t/cdb0.json
+// RUN: sed -e "s|DIR|%/t|g" %t/cdb1.json.in > %t/cdb1.json
+// RUN: sed -e "s|DIR|%/t|g" %t/cdb2.json.in > %t/cdb2.json
+// RUN: clang-scan-deps -compilation-database %t/cdb0.json -format experimental-full > %t/result0.json
+// RUN: clang-scan-deps -compilation-database %t/cdb1.json -format experimental-full > %t/result1.json
+// RUN: clang-scan-deps -compilation-database %t/cdb2.json -format experimental-full -optimize-args=header-search,system-warnings,vfs,canonicalize-macros > %t/result2.json
+// RUN: cat %t/result0.json %t/result1.json | FileCheck %s
+// RUN: cat %t/result0.json %t/result2.json | FileCheck %s -check-prefix=SKIPOPT
+
+//--- cdb0.json.in
+[{
+  "directory": "DIR",
+  "command": "clang -c DIR/tu.c -fmodules -fmodules-cache-path=DIR/cache -IDIR/include/ -o DIR/tu.o",
+  "file": "DIR/tu.c"
+}]
+
+//--- cdb1.json.in
+[{
+  "directory": "DIR/a",
+  "command": "clang -c DIR/tu.c -fmodules -fmodules-cache-path=DIR/cache -IDIR/include/ -o DIR/tu.o",
+  "file": "DIR/tu.c"
+}]
+
+//--- cdb2.json.in
+[{
+  "directory": "DIR/a/",
+  "command": "clang -c DIR/tu.c -fmodules -fmodules-cache-path=DIR/cache -IDIR/include/ -o DIR/tu.o",
+  "file": "DIR/tu.c"
+}]
+
+//--- include/module.modulemap
+module mod {
+  header "mod.h"
+}
+
+//--- include/mod.h
+
+//--- tu.c
+#include "mod.h"
+
+// Check that result0 and result1 compute the same hash with optimization
+// on. The only difference between result0 and result1 is the compiler's
+// working directory.
+// CHECK:     {
+// CHECK-NEXT:  "modules": [
+// CHECK-NEXT:   {
+// CHECK-NEXT:     "clang-module-deps": [],
+// CHECK:          "context-hash": "[[HASH:.*]]",
+// CHECK:        }
+// CHECK:       "translation-units": [
+// CHECK:        {
+// CHECK:          "commands": [
+// CHECK:          {
+// CHECK-NEXT:        "clang-context-hash": "{{.*}}",
+// CHECK-NEXT:        "clang-module-deps": [
+// CHECK-NEXT:          {
+// CHECK-NEXT:            "context-hash": "[[HASH]]",
+// CHECK-NEXT:            "module-name": "mod"
+// CHECK:               }
+// CHECK:             ],
+// CHECK:     {
+// CHECK-NEXT:   "modules": [
+// CHECK-NEXT:    {
+// CHECK-NEXT:      "clang-module-deps": [],
+// CHECK:           "context-hash": "[[HASH]]",
+// CHECK:         }
+// CHECK:        "translation-units": [
+// CHECK:         {
+// CHECK:           "commands": [
+// CHECK:           {
+// CHECK-NEXT:         "clang-context-hash": "{{.*}}",
+// CHECK-NEXT:         "clang-module-deps": [
+// CHECK-NEXT:           {
+// CHECK-NEXT:             "context-hash": "[[HASH]]",
+// CHECK-NEXT:             "module-name": "mod"
+// CHECK:               }
+// CHECK:              ],
+
+// Check that result0 and result2 compute different hashes because
+// the working directory optmization is turned off for result2.
+// SKIPOPT:      {
+// SKIPOPT-NEXT:   "modules": [
+// SKIPOPT-NEXT:    {
+// SKIPOPT-NEXT:      "clang-module-deps": [],
+// SKIPOPT:           "context-hash": "[[HASH0:.*]]",
+// SKIPOPT:         }
+// SKIPOPT:        "translation-units": [
+// SKIPOPT:         {
+// SKIPOPT:            "commands": [
+// SKIPOPT:             {
+// SKIPOPT-NEXT:          "clang-context-hash": "{{.*}}",
+// SKIPOPT-NEXT:          "clang-module-deps": [
+// SKIPOPT-NEXT:            {
+// SKIPOPT-NEXT:              "context-hash": "[[HASH0]]",
+// SKIPOPT-NEXT:              "module-name": "mod"
+// SKIPOPT:            }
+// SKIPOPT:          ],
+// SKIPOPT:      {
+// SKIPOPT-NEXT:   "modules": [
+// SKIPOPT-NEXT:     {
+// SKIPOPT-NEXT:       "clang-module-deps": [],
+// SKIPOPT-NOT:        "context-hash": "[[HASH0]]",
+// SKIPOPT:            "context-hash": "[[HASH2:.*]]",
+// SKIPOPT:          }
+// SKIPOPT:       "translation-units": [
+// SKIPOPT:         {
+// SKIPOPT:           "commands": [
+// SKIPOPT:             {
+// SKIPOPT-NEXT:          "clang-context-hash": "{{.*}}",
+// SKIPOPT-NEXT:          "clang-module-deps": [
+// SKIPOPT-NEXT:            {
+// SKIPOPT-NOT:              "context-hash": "[[HASH0]]",
+// SKIPOPT-NEXT:             "context-hash": "[[HASH2]]"
+// SKIPOPT-NEXT:              "module-name": "mod"
+// SKIPOPT:            }
+// SKIPOPT:          ],
+
diff --git a/clang/test/ClangScanDeps/working-dir.m b/clang/test/ClangScanDeps/working-dir.m
index a04f8c2486b98d..c6b7b1988d3cf7 100644
--- a/clang/test/ClangScanDeps/working-dir.m
+++ b/clang/test/ClangScanDeps/working-dir.m
@@ -2,7 +2,7 @@
 // RUN: split-file %s %t
 // RUN: sed -e "s|DIR|%/t|g" %t/build/compile-commands.json.in > %t/build/compile-commands.json
 // RUN: clang-scan-deps -compilation-database %t/build/compile-commands.json \
-// RUN:   -j 1 -format experimental-full --optimize-args=all > %t/deps.db
+// RUN:   -j 1 -format experimental-full --optimize-args=header-search,system-warnings,vfs,canonicalize-macros > %t/deps.db
 // RUN: cat %t/deps.db | sed 's:\\\\\?:/:g' | FileCheck %s -DPREFIX=%/t
 
 // Check that there are two separate modules hashes. One for each working dir.
diff --git a/clang/tools/clang-scan-deps/ClangScanDeps.cpp b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
index 709dc513be2811..8d429534a20073 100644
--- a/clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -164,6 +164,8 @@ static void ParseArgs(int argc, char **argv) {
             .Case("system-warnings", ScanningOptimizations::SystemWarnings)
             .Case("vfs", ScanningOptimizations::VFS)
             .Case("canonicalize-macros", ScanningOptimizations::Macros)
+            .Case("ignore-current-working-dir",
+                  ScanningOptimizations::IgnoreCWD)
             .Case("all", ScanningOptimizations::All)
             .Default(std::nullopt);
     if (!Optimization) {

Copy link

github-actions bot commented Jan 28, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@Bigcheese Bigcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the right approach, but I think it would be good to have a test for the relative path checking. Not every option needs a test, just one should be fine.

@qiongsiwu qiongsiwu self-assigned this Jan 30, 2025
@qiongsiwu qiongsiwu marked this pull request as ready for review January 30, 2025 19:06
@qiongsiwu qiongsiwu requested a review from Bigcheese January 30, 2025 19:06
Copy link
Collaborator

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better if this optimization happens really early in the process since you only visit all the options in CI? In that case, you can just reset the CurrentWorkingDirectory in the VFS so all the searching is done without CWD. This avoids any hard to debug issues if some options are not taken care of (needs CWD but not checked) but the trade off is more explicit errors during scanning.

@Bigcheese
Copy link
Contributor

It has to happen after the header search optimization in case that removes relative header search paths.

@qiongsiwu
Copy link
Contributor Author

Gentle ping for review. Thanks!

Copy link
Contributor

@Bigcheese Bigcheese left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Copy link
Collaborator

@cachemeifyoucan cachemeifyoucan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some small style comments.

@qiongsiwu qiongsiwu merged commit 54acda2 into llvm:main Feb 5, 2025
8 checks passed
qiongsiwu added a commit to qiongsiwu/llvm-project that referenced this pull request Feb 5, 2025
When computing the context hash, `clang` always includes the compiler's
working directory. This can lead to situations when the only difference
between two compilations is the working directory, different module
variants are generated. These variants are redundant. This PR implements
an optimization that ignores the working directory when computing the
context hash when safe.

Specifically, `clang` checks if it is safe to ignore the working
directory in `isSafeToIgnoreCWD`. The check involves going through
compile command options to see if any paths specified are relative. The
definition of relative path used here is that the input path is not
empty, and `llvm::sys::path::is_absolute` is false. If all the paths
examined are not relative, `clang` considers it safe to ignore the
current working directory and does not consider the working directory
when computing the context hash.

(cherry picked from commit 54acda2)
cyndyishida pushed a commit to swiftlang/llvm-project that referenced this pull request Feb 7, 2025
When computing the context hash, `clang` always includes the compiler's
working directory. This can lead to situations when the only difference
between two compilations is the working directory, different module
variants are generated. These variants are redundant. This PR implements
an optimization that ignores the working directory when computing the
context hash when safe.

Specifically, `clang` checks if it is safe to ignore the working
directory in `isSafeToIgnoreCWD`. The check involves going through
compile command options to see if any paths specified are relative. The
definition of relative path used here is that the input path is not
empty, and `llvm::sys::path::is_absolute` is false. If all the paths
examined are not relative, `clang` considers it safe to ignore the
current working directory and does not consider the working directory
when computing the context hash.

(cherry picked from commit 54acda2)
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
When computing the context hash, `clang` always includes the compiler's
working directory. This can lead to situations when the only difference
between two compilations is the working directory, different module
variants are generated. These variants are redundant. This PR implements
an optimization that ignores the working directory when computing the
context hash when safe.

Specifically, `clang` checks if it is safe to ignore the working
directory in `isSafeToIgnoreCWD`. The check involves going through
compile command options to see if any paths specified are relative. The
definition of relative path used here is that the input path is not
empty, and `llvm::sys::path::is_absolute` is false. If all the paths
examined are not relative, `clang` considers it safe to ignore the
current working directory and does not consider the working directory
when computing the context hash.
qiongsiwu added a commit that referenced this pull request Feb 26, 2025
…re current working directory (#128446)

This PR explicitly sets `DebugCompilationDir` to the system's root
directory if it is safe to ignore the current working directory.

This fixes a problem where a PCM file's embedded debug information can
lead to compilation failure. The compiler may have decided it is indeed
safe to ignore the current working directory. In this case, the PCM
file's content is functionally correct regardless of the current working
directory because no inputs use relative paths (see
#124786). However, a PCM may
contain debug info. If debug info is requested, the compiler uses the
current working directory value to set `DW_AT_comp_dir`. This may lead
to the following situation:
1. Two different compilations need the same PCM file. 
2. The PCM file is compiled assuming a working directory, which is
embedded in the debug info, but otherwise has no effect.
3. The second compilation assumes a different working directory, and
expects an identically-sized pcm file. However, it cannot find such a
PCM, because the existing PCM file has been compiled assuming a
different `DW_AT_comp_dir `, which is embedded in the debug info.

This PR resets the `DebugCompilationDir` if it is functionally safe to
ignore the working directory so the above situation is avoided, since
all debug information will share the same working directory.

rdar://145249881
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 26, 2025
…afe to ignore current working directory (#128446)

This PR explicitly sets `DebugCompilationDir` to the system's root
directory if it is safe to ignore the current working directory.

This fixes a problem where a PCM file's embedded debug information can
lead to compilation failure. The compiler may have decided it is indeed
safe to ignore the current working directory. In this case, the PCM
file's content is functionally correct regardless of the current working
directory because no inputs use relative paths (see
llvm/llvm-project#124786). However, a PCM may
contain debug info. If debug info is requested, the compiler uses the
current working directory value to set `DW_AT_comp_dir`. This may lead
to the following situation:
1. Two different compilations need the same PCM file.
2. The PCM file is compiled assuming a working directory, which is
embedded in the debug info, but otherwise has no effect.
3. The second compilation assumes a different working directory, and
expects an identically-sized pcm file. However, it cannot find such a
PCM, because the existing PCM file has been compiled assuming a
different `DW_AT_comp_dir `, which is embedded in the debug info.

This PR resets the `DebugCompilationDir` if it is functionally safe to
ignore the working directory so the above situation is avoided, since
all debug information will share the same working directory.

rdar://145249881
qiongsiwu added a commit to qiongsiwu/llvm-project that referenced this pull request Feb 26, 2025
…re current working directory (llvm#128446)

This PR explicitly sets `DebugCompilationDir` to the system's root
directory if it is safe to ignore the current working directory.

This fixes a problem where a PCM file's embedded debug information can
lead to compilation failure. The compiler may have decided it is indeed
safe to ignore the current working directory. In this case, the PCM
file's content is functionally correct regardless of the current working
directory because no inputs use relative paths (see
llvm#124786). However, a PCM may
contain debug info. If debug info is requested, the compiler uses the
current working directory value to set `DW_AT_comp_dir`. This may lead
to the following situation:
1. Two different compilations need the same PCM file.
2. The PCM file is compiled assuming a working directory, which is
embedded in the debug info, but otherwise has no effect.
3. The second compilation assumes a different working directory, and
expects an identically-sized pcm file. However, it cannot find such a
PCM, because the existing PCM file has been compiled assuming a
different `DW_AT_comp_dir `, which is embedded in the debug info.

This PR resets the `DebugCompilationDir` if it is functionally safe to
ignore the working directory so the above situation is avoided, since
all debug information will share the same working directory.

rdar://145249881
(cherry picked from commit 7f482aa)

 Conflicts:
	clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
qiongsiwu added a commit that referenced this pull request Mar 5, 2025
…on Off by Default (#129809)

#124786 implemented current
working directory (CWD) optimization and the optimization was on by
default. We have discovered that build system needs to be compatible
with the CWD optimization and default off is a better behavior. The
build system needs to be aware that the current working directory is
ignored. Without a good way of notifying the build system, it is less
risky to default to off. This PR implement the change.

rdar://145860213
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 5, 2025
… Optimization Off by Default (#129809)

llvm/llvm-project#124786 implemented current
working directory (CWD) optimization and the optimization was on by
default. We have discovered that build system needs to be compatible
with the CWD optimization and default off is a better behavior. The
build system needs to be aware that the current working directory is
ignored. Without a good way of notifying the build system, it is less
risky to default to off. This PR implement the change.

rdar://145860213
qiongsiwu added a commit to qiongsiwu/llvm-project that referenced this pull request Mar 17, 2025
…on Off by Default (llvm#129809)

llvm#124786 implemented current
working directory (CWD) optimization and the optimization was on by
default. We have discovered that build system needs to be compatible
with the CWD optimization and default off is a better behavior. The
build system needs to be aware that the current working directory is
ignored. Without a good way of notifying the build system, it is less
risky to default to off. This PR implement the change.

rdar://145860213
(cherry picked from commit 7bd492f)

 Conflicts:
	clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
qiongsiwu added a commit to swiftlang/llvm-project that referenced this pull request Mar 18, 2025
…on Off by Default (llvm#129809)

llvm#124786 implemented current
working directory (CWD) optimization and the optimization was on by
default. We have discovered that build system needs to be compatible
with the CWD optimization and default off is a better behavior. The
build system needs to be aware that the current working directory is
ignored. Without a good way of notifying the build system, it is less
risky to default to off. This PR implement the change.

rdar://145860213
(cherry picked from commit 7bd492f)

 Conflicts:
	clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
…on Off by Default (llvm#129809)

llvm#124786 implemented current
working directory (CWD) optimization and the optimization was on by
default. We have discovered that build system needs to be compatible
with the CWD optimization and default off is a better behavior. The
build system needs to be aware that the current working directory is
ignored. Without a good way of notifying the build system, it is less
risky to default to off. This PR implement the change.

rdar://145860213
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants