Skip to content

Add -funique-source-file-identifier option. #142901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 5, 2025

Conversation

pcc
Copy link
Contributor

@pcc pcc commented Jun 5, 2025

This option complements -funique-source-file-names and allows the user
to use a different unique identifier than the source file path.

Created using spr 1.3.6-beta.1
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. llvm:transforms labels Jun 5, 2025
@pcc pcc requested a review from teresajohnson June 5, 2025 05:37
@llvmbot
Copy link
Member

llvmbot commented Jun 5, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-llvm-transforms

Author: Peter Collingbourne (pcc)

Changes

This flag complements -funique-source-file-names and allows the user to
use a different unique identifier than the source file path.


Full diff: https://github.com/llvm/llvm-project/pull/142901.diff

10 Files Affected:

  • (modified) clang/docs/UsersManual.rst (+12-5)
  • (modified) clang/include/clang/Basic/CodeGenOptions.def (-2)
  • (modified) clang/include/clang/Basic/CodeGenOptions.h (+4)
  • (modified) clang/include/clang/Driver/Options.td (+9-7)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+7-2)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+8-2)
  • (modified) clang/test/CodeGen/unique-source-file-names.c (+3-2)
  • (modified) clang/test/Driver/unique-source-file-names.c (+9-3)
  • (modified) llvm/lib/Transforms/Utils/ModuleUtils.cpp (+6-4)
  • (modified) llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll (+2-1)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 8c72f95b94095..62844f7e6a2fa 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -2300,12 +2300,14 @@ are listed below.
 .. option:: -f[no-]unique-source-file-names
 
    When enabled, allows the compiler to assume that each object file
-   passed to the linker has been compiled using a unique source file
-   path. This is useful for reducing link times when doing ThinLTO
-   in combination with whole-program devirtualization or CFI.
+   passed to the linker has a unique identifier. The identifier for
+   an object file is either the source file path or the value of the
+   argument `-funique-source-file-identifier` if specified. This is
+   useful for reducing link times when doing ThinLTO in combination with
+   whole-program devirtualization or CFI.
 
-   The full source path passed to the compiler must be unique. This
-   means that, for example, the following is a usage error:
+   The full source path or identifier passed to the compiler must be
+   unique. This means that, for example, the following is a usage error:
 
    .. code-block:: console
 
@@ -2327,6 +2329,11 @@ are listed below.
    A misuse of this flag may result in a duplicate symbol error at
    link time.
 
+.. option:: -funique-source-file-identifier=IDENTIFIER
+
+   Used with `-funique-source-file-names` to specify a source file
+   identifier.
+
 .. option:: -fforce-emit-vtables
 
    In order to improve devirtualization, forces emitting of vtables even in
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index aad4e107cbeb3..fa9474d63ae42 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -278,8 +278,6 @@ CODEGENOPT(SanitizeCfiICallNormalizeIntegers, 1, 0) ///< Normalize integer types
                                                     ///< CFI icall function signatures
 CODEGENOPT(SanitizeCfiCanonicalJumpTables, 1, 0) ///< Make jump table symbols canonical
                                                  ///< instead of creating a local jump table.
-CODEGENOPT(UniqueSourceFileNames, 1, 0) ///< Allow the compiler to assume that TUs
-                                        ///< have unique source file names at link time
 CODEGENOPT(SanitizeKcfiArity, 1, 0) ///< Embed arity in KCFI patchable function prefix
 CODEGENOPT(SanitizeCoverageType, 2, 0) ///< Type of sanitizer coverage
                                        ///< instrumentation.
diff --git a/clang/include/clang/Basic/CodeGenOptions.h b/clang/include/clang/Basic/CodeGenOptions.h
index 278803f7bb960..f6a6a7fcfa6d7 100644
--- a/clang/include/clang/Basic/CodeGenOptions.h
+++ b/clang/include/clang/Basic/CodeGenOptions.h
@@ -338,6 +338,10 @@ class CodeGenOptions : public CodeGenOptionsBase {
   /// -fsymbol-partition (see https://lld.llvm.org/Partitions.html).
   std::string SymbolPartition;
 
+  /// If non-empty, allow the compiler to assume that the given source file
+  /// identifier is unique at link time.
+  std::string UniqueSourceFileIdentifier;
+  
   enum RemarkKind {
     RK_Missing,            // Remark argument not present on the command line.
     RK_Enabled,            // Remark enabled via '-Rgroup'.
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 5ca31c253ed8f..f04e214066ccb 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4204,13 +4204,15 @@ def ftrigraphs : Flag<["-"], "ftrigraphs">, Group<f_Group>,
 def fno_trigraphs : Flag<["-"], "fno-trigraphs">, Group<f_Group>,
   HelpText<"Do not process trigraph sequences">,
   Visibility<[ClangOption, CC1Option]>;
-defm unique_source_file_names: BoolOption<"f", "unique-source-file-names",
-  CodeGenOpts<"UniqueSourceFileNames">, DefaultFalse,
-  PosFlag<SetTrue, [], [CC1Option], "Allow">,
-  NegFlag<SetFalse, [], [], "Do not allow">,
-  BothFlags<[], [ClangOption], " the compiler to assume that each translation unit has a unique "
-                               "source file name at link time">>,
-  Group<f_clang_Group>;
+def funique_source_file_names: Flag<["-"], "funique-source-file-names">, Group<f_Group>,
+  HelpText<"Allow the compiler to assume that each translation unit has a unique "                       
+           "source file identifier (see funique-source-file-identifier) at link time">;
+def fno_unique_source_file_names: Flag<["-"], "fno-unique-source-file-names">;
+def unique_source_file_identifier_EQ: Joined<["-"], "funique-source-file-identifier=">, Group<f_Group>,
+  Visibility<[ClangOption, CC1Option]>,
+  HelpText<"Specify the source file identifier for -funique-source-file-names; "
+           "uses the source file path if not specified">,
+  MarshallingInfoString<CodeGenOpts<"UniqueSourceFileIdentifier">>;
 def funsigned_bitfields : Flag<["-"], "funsigned-bitfields">, Group<f_Group>;
 def funsigned_char : Flag<["-"], "funsigned-char">, Group<f_Group>;
 def fno_unsigned_char : Flag<["-"], "fno-unsigned-char">;
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 468fc6e0e5c56..4885965b35abb 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1146,8 +1146,13 @@ void CodeGenModule::Release() {
                               1);
   }
 
-  if (CodeGenOpts.UniqueSourceFileNames) {
-    getModule().addModuleFlag(llvm::Module::Max, "Unique Source File Names", 1);
+  if (!CodeGenOpts.UniqueSourceFileIdentifier.empty()) {
+    getModule().addModuleFlag(
+        llvm::Module::Append, "Unique Source File Identifier",
+        llvm::MDTuple::get(
+            TheModule.getContext(),
+            llvm::MDString::get(TheModule.getContext(),
+                                CodeGenOpts.UniqueSourceFileIdentifier)));
   }
 
   if (LangOpts.Sanitize.has(SanitizerKind::KCFI)) {
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp
index 13842b8cc2870..504d79461d534 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -7740,8 +7740,14 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA,
   Args.addOptInFlag(CmdArgs, options::OPT_fexperimental_late_parse_attributes,
                     options::OPT_fno_experimental_late_parse_attributes);
 
-  Args.addOptInFlag(CmdArgs, options::OPT_funique_source_file_names,
-                    options::OPT_fno_unique_source_file_names);
+  if (Args.hasFlag(options::OPT_funique_source_file_names,
+                    options::OPT_fno_unique_source_file_names, false)) {
+    if (Arg *A = Args.getLastArg(options::OPT_unique_source_file_identifier_EQ))
+      A->render(Args, CmdArgs);
+    else
+      CmdArgs.push_back(Args.MakeArgString(
+          Twine("-funique-source-file-identifier=") + Input.getBaseInput()));
+  }
 
   // Setup statistics file output.
   SmallString<128> StatsFile = getStatsFileName(Args, Output, Input, D);
diff --git a/clang/test/CodeGen/unique-source-file-names.c b/clang/test/CodeGen/unique-source-file-names.c
index 1d5a4a5e8e4c5..df8e3025870ae 100644
--- a/clang/test/CodeGen/unique-source-file-names.c
+++ b/clang/test/CodeGen/unique-source-file-names.c
@@ -1,2 +1,3 @@
-// RUN: %clang_cc1 -funique-source-file-names -triple x86_64-linux-gnu -emit-llvm %s -o - | FileCheck %s
-// CHECK:  !{i32 7, !"Unique Source File Names", i32 1}
+// RUN: %clang_cc1 -funique-source-file-identifier=foo -triple x86_64-linux-gnu -emit-llvm %s -o - | FileCheck %s
+// CHECK:  !{i32 5, !"Unique Source File Identifier", ![[MD:[0-9]*]]}
+// CHECK: ![[MD]] = !{!"foo"}
diff --git a/clang/test/Driver/unique-source-file-names.c b/clang/test/Driver/unique-source-file-names.c
index 8322f0e37b0c7..0dc71345d745c 100644
--- a/clang/test/Driver/unique-source-file-names.c
+++ b/clang/test/Driver/unique-source-file-names.c
@@ -1,5 +1,11 @@
 // RUN: %clang -funique-source-file-names -### %s 2> %t
-// RUN: FileCheck < %t %s
+// RUN: FileCheck --check-prefix=SRC < %t %s
 
-// CHECK: "-cc1"
-// CHECK: "-funique-source-file-names"
+// SRC: "-cc1"
+// SRC: "-funique-source-file-identifier={{.*}}unique-source-file-names.c"
+
+// RUN: %clang -funique-source-file-names -funique-source-file-identifier=foo -### %s 2> %t
+// RUN: FileCheck --check-prefix=ID < %t %s
+
+// ID: "-cc1"
+// ID: "-funique-source-file-identifier=foo"
diff --git a/llvm/lib/Transforms/Utils/ModuleUtils.cpp b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
index 10efdd61d4553..596849ecab742 100644
--- a/llvm/lib/Transforms/Utils/ModuleUtils.cpp
+++ b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
@@ -18,6 +18,7 @@
 #include "llvm/IR/IRBuilder.h"
 #include "llvm/IR/MDBuilder.h"
 #include "llvm/IR/Module.h"
+#include "llvm/Support/Casting.h"
 #include "llvm/Support/MD5.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Support/xxhash.h"
@@ -346,10 +347,11 @@ void llvm::filterDeadComdatFunctions(
 std::string llvm::getUniqueModuleId(Module *M) {
   MD5 Md5;
 
-  auto *UniqueSourceFileNames = mdconst::extract_or_null<ConstantInt>(
-      M->getModuleFlag("Unique Source File Names"));
-  if (UniqueSourceFileNames && UniqueSourceFileNames->getZExtValue()) {
-    Md5.update(M->getSourceFileName());
+  auto *UniqueSourceFileIdentifier = dyn_cast_or_null<MDNode>(
+      M->getModuleFlag("Unique Source File Identifier"));
+  if (UniqueSourceFileIdentifier) {
+    Md5.update(
+        cast<MDString>(UniqueSourceFileIdentifier->getOperand(0))->getString());
   } else {
     bool ExportsSymbols = false;
     for (auto &GV : M->global_values()) {
diff --git a/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll b/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll
index 0f3fd566f9b1c..13dcefcb70cb5 100644
--- a/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll
+++ b/llvm/test/Transforms/ThinLTOBitcodeWriter/unique-source-file-names.ll
@@ -19,4 +19,5 @@ define internal void @f() {
 !0 = !{i32 0, !"typeid"}
 
 !llvm.module.flags = !{!1}
-!1 = !{i32 1, !"Unique Source File Names", i32 1}
+!1 = !{i32 5, !"Unique Source File Identifier", !2}
+!2 = !{!"unique-source-file-names.c"}

Copy link

github-actions bot commented Jun 5, 2025

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions c,cpp,h -- clang/include/clang/Basic/CodeGenOptions.h clang/lib/CodeGen/CodeGenModule.cpp clang/lib/Driver/ToolChains/Clang.cpp clang/test/CodeGen/unique-source-file-names.c clang/test/Driver/unique-source-file-names.c llvm/lib/Transforms/Utils/ModuleUtils.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/Utils/ModuleUtils.cpp b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
index 596849eca..05470b5cd 100644
--- a/llvm/lib/Transforms/Utils/ModuleUtils.cpp
+++ b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
@@ -11,8 +11,8 @@
 //===----------------------------------------------------------------------===//
 
 #include "llvm/Transforms/Utils/ModuleUtils.h"
-#include "llvm/Analysis/VectorUtils.h"
 #include "llvm/ADT/SmallString.h"
+#include "llvm/Analysis/VectorUtils.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Function.h"
 #include "llvm/IR/IRBuilder.h"

Created using spr 1.3.6-beta.1
@MaskRay
Copy link
Member

MaskRay commented Jun 5, 2025

Should call this "option". Within LLVM, we use flag for options without a value.

Group<f_clang_Group>;
def funique_source_file_names: Flag<["-"], "funique-source-file-names">, Group<f_Group>,
HelpText<"Allow the compiler to assume that each translation unit has a unique "
"source file identifier (see funique-source-file-identifier) at link time">;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: missing "-" in front of option name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Created using spr 1.3.6-beta.1
@pcc pcc changed the title Add -funique-source-file-identifier flag. Add -funique-source-file-identifier option. Jun 5, 2025
@pcc pcc merged commit d1b0b4b into main Jun 5, 2025
5 of 8 checks passed
@pcc pcc deleted the users/pcc/spr/add-funique-source-file-identifier-flag branch June 5, 2025 17:52
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Jun 5, 2025
This option complements -funique-source-file-names and allows the user
to use a different unique identifier than the source file path.

Reviewers: teresajohnson

Reviewed By: teresajohnson

Pull Request: llvm/llvm-project#142901
rorth pushed a commit to rorth/llvm-project that referenced this pull request Jun 11, 2025
This option complements -funique-source-file-names and allows the user
to use a different unique identifier than the source file path.

Reviewers: teresajohnson

Reviewed By: teresajohnson

Pull Request: llvm#142901
DhruvSrivastavaX pushed a commit to DhruvSrivastavaX/lldb-for-aix that referenced this pull request Jun 12, 2025
This option complements -funique-source-file-names and allows the user
to use a different unique identifier than the source file path.

Reviewers: teresajohnson

Reviewed By: teresajohnson

Pull Request: llvm#142901
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants