Skip to content

[clang][llvm][aarch64][win] Add a clang flag and module attribute for import call optimization, and remove LLVM flag #122831

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 30, 2025

Conversation

dpaoliello
Copy link
Contributor

Switches import call optimization from being enabled by an LLVM flag to instead using a module attribute, and creates a new Clang flag that will set that attribute. This addresses the concern raised in the original PR: #121516 (comment)

This change also only creates the Called Global info if the module attribute is present, addressing this concern: #122762 (review)

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AArch64 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jan 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 14, 2025

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-backend-aarch64

Author: Daniel Paoliello (dpaoliello)

Changes

Switches import call optimization from being enabled by an LLVM flag to instead using a module attribute, and creates a new Clang flag that will set that attribute. This addresses the concern raised in the original PR: <#121516 (comment)>

This change also only creates the Called Global info if the module attribute is present, addressing this concern: <#122762 (review)>


Full diff: https://github.com/llvm/llvm-project/pull/122831.diff

8 Files Affected:

  • (modified) clang/include/clang/Basic/CodeGenOptions.def (+4)
  • (modified) clang/include/clang/Driver/Options.td (+5)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+5)
  • (added) clang/test/CodeGen/import-call-optimization.c (+5)
  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+6-8)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+4-2)
  • (modified) llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll (+4-1)
  • (modified) llvm/test/CodeGen/AArch64/win-import-call-optimization.ll (+31-31)
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 0f4ed13d5f3d8c..e35e1a9e673dae 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -465,6 +465,10 @@ ENUM_CODEGENOPT(ZeroCallUsedRegs, llvm::ZeroCallUsedRegs::ZeroCallUsedRegsKind,
 /// non-deleting destructors. (No effect on Microsoft ABI.)
 CODEGENOPT(CtorDtorReturnThis, 1, 0)
 
+/// Enables emitting Import Call sections on supported targets that can be used
+/// by the Windows kernel to enable import call optimization.
+CODEGENOPT(ImportCallOptimization, 1, 0)
+
 /// FIXME: Make DebugOptions its own top-level .def file.
 #include "DebugOptions.def"
 
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index bbf5c0e7e7fd1a..8f4e5de74b20aa 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -7657,6 +7657,11 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], "fexperimental-assignme
 def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">,
   MarshallingInfoFlag<CodeGenOpts<"EnableTLSDESC">>;
 
+def import_call_optimization : Flag<["-"], "import-call-optimization">,
+    HelpText<"Emit Import Call sections on supported targets that can be used "
+             "by the Windows kernel to enable import call optimization">,
+    MarshallingInfoFlag<CodeGenOpts<"ImportCallOptimization">>;
+
 } // let Visibility = [CC1Option]
 
 //===----------------------------------------------------------------------===//
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index dfb51b11e1d851..8970992b2b413b 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1291,6 +1291,11 @@ void CodeGenModule::Release() {
   if (LangOpts.EHAsynch)
     getModule().addModuleFlag(llvm::Module::Warning, "eh-asynch", 1);
 
+  // Emit Import Call section.
+  if (CodeGenOpts.ImportCallOptimization)
+    getModule().addModuleFlag(llvm::Module::Warning, "import-call-optimization",
+                              1);
+
   // Indicate whether this Module was compiled with -fopenmp
   if (getLangOpts().OpenMP && !getLangOpts().OpenMPSimd)
     getModule().addModuleFlag(llvm::Module::Max, "openmp", LangOpts.OpenMP);
diff --git a/clang/test/CodeGen/import-call-optimization.c b/clang/test/CodeGen/import-call-optimization.c
new file mode 100644
index 00000000000000..cc4e37fda7bb1d
--- /dev/null
+++ b/clang/test/CodeGen/import-call-optimization.c
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -import-call-optimization -emit-llvm %s -o - | FileCheck %s
+
+void f(void) {}
+
+// CHECK: !"import-call-optimization", i32 1}
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 27e65d60122fd7..621b00dd8da8e9 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -78,11 +78,6 @@ static cl::opt<PtrauthCheckMode> PtrauthAuthChecks(
     cl::desc("Check pointer authentication auth/resign failures"),
     cl::init(Default));
 
-static cl::opt<bool> EnableImportCallOptimization(
-    "aarch64-win-import-call-optimization", cl::Hidden,
-    cl::desc("Enable import call optimization for AArch64 Windows"),
-    cl::init(false));
-
 #define DEBUG_TYPE "asm-printer"
 
 namespace {
@@ -95,6 +90,7 @@ class AArch64AsmPrinter : public AsmPrinter {
 #ifndef NDEBUG
   unsigned InstsEmitted;
 #endif
+  bool EnableImportCallOptimization = false;
   DenseMap<MCSection *, std::vector<std::pair<MCSymbol *, MCSymbol *>>>
       SectionToImportedFunctionCalls;
 
@@ -340,6 +336,9 @@ void AArch64AsmPrinter::emitStartOfAsmFile(Module &M) {
     OutStreamer->emitSymbolAttribute(S, MCSA_Global);
     OutStreamer->emitAssignment(
         S, MCConstantExpr::create(Feat00Value, MMI->getContext()));
+
+    if (M.getModuleFlag("import-call-optimization"))
+      EnableImportCallOptimization = true;
   }
 
   if (!TT.isOSBinFormatELF())
@@ -945,7 +944,7 @@ void AArch64AsmPrinter::emitEndOfAsmFile(Module &M) {
 
   // If import call optimization is enabled, emit the appropriate section.
   // We do this whether or not we recorded any import calls.
-  if (EnableImportCallOptimization && TT.isOSBinFormatCOFF()) {
+  if (EnableImportCallOptimization) {
     OutStreamer->switchSection(getObjFileLowering().getImportCallSection());
 
     // Section always starts with some magic.
@@ -3109,8 +3108,7 @@ void AArch64AsmPrinter::emitInstruction(const MachineInstr *MI) {
 
 void AArch64AsmPrinter::recordIfImportCall(
     const llvm::MachineInstr *BranchInst) {
-  if (!EnableImportCallOptimization ||
-      !TM.getTargetTriple().isOSBinFormatCOFF())
+  if (!EnableImportCallOptimization)
     return;
 
   auto [GV, OpFlags] = BranchInst->getMF()->tryGetCalledGlobal(BranchInst);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index d4a114c275fb76..d02ec0cca62b5c 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -9577,7 +9577,8 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
 
     DAG.addNoMergeSiteInfo(Ret.getNode(), CLI.NoMerge);
     DAG.addCallSiteInfo(Ret.getNode(), std::move(CSInfo));
-    if (CalledGlobal)
+    if (CalledGlobal &&
+        MF.getFunction().getParent()->getModuleFlag("import-call-optimization"))
       DAG.addCalledGlobal(Ret.getNode(), CalledGlobal, OpFlags);
     return Ret;
   }
@@ -9590,7 +9591,8 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
   DAG.addNoMergeSiteInfo(Chain.getNode(), CLI.NoMerge);
   InGlue = Chain.getValue(1);
   DAG.addCallSiteInfo(Chain.getNode(), std::move(CSInfo));
-  if (CalledGlobal)
+  if (CalledGlobal &&
+      MF.getFunction().getParent()->getModuleFlag("import-call-optimization"))
     DAG.addCalledGlobal(Chain.getNode(), CalledGlobal, OpFlags);
 
   uint64_t CalleePopBytes =
diff --git a/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll b/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll
index 81d6d6369dcbf4..b17f2d113c0d29 100644
--- a/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll
+++ b/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-pc-windows-msvc -aarch64-win-import-call-optimization < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s
 
 define dso_local void @normal_call() local_unnamed_addr {
 entry:
@@ -16,3 +16,6 @@ declare void @a() local_unnamed_addr
 ; CHECK-LABEL  .section   .impcall,"yi"
 ; CHECK-NEXT   .asciz  "Imp_Call_V1"
 ; CHECK-NOT    .secnum
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 1, !"import-call-optimization", i32 1}
diff --git a/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll b/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll
index 6bb118ba1e1596..2cf0f8bb6baa5c 100644
--- a/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll
+++ b/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll
@@ -1,7 +1,4 @@
-; RUN: llc -mtriple=aarch64-pc-windows-msvc -aarch64-win-import-call-optimization < %s | FileCheck %s --check-prefix=CHECK-ENABLED
-; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s --check-prefix=CHECK-DISABLED
-
-; CHECK-DISABLED-NOT: .section        .impcall
+; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s --check-prefix=CHECK
 
 define dso_local void @normal_call() local_unnamed_addr section "nc_sect" {
 entry:
@@ -9,40 +6,43 @@ entry:
   call void @a()
   ret void
 }
-; CHECK-ENABLED-LABEL:  normal_call:
-; CHECK-ENABLED:        adrp    [[ADRPREG:x[0-9]+]], __imp_a
-; CHECK-ENABLED-NEXT:   ldr     [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_a]
-; CHECK-ENABLED-NEXT:   .Limpcall0:
-; CHECK-ENABLED-NEXT:   blr     [[LDRREG]]
-; CHECK-ENABLED-NEXT:   .Limpcall1:
-; CHECK-ENABLED-NEXT:   blr     [[LDRREG]]
+; CHECK-LABEL:  normal_call:
+; CHECK:        adrp    [[ADRPREG:x[0-9]+]], __imp_a
+; CHECK-NEXT:   ldr     [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_a]
+; CHECK-NEXT:   .Limpcall0:
+; CHECK-NEXT:   blr     [[LDRREG]]
+; CHECK-NEXT:   .Limpcall1:
+; CHECK-NEXT:   blr     [[LDRREG]]
 
 define dso_local void @tail_call() local_unnamed_addr section "tc_sect" {
 entry:
   tail call void @b()
   ret void
 }
-; CHECK-ENABLED-LABEL:  tail_call:
-; CHECK-ENABLED:        adrp    [[ADRPREG:x[0-9]+]], __imp_b
-; CHECK-ENABLED-NEXT:   ldr     [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_b]
-; CHECK-ENABLED-NEXT:   .Limpcall2:
-; CHECK-ENABLED-NEXT:   br      [[LDRREG]]
+; CHECK-LABEL:  tail_call:
+; CHECK:        adrp    [[ADRPREG:x[0-9]+]], __imp_b
+; CHECK-NEXT:   ldr     [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_b]
+; CHECK-NEXT:   .Limpcall2:
+; CHECK-NEXT:   br      [[LDRREG]]
 
 declare dllimport void @a() local_unnamed_addr
 declare dllimport void @b() local_unnamed_addr
 
-; CHECK-ENABLED-LABEL  .section   .impcall,"yi"
-; CHECK-ENABLED-NEXT   .asciz  "Imp_Call_V1"
-; CHECK-ENABLED-NEXT   .word   32
-; CHECK-ENABLED-NEXT   .secnum nc_sect
-; CHECK-ENABLED-NEXT   .word   19
-; CHECK-ENABLED-NEXT   .secoffset      .Limpcall0
-; CHECK-ENABLED-NEXT   .symidx __imp_a
-; CHECK-ENABLED-NEXT   .word   19
-; CHECK-ENABLED-NEXT   .secoffset      .Limpcall1
-; CHECK-ENABLED-NEXT   .symidx __imp_a
-; CHECK-ENABLED-NEXT   .word   20
-; CHECK-ENABLED-NEXT   .secnum tc_sect
-; CHECK-ENABLED-NEXT   .word   19
-; CHECK-ENABLED-NEXT   .secoffset      .Limpcall2
-; CHECK-ENABLED-NEXT   .symidx __imp_b
+; CHECK-LABEL  .section   .impcall,"yi"
+; CHECK-NEXT   .asciz  "Imp_Call_V1"
+; CHECK-NEXT   .word   32
+; CHECK-NEXT   .secnum nc_sect
+; CHECK-NEXT   .word   19
+; CHECK-NEXT   .secoffset      .Limpcall0
+; CHECK-NEXT   .symidx __imp_a
+; CHECK-NEXT   .word   19
+; CHECK-NEXT   .secoffset      .Limpcall1
+; CHECK-NEXT   .symidx __imp_a
+; CHECK-NEXT   .word   20
+; CHECK-NEXT   .secnum tc_sect
+; CHECK-NEXT   .word   19
+; CHECK-NEXT   .secoffset      .Limpcall2
+; CHECK-NEXT   .symidx __imp_b
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 1, !"import-call-optimization", i32 1}

@dpaoliello dpaoliello changed the title [aarch64][win] Add a clang flag and module attribute for import call optimization, and remove LLVM flag [clang][llvm][aarch64][win] Add a clang flag and module attribute for import call optimization, and remove LLVM flag Jan 14, 2025
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one minor comment.

@@ -945,7 +944,7 @@ void AArch64AsmPrinter::emitEndOfAsmFile(Module &M) {

// If import call optimization is enabled, emit the appropriate section.
// We do this whether or not we recorded any import calls.
if (EnableImportCallOptimization && TT.isOSBinFormatCOFF()) {
if (EnableImportCallOptimization) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe leave the isOSBinFormatCOFF() checks so things don't explode if someone tries to use the metadata on a non-COFF target.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnableImportCallOptimization will only be set to true if we are targeting COFF (see AArch64AsmPrinter::emitStartOfAsmFile)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I see your point: if this is every added for a non-COFF target we don't want to emit the COFF metadata.

@dpaoliello dpaoliello merged commit 845cc96 into llvm:main Jan 30, 2025
8 checks passed
@dpaoliello dpaoliello deleted the impcallmodattr branch January 30, 2025 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AArch64 clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants