-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[clang][llvm][aarch64][win] Add a clang flag and module attribute for import call optimization, and remove LLVM flag #122831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-backend-aarch64 Author: Daniel Paoliello (dpaoliello) ChangesSwitches import call optimization from being enabled by an LLVM flag to instead using a module attribute, and creates a new Clang flag that will set that attribute. This addresses the concern raised in the original PR: <#121516 (comment)> This change also only creates the Called Global info if the module attribute is present, addressing this concern: <#122762 (review)> Full diff: https://github.com/llvm/llvm-project/pull/122831.diff 8 Files Affected:
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 0f4ed13d5f3d8c..e35e1a9e673dae 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -465,6 +465,10 @@ ENUM_CODEGENOPT(ZeroCallUsedRegs, llvm::ZeroCallUsedRegs::ZeroCallUsedRegsKind,
/// non-deleting destructors. (No effect on Microsoft ABI.)
CODEGENOPT(CtorDtorReturnThis, 1, 0)
+/// Enables emitting Import Call sections on supported targets that can be used
+/// by the Windows kernel to enable import call optimization.
+CODEGENOPT(ImportCallOptimization, 1, 0)
+
/// FIXME: Make DebugOptions its own top-level .def file.
#include "DebugOptions.def"
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index bbf5c0e7e7fd1a..8f4e5de74b20aa 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -7657,6 +7657,11 @@ def fexperimental_assignment_tracking_EQ : Joined<["-"], "fexperimental-assignme
def enable_tlsdesc : Flag<["-"], "enable-tlsdesc">,
MarshallingInfoFlag<CodeGenOpts<"EnableTLSDESC">>;
+def import_call_optimization : Flag<["-"], "import-call-optimization">,
+ HelpText<"Emit Import Call sections on supported targets that can be used "
+ "by the Windows kernel to enable import call optimization">,
+ MarshallingInfoFlag<CodeGenOpts<"ImportCallOptimization">>;
+
} // let Visibility = [CC1Option]
//===----------------------------------------------------------------------===//
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index dfb51b11e1d851..8970992b2b413b 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -1291,6 +1291,11 @@ void CodeGenModule::Release() {
if (LangOpts.EHAsynch)
getModule().addModuleFlag(llvm::Module::Warning, "eh-asynch", 1);
+ // Emit Import Call section.
+ if (CodeGenOpts.ImportCallOptimization)
+ getModule().addModuleFlag(llvm::Module::Warning, "import-call-optimization",
+ 1);
+
// Indicate whether this Module was compiled with -fopenmp
if (getLangOpts().OpenMP && !getLangOpts().OpenMPSimd)
getModule().addModuleFlag(llvm::Module::Max, "openmp", LangOpts.OpenMP);
diff --git a/clang/test/CodeGen/import-call-optimization.c b/clang/test/CodeGen/import-call-optimization.c
new file mode 100644
index 00000000000000..cc4e37fda7bb1d
--- /dev/null
+++ b/clang/test/CodeGen/import-call-optimization.c
@@ -0,0 +1,5 @@
+// RUN: %clang_cc1 -import-call-optimization -emit-llvm %s -o - | FileCheck %s
+
+void f(void) {}
+
+// CHECK: !"import-call-optimization", i32 1}
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 27e65d60122fd7..621b00dd8da8e9 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -78,11 +78,6 @@ static cl::opt<PtrauthCheckMode> PtrauthAuthChecks(
cl::desc("Check pointer authentication auth/resign failures"),
cl::init(Default));
-static cl::opt<bool> EnableImportCallOptimization(
- "aarch64-win-import-call-optimization", cl::Hidden,
- cl::desc("Enable import call optimization for AArch64 Windows"),
- cl::init(false));
-
#define DEBUG_TYPE "asm-printer"
namespace {
@@ -95,6 +90,7 @@ class AArch64AsmPrinter : public AsmPrinter {
#ifndef NDEBUG
unsigned InstsEmitted;
#endif
+ bool EnableImportCallOptimization = false;
DenseMap<MCSection *, std::vector<std::pair<MCSymbol *, MCSymbol *>>>
SectionToImportedFunctionCalls;
@@ -340,6 +336,9 @@ void AArch64AsmPrinter::emitStartOfAsmFile(Module &M) {
OutStreamer->emitSymbolAttribute(S, MCSA_Global);
OutStreamer->emitAssignment(
S, MCConstantExpr::create(Feat00Value, MMI->getContext()));
+
+ if (M.getModuleFlag("import-call-optimization"))
+ EnableImportCallOptimization = true;
}
if (!TT.isOSBinFormatELF())
@@ -945,7 +944,7 @@ void AArch64AsmPrinter::emitEndOfAsmFile(Module &M) {
// If import call optimization is enabled, emit the appropriate section.
// We do this whether or not we recorded any import calls.
- if (EnableImportCallOptimization && TT.isOSBinFormatCOFF()) {
+ if (EnableImportCallOptimization) {
OutStreamer->switchSection(getObjFileLowering().getImportCallSection());
// Section always starts with some magic.
@@ -3109,8 +3108,7 @@ void AArch64AsmPrinter::emitInstruction(const MachineInstr *MI) {
void AArch64AsmPrinter::recordIfImportCall(
const llvm::MachineInstr *BranchInst) {
- if (!EnableImportCallOptimization ||
- !TM.getTargetTriple().isOSBinFormatCOFF())
+ if (!EnableImportCallOptimization)
return;
auto [GV, OpFlags] = BranchInst->getMF()->tryGetCalledGlobal(BranchInst);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index d4a114c275fb76..d02ec0cca62b5c 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -9577,7 +9577,8 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
DAG.addNoMergeSiteInfo(Ret.getNode(), CLI.NoMerge);
DAG.addCallSiteInfo(Ret.getNode(), std::move(CSInfo));
- if (CalledGlobal)
+ if (CalledGlobal &&
+ MF.getFunction().getParent()->getModuleFlag("import-call-optimization"))
DAG.addCalledGlobal(Ret.getNode(), CalledGlobal, OpFlags);
return Ret;
}
@@ -9590,7 +9591,8 @@ AArch64TargetLowering::LowerCall(CallLoweringInfo &CLI,
DAG.addNoMergeSiteInfo(Chain.getNode(), CLI.NoMerge);
InGlue = Chain.getValue(1);
DAG.addCallSiteInfo(Chain.getNode(), std::move(CSInfo));
- if (CalledGlobal)
+ if (CalledGlobal &&
+ MF.getFunction().getParent()->getModuleFlag("import-call-optimization"))
DAG.addCalledGlobal(Chain.getNode(), CalledGlobal, OpFlags);
uint64_t CalleePopBytes =
diff --git a/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll b/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll
index 81d6d6369dcbf4..b17f2d113c0d29 100644
--- a/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll
+++ b/llvm/test/CodeGen/AArch64/win-import-call-optimization-nocalls.ll
@@ -1,4 +1,4 @@
-; RUN: llc -mtriple=aarch64-pc-windows-msvc -aarch64-win-import-call-optimization < %s | FileCheck %s
+; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s
define dso_local void @normal_call() local_unnamed_addr {
entry:
@@ -16,3 +16,6 @@ declare void @a() local_unnamed_addr
; CHECK-LABEL .section .impcall,"yi"
; CHECK-NEXT .asciz "Imp_Call_V1"
; CHECK-NOT .secnum
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 1, !"import-call-optimization", i32 1}
diff --git a/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll b/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll
index 6bb118ba1e1596..2cf0f8bb6baa5c 100644
--- a/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll
+++ b/llvm/test/CodeGen/AArch64/win-import-call-optimization.ll
@@ -1,7 +1,4 @@
-; RUN: llc -mtriple=aarch64-pc-windows-msvc -aarch64-win-import-call-optimization < %s | FileCheck %s --check-prefix=CHECK-ENABLED
-; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s --check-prefix=CHECK-DISABLED
-
-; CHECK-DISABLED-NOT: .section .impcall
+; RUN: llc -mtriple=aarch64-pc-windows-msvc < %s | FileCheck %s --check-prefix=CHECK
define dso_local void @normal_call() local_unnamed_addr section "nc_sect" {
entry:
@@ -9,40 +6,43 @@ entry:
call void @a()
ret void
}
-; CHECK-ENABLED-LABEL: normal_call:
-; CHECK-ENABLED: adrp [[ADRPREG:x[0-9]+]], __imp_a
-; CHECK-ENABLED-NEXT: ldr [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_a]
-; CHECK-ENABLED-NEXT: .Limpcall0:
-; CHECK-ENABLED-NEXT: blr [[LDRREG]]
-; CHECK-ENABLED-NEXT: .Limpcall1:
-; CHECK-ENABLED-NEXT: blr [[LDRREG]]
+; CHECK-LABEL: normal_call:
+; CHECK: adrp [[ADRPREG:x[0-9]+]], __imp_a
+; CHECK-NEXT: ldr [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_a]
+; CHECK-NEXT: .Limpcall0:
+; CHECK-NEXT: blr [[LDRREG]]
+; CHECK-NEXT: .Limpcall1:
+; CHECK-NEXT: blr [[LDRREG]]
define dso_local void @tail_call() local_unnamed_addr section "tc_sect" {
entry:
tail call void @b()
ret void
}
-; CHECK-ENABLED-LABEL: tail_call:
-; CHECK-ENABLED: adrp [[ADRPREG:x[0-9]+]], __imp_b
-; CHECK-ENABLED-NEXT: ldr [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_b]
-; CHECK-ENABLED-NEXT: .Limpcall2:
-; CHECK-ENABLED-NEXT: br [[LDRREG]]
+; CHECK-LABEL: tail_call:
+; CHECK: adrp [[ADRPREG:x[0-9]+]], __imp_b
+; CHECK-NEXT: ldr [[LDRREG:x[0-9]+]], [[[ADRPREG]], :lo12:__imp_b]
+; CHECK-NEXT: .Limpcall2:
+; CHECK-NEXT: br [[LDRREG]]
declare dllimport void @a() local_unnamed_addr
declare dllimport void @b() local_unnamed_addr
-; CHECK-ENABLED-LABEL .section .impcall,"yi"
-; CHECK-ENABLED-NEXT .asciz "Imp_Call_V1"
-; CHECK-ENABLED-NEXT .word 32
-; CHECK-ENABLED-NEXT .secnum nc_sect
-; CHECK-ENABLED-NEXT .word 19
-; CHECK-ENABLED-NEXT .secoffset .Limpcall0
-; CHECK-ENABLED-NEXT .symidx __imp_a
-; CHECK-ENABLED-NEXT .word 19
-; CHECK-ENABLED-NEXT .secoffset .Limpcall1
-; CHECK-ENABLED-NEXT .symidx __imp_a
-; CHECK-ENABLED-NEXT .word 20
-; CHECK-ENABLED-NEXT .secnum tc_sect
-; CHECK-ENABLED-NEXT .word 19
-; CHECK-ENABLED-NEXT .secoffset .Limpcall2
-; CHECK-ENABLED-NEXT .symidx __imp_b
+; CHECK-LABEL .section .impcall,"yi"
+; CHECK-NEXT .asciz "Imp_Call_V1"
+; CHECK-NEXT .word 32
+; CHECK-NEXT .secnum nc_sect
+; CHECK-NEXT .word 19
+; CHECK-NEXT .secoffset .Limpcall0
+; CHECK-NEXT .symidx __imp_a
+; CHECK-NEXT .word 19
+; CHECK-NEXT .secoffset .Limpcall1
+; CHECK-NEXT .symidx __imp_a
+; CHECK-NEXT .word 20
+; CHECK-NEXT .secnum tc_sect
+; CHECK-NEXT .word 19
+; CHECK-NEXT .secoffset .Limpcall2
+; CHECK-NEXT .symidx __imp_b
+
+!llvm.module.flags = !{!0}
+!0 = !{i32 1, !"import-call-optimization", i32 1}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor comment.
@@ -945,7 +944,7 @@ void AArch64AsmPrinter::emitEndOfAsmFile(Module &M) { | |||
|
|||
// If import call optimization is enabled, emit the appropriate section. | |||
// We do this whether or not we recorded any import calls. | |||
if (EnableImportCallOptimization && TT.isOSBinFormatCOFF()) { | |||
if (EnableImportCallOptimization) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe leave the isOSBinFormatCOFF() checks so things don't explode if someone tries to use the metadata on a non-COFF target.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnableImportCallOptimization
will only be set to true if we are targeting COFF (see AArch64AsmPrinter::emitStartOfAsmFile
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I see your point: if this is every added for a non-COFF target we don't want to emit the COFF metadata.
…optimization, and remove LLVM flag
c9793e1
to
91aa313
Compare
Switches import call optimization from being enabled by an LLVM flag to instead using a module attribute, and creates a new Clang flag that will set that attribute. This addresses the concern raised in the original PR: #121516 (comment)
This change also only creates the Called Global info if the module attribute is present, addressing this concern: #122762 (review)