Skip to content

Commit 87fd09b

Browse files
committed
[InstrProfiling] Generate runtime hook for ELF platforms
When using -fprofile-list to selectively apply instrumentation only to certain files or functions, we may end up with a binary that doesn't have any counters in the case where no files were selected. However, because on Linux and Fuchsia, we pass -u__llvm_profile_runtime, the runtime would still be pulled in and incur some non-trivial overhead, especially in the case when the continuous or runtime counter relocation mode is being used. A better way would be to pull in the profile runtime only when needed by declaring the __llvm_profile_runtime symbol in the translation unit only when needed. This approach was already used prior to 9a041a7, but we changed it to always generate the __llvm_profile_runtime due to a TAPI limitation. Since TAPI is only used on Mach-O platforms, we could use the early emission of __llvm_profile_runtime there, and on other platforms we could change back to the earlier approach where the symbol is generated later only when needed. We can stop passing -u__llvm_profile_runtime to the linker on Linux and Fuchsia since the generated undefined symbol in each translation unit that needed it serves the same purpose. Differential Revision: https://reviews.llvm.org/D98061
1 parent 63e676f commit 87fd09b

File tree

10 files changed

+26
-50
lines changed

10 files changed

+26
-50
lines changed

clang/docs/UsersManual.rst

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2301,6 +2301,14 @@ In these cases, you can use the flag ``-fno-profile-instr-generate`` (or
23012301
Note that these flags should appear after the corresponding profile
23022302
flags to have an effect.
23032303

2304+
.. note::
2305+
2306+
When none of the translation units inside a binary is instrumented, in the
2307+
case of ELF and COFF binary format the profile runtime will not be linked
2308+
into the binary and no profile will be produced, while in the case of Mach-O
2309+
the profile runtime will be linked and profile will be produced but there
2310+
will not be any counters.
2311+
23042312
Instrumenting only selected files or functions
23052313
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
23062314

clang/lib/Driver/ToolChains/Fuchsia.cpp

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -394,13 +394,3 @@ SanitizerMask Fuchsia::getDefaultSanitizers() const {
394394
}
395395
return Res;
396396
}
397-
398-
void Fuchsia::addProfileRTLibs(const llvm::opt::ArgList &Args,
399-
llvm::opt::ArgStringList &CmdArgs) const {
400-
// Add linker option -u__llvm_profile_runtime to cause runtime
401-
// initialization module to be linked in.
402-
if (needsProfileRT(Args))
403-
CmdArgs.push_back(Args.MakeArgString(
404-
Twine("-u", llvm::getInstrProfRuntimeHookVarName())));
405-
ToolChain::addProfileRTLibs(Args, CmdArgs);
406-
}

clang/lib/Driver/ToolChains/Fuchsia.h

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -71,9 +71,6 @@ class LLVM_LIBRARY_VISIBILITY Fuchsia : public ToolChain {
7171
SanitizerMask getSupportedSanitizers() const override;
7272
SanitizerMask getDefaultSanitizers() const override;
7373

74-
void addProfileRTLibs(const llvm::opt::ArgList &Args,
75-
llvm::opt::ArgStringList &CmdArgs) const override;
76-
7774
RuntimeLibType
7875
GetRuntimeLibType(const llvm::opt::ArgList &Args) const override;
7976
CXXStdlibType

clang/lib/Driver/ToolChains/Linux.cpp

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -925,16 +925,6 @@ SanitizerMask Linux::getSupportedSanitizers() const {
925925
return Res;
926926
}
927927

928-
void Linux::addProfileRTLibs(const llvm::opt::ArgList &Args,
929-
llvm::opt::ArgStringList &CmdArgs) const {
930-
// Add linker option -u__llvm_profile_runtime to cause runtime
931-
// initialization module to be linked in.
932-
if (needsProfileRT(Args))
933-
CmdArgs.push_back(Args.MakeArgString(
934-
Twine("-u", llvm::getInstrProfRuntimeHookVarName())));
935-
ToolChain::addProfileRTLibs(Args, CmdArgs);
936-
}
937-
938928
llvm::DenormalMode
939929
Linux::getDefaultDenormalModeForType(const llvm::opt::ArgList &DriverArgs,
940930
const JobAction &JA,

clang/lib/Driver/ToolChains/Linux.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -43,8 +43,6 @@ class LLVM_LIBRARY_VISIBILITY Linux : public Generic_ELF {
4343
bool isNoExecStackDefault() const override;
4444
bool IsMathErrnoDefault() const override;
4545
SanitizerMask getSupportedSanitizers() const override;
46-
void addProfileRTLibs(const llvm::opt::ArgList &Args,
47-
llvm::opt::ArgStringList &CmdArgs) const override;
4846
std::string computeSysRoot() const override;
4947

5048
std::string getDynamicLinker(const llvm::opt::ArgList &Args) const override;

clang/test/Driver/coverage-ld.c

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,7 @@
1212
// RUN: --sysroot=%S/Inputs/basic_linux_tree \
1313
// RUN: | FileCheck --check-prefix=CHECK-LINUX-I386 %s
1414
//
15-
// CHECK-LINUX-I386-NOT: "-u__llvm_profile_runtime"
1615
// CHECK-LINUX-I386: /Inputs/resource_dir{{/|\\\\}}lib{{/|\\\\}}linux{{/|\\\\}}libclang_rt.profile-i386.a"
17-
// CHECK-LINUX-I386-NOT: "-u__llvm_profile_runtime"
1816
// CHECK-LINUX-I386: "-lc"
1917
//
2018
// RUN: %clang -no-canonical-prefixes %s -### -o %t.o 2>&1 \

clang/test/Driver/fuchsia.c

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,6 @@
249249
// RUN: -fuse-ld=lld 2>&1 \
250250
// RUN: | FileCheck %s -check-prefix=CHECK-PROFRT-AARCH64
251251
// CHECK-PROFRT-AARCH64: "-resource-dir" "[[RESOURCE_DIR:[^"]+]]"
252-
// CHECK-PROFRT-AARCH64: "-u__llvm_profile_runtime"
253252
// CHECK-PROFRT-AARCH64: "[[RESOURCE_DIR]]{{/|\\\\}}lib{{/|\\\\}}aarch64-fuchsia{{/|\\\\}}libclang_rt.profile.a"
254253

255254
// RUN: %clang %s -### --target=x86_64-fuchsia \
@@ -258,5 +257,4 @@
258257
// RUN: -fuse-ld=lld 2>&1 \
259258
// RUN: | FileCheck %s -check-prefix=CHECK-PROFRT-X86_64
260259
// CHECK-PROFRT-X86_64: "-resource-dir" "[[RESOURCE_DIR:[^"]+]]"
261-
// CHECK-PROFRT-X86_64: "-u__llvm_profile_runtime"
262260
// CHECK-PROFRT-X86_64: "[[RESOURCE_DIR]]{{/|\\\\}}lib{{/|\\\\}}x86_64-fuchsia{{/|\\\\}}libclang_rt.profile.a"

llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp

Lines changed: 6 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -543,8 +543,10 @@ bool InstrProfiling::run(
543543
UsedVars.clear();
544544
TT = Triple(M.getTargetTriple());
545545

546-
// Emit the runtime hook even if no counters are present.
547-
bool MadeChange = emitRuntimeHook();
546+
bool MadeChange = false;
547+
// Emit the runtime hook even if no counters are present in Mach-O.
548+
if (TT.isOSBinFormatMachO())
549+
MadeChange = emitRuntimeHook();
548550

549551
// Improve compile time by avoiding linear scans when there is no work.
550552
GlobalVariable *CoverageNamesVar =
@@ -584,6 +586,8 @@ bool InstrProfiling::run(
584586
emitVNodes();
585587
emitNameData();
586588
emitRegistration();
589+
if (!TT.isOSBinFormatMachO())
590+
emitRuntimeHook();
587591
emitUses();
588592
emitInitialization();
589593
return true;
@@ -1058,11 +1062,6 @@ void InstrProfiling::emitRegistration() {
10581062
}
10591063

10601064
bool InstrProfiling::emitRuntimeHook() {
1061-
// We expect the linker to be invoked with -u<hook_var> flag for Linux or
1062-
// Fuchsia, in which case there is no need to emit the user function.
1063-
if (TT.isOSLinux() || TT.isOSFuchsia())
1064-
return false;
1065-
10661065
// If the module's provided its own runtime, we don't need to do anything.
10671066
if (M->getGlobalVariable(getInstrProfRuntimeHookVarName()))
10681067
return false;

llvm/test/Instrumentation/InstrProfiling/linkage.ll

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@
1010
; RUN: opt < %s -mtriple=x86_64-pc-win32-coff -passes=instrprof -S | FileCheck %s --check-prefixes=COFF
1111

1212
; MACHO: @__llvm_profile_runtime = external global i32
13-
; ELF-NOT: @__llvm_profile_runtime = external global i32
1413

1514
; ELF: $__profd_foo = comdat noduplicates
1615
; ELF: $__profd_foo_weak = comdat noduplicates

llvm/test/Instrumentation/InstrProfiling/profiling.ll

Lines changed: 12 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,11 @@
11
; RUN: opt < %s -mtriple=x86_64 -passes=instrprof -S | FileCheck %s --check-prefixes=CHECK,ELF,ELF_GENERIC
22
; RUN: opt < %s -mtriple=x86_64-linux -passes=instrprof -S | FileCheck %s --check-prefixes=CHECK,ELF_LINUX
33
; RUN: opt < %s -mtriple=x86_64-apple-macosx10.10.0 -passes=instrprof -S | FileCheck %s --check-prefixes=CHECK,MACHO
4-
; RUN: opt < %s -mtriple=x86_64-windows -passes=instrprof -S | FileCheck %s --check-prefixes=CHECK,WIN
4+
; RUN: opt < %s -mtriple=x86_64-windows -passes=instrprof -S | FileCheck %s --check-prefixes=CHECK,COFF
55

66
; RUN: opt < %s -mtriple=x86_64-apple-macosx10.10.0 -instrprof -S | FileCheck %s
77

8-
; ELF_GENERIC: @__llvm_profile_runtime = external global i32
9-
; ELF_LINUX-NOT: @__llvm_profile_runtime
108
; MACHO: @__llvm_profile_runtime = external global i32
11-
; WIN: @__llvm_profile_runtime = external global i32
129

1310
@__profn_foo = hidden constant [3 x i8] c"foo"
1411
; CHECK-NOT: __profn_foo
@@ -21,8 +18,8 @@
2118
; ELF: @__profd_foo = hidden {{.*}}, section "__llvm_prf_data", comdat, align 8
2219
; MACHO: @__profc_foo = hidden global [1 x i64] zeroinitializer, section "__DATA,__llvm_prf_cnts", align 8
2320
; MACHO: @__profd_foo = hidden {{.*}}, section "__DATA,__llvm_prf_data,regular,live_support", align 8
24-
; WIN: @__profc_foo = internal global [1 x i64] zeroinitializer, section ".lprfc$M", align 8
25-
; WIN: @__profd_foo = internal {{.*}}, section ".lprfd$M", align 8
21+
; COFF: @__profc_foo = internal global [1 x i64] zeroinitializer, section ".lprfc$M", align 8
22+
; COFF: @__profd_foo = internal {{.*}}, section ".lprfd$M", align 8
2623
define void @foo() {
2724
call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__profn_foo, i32 0, i32 0), i64 0, i32 1, i32 0)
2825
ret void
@@ -32,8 +29,8 @@ define void @foo() {
3229
; ELF: @__profd_bar = hidden {{.*}}, section "__llvm_prf_data", comdat, align 8
3330
; MACHO: @__profc_bar = hidden global [1 x i64] zeroinitializer, section "__DATA,__llvm_prf_cnts", align 8
3431
; MACHO: @__profd_bar = hidden {{.*}}, section "__DATA,__llvm_prf_data,regular,live_support", align 8
35-
; WIN: @__profc_bar = internal global [1 x i64] zeroinitializer, section ".lprfc$M", align 8
36-
; WIN: @__profd_bar = internal {{.*}}, section ".lprfd$M", align 8
32+
; COFF: @__profc_bar = internal global [1 x i64] zeroinitializer, section ".lprfc$M", align 8
33+
; COFF: @__profd_bar = internal {{.*}}, section ".lprfd$M", align 8
3734
define void @bar() {
3835
call void @llvm.instrprof.increment(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @__profn_bar, i32 0, i32 0), i64 0, i32 1, i32 0)
3936
ret void
@@ -43,8 +40,8 @@ define void @bar() {
4340
; ELF: @__profd_baz = hidden {{.*}}, section "__llvm_prf_data", comdat, align 8
4441
; MACHO: @__profc_baz = hidden global [3 x i64] zeroinitializer, section "__DATA,__llvm_prf_cnts", align 8
4542
; MACHO: @__profd_baz = hidden {{.*}}, section "__DATA,__llvm_prf_data,regular,live_support", align 8
46-
; WIN: @__profc_baz = internal global [3 x i64] zeroinitializer, section ".lprfc$M", align 8
47-
; WIN: @__profd_baz = internal {{.*}}, section ".lprfd$M", align 8
43+
; COFF: @__profc_baz = internal global [3 x i64] zeroinitializer, section ".lprfc$M", align 8
44+
; COFF: @__profd_baz = internal {{.*}}, section ".lprfd$M", align 8
4845
define void @baz() {
4946
call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__profn_baz, i32 0, i32 0), i64 0, i32 3, i32 0)
5047
call void @llvm.instrprof.increment(i8* getelementptr inbounds ([3 x i8], [3 x i8]* @__profn_baz, i32 0, i32 0), i64 0, i32 3, i32 1)
@@ -54,12 +51,14 @@ define void @baz() {
5451

5552
declare void @llvm.instrprof.increment(i8*, i64, i32, i32)
5653

57-
; ELF: @llvm.compiler.used = appending global {{.*}} @__llvm_profile_runtime {{.*}} @__profd_foo {{.*}} @__profd_bar {{.*}} @__profd_baz
54+
; ELF: @__llvm_profile_runtime = external global i32
55+
; COFF: @__llvm_profile_runtime = external global i32
56+
57+
; ELF: @llvm.compiler.used = appending global {{.*}} @__profd_foo {{.*}} @__profd_bar {{.*}} @__profd_baz {{.*}} @__llvm_profile_runtime
5858
; MACHO: @llvm.used = appending global {{.*}} @__llvm_profile_runtime {{.*}} @__profd_foo {{.*}} @__profd_bar {{.*}} @__profd_baz
59-
; WIN: @llvm.used = appending global {{.*}} @__llvm_profile_runtime {{.*}} @__profd_foo {{.*}} @__profd_bar {{.*}} @__profd_baz
59+
; COFF: @llvm.used = appending global {{.*}} @__profd_foo {{.*}} @__profd_bar {{.*}} @__profd_baz {{.*}} @__llvm_profile_runtime
6060

6161
; ELF_GENERIC: define internal void @__llvm_profile_register_functions() unnamed_addr {
62-
; ELF_GENERIC-NEXT: call void @__llvm_profile_register_function(i8* bitcast (i32* @__llvm_profile_runtime to i8*))
6362
; ELF_GENERIC-NEXT: call void @__llvm_profile_register_function(i8* bitcast ({ i64, i64, i64*, i8*, i8*, i32, [2 x i16] }* @__profd_foo to i8*))
6463
; ELF_GENERIC-NEXT: call void @__llvm_profile_register_function(i8* bitcast ({ i64, i64, i64*, i8*, i8*, i32, [2 x i16] }* @__profd_bar to i8*))
6564
; ELF_GENERIC-NEXT: call void @__llvm_profile_register_function(i8* bitcast ({ i64, i64, i64*, i8*, i8*, i32, [2 x i16] }* @__profd_baz to i8*))

0 commit comments

Comments
 (0)