Skip to content

Commit 07c9189

Browse files
authored
[PGO] Exposing PGO's Counter Reset and File Dumping APIs (#76471)
This PR exposes four PGO functions - `__llvm_profile_set_filename` - `__llvm_profile_reset_counters`, - `__llvm_profile_dump` - `__llvm_orderfile_dump` to user programs through the new header `instr_prof_interface.h` under `compiler-rt/include/profile`. This way, the user can include the header `profile/instr_prof_interface.h` to introduce these four names to their programs. Additionally, this PR defines macro `__LLVM_INSTR_PROFILE_GENERATE` when the program is compiled with profile generation, and defines macro `__LLVM_INSTR_PROFILE_USE` when the program is compiled with profile use. `__LLVM_INSTR_PROFILE_GENERATE` together with `instr_prof_interface.h` define the PGO functions only when the program is compiled with profile generation. When profile generation is off, these PGO functions are defined away and leave no trace in the user's program. Background: https://discourse.llvm.org/t/pgo-are-the-llvm-profile-functions-stable-c-apis-across-llvm-releases/75832
1 parent a85cbe8 commit 07c9189

File tree

12 files changed

+310
-57
lines changed

12 files changed

+310
-57
lines changed

clang-tools-extra/clang-tidy/ExpandModularHeadersPPCallbacks.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,7 @@ ExpandModularHeadersPPCallbacks::ExpandModularHeadersPPCallbacks(
100100
/*OwnsHeaderSearch=*/false);
101101
PP->Initialize(Compiler.getTarget(), Compiler.getAuxTarget());
102102
InitializePreprocessor(*PP, *PO, Compiler.getPCHContainerReader(),
103-
Compiler.getFrontendOpts());
103+
Compiler.getFrontendOpts(), Compiler.getCodeGenOpts());
104104
ApplyHeaderSearchOptions(*HeaderInfo, *HSO, LangOpts,
105105
Compiler.getTarget().getTriple());
106106
}

clang/docs/UsersManual.rst

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2809,6 +2809,110 @@ indexed format, regardeless whether it is produced by frontend or the IR pass.
28092809
overhead. ``prefer-atomic`` will be transformed to ``atomic`` when supported
28102810
by the target, or ``single`` otherwise.
28112811

2812+
Fine Tuning Profile Collection
2813+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2814+
2815+
The PGO infrastructure provides user program knobs to fine tune profile
2816+
collection. Specifically, the PGO runtime provides the following functions
2817+
that can be used to control the regions in the program where profiles should
2818+
be collected.
2819+
2820+
* ``void __llvm_profile_set_filename(const char *Name)``: changes the name of
2821+
the profile file to ``Name``.
2822+
* ``void __llvm_profile_reset_counters(void)``: resets all counters to zero.
2823+
* ``int __llvm_profile_dump(void)``: write the profile data to disk.
2824+
* ``int __llvm_orderfile_dump(void)``: write the order file to disk.
2825+
2826+
For example, the following pattern can be used to skip profiling program
2827+
initialization, profile two specific hot regions, and skip profiling program
2828+
cleanup:
2829+
2830+
.. code-block:: c
2831+
2832+
int main() {
2833+
initialize();
2834+
2835+
// Reset all profile counters to 0 to omit profile collected during
2836+
// initialize()'s execution.
2837+
__llvm_profile_reset_counters();
2838+
... hot region 1
2839+
// Dump the profile for hot region 1.
2840+
__llvm_profile_set_filename("region1.profraw");
2841+
__llvm_profile_dump();
2842+
2843+
// Reset counters before proceeding to hot region 2.
2844+
__llvm_profile_reset_counters();
2845+
... hot region 2
2846+
// Dump the profile for hot region 2.
2847+
__llvm_profile_set_filename("region2.profraw");
2848+
__llvm_profile_dump();
2849+
2850+
// Since the profile has been dumped, no further profile data
2851+
// will be collected beyond the above __llvm_profile_dump().
2852+
cleanup();
2853+
return 0;
2854+
}
2855+
2856+
These APIs' names can be introduced to user programs in two ways.
2857+
They can be declared as weak symbols on platforms which support
2858+
treating weak symbols as ``null`` during linking. For example, the user can
2859+
have
2860+
2861+
.. code-block:: c
2862+
2863+
__attribute__((weak)) int __llvm_profile_dump(void);
2864+
2865+
// Then later in the same source file
2866+
if (__llvm_profile_dump)
2867+
if (__llvm_profile_dump() != 0) { ... }
2868+
// The first if condition tests if the symbol is actually defined.
2869+
// Profile dumping only happens if the symbol is defined. Hence,
2870+
// the user program works correctly during normal (not profile-generate)
2871+
// executions.
2872+
2873+
Alternatively, the user program can include the header
2874+
``profile/instr_prof_interface.h``, which contains the API names. For example,
2875+
2876+
.. code-block:: c
2877+
2878+
#include "profile/instr_prof_interface.h"
2879+
2880+
// Then later in the same source file
2881+
if (__llvm_profile_dump() != 0) { ... }
2882+
2883+
The user code does not need to check if the API names are defined, because
2884+
these names are automatically replaced by ``(0)`` or the equivalence of noop
2885+
if the ``clang`` is not compiling for profile generation.
2886+
2887+
Such replacement can happen because ``clang`` adds one of two macros depending
2888+
on the ``-fprofile-generate`` and the ``-fprofile-use`` flags.
2889+
2890+
* ``__LLVM_INSTR_PROFILE_GENERATE``: defined when one of
2891+
``-fprofile[-instr]-generate``/``-fcs-profile-generate`` is in effect.
2892+
* ``__LLVM_INSTR_PROFILE_USE``: defined when one of
2893+
``-fprofile-use``/``-fprofile-instr-use`` is in effect.
2894+
2895+
The two macros can be used to provide more flexibiilty so a user program
2896+
can execute code specifically intended for profile generate or profile use.
2897+
For example, a user program can have special logging during profile generate:
2898+
2899+
.. code-block:: c
2900+
2901+
#if __LLVM_INSTR_PROFILE_GENERATE
2902+
expensive_logging_of_full_program_state();
2903+
#endif
2904+
2905+
The logging is automatically excluded during a normal build of the program,
2906+
hence it does not impact performance during a normal execution.
2907+
2908+
It is advised to use such fine tuning only in a program's cold regions. The weak
2909+
symbols can introduce extra control flow (the ``if`` checks), while the macros
2910+
(hence declarations they guard in ``profile/instr_prof_interface.h``)
2911+
can change the control flow of the functions that use them between profile
2912+
generation and profile use (which can lead to discarded counters in such
2913+
functions). Using these APIs in the program's cold regions introduces less
2914+
overhead and leads to more optimized code.
2915+
28122916
Disabling Instrumentation
28132917
^^^^^^^^^^^^^^^^^^^^^^^^^
28142918

clang/include/clang/Basic/CodeGenOptions.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,12 @@ class CodeGenOptions : public CodeGenOptionsBase {
494494
return getProfileInstr() == ProfileCSIRInstr;
495495
}
496496

497+
/// Check if any form of instrumentation is on.
498+
bool hasProfileInstr() const {
499+
return hasProfileClangInstr() || hasProfileIRInstr() ||
500+
hasProfileCSIRInstr();
501+
}
502+
497503
/// Check if Clang profile use is on.
498504
bool hasProfileClangUse() const {
499505
return getProfileUse() == ProfileClangInstr;

clang/include/clang/Frontend/Utils.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,12 +43,14 @@ class PCHContainerReader;
4343
class Preprocessor;
4444
class PreprocessorOptions;
4545
class PreprocessorOutputOptions;
46+
class CodeGenOptions;
4647

4748
/// InitializePreprocessor - Initialize the preprocessor getting it and the
4849
/// environment ready to process a single file.
4950
void InitializePreprocessor(Preprocessor &PP, const PreprocessorOptions &PPOpts,
5051
const PCHContainerReader &PCHContainerRdr,
51-
const FrontendOptions &FEOpts);
52+
const FrontendOptions &FEOpts,
53+
const CodeGenOptions &CodeGenOpts);
5254

5355
/// DoPrintPreprocessedInput - Implement -E mode.
5456
void DoPrintPreprocessedInput(Preprocessor &PP, raw_ostream *OS,

clang/lib/Frontend/CompilerInstance.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -470,7 +470,7 @@ void CompilerInstance::createPreprocessor(TranslationUnitKind TUKind) {
470470

471471
// Predefine macros and configure the preprocessor.
472472
InitializePreprocessor(*PP, PPOpts, getPCHContainerReader(),
473-
getFrontendOpts());
473+
getFrontendOpts(), getCodeGenOpts());
474474

475475
// Initialize the header search object. In CUDA compilations, we use the aux
476476
// triple (the host triple) to initialize our header search, since we need to

clang/lib/Frontend/InitPreprocessor.cpp

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1364,12 +1364,22 @@ static void InitializePredefinedMacros(const TargetInfo &TI,
13641364
TI.getTargetDefines(LangOpts, Builder);
13651365
}
13661366

1367+
static void InitializePGOProfileMacros(const CodeGenOptions &CodeGenOpts,
1368+
MacroBuilder &Builder) {
1369+
if (CodeGenOpts.hasProfileInstr())
1370+
Builder.defineMacro("__LLVM_INSTR_PROFILE_GENERATE");
1371+
1372+
if (CodeGenOpts.hasProfileIRUse() || CodeGenOpts.hasProfileClangUse())
1373+
Builder.defineMacro("__LLVM_INSTR_PROFILE_USE");
1374+
}
1375+
13671376
/// InitializePreprocessor - Initialize the preprocessor getting it and the
13681377
/// environment ready to process a single file.
1369-
void clang::InitializePreprocessor(
1370-
Preprocessor &PP, const PreprocessorOptions &InitOpts,
1371-
const PCHContainerReader &PCHContainerRdr,
1372-
const FrontendOptions &FEOpts) {
1378+
void clang::InitializePreprocessor(Preprocessor &PP,
1379+
const PreprocessorOptions &InitOpts,
1380+
const PCHContainerReader &PCHContainerRdr,
1381+
const FrontendOptions &FEOpts,
1382+
const CodeGenOptions &CodeGenOpts) {
13731383
const LangOptions &LangOpts = PP.getLangOpts();
13741384
std::string PredefineBuffer;
13751385
PredefineBuffer.reserve(4080);
@@ -1416,6 +1426,11 @@ void clang::InitializePreprocessor(
14161426
InitializeStandardPredefinedMacros(PP.getTargetInfo(), PP.getLangOpts(),
14171427
FEOpts, Builder);
14181428

1429+
// The PGO instrumentation profile macros are driven by options
1430+
// -fprofile[-instr]-generate/-fcs-profile-generate/-fprofile[-instr]-use,
1431+
// hence they are not guarded by InitOpts.UsePredefines.
1432+
InitializePGOProfileMacros(CodeGenOpts, Builder);
1433+
14191434
// Add on the predefines from the driver. Wrap in a #line directive to report
14201435
// that they come from the command line.
14211436
Builder.append("# 1 \"<command line>\" 1");

clang/test/Profile/c-general.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,16 @@
99
// Also check compatibility with older profiles.
1010
// RUN: %clang_cc1 -triple x86_64-apple-macosx10.9 -main-file-name c-general.c %s -o - -emit-llvm -fprofile-instrument-use-path=%S/Inputs/c-general.profdata.v1 | FileCheck -allow-deprecated-dag-overlap -check-prefix=PGOUSE %s
1111

12+
// RUN: %clang -fprofile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
13+
// RUN: %clang -fprofile-instr-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
14+
// RUN: %clang -fcs-profile-generate -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFGENMACRO %s
15+
//
16+
// RUN: %clang -fprofile-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
17+
// RUN: %clang -fprofile-instr-use=%t.profdata -E -dM %s | FileCheck -match-full-lines -check-prefix=PROFUSEMACRO %s
18+
19+
// PROFGENMACRO:#define __LLVM_INSTR_PROFILE_GENERATE 1
20+
// PROFUSEMACRO:#define __LLVM_INSTR_PROFILE_USE 1
21+
1222
// PGOGEN: @[[SLC:__profc_simple_loops]] = private global [4 x i64] zeroinitializer
1323
// PGOGEN: @[[IFC:__profc_conditionals]] = private global [13 x i64] zeroinitializer
1424
// PGOGEN: @[[EEC:__profc_early_exits]] = private global [9 x i64] zeroinitializer

compiler-rt/include/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ endif(COMPILER_RT_BUILD_ORC)
4444
if (COMPILER_RT_BUILD_PROFILE)
4545
set(PROFILE_HEADERS
4646
profile/InstrProfData.inc
47+
profile/instr_prof_interface.h
4748
)
4849
endif(COMPILER_RT_BUILD_PROFILE)
4950

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
/*===---- instr_prof_interface.h - Instrumentation PGO User Program API ----===
2+
*
3+
* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
* See https://llvm.org/LICENSE.txt for license information.
5+
* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
*
7+
*===-----------------------------------------------------------------------===
8+
*
9+
* This header provides a public interface for fine-grained control of counter
10+
* reset and profile dumping. These interface functions can be directly called
11+
* in user programs.
12+
*
13+
\*===---------------------------------------------------------------------===*/
14+
15+
#ifndef COMPILER_RT_INSTR_PROFILING
16+
#define COMPILER_RT_INSTR_PROFILING
17+
18+
#ifdef __cplusplus
19+
extern "C" {
20+
#endif
21+
22+
#ifdef __LLVM_INSTR_PROFILE_GENERATE
23+
// Profile file reset and dump interfaces.
24+
// When `-fprofile[-instr]-generate`/`-fcs-profile-generate` is in effect,
25+
// clang defines __LLVM_INSTR_PROFILE_GENERATE to pick up the API calls.
26+
27+
/*!
28+
* \brief Set the filename for writing instrumentation data.
29+
*
30+
* Sets the filename to be used for subsequent calls to
31+
* \a __llvm_profile_write_file().
32+
*
33+
* \c Name is not copied, so it must remain valid. Passing NULL resets the
34+
* filename logic to the default behaviour.
35+
*
36+
* Note: There may be multiple copies of the profile runtime (one for each
37+
* instrumented image/DSO). This API only modifies the filename within the
38+
* copy of the runtime available to the calling image.
39+
*
40+
* Warning: This is a no-op if continuous mode (\ref
41+
* __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
42+
* that in continuous mode, profile counters are mmap()'d to the profile at
43+
* program initialization time. Support for transferring the mmap'd profile
44+
* counts to a new file has not been implemented.
45+
*/
46+
void __llvm_profile_set_filename(const char *Name);
47+
48+
/*!
49+
* \brief Interface to set all PGO counters to zero for the current process.
50+
*
51+
*/
52+
void __llvm_profile_reset_counters(void);
53+
54+
/*!
55+
* \brief this is a wrapper interface to \c __llvm_profile_write_file.
56+
* After this interface is invoked, an already dumped flag will be set
57+
* so that profile won't be dumped again during program exit.
58+
* Invocation of interface __llvm_profile_reset_counters will clear
59+
* the flag. This interface is designed to be used to collect profile
60+
* data from user selected hot regions. The use model is
61+
* __llvm_profile_reset_counters();
62+
* ... hot region 1
63+
* __llvm_profile_dump();
64+
* .. some other code
65+
* __llvm_profile_reset_counters();
66+
* ... hot region 2
67+
* __llvm_profile_dump();
68+
*
69+
* It is expected that on-line profile merging is on with \c %m specifier
70+
* used in profile filename . If merging is not turned on, user is expected
71+
* to invoke __llvm_profile_set_filename to specify different profile names
72+
* for different regions before dumping to avoid profile write clobbering.
73+
*/
74+
int __llvm_profile_dump(void);
75+
76+
// Interface to dump the current process' order file to disk.
77+
int __llvm_orderfile_dump(void);
78+
79+
#else
80+
81+
#define __llvm_profile_set_filename(Name)
82+
#define __llvm_profile_reset_counters()
83+
#define __llvm_profile_dump() (0)
84+
#define __llvm_orderfile_dump() (0)
85+
86+
#endif
87+
88+
#ifdef __cplusplus
89+
} // extern "C"
90+
#endif
91+
92+
#endif

compiler-rt/lib/profile/InstrProfiling.h

Lines changed: 11 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,17 @@
1212
#include "InstrProfilingPort.h"
1313
#include <stdio.h>
1414

15+
// Make sure __LLVM_INSTR_PROFILE_GENERATE is always defined before
16+
// including instr_prof_interface.h so the interface functions are
17+
// declared correctly for the runtime.
18+
// __LLVM_INSTR_PROFILE_GENERATE is always `#undef`ed after the header,
19+
// because compiler-rt does not support profiling the profiling runtime itself.
20+
#ifndef __LLVM_INSTR_PROFILE_GENERATE
21+
#define __LLVM_INSTR_PROFILE_GENERATE
22+
#endif
23+
#include "profile/instr_prof_interface.h"
24+
#undef __LLVM_INSTR_PROFILE_GENERATE
25+
1526
#define INSTR_PROF_VISIBILITY COMPILER_RT_VISIBILITY
1627
#include "profile/InstrProfData.inc"
1728

@@ -100,12 +111,6 @@ ValueProfNode *__llvm_profile_begin_vnodes();
100111
ValueProfNode *__llvm_profile_end_vnodes();
101112
uint32_t *__llvm_profile_begin_orderfile();
102113

103-
/*!
104-
* \brief Clear profile counters to zero.
105-
*
106-
*/
107-
void __llvm_profile_reset_counters(void);
108-
109114
/*!
110115
* \brief Merge profile data from buffer.
111116
*
@@ -156,50 +161,6 @@ void __llvm_profile_instrument_target_value(uint64_t TargetValue, void *Data,
156161
int __llvm_profile_write_file(void);
157162

158163
int __llvm_orderfile_write_file(void);
159-
/*!
160-
* \brief this is a wrapper interface to \c __llvm_profile_write_file.
161-
* After this interface is invoked, an already dumped flag will be set
162-
* so that profile won't be dumped again during program exit.
163-
* Invocation of interface __llvm_profile_reset_counters will clear
164-
* the flag. This interface is designed to be used to collect profile
165-
* data from user selected hot regions. The use model is
166-
* __llvm_profile_reset_counters();
167-
* ... hot region 1
168-
* __llvm_profile_dump();
169-
* .. some other code
170-
* __llvm_profile_reset_counters();
171-
* ... hot region 2
172-
* __llvm_profile_dump();
173-
*
174-
* It is expected that on-line profile merging is on with \c %m specifier
175-
* used in profile filename . If merging is not turned on, user is expected
176-
* to invoke __llvm_profile_set_filename to specify different profile names
177-
* for different regions before dumping to avoid profile write clobbering.
178-
*/
179-
int __llvm_profile_dump(void);
180-
181-
int __llvm_orderfile_dump(void);
182-
183-
/*!
184-
* \brief Set the filename for writing instrumentation data.
185-
*
186-
* Sets the filename to be used for subsequent calls to
187-
* \a __llvm_profile_write_file().
188-
*
189-
* \c Name is not copied, so it must remain valid. Passing NULL resets the
190-
* filename logic to the default behaviour.
191-
*
192-
* Note: There may be multiple copies of the profile runtime (one for each
193-
* instrumented image/DSO). This API only modifies the filename within the
194-
* copy of the runtime available to the calling image.
195-
*
196-
* Warning: This is a no-op if continuous mode (\ref
197-
* __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
198-
* that in continuous mode, profile counters are mmap()'d to the profile at
199-
* program initialization time. Support for transferring the mmap'd profile
200-
* counts to a new file has not been implemented.
201-
*/
202-
void __llvm_profile_set_filename(const char *Name);
203164

204165
/*!
205166
* \brief Set the FILE object for writing instrumentation data. Return 0 if set

0 commit comments

Comments
 (0)