Skip to content

Commit d889d1e

Browse files
committed
[profile] Add a mode to continuously sync counter updates to a file
Add support for continuously syncing profile counter updates to a file. The motivation for this is that programs do not always exit cleanly. On iOS, for example, programs are usually killed via a signal from the OS. Running atexit() handlers after catching a signal is unreliable, so some method for progressively writing out profile data is necessary. The approach taken here is to mmap() the `__llvm_prf_cnts` section onto a raw profile. To do this, the linker must page-align the counter and data sections, and the runtime must ensure that counters are mapped to a page-aligned offset within a raw profile. Continuous mode is (for the moment) incompatible with the online merging mode. This limitation is lifted in https://reviews.llvm.org/D69586. Continuous mode is also (for the moment) incompatible with value profiling, as I'm not sure whether there is interest in this and the implementation may be tricky. As I have not been able to test extensively on non-Darwin platforms, only Darwin support is included for the moment. However, continuous mode may "just work" without modification on Linux and some UNIX-likes. AIUI the default value for the GNU linker's `--section-alignment` flag is set to the page size on many systems. This appears to be true for LLD as well, as its `no_nmagic` option is on by default. Continuous mode will not "just work" on Fuchsia or Windows, as it's not possible to mmap() a section on these platforms. There is a proposal to add a layer of indirection to the profile instrumentation to support these platforms. rdar://54210980 Differential Revision: https://reviews.llvm.org/D68351
1 parent ade776b commit d889d1e

27 files changed

+690
-25
lines changed

clang/docs/SourceBasedCodeCoverage.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,16 @@ directory structure will be created. Additionally, the following special
8787
be between 1 and 9. The merge pool specifier can only occur once per filename
8888
pattern.
8989

90+
* "%c" expands out to nothing, but enables a mode in which profile counter
91+
updates are continuously synced to a file. This means that if the
92+
instrumented program crashes, or is killed by a signal, perfect coverage
93+
information can still be recovered. Continuous mode is not yet compatible with
94+
the "%Nm" merging mode described above, does not support value profiling for
95+
PGO, and is only supported on Darwin. Support for Linux may be mostly
96+
complete but requires testing, and support for Fuchsia/Windows may require
97+
more extensive changes: please get involved if you are interested in porting
98+
this feature.
99+
90100
.. code-block:: console
91101
92102
# Step 2: Run the program.

clang/lib/Driver/ToolChains/Darwin.cpp

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919
#include "clang/Driver/SanitizerArgs.h"
2020
#include "llvm/ADT/StringSwitch.h"
2121
#include "llvm/Option/ArgList.h"
22+
#include "llvm/ProfileData/InstrProf.h"
2223
#include "llvm/Support/Path.h"
2324
#include "llvm/Support/ScopedPrinter.h"
2425
#include "llvm/Support/TargetParser.h"
@@ -1110,18 +1111,33 @@ static void addExportedSymbol(ArgStringList &CmdArgs, const char *Symbol) {
11101111
CmdArgs.push_back(Symbol);
11111112
}
11121113

1114+
/// Add a sectalign directive for \p Segment and \p Section to the maximum
1115+
/// expected page size for Darwin.
1116+
///
1117+
/// On iPhone 6+ the max supported page size is 16K. On macOS, the max is 4K.
1118+
/// Use a common alignment constant (16K) for now, and reduce the alignment on
1119+
/// macOS if it proves important.
1120+
static void addSectalignToPage(const ArgList &Args, ArgStringList &CmdArgs,
1121+
StringRef Segment, StringRef Section) {
1122+
for (const char *A : {"-sectalign", Args.MakeArgString(Segment),
1123+
Args.MakeArgString(Section), "0x4000"})
1124+
CmdArgs.push_back(A);
1125+
}
1126+
11131127
void Darwin::addProfileRTLibs(const ArgList &Args,
11141128
ArgStringList &CmdArgs) const {
11151129
if (!needsProfileRT(Args)) return;
11161130

11171131
AddLinkRuntimeLib(Args, CmdArgs, "profile",
11181132
RuntimeLinkOptions(RLO_AlwaysLink | RLO_FirstLink));
11191133

1134+
bool ForGCOV = needsGCovInstrumentation(Args);
1135+
11201136
// If we have a symbol export directive and we're linking in the profile
11211137
// runtime, automatically export symbols necessary to implement some of the
11221138
// runtime's functionality.
11231139
if (hasExportSymbolDirective(Args)) {
1124-
if (needsGCovInstrumentation(Args)) {
1140+
if (ForGCOV) {
11251141
addExportedSymbol(CmdArgs, "___gcov_flush");
11261142
addExportedSymbol(CmdArgs, "_flush_fn_list");
11271143
addExportedSymbol(CmdArgs, "_writeout_fn_list");
@@ -1131,6 +1147,24 @@ void Darwin::addProfileRTLibs(const ArgList &Args,
11311147
}
11321148
addExportedSymbol(CmdArgs, "_lprofDirMode");
11331149
}
1150+
1151+
// Align __llvm_prf_{cnts,data} sections to the maximum expected page
1152+
// alignment. This allows profile counters to be mmap()'d to disk. Note that
1153+
// it's not enough to just page-align __llvm_prf_cnts: the following section
1154+
// must also be page-aligned so that its data is not clobbered by mmap().
1155+
//
1156+
// The section alignment is only needed when continuous profile sync is
1157+
// enabled, but this is expected to be the default in Xcode. Specifying the
1158+
// extra alignment also allows the same binary to be used with/without sync
1159+
// enabled.
1160+
if (!ForGCOV) {
1161+
for (auto IPSK : {llvm::IPSK_cnts, llvm::IPSK_data}) {
1162+
addSectalignToPage(
1163+
Args, CmdArgs, "__DATA",
1164+
llvm::getInstrProfSectionName(IPSK, llvm::Triple::MachO,
1165+
/*AddSegmentInfo=*/false));
1166+
}
1167+
}
11341168
}
11351169

11361170
void DarwinClang::AddLinkSanitizerLibArgs(const ArgList &Args,

clang/test/Driver/darwin-ld.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -345,6 +345,12 @@
345345
// RUN: FileCheck -check-prefix=LINK_PROFILE_FIRST %s < %t.log
346346
// LINK_PROFILE_FIRST: {{ld(.exe)?"}} "{{[^"]+}}libclang_rt.profile_{{[a-z]+}}.a"
347347

348+
// RUN: %clang -target x86_64-apple-darwin12 -fprofile-instr-generate -### %t.o 2> %t.log
349+
// RUN: FileCheck -check-prefix=PROFILE_SECTALIGN %s < %t.log
350+
// RUN: %clang -target arm64-apple-ios12 -fprofile-instr-generate -### %t.o 2> %t.log
351+
// RUN: FileCheck -check-prefix=PROFILE_SECTALIGN %s < %t.log
352+
// PROFILE_SECTALIGN: "-sectalign" "__DATA" "__llvm_prf_cnts" "0x4000" "-sectalign" "__DATA" "__llvm_prf_data" "0x4000"
353+
348354
// RUN: %clang -target x86_64-apple-darwin12 -fprofile-instr-generate -exported_symbols_list /dev/null -### %t.o 2> %t.log
349355
// RUN: FileCheck -check-prefix=PROFILE_EXPORT %s < %t.log
350356
// RUN: %clang -target x86_64-apple-darwin12 -fprofile-instr-generate -Wl,-exported_symbols_list,/dev/null -### %t.o 2> %t.log

compiler-rt/lib/profile/InstrProfData.inc

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,9 @@ INSTR_PROF_VALUE_NODE(PtrToNodeT, llvm::Type::getInt8PtrTy(Ctx), Next, \
130130
INSTR_PROF_RAW_HEADER(uint64_t, Magic, __llvm_profile_get_magic())
131131
INSTR_PROF_RAW_HEADER(uint64_t, Version, __llvm_profile_get_version())
132132
INSTR_PROF_RAW_HEADER(uint64_t, DataSize, DataSize)
133+
INSTR_PROF_RAW_HEADER(uint64_t, PaddingBytesBeforeCounters, PaddingBytesBeforeCounters)
133134
INSTR_PROF_RAW_HEADER(uint64_t, CountersSize, CountersSize)
135+
INSTR_PROF_RAW_HEADER(uint64_t, PaddingBytesAfterCounters, PaddingBytesAfterCounters)
134136
INSTR_PROF_RAW_HEADER(uint64_t, NamesSize, NamesSize)
135137
INSTR_PROF_RAW_HEADER(uint64_t, CountersDelta, (uintptr_t)CountersBegin)
136138
INSTR_PROF_RAW_HEADER(uint64_t, NamesDelta, (uintptr_t)NamesBegin)
@@ -628,7 +630,7 @@ serializeValueProfDataFrom(ValueProfRecordClosure *Closure,
628630
(uint64_t)'f' << 16 | (uint64_t)'R' << 8 | (uint64_t)129
629631

630632
/* Raw profile format version (start from 1). */
631-
#define INSTR_PROF_RAW_VERSION 4
633+
#define INSTR_PROF_RAW_VERSION 5
632634
/* Indexed profile format version (start from 1). */
633635
#define INSTR_PROF_INDEX_VERSION 5
634636
/* Coverage mapping format vresion (start from 0). */

compiler-rt/lib/profile/InstrProfiling.h

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,22 @@ typedef struct ValueProfNode {
3838
#include "InstrProfData.inc"
3939
} ValueProfNode;
4040

41+
/*!
42+
* \brief Return 1 if profile counters are continuously synced to the raw
43+
* profile via an mmap(). This is in contrast to the default mode, in which
44+
* the raw profile is written out at program exit time.
45+
*/
46+
int __llvm_profile_is_continuous_mode_enabled(void);
47+
48+
/*!
49+
* \brief Enable continuous mode.
50+
*
51+
* See \ref __llvm_profile_is_continuous_mode_enabled. The behavior is undefined
52+
* if continuous mode is already enabled, or if it cannot be enable due to
53+
* conflicting options.
54+
*/
55+
void __llvm_profile_enable_continuous_mode(void);
56+
4157
/*!
4258
* \brief Get number of bytes necessary to pad the argument to eight
4359
* byte boundary.
@@ -159,6 +175,12 @@ int __llvm_orderfile_dump(void);
159175
* Note: There may be multiple copies of the profile runtime (one for each
160176
* instrumented image/DSO). This API only modifies the filename within the
161177
* copy of the runtime available to the calling image.
178+
*
179+
* Warning: This is a no-op if continuous mode (\ref
180+
* __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
181+
* that in continuous mode, profile counters are mmap()'d to the profile at
182+
* program initialization time. Support for transferring the mmap'd profile
183+
* counts to a new file has not been implemented.
162184
*/
163185
void __llvm_profile_set_filename(const char *Name);
164186

@@ -181,6 +203,12 @@ void __llvm_profile_set_filename(const char *Name);
181203
* Note: There may be multiple copies of the profile runtime (one for each
182204
* instrumented image/DSO). This API only modifies the file object within the
183205
* copy of the runtime available to the calling image.
206+
*
207+
* Warning: This is a no-op if continuous mode (\ref
208+
* __llvm_profile_is_continuous_mode_enabled) is on. The reason for this is
209+
* that in continuous mode, profile counters are mmap()'d to the profile at
210+
* program initialization time. Support for transferring the mmap'd profile
211+
* counts to a new file has not been implemented.
184212
*/
185213
void __llvm_profile_set_file_object(FILE *File, int EnableMerge);
186214

@@ -223,6 +251,24 @@ uint64_t __llvm_profile_get_version(void);
223251
uint64_t __llvm_profile_get_data_size(const __llvm_profile_data *Begin,
224252
const __llvm_profile_data *End);
225253

254+
/* ! \brief Given the sizes of the data and counter information, return the
255+
* number of padding bytes before and after the counters, and after the names,
256+
* in the raw profile.
257+
*
258+
* Note: In this context, "size" means "number of entries", i.e. the first two
259+
* arguments must be the result of __llvm_profile_get_data_size() and of
260+
* (__llvm_profile_end_counters() - __llvm_profile_begin_counters()) resp.
261+
*
262+
* Note: When mmap() mode is disabled, no padding bytes before/after counters
263+
* are needed. However, in mmap() mode, the counter section in the raw profile
264+
* must be page-aligned: this API computes the number of padding bytes
265+
* needed to achieve that.
266+
*/
267+
void __llvm_profile_get_padding_sizes_for_counters(
268+
uint64_t DataSize, uint64_t CountersSize, uint64_t NamesSize,
269+
uint64_t *PaddingBytesBeforeCounters, uint64_t *PaddingBytesAfterCounters,
270+
uint64_t *PaddingBytesAfterNames);
271+
226272
/*!
227273
* \brief Set the flag that profile data has been dumped to the file.
228274
* This is useful for users to disable dumping profile data to the file for

compiler-rt/lib/profile/InstrProfilingBuffer.c

Lines changed: 70 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,27 @@
88

99
#include "InstrProfiling.h"
1010
#include "InstrProfilingInternal.h"
11+
#include "InstrProfilingPort.h"
12+
13+
/* When continuous mode is enabled (%c), this parameter is set to 1. This is
14+
* incompatible with the in-process merging mode. Lifting this restriction
15+
* may be complicated, as merging mode requires a lock on the profile, and
16+
* mmap() mode would require that lock to be held for the entire process
17+
* lifetime.
18+
*
19+
* This parameter is defined here in InstrProfilingBuffer.o, instead of in
20+
* InstrProfilingFile.o, to sequester all libc-dependent code in
21+
* InstrProfilingFile.o. The test `instrprof-without-libc` will break if this
22+
* layering is violated. */
23+
static int ContinuouslySyncProfile = 0;
24+
25+
COMPILER_RT_VISIBILITY int __llvm_profile_is_continuous_mode_enabled(void) {
26+
return ContinuouslySyncProfile;
27+
}
28+
29+
COMPILER_RT_VISIBILITY void __llvm_profile_enable_continuous_mode(void) {
30+
ContinuouslySyncProfile = 1;
31+
}
1132

1233
COMPILER_RT_VISIBILITY
1334
uint64_t __llvm_profile_get_size_for_buffer(void) {
@@ -30,18 +51,63 @@ uint64_t __llvm_profile_get_data_size(const __llvm_profile_data *Begin,
3051
sizeof(__llvm_profile_data);
3152
}
3253

54+
/// Calculate the number of padding bytes needed to add to \p Offset in order
55+
/// for (\p Offset + Padding) to be page-aligned.
56+
static uint64_t calculateBytesNeededToPageAlign(uint64_t Offset,
57+
unsigned PageSize) {
58+
uint64_t OffsetModPage = Offset % PageSize;
59+
if (OffsetModPage > 0)
60+
return PageSize - OffsetModPage;
61+
return 0;
62+
}
63+
64+
COMPILER_RT_VISIBILITY
65+
void __llvm_profile_get_padding_sizes_for_counters(
66+
uint64_t DataSize, uint64_t CountersSize, uint64_t NamesSize,
67+
uint64_t *PaddingBytesBeforeCounters, uint64_t *PaddingBytesAfterCounters,
68+
uint64_t *PaddingBytesAfterNames) {
69+
if (!__llvm_profile_is_continuous_mode_enabled()) {
70+
*PaddingBytesBeforeCounters = 0;
71+
*PaddingBytesAfterCounters = 0;
72+
*PaddingBytesAfterNames = __llvm_profile_get_num_padding_bytes(NamesSize);
73+
return;
74+
}
75+
76+
// In continuous mode, the file offsets for headers and for the start of
77+
// counter sections need to be page-aligned.
78+
unsigned PageSize = getpagesize();
79+
uint64_t DataSizeInBytes = DataSize * sizeof(__llvm_profile_data);
80+
uint64_t CountersSizeInBytes = CountersSize * sizeof(uint64_t);
81+
*PaddingBytesBeforeCounters = calculateBytesNeededToPageAlign(
82+
sizeof(__llvm_profile_header) + DataSizeInBytes, PageSize);
83+
*PaddingBytesAfterCounters =
84+
calculateBytesNeededToPageAlign(CountersSizeInBytes, PageSize);
85+
*PaddingBytesAfterNames =
86+
calculateBytesNeededToPageAlign(NamesSize, PageSize);
87+
}
88+
3389
COMPILER_RT_VISIBILITY
3490
uint64_t __llvm_profile_get_size_for_buffer_internal(
3591
const __llvm_profile_data *DataBegin, const __llvm_profile_data *DataEnd,
3692
const uint64_t *CountersBegin, const uint64_t *CountersEnd,
3793
const char *NamesBegin, const char *NamesEnd) {
3894
/* Match logic in __llvm_profile_write_buffer(). */
3995
const uint64_t NamesSize = (NamesEnd - NamesBegin) * sizeof(char);
40-
const uint8_t Padding = __llvm_profile_get_num_padding_bytes(NamesSize);
96+
uint64_t DataSize = __llvm_profile_get_data_size(DataBegin, DataEnd);
97+
uint64_t CountersSize = CountersEnd - CountersBegin;
98+
99+
/* Determine how much padding is needed before/after the counters and after
100+
* the names. */
101+
uint64_t PaddingBytesBeforeCounters, PaddingBytesAfterCounters,
102+
PaddingBytesAfterNames;
103+
__llvm_profile_get_padding_sizes_for_counters(
104+
DataSize, CountersSize, NamesSize, &PaddingBytesBeforeCounters,
105+
&PaddingBytesAfterCounters, &PaddingBytesAfterNames);
106+
41107
return sizeof(__llvm_profile_header) +
42-
(__llvm_profile_get_data_size(DataBegin, DataEnd) *
43-
sizeof(__llvm_profile_data)) +
44-
(CountersEnd - CountersBegin) * sizeof(uint64_t) + NamesSize + Padding;
108+
(DataSize * sizeof(__llvm_profile_data)) + PaddingBytesBeforeCounters +
109+
(CountersSize * sizeof(uint64_t)) + PaddingBytesAfterCounters +
110+
NamesSize + PaddingBytesAfterNames;
45111
}
46112

47113
COMPILER_RT_VISIBILITY

0 commit comments

Comments
 (0)