Skip to content

Commit 7df28fd

Browse files
authored
[SHT_LLVM_BB_ADDR_MAP][AsmPrinter] Implements PGOAnalysisMap emitting in AsmPrinter with tests. (llvm#75202)
Uses machine analyses to emit PGOAnalysisMap into the bb-addr-map ELF section. Implements filecheck tests to verify emitting new fields. This patch emits optional PGO related analyses into the bb-addr-map ELF section during AsmPrinter. This currently supports Function Entry Count, Machine Block Frequencies. and Machine Branch Probabilities. Each is independently enabled via the `feature` byte of `bb-addr-map` for the given function. A part of [RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch and Block Analysis](https://discourse.llvm.org/t/rfc-pgo-accuracy-metrics-emitting-and-evaluating-branch-and-block-analysis/73902).
1 parent ddfbca8 commit 7df28fd

File tree

4 files changed

+292
-5
lines changed

4 files changed

+292
-5
lines changed

llvm/docs/Extensions.rst

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -451,6 +451,90 @@ Example:
451451
.uleb128 .LBB_END0_1-.LBB0_1 # BB_1 size
452452
.byte y # BB_1 metadata
453453
454+
PGO Analysis Map
455+
""""""""""""""""
456+
457+
PGO related analysis data can be emitted after each function within the
458+
``SHT_LLVM_BB_ADDR_MAP`` through the optional ``pgo-analysis-map`` flag.
459+
Supported analyses currently are Function Entry Count, Basic Block Frequencies,
460+
and Branch Probabilities.
461+
462+
Each analysis is enabled or disabled via a bit in the feature byte. Currently
463+
those bits are:
464+
465+
#. Function Entry Count - Number of times the function was called as taken
466+
from a PGO profile. This will always be zero if PGO was not used or the
467+
function was not encountered in the profile.
468+
469+
#. Basic Block Frequencies - Encoded as raw block frequency value taken from
470+
MBFI analysis. This value is an integer that encodes the relative frequency
471+
compared to the entry block. More information can be found in
472+
'llvm/Support/BlockFrequency.h'.
473+
474+
#. Branch Probabilities - Encoded as raw numerator for branch probability
475+
taken from MBPI analysis. This value is the numerator for a fixed point ratio
476+
defined in 'llvm/Support/BranchProbability.h'. It indicates the probability
477+
that the block is followed by a given successor block during execution.
478+
479+
This extra data requires version 2 or above. This is necessary since successors
480+
of basic blocks won't know their index but will know their BB ID.
481+
482+
Example of BBAddrMap with PGO data:
483+
484+
.. code-block:: gas
485+
486+
.section ".llvm_bb_addr_map","",@llvm_bb_addr_map
487+
.byte 2 # version number
488+
.byte 7 # feature byte - PGO analyses enabled mask
489+
.quad .Lfunc_begin0 # address of the function
490+
.uleb128 4 # number of basic blocks
491+
# BB record for BB_0
492+
.uleb128 0 # BB_0 BB ID
493+
.uleb128 .Lfunc_begin0-.Lfunc_begin0 # BB_0 offset relative to function entry (always zero)
494+
.uleb128 .LBB_END0_0-.Lfunc_begin0 # BB_0 size
495+
.byte 0x18 # BB_0 metadata (multiple successors)
496+
# BB record for BB_1
497+
.uleb128 1 # BB_1 BB ID
498+
.uleb128 .LBB0_1-.LBB_END0_0 # BB_1 offset relative to the end of last block (BB_0).
499+
.uleb128 .LBB_END0_1-.LBB0_1 # BB_1 size
500+
.byte 0x0 # BB_1 metadata (two successors)
501+
# BB record for BB_2
502+
.uleb128 2 # BB_2 BB ID
503+
.uleb128 .LBB0_2-.LBB_END1_0 # BB_2 offset relative to the end of last block (BB_1).
504+
.uleb128 .LBB_END0_2-.LBB0_2 # BB_2 size
505+
.byte 0x0 # BB_2 metadata (one successor)
506+
# BB record for BB_3
507+
.uleb128 3 # BB_3 BB ID
508+
.uleb128 .LBB0_3-.LBB_END0_2 # BB_3 offset relative to the end of last block (BB_2).
509+
.uleb128 .LBB_END0_3-.LBB0_3 # BB_3 size
510+
.byte 0x0 # BB_3 metadata (zero successors)
511+
# PGO Analysis Map
512+
.uleb128 1000 # function entry count (only when enabled)
513+
# PGO data record for BB_0
514+
.uleb128 1000 # BB_0 basic block frequency (only when enabled)
515+
.uleb128 3 # BB_0 successors count (only enabled with branch probabilities)
516+
.uleb128 1 # BB_0 successor 1 BB ID (only enabled with branch probabilities)
517+
.uleb128 0x22222222 # BB_0 successor 1 branch probability (only enabled with branch probabilities)
518+
.uleb128 2 # BB_0 successor 2 BB ID (only enabled with branch probabilities)
519+
.uleb128 0x33333333 # BB_0 successor 2 branch probability (only enabled with branch probabilities)
520+
.uleb128 3 # BB_0 successor 3 BB ID (only enabled with branch probabilities)
521+
.uleb128 0xaaaaaaaa # BB_0 successor 3 branch probability (only enabled with branch probabilities)
522+
# PGO data record for BB_1
523+
.uleb128 133 # BB_1 basic block frequency (only when enabled)
524+
.uleb128 2 # BB_1 successors count (only enabled with branch probabilities)
525+
.uleb128 2 # BB_1 successor 1 BB ID (only enabled with branch probabilities)
526+
.uleb128 0x11111111 # BB_1 successor 1 branch probability (only enabled with branch probabilities)
527+
.uleb128 3 # BB_1 successor 2 BB ID (only enabled with branch probabilities)
528+
.uleb128 0x11111111 # BB_1 successor 2 branch probability (only enabled with branch probabilities)
529+
# PGO data record for BB_2
530+
.uleb128 18 # BB_2 basic block frequency (only when enabled)
531+
.uleb128 1 # BB_2 successors count (only enabled with branch probabilities)
532+
.uleb128 3 # BB_2 successor 1 BB ID (only enabled with branch probabilities)
533+
.uleb128 0xffffffff # BB_2 successor 1 branch probability (only enabled with branch probabilities)
534+
# PGO data record for BB_3
535+
.uleb128 1000 # BB_3 basic block frequency (only when enabled)
536+
.uleb128 0 # BB_3 successors count (only enabled with branch probabilities)
537+
454538
``SHT_LLVM_OFFLOADING`` Section (offloading data)
455539
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
456540
This section stores the binary data used to perform offloading device linking

llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Lines changed: 77 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@
4040
#include "llvm/CodeGen/GCMetadataPrinter.h"
4141
#include "llvm/CodeGen/LazyMachineBlockFrequencyInfo.h"
4242
#include "llvm/CodeGen/MachineBasicBlock.h"
43+
#include "llvm/CodeGen/MachineBranchProbabilityInfo.h"
4344
#include "llvm/CodeGen/MachineConstantPool.h"
4445
#include "llvm/CodeGen/MachineDominators.h"
4546
#include "llvm/CodeGen/MachineFrameInfo.h"
@@ -140,6 +141,26 @@ static cl::opt<std::string> BasicBlockProfileDump(
140141
"performed with -basic-block-sections=labels. Enabling this "
141142
"flag during in-process ThinLTO is not supported."));
142143

144+
// This is a replication of fields of object::PGOAnalysisMap::Features. It
145+
// should match the order of the fields so that
146+
// `object::PGOAnalysisMap::Features::decode(PgoAnalysisMapFeatures.getBits())`
147+
// succeeds.
148+
enum class PGOMapFeaturesEnum {
149+
FuncEntryCount,
150+
BBFreq,
151+
BrProb,
152+
};
153+
static cl::bits<PGOMapFeaturesEnum> PgoAnalysisMapFeatures(
154+
"pgo-analysis-map", cl::Hidden, cl::CommaSeparated,
155+
cl::values(clEnumValN(PGOMapFeaturesEnum::FuncEntryCount,
156+
"func-entry-count", "Function Entry Count"),
157+
clEnumValN(PGOMapFeaturesEnum::BBFreq, "bb-freq",
158+
"Basic Block Frequency"),
159+
clEnumValN(PGOMapFeaturesEnum::BrProb, "br-prob",
160+
"Branch Probability")),
161+
cl::desc("Enable extended information within the BBAddrMap that is "
162+
"extracted from PGO related analysis."));
163+
143164
const char DWARFGroupName[] = "dwarf";
144165
const char DWARFGroupDescription[] = "DWARF Emission";
145166
const char DbgTimerName[] = "emit";
@@ -428,6 +449,7 @@ void AsmPrinter::getAnalysisUsage(AnalysisUsage &AU) const {
428449
AU.addRequired<MachineOptimizationRemarkEmitterPass>();
429450
AU.addRequired<GCModuleInfo>();
430451
AU.addRequired<LazyMachineBlockFrequencyInfoPass>();
452+
AU.addRequired<MachineBranchProbabilityInfo>();
431453
}
432454

433455
bool AsmPrinter::doInitialization(Module &M) {
@@ -1379,7 +1401,8 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
13791401
uint8_t BBAddrMapVersion = OutStreamer->getContext().getBBAddrMapVersion();
13801402
OutStreamer->emitInt8(BBAddrMapVersion);
13811403
OutStreamer->AddComment("feature");
1382-
OutStreamer->emitInt8(0);
1404+
auto FeaturesBits = static_cast<uint8_t>(PgoAnalysisMapFeatures.getBits());
1405+
OutStreamer->emitInt8(FeaturesBits);
13831406
OutStreamer->AddComment("function address");
13841407
OutStreamer->emitSymbolValue(FunctionSymbol, getPointerSize());
13851408
OutStreamer->AddComment("number of basic blocks");
@@ -1409,6 +1432,51 @@ void AsmPrinter::emitBBAddrMapSection(const MachineFunction &MF) {
14091432
OutStreamer->emitULEB128IntValue(getBBAddrMapMetadata(MBB));
14101433
PrevMBBEndSymbol = MBB.getEndSymbol();
14111434
}
1435+
1436+
if (FeaturesBits != 0) {
1437+
assert(BBAddrMapVersion >= 2 &&
1438+
"PGOAnalysisMap only supports version 2 or later");
1439+
1440+
auto FeatEnable =
1441+
cantFail(object::PGOAnalysisMap::Features::decode(FeaturesBits));
1442+
1443+
if (FeatEnable.FuncEntryCount) {
1444+
OutStreamer->AddComment("function entry count");
1445+
auto MaybeEntryCount = MF.getFunction().getEntryCount();
1446+
OutStreamer->emitULEB128IntValue(
1447+
MaybeEntryCount ? MaybeEntryCount->getCount() : 0);
1448+
}
1449+
const MachineBlockFrequencyInfo *MBFI =
1450+
FeatEnable.BBFreq
1451+
? &getAnalysis<LazyMachineBlockFrequencyInfoPass>().getBFI()
1452+
: nullptr;
1453+
const MachineBranchProbabilityInfo *MBPI =
1454+
FeatEnable.BrProb ? &getAnalysis<MachineBranchProbabilityInfo>()
1455+
: nullptr;
1456+
1457+
if (FeatEnable.BBFreq || FeatEnable.BrProb) {
1458+
for (const MachineBasicBlock &MBB : MF) {
1459+
if (FeatEnable.BBFreq) {
1460+
OutStreamer->AddComment("basic block frequency");
1461+
OutStreamer->emitULEB128IntValue(
1462+
MBFI->getBlockFreq(&MBB).getFrequency());
1463+
}
1464+
if (FeatEnable.BrProb) {
1465+
unsigned SuccCount = MBB.succ_size();
1466+
OutStreamer->AddComment("basic block successor count");
1467+
OutStreamer->emitULEB128IntValue(SuccCount);
1468+
for (const MachineBasicBlock *SuccMBB : MBB.successors()) {
1469+
OutStreamer->AddComment("successor BB ID");
1470+
OutStreamer->emitULEB128IntValue(SuccMBB->getBBID()->BaseID);
1471+
OutStreamer->AddComment("successor branch probability");
1472+
OutStreamer->emitULEB128IntValue(
1473+
MBPI->getEdgeProbability(&MBB, SuccMBB).getNumerator());
1474+
}
1475+
}
1476+
}
1477+
}
1478+
}
1479+
14121480
OutStreamer->popSection();
14131481
}
14141482

@@ -1934,8 +2002,14 @@ void AsmPrinter::emitFunctionBody() {
19342002

19352003
// Emit section containing BB address offsets and their metadata, when
19362004
// BB labels are requested for this function. Skip empty functions.
1937-
if (MF->hasBBLabels() && HasAnyRealCode)
1938-
emitBBAddrMapSection(*MF);
2005+
if (HasAnyRealCode) {
2006+
if (MF->hasBBLabels())
2007+
emitBBAddrMapSection(*MF);
2008+
else if (PgoAnalysisMapFeatures.getBits() != 0)
2009+
MF->getContext().reportWarning(
2010+
SMLoc(), "pgo-analysis-map is enabled for function " + MF->getName() +
2011+
" but it does not have labels");
2012+
}
19392013

19402014
// Emit sections containing instruction and function PCs.
19412015
emitPCSections(*MF);

llvm/test/CodeGen/X86/basic-block-sections-labels-empty-function.ll

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
;; Verify that the BB address map is not emitted for empty functions.
2-
; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels | FileCheck %s
2+
; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels | FileCheck %s --check-prefixes=CHECK,BASIC
3+
; RUN: llc < %s -mtriple=x86_64 -basic-block-sections=labels -pgo-analysis-map=func-entry-count,bb-freq | FileCheck %s --check-prefixes=CHECK,PGO
34

45
define void @empty_func() {
56
entry:
@@ -19,5 +20,6 @@ entry:
1920
; CHECK: .Lfunc_begin1:
2021
; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text{{$}}
2122
; CHECK-NEXT: .byte 2 # version
22-
; CHECK-NEXT: .byte 0 # feature
23+
; BASIC-NEXT: .byte 0 # feature
24+
; PGO-NEXT: .byte 3 # feature
2325
; CHECK-NEXT: .quad .Lfunc_begin1 # function address
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
; Check the basic block sections labels option
2+
; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels | FileCheck %s --check-prefixes=CHECK,BASIC
3+
4+
;; Also verify this holds for all PGO features enabled
5+
; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=func-entry-count,bb-freq,br-prob | FileCheck %s --check-prefixes=CHECK,PGO-ALL,PGO-FEC,PGO-BBF,PGO-BRP
6+
7+
;; Also verify that pgo extension only includes the enabled feature
8+
; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=func-entry-count | FileCheck %s --check-prefixes=CHECK,PGO-FEC,FEC-ONLY
9+
; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=bb-freq | FileCheck %s --check-prefixes=CHECK,PGO-BBF,BBF-ONLY
10+
; RUN: llc < %s -mtriple=x86_64 -function-sections -unique-section-names=true -basic-block-sections=labels -pgo-analysis-map=br-prob | FileCheck %s --check-prefixes=CHECK,PGO-BRP,BRP-ONLY
11+
12+
13+
define void @_Z3bazb(i1 zeroext, i1 zeroext) personality ptr @__gxx_personality_v0 !prof !0 {
14+
br i1 %0, label %3, label %8, !prof !1
15+
16+
3:
17+
%4 = invoke i32 @_Z3barv()
18+
to label %8 unwind label %6
19+
br label %10
20+
21+
6:
22+
landingpad { ptr, i32 }
23+
catch ptr null
24+
br label %12
25+
26+
8:
27+
%9 = call i32 @_Z3foov()
28+
br i1 %1, label %12, label %10, !prof !2
29+
30+
10:
31+
%11 = select i1 %1, ptr blockaddress(@_Z3bazb, %3), ptr blockaddress(@_Z3bazb, %12) ; <ptr> [#uses=1]
32+
indirectbr ptr %11, [label %3, label %12], !prof !3
33+
34+
12:
35+
ret void
36+
}
37+
38+
declare i32 @_Z3barv() #1
39+
40+
declare i32 @_Z3foov() #1
41+
42+
declare i32 @__gxx_personality_v0(...)
43+
44+
!0 = !{!"function_entry_count", i64 100}
45+
!1 = !{!"branch_weights", i32 80, i32 20}
46+
!2 = !{!"branch_weights", i32 70, i32 10}
47+
!3 = !{!"branch_weights", i32 15, i32 5}
48+
49+
; CHECK: .section .text._Z3bazb,"ax",@progbits{{$}}
50+
; CHECK-LABEL: _Z3bazb:
51+
; CHECK-LABEL: .Lfunc_begin0:
52+
; CHECK-LABEL: .LBB_END0_0:
53+
; CHECK-LABEL: .LBB0_1:
54+
; CHECK-LABEL: .LBB_END0_1:
55+
; CHECK-LABEL: .LBB0_2:
56+
; CHECK-LABEL: .LBB_END0_2:
57+
; CHECK-LABEL: .LBB0_3:
58+
; CHECK-LABEL: .LBB_END0_3:
59+
; CHECK-LABEL: .Lfunc_end0:
60+
61+
; CHECK: .section .llvm_bb_addr_map,"o",@llvm_bb_addr_map,.text._Z3bazb{{$}}
62+
; CHECK-NEXT: .byte 2 # version
63+
; BASIC-NEXT: .byte 0 # feature
64+
; PGO-ALL-NEXT: .byte 7 # feature
65+
; FEC-ONLY-NEXT:.byte 1 # feature
66+
; BBF-ONLY-NEXT:.byte 2 # feature
67+
; BRP-ONLY-NEXT:.byte 4 # feature
68+
; CHECK-NEXT: .quad .Lfunc_begin0 # function address
69+
; CHECK-NEXT: .byte 6 # number of basic blocks
70+
; CHECK-NEXT: .byte 0 # BB id
71+
; CHECK-NEXT: .uleb128 .Lfunc_begin0-.Lfunc_begin0
72+
; CHECK-NEXT: .uleb128 .LBB_END0_0-.Lfunc_begin0
73+
; CHECK-NEXT: .byte 8
74+
; CHECK-NEXT: .byte 1 # BB id
75+
; CHECK-NEXT: .uleb128 .LBB0_1-.LBB_END0_0
76+
; CHECK-NEXT: .uleb128 .LBB_END0_1-.LBB0_1
77+
; CHECK-NEXT: .byte 8
78+
; CHECK-NEXT: .byte 3 # BB id
79+
; CHECK-NEXT: .uleb128 .LBB0_2-.LBB_END0_1
80+
; CHECK-NEXT: .uleb128 .LBB_END0_2-.LBB0_2
81+
; CHECK-NEXT: .byte 8
82+
; CHECK-NEXT: .byte 5 # BB id
83+
; CHECK-NEXT: .uleb128 .LBB0_3-.LBB_END0_2
84+
; CHECK-NEXT: .uleb128 .LBB_END0_3-.LBB0_3
85+
; CHECK-NEXT: .byte 1
86+
; CHECK-NEXT: .byte 4 # BB id
87+
; CHECK-NEXT: .uleb128 .LBB0_4-.LBB_END0_3
88+
; CHECK-NEXT: .uleb128 .LBB_END0_4-.LBB0_4
89+
; CHECK-NEXT: .byte 16
90+
; CHECK-NEXT: .byte 2 # BB id
91+
; CHECK-NEXT: .uleb128 .LBB0_5-.LBB_END0_4
92+
; CHECK-NEXT: .uleb128 .LBB_END0_5-.LBB0_5
93+
; CHECK-NEXT: .byte 4
94+
95+
;; PGO Analysis Map
96+
; PGO-FEC-NEXT: .byte 100 # function entry count
97+
; PGO-BBF-NEXT: .ascii "\271\235\376\332\245\200\356\017" # basic block frequency
98+
; PGO-BRP-NEXT: .byte 2 # basic block successor count
99+
; PGO-BRP-NEXT: .byte 1 # successor BB ID
100+
; PGO-BRP-NEXT: .ascii "\346\314\231\263\006" # successor branch probability
101+
; PGO-BRP-NEXT: .byte 3 # successor BB ID
102+
; PGO-BRP-NEXT: .ascii "\232\263\346\314\001" # successor branch probability
103+
; PGO-BBF-NEXT: .ascii "\202\301\341\375\205\200\200\003" # basic block frequency
104+
; PGO-BRP-NEXT: .byte 2 # basic block successor count
105+
; PGO-BRP-NEXT: .byte 3 # successor BB ID
106+
; PGO-BRP-NEXT: .ascii "\200\360\377\377\007" # successor branch probability
107+
; PGO-BRP-NEXT: .byte 2 # successor BB ID
108+
; PGO-BRP-NEXT: .ascii "\200\020" # successor branch probability
109+
; PGO-BBF-NEXT: .ascii "\200\200\200\200\200\200\200 " # basic block frequency
110+
; PGO-BRP-NEXT: .byte 2 # basic block successor count
111+
; PGO-BRP-NEXT: .byte 5 # successor BB ID
112+
; PGO-BRP-NEXT: .ascii "\200\200\200\200\007" # successor branch probability
113+
; PGO-BRP-NEXT: .byte 4 # successor BB ID
114+
; PGO-BRP-NEXT: .ascii "\200\200\200\200\001" # successor branch probability
115+
; PGO-BBF-NEXT: .ascii "\271\235\376\332\245\200\356\017" # basic block frequency
116+
; PGO-BRP-NEXT: .byte 0 # basic block successor count
117+
; PGO-BBF-NEXT: .ascii "\210\214\356\257\200\200\230\002" # basic block frequency
118+
; PGO-BRP-NEXT: .byte 2 # basic block successor count
119+
; PGO-BRP-NEXT: .byte 1 # successor BB ID
120+
; PGO-BRP-NEXT: .ascii "\200\200\200\200\006" # successor branch probability
121+
; PGO-BRP-NEXT: .byte 5 # successor BB ID
122+
; PGO-BRP-NEXT: .ascii "\200\200\200\200\002" # successor branch probability
123+
; PGO-BBF-NEXT: .ascii "\235\323\243\200#" # basic block frequency
124+
; PGO-BRP-NEXT: .byte 1 # basic block successor count
125+
; PGO-BRP-NEXT: .byte 5 # successor BB ID
126+
; PGO-BRP-NEXT: .ascii "\200\200\200\200\b" # successor branch probability
127+

0 commit comments

Comments
 (0)