-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Implements PGOBBAddrMap in Object and ObjectYAML with tests [1/5] #71750
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements PGOBBAddrMap in Object and ObjectYAML with tests [1/5] #71750
Conversation
@llvm/pr-subscribers-llvm-binary-utilities @llvm/pr-subscribers-mc Author: Micah Weston (red1bluelost) ChangesA part of RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch and Block Analysis. This PR adds the PGOBBAddrMap data structure and implements encoding and decoding through Object and ObjectYAML along with associated tests. Design AlternativeThe current implementation makes an effort to avoid modifying the original BBAddrMap data structure. This keeps the API the same as before for all pre-existing code and avoids unnecessary fields for regular BBAddrMap usage. It comes with the cost of some complexity in sharing encoding/decoding code. Alternatively, we could just place all new fields into BBAddrMap. This makes the encoding/decoding code simpler. It has the cost of changing some behavior in the original API. It also adds unnecessary fields for anyone wanting to just use the regular BBAddrMap. Please let us know if you feel that the alternate design of placing new fields into BBAddrMap is preferable. PR SeriesThe current code for PGOBBAddrMap to be upstreamed is split into five PRs:
If you would like to try testing PGOBBAddrMap locally on a program, PR-3 llvm-readobj is likely the minimum code needed to meaningfully use this feature. Patch is 60.80 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/71750.diff 15 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/ELF.h b/llvm/include/llvm/BinaryFormat/ELF.h
index 3596174f74dde80..299d7e8265317b3 100644
--- a/llvm/include/llvm/BinaryFormat/ELF.h
+++ b/llvm/include/llvm/BinaryFormat/ELF.h
@@ -1037,6 +1037,7 @@ enum : unsigned {
SHT_LLVM_BB_ADDR_MAP = 0x6fff4c0a, // LLVM Basic Block Address Map.
SHT_LLVM_OFFLOADING = 0x6fff4c0b, // LLVM device offloading data.
SHT_LLVM_LTO = 0x6fff4c0c, // .llvm.lto for fat LTO.
+ SHT_LLVM_PGO_BB_ADDR_MAP = 0x6fff4c0d, // LLVM PGO extended BB Addr Map.
// Android's experimental support for SHT_RELR sections.
// https://android.googlesource.com/platform/bionic/+/b7feec74547f84559a1467aca02708ff61346d2a/libc/include/elf.h#512
SHT_ANDROID_RELR = 0x6fffff00, // Relocation entries; only offsets.
diff --git a/llvm/include/llvm/Object/ELF.h b/llvm/include/llvm/Object/ELF.h
index a1cf47a1c4a6173..3768347ee59582b 100644
--- a/llvm/include/llvm/Object/ELF.h
+++ b/llvm/include/llvm/Object/ELF.h
@@ -417,6 +417,12 @@ class ELFFile {
Expected<std::vector<BBAddrMap>>
decodeBBAddrMap(const Elf_Shdr &Sec, const Elf_Shdr *RelaSec = nullptr) const;
+ /// Decodes same as decodeBBAddrMap but also decodes extra information from
+ /// features which are enabled.
+ Expected<std::vector<PGOBBAddrMap>>
+ decodePGOBBAddrMap(const Elf_Shdr &Sec,
+ const Elf_Shdr *RelaSec = nullptr) const;
+
/// Returns a map from every section matching \p IsMatch to its relocation
/// section, or \p nullptr if it has no relocation section. This function
/// returns an error if any of the \p IsMatch calls fail or if it fails to
diff --git a/llvm/include/llvm/Object/ELFObjectFile.h b/llvm/include/llvm/Object/ELFObjectFile.h
index d7947d85739eb3b..d6dd62d487affbe 100644
--- a/llvm/include/llvm/Object/ELFObjectFile.h
+++ b/llvm/include/llvm/Object/ELFObjectFile.h
@@ -114,6 +114,9 @@ class ELFObjectFileBase : public ObjectFile {
// corresponding to the section with that index.
Expected<std::vector<BBAddrMap>>
readBBAddrMap(std::optional<unsigned> TextSectionIndex = std::nullopt) const;
+
+ Expected<std::vector<PGOBBAddrMap>> readPGOBBAddrMap(
+ std::optional<unsigned> TextSectionIndex = std::nullopt) const;
};
class ELFSectionRef : public SectionRef {
diff --git a/llvm/include/llvm/Object/ELFTypes.h b/llvm/include/llvm/Object/ELFTypes.h
index 45fc52288bdd4c0..66880bdfac99581 100644
--- a/llvm/include/llvm/Object/ELFTypes.h
+++ b/llvm/include/llvm/Object/ELFTypes.h
@@ -10,9 +10,12 @@
#define LLVM_OBJECT_ELFTYPES_H
#include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/BitmaskEnum.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/BinaryFormat/ELF.h"
#include "llvm/Object/Error.h"
+#include "llvm/Support/BlockFrequency.h"
+#include "llvm/Support/BranchProbability.h"
#include "llvm/Support/Endian.h"
#include "llvm/Support/Error.h"
#include "llvm/Support/MathExtras.h"
@@ -806,6 +809,11 @@ struct BBAddrMap {
bool HasIndirectBranch : 1; // If this block ends with an indirect branch
// (branch via a register).
+ // Number of bits used when encoding Metadata, that way an extension
+ // can pack into the extra space if possible. This must be updated when
+ // new bits are added here.
+ static constexpr uint32_t NumberOfBits = 5;
+
bool operator==(const Metadata &Other) const {
return HasReturn == Other.HasReturn &&
HasTailCall == Other.HasTailCall && IsEHPad == Other.IsEHPad &&
@@ -865,6 +873,106 @@ struct BBAddrMap {
}
};
+/// An extension of BBAddrMap that holds information relevant to PGO.
+struct PGOBBAddrMap {
+ /// Bitmask of optional features to include in the PGO extended map.
+ enum class Features {
+ None = 0,
+ FuncEntryCnt = (1 << 0),
+ BBFreq = (1 << 1),
+ BrProb = (1 << 2),
+ LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/BrProb),
+ };
+
+ /// Super-set of BBAddrMap::BBEntry with additional fields for block frequency
+ /// and branch probability.
+ struct BBEntry {
+ using BaseMetadata = BBAddrMap::BBEntry::Metadata;
+
+ /// Enum indicating the how many successors a block has. This enum must fit
+ /// into two bits.
+ enum class SuccessorsType {
+ /// None should be present if PGOBBAddrMap has disabled branch
+ /// probability.
+ None = 0,
+ /// Single successor blocks are not present in the successor entries.
+ One = 1,
+ /// Common case for conditional branches to avoid encoding size.
+ Two = 2,
+ /// Uncommon case which needs successor size to be encoded.
+ Multiple = 3,
+ };
+
+ /// Single successor of a given basic block that contains the tag and branch
+ /// probability associated with it.
+ struct SuccessorEntry {
+ /// Unique ID of this successor basic block.
+ uint32_t ID;
+ /// Branch Probability of the edge to this successor taken from MBPI
+ BranchProbability Prob;
+
+ bool operator==(const SuccessorEntry &Other) const {
+ return std::tie(ID, Prob) == std::tie(Other.ID, Other.Prob);
+ }
+ };
+
+ /// Reuse of the fields provided by regular BBAddrMap
+ BBAddrMap::BBEntry Base;
+ /// Block frequency taken from MBFI
+ BlockFrequency BlockFreq;
+ /// List of successors of the current block
+ llvm::SmallVector<SuccessorEntry, 2> Successors;
+
+ /// Converts number of successors into a SuccessorsType.
+ static SuccessorsType getSuccessorsType(unsigned SuccessorsCount) {
+ return SuccessorsCount == 0 ? SuccessorsType::None
+ : SuccessorsCount == 1 ? SuccessorsType::One
+ : SuccessorsCount == 2 ? SuccessorsType::Two
+ : SuccessorsType::Multiple;
+ }
+
+ /// Encodes extra information in the free bits of the base metadata
+ static uint32_t encodeMD(BaseMetadata MD, SuccessorsType SuccType) {
+ return MD.encode() | static_cast<uint32_t>(SuccType)
+ << BaseMetadata::NumberOfBits;
+ }
+
+ /// Extracts successors type then defers all errors to the base metadata
+ static Expected<std::pair<BaseMetadata, SuccessorsType>>
+ decodeMD(uint32_t V) {
+ auto SuccType = SuccessorsType((V >> BaseMetadata::NumberOfBits) & 0b11);
+ V &= ~(0b11 << BaseMetadata::NumberOfBits); // Clear extra bits
+ BaseMetadata MD;
+ if (llvm::Error E = BaseMetadata::decode(V).moveInto(MD))
+ return std::move(E);
+ return std::make_pair(MD, SuccType);
+ }
+
+ bool operator==(const BBEntry &Other) const {
+ return std::tie(Base, BlockFreq, Successors) ==
+ std::tie(Other.Base, Other.BlockFreq, Other.Successors);
+ }
+ };
+ // This field is duplicated from BBAddrMap since this class needs a different
+ // type for the vector of entries.
+ uint64_t Addr; // Function address
+ std::vector<BBEntry> BBEntries; // Extended basic block entries
+ uint64_t FuncEntryCount; // Prof count from IR function
+
+ // Flags to indicate if each PGO related info was enabled in this function
+ bool FuncEntryCountEnabled : 1;
+ bool BBFreqEnabled : 1;
+ bool BBSuccProbEnabled : 1;
+
+ bool operator==(const PGOBBAddrMap &Other) const {
+ return std::tie(Addr, FuncEntryCount, BBEntries, FuncEntryCountEnabled,
+ BBFreqEnabled, BBSuccProbEnabled) ==
+ std::tie(Other.Addr, Other.FuncEntryCount, Other.BBEntries,
+ Other.FuncEntryCountEnabled, Other.BBFreqEnabled,
+ Other.BBSuccProbEnabled);
+ }
+};
+
} // end namespace object.
} // end namespace llvm.
diff --git a/llvm/include/llvm/ObjectYAML/ELFYAML.h b/llvm/include/llvm/ObjectYAML/ELFYAML.h
index 1ba41232f552e3a..3f18d189b7dc870 100644
--- a/llvm/include/llvm/ObjectYAML/ELFYAML.h
+++ b/llvm/include/llvm/ObjectYAML/ELFYAML.h
@@ -156,6 +156,13 @@ struct DynamicEntry {
llvm::yaml::Hex64 Val;
};
+struct BBAddrMapCommonBase {
+ uint8_t Version;
+ llvm::yaml::Hex8 Feature;
+ llvm::yaml::Hex64 Address;
+ std::optional<uint64_t> NumBlocks;
+};
+
struct BBAddrMapEntry {
struct BBEntry {
uint32_t ID;
@@ -163,10 +170,22 @@ struct BBAddrMapEntry {
llvm::yaml::Hex64 Size;
llvm::yaml::Hex64 Metadata;
};
- uint8_t Version;
- llvm::yaml::Hex8 Feature;
- llvm::yaml::Hex64 Address;
- std::optional<uint64_t> NumBlocks;
+ BBAddrMapCommonBase Common;
+ std::optional<std::vector<BBEntry>> BBEntries;
+};
+
+struct PGOBBAddrMapEntry {
+ struct BBEntry {
+ struct SuccessorEntry {
+ uint32_t ID;
+ llvm::yaml::Hex32 BrProb;
+ };
+ BBAddrMapEntry::BBEntry Base;
+ std::optional<uint64_t> BBFreq;
+ std::optional<std::vector<SuccessorEntry>> Successors;
+ };
+ BBAddrMapCommonBase Common;
+ std::optional<uint64_t> FuncEntryCount;
std::optional<std::vector<BBEntry>> BBEntries;
};
@@ -204,6 +223,7 @@ struct Chunk {
DependentLibraries,
CallGraphProfile,
BBAddrMap,
+ PGOBBAddrMap,
// Special chunks.
SpecialChunksStart,
@@ -329,6 +349,20 @@ struct BBAddrMapSection : Section {
}
};
+struct PGOBBAddrMapSection : Section {
+ std::optional<std::vector<PGOBBAddrMapEntry>> Entries;
+
+ PGOBBAddrMapSection() : Section(ChunkKind::PGOBBAddrMap) {}
+
+ std::vector<std::pair<StringRef, bool>> getEntries() const override {
+ return {{"Entries", Entries.has_value()}};
+ };
+
+ static bool classof(const Chunk *S) {
+ return S->Kind == ChunkKind::PGOBBAddrMap;
+ }
+};
+
struct StackSizesSection : Section {
std::optional<std::vector<StackSizeEntry>> Entries;
@@ -737,6 +771,10 @@ bool shouldAllocateFileSpace(ArrayRef<ProgramHeader> Phdrs,
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::StackSizeEntry)
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::BBAddrMapEntry)
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::BBAddrMapEntry::BBEntry)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::PGOBBAddrMapEntry)
+LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::PGOBBAddrMapEntry::BBEntry)
+LLVM_YAML_IS_SEQUENCE_VECTOR(
+ llvm::ELFYAML::PGOBBAddrMapEntry::BBEntry::SuccessorEntry)
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::DynamicEntry)
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::LinkerOption)
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::CallGraphEntryWeight)
@@ -905,6 +943,20 @@ template <> struct MappingTraits<ELFYAML::BBAddrMapEntry::BBEntry> {
static void mapping(IO &IO, ELFYAML::BBAddrMapEntry::BBEntry &Rel);
};
+template <> struct MappingTraits<ELFYAML::PGOBBAddrMapEntry> {
+ static void mapping(IO &IO, ELFYAML::PGOBBAddrMapEntry &Rel);
+};
+
+template <> struct MappingTraits<ELFYAML::PGOBBAddrMapEntry::BBEntry> {
+ static void mapping(IO &IO, ELFYAML::PGOBBAddrMapEntry::BBEntry &Rel);
+};
+
+template <>
+struct MappingTraits<ELFYAML::PGOBBAddrMapEntry::BBEntry::SuccessorEntry> {
+ static void mapping(IO &IO,
+ ELFYAML::PGOBBAddrMapEntry::BBEntry::SuccessorEntry &Rel);
+};
+
template <> struct MappingTraits<ELFYAML::GnuHashHeader> {
static void mapping(IO &IO, ELFYAML::GnuHashHeader &Rel);
};
diff --git a/llvm/lib/MC/MCParser/ELFAsmParser.cpp b/llvm/lib/MC/MCParser/ELFAsmParser.cpp
index dbfe0d83e1b29b1..e4c1cc7763f7804 100644
--- a/llvm/lib/MC/MCParser/ELFAsmParser.cpp
+++ b/llvm/lib/MC/MCParser/ELFAsmParser.cpp
@@ -673,6 +673,8 @@ bool ELFAsmParser::ParseSectionArguments(bool IsPush, SMLoc loc) {
Type = ELF::SHT_LLVM_SYMPART;
else if (TypeName == "llvm_bb_addr_map")
Type = ELF::SHT_LLVM_BB_ADDR_MAP;
+ else if (TypeName == "llvm_pgo_bb_addr_map")
+ Type = ELF::SHT_LLVM_PGO_BB_ADDR_MAP;
else if (TypeName == "llvm_offloading")
Type = ELF::SHT_LLVM_OFFLOADING;
else if (TypeName == "llvm_lto")
diff --git a/llvm/lib/MC/MCSectionELF.cpp b/llvm/lib/MC/MCSectionELF.cpp
index 95fdf33522076e4..02e63f7b03919a8 100644
--- a/llvm/lib/MC/MCSectionELF.cpp
+++ b/llvm/lib/MC/MCSectionELF.cpp
@@ -170,6 +170,8 @@ void MCSectionELF::printSwitchToSection(const MCAsmInfo &MAI, const Triple &T,
OS << "llvm_bb_addr_map";
else if (Type == ELF::SHT_LLVM_BB_ADDR_MAP_V0)
OS << "llvm_bb_addr_map_v0";
+ else if (Type == ELF::SHT_LLVM_PGO_BB_ADDR_MAP)
+ OS << "llvm_pgo_bb_addr_map";
else if (Type == ELF::SHT_LLVM_OFFLOADING)
OS << "llvm_offloading";
else if (Type == ELF::SHT_LLVM_LTO)
diff --git a/llvm/lib/Object/ELF.cpp b/llvm/lib/Object/ELF.cpp
index 0d1862e573712f5..e5b2b38048b9382 100644
--- a/llvm/lib/Object/ELF.cpp
+++ b/llvm/lib/Object/ELF.cpp
@@ -312,6 +312,7 @@ StringRef llvm::object::getELFSectionTypeName(uint32_t Machine, unsigned Type) {
STRINGIFY_ENUM_CASE(ELF, SHT_LLVM_PART_PHDR);
STRINGIFY_ENUM_CASE(ELF, SHT_LLVM_BB_ADDR_MAP_V0);
STRINGIFY_ENUM_CASE(ELF, SHT_LLVM_BB_ADDR_MAP);
+ STRINGIFY_ENUM_CASE(ELF, SHT_LLVM_PGO_BB_ADDR_MAP);
STRINGIFY_ENUM_CASE(ELF, SHT_LLVM_OFFLOADING);
STRINGIFY_ENUM_CASE(ELF, SHT_LLVM_LTO);
STRINGIFY_ENUM_CASE(ELF, SHT_GNU_ATTRIBUTES);
@@ -645,11 +646,179 @@ ELFFile<ELFT>::toMappedAddr(uint64_t VAddr, WarningHandler WarnHandler) const {
return base() + Offset;
}
-template <class ELFT>
-Expected<std::vector<BBAddrMap>>
-ELFFile<ELFT>::decodeBBAddrMap(const Elf_Shdr &Sec,
- const Elf_Shdr *RelaSec) const {
- bool IsRelocatable = getHeader().e_type == ELF::ET_REL;
+// Helper to extract and decode the next ULEB128 value as unsigned int.
+// Returns zero and sets ULEBSizeErr if the ULEB128 value exceeds the unsigned
+// int limit.
+// Also returns zero if ULEBSizeErr is already in an error state.
+// ULEBSizeErr is an out variable if an error occurs.
+template <typename IntTy>
+static IntTy readULEB128As(DataExtractor &Data, DataExtractor::Cursor &Cur,
+ Error &ULEBSizeErr) {
+ static_assert(std::is_unsigned_v<IntTy> &&
+ (std::numeric_limits<IntTy>::radix == 2),
+ "only use unsigned radix 2");
+ // Bail out and do not extract data if ULEBSizeErr is already set.
+ if (ULEBSizeErr)
+ return 0;
+ uint64_t Offset = Cur.tell();
+ uint64_t Value = Data.getULEB128(Cur);
+ if (Value > std::numeric_limits<IntTy>::max()) {
+ ULEBSizeErr = createError("ULEB128 value at offset 0x" +
+ Twine::utohexstr(Offset) + " exceeds UINT" +
+ Twine(std::numeric_limits<IntTy>::digits) +
+ "_MAX (0x" + Twine::utohexstr(Value) + ")");
+ return 0;
+ }
+ return static_cast<IntTy>(Value);
+}
+
+namespace {
+struct BasicAddrMapBBEntry {
+ uint32_t ID;
+ uint32_t Offset;
+ uint32_t Size;
+ uint32_t MD;
+};
+
+template <typename ELFT, typename AddrMap> struct AddrMapDecodeTrait;
+
+template <typename ELFT> struct AddrMapDecodeTrait<ELFT, BBAddrMap> {
+ // Base addr map has no extra data
+ struct ExtraData {};
+
+ static constexpr unsigned SectionID = ELF::SHT_LLVM_BB_ADDR_MAP;
+
+ DataExtractor &Data;
+ DataExtractor::Cursor &Cur;
+ Error &ULEBSizeErr;
+ Error &MetadataDecodeErr;
+
+ uint32_t PrevBBEndOffset = 0;
+
+ AddrMapDecodeTrait(DataExtractor &Data, DataExtractor::Cursor &Cur,
+ Error &ULEBSizeErr, Error &MetadataDecodeErr)
+ : Data(Data), Cur(Cur), ULEBSizeErr(ULEBSizeErr),
+ MetadataDecodeErr(MetadataDecodeErr) {}
+
+ ExtraData decodeExtraDataHeader(uint8_t, uint8_t) {
+ PrevBBEndOffset = 0;
+ return {};
+ }
+
+ // This helper method avoids decoding the metadata so other AddrMaps can use
+ BasicAddrMapBBEntry decodeBasicInfo(uint8_t Version, uint8_t Feature,
+ uint32_t BlockIndex) {
+ uint32_t ID = Version >= 2 ? readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr)
+ : BlockIndex;
+ uint32_t Offset = readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
+ uint32_t Size = readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
+ uint32_t MD = readULEB128As<uint32_t>(Data, Cur, ULEBSizeErr);
+ if (Version >= 1) {
+ // Offset is calculated relative to the end of the previous BB.
+ Offset += PrevBBEndOffset;
+ PrevBBEndOffset = Offset + Size;
+ }
+ return {ID, Offset, Size, MD};
+ }
+
+ BBAddrMap::BBEntry decodeBBEntry(uint8_t Version, uint8_t Feature,
+ uint32_t BlockIndex) {
+ auto [ID, Offset, Size, MD] = decodeBasicInfo(Version, Feature, BlockIndex);
+ auto MetadataOrErr = BBAddrMap::BBEntry::Metadata::decode(MD);
+ if (Error E = MetadataOrErr.takeError()) {
+ MetadataDecodeErr = std::move(E);
+ return {{}, {}, {}, {}};
+ }
+ return {ID, Offset, Size, *MetadataOrErr};
+ }
+
+ BBAddrMap makeAddrMap(ExtraData, uint8_t Feature, uint64_t Address,
+ std::vector<BBAddrMap::BBEntry> BBEntries) {
+ return {Address, std::move(BBEntries)};
+ }
+};
+
+template <typename ELFT> struct AddrMapDecodeTrait<ELFT, PGOBBAddrMap> {
+ using Features = PGOBBAddrMap::Features;
+
+ struct ExtraData {
+ uint64_t FuncEntryCount;
+ };
+
+ static constexpr unsigned SectionID = ELF::SHT_LLVM_PGO_BB_ADDR_MAP;
+
+ AddrMapDecodeTrait<ELFT, BBAddrMap> Base;
+
+ AddrMapDecodeTrait(DataExtractor &Data, DataExtractor::Cursor &Cur,
+ Error &ULEBSizeErr, Error &MetadataDecodeErr)
+ : Base(Data, Cur, ULEBSizeErr, MetadataDecodeErr) {}
+
+ ExtraData decodeExtraDataHeader(uint8_t Version, uint8_t Feature) {
+ Base.decodeExtraDataHeader(Version, Feature);
+ if (Version < 2) {
+ // hijack the metadata error if version is too low
+ Base.MetadataDecodeErr =
+ createError("unsupported SHT_LLVM_PGO_BB_ADDR_MAP version: " +
+ Twine(static_cast<int>(Version)));
+ return {};
+ }
+ return {Feature & uint8_t(PGOBBAddrMap::Features::FuncEntryCnt)
+ ? readULEB128As<uint64_t>(Base.Data, Base.Cur, Base.ULEBSizeErr)
+ : 0};
+ }
+
+ PGOBBAddrMap::BBEntry decodeBBEntry(uint8_t Version, uint8_t Feature,
+ uint32_t BlockIndex) {
+ auto [ID, Offset, Size, MD] =
+ Base.decodeBasicInfo(Version, Feature, BlockIndex);
+ auto MetadataOrErr = PGOBBAddrMap::BBEntry::decodeMD(MD);
+ if (Error E = MetadataOrErr.takeError()) {
+ Base.MetadataDecodeErr = std::move(E);
+ return {{{}, {}, {}, {}}, {}, {}};
+ }
+ auto [MetaData, SuccsType] = *MetadataOrErr;
+
+ uint64_t BBF =
+ Feature & uint8_t(PGOBBAddrMap::Features::BBFreq)
+ ? readULEB128As<uint64_t>(Base.Data, Base.Cur, Base.ULEBSizeErr)
+ : 0;
+
+ llvm::SmallVector<PGOBBAddrMap::BBEntry::SuccessorEntry, 2> Successors;
+ if (Feature & uint8_t(PGOBBAddrMap::Features::BrProb)) {
+ auto SuccCount =
+ SuccsType == PGOBBAddrMap::BBEntry::SuccessorsType::Multiple
+ ? readULEB128As<uint64_t>(Base.Data, Base.Cur, Base.ULEBSizeErr)
+ : uint64_t(SuccsType);
+ for (uint64_t I = 0; I < SuccCount; ++I) {
+ uint32_t BBID =
+ readULEB128As<uint32_t>(Base.Data, Base.Cur, Base.ULEBSizeErr);
+ uint32_t BrProb =
+ readULEB128As<uint32_t>(Base.Data, Base.Cur, Base.ULEBSizeErr);
+ Successors.push_back({BBID, BranchProbability::getRaw(BrProb)});
+ }
+ }
+ return PGOBBAddrMap::BBEntry{
+ {ID, Offset, Size, MetaData}, BlockFrequency(BBF), Successors};
+ }
+
+ PGOBBAddrMap makeAddrMap(ExtraData ED, uint8_t Feature, uint64_t Address,
+ std::vector<PGOBBAddrMap::BBEntry> BBEntries) {
+ return {Address,
+ std::move(BBEntries),
+ ED.FuncEntryCount,
+ bool(Feature & uint8_t(Features::FuncEntryCnt)),
+ bool(Feature & uint8_t(Features::BBFreq)),
+ bool(Feature & uint8_t(Features::BrProb))};
+ }
+};
+} // namespace
+
+template <typename AddrMap, typename ELFT>
+static Expected<std::vector<AddrMap>>
+decodeBBAddrMapCommonImpl(const ELFFile<ELFT> &EF,
+ const typename ELFFile<ELFT>::Elf_Shdr &Sec,
+ const typename ELFFile<ELFT>::Elf_Shdr *RelaSec) {
+ bool IsRelocatable = EF.getHeader().e_type == ELF::ET_REL;
// This DenseMap maps the offset of each function (the location of the
// reference to the function in the SHT_LLVM_BB_ADDR_MAP section) to the
@@ -659,57 +828,44 @@ ELFFile<ELFT>::decodeBBAddrMap(const...
[truncated]
|
I've added @rlavaee as a reviewer, as they've done a lot of the work for the BB support in the various tools. I think it makes most sense for him to do the high-level review of this. |
Haven't read the code yet, but this is my high-level comment:
We can distinguish between the ELF representation (SHT_LLVM_BB_ADDR_MAP) and the in-memory representation ( |
I see what you mean by sharing ELF section (SHT_LLVM_BB_ADDR_MAP) rather than having it separate (SHT_PGO_LLVM_BB_ADDR_MAP). We would use feature byte to indicate enabled analyses. I would not be opposed to doing this. If we keep the two classes I'll give a try to combine and have BBAddrMap ignore extra data in the mean time. |
One way to do this is to have the decoding function return two structures For decoding, we can inspect the feature field and populate |
I see. I think that should be better. Using the "PGOFrequencyMap" will definitely help simplify Object's decode/encode. ObjectYAML is the harder area to tackle. I have one more idea to try so that we don't need new fields in yaml::BBAddrMapEntry, but if it fails, it might be cleanest to duplicate the fields just in yaml. I think mapping should be fine. I see two designs.
I'll try building the first designs. |
…for tests. Removes public and adds check for number of bits.
746ac2e
to
989c084
Compare
I went ahead and merged into this PR the design changes suggested @rlavaee. (I called it PGOAnalysisMap instead of PGOFrequencyMap). Using the same ELF section with the feature guard does help to simplify and shorten the code. If the user does not request the PGOAnalysisMap when decoding the BBAddrMap section then the code just ignores any extra data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments - I haven't done a particularly detailed review.
llvm/lib/Object/ELF.cpp
Outdated
static_assert(std::is_unsigned_v<IntTy> && | ||
(std::numeric_limits<IntTy>::radix == 2), | ||
"only use unsigned radix 2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you use enable_if
or similar to prevent this function being used by an incorrect type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could work but it looks a bit ugly since formats weird and you need bool_constant. Since this method is just a static helper, the static_assert is cleaner and avoids accidental misuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if you're doing the same thing I imagined. I played around with your code, and the template signature just becomes:
template <typename IntTy,
typename = std::enable_if_t<std::is_unsigned_v<IntTy> &&
(std::numeric_limits<IntTy>::radix == 2)>>
Which doesn't seem weirdly formatted to me and doesn't use bool_constant
? I think enable_if
(or any other SFINAE approach) is the more canonical way in C++ of preventing use of a template function rather than a static_assert
.
But anyway, is the radix
part actually needed? What case would it suppress that you think is potentially going to happen? What would be the issue if the function were to be used with such an integer type (if the static_assert/enable_if weren't present)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using std conjunction and forgot I can just do &&. Either way, I do agree checking radix is probably overkill. The guard is primarily there just to make the next person stop and check if it is fine since we only used it with unsigned int values so far.
I switched it to SFINAE with just checking for unsigned.
IO.mapRequired("ID", E.ID); | ||
IO.mapRequired("BrProb", E.BrProb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these fields both REALLY required, or could reasonable defaults (including empty strings, 0 etc) be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are listing the successors in the YAML, these two fields are essential. It's optional to have the list of successor but not optional to have these fields in a successor.
Maybe in the future if SuccessorEntry gets more fields then optional would become worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my concerns.
I wonder if there is any benefit in emitting the PGOAnalysis BB entries next to their BBAddrMap counterparts in the section. I see we are using the Metadata field to emit the successor type (used primarily for succinct decoding/encoding), but we probably shouldn't introduce this kind of coupling.
Here is an alternative:
Emit the entire PGOAnalysisMap entry of each function right after its BBAddrMap (NBlocks and Feature are shared). Define and use your own fancy encoding for successor type and count.
uint64_t Offset = Cur.tell(); | ||
uint64_t Value = Data.getULEB128(Cur); | ||
if (Value > std::numeric_limits<IntTy>::max()) { | ||
ULEBSizeErr = createError("ULEB128 value at offset 0x" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any testing of this error? I couldn't immediately find any in this patch, but I suppose it could exist already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I this is already tested in the regular BBAddrMap tests. Found at ELFObjectFileTest.cpp:564
llvm/lib/Object/ELF.cpp
Outdated
static_assert(std::is_unsigned_v<IntTy> && | ||
(std::numeric_limits<IntTy>::radix == 2), | ||
"only use unsigned radix 2"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if you're doing the same thing I imagined. I played around with your code, and the template signature just becomes:
template <typename IntTy,
typename = std::enable_if_t<std::is_unsigned_v<IntTy> &&
(std::numeric_limits<IntTy>::radix == 2)>>
Which doesn't seem weirdly formatted to me and doesn't use bool_constant
? I think enable_if
(or any other SFINAE approach) is the more canonical way in C++ of preventing use of a template function rather than a static_assert
.
But anyway, is the radix
part actually needed? What case would it suppress that you think is potentially going to happen? What would be the issue if the function were to be used with such an integer type (if the static_assert/enable_if weren't present)?
I'll try that out. Without packing into Metadata there is no space saved for Branch Probability. But that probably doesn't matter much. |
Done. It now encoded PGOAnalysisMap after each BBAddrMap function if it was enabled in Features. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates. No more comments from me, barring one possible nit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates. Only one comment from me.
Also, please add the "[SHT_LLVM_BB_ADDR_MAP]" tag to the commit message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two final nitpicks from me (I didn't check to see if there were other cases of the same issue, but if there are, please fix them.
Would you please wait until I merge #74107? This should allow removing some boilerplate code from your PR as well. |
A part of RFC - PGO Accuracy Metrics: Emitting and Evaluating Branch and Block Analysis.
This PR adds the PGOBBAddrMap data structure and implements encoding and decoding through Object and ObjectYAML along with associated tests.
Design Alternative
The current implementation makes an effort to avoid modifying the original BBAddrMap data structure. This keeps the API the same as before for all pre-existing code and avoids unnecessary fields for regular BBAddrMap usage. It comes with the cost of some complexity in sharing encoding/decoding code.
Alternatively, we could just place all new fields into BBAddrMap. This makes the encoding/decoding code simpler. It has the cost of changing some behavior in the original API. It also adds unnecessary fields for anyone wanting to just use the regular BBAddrMap.
Please let us know if you feel that the alternate design of placing new fields into BBAddrMap is preferable.
PR Series
The current code for PGOBBAddrMap to be upstreamed is split into five PRs:
If you would like to try testing PGOBBAddrMap locally on a program, PR-3 llvm-readobj is likely the minimum code needed to meaningfully use this feature.