Skip to content

Commit f66f385

Browse files
wlei-llvmyuxuanchen1997
authored andcommitted
[SampleFDO] Stale profile call-graph matching (#95135)
Profile staleness could be due to function renaming. Given that sample profile loader relies on exact string matching, a trivial change in the function signature( such as `int foo()` --> `long foo()` ) can make the mangled name different, the function profile(including all nested children profile) becomes unavailable. This patch introduces stale profile call-graph level matching, targeting at identifying the trivial function renaming and reusing the old function profile. Some noteworthy details: 1. Extend the LCS based CFG level matching to identify new function. - Extend to match function and profile have different name instead of the exact function name matching. This leverages LCS, i.e during the finding of callsite anchor matching, when two function name are different, try matching the functions instead of return. - In LCS, the equal function check is replaced by `functionMatchesProfile`. - Only try matching functions that are new functions(neither appears on each side). This reduces the matching scope as we don't need to match the originally matched function. 2. Determine the matching by call-site anchor similarity check. - A new function `functionMatchesProfile(IRFunc, ProfFunc)` is used to check the renaming for the possible <IRFunc, ProfFunc> pair, use the LCS(diff) matching to compute the equal set and we define: `Similarity = |equalSet * 2| / (|A| + |B|)`. The profile name is marked as renamed if the similarity is above a threshold(`-func-profile-similarity-threshold`) 3. Process the matching in top-down function order - when a caller's is done matching, the new function names are saved for later use, using top-down order will maximize the reused results. - `ProfileNameToFuncMap` is used to save or cache the matching result. 4. Update the original profile at the end using `ProfileNameToFuncMap`. 5. Added a new switch --salvage-unused-profile to control this, default is false. Verified on one Meta's internal big service, confirmed 90%+ of the found renaming pair is good. (There could be incorrect renaming pair if the num of the anchor is small, but checked that those functions are simple cold function)
1 parent d799725 commit f66f385

14 files changed

+1064
-127
lines changed

llvm/include/llvm/ProfileData/SampleProf.h

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -919,12 +919,14 @@ class FunctionSamples {
919919
/// Returns a pointer to FunctionSamples at the given callsite location
920920
/// \p Loc with callee \p CalleeName. If no callsite can be found, relax
921921
/// the restriction to return the FunctionSamples at callsite location
922-
/// \p Loc with the maximum total sample count. If \p Remapper is not
923-
/// nullptr, use \p Remapper to find FunctionSamples with equivalent name
924-
/// as \p CalleeName.
925-
const FunctionSamples *
926-
findFunctionSamplesAt(const LineLocation &Loc, StringRef CalleeName,
927-
SampleProfileReaderItaniumRemapper *Remapper) const;
922+
/// \p Loc with the maximum total sample count. If \p Remapper or \p
923+
/// FuncNameToProfNameMap is not nullptr, use them to find FunctionSamples
924+
/// with equivalent name as \p CalleeName.
925+
const FunctionSamples *findFunctionSamplesAt(
926+
const LineLocation &Loc, StringRef CalleeName,
927+
SampleProfileReaderItaniumRemapper *Remapper,
928+
const HashKeyMap<std::unordered_map, FunctionId, FunctionId>
929+
*FuncNameToProfNameMap = nullptr) const;
928930

929931
bool empty() const { return TotalSamples == 0; }
930932

@@ -1172,11 +1174,14 @@ class FunctionSamples {
11721174
/// tree nodes in the profile.
11731175
///
11741176
/// \returns the FunctionSamples pointer to the inlined instance.
1175-
/// If \p Remapper is not nullptr, it will be used to find matching
1176-
/// FunctionSamples with not exactly the same but equivalent name.
1177+
/// If \p Remapper or \p FuncNameToProfNameMap is not nullptr, it will be used
1178+
/// to find matching FunctionSamples with not exactly the same but equivalent
1179+
/// name.
11771180
const FunctionSamples *findFunctionSamples(
11781181
const DILocation *DIL,
1179-
SampleProfileReaderItaniumRemapper *Remapper = nullptr) const;
1182+
SampleProfileReaderItaniumRemapper *Remapper = nullptr,
1183+
const HashKeyMap<std::unordered_map, FunctionId, FunctionId>
1184+
*FuncNameToProfNameMap = nullptr) const;
11801185

11811186
static bool ProfileIsProbeBased;
11821187

llvm/include/llvm/Transforms/IPO/SampleProfileMatcher.h

Lines changed: 101 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ using AnchorMap = std::map<LineLocation, FunctionId>;
2626
class SampleProfileMatcher {
2727
Module &M;
2828
SampleProfileReader &Reader;
29+
LazyCallGraph &CG;
2930
const PseudoProbeManager *ProbeManager;
3031
const ThinOrFullLTOPhase LTOPhase;
3132
SampleProfileMap FlattenedProfiles;
@@ -58,6 +59,40 @@ class SampleProfileMatcher {
5859
StringMap<std::unordered_map<LineLocation, MatchState, LineLocationHash>>
5960
FuncCallsiteMatchStates;
6061

62+
struct FuncToProfileNameMapHash {
63+
uint64_t
64+
operator()(const std::pair<const Function *, FunctionId> &P) const {
65+
return hash_combine(P.first, P.second);
66+
}
67+
};
68+
// A map from a pair of function and profile name to a boolean value
69+
// indicating whether they are matched. This is used as a cache for the
70+
// matching result.
71+
std::unordered_map<std::pair<const Function *, FunctionId>, bool,
72+
FuncToProfileNameMapHash>
73+
FuncProfileMatchCache;
74+
// The new functions found by the call graph matching. The map's key is the
75+
// the new(renamed) function pointer and the value is old(unused) profile
76+
// name.
77+
std::unordered_map<Function *, FunctionId> FuncToProfileNameMap;
78+
79+
// A map pointer to the FuncNameToProfNameMap in SampleProfileLoader,
80+
// which maps the function name to the matched profile name. This is used
81+
// for sample loader to look up profile using the new name.
82+
HashKeyMap<std::unordered_map, FunctionId, FunctionId> *FuncNameToProfNameMap;
83+
84+
// A map pointer to the SymbolMap in SampleProfileLoader, which stores all
85+
// the original matched symbols before the matching. this is to determine if
86+
// the profile is unused(to be matched) or not.
87+
HashKeyMap<std::unordered_map, FunctionId, Function *> *SymbolMap;
88+
89+
// The new functions from IR.
90+
HashKeyMap<std::unordered_map, FunctionId, Function *>
91+
FunctionsWithoutProfile;
92+
93+
// Pointer to the Profile Symbol List in the reader.
94+
std::shared_ptr<ProfileSymbolList> PSL;
95+
6196
// Profile mismatch statstics:
6297
uint64_t TotalProfiledFunc = 0;
6398
// Num of checksum-mismatched function.
@@ -72,34 +107,61 @@ class SampleProfileMatcher {
72107
uint64_t MismatchedCallsiteSamples = 0;
73108
uint64_t RecoveredCallsiteSamples = 0;
74109

110+
// Profile call-graph matching statstics:
111+
uint64_t NumCallGraphRecoveredProfiledFunc = 0;
112+
uint64_t NumCallGraphRecoveredFuncSamples = 0;
113+
75114
// A dummy name for unknown indirect callee, used to differentiate from a
76115
// non-call instruction that also has an empty callee name.
77116
static constexpr const char *UnknownIndirectCallee =
78117
"unknown.indirect.callee";
79118

80119
public:
81-
SampleProfileMatcher(Module &M, SampleProfileReader &Reader,
82-
const PseudoProbeManager *ProbeManager,
83-
ThinOrFullLTOPhase LTOPhase)
84-
: M(M), Reader(Reader), ProbeManager(ProbeManager), LTOPhase(LTOPhase){};
120+
SampleProfileMatcher(
121+
Module &M, SampleProfileReader &Reader, LazyCallGraph &CG,
122+
const PseudoProbeManager *ProbeManager, ThinOrFullLTOPhase LTOPhase,
123+
HashKeyMap<std::unordered_map, FunctionId, Function *> &SymMap,
124+
std::shared_ptr<ProfileSymbolList> PSL,
125+
HashKeyMap<std::unordered_map, FunctionId, FunctionId>
126+
&FuncNameToProfNameMap)
127+
: M(M), Reader(Reader), CG(CG), ProbeManager(ProbeManager),
128+
LTOPhase(LTOPhase), FuncNameToProfNameMap(&FuncNameToProfNameMap),
129+
SymbolMap(&SymMap), PSL(PSL) {};
85130
void runOnModule();
86131
void clearMatchingData() {
87132
// Do not clear FuncMappings, it stores IRLoc to ProfLoc remappings which
88133
// will be used for sample loader.
89-
FuncCallsiteMatchStates.clear();
134+
// Do not clear FlattenedProfiles as it contains function names referenced
135+
// by FuncNameToProfNameMap. Clearing this memory could lead to a
136+
// use-after-free error.
137+
freeContainer(FuncCallsiteMatchStates);
138+
freeContainer(FunctionsWithoutProfile);
139+
freeContainer(FuncToProfileNameMap);
90140
}
91141

92142
private:
93-
FunctionSamples *getFlattenedSamplesFor(const Function &F) {
94-
StringRef CanonFName = FunctionSamples::getCanonicalFnName(F);
95-
auto It = FlattenedProfiles.find(FunctionId(CanonFName));
143+
FunctionSamples *getFlattenedSamplesFor(const FunctionId &Fname) {
144+
auto It = FlattenedProfiles.find(Fname);
96145
if (It != FlattenedProfiles.end())
97146
return &It->second;
98147
return nullptr;
99148
}
149+
FunctionSamples *getFlattenedSamplesFor(const Function &F) {
150+
StringRef CanonFName = FunctionSamples::getCanonicalFnName(F);
151+
return getFlattenedSamplesFor(FunctionId(CanonFName));
152+
}
153+
template <typename T> inline void freeContainer(T &C) {
154+
T Empty;
155+
std::swap(C, Empty);
156+
}
157+
void getFilteredAnchorList(const AnchorMap &IRAnchors,
158+
const AnchorMap &ProfileAnchors,
159+
AnchorList &FilteredIRAnchorsList,
160+
AnchorList &FilteredProfileAnchorList);
100161
void runOnFunction(Function &F);
101-
void findIRAnchors(const Function &F, AnchorMap &IRAnchors);
102-
void findProfileAnchors(const FunctionSamples &FS, AnchorMap &ProfileAnchors);
162+
void findIRAnchors(const Function &F, AnchorMap &IRAnchors) const;
163+
void findProfileAnchors(const FunctionSamples &FS,
164+
AnchorMap &ProfileAnchors) const;
103165
// Record the callsite match states for profile staleness report, the result
104166
// is saved in FuncCallsiteMatchStates.
105167
void recordCallsiteMatchStates(const Function &F, const AnchorMap &IRAnchors,
@@ -124,6 +186,9 @@ class SampleProfileMatcher {
124186
State == MatchState::RemovedMatch;
125187
};
126188

189+
void countCallGraphRecoveredSamples(
190+
const FunctionSamples &FS,
191+
std::unordered_set<FunctionId> &MatchedUnusedProfile);
127192
// Count the samples of checksum mismatched function for the top-level
128193
// function and all inlinees.
129194
void countMismatchedFuncSamples(const FunctionSamples &FS, bool IsTopLevel);
@@ -151,15 +216,37 @@ class SampleProfileMatcher {
151216
// parts from the resulting SES are used to remap the IR locations to the
152217
// profile locations. As the number of function callsite is usually not big,
153218
// we currently just implements the basic greedy version(page 6 of the paper).
154-
LocToLocMap
155-
longestCommonSequence(const AnchorList &IRCallsiteAnchors,
156-
const AnchorList &ProfileCallsiteAnchors) const;
219+
LocToLocMap longestCommonSequence(const AnchorList &IRCallsiteAnchors,
220+
const AnchorList &ProfileCallsiteAnchors,
221+
bool MatchUnusedFunction);
157222
void matchNonCallsiteLocs(const LocToLocMap &AnchorMatchings,
158223
const AnchorMap &IRAnchors,
159224
LocToLocMap &IRToProfileLocationMap);
160225
void runStaleProfileMatching(const Function &F, const AnchorMap &IRAnchors,
161226
const AnchorMap &ProfileAnchors,
162-
LocToLocMap &IRToProfileLocationMap);
227+
LocToLocMap &IRToProfileLocationMap,
228+
bool RunCFGMatching, bool RunCGMatching);
229+
// If the function doesn't have profile, return the pointer to the function.
230+
bool functionHasProfile(const FunctionId &IRFuncName,
231+
Function *&FuncWithoutProfile);
232+
bool isProfileUnused(const FunctionId &ProfileFuncName);
233+
bool functionMatchesProfileHelper(const Function &IRFunc,
234+
const FunctionId &ProfFunc);
235+
// Determine if the function matches profile. If FindMatchedProfileOnly is
236+
// set, only search the existing matched function. Otherwise, try matching the
237+
// two functions.
238+
bool functionMatchesProfile(const FunctionId &IRFuncName,
239+
const FunctionId &ProfileFuncName,
240+
bool FindMatchedProfileOnly);
241+
// Determine if the function matches profile by computing a similarity ratio
242+
// between two sequences of callsite anchors extracted from function and
243+
// profile. If it's above the threshold, the function matches the profile.
244+
bool functionMatchesProfile(Function &IRFunc, const FunctionId &ProfFunc,
245+
bool FindMatchedProfileOnly);
246+
// Find functions that don't show in the profile or profile symbol list,
247+
// which are supposed to be new functions. We use them as the targets for
248+
// call graph matching.
249+
void findFunctionsWithoutProfile();
163250
void reportOrPersistProfileStats();
164251
};
165252
} // end namespace llvm

llvm/include/llvm/Transforms/Utils/SampleProfileLoaderBaseImpl.h

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
#include "llvm/ADT/SmallPtrSet.h"
2323
#include "llvm/ADT/SmallSet.h"
2424
#include "llvm/ADT/SmallVector.h"
25+
#include "llvm/Analysis/LazyCallGraph.h"
2526
#include "llvm/Analysis/LoopInfo.h"
2627
#include "llvm/Analysis/OptimizationRemarkEmitter.h"
2728
#include "llvm/Analysis/PostDominators.h"
@@ -155,6 +156,22 @@ static inline bool skipProfileForFunction(const Function &F) {
155156
return F.isDeclaration() || !F.hasFnAttribute("use-sample-profile");
156157
}
157158

159+
static inline void
160+
buildTopDownFuncOrder(LazyCallGraph &CG,
161+
std::vector<Function *> &FunctionOrderList) {
162+
CG.buildRefSCCs();
163+
for (LazyCallGraph::RefSCC &RC : CG.postorder_ref_sccs()) {
164+
for (LazyCallGraph::SCC &C : RC) {
165+
for (LazyCallGraph::Node &N : C) {
166+
Function &F = N.getFunction();
167+
if (!skipProfileForFunction(F))
168+
FunctionOrderList.push_back(&F);
169+
}
170+
}
171+
}
172+
std::reverse(FunctionOrderList.begin(), FunctionOrderList.end());
173+
}
174+
158175
template <typename FT> class SampleProfileLoaderBaseImpl {
159176
public:
160177
SampleProfileLoaderBaseImpl(std::string Name, std::string RemapName,

llvm/lib/ProfileData/SampleProf.cpp

Lines changed: 26 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,9 @@ LineLocation FunctionSamples::getCallSiteIdentifier(const DILocation *DIL,
236236
}
237237

238238
const FunctionSamples *FunctionSamples::findFunctionSamples(
239-
const DILocation *DIL, SampleProfileReaderItaniumRemapper *Remapper) const {
239+
const DILocation *DIL, SampleProfileReaderItaniumRemapper *Remapper,
240+
const HashKeyMap<std::unordered_map, FunctionId, FunctionId>
241+
*FuncNameToProfNameMap) const {
240242
assert(DIL);
241243
SmallVector<std::pair<LineLocation, StringRef>, 10> S;
242244

@@ -256,7 +258,8 @@ const FunctionSamples *FunctionSamples::findFunctionSamples(
256258
return this;
257259
const FunctionSamples *FS = this;
258260
for (int i = S.size() - 1; i >= 0 && FS != nullptr; i--) {
259-
FS = FS->findFunctionSamplesAt(S[i].first, S[i].second, Remapper);
261+
FS = FS->findFunctionSamplesAt(S[i].first, S[i].second, Remapper,
262+
FuncNameToProfNameMap);
260263
}
261264
return FS;
262265
}
@@ -277,19 +280,32 @@ void FunctionSamples::findAllNames(DenseSet<FunctionId> &NameSet) const {
277280

278281
const FunctionSamples *FunctionSamples::findFunctionSamplesAt(
279282
const LineLocation &Loc, StringRef CalleeName,
280-
SampleProfileReaderItaniumRemapper *Remapper) const {
283+
SampleProfileReaderItaniumRemapper *Remapper,
284+
const HashKeyMap<std::unordered_map, FunctionId, FunctionId>
285+
*FuncNameToProfNameMap) const {
281286
CalleeName = getCanonicalFnName(CalleeName);
282287

283-
auto iter = CallsiteSamples.find(mapIRLocToProfileLoc(Loc));
284-
if (iter == CallsiteSamples.end())
288+
auto I = CallsiteSamples.find(mapIRLocToProfileLoc(Loc));
289+
if (I == CallsiteSamples.end())
285290
return nullptr;
286-
auto FS = iter->second.find(getRepInFormat(CalleeName));
287-
if (FS != iter->second.end())
291+
auto FS = I->second.find(getRepInFormat(CalleeName));
292+
if (FS != I->second.end())
288293
return &FS->second;
294+
295+
if (FuncNameToProfNameMap && !FuncNameToProfNameMap->empty()) {
296+
auto R = FuncNameToProfNameMap->find(FunctionId(CalleeName));
297+
if (R != FuncNameToProfNameMap->end()) {
298+
CalleeName = R->second.stringRef();
299+
auto FS = I->second.find(getRepInFormat(CalleeName));
300+
if (FS != I->second.end())
301+
return &FS->second;
302+
}
303+
}
304+
289305
if (Remapper) {
290306
if (auto NameInProfile = Remapper->lookUpNameInProfile(CalleeName)) {
291-
auto FS = iter->second.find(getRepInFormat(*NameInProfile));
292-
if (FS != iter->second.end())
307+
auto FS = I->second.find(getRepInFormat(*NameInProfile));
308+
if (FS != I->second.end())
293309
return &FS->second;
294310
}
295311
}
@@ -300,7 +316,7 @@ const FunctionSamples *FunctionSamples::findFunctionSamplesAt(
300316
return nullptr;
301317
uint64_t MaxTotalSamples = 0;
302318
const FunctionSamples *R = nullptr;
303-
for (const auto &NameFS : iter->second)
319+
for (const auto &NameFS : I->second)
304320
if (NameFS.second.getTotalSamples() >= MaxTotalSamples) {
305321
MaxTotalSamples = NameFS.second.getTotalSamples();
306322
R = &NameFS.second;

0 commit comments

Comments
 (0)