Skip to content

Commit 6c3bf34

Browse files
[MemProf] Fix summary identification for imported locals (#124659)
When we apply cloning decisions in the ThinLTO backend, we need to find the corresponding summary for each function in the IR, and in some cases for callee functions. This is complicated when the function was a promoted local, in which case the GUID was formed from the hash of the original source file prepended to the function name. Those functions can be identified by the fact that they were given a ".llvm." suffix during promotion. We previously didn't do this correctly for promoted locals imported from other modules, as we only tried the current module source name. This led to crashes, in particular when the current module also had an local function of the same original name. In particular, we were attempting to iterate through the wrong summary's callsites, and there were fewer than in the actual function so we accessed data off the end (in a release build with assertion checking off - with assertion checking on we double check the stack ids and that would have failed). Even if we hadn't crashed or hit an assert, we could have applied the wrong cloning decisions, leading to unsats at link time. Luckily, function importing attaches thinlto_src_file metadata containing the original source file name to all imported functions. It normally doesn't do this by default, however, it always does if MemProf context disambiguation is enabled. Therefore, we can just look to see if the function contains this metadata and if so use it to recreate the original GUID. A similar issue can occur when looking for the ValueInfo / GUID of a direct tail call to see if we synthesized a callsite record for a missing tail call frame. In that case, the callee function may be a declaration, if we imported its caller but not the callee function definition. Because imported declarations don't get the thinlto_src_file metadata, we instead look at its caller (which works because this happens very early in the backend before any inlining).
1 parent a3a3e69 commit 6c3bf34

File tree

4 files changed

+297
-20
lines changed

4 files changed

+297
-20
lines changed

llvm/lib/AsmParser/LLParser.cpp

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10710,12 +10710,15 @@ bool LLParser::parseOptionalCallsites(std::vector<CallsiteInfo> &Callsites) {
1071010710
return true;
1071110711

1071210712
SmallVector<unsigned> StackIdIndices;
10713-
do {
10714-
uint64_t StackId = 0;
10715-
if (parseUInt64(StackId))
10716-
return true;
10717-
StackIdIndices.push_back(Index->addOrGetStackIdIndex(StackId));
10718-
} while (EatIfPresent(lltok::comma));
10713+
// Synthesized callsite records will not have a stack id list.
10714+
if (Lex.getKind() != lltok::rparen) {
10715+
do {
10716+
uint64_t StackId = 0;
10717+
if (parseUInt64(StackId))
10718+
return true;
10719+
StackIdIndices.push_back(Index->addOrGetStackIdIndex(StackId));
10720+
} while (EatIfPresent(lltok::comma));
10721+
}
1071910722

1072010723
if (parseToken(lltok::rparen, "expected ')' in stackIds"))
1072110724
return true;

llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp

Lines changed: 43 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4205,7 +4205,8 @@ static SmallVector<std::unique_ptr<ValueToValueMapTy>, 4> createFunctionClones(
42054205
// Locate the summary for F. This is complicated by the fact that it might
42064206
// have been internalized or promoted.
42074207
static ValueInfo findValueInfoForFunc(const Function &F, const Module &M,
4208-
const ModuleSummaryIndex *ImportSummary) {
4208+
const ModuleSummaryIndex *ImportSummary,
4209+
const Function *CallingFunc = nullptr) {
42094210
// FIXME: Ideally we would retain the original GUID in some fashion on the
42104211
// function (e.g. as metadata), but for now do our best to locate the
42114212
// summary without that information.
@@ -4220,20 +4221,48 @@ static ValueInfo findValueInfoForFunc(const Function &F, const Module &M,
42204221
// Now query with the original name before any promotion was performed.
42214222
StringRef OrigName =
42224223
ModuleSummaryIndex::getOriginalNameBeforePromote(F.getName());
4224+
// When this pass is enabled, we always add thinlto_src_file provenance
4225+
// metadata to imported function definitions, which allows us to recreate the
4226+
// original internal symbol's GUID.
4227+
auto SrcFileMD = F.getMetadata("thinlto_src_file");
4228+
// If this is a call to an imported/promoted local for which we didn't import
4229+
// the definition, the metadata will not exist on the declaration. However,
4230+
// since we are doing this early, before any inlining in the LTO backend, we
4231+
// can simply look at the metadata on the calling function which must have
4232+
// been from the same module if F was an internal symbol originally.
4233+
if (!SrcFileMD && F.isDeclaration()) {
4234+
// We would only call this for a declaration for a direct callsite, in which
4235+
// case the caller would have provided the calling function pointer.
4236+
assert(CallingFunc);
4237+
SrcFileMD = CallingFunc->getMetadata("thinlto_src_file");
4238+
// If this is a promoted local (OrigName != F.getName()), since this is a
4239+
// declaration, it must be imported from a different module and therefore we
4240+
// should always find the metadata on its calling function. Any call to a
4241+
// promoted local that came from this module should still be a definition.
4242+
assert(SrcFileMD || OrigName == F.getName());
4243+
}
4244+
StringRef SrcFile = M.getSourceFileName();
4245+
if (SrcFileMD)
4246+
SrcFile = dyn_cast<MDString>(SrcFileMD->getOperand(0))->getString();
42234247
std::string OrigId = GlobalValue::getGlobalIdentifier(
4224-
OrigName, GlobalValue::InternalLinkage, M.getSourceFileName());
4248+
OrigName, GlobalValue::InternalLinkage, SrcFile);
42254249
TheFnVI = ImportSummary->getValueInfo(GlobalValue::getGUID(OrigId));
4226-
if (TheFnVI)
4227-
return TheFnVI;
4228-
// Could be a promoted local imported from another module. We need to pass
4229-
// down more info here to find the original module id. For now, try with
4230-
// the OrigName which might have been stored in the OidGuidMap in the
4231-
// index. This would not work if there were same-named locals in multiple
4232-
// modules, however.
4233-
auto OrigGUID =
4234-
ImportSummary->getGUIDFromOriginalID(GlobalValue::getGUID(OrigName));
4235-
if (OrigGUID)
4236-
TheFnVI = ImportSummary->getValueInfo(OrigGUID);
4250+
// Internal func in original module may have gotten a numbered suffix if we
4251+
// imported an external function with the same name. This happens
4252+
// automatically during IR linking for naming conflicts. It would have to
4253+
// still be internal in that case (otherwise it would have been renamed on
4254+
// promotion in which case we wouldn't have a naming conflict).
4255+
if (!TheFnVI && OrigName == F.getName() && F.hasLocalLinkage() &&
4256+
F.getName().contains('.')) {
4257+
OrigName = F.getName().rsplit('.').first;
4258+
OrigId = GlobalValue::getGlobalIdentifier(
4259+
OrigName, GlobalValue::InternalLinkage, SrcFile);
4260+
TheFnVI = ImportSummary->getValueInfo(GlobalValue::getGUID(OrigId));
4261+
}
4262+
// The only way we may not have a VI is if this is a declaration created for
4263+
// an imported reference. For distributed ThinLTO we may not have a VI for
4264+
// such declarations in the distributed summary.
4265+
assert(TheFnVI || F.isDeclaration());
42374266
return TheFnVI;
42384267
}
42394268

@@ -4592,7 +4621,7 @@ bool MemProfContextDisambiguation::applyImport(Module &M) {
45924621
// Locate the synthesized callsite info for the callee VI, if any was
45934622
// created, and use that for cloning.
45944623
ValueInfo CalleeVI =
4595-
findValueInfoForFunc(*CalledFunction, M, ImportSummary);
4624+
findValueInfoForFunc(*CalledFunction, M, ImportSummary, &F);
45964625
if (CalleeVI && MapTailCallCalleeVIToCallsite.count(CalleeVI)) {
45974626
auto Callsite = MapTailCallCalleeVIToCallsite.find(CalleeVI);
45984627
assert(Callsite != MapTailCallCalleeVIToCallsite.end());
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
;; Test to make sure that the memprof ThinLTO backend finds the correct summary
2+
;; for an imported promoted local, so that we can perform the correct cloning.
3+
;; In particular, we should be able to use the thinlto_src_file metadata to
4+
;; recreate its original GUID. In particular, this test contains promoted
5+
;; internal functions with the same original name as those that were imported,
6+
;; and we want to ensure we don't use those by mistake.
7+
8+
;; The original code looks something like:
9+
;;
10+
;; src1.cc:
11+
;; extern void external1();
12+
;; extern void external2();
13+
;; static void internal1() {
14+
;; external2();
15+
;; }
16+
;; static void internal2() {
17+
;; external2();
18+
;; }
19+
;; int main() {
20+
;; internal1();
21+
;; internal2();
22+
;; external1();
23+
;; return 0;
24+
;; }
25+
;;
26+
;; src2.cc:
27+
;; extern void external2();
28+
;; static void internal1() {
29+
;; external2();
30+
;; }
31+
;; static void internal2() {
32+
;; external2();
33+
;; }
34+
;; void external1() {
35+
;; internal1();
36+
;; internal2();
37+
;; }
38+
;;
39+
;; The assembly for src1 shown below was dumped after function importing, with
40+
;; some hand modification to ensure we import the definitions of src2.cc's
41+
;; external1 and internal1 functions, and the declaration only for its
42+
;; internal2 function. I also hand modified it to add !callsite metadata
43+
;; to a few calls, and the distributed ThinLTO summary in src1.o.thinlto.ll to
44+
;; contain callsite metadata records with cloning results.
45+
46+
; RUN: rm -rf %t && split-file %s %t && cd %t
47+
; RUN: llvm-as src1.ll -o src1.o
48+
; RUN: llvm-as src1.o.thinlto.ll -o src1.o.thinlto.bc
49+
50+
; RUN: opt -passes=memprof-context-disambiguation src1.o -S -memprof-import-summary=src1.o.thinlto.bc | FileCheck %s
51+
52+
;; Per the cloning results in the summary, none of the original functions should
53+
;; call any memprof clones.
54+
; CHECK-NOT: memprof
55+
;; We should have one clone of src1.cc's internal1 that calls a clone of
56+
;; external2.
57+
; CHECK-LABEL: define void @_ZL9internal1v.llvm.5985484347676238233.memprof.1()
58+
; CHECK: tail call void @_Z9external2v.memprof.1()
59+
; CHECK-LABEL: declare void @_Z9external2v.memprof.1()
60+
;; We should have one clone of external1 that calls a clone of internal2 from
61+
;; a synthesized callsite record (for a tail call with a missing frame).
62+
; CHECK-LABEL: define available_externally {{.*}} void @_Z9external1v.memprof.1()
63+
; CHECK: tail call void @_ZL9internal1v.llvm.3267420853450984672()
64+
; CHECK: tail call void @_ZL9internal2v.llvm.3267420853450984672.memprof.1()
65+
; CHECK-LABEL: declare void @_ZL9internal2v.llvm.3267420853450984672.memprof.1()
66+
;; We should have 2 clones of src2.cc's internal1 function, calling a single
67+
;; clone of external2.
68+
; CHECK-LABEL: define available_externally void @_ZL9internal1v.llvm.3267420853450984672.memprof.1()
69+
; CHECK: tail call void @_Z9external2v.memprof.1()
70+
; CHECK: tail call void @_Z9external2v.memprof.1()
71+
; CHECK-LABEL: define available_externally void @_ZL9internal1v.llvm.3267420853450984672.memprof.2()
72+
; CHECK: tail call void @_Z9external2v.memprof.1()
73+
; CHECK: tail call void @_Z9external2v.memprof.1()
74+
; CHECK-NOT: memprof
75+
76+
;--- src1.ll
77+
; ModuleID = 'src1.o'
78+
source_filename = "src1.cc"
79+
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
80+
target triple = "x86_64-unknown-linux-gnu"
81+
82+
define dso_local noundef i32 @main() {
83+
entry:
84+
tail call void @_ZL9internal1v.llvm.5985484347676238233()
85+
tail call void @_ZL9internal2v.llvm.5985484347676238233()
86+
tail call void @_Z9external1v()
87+
ret i32 0
88+
}
89+
90+
define void @_ZL9internal1v.llvm.5985484347676238233() {
91+
entry:
92+
tail call void @_Z9external2v(), !callsite !8
93+
ret void
94+
}
95+
96+
define void @_ZL9internal2v.llvm.5985484347676238233() {
97+
entry:
98+
tail call void @_Z9external2v()
99+
ret void
100+
}
101+
102+
declare void @_Z9external2v()
103+
104+
define available_externally dso_local void @_Z9external1v() !thinlto_src_module !6 !thinlto_src_file !7 {
105+
entry:
106+
tail call void @_ZL9internal1v.llvm.3267420853450984672()
107+
tail call void @_ZL9internal2v.llvm.3267420853450984672()
108+
ret void
109+
}
110+
111+
define available_externally void @_ZL9internal1v.llvm.3267420853450984672() !thinlto_src_module !6 !thinlto_src_file !7 {
112+
entry:
113+
;; This one has more callsite records than the other version of internal1,
114+
;; which would cause the code to iterate past the end of the callsite
115+
;; records if we incorrectly got the other internal1's summary.
116+
tail call void @_Z9external2v(), !callsite !9
117+
tail call void @_Z9external2v(), !callsite !10
118+
ret void
119+
}
120+
121+
declare void @_ZL9internal2v.llvm.3267420853450984672()
122+
123+
!6 = !{!"src2.o"}
124+
!7 = !{!"src2.cc"}
125+
!8 = !{i64 12345}
126+
!9 = !{i64 23456}
127+
!10 = !{i64 34567}
128+
129+
;--- src1.o.thinlto.ll
130+
; ModuleID = 'src1.o.thinlto.bc'
131+
source_filename = "src1.o.thinlto.bc"
132+
133+
^0 = module: (path: "src1.o", hash: (1393604173, 1072112025, 2857473630, 2016801496, 3238735916))
134+
^1 = module: (path: "src2.o", hash: (760755700, 1705397472, 4198605753, 677969311, 2408738824))
135+
;; src2.o:internal1. It specifies that we should have 3 clones total (including
136+
;; original).
137+
^3 = gv: (guid: 1143217136900127394, summaries: (function: (module: ^1, flags: (linkage: available_externally, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 0, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^6, tail: 1), (callee: ^6, tail: 1)), callsites: ((callee: ^6, clones: (0, 1, 1), stackIds: (23456)), (callee: ^6, clones: (0, 1, 1), stackIds: (34567))))))
138+
;; src2.o:internal2. It was manually modified to have importType = declaration.
139+
^4 = gv: (guid: 3599593882704738259, summaries: (function: (module: ^1, flags: (linkage: available_externally, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: declaration), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 0, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^6, tail: 1)))))
140+
;; src1.o:internal1.
141+
^5 = gv: (guid: 6084810090198994915, summaries: (function: (module: ^0, flags: (linkage: internal, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 0, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^6, tail: 1)), callsites: ((callee: ^6, clones: (0, 1), stackIds: (12345))))))
142+
^6 = gv: (guid: 8596367375252297795)
143+
;; src1.o:internal2.
144+
^7 = gv: (guid: 11092151021205906565, summaries: (function: (module: ^0, flags: (linkage: internal, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 0, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^6, tail: 1)))))
145+
;; src2.o:external1. It contains a synthesized callsite record for the tail call
146+
;; to internal2 (the empty stackId list indicates it is synthesized for a
147+
;; discovered missing tail call frame.
148+
^8 = gv: (guid: 12313225385227428720, summaries: (function: (module: ^1, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 3, calls: ((callee: ^3, tail: 1), (callee: ^4, tail: 1)), callsites: ((callee: ^4, clones: (0, 1), stackIds: ())))))
149+
;; src1.o:main.
150+
^9 = gv: (guid: 15822663052811949562, summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 4, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 0, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^5, tail: 1), (callee: ^7, tail: 1), (callee: ^8, tail: 1)))))
151+
^10 = flags: 97
152+
^11 = blockcount: 0
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
;; Test to make sure that the memprof ThinLTO backend finds the correct summary
2+
;; when there was a naming conflict between an internal function in the original
3+
;; module and an imported external function with the same name. The IR linking
4+
;; will automatically append a "." followed by a numbered suffix to the existing
5+
;; local name in that case. Note this can happen with C where the mangling would
6+
;; be the same for the internal and external functions of the same name (C++
7+
;; would have different mangling).
8+
9+
;; Note we don't need any MemProf related metadata for this to fail to find a
10+
;; ValueInfo and crash if the wrong GUID is computed for the renamed local.
11+
12+
;; The original code looks something like:
13+
;;
14+
;; src1.c:
15+
;; extern void external1();
16+
;; extern void external2();
17+
;; static void foo() {
18+
;; external2();
19+
;; }
20+
;; int main() {
21+
;; external1();
22+
;; foo();
23+
;; return 0;
24+
;; }
25+
;;
26+
;; src2.c:
27+
;; extern void external2();
28+
;; void foo() {
29+
;; external2();
30+
;; }
31+
;; void external1() {
32+
;; foo();
33+
;; }
34+
;;
35+
;; The assembly for src1 shown below was dumped after function importing.
36+
37+
; RUN: rm -rf %t && split-file %s %t && cd %t
38+
; RUN: llvm-as src1.ll -o src1.o
39+
; RUN: llvm-as src1.o.thinlto.ll -o src1.o.thinlto.bc
40+
41+
;; Simply check that we don't crash when trying to find the ValueInfo for each
42+
;; function in the IR.
43+
; RUN: opt -passes=memprof-context-disambiguation src1.o -S -memprof-import-summary=src1.o.thinlto.bc
44+
45+
;--- src1.ll
46+
; ModuleID = 'src1.o'
47+
source_filename = "src1.c"
48+
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
49+
target triple = "x86_64-unknown-linux-gnu"
50+
51+
define dso_local noundef i32 @main() {
52+
entry:
53+
tail call void @external1()
54+
tail call void @foo.2()
55+
ret i32 0
56+
}
57+
58+
define internal void @foo.2() {
59+
entry:
60+
tail call void @external2()
61+
ret void
62+
}
63+
64+
declare void @external2()
65+
66+
define available_externally dso_local void @foo() !thinlto_src_module !1 !thinlto_src_file !2 {
67+
entry:
68+
tail call void @external2()
69+
ret void
70+
}
71+
72+
define available_externally dso_local void @external1() !thinlto_src_module !1 !thinlto_src_file !2 {
73+
entry:
74+
tail call void @foo()
75+
ret void
76+
}
77+
78+
!1 = !{!"src2.o"}
79+
!2 = !{!"src2.c"}
80+
81+
;--- src1.o.thinlto.ll
82+
; ModuleID = 'src1.o.thinlto.bc'
83+
source_filename = "src1.o.thinlto.bc"
84+
85+
^0 = module: (path: "src1.o", hash: (2435941910, 498944982, 2551913764, 2759430100, 3918124321))
86+
^1 = module: (path: "src2.o", hash: (1826286437, 1557684621, 1220464477, 2734102338, 1025249503))
87+
^2 = module: (path: "src3.o", hash: (1085916433, 503665945, 2163560042, 340524, 2255774964))
88+
^3 = gv: (guid: 1456206394295721279, summaries: (function: (module: ^0, flags: (linkage: internal, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0)))) ;; src1.c:foo
89+
^4 = gv: (guid: 6699318081062747564, summaries: (function: (module: ^1, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 1, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0)))) ;; src2.c:foo
90+
^5 = gv: (guid: 13087145834073153720, summaries: (function: (module: ^1, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 2, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^4, tail: 1))))) ;; src1.c:external1
91+
^6 = gv: (guid: 15822663052811949562, summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0, importType: definition), insts: 3, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 1, mayThrow: 0, hasUnknownCall: 0, mustBeUnreachable: 0), calls: ((callee: ^5, tail: 1), (callee: ^3, tail: 1))))) ;; src1.c:main
92+
^8 = flags: 97
93+
^9 = blockcount: 0

0 commit comments

Comments
 (0)