-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. #74008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…colon as delimiter for local-linkage varibles. Commit fe05193 (phab D156569), IRPGO names uses format '[<filepath>;]<linkage-name>' while prior format is [<filepath>:<linkage-name>'. The format change would break the use caes demonstrated in (updated) llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll This patch changes GlobalValues::getGlobalIdentifer to use the semicolon. To elaborate on the scenario how things break without this PR 1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. One field in per-function profile data is the MD5 hash of IRPGO names. 2. When raw profiles are converted to indexed format profiles, the profiled address is mapped to the MD5 hash of the callee. 3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be annotated as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported. The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path - Annotate raw profiles and generate import summaries. Using the imported summaries, it tests that functions are correctly imported and ICP transformations happened.
@llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-lto Author: Mingming Liu (minglotus-6) ChangesCommit fe05193 (phab D156569), IRPGO names uses format This patch changes To elaborate on the scenario how things break without this PR
The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path
Patch is 28.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74008.diff 12 Files Affected:
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
// that it will stay the same, e.g., if the files are checked out from
// version control in different locations.
if (FileName.empty())
- NewName = NewName.insert(0, "<unknown>:");
+ NewName = NewName.insert(0, "<unknown>;");
else
- NewName = NewName.insert(0, FileName.str() + ":");
+ NewName = NewName.insert(0, FileName.str() + ";");
}
return NewName;
}
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
char InstrProfError::ID = 0;
-std::string getPGOFuncName(StringRef RawFuncName,
- GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
StringRef FileName,
uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
- return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+ // Value names may be prefixed with a binary '1' to indicate
+ // that the backend should not modify the symbols due to any platform
+ // naming convention. Do not include that '1' in the PGO profile name.
+ if (Name[0] == '\1')
+ Name = Name.substr(1);
+
+ std::string NewName = std::string(Name);
+ if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+ // For local symbols, prepend the main file name to distinguish them.
+ // Do not include the full path in the file name since there's no guarantee
+ // that it will stay the same, e.g., if the files are checked out from
+ // version control in different locations.
+ if (FileName.empty())
+ NewName = NewName.insert(0, "<unknown>:");
+ else
+ NewName = NewName.insert(0, FileName.str() + ":");
+ }
+ return NewName;
}
// Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
GlobalValue::LinkageTypes Linkage,
StringRef FileName) {
SmallString<64> Name;
- if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
- Name.append(FileName.empty() ? "<unknown>" : FileName);
- Name.append(";");
- }
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
- return Name.str().str();
+ return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
}
static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
; COMBINED: <GLOBALVAL_SUMMARY_BLOCK
; COMBINED-NEXT: <VERSION
; COMBINED-NEXT: <FLAGS
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-8118561185538785069/>
; COMBINED-DAG: <COMBINED{{ }}
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
; COMBINED-DAG: <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAZ2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[FOO2]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clone of [[FOO]]
; DUMP: Node [[BAZ2]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
; DOTCLONED: }
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 6
; DUMP: CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[C]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 2 5
; DUMP: CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[B]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
;; This is the node synthesized for the call to bar in foo that was created
;; by inlining baz into foo.
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3...
[truncated]
|
@llvm/pr-subscribers-pgo Author: Mingming Liu (minglotus-6) ChangesCommit fe05193 (phab D156569), IRPGO names uses format This patch changes To elaborate on the scenario how things break without this PR
The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path
Patch is 28.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74008.diff 12 Files Affected:
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
// that it will stay the same, e.g., if the files are checked out from
// version control in different locations.
if (FileName.empty())
- NewName = NewName.insert(0, "<unknown>:");
+ NewName = NewName.insert(0, "<unknown>;");
else
- NewName = NewName.insert(0, FileName.str() + ":");
+ NewName = NewName.insert(0, FileName.str() + ";");
}
return NewName;
}
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
char InstrProfError::ID = 0;
-std::string getPGOFuncName(StringRef RawFuncName,
- GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
StringRef FileName,
uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
- return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+ // Value names may be prefixed with a binary '1' to indicate
+ // that the backend should not modify the symbols due to any platform
+ // naming convention. Do not include that '1' in the PGO profile name.
+ if (Name[0] == '\1')
+ Name = Name.substr(1);
+
+ std::string NewName = std::string(Name);
+ if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+ // For local symbols, prepend the main file name to distinguish them.
+ // Do not include the full path in the file name since there's no guarantee
+ // that it will stay the same, e.g., if the files are checked out from
+ // version control in different locations.
+ if (FileName.empty())
+ NewName = NewName.insert(0, "<unknown>:");
+ else
+ NewName = NewName.insert(0, FileName.str() + ":");
+ }
+ return NewName;
}
// Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
GlobalValue::LinkageTypes Linkage,
StringRef FileName) {
SmallString<64> Name;
- if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
- Name.append(FileName.empty() ? "<unknown>" : FileName);
- Name.append(";");
- }
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
- return Name.str().str();
+ return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
}
static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
; COMBINED: <GLOBALVAL_SUMMARY_BLOCK
; COMBINED-NEXT: <VERSION
; COMBINED-NEXT: <FLAGS
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-8118561185538785069/>
; COMBINED-DAG: <COMBINED{{ }}
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
; COMBINED-DAG: <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAZ2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[FOO2]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clone of [[FOO]]
; DUMP: Node [[BAZ2]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
; DOTCLONED: }
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 6
; DUMP: CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[C]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 2 5
; DUMP: CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[B]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
;; This is the node synthesized for the call to bar in foo that was created
;; by inlining baz into foo.
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3...
[truncated]
|
@llvm/pr-subscribers-llvm-transforms Author: Mingming Liu (minglotus-6) ChangesCommit fe05193 (phab D156569), IRPGO names uses format This patch changes To elaborate on the scenario how things break without this PR
The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path
Patch is 28.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74008.diff 12 Files Affected:
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
// that it will stay the same, e.g., if the files are checked out from
// version control in different locations.
if (FileName.empty())
- NewName = NewName.insert(0, "<unknown>:");
+ NewName = NewName.insert(0, "<unknown>;");
else
- NewName = NewName.insert(0, FileName.str() + ":");
+ NewName = NewName.insert(0, FileName.str() + ";");
}
return NewName;
}
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
char InstrProfError::ID = 0;
-std::string getPGOFuncName(StringRef RawFuncName,
- GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
StringRef FileName,
uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
- return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+ // Value names may be prefixed with a binary '1' to indicate
+ // that the backend should not modify the symbols due to any platform
+ // naming convention. Do not include that '1' in the PGO profile name.
+ if (Name[0] == '\1')
+ Name = Name.substr(1);
+
+ std::string NewName = std::string(Name);
+ if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+ // For local symbols, prepend the main file name to distinguish them.
+ // Do not include the full path in the file name since there's no guarantee
+ // that it will stay the same, e.g., if the files are checked out from
+ // version control in different locations.
+ if (FileName.empty())
+ NewName = NewName.insert(0, "<unknown>:");
+ else
+ NewName = NewName.insert(0, FileName.str() + ":");
+ }
+ return NewName;
}
// Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
GlobalValue::LinkageTypes Linkage,
StringRef FileName) {
SmallString<64> Name;
- if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
- Name.append(FileName.empty() ? "<unknown>" : FileName);
- Name.append(";");
- }
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
- return Name.str().str();
+ return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
}
static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
; COMBINED: <GLOBALVAL_SUMMARY_BLOCK
; COMBINED-NEXT: <VERSION
; COMBINED-NEXT: <FLAGS
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-8118561185538785069/>
; COMBINED-DAG: <COMBINED{{ }}
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
; COMBINED-DAG: <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[BAZ]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAZ2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[FOO2]]
-; DUMP: Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clone of [[FOO]]
; DUMP: Node [[BAZ2]]
-; DUMP: Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
; DOTCLONED: }
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 6
; DUMP: CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[C]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 2 5
; DUMP: CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[B]]
-; DUMP: Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[FOO2:0x[a-z0-9]+]]
; DUMP: Node [[AX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 1 2
; DUMP: CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
; DUMP: Node [[MAIN3]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1
; DUMP: CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN4]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2
; DUMP: CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3
; DUMP: CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[BX]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 4 5
; DUMP: CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
; DUMP: Node [[MAIN5]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 4
; DUMP: CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN6]]
-; DUMP: Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 5
; DUMP: CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 6
; DUMP: CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
;; This is the node synthesized for the call to bar in foo that was created
;; by inlining baz into foo.
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotColdCold
; DUMP: ContextIds: 3 4
; DUMP: CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
; DUMP: Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
; DUMP: Node [[MAIN1]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 1 3
; DUMP: CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
; DUMP: CallerEdges:
; DUMP: Node [[MAIN2]]
-; DUMP: Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
+; DUMP: Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3 (clone 0)
; DUMP: AllocTypes: Cold
; DUMP: ContextIds: 2 4
; DUMP: CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
; DUMP: Clones: [[BAR2:0x[a-z0-9]+]]
; DUMP: Node [[FOO]]
-; DUMP: Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP: Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
; DUMP: AllocTypes: NotCold
; DUMP: ContextIds: 3...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! A few comments below.
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const { | |||
|
|||
char InstrProfError::ID = 0; | |||
|
|||
std::string getPGOFuncName(StringRef RawFuncName, | |||
GlobalValue::LinkageTypes Linkage, | |||
std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to keep the original name, probably want to at least leave big comments in the header not to use it as it is legacy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also change the function name to getLegacyPGOFuncName
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slightly prefer to use (simpler) getPGOFuncName
for new ;
format and use getLegacyPGOFuncName
for :
format, and chose not to do all the rename (want an NFC patch if rename is the way to go, since there are many existing callers).
I updated header and cpp, and did a minimal rename (this function has 3 callers, one in clang directory and another two in InstrProf.cpp)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this resolved? It looks like the most recent version of the patch changes getPGOFuncName to getLegacyPGOFuncName for the old format, and uses getIRPGOFuncName for the new format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't rename getPGOFuncName()
in https://reviews.llvm.org/D156569 because Swift and Clang FE-PGO relied on it. Is that still the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I missed the fact that one getPGOFuncName interface was left. I am only seeing 2 invocations of that interface outside of the unittest test. I would be in favor of doing renames of both getPGOFuncName interfaces in one patch. A separate NFC patch is a fine option, and keeps this patch just about fixing the ICP breakage caused by the delimiter change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that still the case?
Clang FE-PGO calls this function, but afaik it's used for coverage rather than performance nowadays. I guess if Clang FE-PGO continues to use this function it shouldn't be called legacy
in its name.
I'm wondering if it makes more sense if I rename getPGOFuncName
(taking stringified names) to getClangPGOFuncName
in this PR? Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it depends on whether it is intentional that Clang (and Swift apparently?) use the old name.
@ellishg was there a reason to leave that as is?
If they must use the old one then maybe getFEPGOFuncName. If they should be transitioned eventually then using "Legacy" makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I missed the fact that one getPGOFuncName interface was left. I am only seeing 2 invocations of that interface outside of the unittest test. I would be in favor of doing renames of both getPGOFuncName interfaces in one patch. A separate NFC patch is a fine option, and keeps this patch just about fixing the ICP breakage caused by the delimiter change.
got it. I'll probably un-do the rename then. We can figure out about names in a follow-up patch.
I might need to tweak my Github notifications a little bit. Right now I cannot see real-time updates when I opened a PR and replied on one comment. Sorry for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
undo the rename.
I didn't find a way to configure the Github notifications for real-time comments. Guess I'll need to combine email notifications.
if (llvm::GlobalValue::isLocalLinkage(Linkage)) { | ||
Name.append(FileName.empty() ? "<unknown>" : FileName); | ||
Name.append(";"); | ||
} | ||
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the Mangler() handling into getGlobalIdentifier(), then remove the handling there of "Name[0] == '\1'" as I believe it is handled by Manger().getNameWithPrefix().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought it was feasible initially. Two kinds of rough edges came up when I tried the implementation. (added some comments to document them)
a) getIRPGONameForGlobalObject
currently mangles names currently, by passing global values to getNameWithPrefix
.getNameWithPrefix
with GlobalValue parameter makes use of other information (e.g., linkage, calling-conv of functions, etc). getNameWithPrefix
with stringified names as parameter just use names. For parity we might need a getGlobalIdentifier
that takes global values (not just stringified names).
b) getGlobalIdentifier
gets called in many places, and the \1
handling function in Mangler does more than dropping \1
, so embedding Mangler's way of handling \1
might affect all callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ellishg How much of D156569 relied on the invocation of Mangler? It is not mentioned in the patch description, only the rationale for changing ":" to ";". The problem is if these are out of sync, then cross-module importing of indirectly called local functions will continue to be broken in whatever cases Mangler().getNameWithPrefix affects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D156569 changes the format from [<filepath>:]<function-name>
to [<filepath>;]<linkage-name>
. Note the change from function-name to linkage-name. Having the mangled name is required so that we can pass a symbol order via -symbol-ordering-file
or -order_file
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm confused about is that the function name is already the mangled name, at least using clang with c++. Is it not for objective C?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid subtle issues when linkage-name is different from mangled names,,I'm wondering if it warrants a change to use linkage-names (as opposed to mangled name) in GlobalValue::getGlobalIdentifier
in this PR. Global identifier is supposed to be hash of unique names, and linkage-name is already unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't broken in general, but it's needed to get -order_file working correctly.
Unfortunately this change broke aspects of ThinLTO ICP, however. Is it possible to change the handling of -order_file in the linker to modify the symbol names appropriately?
To avoid subtle issues when linkage-name is different from mangled names,,I'm wondering if it warrants a change to use linkage-names (as opposed to mangled name) in GlobalValue::getGlobalIdentifier in this PR.
If the -order_name handling cannot be fixed as I suggested above, then yes, we need some solution to ensure that the hashed symbol names are consistent across PGO and ThinLTO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to use linkage-names for IRPGO, -order_file
, and ThinLTO. -order_file
is used in the linker when it only knows linkage-names, so I don't think it makes sense to feed it mangled names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to use linkage-names for IRPGO, -order_file, and ThinLTO. -order_file is used in the linker when it only knows linkage-names, so I don't think it makes sense to feed it mangled names.
Thanks for the input.
Using linkage-name would be a non-trivial change, given the static getGlobalIdentifier
takes a stringified name currently, and using linkage-name means requiring compatible change in each callsite (e.g., if the caller context doesn't have GlobalValue
but just stringified names in the bitcode, make sure linkage-name exists in the bitcode, this seems one example). An alternative fixup is to store the MD5 of [filename;]linkage-name
(this is what's currently stored in this field) and the MD5 of [filename:]mangled-name
(the original hash before D156569) in the raw profiles, so llvm-profdata
could choose properly (former for ordering and latter for ICP)
@ellishg Would you mind if I create a Github issue to track how to fix other potential cases and assign it to you ? This PR would solve the colon and semicolon difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for giving the thumb-up Ellis! Just created #74565.
EOF | ||
|
||
# Creates lib.cc. global_func might call one of two indirect callees. Both | ||
# indirect callees have internal linkage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we now missing testing of ThinLTO ICP importing for a global function? What if one of the functions was left with non-local linkage like before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch. I changed 'callee1' to external linkage in this script and the test IRs.
Using the same added ICP test, profile matching on local-linkage However, when trying to follow the counter matching code path, I came across another place(
|
Can you clarify what you are saying here - is it working or not working? |
It's working. The local-linkage David says the itanium remapper file was only used once during gcc to llvm transition, so not relevant here. |
Just noticed there is a merge conflict now. Will update my fork and merge. |
I believe it was actually for the libstdc++ to libc++ transition (see https://reviews.llvm.org/D51247 and https://reviews.llvm.org/D51240). If it is broken we'll at least want to add a FIXME there. |
Yes, I meant libstdc++ to libc++ transition. Why source line is this comment addressing? I take take a look the changes/comments there. |
Sorry for the misinformation, and thanks for the Phab links. I think the itanium remapper needs a |
|
||
// Test should fail where linkage-name and mangled-name diverges, see issue https://github.com/llvm/llvm-project/issues/74565). | ||
// Currently, this name divergence happens on Mach-O object file format, or on | ||
// many (but not all) 32-bit Windows systems. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more details regarding this statement
- The global prefix is
_
for {MachO
,WinCOFFX86
} and\0
(i.e., no prefix) for the rest of mangling modes (source code).- The global prefix is used by IR Mangler in global symbols (e.g., prefix emitted if it's not
\0
)
- The global prefix is used by IR Mangler in global symbols (e.g., prefix emitted if it's not
- In the data-layout string,
m:o
specifiesMachO
,m:x
specifiesWinCOFFX86
, andm:e
specifiesELF
(parser source code)
With 1 and 2, note not all 32-bit windows uses WinCOFFX86
mangling-mode. For instance, one windows-32 target specifies m:e
, and some [1] windows-32 target chooses between m:e
and one of {m:x
, m:o
}
[1] example 1 chooses between m:x
and m:e
,and example 2 chooses between m:o
and m:e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once these changes are made.
- unused
%clangxx_pgouse=
- profraw => proftext for the PGOProfile test
- lld-available -fuse-ld=lld and windows triple simplification for the compiler-rt test
It seems useful to wait for others' opinions as well.
// IR-NEXT: %1 = load ptr, ptr getelementptr inbounds ([2 x ptr], ptr @calleeAddrs, | ||
// IR-NEXT: tail call void %1(), !prof ![[#PROF2:]] | ||
|
||
// The GUID of indirect callee is the MD5 hash of `/path/to/lib.cpp:_ZL7callee0v` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this have the semicolon separator not a colon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, done.
llvm/lib/ProfileData/InstrProf.cpp
Outdated
@@ -352,6 +366,8 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) { | |||
return getIRPGOObjectName(F, InLTO, getPGOFuncNameMetadata(F)); | |||
} | |||
|
|||
// DEPRECATED. Use `getIRPGOFuncName`for new code. See that function for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the header it is described as for FE instrumentation. Probably want the comments to be consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM other than a couple of minor comments and pending resolution of the LLVM IR test. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved comments except the pending discussion about whether to use .proftext
or.profraw
.
Added REQUIRES: zlib
for LLVM IR test since the profile reader should be built with zlib support.
I'll probably spend sometime to get this test running on my laptop (haven't tried to build llvm on mac before), while waiting for more feedbacks. I'm thinking of submitting it on Thursday or Friday. @ellishg I think the added compiler-rt test (on macosx) should be a test case for issue 74565.
// IR-NEXT: %1 = load ptr, ptr getelementptr inbounds ([2 x ptr], ptr @calleeAddrs, | ||
// IR-NEXT: tail call void %1(), !prof ![[#PROF2:]] | ||
|
||
// The GUID of indirect callee is the MD5 hash of `/path/to/lib.cpp:_ZL7callee0v` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, done.
llvm/lib/ProfileData/InstrProf.cpp
Outdated
@@ -352,6 +366,8 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) { | |||
return getIRPGOObjectName(F, InLTO, getPGOFuncNameMetadata(F)); | |||
} | |||
|
|||
// DEPRECATED. Use `getIRPGOFuncName`for new code. See that function for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
for disassembled IR. - Currently, assume dso_local is true on ELF and COFF according to https://github.com/llvm/llvm-project/blob/4f1ddf7523c0bbb4075b1682dbe2278080642eee/clang/lib/CodeGen/CodeGenModule.cpp#L1528 - Running the compiler-rt test on mac would fail on `dso_local` without this. The test should fail on imports of indirect callees as well as finding `PGONameMetadata` for external functions.
The test failed on mac, but initially on trying to find
After making the
|
…use semicolon as delimiter for local-linkage varibles." (#75835) Reverts #74008 The compiler-rt test failed due to `llvm-dis` not found (https://lab.llvm.org/buildbot/#/builders/127/builds/59884) Will revert and investigate how to require the proper dependency.
Change the format of IRPGO counter names to `[<filepath>;]<mangled-name>` which is computed by `GlobalValue::getGlobalIdentifier()` to fix llvm#74565. In fe05193 (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O (it is confusing that `<linkage-name>` is computed with `Mangler().getNameWithPrefix()` while `<mangled-name>` is just `F.getName()`). We discovered in llvm#74565 that this causes some missed import issues on some targets and llvm#74008 is a partial fix. Since `<mangled-name>` may not match the `<linkage-name>` on some targets like Mach-O, we will need to post-process the output of `llvm-profdata order` before passing to the linker via `-order_file`. Profiles generated after fe05193 will become stale after this diff, but I think this is acceptable since that patch after the LLVM 18 cut which has not been released yet.
Change the format of IRPGO counter names to `[<filepath>;]<mangled-name>` which is computed by `GlobalValue::getGlobalIdentifier()` to fix #74565. In fe05193 (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is basically `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O (yes, it is confusing that `<linkage-name>` is computed with `Mangler().getNameWithPrefix()` while `<mangled-name>` is just `F.getName()`). We discovered in #74565 that this causes some missed import issues on some targets and #74008 is a partial fix. Since `<mangled-name>` may not match the `<linkage-name>` on some targets like Mach-O, we will need to post-process the output of `llvm-profdata order` before passing to the linker via `-order_file`. Profiles generated after fe05193 will become stale after this diff, but I think this is acceptable since that patch landed after the LLVM 18 cut which hasn't been released yet.
…use semicolon as delimiter for local-linkage varibles." (#75835) Reverts llvm/llvm-project#74008 The compiler-rt test failed due to `llvm-dis` not found (https://lab.llvm.org/buildbot/#/builders/127/builds/59884) Will revert and investigate how to require the proper dependency.
Commit fe05193 (phab D156569), IRPGO names uses format
[<filepath>;]<linkage-name>
while prior format is[<filepath>:<mangled-name>
. The format change would break the use case demonstrated in (updated)llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
andcompiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp
This patch changes
GlobalValues::getGlobalIdentifer
to use the semicolon.To elaborate on the scenario how things break without this PR
pgo-instr-use
thin-lto prelink pipeline, MD5 hash of IRPGO names will be annotated as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported.compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp
is added to have an end-to-end test.llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll
is updated to have better test coverage from another aspect (as runtime tests are more sensitive to the environment and may be skipped by some contributors)