Skip to content

[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. #74008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 18, 2023

Conversation

mingmingl-llvm
Copy link
Contributor

@mingmingl-llvm mingmingl-llvm commented Dec 1, 2023

Commit fe05193 (phab D156569), IRPGO names uses format [<filepath>;]<linkage-name> while prior format is [<filepath>:<mangled-name>. The format change would break the use case demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll and compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp

This patch changes GlobalValues::getGlobalIdentifer to use the semicolon.

To elaborate on the scenario how things break without this PR

  1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. The NameRef field in per-function profile data is the MD5 hash of IRPGO names.
  2. When raw profiles are converted to indexed format profiles, the profiled address is mapped to the MD5 hash of the callee.
  3. In pgo-instr-use thin-lto prelink pipeline, MD5 hash of IRPGO names will be annotated as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported.
  • compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp is added to have an end-to-end test.
  • llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll is updated to have better test coverage from another aspect (as runtime tests are more sensitive to the environment and may be skipped by some contributors)

…colon as delimiter for local-linkage varibles.

Commit fe05193 (phab D156569), IRPGO names uses format
'[<filepath>;]<linkage-name>' while prior format is
[<filepath>:<linkage-name>'. The format change would break the use caes
demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll

This patch changes GlobalValues::getGlobalIdentifer to use the
semicolon.

To elaborate on the scenario how things break without this PR
1. IRPGO raw profiles stores (compressed) IRPGO names of functions in
   one section, and per-function profile data in another section. One
   field in per-function profile data is the MD5 hash of IRPGO names.
2. When raw profiles are converted to indexed format profiles, the
   profiled address is mapped to the MD5 hash of the callee.
3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be
   annotated as value profiles, and used to import indirect-call-prom
   candidates. If the annotated MD5 hash is computed from the new format
   while import uses the prior format, the callee cannot be imported.

The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path
- Annotate raw profiles and generate import summaries. Using the
  imported summaries, it tests that functions are correctly imported and
  ICP transformations happened.
@llvmbot llvmbot added PGO Profile Guided Optimizations LTO Link time optimization (regular/full LTO or ThinLTO) llvm:ir llvm:transforms labels Dec 1, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 1, 2023

@llvm/pr-subscribers-clang-codegen
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-lto

Author: Mingming Liu (minglotus-6)

Changes

Commit fe05193 (phab D156569), IRPGO names uses format [&lt;filepath&gt;;]&lt;linkage-name&gt; while prior format is [&lt;filepath&gt;:&lt;linkage-name&gt;. The format change would break the use caes demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll

This patch changes GlobalValues::getGlobalIdentifer to use the semicolon.

To elaborate on the scenario how things break without this PR

  1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. One field in per-function profile data is the MD5 hash of IRPGO names.
  2. When raw profiles are converted to indexed format profiles, the profiled address is mapped to the MD5 hash of the callee.
  3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be annotated as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported.

The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path

  • Annotate raw profiles, turn-off ICP transformations in the thin-lto prelink pipeline and generate import summaries. Using the imported summaries, it tests that functions are correctly imported and ICP transformations happened.

Patch is 28.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74008.diff

12 Files Affected:

  • (modified) llvm/lib/IR/Globals.cpp (+2-2)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+20-8)
  • (modified) llvm/test/Bitcode/thinlto-function-summary-originalnames.ll (+3-3)
  • (modified) llvm/test/ThinLTO/X86/memprof-basic.ll (+13-13)
  • (modified) llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll (+7-5)
  • (modified) llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll (+3-3)
  • (modified) llvm/test/ThinLTO/X86/memprof-indirectcall.ll (+16-16)
  • (modified) llvm/test/ThinLTO/X86/memprof-inlined.ll (+7-7)
  • (added) llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata ()
  • (modified) llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll (+27-5)
  • (added) llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh (+70)
  • (modified) llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll (+26-26)
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
     // that it will stay the same, e.g., if the files are checked out from
     // version control in different locations.
     if (FileName.empty())
-      NewName = NewName.insert(0, "<unknown>:");
+      NewName = NewName.insert(0, "<unknown>;");
     else
-      NewName = NewName.insert(0, FileName.str() + ":");
+      NewName = NewName.insert(0, FileName.str() + ";");
   }
   return NewName;
 }
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
 
 char InstrProfError::ID = 0;
 
-std::string getPGOFuncName(StringRef RawFuncName,
-                           GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
                            StringRef FileName,
                            uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
-  return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+  // Value names may be prefixed with a binary '1' to indicate
+  // that the backend should not modify the symbols due to any platform
+  // naming convention. Do not include that '1' in the PGO profile name.
+  if (Name[0] == '\1')
+    Name = Name.substr(1);
+
+  std::string NewName = std::string(Name);
+  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+    // For local symbols, prepend the main file name to distinguish them.
+    // Do not include the full path in the file name since there's no guarantee
+    // that it will stay the same, e.g., if the files are checked out from
+    // version control in different locations.
+    if (FileName.empty())
+      NewName = NewName.insert(0, "<unknown>:");
+    else
+      NewName = NewName.insert(0, FileName.str() + ":");
+  }
+  return NewName;
 }
 
 // Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
                             GlobalValue::LinkageTypes Linkage,
                             StringRef FileName) {
   SmallString<64> Name;
-  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
-    Name.append(FileName.empty() ? "<unknown>" : FileName);
-    Name.append(";");
-  }
   Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
-  return Name.str().str();
+  return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
 }
 
 static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
 ; COMBINED:       <GLOBALVAL_SUMMARY_BLOCK
 ; COMBINED-NEXT:    <VERSION
 ; COMBINED-NEXT:    <FLAGS
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-8118561185538785069/>
 ; COMBINED-DAG:    <COMBINED{{ }}
 ; COMBINED-DAG:    <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
 ; COMBINED-DAG:    <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAZ2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[FOO2]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clone of [[FOO]]
 
 ; DUMP: Node [[BAZ2]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
 ; DOTCLONED: }
 
 
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
 ; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
 ; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
 
 
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
 
 ; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 6
 ; DUMP: 	CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[C]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 2 5
 ; DUMP: 	CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[B]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
 ;; This is the node synthesized for the call to bar in foo that was created
 ;; by inlining baz into foo.
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
+; DUMP: 	Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:         Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP:         Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP:         Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Dec 1, 2023

@llvm/pr-subscribers-pgo

Author: Mingming Liu (minglotus-6)

Changes

Commit fe05193 (phab D156569), IRPGO names uses format [&lt;filepath&gt;;]&lt;linkage-name&gt; while prior format is [&lt;filepath&gt;:&lt;linkage-name&gt;. The format change would break the use caes demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll

This patch changes GlobalValues::getGlobalIdentifer to use the semicolon.

To elaborate on the scenario how things break without this PR

  1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. One field in per-function profile data is the MD5 hash of IRPGO names.
  2. When raw profiles are converted to indexed format profiles, the profiled address is mapped to the MD5 hash of the callee.
  3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be annotated as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported.

The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path

  • Annotate raw profiles, turn-off ICP transformations in the thin-lto prelink pipeline and generate import summaries. Using the imported summaries, it tests that functions are correctly imported and ICP transformations happened.

Patch is 28.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74008.diff

12 Files Affected:

  • (modified) llvm/lib/IR/Globals.cpp (+2-2)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+20-8)
  • (modified) llvm/test/Bitcode/thinlto-function-summary-originalnames.ll (+3-3)
  • (modified) llvm/test/ThinLTO/X86/memprof-basic.ll (+13-13)
  • (modified) llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll (+7-5)
  • (modified) llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll (+3-3)
  • (modified) llvm/test/ThinLTO/X86/memprof-indirectcall.ll (+16-16)
  • (modified) llvm/test/ThinLTO/X86/memprof-inlined.ll (+7-7)
  • (added) llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata ()
  • (modified) llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll (+27-5)
  • (added) llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh (+70)
  • (modified) llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll (+26-26)
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
     // that it will stay the same, e.g., if the files are checked out from
     // version control in different locations.
     if (FileName.empty())
-      NewName = NewName.insert(0, "<unknown>:");
+      NewName = NewName.insert(0, "<unknown>;");
     else
-      NewName = NewName.insert(0, FileName.str() + ":");
+      NewName = NewName.insert(0, FileName.str() + ";");
   }
   return NewName;
 }
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
 
 char InstrProfError::ID = 0;
 
-std::string getPGOFuncName(StringRef RawFuncName,
-                           GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
                            StringRef FileName,
                            uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
-  return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+  // Value names may be prefixed with a binary '1' to indicate
+  // that the backend should not modify the symbols due to any platform
+  // naming convention. Do not include that '1' in the PGO profile name.
+  if (Name[0] == '\1')
+    Name = Name.substr(1);
+
+  std::string NewName = std::string(Name);
+  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+    // For local symbols, prepend the main file name to distinguish them.
+    // Do not include the full path in the file name since there's no guarantee
+    // that it will stay the same, e.g., if the files are checked out from
+    // version control in different locations.
+    if (FileName.empty())
+      NewName = NewName.insert(0, "<unknown>:");
+    else
+      NewName = NewName.insert(0, FileName.str() + ":");
+  }
+  return NewName;
 }
 
 // Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
                             GlobalValue::LinkageTypes Linkage,
                             StringRef FileName) {
   SmallString<64> Name;
-  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
-    Name.append(FileName.empty() ? "<unknown>" : FileName);
-    Name.append(";");
-  }
   Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
-  return Name.str().str();
+  return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
 }
 
 static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
 ; COMBINED:       <GLOBALVAL_SUMMARY_BLOCK
 ; COMBINED-NEXT:    <VERSION
 ; COMBINED-NEXT:    <FLAGS
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-8118561185538785069/>
 ; COMBINED-DAG:    <COMBINED{{ }}
 ; COMBINED-DAG:    <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
 ; COMBINED-DAG:    <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAZ2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[FOO2]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clone of [[FOO]]
 
 ; DUMP: Node [[BAZ2]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
 ; DOTCLONED: }
 
 
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
 ; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
 ; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
 
 
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
 
 ; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 6
 ; DUMP: 	CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[C]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 2 5
 ; DUMP: 	CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[B]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
 ;; This is the node synthesized for the call to bar in foo that was created
 ;; by inlining baz into foo.
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
+; DUMP: 	Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:         Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP:         Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP:         Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Dec 1, 2023

@llvm/pr-subscribers-llvm-transforms

Author: Mingming Liu (minglotus-6)

Changes

Commit fe05193 (phab D156569), IRPGO names uses format [&lt;filepath&gt;;]&lt;linkage-name&gt; while prior format is [&lt;filepath&gt;:&lt;linkage-name&gt;. The format change would break the use caes demonstrated in (updated)
llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll

This patch changes GlobalValues::getGlobalIdentifer to use the semicolon.

To elaborate on the scenario how things break without this PR

  1. IRPGO raw profiles stores (compressed) IRPGO names of functions in one section, and per-function profile data in another section. One field in per-function profile data is the MD5 hash of IRPGO names.
  2. When raw profiles are converted to indexed format profiles, the profiled address is mapped to the MD5 hash of the callee.
  3. In thin-lto prelink pipeline, MD5 hash of IRPGO names will be annotated as value profiles, and used to import indirect-call-prom candidates. If the annotated MD5 hash is computed from the new format while import uses the prior format, the callee cannot be imported.

The updated test case Transforms/PGOProfile/thinlto_indirect_call_promotion.ll exercise the following path

  • Annotate raw profiles, turn-off ICP transformations in the thin-lto prelink pipeline and generate import summaries. Using the imported summaries, it tests that functions are correctly imported and ICP transformations happened.

Patch is 28.48 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/74008.diff

12 Files Affected:

  • (modified) llvm/lib/IR/Globals.cpp (+2-2)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+20-8)
  • (modified) llvm/test/Bitcode/thinlto-function-summary-originalnames.ll (+3-3)
  • (modified) llvm/test/ThinLTO/X86/memprof-basic.ll (+13-13)
  • (modified) llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll (+7-5)
  • (modified) llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll (+3-3)
  • (modified) llvm/test/ThinLTO/X86/memprof-indirectcall.ll (+16-16)
  • (modified) llvm/test/ThinLTO/X86/memprof-inlined.ll (+7-7)
  • (added) llvm/test/Transforms/PGOProfile/Inputs/thinlto_icall_prom.profdata ()
  • (modified) llvm/test/Transforms/PGOProfile/Inputs/thinlto_indirect_call_promotion.ll (+27-5)
  • (added) llvm/test/Transforms/PGOProfile/Inputs/update_icall_promotion_inputs.sh (+70)
  • (modified) llvm/test/Transforms/PGOProfile/thinlto_indirect_call_promotion.ll (+26-26)
diff --git a/llvm/lib/IR/Globals.cpp b/llvm/lib/IR/Globals.cpp
index 7bd4503a689e4ae..e821de3b198f1b6 100644
--- a/llvm/lib/IR/Globals.cpp
+++ b/llvm/lib/IR/Globals.cpp
@@ -158,9 +158,9 @@ std::string GlobalValue::getGlobalIdentifier(StringRef Name,
     // that it will stay the same, e.g., if the files are checked out from
     // version control in different locations.
     if (FileName.empty())
-      NewName = NewName.insert(0, "<unknown>:");
+      NewName = NewName.insert(0, "<unknown>;");
     else
-      NewName = NewName.insert(0, FileName.str() + ":");
+      NewName = NewName.insert(0, FileName.str() + ";");
   }
   return NewName;
 }
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 236b083a1e2155b..d9ad5c8b6f6838d 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {
 
 char InstrProfError::ID = 0;
 
-std::string getPGOFuncName(StringRef RawFuncName,
-                           GlobalValue::LinkageTypes Linkage,
+std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
                            StringRef FileName,
                            uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
-  return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
+  // Value names may be prefixed with a binary '1' to indicate
+  // that the backend should not modify the symbols due to any platform
+  // naming convention. Do not include that '1' in the PGO profile name.
+  if (Name[0] == '\1')
+    Name = Name.substr(1);
+
+  std::string NewName = std::string(Name);
+  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
+    // For local symbols, prepend the main file name to distinguish them.
+    // Do not include the full path in the file name since there's no guarantee
+    // that it will stay the same, e.g., if the files are checked out from
+    // version control in different locations.
+    if (FileName.empty())
+      NewName = NewName.insert(0, "<unknown>:");
+    else
+      NewName = NewName.insert(0, FileName.str() + ":");
+  }
+  return NewName;
 }
 
 // Strip NumPrefix level of directory name from PathNameStr. If the number of
@@ -300,12 +316,8 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
                             GlobalValue::LinkageTypes Linkage,
                             StringRef FileName) {
   SmallString<64> Name;
-  if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
-    Name.append(FileName.empty() ? "<unknown>" : FileName);
-    Name.append(";");
-  }
   Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
-  return Name.str().str();
+  return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
 }
 
 static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
diff --git a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
index 4d840d1f8ec8dda..24bb2a4efff509b 100644
--- a/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
+++ b/llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
@@ -6,9 +6,9 @@
 ; COMBINED:       <GLOBALVAL_SUMMARY_BLOCK
 ; COMBINED-NEXT:    <VERSION
 ; COMBINED-NEXT:    <FLAGS
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4947176790635855146/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-6591587165810580810/>
-; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-4377693495213223786/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=686735765308251824/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=4507502870619175775/>
+; COMBINED-NEXT:    <VALUE_GUID {{.*}} op1=-8118561185538785069/>
 ; COMBINED-DAG:    <COMBINED{{ }}
 ; COMBINED-DAG:    <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
 ; COMBINED-DAG:    <COMBINED_GLOBALVAR_INIT_REFS
diff --git a/llvm/test/ThinLTO/X86/memprof-basic.ll b/llvm/test/ThinLTO/X86/memprof-basic.ll
index 0d466830ba57d62..54e01e5fcdf9555 100644
--- a/llvm/test/ThinLTO/X86/memprof-basic.ll
+++ b/llvm/test/ThinLTO/X86/memprof-basic.ll
@@ -148,7 +148,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[BAZ:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -157,7 +157,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAZ]] to Caller: [[FOO:0x[a-z0-9]+]] AllocTypes: NotColdCold ContextIds: 1 2
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -167,7 +167,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -175,7 +175,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -197,7 +197,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[BAZ]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -207,7 +207,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[BAZ2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -217,7 +217,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 0     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 6731117468105397038 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
+; DUMP: 	Callee: 8107868197919466657 (_Z3foov) Clones: 0 StackIds: 1     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[FOO2]]
-; DUMP: 	Callee: 5878270615442837395 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
+; DUMP: 	Callee: 1807954217441101578 (_Z3bazv) Clones: 0 StackIds: 3    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clone of [[FOO]]
 
 ; DUMP: Node [[BAZ2]]
-; DUMP: 	Callee: 9832687305761716512 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
+; DUMP: 	Callee: 11481133863268513686 (_Z3barv) Clones: 0 StackIds: 2    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -344,7 +344,7 @@ attributes #0 = { noinline optnone }
 ; DOTCLONED: }
 
 
-; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 5878270615442837395, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
-; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 6731117468105397038, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
-; DISTRIB: ^[[BAR]] = gv: (guid: 9832687305761716512, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[BAZ:[0-9]+]] = gv: (guid: 1807954217441101578, {{.*}} callsites: ((callee: ^[[BAR:[0-9]+]], clones: (0, 1)
+; DISTRIB: ^[[FOO:[0-9]+]] = gv: (guid: 8107868197919466657, {{.*}} callsites: ((callee: ^[[BAZ]], clones: (0, 1)
+; DISTRIB: ^[[BAR]] = gv: (guid: 11481133863268513686, {{.*}} allocs: ((versions: (notcold, cold)
 ; DISTRIB: ^[[MAIN:[0-9]+]] = gv: (guid: 15822663052811949562, {{.*}} callsites: ((callee: ^[[FOO]], clones: (0), {{.*}} (callee: ^[[FOO]], clones: (1)
diff --git a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
index f7ba0d27dca78a7..7a0b4a36dbad4dd 100644
--- a/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
+++ b/llvm/test/ThinLTO/X86/memprof-duplicate-context-ids.ll
@@ -260,8 +260,10 @@ attributes #0 = { noinline optnone}
 ; STATS-BE: 1 memprof-context-disambiguation - Number of original (not cloned) allocations with memprof profiles during ThinLTO backend
 
 
-; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 1643923691937891493, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
-; DISTRIB: ^[[D]] = gv: (guid: 4881081444663423788, {{.*}} allocs: ((versions: (notcold, cold)
-; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 14590037969532473829, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
-; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 17035303613541779335, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
-; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 17820708772846654376, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[E:[0-9]+]] = gv: (guid: 331966645857188136, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+; DISTRIB: ^[[D]] = gv: (guid: 11079124245221721799, {{.*}} allocs: ((versions: (notcold, cold)
+; DISTRIB: ^[[F:[0-9]+]] = gv: (guid: 11254287701717398916, {{.*}} callsites: ((callee: ^[[D]], clones: (0)
+; DISTRIB: ^[[B:[0-9]+]] = gv: (guid: 13579056193435805313, {{.*}} callsites: ((callee: ^[[D]], clones: (1)
+; DISTRIB: ^[[C:[0-9]+]] = gv: (guid: 15101436305866936160, {{.*}} callsites: ((callee: ^[[D:[0-9]+]], clones: (1)
+
+
diff --git a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
index 9a72ae43b2f1e48..f1a494d077fefca 100644
--- a/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
+++ b/llvm/test/ThinLTO/X86/memprof-funcassigncloning.ll
@@ -176,7 +176,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	Clones: [[ENEW1CLONE:0x[a-z0-9]+]]
 
 ; DUMP: Node [[D:0x[a-z0-9]+]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 0 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 6
 ; DUMP: 	CalleeEdges:
@@ -185,7 +185,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[C]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 1 (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 2 5
 ; DUMP: 	CalleeEdges:
@@ -194,7 +194,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[B]]
-; DUMP: 	Callee: 10758063066234039248 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
+; DUMP: 	Callee: 16147627620923572899 (_Z1EPPcS0_) Clones: 0 StackIds: 2 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
index 76273959f4f4ac8..07a52f441ca2783 100644
--- a/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
+++ b/llvm/test/ThinLTO/X86/memprof-indirectcall.ll
@@ -202,7 +202,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 6
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -225,7 +225,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6:0x[a-z0-9]+]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -233,7 +233,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -241,7 +241,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -249,7 +249,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -258,7 +258,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -266,7 +266,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -274,7 +274,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
@@ -302,7 +302,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:		Clones: [[FOO2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[AX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 6    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 1 2
 ; DUMP: 	CalleeEdges:
@@ -324,7 +324,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BAR]] to Caller: [[MAIN6]] AllocTypes: NotCold ContextIds: 5
 
 ; DUMP: Node [[MAIN3]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 4   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1
 ; DUMP: 	CalleeEdges:
@@ -332,7 +332,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN4]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 5   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2
 ; DUMP: 	CalleeEdges:
@@ -340,7 +340,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 0    (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3
 ; DUMP: 	CalleeEdges:
@@ -348,7 +348,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[BX]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 7    (clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 4 5
 ; DUMP: 	CalleeEdges:
@@ -357,7 +357,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[BX]] to Caller: [[BAR]] AllocTypes: NotColdCold ContextIds: 4 5
 
 ; DUMP: Node [[MAIN5]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 2   (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 4
 ; DUMP: 	CalleeEdges:
@@ -365,7 +365,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN6]]
-; DUMP: 	Callee: 4095956691517954349 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
+; DUMP: 	Callee: 2040285415115148168 (_Z3barP1A) Clones: 0 StackIds: 3   (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 5
 ; DUMP: 	CalleeEdges:
@@ -373,7 +373,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 12914368124089294956 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
+; DUMP: 	Callee: 15844184524768596045 (_Z3foov) Clones: 0 StackIds: 1    (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 6
 ; DUMP: 	CalleeEdges:
diff --git a/llvm/test/ThinLTO/X86/memprof-inlined.ll b/llvm/test/ThinLTO/X86/memprof-inlined.ll
index feb9c94344223c9..89df345b2204239 100644
--- a/llvm/test/ThinLTO/X86/memprof-inlined.ll
+++ b/llvm/test/ThinLTO/X86/memprof-inlined.ll
@@ -170,7 +170,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2:0x[a-z0-9]+]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2	(clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -179,7 +179,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP: 	Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
+; DUMP: 	Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3	(clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -201,7 +201,7 @@ attributes #0 = { noinline optnone }
 ;; This is the node synthesized for the call to bar in foo that was created
 ;; by inlining baz into foo.
 ; DUMP: Node [[FOO]]
-; DUMP: 	Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
+; DUMP: 	Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1	(clone 0)
 ; DUMP: 	AllocTypes: NotColdCold
 ; DUMP: 	ContextIds: 3 4
 ; DUMP: 	CalleeEdges:
@@ -234,7 +234,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 		Edge from Callee [[FOO2]] to Caller: [[MAIN2]] AllocTypes: Cold ContextIds: 2
 
 ; DUMP: Node [[MAIN1]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 2     (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 1 3
 ; DUMP: 	CalleeEdges:
@@ -243,7 +243,7 @@ attributes #0 = { noinline optnone }
 ; DUMP: 	CallerEdges:
 
 ; DUMP: Node [[MAIN2]]
-; DUMP:         Callee: 2229562716906371625 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
+; DUMP:         Callee: 644169328058379925 (_Z3foov) Clones: 0 StackIds: 3     (clone 0)
 ; DUMP: 	AllocTypes: Cold
 ; DUMP: 	ContextIds: 2 4
 ; DUMP: 	CalleeEdges:
@@ -264,7 +264,7 @@ attributes #0 = { noinline optnone }
 ; DUMP:         Clones: [[BAR2:0x[a-z0-9]+]]
 
 ; DUMP: Node [[FOO]]
-; DUMP:         Callee: 16064618363798697104 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
+; DUMP:         Callee: 10349908617508457487 (_Z3barv) Clones: 0 StackIds: 0, 1 (clone 0)
 ; DUMP: 	AllocTypes: NotCold
 ; DUMP: 	ContextIds: 3...
[truncated]

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A few comments below.

@@ -246,11 +246,27 @@ std::string InstrProfError::message() const {

char InstrProfError::ID = 0;

std::string getPGOFuncName(StringRef RawFuncName,
GlobalValue::LinkageTypes Linkage,
std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to keep the original name, probably want to at least leave big comments in the header not to use it as it is legacy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also change the function name to getLegacyPGOFuncName

Copy link
Contributor Author

@mingmingl-llvm mingmingl-llvm Dec 1, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slightly prefer to use (simpler) getPGOFuncName for new ; format and use getLegacyPGOFuncName for : format, and chose not to do all the rename (want an NFC patch if rename is the way to go, since there are many existing callers).

I updated header and cpp, and did a minimal rename (this function has 3 callers, one in clang directory and another two in InstrProf.cpp)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved? It looks like the most recent version of the patch changes getPGOFuncName to getLegacyPGOFuncName for the old format, and uses getIRPGOFuncName for the new format.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't rename getPGOFuncName() in https://reviews.llvm.org/D156569 because Swift and Clang FE-PGO relied on it. Is that still the case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed the fact that one getPGOFuncName interface was left. I am only seeing 2 invocations of that interface outside of the unittest test. I would be in favor of doing renames of both getPGOFuncName interfaces in one patch. A separate NFC patch is a fine option, and keeps this patch just about fixing the ICP breakage caused by the delimiter change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that still the case?

Clang FE-PGO calls this function, but afaik it's used for coverage rather than performance nowadays. I guess if Clang FE-PGO continues to use this function it shouldn't be called legacy in its name.

I'm wondering if it makes more sense if I rename getPGOFuncName (taking stringified names) to getClangPGOFuncName in this PR? Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on whether it is intentional that Clang (and Swift apparently?) use the old name.

@ellishg was there a reason to leave that as is?

If they must use the old one then maybe getFEPGOFuncName. If they should be transitioned eventually then using "Legacy" makes sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I missed the fact that one getPGOFuncName interface was left. I am only seeing 2 invocations of that interface outside of the unittest test. I would be in favor of doing renames of both getPGOFuncName interfaces in one patch. A separate NFC patch is a fine option, and keeps this patch just about fixing the ICP breakage caused by the delimiter change.

got it. I'll probably un-do the rename then. We can figure out about names in a follow-up patch.

I might need to tweak my Github notifications a little bit. Right now I cannot see real-time updates when I opened a PR and replied on one comment. Sorry for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undo the rename.

I didn't find a way to configure the Github notifications for real-time comments. Guess I'll need to combine email notifications.

if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
Name.append(FileName.empty() ? "<unknown>" : FileName);
Name.append(";");
}
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the Mangler() handling into getGlobalIdentifier(), then remove the handling there of "Name[0] == '\1'" as I believe it is handled by Manger().getNameWithPrefix().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also thought it was feasible initially. Two kinds of rough edges came up when I tried the implementation. (added some comments to document them)
a) getIRPGONameForGlobalObject currently mangles names currently, by passing global values to getNameWithPrefix.getNameWithPrefix with GlobalValue parameter makes use of other information (e.g., linkage, calling-conv of functions, etc). getNameWithPrefix with stringified names as parameter just use names. For parity we might need a getGlobalIdentifier that takes global values (not just stringified names).
b) getGlobalIdentifier gets called in many places, and the \1 handling function in Mangler does more than dropping \1, so embedding Mangler's way of handling \1 might affect all callers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ellishg How much of D156569 relied on the invocation of Mangler? It is not mentioned in the patch description, only the rationale for changing ":" to ";". The problem is if these are out of sync, then cross-module importing of indirectly called local functions will continue to be broken in whatever cases Mangler().getNameWithPrefix affects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D156569 changes the format from [<filepath>:]<function-name> to [<filepath>;]<linkage-name>. Note the change from function-name to linkage-name. Having the mangled name is required so that we can pass a symbol order via -symbol-ordering-file or -order_file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I'm confused about is that the function name is already the mangled name, at least using clang with c++. Is it not for objective C?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid subtle issues when linkage-name is different from mangled names,,I'm wondering if it warrants a change to use linkage-names (as opposed to mangled name) in GlobalValue::getGlobalIdentifier in this PR. Global identifier is supposed to be hash of unique names, and linkage-name is already unique.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't broken in general, but it's needed to get -order_file working correctly.

Unfortunately this change broke aspects of ThinLTO ICP, however. Is it possible to change the handling of -order_file in the linker to modify the symbol names appropriately?

To avoid subtle issues when linkage-name is different from mangled names,,I'm wondering if it warrants a change to use linkage-names (as opposed to mangled name) in GlobalValue::getGlobalIdentifier in this PR.

If the -order_name handling cannot be fixed as I suggested above, then yes, we need some solution to ensure that the hashed symbol names are consistent across PGO and ThinLTO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to use linkage-names for IRPGO, -order_file, and ThinLTO. -order_file is used in the linker when it only knows linkage-names, so I don't think it makes sense to feed it mangled names.

Copy link
Contributor Author

@mingmingl-llvm mingmingl-llvm Dec 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes more sense to use linkage-names for IRPGO, -order_file, and ThinLTO. -order_file is used in the linker when it only knows linkage-names, so I don't think it makes sense to feed it mangled names.

Thanks for the input.

Using linkage-name would be a non-trivial change, given the static getGlobalIdentifier takes a stringified name currently, and using linkage-name means requiring compatible change in each callsite (e.g., if the caller context doesn't have GlobalValue but just stringified names in the bitcode, make sure linkage-name exists in the bitcode, this seems one example). An alternative fixup is to store the MD5 of [filename;]linkage-name (this is what's currently stored in this field) and the MD5 of [filename:]mangled-name (the original hash before D156569) in the raw profiles, so llvm-profdata could choose properly (former for ordering and latter for ICP)

@ellishg Would you mind if I create a Github issue to track how to fix other potential cases and assign it to you ? This PR would solve the colon and semicolon difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving the thumb-up Ellis! Just created #74565.

EOF

# Creates lib.cc. global_func might call one of two indirect callees. Both
# indirect callees have internal linkage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we now missing testing of ThinLTO ICP importing for a global function? What if one of the functions was left with non-local linkage like before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. I changed 'callee1' to external linkage in this script and the test IRs.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Dec 1, 2023
@nikic nikic removed their request for review December 1, 2023 08:27
@mingmingl-llvm
Copy link
Contributor Author

Using the same added ICP test, profile matching on local-linkage _ZL7callee0v using clang++ -v -fuse-ld=lld -O2 -fprofile-use=thinlto_icall_prom.profdata <file-list> , as this pgo-instr-use output shows.

However, when trying to follow the counter matching code path, I came across another place(

Parts = Parts.second.split(':');
) that might have something to do with colon/semicolon delimiter inside itanium remapper (
StringRef RealName = extractName(FuncName);
). I'm wondering if this needs an update (with updated regression test)? @david-xl @xur-llvm who might have more context.

@teresajohnson
Copy link
Contributor

Using the same added ICP test, profile matching on local-linkage _ZL7callee0v using clang++ -v -fuse-ld=lld -O2 -fprofile-use=thinlto_icall_prom.profdata <file-list> , as this pgo-instr-use output shows.

Can you clarify what you are saying here - is it working or not working?

@mingmingl-llvm
Copy link
Contributor Author

Can you clarify what you are saying here - is it working or not working?

It's working. The local-linkage _ZL7callee0v has function entry count annotated as !prof.

David says the itanium remapper file was only used once during gcc to llvm transition, so not relevant here.

@mingmingl-llvm
Copy link
Contributor Author

mingmingl-llvm commented Dec 6, 2023

Just noticed there is a merge conflict now. Will update my fork and merge.

@teresajohnson
Copy link
Contributor

David says the itanium remapper file was only used once during gcc to llvm transition, so not relevant here.

I believe it was actually for the libstdc++ to libc++ transition (see https://reviews.llvm.org/D51247 and https://reviews.llvm.org/D51240).

If it is broken we'll at least want to add a FIXME there.

@david-xl
Copy link
Contributor

david-xl commented Dec 6, 2023

David says the itanium remapper file was only used once during gcc to llvm transition, so not relevant here.

I believe it was actually for the libstdc++ to libc++ transition (see https://reviews.llvm.org/D51247 and https://reviews.llvm.org/D51240).

If it is broken we'll at least want to add a FIXME there.

Yes, I meant libstdc++ to libc++ transition. Why source line is this comment addressing? I take take a look the changes/comments there.

@mingmingl-llvm
Copy link
Contributor Author

mingmingl-llvm commented Dec 6, 2023

David says the itanium remapper file was only used once during gcc to llvm transition, so not relevant here.

I believe it was actually for the libstdc++ to libc++ transition (see https://reviews.llvm.org/D51247 and https://reviews.llvm.org/D51240).
If it is broken we'll at least want to add a FIXME there.

Yes, I meant libstdc++ to libc++ transition. Why source line is this comment addressing? I take take a look the changes/comments there.

Sorry for the misinformation, and thanks for the Phab links.

I think the itanium remapper needs a : -> ; update (going to update this PR and related tests), since (for local-linkage functions) the function name used to look up profiles should use ; delimiter.


// Test should fail where linkage-name and mangled-name diverges, see issue https://github.com/llvm/llvm-project/issues/74565).
// Currently, this name divergence happens on Mach-O object file format, or on
// many (but not all) 32-bit Windows systems.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more details regarding this statement

  1. The global prefix is _ for {MachO, WinCOFFX86} and \0 (i.e., no prefix) for the rest of mangling modes (source code).
  2. In the data-layout string,m:o specifies MachO, m:x specifies WinCOFFX86, and m:e specifies ELF (parser source code)

With 1 and 2, note not all 32-bit windows uses WinCOFFX86 mangling-mode. For instance, one windows-32 target specifies m:e, and some [1] windows-32 target chooses between m:e and one of {m:x, m:o}

[1] example 1 chooses between m:x and m:e,and example 2 chooses between m:o and m:e

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once these changes are made.

  • unused %clangxx_pgouse=
  • profraw => proftext for the PGOProfile test
  • lld-available -fuse-ld=lld and windows triple simplification for the compiler-rt test

It seems useful to wait for others' opinions as well.

// IR-NEXT: %1 = load ptr, ptr getelementptr inbounds ([2 x ptr], ptr @calleeAddrs,
// IR-NEXT: tail call void %1(), !prof ![[#PROF2:]]

// The GUID of indirect callee is the MD5 hash of `/path/to/lib.cpp:_ZL7callee0v`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this have the semicolon separator not a colon?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, done.

@@ -352,6 +366,8 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) {
return getIRPGOObjectName(F, InLTO, getPGOFuncNameMetadata(F));
}

// DEPRECATED. Use `getIRPGOFuncName`for new code. See that function for
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the header it is described as for FE instrumentation. Probably want the comments to be consistent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than a couple of minor comments and pending resolution of the LLVM IR test. Thanks!

Copy link
Contributor Author

@mingmingl-llvm mingmingl-llvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved comments except the pending discussion about whether to use .proftext or.profraw.

Added REQUIRES: zlib for LLVM IR test since the profile reader should be built with zlib support.

I'll probably spend sometime to get this test running on my laptop (haven't tried to build llvm on mac before), while waiting for more feedbacks. I'm thinking of submitting it on Thursday or Friday. @ellishg I think the added compiler-rt test (on macosx) should be a test case for issue 74565.

// IR-NEXT: %1 = load ptr, ptr getelementptr inbounds ([2 x ptr], ptr @calleeAddrs,
// IR-NEXT: tail call void %1(), !prof ![[#PROF2:]]

// The GUID of indirect callee is the MD5 hash of `/path/to/lib.cpp:_ZL7callee0v`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, done.

@@ -352,6 +366,8 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) {
return getIRPGOObjectName(F, InLTO, getPGOFuncNameMetadata(F));
}

// DEPRECATED. Use `getIRPGOFuncName`for new code. See that function for
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

for disassembled IR.
- Currently, assume dso_local is true on ELF and COFF according to https://github.com/llvm/llvm-project/blob/4f1ddf7523c0bbb4075b1682dbe2278080642eee/clang/lib/CodeGen/CodeGenModule.cpp#L1528
- Running the compiler-rt test on mac would fail on `dso_local` without this. The test should fail on imports of indirect callees as well as finding `PGONameMetadata` for external functions.
@mingmingl-llvm
Copy link
Contributor Author

mingmingl-llvm commented Dec 14, 2023

Resolved comments except the pending discussion about whether to use .proftext or.profraw.

Added REQUIRES: zlib for LLVM IR test since the profile reader should be built with zlib support.

I'll probably spend sometime to get this test running on my laptop (haven't tried to build llvm on mac before), while waiting for more feedbacks. I'm thinking of submitting it on Thursday or Friday. @ellishg I think the added compiler-rt test (on macosx) should be a test case for issue 74565.

The test failed on mac, but initially on trying to find dso_local of line PGOName: define dso_local void @_Z7callee1v() {{.*}} !prof ![[#]] !PGOFuncName ![[#]] .

  • Debugging with lldb points to this comment, saying only COFF and ELF assumes dso_local.

After making the dso_local optional in the latest commit, the test started to fail due to the difference of <mangled-name> and <linkage-name>.

  1. Test failed, when external func callee1 in expected output shouldn't have !PGOFuncName metadata but the test output has it. Note !PGOFuncName is annotated only if PGOName != Func.getName()
  2. Fixing the first error allows me to see the missed import errors.

@mingmingl-llvm mingmingl-llvm removed clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Dec 14, 2023
@mingmingl-llvm mingmingl-llvm merged commit 245cdda into llvm:main Dec 18, 2023
mingmingl-llvm added a commit that referenced this pull request Dec 18, 2023
…use semicolon as delimiter for local-linkage varibles. (#74008)"

This reverts commit 245cdda.
mingmingl-llvm added a commit that referenced this pull request Dec 18, 2023
…use semicolon as delimiter for local-linkage varibles." (#75835)

Reverts #74008

The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
@mingmingl-llvm mingmingl-llvm deleted the icptest branch December 20, 2023 22:10
ellishg added a commit to ellishg/llvm-project that referenced this pull request Jan 4, 2024
Change the format of IRPGO counter names to `[<filepath>;]<mangled-name>` which is computed by `GlobalValue::getGlobalIdentifier()` to fix llvm#74565.

In fe05193 (https://reviews.llvm.org/D156569) the format of IRPGO counter names was changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O (it is confusing that `<linkage-name>` is computed with `Mangler().getNameWithPrefix()` while `<mangled-name>` is just `F.getName()`). We discovered in llvm#74565 that this causes some missed import issues on some targets and llvm#74008 is a partial fix.

Since `<mangled-name>` may not match the `<linkage-name>` on some targets like Mach-O, we will need to post-process the output of `llvm-profdata order` before passing to the linker via `-order_file`.

Profiles generated after fe05193 will become stale after this diff, but I think this is acceptable since that patch after the LLVM 18 cut which has not been released yet.
ellishg added a commit that referenced this pull request Jan 5, 2024
Change the format of IRPGO counter names to
`[<filepath>;]<mangled-name>` which is computed by
`GlobalValue::getGlobalIdentifier()` to fix #74565.

In fe05193
(https://reviews.llvm.org/D156569) the format of IRPGO counter names was
changed to be `[<filepath>;]<linkage-name>` where `<linkage-name>` is
basically `F.getName()` with some prefix, e.g., `_` or `l_` on Mach-O
(yes, it is confusing that `<linkage-name>` is computed with
`Mangler().getNameWithPrefix()` while `<mangled-name>` is just
`F.getName()`). We discovered in #74565 that this causes some missed
import issues on some targets and #74008 is a partial fix.

Since `<mangled-name>` may not match the `<linkage-name>` on some
targets like Mach-O, we will need to post-process the output of
`llvm-profdata order` before passing to the linker via `-order_file`.

Profiles generated after fe05193 will
become stale after this diff, but I think this is acceptable since that
patch landed after the LLVM 18 cut which hasn't been released yet.
qihangkong pushed a commit to rvgpu/rvgpu-llvm that referenced this pull request Apr 23, 2024
…use semicolon as delimiter for local-linkage varibles." (#75835)

Reverts llvm/llvm-project#74008

The compiler-rt test failed due to `llvm-dis` not found
(https://lab.llvm.org/buildbot/#/builders/127/builds/59884)
Will revert and investigate how to require the proper dependency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler-rt llvm:ir llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO) PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants