Skip to content

[MemProf] Avoid incorrect ICP symtab canonicalization #115419

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 8, 2024

Conversation

teresajohnson
Copy link
Contributor

ICP builds a symtab from the symbols in the module allowing mapping from
the VP metadata GUIDs to the Function. MemProf uses this same symtab
handling for its ICP during cloning. When symbols are added to the
symtab, the handling adds both a GUID computed from the function name,
or from the attached PGOFuncName metadata for locals, as well as a GUID
computed from the "canonicalized" name, which strips all "." suffixes
other than ".__uniq". This was originally meant to remove the ".llvm.*"
suffix added to promoted locals (done earlier in the ThinLTO backend).
In theory, it should no longer be needed as locals should have
PGOFuncName metadata.

However, this was causing a linker unsat, in code that used coroutines.
For an original coroutine function, there were several additional
functions created that had the same name, but different "." suffixes.
Therefore the canonical name for these additional functions had the same
GUID as that of the original function, leading to extra entries in the
symtab, and to selecting the wrong function for promotion. For regular
ICP this can happen, but is just a performance issue. However, for
memprof the promoted direct call calls a memprof clone, and because we
called the wrong function, in this case it didn't have a memprof clone
and we got a linker unsat.

We may be able to remove the canonical name handling for ICP in general,
but for now disable it for MemProf. At worst this could lead to not
finding a GUID in the symtab and not performing an ICP, so should be
conservatively correct.

ICP builds a symtab from the symbols in the module allowing mapping from
the VP metadata GUIDs to the Function. MemProf uses this same symtab
handling for its ICP during cloning. When symbols are added to the
symtab, the handling adds both a GUID computed from the function name,
or from the attached PGOFuncName metadata for locals, as well as a GUID
computed from the "canonicalized" name, which strips all "." suffixes
other than ".__uniq". This was originally meant to remove the ".llvm.*"
suffix added to promoted locals (done earlier in the ThinLTO backend).
In theory, it should no longer be needed as locals should have
PGOFuncName metadata.

However, this was causing a linker unsat, in code that used coroutines.
For an original coroutine function, there were several additional
functions created that had the same name, but different "." suffixes.
Therefore the canonical name for these additional functions had the same
GUID as that of the original function, leading to extra entries in the
symtab, and to selecting the wrong function for promotion. For regular
ICP this can happen, but is just a performance issue. However, for
memprof the promoted direct call calls a memprof clone, and because we
called the wrong function, in this case it didn't have a memprof clone
and we got a linker unsat.

We may be able to remove the canonical name handling for ICP in general,
but for now disable it for MemProf. At worst this could lead to not
finding a GUID in the symtab and not performing an ICP, so should be
conservatively correct.
@llvmbot llvmbot added PGO Profile Guided Optimizations LTO Link time optimization (regular/full LTO or ThinLTO) llvm:transforms labels Nov 8, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: Teresa Johnson (teresajohnson)

Changes

ICP builds a symtab from the symbols in the module allowing mapping from
the VP metadata GUIDs to the Function. MemProf uses this same symtab
handling for its ICP during cloning. When symbols are added to the
symtab, the handling adds both a GUID computed from the function name,
or from the attached PGOFuncName metadata for locals, as well as a GUID
computed from the "canonicalized" name, which strips all "." suffixes
other than ".__uniq". This was originally meant to remove the ".llvm.*"
suffix added to promoted locals (done earlier in the ThinLTO backend).
In theory, it should no longer be needed as locals should have
PGOFuncName metadata.

However, this was causing a linker unsat, in code that used coroutines.
For an original coroutine function, there were several additional
functions created that had the same name, but different "." suffixes.
Therefore the canonical name for these additional functions had the same
GUID as that of the original function, leading to extra entries in the
symtab, and to selecting the wrong function for promotion. For regular
ICP this can happen, but is just a performance issue. However, for
memprof the promoted direct call calls a memprof clone, and because we
called the wrong function, in this case it didn't have a memprof clone
and we got a linker unsat.

We may be able to remove the canonical name handling for ICP in general,
but for now disable it for MemProf. At worst this could lead to not
finding a GUID in the symtab and not performing an ICP, so should be
conservatively correct.


Full diff: https://github.com/llvm/llvm-project/pull/115419.diff

4 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/InstrProf.h (+5-2)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+10-7)
  • (modified) llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp (+11-1)
  • (modified) llvm/test/ThinLTO/X86/memprof-icp.ll (+11)
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h
index b0b2258735e2ae..c5f7800097807d 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -504,7 +504,8 @@ class InstrProfSymtab {
   // name-set = {PGOFuncName} union {getCanonicalName(PGOFuncName)}
   // - In MD5NameMap: <MD5Hash(name), name> for name in name-set
   // - In MD5FuncMap: <MD5Hash(name), &F> for name in name-set
-  Error addFuncWithName(Function &F, StringRef PGOFuncName);
+  // The canonical name is only added if \c AddCanonical is true.
+  Error addFuncWithName(Function &F, StringRef PGOFuncName, bool AddCanonical);
 
   // Add the vtable into the symbol table, by creating the following
   // map entries:
@@ -560,7 +561,9 @@ class InstrProfSymtab {
   /// decls from module \c M. This interface is used by transformation
   /// passes such as indirect function call promotion. Variable \c InLTO
   /// indicates if this is called from LTO optimization passes.
-  Error create(Module &M, bool InLTO = false);
+  /// A canonical name, removing non-__uniq suffixes, is added if
+  /// \c AddCanonical is true.
+  Error create(Module &M, bool InLTO = false, bool AddCanonical = true);
 
   /// Create InstrProfSymtab from a set of names iteratable from
   /// \p IterRange. This interface is used by IndexedProfReader.
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index b9937c9429b77d..6e0575db26d298 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -483,16 +483,16 @@ GlobalVariable *createPGOFuncNameVar(Function &F, StringRef PGOFuncName) {
   return createPGOFuncNameVar(*F.getParent(), F.getLinkage(), PGOFuncName);
 }
 
-Error InstrProfSymtab::create(Module &M, bool InLTO) {
+Error InstrProfSymtab::create(Module &M, bool InLTO, bool AddCanonical) {
   for (Function &F : M) {
     // Function may not have a name: like using asm("") to overwrite the name.
     // Ignore in this case.
     if (!F.hasName())
       continue;
-    if (Error E = addFuncWithName(F, getIRPGOFuncName(F, InLTO)))
+    if (Error E = addFuncWithName(F, getIRPGOFuncName(F, InLTO), AddCanonical))
       return E;
     // Also use getPGOFuncName() so that we can find records from older profiles
-    if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO)))
+    if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO), AddCanonical))
       return E;
   }
 
@@ -630,7 +630,8 @@ StringRef InstrProfSymtab::getCanonicalName(StringRef PGOName) {
   return PGOName;
 }
 
-Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) {
+Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName,
+                                       bool AddCanonical) {
   auto NameToGUIDMap = [&](StringRef Name) -> Error {
     if (Error E = addFuncName(Name))
       return E;
@@ -640,9 +641,11 @@ Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) {
   if (Error E = NameToGUIDMap(PGOFuncName))
     return E;
 
-  StringRef CanonicalFuncName = getCanonicalName(PGOFuncName);
-  if (CanonicalFuncName != PGOFuncName)
-    return NameToGUIDMap(CanonicalFuncName);
+  if (AddCanonical) {
+    StringRef CanonicalFuncName = getCanonicalName(PGOFuncName);
+    if (CanonicalFuncName != PGOFuncName)
+      return NameToGUIDMap(CanonicalFuncName);
+  }
 
   return Error::success();
 }
diff --git a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
index 1a5e75eb42dead..f855037500ee2f 100644
--- a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+++ b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
@@ -4143,7 +4143,17 @@ bool MemProfContextDisambiguation::initializeIndirectCallPromotionInfo(
     Module &M) {
   ICallAnalysis = std::make_unique<ICallPromotionAnalysis>();
   Symtab = std::make_unique<InstrProfSymtab>();
-  if (Error E = Symtab->create(M, /*InLTO=*/true)) {
+  // Don't add canonical names, to avoid multiple functions to the symtab
+  // when they both have the same root name with "." suffixes stripped.
+  // If we pick the wrong one then this could lead to incorrect ICP and calling
+  // a memprof clone that we don't actually create (resulting in linker unsats).
+  // What this means is that the GUID of the function (or its PGOFuncName
+  // metadata) *must* match that in the VP metadata to allow promotion.
+  // In practice this should not be a limitation, since local functions should
+  // have PGOFuncName metadata and global function names shouldn't need any
+  // special handling (they should not get the ".llvm.*" suffix that the
+  // canonicalization handling is attempting to strip.
+  if (Error E = Symtab->create(M, /*InLTO=*/true, /*AddCanonical=*/false)) {
     std::string SymtabFailure = toString(std::move(E));
     M.getContext().emitError("Failed to create symtab: " + SymtabFailure);
     return false;
diff --git a/llvm/test/ThinLTO/X86/memprof-icp.ll b/llvm/test/ThinLTO/X86/memprof-icp.ll
index 99e07189876556..d4d0e21b4bd7db 100644
--- a/llvm/test/ThinLTO/X86/memprof-icp.ll
+++ b/llvm/test/ThinLTO/X86/memprof-icp.ll
@@ -91,6 +91,7 @@
 ; RUN:	-enable-memprof-indirect-call-support=true \
 ; RUN:  -supports-hot-cold-new \
 ; RUN:  -r=%t/foo.o,_Z3fooR2B0j,plx \
+; RUN:  -r=%t/foo.o,_ZN2B03barEj.abc,plx \
 ; RUN:  -r=%t/foo.o,_Z3xyzR2B0j, \
 ; RUN:  -r=%t/main.o,_Z3fooR2B0j, \
 ; RUN:  -r=%t/main.o,_Znwm, \
@@ -121,6 +122,7 @@
 ; RUN:  -supports-hot-cold-new \
 ; RUN:  -thinlto-distributed-indexes \
 ; RUN:  -r=%t/foo.o,_Z3fooR2B0j,plx \
+; RUN:  -r=%t/foo.o,_ZN2B03barEj.abc,plx \
 ; RUN:  -r=%t/foo.o,_Z3xyzR2B0j, \
 ; RUN:  -r=%t/main.o,_Z3fooR2B0j, \
 ; RUN:  -r=%t/main.o,_Znwm, \
@@ -155,6 +157,7 @@
 ; RUN:	-enable-memprof-indirect-call-support=false \
 ; RUN:  -supports-hot-cold-new \
 ; RUN:  -r=%t/foo.noicp.o,_Z3fooR2B0j,plx \
+; RUN:  -r=%t/foo.noicp.o,_ZN2B03barEj.abc,plx \
 ; RUN:  -r=%t/foo.noicp.o,_Z3xyzR2B0j, \
 ; RUN:  -r=%t/main.o,_Z3fooR2B0j, \
 ; RUN:  -r=%t/main.o,_Znwm, \
@@ -261,6 +264,14 @@ target triple = "x86_64-unknown-linux-gnu"
 
 declare i32 @_Z3xyzR2B0j(ptr %b)
 
+;; Add a function that has the same name as one of the indirect callees, but
+;; with a suffix, to make sure we don't incorrectly pick the wrong one as the
+;; promoted target (which happens if we attempt to canonicalize the names
+;; when building the ICP symtab).
+define i32 @_ZN2B03barEj.abc(ptr %this, i32 %s) {
+  ret i32 0
+}
+
 define i32 @_Z3fooR2B0j(ptr %b) {
 entry:
   %0 = load ptr, ptr %b, align 8

@llvmbot
Copy link
Member

llvmbot commented Nov 8, 2024

@llvm/pr-subscribers-lto

Author: Teresa Johnson (teresajohnson)

Changes

ICP builds a symtab from the symbols in the module allowing mapping from
the VP metadata GUIDs to the Function. MemProf uses this same symtab
handling for its ICP during cloning. When symbols are added to the
symtab, the handling adds both a GUID computed from the function name,
or from the attached PGOFuncName metadata for locals, as well as a GUID
computed from the "canonicalized" name, which strips all "." suffixes
other than ".__uniq". This was originally meant to remove the ".llvm.*"
suffix added to promoted locals (done earlier in the ThinLTO backend).
In theory, it should no longer be needed as locals should have
PGOFuncName metadata.

However, this was causing a linker unsat, in code that used coroutines.
For an original coroutine function, there were several additional
functions created that had the same name, but different "." suffixes.
Therefore the canonical name for these additional functions had the same
GUID as that of the original function, leading to extra entries in the
symtab, and to selecting the wrong function for promotion. For regular
ICP this can happen, but is just a performance issue. However, for
memprof the promoted direct call calls a memprof clone, and because we
called the wrong function, in this case it didn't have a memprof clone
and we got a linker unsat.

We may be able to remove the canonical name handling for ICP in general,
but for now disable it for MemProf. At worst this could lead to not
finding a GUID in the symtab and not performing an ICP, so should be
conservatively correct.


Full diff: https://github.com/llvm/llvm-project/pull/115419.diff

4 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/InstrProf.h (+5-2)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+10-7)
  • (modified) llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp (+11-1)
  • (modified) llvm/test/ThinLTO/X86/memprof-icp.ll (+11)
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h
index b0b2258735e2ae..c5f7800097807d 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -504,7 +504,8 @@ class InstrProfSymtab {
   // name-set = {PGOFuncName} union {getCanonicalName(PGOFuncName)}
   // - In MD5NameMap: <MD5Hash(name), name> for name in name-set
   // - In MD5FuncMap: <MD5Hash(name), &F> for name in name-set
-  Error addFuncWithName(Function &F, StringRef PGOFuncName);
+  // The canonical name is only added if \c AddCanonical is true.
+  Error addFuncWithName(Function &F, StringRef PGOFuncName, bool AddCanonical);
 
   // Add the vtable into the symbol table, by creating the following
   // map entries:
@@ -560,7 +561,9 @@ class InstrProfSymtab {
   /// decls from module \c M. This interface is used by transformation
   /// passes such as indirect function call promotion. Variable \c InLTO
   /// indicates if this is called from LTO optimization passes.
-  Error create(Module &M, bool InLTO = false);
+  /// A canonical name, removing non-__uniq suffixes, is added if
+  /// \c AddCanonical is true.
+  Error create(Module &M, bool InLTO = false, bool AddCanonical = true);
 
   /// Create InstrProfSymtab from a set of names iteratable from
   /// \p IterRange. This interface is used by IndexedProfReader.
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index b9937c9429b77d..6e0575db26d298 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -483,16 +483,16 @@ GlobalVariable *createPGOFuncNameVar(Function &F, StringRef PGOFuncName) {
   return createPGOFuncNameVar(*F.getParent(), F.getLinkage(), PGOFuncName);
 }
 
-Error InstrProfSymtab::create(Module &M, bool InLTO) {
+Error InstrProfSymtab::create(Module &M, bool InLTO, bool AddCanonical) {
   for (Function &F : M) {
     // Function may not have a name: like using asm("") to overwrite the name.
     // Ignore in this case.
     if (!F.hasName())
       continue;
-    if (Error E = addFuncWithName(F, getIRPGOFuncName(F, InLTO)))
+    if (Error E = addFuncWithName(F, getIRPGOFuncName(F, InLTO), AddCanonical))
       return E;
     // Also use getPGOFuncName() so that we can find records from older profiles
-    if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO)))
+    if (Error E = addFuncWithName(F, getPGOFuncName(F, InLTO), AddCanonical))
       return E;
   }
 
@@ -630,7 +630,8 @@ StringRef InstrProfSymtab::getCanonicalName(StringRef PGOName) {
   return PGOName;
 }
 
-Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) {
+Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName,
+                                       bool AddCanonical) {
   auto NameToGUIDMap = [&](StringRef Name) -> Error {
     if (Error E = addFuncName(Name))
       return E;
@@ -640,9 +641,11 @@ Error InstrProfSymtab::addFuncWithName(Function &F, StringRef PGOFuncName) {
   if (Error E = NameToGUIDMap(PGOFuncName))
     return E;
 
-  StringRef CanonicalFuncName = getCanonicalName(PGOFuncName);
-  if (CanonicalFuncName != PGOFuncName)
-    return NameToGUIDMap(CanonicalFuncName);
+  if (AddCanonical) {
+    StringRef CanonicalFuncName = getCanonicalName(PGOFuncName);
+    if (CanonicalFuncName != PGOFuncName)
+      return NameToGUIDMap(CanonicalFuncName);
+  }
 
   return Error::success();
 }
diff --git a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
index 1a5e75eb42dead..f855037500ee2f 100644
--- a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+++ b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
@@ -4143,7 +4143,17 @@ bool MemProfContextDisambiguation::initializeIndirectCallPromotionInfo(
     Module &M) {
   ICallAnalysis = std::make_unique<ICallPromotionAnalysis>();
   Symtab = std::make_unique<InstrProfSymtab>();
-  if (Error E = Symtab->create(M, /*InLTO=*/true)) {
+  // Don't add canonical names, to avoid multiple functions to the symtab
+  // when they both have the same root name with "." suffixes stripped.
+  // If we pick the wrong one then this could lead to incorrect ICP and calling
+  // a memprof clone that we don't actually create (resulting in linker unsats).
+  // What this means is that the GUID of the function (or its PGOFuncName
+  // metadata) *must* match that in the VP metadata to allow promotion.
+  // In practice this should not be a limitation, since local functions should
+  // have PGOFuncName metadata and global function names shouldn't need any
+  // special handling (they should not get the ".llvm.*" suffix that the
+  // canonicalization handling is attempting to strip.
+  if (Error E = Symtab->create(M, /*InLTO=*/true, /*AddCanonical=*/false)) {
     std::string SymtabFailure = toString(std::move(E));
     M.getContext().emitError("Failed to create symtab: " + SymtabFailure);
     return false;
diff --git a/llvm/test/ThinLTO/X86/memprof-icp.ll b/llvm/test/ThinLTO/X86/memprof-icp.ll
index 99e07189876556..d4d0e21b4bd7db 100644
--- a/llvm/test/ThinLTO/X86/memprof-icp.ll
+++ b/llvm/test/ThinLTO/X86/memprof-icp.ll
@@ -91,6 +91,7 @@
 ; RUN:	-enable-memprof-indirect-call-support=true \
 ; RUN:  -supports-hot-cold-new \
 ; RUN:  -r=%t/foo.o,_Z3fooR2B0j,plx \
+; RUN:  -r=%t/foo.o,_ZN2B03barEj.abc,plx \
 ; RUN:  -r=%t/foo.o,_Z3xyzR2B0j, \
 ; RUN:  -r=%t/main.o,_Z3fooR2B0j, \
 ; RUN:  -r=%t/main.o,_Znwm, \
@@ -121,6 +122,7 @@
 ; RUN:  -supports-hot-cold-new \
 ; RUN:  -thinlto-distributed-indexes \
 ; RUN:  -r=%t/foo.o,_Z3fooR2B0j,plx \
+; RUN:  -r=%t/foo.o,_ZN2B03barEj.abc,plx \
 ; RUN:  -r=%t/foo.o,_Z3xyzR2B0j, \
 ; RUN:  -r=%t/main.o,_Z3fooR2B0j, \
 ; RUN:  -r=%t/main.o,_Znwm, \
@@ -155,6 +157,7 @@
 ; RUN:	-enable-memprof-indirect-call-support=false \
 ; RUN:  -supports-hot-cold-new \
 ; RUN:  -r=%t/foo.noicp.o,_Z3fooR2B0j,plx \
+; RUN:  -r=%t/foo.noicp.o,_ZN2B03barEj.abc,plx \
 ; RUN:  -r=%t/foo.noicp.o,_Z3xyzR2B0j, \
 ; RUN:  -r=%t/main.o,_Z3fooR2B0j, \
 ; RUN:  -r=%t/main.o,_Znwm, \
@@ -261,6 +264,14 @@ target triple = "x86_64-unknown-linux-gnu"
 
 declare i32 @_Z3xyzR2B0j(ptr %b)
 
+;; Add a function that has the same name as one of the indirect callees, but
+;; with a suffix, to make sure we don't incorrectly pick the wrong one as the
+;; promoted target (which happens if we attempt to canonicalize the names
+;; when building the ICP symtab).
+define i32 @_ZN2B03barEj.abc(ptr %this, i32 %s) {
+  ret i32 0
+}
+
 define i32 @_Z3fooR2B0j(ptr %b) {
 entry:
   %0 = load ptr, ptr %b, align 8

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

StringRef CanonicalFuncName = getCanonicalName(PGOFuncName);
if (CanonicalFuncName != PGOFuncName)
return NameToGUIDMap(CanonicalFuncName);
if (AddCanonical) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe early return with an inverted condition to avoid nesting below?

if(!AddCanonical) return Error::success

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// metadata) *must* match that in the VP metadata to allow promotion.
// In practice this should not be a limitation, since local functions should
// have PGOFuncName metadata and global function names shouldn't need any
// special handling (they should not get the ".llvm.*" suffix that the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parenthesis not closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@teresajohnson teresajohnson merged commit 594e11c into llvm:main Nov 8, 2024
5 of 8 checks passed
Groverkss pushed a commit to iree-org/llvm-project that referenced this pull request Nov 15, 2024
ICP builds a symtab from the symbols in the module allowing mapping from
the VP metadata GUIDs to the Function. MemProf uses this same symtab
handling for its ICP during cloning. When symbols are added to the
symtab, the handling adds both a GUID computed from the function name,
or from the attached PGOFuncName metadata for locals, as well as a GUID
computed from the "canonicalized" name, which strips all "." suffixes
other than ".__uniq". This was originally meant to remove the ".llvm.*"
suffix added to promoted locals (done earlier in the ThinLTO backend).
In theory, it should no longer be needed as locals should have
PGOFuncName metadata.

However, this was causing a linker unsat, in code that used coroutines.
For an original coroutine function, there were several additional
functions created that had the same name, but different "." suffixes.
Therefore the canonical name for these additional functions had the same
GUID as that of the original function, leading to extra entries in the
symtab, and to selecting the wrong function for promotion. For regular
ICP this can happen, but is just a performance issue. However, for
memprof the promoted direct call calls a memprof clone, and because we
called the wrong function, in this case it didn't have a memprof clone
and we got a linker unsat.

We may be able to remove the canonical name handling for ICP in general,
but for now disable it for MemProf. At worst this could lead to not
finding a GUID in the symtab and not performing an ICP, so should be
conservatively correct.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO) PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants