Skip to content

Reland "[PGO][GlobalValue][LTO]In GlobalValues::getGlobalIdentifier, use semicolon as delimiter for local-linkage varibles. " #75860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion compiler-rt/test/profile/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ set(PROFILE_TESTSUITES)
set(PROFILE_TEST_DEPS ${SANITIZER_COMMON_LIT_TEST_DEPS} compiler-rt-headers)
list(APPEND PROFILE_TEST_DEPS profile)
if(NOT COMPILER_RT_STANDALONE_BUILD)
list(APPEND PROFILE_TEST_DEPS llvm-profdata llvm-cov)
list(APPEND PROFILE_TEST_DEPS llvm-cov llvm-dis llvm-lto llvm-profdata opt)
if(NOT APPLE AND COMPILER_RT_HAS_LLD AND "lld" IN_LIST LLVM_ENABLE_PROJECTS)
list(APPEND PROFILE_TEST_DEPS lld)
endif()
Expand Down
115 changes: 115 additions & 0 deletions compiler-rt/test/profile/instrprof-thinlto-indirect-call-promotion.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
// This is a regression test for ThinLTO indirect-call-promotion when candidate
// callees need to be imported from another IR module. In the C++ test case,
// `main` calls `global_func` which is defined in another module. `global_func`
// has two indirect callees, one has external linkage and one has local linkage.
// All three functions should be imported into the IR module of main.

// What the test does:
// - Generate raw profiles from executables and convert it to indexed profiles.
// During the conversion, a profiled callee address in raw profiles will be
// converted to function hash in indexed profiles.
// - Run IRPGO profile use and ThinTLO prelink pipeline and get LLVM bitcodes
// for both cpp files in the C++ test case.
// - Generate ThinLTO summary file with LLVM bitcodes, and run `function-import` pass.
// - Run `pgo-icall-prom` pass for the IR module which needs to import callees.

// Use lld as linker for more robust test. We need to REQUIRE LLVMgold.so for
// LTO if default linker is GNU ld or gold anyway.
// REQUIRES: lld-available

// Test should fail where linkage-name and mangled-name diverges, see issue https://github.com/llvm/llvm-project/issues/74565).
// Currently, this name divergence happens on Mach-O object file format, or on
// many (but not all) 32-bit Windows systems.
//
// XFAIL: system-darwin
//
// Mark 32-bit Windows as UNSUPPORTED for now as opposed to XFAIL. This test
// should fail on many (but not all) 32-bit Windows systems and succeed on the
// rest. The flexibility in triple string parsing makes it tricky to capture
// both sets accurately. i[3-9]86 specifies arch as Triple::ArchType::x86, (win32|windows)
// specifies OS as Triple::OS::Win32
//
// UNSUPPORTED: target={{i.86.*windows.*}}

// RUN: rm -rf %t && split-file %s %t && cd %t

// Do setup work for all below tests.
// Generate raw profiles from real programs and convert it into indexed profiles.
// Use clangxx_pgogen for IR level instrumentation for C++.
// RUN: %clangxx_pgogen -fuse-ld=lld -O2 lib.cpp main.cpp -o main
// RUN: env LLVM_PROFILE_FILE=main.profraw %run ./main
// RUN: llvm-profdata merge main.profraw -o main.profdata

// Use profile on lib and get bitcode, test that local function callee0 has
// expected !PGOFuncName metadata and external function callee1 doesn't have
// !PGOFuncName metadata. Explicitly skip ICP pass to test ICP happens as
// expected in the IR module that imports functions from lib.
// RUN: %clang -mllvm -disable-icp -fprofile-use=main.profdata -flto=thin -O2 -c lib.cpp -o lib.bc
// RUN: llvm-dis lib.bc -o - | FileCheck %s --check-prefix=PGOName

// Use profile on main and get bitcode.
// RUN: %clang -fprofile-use=main.profdata -flto=thin -O2 -c main.cpp -o main.bc

// Run llvm-lto to get summary file.
// RUN: llvm-lto -thinlto -o summary main.bc lib.bc

// Test the imports of functions. Default import thresholds would work but do
// explicit override to be more futureproof. Note all functions have one basic
// block with a function-entry-count of one, so they are actually hot functions
// per default profile summary hotness cutoff.
// RUN: opt -passes=function-import -import-instr-limit=100 -import-cold-multiplier=1 -summary-file summary.thinlto.bc main.bc -o main.import.bc -print-imports 2>&1 | FileCheck %s --check-prefix=IMPORTS
// Test that '_Z11global_funcv' has indirect calls annotated with value profiles.
// RUN: llvm-dis main.import.bc -o - | FileCheck %s --check-prefix=IR

// Test that both candidates are ICP'ed and there is no `!VP` in the IR.
// RUN: opt main.import.bc -icp-lto -passes=pgo-icall-prom -S -pass-remarks=pgo-icall-prom 2>&1 | FileCheck %s --check-prefixes=ICP-IR,ICP-REMARK --implicit-check-not="!VP"

// IMPORTS: main.cpp: Import _Z7callee1v
// IMPORTS: main.cpp: Import _ZL7callee0v.llvm.[[#]]
// IMPORTS: main.cpp: Import _Z11global_funcv

// PGOName: define {{(dso_local )?}}void @_Z7callee1v() #[[#]] !prof ![[#]] {
// PGOName: define internal void @_ZL7callee0v() #[[#]] !prof ![[#]] !PGOFuncName ![[#MD:]] {
// PGOName: ![[#MD]] = !{!"{{.*}}lib.cpp;_ZL7callee0v"}

// IR-LABEL: define available_externally {{.*}} void @_Z11global_funcv() {{.*}} !prof ![[#]] {
// IR-NEXT: entry:
// IR-NEXT: %0 = load ptr, ptr @calleeAddrs
// IR-NEXT: tail call void %0(), !prof ![[#PROF1:]]
// IR-NEXT: %1 = load ptr, ptr getelementptr inbounds ([2 x ptr], ptr @calleeAddrs,
// IR-NEXT: tail call void %1(), !prof ![[#PROF2:]]

// The GUID of indirect callee is the MD5 hash of `/path/to/lib.cpp;_ZL7callee0v`
// that depends on the directory. Use [[#]] for its MD5 hash.
// Use {{.*}} for integer types so the test works on 32-bit and 64-bit systems.
// IR: ![[#PROF1]] = !{!"VP", i32 0, {{.*}} 1, {{.*}} [[#]], {{.*}} 1}
// IR: ![[#PROF2]] = !{!"VP", i32 0, {{.*}} 1, {{.*}} -3993653843325621743, {{.*}} 1}

// ICP-REMARK: Promote indirect call to _ZL7callee0v.llvm.[[#]] with count 1 out of 1
// ICP-REMARK: Promote indirect call to _Z7callee1v with count 1 out of 1

// ICP-IR: br i1 %[[#]], label %if.true.direct_targ, label %if.false.orig_indirect, !prof ![[#BRANCH_WEIGHT1:]]
// ICP-IR: br i1 %[[#]], label %if.true.direct_targ1, label %if.false.orig_indirect2, !prof ![[#BRANCH_WEIGHT1]]
// ICP-IR: ![[#BRANCH_WEIGHT1]] = !{!"branch_weights", i32 1, i32 0}

//--- lib.h
void global_func();

//--- lib.cpp
#include "lib.h"
static void callee0() {}
void callee1() {}
typedef void (*FPT)();
FPT calleeAddrs[] = {callee0, callee1};
// `global_func`` might call one of two indirect callees. callee0 has internal
// linkage and callee1 has external linkage.
void global_func() {
FPT fp = calleeAddrs[0];
fp();
fp = calleeAddrs[1];
fp();
}

//--- main.cpp
#include "lib.h"
int main() { global_func(); }
4 changes: 4 additions & 0 deletions llvm/include/llvm/IR/GlobalValue.h
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,10 @@ namespace Intrinsic {
typedef unsigned ID;
} // end namespace Intrinsic

// Choose ';' as the delimiter. ':' was used once but it doesn't work well for
// Objective-C functions which commonly have :'s in their names.
inline constexpr char kGlobalIdentifierDelimiter = ';';

class GlobalValue : public Constant {
public:
/// An enumeration for the kinds of linkage for global values.
Expand Down
26 changes: 15 additions & 11 deletions llvm/include/llvm/ProfileData/InstrProf.h
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,8 @@ inline StringRef getInstrProfCounterBiasVarName() {
/// Return the marker used to separate PGO names during serialization.
inline StringRef getInstrProfNameSeparator() { return "\01"; }

/// Please use getIRPGOFuncName for LLVM IR instrumentation. This function is
/// for front-end (Clang, etc) instrumentation.
/// Return the modified name for function \c F suitable to be
/// used the key for profile lookup. Variable \c InLTO indicates if this
/// is called in LTO optimization passes.
Expand All @@ -196,20 +198,22 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO = false);
std::pair<StringRef, StringRef> getParsedIRPGOFuncName(StringRef IRPGOFuncName);

/// Return the name of the global variable used to store a function
/// name in PGO instrumentation. \c FuncName is the name of the function
/// returned by the \c getPGOFuncName call.
/// name in PGO instrumentation. \c FuncName is the IRPGO function name
/// (returned by \c getIRPGOFuncName) for LLVM IR instrumentation and PGO
/// function name (returned by \c getPGOFuncName) for front-end instrumentation.
std::string getPGOFuncNameVarName(StringRef FuncName,
GlobalValue::LinkageTypes Linkage);

/// Create and return the global variable for function name used in PGO
/// instrumentation. \c FuncName is the name of the function returned
/// by \c getPGOFuncName call.
/// instrumentation. \c FuncName is the IRPGO function name (returned by
/// \c getIRPGOFuncName) for LLVM IR instrumentation and PGO function name
/// (returned by \c getPGOFuncName) for front-end instrumentation.
GlobalVariable *createPGOFuncNameVar(Function &F, StringRef PGOFuncName);

/// Create and return the global variable for function name used in PGO
/// instrumentation. /// \c FuncName is the name of the function
/// returned by \c getPGOFuncName call, \c M is the owning module,
/// and \c Linkage is the linkage of the instrumented function.
/// instrumentation. \c FuncName is the IRPGO function name (returned by
/// \c getIRPGOFuncName) for LLVM IR instrumentation and PGO function name
/// (returned by \c getPGOFuncName) for front-end instrumentation.
GlobalVariable *createPGOFuncNameVar(Module &M,
GlobalValue::LinkageTypes Linkage,
StringRef PGOFuncName);
Expand Down Expand Up @@ -417,11 +421,11 @@ uint64_t ComputeHash(StringRef K);

} // end namespace IndexedInstrProf

/// A symbol table used for function PGO name look-up with keys
/// A symbol table used for function [IR]PGO name look-up with keys
/// (such as pointers, md5hash values) to the function. A function's
/// PGO name or name's md5hash are used in retrieving the profile
/// data of the function. See \c getPGOFuncName() method for details
/// on how PGO name is formed.
/// [IR]PGO name or name's md5hash are used in retrieving the profile
/// data of the function. See \c getIRPGOFuncName() and \c getPGOFuncName
/// methods for details how [IR]PGO name is formed.
class InstrProfSymtab {
public:
using AddrHashMap = std::vector<std::pair<uint64_t, uint64_t>>;
Expand Down
12 changes: 7 additions & 5 deletions llvm/lib/IR/Globals.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -144,25 +144,27 @@ void GlobalObject::copyAttributesFrom(const GlobalObject *Src) {
std::string GlobalValue::getGlobalIdentifier(StringRef Name,
GlobalValue::LinkageTypes Linkage,
StringRef FileName) {

// Value names may be prefixed with a binary '1' to indicate
// that the backend should not modify the symbols due to any platform
// naming convention. Do not include that '1' in the PGO profile name.
if (Name[0] == '\1')
Name = Name.substr(1);

std::string NewName = std::string(Name);
std::string GlobalName;
if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
// For local symbols, prepend the main file name to distinguish them.
// Do not include the full path in the file name since there's no guarantee
// that it will stay the same, e.g., if the files are checked out from
// version control in different locations.
if (FileName.empty())
NewName = NewName.insert(0, "<unknown>:");
GlobalName += "<unknown>";
else
NewName = NewName.insert(0, FileName.str() + ":");
GlobalName += FileName;

GlobalName += kGlobalIdentifierDelimiter;
}
return NewName;
GlobalName += Name;
return GlobalName;
}

std::string GlobalValue::getGlobalIdentifier() const {
Expand Down
36 changes: 27 additions & 9 deletions llvm/lib/ProfileData/InstrProf.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -246,11 +246,27 @@ std::string InstrProfError::message() const {

char InstrProfError::ID = 0;

std::string getPGOFuncName(StringRef RawFuncName,
GlobalValue::LinkageTypes Linkage,
std::string getPGOFuncName(StringRef Name, GlobalValue::LinkageTypes Linkage,
StringRef FileName,
uint64_t Version LLVM_ATTRIBUTE_UNUSED) {
return GlobalValue::getGlobalIdentifier(RawFuncName, Linkage, FileName);
// Value names may be prefixed with a binary '1' to indicate
// that the backend should not modify the symbols due to any platform
// naming convention. Do not include that '1' in the PGO profile name.
if (Name[0] == '\1')
Name = Name.substr(1);

std::string NewName = std::string(Name);
if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
// For local symbols, prepend the main file name to distinguish them.
// Do not include the full path in the file name since there's no guarantee
// that it will stay the same, e.g., if the files are checked out from
// version control in different locations.
if (FileName.empty())
NewName = NewName.insert(0, "<unknown>:");
else
NewName = NewName.insert(0, FileName.str() + ":");
}
return NewName;
}

// Strip NumPrefix level of directory name from PathNameStr. If the number of
Expand Down Expand Up @@ -300,12 +316,10 @@ getIRPGONameForGlobalObject(const GlobalObject &GO,
GlobalValue::LinkageTypes Linkage,
StringRef FileName) {
SmallString<64> Name;
if (llvm::GlobalValue::isLocalLinkage(Linkage)) {
Name.append(FileName.empty() ? "<unknown>" : FileName);
Name.append(";");
}
// FIXME: Mangler's handling is kept outside of `getGlobalIdentifier` for now.
// For more details please check issue #74565.
Mangler().getNameWithPrefix(Name, &GO, /*CannotUsePrivateLabel=*/true);
return Name.str().str();
return GlobalValue::getGlobalIdentifier(Name, Linkage, FileName);
}

static std::optional<std::string> lookupPGONameFromMetadata(MDNode *MD) {
Expand Down Expand Up @@ -352,6 +366,9 @@ std::string getIRPGOFuncName(const Function &F, bool InLTO) {
return getIRPGOObjectName(F, InLTO, getPGOFuncNameMetadata(F));
}

// Please use getIRPGOFuncName for LLVM IR instrumentation. This function is
// for front-end (Clang, etc) instrumentation.
// The implementation is kept for profile matching from older profiles.
// This is similar to `getIRPGOFuncName` except that this function calls
// 'getPGOFuncName' to get a name and `getIRPGOFuncName` calls
// 'getIRPGONameForGlobalObject'. See the difference between two callees in the
Expand Down Expand Up @@ -384,7 +401,8 @@ getParsedIRPGOFuncName(StringRef IRPGOFuncName) {
StringRef getFuncNameWithoutPrefix(StringRef PGOFuncName, StringRef FileName) {
if (FileName.empty())
return PGOFuncName;
// Drop the file name including ':'. See also getPGOFuncName.
// Drop the file name including ':' or ';'. See getIRPGONameForGlobalObject as
// well.
if (PGOFuncName.starts_with(FileName))
PGOFuncName = PGOFuncName.drop_front(FileName.size() + 1);
return PGOFuncName;
Expand Down
9 changes: 5 additions & 4 deletions llvm/lib/ProfileData/InstrProfReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1008,12 +1008,13 @@ class llvm::InstrProfReaderItaniumRemapper

/// Extract the original function name from a PGO function name.
static StringRef extractName(StringRef Name) {
// We can have multiple :-separated pieces; there can be pieces both
// before and after the mangled name. Find the first part that starts
// with '_Z'; we'll assume that's the mangled name we want.
// We can have multiple pieces separated by kGlobalIdentifierDelimiter (
// semicolon now and colon in older profiles); there can be pieces both
// before and after the mangled name. Find the first part that starts with
// '_Z'; we'll assume that's the mangled name we want.
std::pair<StringRef, StringRef> Parts = {StringRef(), Name};
while (true) {
Parts = Parts.second.split(':');
Parts = Parts.second.split(kGlobalIdentifierDelimiter);
if (Parts.first.starts_with("_Z"))
return Parts.first;
if (Parts.second.empty())
Expand Down
10 changes: 5 additions & 5 deletions llvm/test/Bitcode/thinlto-function-summary-originalnames.ll
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
; COMBINED: <GLOBALVAL_SUMMARY_BLOCK
; COMBINED-NEXT: <VERSION
; COMBINED-NEXT: <FLAGS
; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4947176790635855146/>
; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-6591587165810580810/>
; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-4377693495213223786/>
; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=686735765308251824/>
; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=4507502870619175775/>
; COMBINED-NEXT: <VALUE_GUID {{.*}} op1=-8118561185538785069/>
; COMBINED-DAG: <COMBINED_PROFILE{{ }}
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
; COMBINED-DAG: <COMBINED_GLOBALVAR_INIT_REFS
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=-2012135647395072713/>
; COMBINED-DAG: <COMBINED_GLOBALVAR_INIT_REFS
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=6699318081062747564/>
; COMBINED-DAG: <COMBINED_ALIAS
; COMBINED-DAG: <COMBINED_ORIGINAL_NAME op0=-4170563161550796836/>
; COMBINED-NEXT: </GLOBALVAL_SUMMARY_BLOCK>
Expand Down
Loading