Skip to content

[SampleProfile] Fix bug where remapper returns empty string and crashing Sample Profile loader #71479

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 10, 2023

Conversation

huangjd
Copy link
Contributor

@huangjd huangjd commented Nov 7, 2023

Normally SampleContext does not allow using an empty StirngRef to construct an object, this is to prevent bugs reading the profile. However empty names may be emitted by a function which its name is intentionally set to empty, or a bug in the remapper that returns an empty string. Regardless, converting it to FunctionId first will prevent the assert, and that assert check is unnecessary, which will be addressed in another patch

empty name asserts.

Normally SampleContext does not allow using an empty StirngRef to
construct an object, this is to prevent bugs reading the profile.
However empty names may be emitted by a function which its name is
intentionally set to empty, or a bug in the remapper (which should be
investigated separately) that remaps a non-empty name to an empty
string. Regardless, converting it to FunctionId first will prevent the
assert, and that assert check is unnecessary, which will be addressed in
another patch
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2023

@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: William Junda Huang (huangjd)

Changes

Normally SampleContext does not allow using an empty StirngRef to construct an object, this is to prevent bugs reading the profile. However empty names may be emitted by a function which its name is intentionally set to empty, or a bug in the remapper (which should be investigated separately) that remaps a non-empty name to an empty string. Regardless, converting it to FunctionId first will prevent the assert, and that assert check is unnecessary, which will be addressed in another patch


Full diff: https://github.com/llvm/llvm-project/pull/71479.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/SampleProfile.cpp (+3-3)
diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index a7773737ef16bd3..063f7b42022ff83 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -1630,7 +1630,7 @@ void SampleProfileLoader::promoteMergeNotInlinedContextSamples(
         // separate map so that it does not rehash the original profile.
         if (!OutlineFS)
           OutlineFS = &OutlineFunctionSamples[
-              FunctionSamples::getCanonicalFnName(Callee->getName())];
+              FunctionId(FunctionSamples::getCanonicalFnName(Callee->getName()))];
         OutlineFS->merge(*FS, 1);
         // Set outlined profile to be synthetic to not bias the inliner.
         OutlineFS->SetContextSynthetic();
@@ -2675,12 +2675,12 @@ bool SampleProfileLoader::runOnFunction(Function &F, ModuleAnalysisManager *AM)
     // into base.
     if (!Samples) {
       StringRef CanonName = FunctionSamples::getCanonicalFnName(F);
-      auto It = OutlineFunctionSamples.find(CanonName);
+      auto It = OutlineFunctionSamples.find(FunctionId(CanonName));
       if (It != OutlineFunctionSamples.end()) {
         Samples = &It->second;
       } else if (auto Remapper = Reader->getRemapper()) {
         if (auto RemppedName = Remapper->lookUpNameInProfile(CanonName)) {
-          It = OutlineFunctionSamples.find(*RemppedName);
+          It = OutlineFunctionSamples.find(FunctionId(*RemppedName));
           if (It != OutlineFunctionSamples.end())
             Samples = &It->second;
         }

@huangjd
Copy link
Contributor Author

huangjd commented Nov 7, 2023

Specifically in SampleProfileLoader::runOnFunction,

    if (auto RemppedName = Remapper->lookUpNameInProfile(CanonName)) {
          It = OutlineFunctionSamples.find(*RemppedName);
          if (It != OutlineFunctionSamples.end())
            Samples = &It->second;
        }

Triggers the assert because *RemappedName is empty.

@huangjd huangjd added the PGO Profile Guided Optimizations label Nov 7, 2023
Copy link

github-actions bot commented Nov 7, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 6846258aa62ebeb64d23fc8839e3ab471b18f6dd 7ea2ed8dc65d76b4a2b460de002702d449a079aa -- llvm/lib/ProfileData/SampleProfReader.cpp llvm/lib/Transforms/IPO/SampleProfile.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/IPO/SampleProfile.cpp b/llvm/lib/Transforms/IPO/SampleProfile.cpp
index 063f7b42022f..b93157fe7852 100644
--- a/llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ b/llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -1629,8 +1629,8 @@ void SampleProfileLoader::promoteMergeNotInlinedContextSamples(
         // If outlined function does not exist in the profile, add it to a
         // separate map so that it does not rehash the original profile.
         if (!OutlineFS)
-          OutlineFS = &OutlineFunctionSamples[
-              FunctionId(FunctionSamples::getCanonicalFnName(Callee->getName()))];
+          OutlineFS = &OutlineFunctionSamples[FunctionId(
+              FunctionSamples::getCanonicalFnName(Callee->getName()))];
         OutlineFS->merge(*FS, 1);
         // Set outlined profile to be synthetic to not bias the inliner.
         OutlineFS->SetContextSynthetic();

Copy link
Contributor

@snehasish snehasish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test?

Added safeguard to remapper
@huangjd huangjd requested a review from snehasish November 8, 2023 02:29
@huangjd huangjd changed the title [SampleProfile] Fix bug where intentionally constructed function with empty name asserts. [SampleProfile] Fix bug where remapper returns empty string and crashing Sample Profile loader Nov 8, 2023
@huangjd huangjd added the bug Indicates an unexpected problem or unintended behavior label Nov 8, 2023
@huangjd
Copy link
Contributor Author

huangjd commented Nov 8, 2023

Is it possible to add a test?

Added test case, which crashes on latest build on main

@@ -1630,7 +1630,7 @@ void SampleProfileLoader::promoteMergeNotInlinedContextSamples(
// separate map so that it does not rehash the original profile.
if (!OutlineFS)
OutlineFS = &OutlineFunctionSamples[
FunctionSamples::getCanonicalFnName(Callee->getName())];
FunctionId(FunctionSamples::getCanonicalFnName(Callee->getName()))];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pretty much a drive-by comment

It seems to me SampleContext could take either a StringRef or a FunctionId in its constructor, and the former asserts names are not empty (

assert(!Name.empty() && "Name is empty");
) while the latter doesn't (FunctionId ctor doesn't require names are not empty either); and this PR wants to fix a crash for the former constructor.

If I understand correctly, is this FunctionId needed here and below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentionally using FunctionId constructor here to avoid that assert, and any other potential situation where a function with empty name is passed in. Since the front end is relatively independent, so we are not making any assumption there. Accepting an empty string in profile matching will at worst degrade the profile's accuracy but still result in a correct binary, which is something we want to ensure here: If both the source code and the sample profile file are valid, the compiler should not exit abnormally.

Also there's an upcoming patch which will make this moot (https://github.com/huangjd/llvm-project/tree/md5phase3) I am still validating it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the StringRef based constructor in SampleConext be removed to avoid future confusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will be addressed in MD5phase3 (pending), in that patch FunctionId no longer requires explicit constructor since the difference between MD5 and String FunctionId will be minimized

@huangjd huangjd merged commit 683f2df into llvm:main Nov 10, 2023
@huangjd huangjd deleted the md5phase2_bug1 branch November 10, 2023 22:04
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
…ing Sample Profile loader (llvm#71479)

Normally SampleContext does not allow using an empty StirngRef to
construct an object, this is to prevent bugs reading the profile.
However empty names may be emitted by a function which its name is
intentionally set to empty, or a bug in the remapper that returns an
empty string. Regardless, converting it to FunctionId first will prevent
the assert, and that assert check is unnecessary, which will be
addressed in another patch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants