Skip to content

[ProfileData] Add a variant of getValueProfDataFromInst #95993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

kazutakahirata
Copy link
Contributor

This patch adds a variant of getValueProfDataFromInst that returns
std::vector instead of
std::unique<InstrProfValueData[]>. The new return type carries the
length with it, so we can drop out parameter ActualNumValueData.
Also, the caller can directly feed the return value into a range-based
for loop as shown in the patch.

I'm planning to migrate other callers of getValueProfDataFromInst to
the new variant in follow-up patches.

This patch adds a variant of getValueProfDataFromInst that returns
std::vector<InstrProfValueData> instead of
std::unique<InstrProfValueData[]>.  The new return type carries the
length with it, so we can drop out parameter ActualNumValueData.
Also, the caller can directly feed the return value into a range-based
for loop as shown in the patch.

I'm planning to migrate other callers of getValueProfDataFromInst to
the new variant in follow-up patches.
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:analysis labels Jun 18, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 18, 2024

@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-analysis

Author: Kazu Hirata (kazutakahirata)

Changes

This patch adds a variant of getValueProfDataFromInst that returns
std::vector<InstrProfValueData> instead of
std::unique<InstrProfValueData[]>. The new return type carries the
length with it, so we can drop out parameter ActualNumValueData.
Also, the caller can directly feed the return value into a range-based
for loop as shown in the patch.

I'm planning to migrate other callers of getValueProfDataFromInst to
the new variant in follow-up patches.


Full diff: https://github.com/llvm/llvm-project/pull/95993.diff

3 Files Affected:

  • (modified) llvm/include/llvm/ProfileData/InstrProf.h (+7)
  • (modified) llvm/lib/Analysis/ModuleSummaryAnalysis.cpp (+6-11)
  • (modified) llvm/lib/ProfileData/InstrProf.cpp (+37)
diff --git a/llvm/include/llvm/ProfileData/InstrProf.h b/llvm/include/llvm/ProfileData/InstrProf.h
index df79073da5b50..2150a5ec8dc21 100644
--- a/llvm/include/llvm/ProfileData/InstrProf.h
+++ b/llvm/include/llvm/ProfileData/InstrProf.h
@@ -294,6 +294,13 @@ getValueProfDataFromInst(const Instruction &Inst, InstrProfValueKind ValueKind,
                          uint32_t MaxNumValueData, uint32_t &ActualNumValueData,
                          uint64_t &TotalC, bool GetNoICPValue = false);
 
+/// Extract the value profile data from \p Inst and returns them if \p Inst is
+/// annotated with value profile data. Returns an empty vector otherwise.
+std::vector<InstrProfValueData>
+getValueProfDataFromInst(const Instruction &Inst, InstrProfValueKind ValueKind,
+                         uint32_t MaxNumValueData, uint64_t &TotalC,
+                         bool GetNoICPValue = false);
+
 inline StringRef getPGOFuncNameMetadataName() { return "PGOFuncName"; }
 
 /// Return the PGOFuncName meta data associated with a function.
diff --git a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
index c6934f55ee114..94ac0484f5ec7 100644
--- a/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
+++ b/llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
@@ -143,20 +143,15 @@ static bool findRefEdges(ModuleSummaryIndex &Index, const User *CurUser,
 
   const Instruction *I = dyn_cast<Instruction>(CurUser);
   if (I) {
-    uint32_t ActualNumValueData = 0;
     uint64_t TotalCount = 0;
     // MaxNumVTableAnnotations is the maximum number of vtables annotated on
     // the instruction.
-    auto ValueDataArray =
-        getValueProfDataFromInst(*I, IPVK_VTableTarget, MaxNumVTableAnnotations,
-                                 ActualNumValueData, TotalCount);
-
-    if (ValueDataArray.get()) {
-      for (uint32_t j = 0; j < ActualNumValueData; j++) {
-        RefEdges.insert(Index.getOrInsertValueInfo(/* VTableGUID = */
-                                                   ValueDataArray[j].Value));
-      }
-    }
+    auto ValueDataArray = getValueProfDataFromInst(
+        *I, IPVK_VTableTarget, MaxNumVTableAnnotations, TotalCount);
+
+    for (const auto &V : ValueDataArray)
+      RefEdges.insert(Index.getOrInsertValueInfo(/* VTableGUID = */
+                                                 V.Value));
   }
   return HasBlockAddress;
 }
diff --git a/llvm/lib/ProfileData/InstrProf.cpp b/llvm/lib/ProfileData/InstrProf.cpp
index 8cc1625e1c05c..e94cdc3b3235e 100644
--- a/llvm/lib/ProfileData/InstrProf.cpp
+++ b/llvm/lib/ProfileData/InstrProf.cpp
@@ -1381,6 +1381,43 @@ getValueProfDataFromInst(const Instruction &Inst, InstrProfValueKind ValueKind,
   return ValueDataArray;
 }
 
+std::vector<InstrProfValueData>
+getValueProfDataFromInst(const Instruction &Inst, InstrProfValueKind ValueKind,
+                         uint32_t MaxNumValueData, uint64_t &TotalC,
+                         bool GetNoICPValue) {
+  std::vector<InstrProfValueData> ValueData;
+  MDNode *MD = mayHaveValueProfileOfKind(Inst, ValueKind);
+  if (!MD)
+    return ValueData;
+  const unsigned NOps = MD->getNumOperands();
+  // Get total count
+  ConstantInt *TotalCInt = mdconst::dyn_extract<ConstantInt>(MD->getOperand(2));
+  if (!TotalCInt)
+    return ValueData;
+  TotalC = TotalCInt->getZExtValue();
+
+  ValueData.reserve(MaxNumValueData);
+  for (unsigned I = 3; I < NOps; I += 2) {
+    if (ValueData.size() >= MaxNumValueData)
+      break;
+    ConstantInt *Value = mdconst::dyn_extract<ConstantInt>(MD->getOperand(I));
+    ConstantInt *Count =
+        mdconst::dyn_extract<ConstantInt>(MD->getOperand(I + 1));
+    if (!Value || !Count) {
+      ValueData.clear();
+      return ValueData;
+    }
+    uint64_t CntValue = Count->getZExtValue();
+    if (!GetNoICPValue && (CntValue == NOMORE_ICP_MAGICNUM))
+      continue;
+    InstrProfValueData V;
+    V.Value = Value->getZExtValue();
+    V.Count = CntValue;
+    ValueData.push_back(V);
+  }
+  return ValueData;
+}
+
 MDNode *getPGOFuncNameMetadata(const Function &F) {
   return F.getMetadata(getPGOFuncNameMetadataName());
 }

@kazutakahirata
Copy link
Contributor Author

I've switched to SmallVector. Please take a look. Thanks!

Copy link
Contributor

@mingmingl-llvm mingmingl-llvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but it'd be good to wait a little bit more for Ellis's comments

@kazutakahirata kazutakahirata merged commit b0ae923 into llvm:main Jun 22, 2024
7 checks passed
@kazutakahirata kazutakahirata deleted the cleanup_ProfileData_getValueProfDataFromInst_1 branch June 22, 2024 07:40
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jun 22, 2024

LLVM Buildbot has detected a new failure on builder sanitizer-x86_64-linux-qemu running on sanitizer-buildbot3 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/139/builds/348

Here is the relevant piece of the build log for the reference:

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
1 warning generated.
[72/76] Generating ScudoCUnitTest-mips64-Test
[73/76] Generating ScudoCxxUnitTest-mips64-Test
[74/76] Generating ScudoUnitTestsObjects.combined_test.cpp.mips64.o
[75/76] Generating ScudoUnitTest-mips64-Test
[75/76] Running Scudo Standalone tests
llvm-lit: /b/sanitizer-x86_64-linux-qemu/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 151 tests, 80 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
TIMEOUT: ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/71/133 (151 of 151)
******************** TEST 'ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/71/133' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/b/sanitizer-x86_64-linux-qemu/build/llvm_build2_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test-ScudoStandalone-Unit-399528-71-133.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=133 GTEST_SHARD_INDEX=71 /b/sanitizer-x86_64-linux-qemu/build/qemu_build/qemu-mips64 -L /usr/mips64-linux-gnuabi64 /b/sanitizer-x86_64-linux-qemu/build/llvm_build2_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test
--

Note: This is test shard 72 of 133.
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from ScudoCombinedDeathTestBasicCombined15_TestConditionVariableConfig
[ RUN      ] ScudoCombinedDeathTestBasicCombined15_TestConditionVariableConfig.BasicCombined15
Stats: SizeClassAllocator64: 0M mapped (0M rss) in 15 allocations; remains 15; ReleaseToOsIntervalMs = 1000
  00 (    64): mapped:    256K popped:      13 pushed:       0 inuse:     13 total:    104 releases:      0 last released:      0K latest pushed bytes:      5K region: 0x5555c6811000 (0x5555c6808000)
  31 ( 33296): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      7 releases:      0 last released:      0K latest pushed bytes:    195K region: 0x555726810000 (0x555726808000)
  32 ( 65552): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      3 releases:      0 last released:      0K latest pushed bytes:    128K region: 0x555656810000 (0x555656808000)
Stats: MapAllocator: allocated 81 times (7848K), freed 81 times (7848K), remains 0 (0K) max 0M, Fragmented 0K
Secondary Cache Disabled
Stats: Quarantine: batches: 0; bytes: 0 (user: 0); chunks: 0 (capacity: 0); 0% chunks used; 0% memory overhead
Quarantine limits: global: 0K; thread local: 0K
Stats: SharedTSDs: 4 available; total 8
  Shared TSD[0]:
    00 (    64): cached:    9 max:   26
    31 ( 33296): cached:    1 max:    2
    32 ( 65552): cached:    1 max:    2
  Shared TSD[1]:
    No block is cached.
  Shared TSD[2]:
    No block is cached.
  Shared TSD[3]:
    No block is cached.
Fragmentation Stats: SizeClassAllocator64: page size = 4096 bytes
  01 (    32): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  02 (    48): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  03 (    64): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  04 (    80): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  05 (    96): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  06 (   112): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  07 (   144): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  08 (   176): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
Step 17 (scudo mips64_qemu) failure: scudo mips64_qemu (failure)
...
1 warning generated.
[72/76] Generating ScudoCUnitTest-mips64-Test
[73/76] Generating ScudoCxxUnitTest-mips64-Test
[74/76] Generating ScudoUnitTestsObjects.combined_test.cpp.mips64.o
[75/76] Generating ScudoUnitTest-mips64-Test
[75/76] Running Scudo Standalone tests
llvm-lit: /b/sanitizer-x86_64-linux-qemu/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 151 tests, 80 workers --
Testing:  0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90..
TIMEOUT: ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/71/133 (151 of 151)
******************** TEST 'ScudoStandalone-Unit :: ./ScudoUnitTest-mips64-Test/71/133' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/b/sanitizer-x86_64-linux-qemu/build/llvm_build2_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test-ScudoStandalone-Unit-399528-71-133.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=133 GTEST_SHARD_INDEX=71 /b/sanitizer-x86_64-linux-qemu/build/qemu_build/qemu-mips64 -L /usr/mips64-linux-gnuabi64 /b/sanitizer-x86_64-linux-qemu/build/llvm_build2_mips64_qemu/lib/scudo/standalone/tests/./ScudoUnitTest-mips64-Test
--

Note: This is test shard 72 of 133.
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from ScudoCombinedDeathTestBasicCombined15_TestConditionVariableConfig
[ RUN      ] ScudoCombinedDeathTestBasicCombined15_TestConditionVariableConfig.BasicCombined15
Stats: SizeClassAllocator64: 0M mapped (0M rss) in 15 allocations; remains 15; ReleaseToOsIntervalMs = 1000
  00 (    64): mapped:    256K popped:      13 pushed:       0 inuse:     13 total:    104 releases:      0 last released:      0K latest pushed bytes:      5K region: 0x5555c6811000 (0x5555c6808000)
  31 ( 33296): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      7 releases:      0 last released:      0K latest pushed bytes:    195K region: 0x555726810000 (0x555726808000)
  32 ( 65552): mapped:    256K popped:       1 pushed:       0 inuse:      1 total:      3 releases:      0 last released:      0K latest pushed bytes:    128K region: 0x555656810000 (0x555656808000)
Stats: MapAllocator: allocated 81 times (7848K), freed 81 times (7848K), remains 0 (0K) max 0M, Fragmented 0K
Secondary Cache Disabled
Stats: Quarantine: batches: 0; bytes: 0 (user: 0); chunks: 0 (capacity: 0); 0% chunks used; 0% memory overhead
Quarantine limits: global: 0K; thread local: 0K
Stats: SharedTSDs: 4 available; total 8
  Shared TSD[0]:
    00 (    64): cached:    9 max:   26
    31 ( 33296): cached:    1 max:    2
    32 ( 65552): cached:    1 max:    2
  Shared TSD[1]:
    No block is cached.
  Shared TSD[2]:
    No block is cached.
  Shared TSD[3]:
    No block is cached.
Fragmentation Stats: SizeClassAllocator64: page size = 4096 bytes
  01 (    32): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  02 (    48): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  03 (    64): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  04 (    80): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  05 (    96): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  06 (   112): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  07 (   144): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%
  08 (   176): inuse/total blocks:      0/     0 inuse/total pages:      0/     0 inuse bytes:      0K util: 100.00%

AlexisPerry pushed a commit to llvm-project-tlp/llvm-project that referenced this pull request Jul 9, 2024
This patch adds a variant of getValueProfDataFromInst that returns
std::vector<InstrProfValueData> instead of
std::unique<InstrProfValueData[]>.  The new return type carries the
length with it, so we can drop out parameter ActualNumValueData.
Also, the caller can directly feed the return value into a range-based
for loop as shown in the patch.

I'm planning to migrate other callers of getValueProfDataFromInst to
the new variant in follow-up patches.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants