Skip to content

[ctx_prof] Fix checks in PGOCtxprofFlattening #108467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 18, 2024
Merged

[ctx_prof] Fix checks in PGOCtxprofFlattening #108467

merged 10 commits into from
Sep 18, 2024

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented Sep 12, 2024

The assertion that all out-edges of a BB can't be 0 is incorrect: they can be, if that branch is on a cold subgraph.

Added validators and asserts about the expected proprerties of the propagated counters.

The assertion that all out-edges of a BB can't be 0 is incorrect: they
can be, if that branch is on a cold subgraph.

Added validators and asserts about the expected proprerties of the
propagated counters.
@llvmbot llvmbot added PGO Profile Guided Optimizations llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Sep 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Sep 12, 2024

@llvm/pr-subscribers-lto
@llvm/pr-subscribers-pgo

@llvm/pr-subscribers-llvm-transforms

Author: Mircea Trofin (mtrofin)

Changes

The assertion that all out-edges of a BB can't be 0 is incorrect: they can be, if that branch is on a cold subgraph.

Added validators and asserts about the expected proprerties of the propagated counters.


Full diff: https://github.com/llvm/llvm-project/pull/108467.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp (+56-9)
  • (added) llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll (+55)
diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
index fc78d8c60ec050..77cfa2cdcff7e8 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
@@ -34,6 +34,7 @@
 #include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
 #include "llvm/Transforms/Scalar/DCE.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
+#include <deque>
 
 using namespace llvm;
 
@@ -110,7 +111,7 @@ class ProfileAnnotator final {
     }
 
     bool tryTakeCountFromKnownOutEdges(const BasicBlock &BB) {
-      if (!succ_empty(&BB) && !UnknownCountOutEdges) {
+      if (!UnknownCountOutEdges) {
         computeCountFrom(OutEdges);
         return true;
       }
@@ -118,7 +119,7 @@ class ProfileAnnotator final {
     }
 
     bool tryTakeCountFromKnownInEdges(const BasicBlock &BB) {
-      if (!BB.isEntryBlock() && !UnknownCountInEdges) {
+      if (!UnknownCountInEdges) {
         computeCountFrom(InEdges);
         return true;
       }
@@ -198,6 +199,50 @@ class ProfileAnnotator final {
 
   BBInfo &getBBInfo(const BasicBlock &BB) { return BBInfos.find(&BB)->second; }
 
+  const BBInfo &getBBInfo(const BasicBlock &BB) const {
+    return BBInfos.find(&BB)->second;
+  }
+
+  // validation function after we propagate the counters: all BBs and edges'
+  // counters must have a value.
+  bool allCountersAreAssinged() const {
+    for (const auto &BBInfo : BBInfos)
+      if (!BBInfo.second.hasCount())
+        return false;
+    for (const auto &EdgeInfo : EdgeInfos)
+      if (!EdgeInfo.Count.has_value())
+        return false;
+    return true;
+  }
+
+  bool allTakenPathsExit() const {
+    std::deque<const BasicBlock *> Worklist;
+    DenseSet<const BasicBlock *> Visited;
+    Worklist.push_back(&F.getEntryBlock());
+    Visited.insert(&F.getEntryBlock());
+    while (!Worklist.empty()) {
+      const auto *BB = Worklist.front();
+      Worklist.pop_front();
+      if (succ_size(BB) <= 1)
+        continue;
+      const auto &BBInfo = getBBInfo(*BB);
+      bool Inserted = false;
+      for (auto I = 0U; I < BB->getTerminator()->getNumSuccessors(); ++I) {
+        const auto *Succ = BB->getTerminator()->getSuccessor(I);
+        if (!shouldExcludeEdge(*BB, *Succ)) {
+          if (BBInfo.getEdgeCount(I) > 0)
+            if (Visited.insert(Succ).second) {
+              Worklist.push_back(Succ);
+              Inserted = true;
+            }
+        }
+      }
+      if (!Inserted)
+        return false;
+    }
+    return true;
+  }
+
 public:
   ProfileAnnotator(Function &F, const SmallVectorImpl<uint64_t> &Counters,
                    InstrProfSummaryBuilder &PB)
@@ -268,14 +313,16 @@ class ProfileAnnotator final {
         PB.addInternalCount(EdgeCount);
       }
 
-      if (MaxCount == 0)
-        F.getContext().emitError(
-            "[ctx-prof] Encountered a BB with more than one successor, where "
-            "all outgoing edges have a 0 count. This occurs in non-exiting "
-            "functions (message pumps, usually) which are not supported in the "
-            "contextual profiling case");
-      setProfMetadata(F.getParent(), Term, EdgeCounts, MaxCount);
+      if (MaxCount != 0)
+        setProfMetadata(F.getParent(), Term, EdgeCounts, MaxCount);
     }
+    assert(allCountersAreAssinged() &&
+           "Expected all counters have been assigned.");
+    assert(allTakenPathsExit() &&
+           "[ctx-prof] Encountered a BB with more than one successor, where "
+           "all outgoing edges have a 0 count. This occurs in non-exiting "
+           "functions (message pumps, usually) which are not supported in the "
+           "contextual profiling case");
   }
 };
 
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll
new file mode 100644
index 00000000000000..d4ed92438ce5a8
--- /dev/null
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll
@@ -0,0 +1,55 @@
+; RUN: split-file %s %t
+; RUN: llvm-ctxprof-util fromJSON --input=%t/profile.json --output=%t/profile.ctxprofdata
+; RUN: opt -passes=ctx-prof-flatten %t/example.ll -use-ctx-profile=%t/profile.ctxprofdata -S -o - | FileCheck %s
+
+; CHECK-LABEL: entry:
+; CHECK:          br i1 %t, label %yes, label %no, !prof ![[C1:[0-9]+]]
+; CHECK-LABEL: no:
+; CHECK-NOT:   !prof
+; CHECK-LABEL: no1:
+; CHECK-NOT:   !prof
+; CHECK-LABEL: no2:
+; CHECK-NOT:   !prof
+; CHECK-LABEL: yes:
+; CHECK:          br i1 %t3, label %yes1, label %yes2, !prof ![[C1]]
+; CHECK-NOT:   !prof
+; CHECK: ![[C1]] = !{!"branch_weights", i32 6, i32 0}
+
+;--- example.ll
+define void @f1(i32 %cond) !guid !0 {
+entry:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 0)
+  %t = icmp eq i32 %cond, 1
+  br i1 %t, label %yes, label %no
+
+no:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 1)
+  %t2 = icmp eq i32 %cond, 2
+  br i1 %t2, label %no1, label %no2
+no1:
+  unreachable
+no2:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 2)
+  unreachable
+yes:
+  %t3 = icmp eq i32 %cond, 3
+  br i1 %t3, label %yes1, label %yes2
+yes1:
+  br label %exit
+yes2:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 3)
+  %t4 = icmp eq i32 %cond, 4
+  br i1 %t4, label %yes3, label %yes4
+yes3:
+  br label %exit
+yes4:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 4)
+  unreachable
+exit:
+  ret void
+}
+
+!0 = !{i64 1234}
+
+;--- profile.json
+[{"Guid":1234, "Counters":[6,0,0,0,0]}]

@llvmbot
Copy link
Member

llvmbot commented Sep 12, 2024

@llvm/pr-subscribers-llvm-analysis

Author: Mircea Trofin (mtrofin)

Changes

The assertion that all out-edges of a BB can't be 0 is incorrect: they can be, if that branch is on a cold subgraph.

Added validators and asserts about the expected proprerties of the propagated counters.


Full diff: https://github.com/llvm/llvm-project/pull/108467.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp (+56-9)
  • (added) llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll (+55)
diff --git a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
index fc78d8c60ec050..77cfa2cdcff7e8 100644
--- a/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
+++ b/llvm/lib/Transforms/Instrumentation/PGOCtxProfFlattening.cpp
@@ -34,6 +34,7 @@
 #include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
 #include "llvm/Transforms/Scalar/DCE.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
+#include <deque>
 
 using namespace llvm;
 
@@ -110,7 +111,7 @@ class ProfileAnnotator final {
     }
 
     bool tryTakeCountFromKnownOutEdges(const BasicBlock &BB) {
-      if (!succ_empty(&BB) && !UnknownCountOutEdges) {
+      if (!UnknownCountOutEdges) {
         computeCountFrom(OutEdges);
         return true;
       }
@@ -118,7 +119,7 @@ class ProfileAnnotator final {
     }
 
     bool tryTakeCountFromKnownInEdges(const BasicBlock &BB) {
-      if (!BB.isEntryBlock() && !UnknownCountInEdges) {
+      if (!UnknownCountInEdges) {
         computeCountFrom(InEdges);
         return true;
       }
@@ -198,6 +199,50 @@ class ProfileAnnotator final {
 
   BBInfo &getBBInfo(const BasicBlock &BB) { return BBInfos.find(&BB)->second; }
 
+  const BBInfo &getBBInfo(const BasicBlock &BB) const {
+    return BBInfos.find(&BB)->second;
+  }
+
+  // validation function after we propagate the counters: all BBs and edges'
+  // counters must have a value.
+  bool allCountersAreAssinged() const {
+    for (const auto &BBInfo : BBInfos)
+      if (!BBInfo.second.hasCount())
+        return false;
+    for (const auto &EdgeInfo : EdgeInfos)
+      if (!EdgeInfo.Count.has_value())
+        return false;
+    return true;
+  }
+
+  bool allTakenPathsExit() const {
+    std::deque<const BasicBlock *> Worklist;
+    DenseSet<const BasicBlock *> Visited;
+    Worklist.push_back(&F.getEntryBlock());
+    Visited.insert(&F.getEntryBlock());
+    while (!Worklist.empty()) {
+      const auto *BB = Worklist.front();
+      Worklist.pop_front();
+      if (succ_size(BB) <= 1)
+        continue;
+      const auto &BBInfo = getBBInfo(*BB);
+      bool Inserted = false;
+      for (auto I = 0U; I < BB->getTerminator()->getNumSuccessors(); ++I) {
+        const auto *Succ = BB->getTerminator()->getSuccessor(I);
+        if (!shouldExcludeEdge(*BB, *Succ)) {
+          if (BBInfo.getEdgeCount(I) > 0)
+            if (Visited.insert(Succ).second) {
+              Worklist.push_back(Succ);
+              Inserted = true;
+            }
+        }
+      }
+      if (!Inserted)
+        return false;
+    }
+    return true;
+  }
+
 public:
   ProfileAnnotator(Function &F, const SmallVectorImpl<uint64_t> &Counters,
                    InstrProfSummaryBuilder &PB)
@@ -268,14 +313,16 @@ class ProfileAnnotator final {
         PB.addInternalCount(EdgeCount);
       }
 
-      if (MaxCount == 0)
-        F.getContext().emitError(
-            "[ctx-prof] Encountered a BB with more than one successor, where "
-            "all outgoing edges have a 0 count. This occurs in non-exiting "
-            "functions (message pumps, usually) which are not supported in the "
-            "contextual profiling case");
-      setProfMetadata(F.getParent(), Term, EdgeCounts, MaxCount);
+      if (MaxCount != 0)
+        setProfMetadata(F.getParent(), Term, EdgeCounts, MaxCount);
     }
+    assert(allCountersAreAssinged() &&
+           "Expected all counters have been assigned.");
+    assert(allTakenPathsExit() &&
+           "[ctx-prof] Encountered a BB with more than one successor, where "
+           "all outgoing edges have a 0 count. This occurs in non-exiting "
+           "functions (message pumps, usually) which are not supported in the "
+           "contextual profiling case");
   }
 };
 
diff --git a/llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll b/llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll
new file mode 100644
index 00000000000000..d4ed92438ce5a8
--- /dev/null
+++ b/llvm/test/Analysis/CtxProfAnalysis/flatten-zero-path.ll
@@ -0,0 +1,55 @@
+; RUN: split-file %s %t
+; RUN: llvm-ctxprof-util fromJSON --input=%t/profile.json --output=%t/profile.ctxprofdata
+; RUN: opt -passes=ctx-prof-flatten %t/example.ll -use-ctx-profile=%t/profile.ctxprofdata -S -o - | FileCheck %s
+
+; CHECK-LABEL: entry:
+; CHECK:          br i1 %t, label %yes, label %no, !prof ![[C1:[0-9]+]]
+; CHECK-LABEL: no:
+; CHECK-NOT:   !prof
+; CHECK-LABEL: no1:
+; CHECK-NOT:   !prof
+; CHECK-LABEL: no2:
+; CHECK-NOT:   !prof
+; CHECK-LABEL: yes:
+; CHECK:          br i1 %t3, label %yes1, label %yes2, !prof ![[C1]]
+; CHECK-NOT:   !prof
+; CHECK: ![[C1]] = !{!"branch_weights", i32 6, i32 0}
+
+;--- example.ll
+define void @f1(i32 %cond) !guid !0 {
+entry:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 0)
+  %t = icmp eq i32 %cond, 1
+  br i1 %t, label %yes, label %no
+
+no:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 1)
+  %t2 = icmp eq i32 %cond, 2
+  br i1 %t2, label %no1, label %no2
+no1:
+  unreachable
+no2:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 2)
+  unreachable
+yes:
+  %t3 = icmp eq i32 %cond, 3
+  br i1 %t3, label %yes1, label %yes2
+yes1:
+  br label %exit
+yes2:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 3)
+  %t4 = icmp eq i32 %cond, 4
+  br i1 %t4, label %yes3, label %yes4
+yes3:
+  br label %exit
+yes4:
+  call void @llvm.instrprof.increment(ptr @f1, i64 42, i32 42, i32 4)
+  unreachable
+exit:
+  ret void
+}
+
+!0 = !{i64 1234}
+
+;--- profile.json
+[{"Guid":1234, "Counters":[6,0,0,0,0]}]

Copy link

github-actions bot commented Sep 13, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@llvmbot llvmbot added the LTO Link time optimization (regular/full LTO or ThinLTO) label Sep 14, 2024
return true;
}

bool allTakenPathsExit() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a description for the function. Also make it static?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc done.

re. static, same as above - it uses private members of this class.


// validation function after we propagate the counters: all BBs and edges'
// counters must have a value.
bool allCountersAreAssinged() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it static?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why, it uses private members of ProfileAnnotator.

for (const auto *E : Edges)
if (E)
Sum += AssumeAllKnown ? *E->Count : E->Count.value_or(0U);
if (E) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the situation where we have an entry in the Edges vector but it is null?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edges that are removed, e.g. the faux coroutine suspend edge. I'll add a comment on the Edges member to explain the contents.

@@ -178,7 +191,7 @@ class ProfileAnnotator final {
bool KeepGoing = true;
while (KeepGoing) {
KeepGoing = false;
for (const auto &BB : reverse(F)) {
for (const auto &BB : F) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We always have the entry BB's count, so it makes more sense to traverse top down.

@@ -47,18 +47,21 @@
; NOPROF-1-NOT: m2_f1()
; NOPROF-2-NOT: m1_f1()
;
; The run with workload definitions - same other options.
; The run with workload definitions - same other options. We do need to re-generate the .bc
; files, to include instrumentation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did this work before this change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions have no edges. Before, the validation had to do with edges, now we insist all counters are defined.

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@mtrofin mtrofin merged commit b2d3c31 into llvm:main Sep 18, 2024
6 of 7 checks passed
@mtrofin mtrofin deleted the profile branch September 18, 2024 01:19
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 18, 2024

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/7647

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: ExecutionEngine/JITLink/AArch64/ELF_relocations.s' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 1: rm -rf /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp && mkdir -p /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp
+ rm -rf /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp
+ mkdir -p /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp
RUN: at line 2: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llvm-mc -triple=aarch64-unknown-linux-gnu -x86-relax-relocations=false    -position-independent -filetype=obj -o /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp/elf_reloc.o /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/ExecutionEngine/JITLink/AArch64/ELF_relocations.s
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llvm-mc -triple=aarch64-unknown-linux-gnu -x86-relax-relocations=false -position-independent -filetype=obj -o /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp/elf_reloc.o /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/ExecutionEngine/JITLink/AArch64/ELF_relocations.s
RUN: at line 4: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llvm-jitlink -noexec               -abs external_data=0xdeadbeef               -abs external_func=0xcafef00d               -check /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/ExecutionEngine/JITLink/AArch64/ELF_relocations.s /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp/elf_reloc.o
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/llvm-jitlink -noexec -abs external_data=0xdeadbeef -abs external_func=0xcafef00d -check /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/ExecutionEngine/JITLink/AArch64/ELF_relocations.s /b/1/clang-x86_64-debian-fast/llvm.obj/test/ExecutionEngine/JITLink/AArch64/Output/ELF_relocations.s.tmp/elf_reloc.o
Expression 'decode_operand(test_adr_gotpage_external, 1) =      (got_addr(elf_reloc.o, external_data)[32:12] -         test_adr_gotpage_external[32:12])' is false: 0x108 != 0xffffffffffe00108
/b/1/clang-x86_64-debian-fast/llvm.obj/bin/llvm-jitlink: Some checks in /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/ExecutionEngine/JITLink/AArch64/ELF_relocations.s failed

--

********************


tmsri pushed a commit to tmsri/llvm-project that referenced this pull request Sep 19, 2024
The assertion that all out-edges of a BB can't be 0 is incorrect: they
can be, if that branch is on a cold subgraph.

Added validators and asserts about the expected proprerties of the
propagated counters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms LTO Link time optimization (regular/full LTO or ThinLTO) PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants