[DebugInfo] Preserve line and column number when merging debug info. #129960

snehasish · 2025-03-06T00:03:25Z

This patch introduces a new option -preserve-merged-debug-info to preserve an arbitrary but deterministic version of debug information when DILocations are merged. This is intended to be used in production environments from which sample based profiles are derived such as AutoFDO and MemProf.

With this patch we have see a 0.2% improvement on an internal workload at Google when generating AutoFDO profiles. It also significantly improves the ability for MemProf by preserving debug info for merged call instructions used in the contextual profile.

This patch introduces a new option `-preserve-merged-debug-info` to preserve an arbitrary version of debug information when DILocations are merged. This is intended to be used in production environments from which sample based profiles are derived such as AutoFDO and MemProf. With this patch we have see a 0.2% improvement on an internal workload at Google when generating AutoFDO profiles. It also significantly improves the ability for MemProf by preserving debug info for merged call instructions used in the contextual profile. Co-authored-by: Krzysztof Pszeniczny <[email protected]>

llvmbot · 2025-03-06T00:03:58Z

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-debuginfo

Author: Snehasish Kumar (snehasish)

Changes

This patch introduces a new option -preserve-merged-debug-info to preserve an arbitrary version of debug information when DILocations are merged. This is intended to be used in production environments from which sample based profiles are derived such as AutoFDO and MemProf.

With this patch we have see a 0.2% improvement on an internal workload at Google when generating AutoFDO profiles. It also significantly improves the ability for MemProf by preserving debug info for merged call instructions used in the contextual profile.

Full diff: https://github.com/llvm/llvm-project/pull/129960.diff

2 Files Affected:

(modified) llvm/lib/IR/DebugInfoMetadata.cpp (+15)
(added) llvm/test/DebugInfo/preserve-merged-debug-info.ll (+65)

diff --git a/llvm/lib/IR/DebugInfoMetadata.cpp b/llvm/lib/IR/DebugInfoMetadata.cpp
index f975d4ca33ad9..b18db09820068 100644
--- a/llvm/lib/IR/DebugInfoMetadata.cpp
+++ b/llvm/lib/IR/DebugInfoMetadata.cpp
@@ -21,6 +21,7 @@
 #include "llvm/IR/IntrinsicInst.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/Value.h"
+#include "llvm/Support/CommandLine.h"
 
 #include <numeric>
 #include <optional>
@@ -34,6 +35,12 @@ cl::opt<bool> EnableFSDiscriminator(
     cl::desc("Enable adding flow sensitive discriminators"));
 } // namespace llvm
 
+// When true, preserves line and column number by picking one of the merged
+// location info in a deterministic manner to assist sample based PGO.
+static cl::opt<bool> PreserveMergedDebugInfo(
+    "preserve-merged-debug-info", cl::init(false), cl::Hidden,
+    cl::desc("Preserve line and column number when merging locations."));
+
 uint32_t DIType::getAlignInBits() const {
   return (getTag() == dwarf::DW_TAG_LLVM_ptrauth_type ? 0 : SubclassData32);
 }
@@ -125,6 +132,14 @@ DILocation *DILocation::getMergedLocation(DILocation *LocA, DILocation *LocB) {
   if (LocA == LocB)
     return LocA;
 
+  if (PreserveMergedDebugInfo) {
+    auto A = std::make_tuple(LocA->getLine(), LocA->getColumn(),
+                             LocA->getDiscriminator());
+    auto B = std::make_tuple(LocB->getLine(), LocB->getColumn(),
+                             LocB->getDiscriminator());
+    return A < B ? LocA : LocB;
+  }
+
   LLVMContext &C = LocA->getContext();
 
   using LocVec = SmallVector<const DILocation *>;
diff --git a/llvm/test/DebugInfo/preserve-merged-debug-info.ll b/llvm/test/DebugInfo/preserve-merged-debug-info.ll
new file mode 100644
index 0000000000000..d2b5af0d2d2c3
--- /dev/null
+++ b/llvm/test/DebugInfo/preserve-merged-debug-info.ll
@@ -0,0 +1,65 @@
+; RUN: opt %s -passes=simplifycfg -hoist-common-insts -preserve-merged-debug-info -S | FileCheck %s
+; CHECK: tail call i32 @bar{{.*!dbg !}}[[TAG:[0-9]+]]
+; CHECK: ![[TAG]] = !DILocation(line: 9, column: 16, scope: !9) 
+
+; ModuleID = '../llvm/test/DebugInfo/Inputs/debug-info-merge-call.c'
+source_filename = "../llvm/test/DebugInfo/Inputs/debug-info-merge-call.c"
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+; Function Attrs: nounwind uwtable
+define dso_local i32 @test(i32 noundef %n) local_unnamed_addr #0 !dbg !9 {
+entry:
+  %call = tail call i32 @foo(i32 noundef %n) #2, !dbg !12
+  %cmp1 = icmp sgt i32 %n, 100, !dbg !13
+  br i1 %cmp1, label %if.then, label %if.else, !dbg !13
+
+if.then:                                          ; preds = %entry
+  %call2 = tail call i32 @bar(i32 noundef %n) #2, !dbg !14
+  %add = add nsw i32 %call2, %call, !dbg !15
+  br label %if.end, !dbg !16
+
+if.else:                                          ; preds = %entry
+  %call4 = tail call i32 @bar(i32 noundef %n) #2, !dbg !17
+  br label %if.end
+
+if.end:                                           ; preds = %if.else, %if.then
+  %r.0 = phi i32 [ %add, %if.then ], [ %call4, %if.else ], !dbg !18
+  ret i32 %r.0, !dbg !19
+}
+
+declare !dbg !20 i32 @foo(i32 noundef) local_unnamed_addr #1
+
+declare !dbg !21 i32 @bar(i32 noundef) local_unnamed_addr #1
+
+attributes #0 = { nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #2 = { nounwind }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 21.0.0git ([email protected]:snehasish/llvm-project.git 6ce41db6b0275d060d6e60f88b96a1657024345c)", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "../llvm/test/DebugInfo/Inputs/debug-info-merge-call.c", directory: "/usr/local/google/home/snehasishk/working/llvm-project/build-assert", checksumkind: CSK_MD5, checksum: "ac1be6c40dad11691922d600f9d55c55")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"PIE Level", i32 2}
+!7 = !{i32 7, !"uwtable", i32 2}
+!8 = !{!"clang version 21.0.0git ([email protected]:snehasish/llvm-project.git 6ce41db6b0275d060d6e60f88b96a1657024345c)"}
+!9 = distinct !DISubprogram(name: "test", scope: !1, file: !1, line: 5, type: !10, scopeLine: 5, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!10 = !DISubroutineType(types: !11)
+!11 = !{}
+!12 = !DILocation(line: 7, column: 13, scope: !9)
+!13 = !DILocation(line: 8, column: 8, scope: !9)
+!14 = !DILocation(line: 9, column: 16, scope: !9)
+!15 = !DILocation(line: 9, column: 14, scope: !9)
+!16 = !DILocation(line: 10, column: 3, scope: !9)
+!17 = !DILocation(line: 11, column: 10, scope: !9)
+!18 = !DILocation(line: 0, scope: !9)
+!19 = !DILocation(line: 13, column: 3, scope: !9)
+!20 = !DISubprogram(name: "foo", scope: !1, file: !1, line: 2, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)
+!21 = !DISubprogram(name: "bar", scope: !1, file: !1, line: 1, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)
+

jmorse

My understanding of this is that with this flag on, a consistent but essentially arbitrary source location will be kept for merged instructions -- potentially misleading for debugging (hence it's an option), but good for PGO, yes?

I wonder how effective this is in the general case, but you've got numbers for a workload, so SGTM.

jmorse · 2025-03-06T10:45:52Z

llvm/test/DebugInfo/preserve-merged-debug-info.ll

+attributes #0 = { nounwind uwtable "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #1 = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
+attributes #2 = { nounwind }


Nit -- if we can remove or reduce these attributes that'd be appreciated, it's a future maintenance burden when attributes change otherwise.

jmorse · 2025-03-06T10:46:41Z

llvm/test/DebugInfo/preserve-merged-debug-info.ll

+; RUN: opt %s -passes=simplifycfg -hoist-common-insts -preserve-merged-debug-info -S | FileCheck %s
+; CHECK: tail call i32 @bar{{.*!dbg !}}[[TAG:[0-9]+]]
+; CHECK: ![[TAG]] = !DILocation(line: 9, column: 16, scope: !9) 
+


A file-level comment recording the intention of this test will assist future developers who cause this test to fail.

You know, we have unit tests for this API with decent fixtures:

llvm-project/llvm/unittests/IR/MetadataTest.cpp

Line 921 in a8d2d16

TEST_F(DILocationTest, Merge) {

Maybe this is more trouble than it's worth, but you could make the cl::opt non-static, declare it in the unittest, modify it directly (PreserveMergedDebugInfo = true/false) and test it that way. No strong preference on my part, though,

I'd prefer to leave this as is since the current test also covers the SimplifyCfg transformations usage of getMergedLocation . This is the case we found to be particularly impactful to preserve.

OCHyams · 2025-03-06T11:43:50Z

llvm/test/DebugInfo/preserve-merged-debug-info.ll

@@ -0,0 +1,65 @@
+; RUN: opt %s -passes=simplifycfg -hoist-common-insts -preserve-merged-debug-info -S | FileCheck %s


I think the test could be made more robust by adding CHECKs to show that simplifycfg has done the hoisting (at the moment, I think this test would pass even if the hoisting didn't take place?).

Possibly also worth checking with -preserve-merged-debug-info=false too (that the instruction gets a merged loc) to avoid this test spuriously passing if, for whatever reason, the default behaviour changes to a non-merged-loc for this test? YMMV.

Done for both, thanks for the suggestion.

pogo59 · 2025-03-06T16:58:23Z

The commit message should say "arbitrary but deterministic." If it's just arbitrary, the workload got lucky and it might not remain lucky. If it's deterministic, the luck won't change.

dwblaikie · 2025-03-06T19:02:04Z

This seems like a pretty substantial departure from LLVM's documented approach to instruction line information - a policy that was established predominantly as an agreement/unified needs between Google (for profile accuracy) and Sony (for user debuggability/not so "jumpy" stepping).

So this deserves a bit more discussion to better understand why we're considering departing from that established decision now.

For instance, this seems like it does the wrong thing for PGO, as far I've understood the goals, for an example like this:

void f1();
void f2();
void f3();
void f4(bool b) {
  if (b) {
    f1();
    f2();
  } else {
    f1();
    f3();
  }
}

The call to f1() is hoisted out of the if/else, and currently would be given a source location of line zero - whereas with this change, it receives line 6 (the first of the two calls).

The premise so far has been that that change is harmful for users (they might incorrectly conclude that since line 6 is reached, then b must be true, even when it is false) and for profile-guided optimizations (for similar reasons - the compiler might assume that b is true, that the f2 call is similarly hot, and the f3 call is cold).

Could you provide more context for how this ^ benefits PGO? Or which cases do benefit?

snehasish · 2025-03-06T22:54:59Z

Thanks for taking a look @dwblaikie.

This seems like a pretty substantial departure from LLVM's documented approach to instruction line information - a policy that was established predominantly as an agreement/unified needs between Google (for profile accuracy) and Sony (for user debuggability/not so "jumpy" stepping).

We do not intend to change the behaviour for scenarios where users are stepping through the binary. This behaviour is intended to be enabled in production where sample profiles are collected, similar in spirit to -fdebug-info-for-profiling. This is the reason why we will keep this gated behind a flag.

So this deserves a bit more discussion to better understand why we're considering departing from that established decision now.

For instance, this seems like it does the wrong thing for PGO, as far I've understood the goals, for an example like this:
void f1();
void f2();
void f3();
void f4(bool b) {
  if (b) {
    f1();
    f2();
  } else {
    f1();
    f3();
  }
}
The call to f1() is hoisted out of the if/else, and currently would be given a source location of line zero - whereas with this change, it receives line 6 (the first of the two calls).

The premise so far has been that that change is harmful for users (they might incorrectly conclude that since line 6 is reached, then b must be true, even when it is false) and for profile-guided optimizations (for similar reasons - the compiler might assume that b is true, that the f2 call is similarly hot, and the f3 call is cold).

Could you provide more context for how this ^ benefits PGO? Or which cases do benefit?

I believe this notion of attribution for AutoFDO is outdated. Today hotness inference for a basic block relies on branch target information. In your example, we would use the debug information attached to the if statement to determine the hotness of the callsites for f2 and f3.

Or which cases do benefit?

Debug info loss on merged indirect calls is particularly important since it disallows indirect call promotion. The net impact of retaining a deterministic version of merged debug info for a large workload is positive. MemProf is a context sensitive profile which is particularly susceptible to loss of debug information on direct and indirect calls. This is part of a broader effort to improve profile quality for AutoFDO and MemProf by addressing missing debug info. I think we will find additional opportunities and once we have a better idea, I can update the documentation and guidance at https://llvm.org/docs/HowToUpdateDebugInfo.html.

Does that sound reasonable to you?

snehasish · 2025-03-06T22:59:31Z

Thanks for taking a look @pogo59 @jmorse!

The commit message should say "arbitrary but deterministic." If it's just arbitrary, the workload got lucky and it might not remain lucky. If it's deterministic, the luck won't change.

Yes, determinism is necessary for accurate profile attribution. Updated the PR description.

My understanding of this is that with this flag on, a consistent but essentially arbitrary source location will be kept for merged instructions -- potentially misleading for debugging (hence it's an option), but good for PGO, yes?

I wonder how effective this is in the general case, but you've got numbers for a workload, so SGTM.

Yes, it's a net positive for PGO and not meant to be used in interactive debugging sessions.

dwblaikie · 2025-03-06T23:49:04Z

Thanks for taking a look @dwblaikie.

This seems like a pretty substantial departure from LLVM's documented approach to instruction line information - a policy that was established predominantly as an agreement/unified needs between Google (for profile accuracy) and Sony (for user debuggability/not so "jumpy" stepping).

We do not intend to change the behaviour for scenarios where users are stepping through the binary.

That's hard to separate -people still want to debug their production binaries.

This behaviour is intended to be enabled in production where sample profiles are collected, similar in spirit to -fdebug-info-for-profiling. This is the reason why we will keep this gated behind a flag.

-fdebug-info-for-profiling, so far, adds extra debug info but doesn't harm/change/remove existing debug info - sounds like we're considering diverging from that, which seems difficult.

I believe this notion of attribution for AutoFDO is outdated. Today hotness inference for a basic block relies on branch target information. In your example, we would use the debug information attached to the if statement to determine the hotness of the callsites for f2 and f3.

Hrm :/ Sort of following, thanks.

Or which cases do benefit?

Debug info loss on merged indirect calls is particularly important since it disallows indirect call promotion.

Could you walk me through this in more detail - exactly what DWARF is generated, how does it get consumed, and what decisions are made based on it?

The net impact of retaining a deterministic version of merged debug info for a large workload is positive. MemProf is a context sensitive profile which is particularly susceptible to loss of debug information on direct and indirect calls. This is part of a broader effort to improve profile quality for AutoFDO and MemProf by addressing missing debug info. I think we will find additional opportunities and once we have a better idea, I can update the documentation and guidance at https://llvm.org/docs/HowToUpdateDebugInfo.html.

Does that sound reasonable to you?

I'd be inclined to figure out the principles here a bit earlier/somewhat up front, before we make changes. I'm especially concerned about a divergence between what's good for profilers and what's good for humans.

rnk · 2025-03-06T23:53:53Z

I think we will find additional opportunities and once we have a better idea, I can update the documentation and guidance at https://llvm.org/docs/HowToUpdateDebugInfo.html.
Does that sound reasonable to you?

I'd be inclined to figure out the principles here a bit earlier/somewhat up front, before we make changes. I'm especially concerned about a divergence between what's good for profilers and what's good for humans.

I think this is probably the most important thing: documenting the principles of what we do, why, what AFDO needs, what profilers need, etc, what modes we have, and what we expect them to do.

Over a long enough time horizon, it seems like we'd want to make this new behavior part of -fdebug-info-for-profiling, but I understand the need for separate feature flags in the short term.

llvm/docs/SourceLevelDebugging.rst

dwblaikie · 2025-03-17T14:52:32Z

I think we will find additional opportunities and once we have a better idea, I can update the documentation and guidance at https://llvm.org/docs/HowToUpdateDebugInfo.html.
Does that sound reasonable to you?

I'd be inclined to figure out the principles here a bit earlier/somewhat up front, before we make changes. I'm especially concerned about a divergence between what's good for profilers and what's good for humans.

I think this is probably the most important thing: documenting the principles of what we do, why, what AFDO needs, what profilers need, etc, what modes we have, and what we expect them to do.

Documenting it seems good - perhaps more than has been posted so far (the AFDO documentation doesn't discuss the issue of "samples of a source location inside a block could lead to misconstrued assumptions about that block being reachable"). But even then, I'm still really worried that this makes the needs of different consumers divergent to a problematic degree - we managed to make the direction align between a variety of consumers (humans, profilers, profile driven optimization, etc) so far and diverging from that seems like a real problem. (even just at Google - humans use profiles from this same debug info, and if we tell them that some block is hot when it isn't, that seems potentially really unfortunate)

snehasish · 2025-03-18T00:53:25Z

Debug info loss on merged indirect calls is particularly important since it disallows indirect call promotion.

Could you walk me through this in more detail - exactly what DWARF is generated, how does it get consumed, and what decisions are made based on it?

Here is a pointer to a prior discussion on merging value profile metadata which notes the need to retain debug information to be able to assign the right profile information to indirect calls. For AFDO we rely on the start line of the function (needs -fdebug-info-for-profiling) and the line number of the indirect call (-g1 is sufficient) to compute the line offset. This is used to generate a profile like this example. Here Line 9 in the profile has 2 entries which indicate the targets for the indirect call and their counts. In the absence of line numbers for a merged indirect call we fail to map the profiled targets to the indirect call in the IR. Note that retaining line number information is the first step to enabling more effective ICP (details in D148876). So the 0.2% number I shared is conservative and there is more room for improvement!

I think this is probably the most important thing: documenting the principles of what we do, why, what AFDO needs, what profilers need, etc, what modes we have, and what we expect them to do.

Documenting it seems good - perhaps more than has been posted so far (the AFDO documentation doesn't discuss the issue of "samples of a source location inside a block could lead to misconstrued assumptions about that block being reachable"). But even then, I'm still really worried that this makes the needs of different consumers divergent to a problematic degree - we managed to make the direction align between a variety of consumers (humans, profilers, profile driven optimization, etc) so far and diverging from that seems like a real problem. (even just at Google - humans use profiles from this same debug info, and if we tell them that some block is hot when it isn't, that seems potentially really unfortunate)

I agree - with this change we are no longer have a single aligned implementation for all users. However, the cost we pay in terms of performance for alignment increases with time as more aggressive optimizations are introduced. For newer optimizations such as MemProf the loss of debug information on call insts is a significant contributor to coverage loss. Basically, the trade off we have encoded in LLVM increases the downside for PGO with the passage of time.

With this change I want us to be able to quantify the impact in production. It's possible we will encounter situations where this is not ideal (e.g. crash stack traces) but that's ok in my opinion. In the short term we can disable the change for those affected binaries and in the long term, gives us motivation to champion work on improving source location fidelity for humans and tools.

Are there any further changes / clarifications I can make to help move this along? @rnk @dwblaikie

rnk

Thanks for the docs!

This makes me wonder if we should look into some higher-level integration testing that tracks the number of call sites that lack source locations, since call site source locations are critical for so many applications.

rnk · 2025-03-20T16:37:45Z

llvm/docs/HowToUpdateDebugInfo.rst

+In particular, loss of location information for calls inhibit optimizations 
+such as indirect call promotion. This behavior can be optionally enabled until
+support for accurately representing merged instructions in the line table is
+implemented. 


Are branch instruction locations also critical? I've had more conversations with other sample PGO folks, and my mental model is that PGO constructs branch weights from LBR (Last Branch Record) counters. So, the key to getting accurate branch weights is not having deterministic, distinct source locations on every instruction in the basic block, it's tracking deterministic, distinct source locations on every branch instruction. I think it turns out that, in practice, branch instructions are control instructions, so they tend to either folded away, duplicated in order (inlining, unswitching), or left alone. It doesn't really make sense to speculate a branch instruction. The folding we do just needs to be extremely careful about tracking slocs, just like it has to care about updating branch weights.

If that's accurate, I think it would be helpful to document that maintaining deterministic and distinct source locations are what's important for sample PGO. I think this would go a long way to helping motivate why we can safely retain more source locations on out-of-order instructions without negatively impacting sample PGO results.

Are branch instruction locations also critical?

Yes, though the design of sample attribution mitigates this somewhat by inferring block weight from the max of all instructions in the basic block. Calls are more important since they carry additional metadata apart from their own execution count as noted earlier.

Updated text, ptal.

rnk · 2025-03-20T18:18:01Z

I'd be inclined to figure out the principles here a bit earlier/somewhat up front, before we make changes. I'm especially concerned about a divergence between what's good for profilers and what's good for humans.
...
I'm still really worried that this makes the needs of different consumers divergent to a problematic degree - we managed to make the direction align between a variety of consumers (humans, profilers, profile driven optimization, etc) so far and diverging from that seems like a real problem.

I share this concern about aligning the needs . At the end of the day, the compiler only emits one kind of debug info, and it's used for many purposes. We need to make reasonable tradeoffs. I view this copt as a form of short term tech debt to unblock current PGO use cases. It doesn't feel reasonable to me to insist that at this moment, to uphold the desired PGO invariant that all call sites have determinstic source locations, that we stop now and come up with a design for that up front.

As long as we have agreement that we need such a design and that this is the long term goal, I'm comfortable living with this (I won't even say short term, because everything lives longer than you expect) tech debt tradeoff.

rnk · 2025-03-20T18:23:05Z

As long as we have agreement that we need such a design and that this is the long term goal, I'm comfortable living with this (I won't even say short term, because everything lives longer than you expect) tech debt tradeoff.

To say more (I should edit more before replying), I think documenting what PGO needs from debug info is a necessary step on the path to creating this alignment. We have some of this in HowToUpdateDebugInfo.rst, but a lot of these requirements for interactive stepping, PGO control dependency, and sanitizer reports have largely been in our heads, and that makes it harder for the LLVM project to function as a globally distributed open source community. Which is why I think it's a good tradeoff to accept some tech debt for improved documentation.

snehasish · 2025-03-24T22:49:52Z

Thanks for the docs!

This makes me wonder if we should look into some higher-level integration testing that tracks the number of call sites that lack source locations, since call site source locations are critical for so many applications.

+1 to this. We've had similar discussions about profile information propagation before.

I think it would be helpful to document that maintaining deterministic and distinct source locations are what's important for sample PGO.

Yes, the need for distinction also motivated the work on DWARF path discriminators for duplicated basic blocks. More recent work on flow sensitive AFDO has extended the use of discriminators to enhance profile accuracy particularly in the backend.

As long as we have agreement that we need such a design and that this is the long term goal, I'm comfortable living with this

Let me know if you want me to file an issue or if you want any further clarification on SamplePGO needs in the debug info documentation. Thanks!

rnk · 2025-03-31T20:25:36Z

llvm/lib/IR/DebugInfoMetadata.cpp

@@ -34,6 +35,12 @@ cl::opt<bool> EnableFSDiscriminator(
    cl::desc("Enable adding flow sensitive discriminators"));
 } // namespace llvm

+// When true, preserves line and column number by picking one of the merged


I'd like to have a more descriptive name for this option, since we are discarding information, we're not preserving half of the merged source location. I came up with some suggestions:

-merge-source-locs-by-picking

-pick-merged-source-locations

-merge-source-locations-with-selection

-merge-source-locations-by-choice

-no-approximate-merged-source-locs

I'm really just trying various synonyms for "pick", "select", "choose".

Done, "pick-merged-source-locations" sounds good to me.

llvm/lib/IR/DebugInfoMetadata.cpp

rnk · 2025-03-31T20:36:32Z

llvm/lib/IR/DebugInfoMetadata.cpp

@@ -125,6 +132,14 @@ DILocation *DILocation::getMergedLocation(DILocation *LocA, DILocation *LocB) {
  if (LocA == LocB)
    return LocA;

+  if (PreserveMergedDebugInfo) {
+    auto A = std::make_tuple(LocA->getLine(), LocA->getColumn(),


Should filenames be compared here? DILocation has getFilename and getDirectory, which return StringRefs, so you can fill them in here in a reasonably straightforward way.

However, given the use case, would it be reasonable to always pick LocA?

Regardless, I think adding directories and files would follow the principle of least surprise.

Sure, I've added getFilename and getDirectory to the comparison. Note that SamplePGO doesn't rely on file and directory today when using instruction locations.

would it be reasonable to always pick LocA

I think the pick policy should be independent of the order of the parameters passed to this function. So getMergedLocation(A, B) == getMergedLocation(B, A).

rnk · 2025-03-31T20:43:09Z

llvm/test/DebugInfo/preserve-merged-debug-info.ll

+; RUN: opt %s -passes=simplifycfg -hoist-common-insts -preserve-merged-debug-info -S | FileCheck %s
+; CHECK: tail call i32 @bar{{.*!dbg !}}[[TAG:[0-9]+]]
+; CHECK: ![[TAG]] = !DILocation(line: 9, column: 16, scope: !9) 
+


You know, we have unit tests for this API with decent fixtures:

llvm-project/llvm/unittests/IR/MetadataTest.cpp

Line 921 in a8d2d16

TEST_F(DILocationTest, Merge) {

Maybe this is more trouble than it's worth, but you could make the cl::opt non-static, declare it in the unittest, modify it directly (PreserveMergedDebugInfo = true/false) and test it that way. No strong preference on my part, though,

github-actions · 2025-03-31T23:56:57Z

✅ With the latest revision this PR passed the C/C++ code formatter.

rnk

Thanks!

snehasish · 2025-04-02T17:01:52Z

I'll wait a day and merge this on Friday morning if there are no more comments. Thanks!

rlavaee · 2025-05-02T16:34:08Z

llvm/lib/IR/DebugInfoMetadata.cpp

+  // rather than computing a merged location using line 0, which is typically
+  // not useful for PGO.
+  if (PickMergedSourceLocations) {
+    auto A = std::make_tuple(LocA->getLine(), LocA->getColumn(),


nit: Does std::forward_as_tuple work here (to avoid copies)?

No, the only place it would be useful is if getFilename and getDirectory returned string, but they return StringRef. Also you'd have to rewrite this to do the comparison without the auto A and auto B vars or you'd introduce lifetime issues if you just replaced make_tuple with forward_as_tuple.

llvmbot added debuginfo llvm:ir labels Mar 6, 2025

snehasish requested review from teresajohnson and dwblaikie March 6, 2025 00:04

jmorse reviewed Mar 6, 2025

View reviewed changes

OCHyams reviewed Mar 6, 2025

View reviewed changes

Address comments.

df04e25

Update docs to mention SamplePGO debug info usage.

e43ffd1

dwblaikie reviewed Mar 17, 2025

View reviewed changes

llvm/docs/SourceLevelDebugging.rst Outdated Show resolved Hide resolved

snehasish requested review from rnk, dwblaikie, jmorse and OCHyams March 18, 2025 00:54

Fix typo.

ffea531

rnk reviewed Mar 20, 2025

View reviewed changes

Enhance the documentation for SamplePGO.

713ae85

snehasish requested a review from rnk March 25, 2025 16:20

rnk reviewed Mar 31, 2025

View reviewed changes

Address comments.

4ba8427

Fix formatting.

7129758

dwblaikie mentioned this pull request Apr 1, 2025

[DebugInfo] getMergedLocation: match scopes based on their location #132286

Merged

rnk approved these changes Apr 2, 2025

View reviewed changes

snehasish merged commit f9193f3 into llvm:main Apr 4, 2025
12 checks passed

rlavaee reviewed May 2, 2025

View reviewed changes

		@@ -0,0 +1,65 @@
		; RUN: opt %s -passes=simplifycfg -hoist-common-insts -preserve-merged-debug-info -S \| FileCheck %s

[DebugInfo] Preserve line and column number when merging debug info. #129960

[DebugInfo] Preserve line and column number when merging debug info. #129960

Uh oh!

Conversation

snehasish commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmorse left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pogo59 commented Mar 6, 2025

Uh oh!

dwblaikie commented Mar 6, 2025

Uh oh!

snehasish commented Mar 6, 2025

Uh oh!

snehasish commented Mar 6, 2025

Uh oh!

dwblaikie commented Mar 6, 2025

Uh oh!

rnk commented Mar 6, 2025

Uh oh!

Uh oh!

dwblaikie commented Mar 17, 2025

Uh oh!

snehasish commented Mar 18, 2025

Uh oh!

rnk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rnk commented Mar 20, 2025

Uh oh!

rnk commented Mar 20, 2025

Uh oh!

snehasish commented Mar 24, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rnk left a comment

Choose a reason for hiding this comment

Uh oh!

snehasish commented Apr 2, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

snehasish commented Mar 6, 2025 •

edited

Loading

llvmbot commented Mar 6, 2025 •

edited

Loading

github-actions bot commented Mar 31, 2025 •

edited

Loading