Skip to content

[DWARF] Emit line 0 source locations for BB padding nops #99496

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 29, 2024

Conversation

omern1
Copy link
Member

@omern1 omern1 commented Jul 18, 2024

This patch makes LLVM emit line 0 source locations for nops emitted as basic block padding.

CC @OCHyams

This patch makes LLVM emit line 0 source locations for nops emitted as
basic block padding.

Co-authored-by: Orlando Cazalet-Hyams <[email protected]>
@llvmbot
Copy link
Member

llvmbot commented Jul 18, 2024

@llvm/pr-subscribers-debuginfo

@llvm/pr-subscribers-backend-x86

Author: Nabeel Omer (omern1)

Changes

This patch makes LLVM emit line 0 source locations for nops emitted as basic block padding.

CC @OCHyams


Full diff: https://github.com/llvm/llvm-project/pull/99496.diff

7 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/DebugHandlerBase.h (+2)
  • (modified) llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (+4)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp (+16)
  • (modified) llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h (+3)
  • (modified) llvm/test/CodeGen/X86/fsafdo_test1.ll (+2-2)
  • (modified) llvm/test/CodeGen/X86/fsafdo_test4.ll (+9-4)
  • (added) llvm/test/DebugInfo/X86/loop-align-debug.ll (+55)
diff --git a/llvm/include/llvm/CodeGen/DebugHandlerBase.h b/llvm/include/llvm/CodeGen/DebugHandlerBase.h
index 36a844e7087fa..30e56269b529d 100644
--- a/llvm/include/llvm/CodeGen/DebugHandlerBase.h
+++ b/llvm/include/llvm/CodeGen/DebugHandlerBase.h
@@ -137,6 +137,8 @@ class DebugHandlerBase {
   void beginBasicBlockSection(const MachineBasicBlock &MBB);
   void endBasicBlockSection(const MachineBasicBlock &MBB);
 
+  virtual void beginCodeAlignment(const MachineBasicBlock& MBB) {}
+
   /// Return Label preceding the instruction.
   MCSymbol *getLabelBeforeInsn(const MachineInstr *MI);
 
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 1f59ec545b4f7..78cb5df4bd02c 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -3967,7 +3967,11 @@ void AsmPrinter::emitBasicBlockStart(const MachineBasicBlock &MBB) {
   // Emit an alignment directive for this block, if needed.
   const Align Alignment = MBB.getAlignment();
   if (Alignment != Align(1))
+  {
+    for (auto &Handler : DebugHandlers)
+      Handler->beginCodeAlignment(MBB);
     emitAlignment(Alignment, nullptr, MBB.getMaxBytesForAlignment());
+  }
 
   // If the block has its address taken, emit any labels that were used to
   // reference the block.  It is possible that there is more than one label
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 80cd5ec501f25..36e53702930f8 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -3668,3 +3668,19 @@ bool DwarfDebug::alwaysUseRanges(const DwarfCompileUnit &CU) const {
     return true;
   return false;
 }
+
+void DwarfDebug::beginCodeAlignment(const MachineBasicBlock& MBB) {
+  auto *SP = MBB.getParent()->getFunction().getSubprogram();
+
+  bool NoDebug =
+      !SP || SP->getUnit()->getEmissionKind() == DICompileUnit::NoDebug;
+
+  if (NoDebug)
+    return;
+
+  auto PrevLoc = Asm->OutStreamer->getContext().getCurrentDwarfLoc();
+  Asm->OutStreamer->emitDwarfLocDirective(PrevLoc.getFileNum(), 0,
+                                                      0, 0, 0, 0, StringRef());
+  MCDwarfLineEntry::make(Asm->OutStreamer.get(),
+                         Asm->OutStreamer->getCurrentSectionOnly());
+}
\ No newline at end of file
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
index 452485b632c45..eb474c0366d82 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
@@ -727,6 +727,9 @@ class DwarfDebug : public DebugHandlerBase {
   /// Process beginning of an instruction.
   void beginInstruction(const MachineInstr *MI) override;
 
+  /// Process beginning of code alignment.
+  void beginCodeAlignment(const MachineBasicBlock& MBB) override;
+
   /// Perform an MD5 checksum of \p Identifier and return the lower 64 bits.
   static uint64_t makeTypeSignature(StringRef Identifier);
 
diff --git a/llvm/test/CodeGen/X86/fsafdo_test1.ll b/llvm/test/CodeGen/X86/fsafdo_test1.ll
index 61c0f59aba6f8..e80a7f2f354f2 100644
--- a/llvm/test/CodeGen/X86/fsafdo_test1.ll
+++ b/llvm/test/CodeGen/X86/fsafdo_test1.ll
@@ -4,9 +4,9 @@
 ; Check that fs-afdo discriminators are generated.
 ; V01: .loc    1 7 3 is_stmt 0 discriminator 2 # foo.c:7:3
 ; V01: .loc    1 9 5 is_stmt 1 discriminator 2 # foo.c:9:5
-; V0: .loc    1 9 5 is_stmt 0 discriminator 11266 # foo.c:9:5
+; V0: .loc    1 9 5 discriminator 11266 # foo.c:9:5
 ; V0: .loc    1 7 3 is_stmt 1 discriminator 11266 # foo.c:7:3
-; V1: .loc    1 9 5 is_stmt 0 discriminator 514 # foo.c:9:5
+; V1: .loc    1 9 5 discriminator 514 # foo.c:9:5
 ; V1: .loc    1 7 3 is_stmt 1 discriminator 258 # foo.c:7:3
 ; Check that variable __llvm_fs_discriminator__ is generated.
 ; V01: .type   __llvm_fs_discriminator__,@object # @__llvm_fs_discriminator__
diff --git a/llvm/test/CodeGen/X86/fsafdo_test4.ll b/llvm/test/CodeGen/X86/fsafdo_test4.ll
index 6a22ea9822412..aa05a0c4cd3ef 100644
--- a/llvm/test/CodeGen/X86/fsafdo_test4.ll
+++ b/llvm/test/CodeGen/X86/fsafdo_test4.ll
@@ -1,11 +1,16 @@
-; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false < %s | FileCheck %s
-; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true < %s | FileCheck %s
+; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false < %s | FileCheck --implicit-check-not=.loc %s
+; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true < %s | FileCheck --implicit-check-not=.loc %s
 ;
 ; Check that fs-afdo discriminators are NOT generated, as debugInfoForProfiling is false (not set).
+; CHECK: .loc    1 7 15 prologue_end discriminator 2 # foo.c:7:15
 ; CHECK: .loc    1 7 3 is_stmt 0 discriminator 2 # foo.c:7:3
+; CHECK: .loc    1 0 3 # foo.c:0:3
 ; CHECK: .loc    1 9 5 is_stmt 1 discriminator 2 # foo.c:9:5
-; CHECK-NOT: .loc    1 9 5 is_stmt 0 discriminator
-; CHECK-NOT: .loc    1 7 3 is_stmt 1 discriminator
+; CHECK: .loc    1 0 0 is_stmt 0 # :0:0
+; CHECK: .loc    1 9 5 discriminator 2 # foo.c:9:5
+; CHECK: .loc    1 0 0 # :0:0
+; CHECK: .loc	   1 7 3 is_stmt 1 discriminator 2 # foo.c:7:3
+; CHECK: .loc    1 14 3 # foo.c:14:3
 ; Check that variable __llvm_fs_discriminator__ is NOT generated.
 ; CHECK-NOT: __llvm_fs_discriminator__:
 
diff --git a/llvm/test/DebugInfo/X86/loop-align-debug.ll b/llvm/test/DebugInfo/X86/loop-align-debug.ll
new file mode 100644
index 0000000000000..a0302d08faa0c
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/loop-align-debug.ll
@@ -0,0 +1,55 @@
+; RUN: llc %s --filetype=obj -o %t
+; RUN: llvm-objdump -d %t | FileCheck %s --check-prefixes=OBJ
+; RUN: llvm-dwarfdump --debug-line %t | FileCheck %s --check-prefixes=DBG
+; RUN: llc %s -o - | FileCheck %s --check-prefixes=ASM
+
+; OBJ: 1:{{.*}}nop
+
+;;     Address            Line   Column File   ISA Discriminator OpIndex Flags
+; DBG: 0x0000000000000000      3      0      0   0             0       0  is_stmt
+; DBG: 0x0000000000000001      0      0      0   0             0       0
+; DBG: 0x0000000000000010      5      0      0   0             0       0  is_stmt prologue_end
+; DBG: 0x0000000000000017      5      0      0   0             0       0  is_stmt end_sequence
+
+; ASM:      .loc    0 0 0 is_stmt 0
+; ASM-NEXT: .L{{.*}}:
+; ASM-NEXT: .p2align        4, 0x90
+
+;; $ cat test.cpp
+;; void g();
+;; void f() {
+;;   [[clang::code_align(16)]]
+;;   while (1) { g(); }
+;; }
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local void @f() local_unnamed_addr !dbg !9 {
+entry:
+  br label %while.body, !dbg !12
+
+while.body:                                       ; preds = %entry, %while.body
+  tail call void @g(), !dbg !12
+  br label %while.body, !dbg !12, !llvm.loop !13
+}
+
+declare !dbg !16 void @g() local_unnamed_addr
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 19.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "test.cpp", directory: "/")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!8 = !{!"clang version 19.0.0git"}
+!9 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 3, type: !10, scopeLine: 3, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
+!10 = !DISubroutineType(types: !11)
+!11 = !{}
+!12 = !DILocation(line: 5, scope: !9)
+!13 = distinct !{!13, !12, !12, !14, !15}
+!14 = !{!"llvm.loop.mustprogress"}
+!15 = !{!"llvm.loop.align", i32 16}
+!16 = !DISubprogram(name: "g", scope: !1, file: !1, line: 2, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)

Copy link

github-actions bot commented Jul 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@OCHyams
Copy link
Contributor

OCHyams commented Jul 18, 2024

cc @dwblaikie, for some context, we noticed this as it manifested in our debugger "step-over"-ing to an out of scope line number (a loop alignment nop inherited the source location from an inlined instruction before it, but isn't included in the inlined function's scope range). GDB and LLDB seem to avoid running into that issue while stepping over but it can still be observed while stepping at an instruction level. Either way, a code alignment nop feels unambiguously like an instruction that should be line-zeroed to me. wdyt?

; CHECK: .loc 1 9 5 is_stmt 1 discriminator 2 # foo.c:9:5
; CHECK-NOT: .loc 1 9 5 is_stmt 0 discriminator
; CHECK-NOT: .loc 1 7 3 is_stmt 1 discriminator
; CHECK: .loc 1 0 0 is_stmt 0 # :0:0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the column to zero is sub-optimal, it requires extra line-table opcodes to set it to zero and then put it back. Please preserve the column number.

; V0: .loc 1 7 3 is_stmt 1 discriminator 11266 # foo.c:7:3
; V1: .loc 1 9 5 is_stmt 0 discriminator 514 # foo.c:9:5
; V1: .loc 1 9 5 discriminator 514 # foo.c:9:5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does your patch cause this change?

Copy link
Member Author

@omern1 omern1 Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC .loc directives only show state changes, and we emit a .loc 1 0 0 is_stmt 0 before that so the assembly printer omits the is_stmt there. I've also verified that the line table is the same with and without this patch (apart from the new line 0 entries).

auto PrevLoc = Asm->OutStreamer->getContext().getCurrentDwarfLoc();
Asm->OutStreamer->emitDwarfLocDirective(
PrevLoc.getFileNum(), 0, PrevLoc.getColumn(), 0, 0, 0, StringRef());
MCDwarfLineEntry::make(Asm->OutStreamer.get(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this call should be unnecessary? It's not used anywhere else in CodeGen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instructions, usually the line information is recorded by AsmPrinter::emitFunctionBody() -> DwarfDebug::beginInstruction() -> recordSourceLine() -> MCStreamer::emitDwarfLocDirective() and then MCDwarfLineEntry::make is called by MCObjectStreamer::emitInstructionImpl().

In the case of nops we aren't going through the normal flow of emitting instructions so we want the line entry for our line information to be emitted immediately otherwise it'll be emitted at the beginning of the next instruction (the end of the nops).

This is also what MCObjectStreamer::emitInstructionImpl() does. The line entry is emitted first so that the label points to the beginning of the instruction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so because in this case we're looking at a .align directive rather than a real instruction, it won't be passing through emitInstructionImpl() so we need to call MCDwarfLineEntry::make() directly. Sounds reasonable.

return;

auto PrevLoc = Asm->OutStreamer->getContext().getCurrentDwarfLoc();
Asm->OutStreamer->emitDwarfLocDirective(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we call recordSourceLine(), but in this case we're assuming we're at the top of a basic block and (maybe) emitting nops; so, zeroing out flags and discriminator as well as line seems like a reasonable simplification.

What happens if the block is already aligned? No nops will be emitted, and presumably you'd end up with a .loc for a line 0 immediately followed by a .loc for the correct line. The line table will encode that correctly, but I'm suspicious that some consumers will decide the instruction actually has line 0 (because that's what it will find first for that address). Please test that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. We control whether or not we call beginCodeAlignment in this if block but if other users call beginCodeAlignment unconditionally they would end up with a line 0 .loc immediately followed by the correct .loc.

The function is named beginCodeAlignment so one may argue you should only call it when emitting alignment nops but on the other hand we can easily move the if block into beginCodeAlignment so I'll do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it now skips the line-0 for a block that has 1-byte alignment, but not if the block happens to be aligned.

Copy link
Member Author

@omern1 omern1 Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean.

0000000000000000 <f>:
       ; ...
       3: 7e 2b                         jle     0x30 <f+0x30>
       5: 66 2e 0f 1f 84 00 00 00 00 00 nopw    %cs:(%rax,%rax)
       f: 90                            nop
      10: e8 00 00 00 00                callq   0x15 <f+0x15>
      15: e8 00 00 00 00                callq   0x1a <f+0x1a>
      1a: e8 00 00 00 00                callq   0x1f <f+0x1f>
      1f: e8 00 00 00 00                callq   0x24 <f+0x24>
      24: e8 00 00 00 00                callq   0x29 <f+0x29>
      29: e8 00 00 00 00                callq   0x2e <f+0x2e>
      2e: eb e0                         jmp     0x10 <f+0x10>

      30: e8 00 00 00 00                callq   0x35 <f+0x35> ; BB starting here is a loop but has no nops
      35: eb f9                         jmp     0x30 <f+0x30>

We will end up with a line 0 record for address 0x30 immediately followed by a record with the correct line number.

The MCAlignFragment type only allows for the possibility of nops being emitted, it does not guarantee that they'll be emitted. The addresses aren't known till after the assembler computes layout which happens much later than here. I can't think of a way to handle this case as things currently stand though there may be other debug-info related mechanisms to deal with this that I'm not aware of, what do you suggest?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yes at the AsmPrinter level we don't know whether we'll need any nops. That suggests you want to figure this out down in MC when you do know whether you will be emitting any nops. Or, possibly, when encoding the line table, look to see whether there's a line-0 immediately followed by a non-0 at the same offset, and suppress the line-0. Offhand I don't know where those kinds of things happen, other than "somewhere in MC," sorry.

Copy link
Member Author

@omern1 omern1 Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That suggests you want to figure this out down in MC when you do know whether you will be emitting any nops.

That's going to be difficult because the line table has already been encoded by that time. In theory one could evaluate the line table to find the MCDwarfLineAddrFragment associated with the address that the nops are being emitted at but that doesn't seem practical.

Or, possibly, when encoding the line table, look to see whether there's a line-0 immediately followed by a non-0 at the same offset, and suppress the line-0.

Dwarf line entries are encoded in a MCDwarfLineAddrFragment which contains an expression and fixups to compute the delta between the current symbol and the last symbol encoded with fixups and then eventually end up as a link-time relocation so the absolute address isn't known until that time.

This seems like a design problem in the MC layer, there isn't a way to correlate a fragment with it's DWARF-related counterpart. This is probably partly due to the fact that fragments don't normally need to change once they've been created. I don't think this is solvable within the scope of this patch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dwarf line entries are encoded in a MCDwarfLineAddrFragment which contains an expression and fixups to compute the delta between the current symbol and the last symbol encoded with fixups and then eventually end up as a link-time relocation so the absolute address isn't known until that time.

That doesn't sound right. There should be only one link-time relocation per sequence (usually, one per function). Line-table entries are encoded with a line-adjustment and an instruction-offset delta from the previous entry. It should be feasible to detect a zero instruction delta.

That said, it is a broader problem as I've seen these zero deltas in other situations as well. This patch makes it more prevalent but isn't introducing a new problem. At the AsmPrinter level, you have to assume that there could be nops introduced, and attaching line-0 to those nops is one reasonable solution.

Have you considered instead using the .loc of the first instruction of the new block? That is, emit the new block's .loc before emitting the .align directive. This is something that occurred to me just now, I haven't looked at the code to see whether that's feasible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind. The line-table heuristic will put is_stmt on the nop instead of at the top of the block, and that will probably cause other problems. Go ahead with the line-0 idea.

emitAlignment(Alignment, nullptr, MBB.getMaxBytesForAlignment());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this isn't going to change, don't keep the braces.

return;

auto PrevLoc = Asm->OutStreamer->getContext().getCurrentDwarfLoc();
Asm->OutStreamer->emitDwarfLocDirective(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it now skips the line-0 for a block that has 1-byte alignment, but not if the block happens to be aligned.

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still two points to answer: The call to MCDwarfLineEntry::make, and the braces on the if where you undid a code change.

Copy link
Collaborator

@pogo59 pogo59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@omern1 omern1 merged commit 03e17da into llvm:main Jul 29, 2024
7 checks passed
@omern1 omern1 deleted the nop-fix branch July 29, 2024 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants