Skip to content

[DWARF] Emit line 0 source locations for BB padding nops #99496

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions llvm/include/llvm/CodeGen/DebugHandlerBase.h
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,8 @@ class DebugHandlerBase {
void beginBasicBlockSection(const MachineBasicBlock &MBB);
void endBasicBlockSection(const MachineBasicBlock &MBB);

virtual void beginCodeAlignment(const MachineBasicBlock &MBB) {}

/// Return Label preceding the instruction.
MCSymbol *getLabelBeforeInsn(const MachineInstr *MI);

Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3964,6 +3964,9 @@ void AsmPrinter::emitBasicBlockStart(const MachineBasicBlock &MBB) {
CurrentSectionBeginSym = MBB.getSymbol();
}

for (auto &Handler : DebugHandlers)
Handler->beginCodeAlignment(MBB);

// Emit an alignment directive for this block, if needed.
const Align Alignment = MBB.getAlignment();
if (Alignment != Align(1))
Expand Down
18 changes: 18 additions & 0 deletions llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3668,3 +3668,21 @@ bool DwarfDebug::alwaysUseRanges(const DwarfCompileUnit &CU) const {
return true;
return false;
}

void DwarfDebug::beginCodeAlignment(const MachineBasicBlock &MBB) {
if (MBB.getAlignment() == Align(1))
return;

auto *SP = MBB.getParent()->getFunction().getSubprogram();
bool NoDebug =
!SP || SP->getUnit()->getEmissionKind() == DICompileUnit::NoDebug;

if (NoDebug)
return;

auto PrevLoc = Asm->OutStreamer->getContext().getCurrentDwarfLoc();
Asm->OutStreamer->emitDwarfLocDirective(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually we call recordSourceLine(), but in this case we're assuming we're at the top of a basic block and (maybe) emitting nops; so, zeroing out flags and discriminator as well as line seems like a reasonable simplification.

What happens if the block is already aligned? No nops will be emitted, and presumably you'd end up with a .loc for a line 0 immediately followed by a .loc for the correct line. The line table will encode that correctly, but I'm suspicious that some consumers will decide the instruction actually has line 0 (because that's what it will find first for that address). Please test that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. We control whether or not we call beginCodeAlignment in this if block but if other users call beginCodeAlignment unconditionally they would end up with a line 0 .loc immediately followed by the correct .loc.

The function is named beginCodeAlignment so one may argue you should only call it when emitting alignment nops but on the other hand we can easily move the if block into beginCodeAlignment so I'll do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it now skips the line-0 for a block that has 1-byte alignment, but not if the block happens to be aligned.

Copy link
Member Author

@omern1 omern1 Jul 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean.

0000000000000000 <f>:
       ; ...
       3: 7e 2b                         jle     0x30 <f+0x30>
       5: 66 2e 0f 1f 84 00 00 00 00 00 nopw    %cs:(%rax,%rax)
       f: 90                            nop
      10: e8 00 00 00 00                callq   0x15 <f+0x15>
      15: e8 00 00 00 00                callq   0x1a <f+0x1a>
      1a: e8 00 00 00 00                callq   0x1f <f+0x1f>
      1f: e8 00 00 00 00                callq   0x24 <f+0x24>
      24: e8 00 00 00 00                callq   0x29 <f+0x29>
      29: e8 00 00 00 00                callq   0x2e <f+0x2e>
      2e: eb e0                         jmp     0x10 <f+0x10>

      30: e8 00 00 00 00                callq   0x35 <f+0x35> ; BB starting here is a loop but has no nops
      35: eb f9                         jmp     0x30 <f+0x30>

We will end up with a line 0 record for address 0x30 immediately followed by a record with the correct line number.

The MCAlignFragment type only allows for the possibility of nops being emitted, it does not guarantee that they'll be emitted. The addresses aren't known till after the assembler computes layout which happens much later than here. I can't think of a way to handle this case as things currently stand though there may be other debug-info related mechanisms to deal with this that I'm not aware of, what do you suggest?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm yes at the AsmPrinter level we don't know whether we'll need any nops. That suggests you want to figure this out down in MC when you do know whether you will be emitting any nops. Or, possibly, when encoding the line table, look to see whether there's a line-0 immediately followed by a non-0 at the same offset, and suppress the line-0. Offhand I don't know where those kinds of things happen, other than "somewhere in MC," sorry.

Copy link
Member Author

@omern1 omern1 Jul 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That suggests you want to figure this out down in MC when you do know whether you will be emitting any nops.

That's going to be difficult because the line table has already been encoded by that time. In theory one could evaluate the line table to find the MCDwarfLineAddrFragment associated with the address that the nops are being emitted at but that doesn't seem practical.

Or, possibly, when encoding the line table, look to see whether there's a line-0 immediately followed by a non-0 at the same offset, and suppress the line-0.

Dwarf line entries are encoded in a MCDwarfLineAddrFragment which contains an expression and fixups to compute the delta between the current symbol and the last symbol encoded with fixups and then eventually end up as a link-time relocation so the absolute address isn't known until that time.

This seems like a design problem in the MC layer, there isn't a way to correlate a fragment with it's DWARF-related counterpart. This is probably partly due to the fact that fragments don't normally need to change once they've been created. I don't think this is solvable within the scope of this patch.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dwarf line entries are encoded in a MCDwarfLineAddrFragment which contains an expression and fixups to compute the delta between the current symbol and the last symbol encoded with fixups and then eventually end up as a link-time relocation so the absolute address isn't known until that time.

That doesn't sound right. There should be only one link-time relocation per sequence (usually, one per function). Line-table entries are encoded with a line-adjustment and an instruction-offset delta from the previous entry. It should be feasible to detect a zero instruction delta.

That said, it is a broader problem as I've seen these zero deltas in other situations as well. This patch makes it more prevalent but isn't introducing a new problem. At the AsmPrinter level, you have to assume that there could be nops introduced, and attaching line-0 to those nops is one reasonable solution.

Have you considered instead using the .loc of the first instruction of the new block? That is, emit the new block's .loc before emitting the .align directive. This is something that occurred to me just now, I haven't looked at the code to see whether that's feasible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, never mind. The line-table heuristic will put is_stmt on the nop instead of at the top of the block, and that will probably cause other problems. Go ahead with the line-0 idea.

PrevLoc.getFileNum(), 0, PrevLoc.getColumn(), 0, 0, 0, StringRef());
MCDwarfLineEntry::make(Asm->OutStreamer.get(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this call should be unnecessary? It's not used anywhere else in CodeGen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For instructions, usually the line information is recorded by AsmPrinter::emitFunctionBody() -> DwarfDebug::beginInstruction() -> recordSourceLine() -> MCStreamer::emitDwarfLocDirective() and then MCDwarfLineEntry::make is called by MCObjectStreamer::emitInstructionImpl().

In the case of nops we aren't going through the normal flow of emitting instructions so we want the line entry for our line information to be emitted immediately otherwise it'll be emitted at the beginning of the next instruction (the end of the nops).

This is also what MCObjectStreamer::emitInstructionImpl() does. The line entry is emitted first so that the label points to the beginning of the instruction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so because in this case we're looking at a .align directive rather than a real instruction, it won't be passing through emitInstructionImpl() so we need to call MCDwarfLineEntry::make() directly. Sounds reasonable.

Asm->OutStreamer->getCurrentSectionOnly());
}
3 changes: 3 additions & 0 deletions llvm/lib/CodeGen/AsmPrinter/DwarfDebug.h
Original file line number Diff line number Diff line change
Expand Up @@ -727,6 +727,9 @@ class DwarfDebug : public DebugHandlerBase {
/// Process beginning of an instruction.
void beginInstruction(const MachineInstr *MI) override;

/// Process beginning of code alignment.
void beginCodeAlignment(const MachineBasicBlock &MBB) override;

/// Perform an MD5 checksum of \p Identifier and return the lower 64 bits.
static uint64_t makeTypeSignature(StringRef Identifier);

Expand Down
4 changes: 2 additions & 2 deletions llvm/test/CodeGen/X86/fsafdo_test1.ll
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
; Check that fs-afdo discriminators are generated.
; V01: .loc 1 7 3 is_stmt 0 discriminator 2 # foo.c:7:3
; V01: .loc 1 9 5 is_stmt 1 discriminator 2 # foo.c:9:5
; V0: .loc 1 9 5 is_stmt 0 discriminator 11266 # foo.c:9:5
; V0: .loc 1 9 5 discriminator 11266 # foo.c:9:5
; V0: .loc 1 7 3 is_stmt 1 discriminator 11266 # foo.c:7:3
; V1: .loc 1 9 5 is_stmt 0 discriminator 514 # foo.c:9:5
; V1: .loc 1 9 5 discriminator 514 # foo.c:9:5
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does your patch cause this change?

Copy link
Member Author

@omern1 omern1 Jul 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC .loc directives only show state changes, and we emit a .loc 1 0 0 is_stmt 0 before that so the assembly printer omits the is_stmt there. I've also verified that the line table is the same with and without this patch (apart from the new line 0 entries).

; V1: .loc 1 7 3 is_stmt 1 discriminator 258 # foo.c:7:3
; Check that variable __llvm_fs_discriminator__ is generated.
; V01: .type __llvm_fs_discriminator__,@object # @__llvm_fs_discriminator__
Expand Down
13 changes: 9 additions & 4 deletions llvm/test/CodeGen/X86/fsafdo_test4.ll
Original file line number Diff line number Diff line change
@@ -1,11 +1,16 @@
; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false < %s | FileCheck %s
; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true < %s | FileCheck %s
; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=false < %s | FileCheck --implicit-check-not=.loc %s
; RUN: llc -enable-fs-discriminator -improved-fs-discriminator=true < %s | FileCheck --implicit-check-not=.loc %s
;
; Check that fs-afdo discriminators are NOT generated, as debugInfoForProfiling is false (not set).
; CHECK: .loc 1 7 15 prologue_end discriminator 2 # foo.c:7:15
; CHECK: .loc 1 7 3 is_stmt 0 discriminator 2 # foo.c:7:3
; CHECK: .loc 1 0 3 # foo.c:0:3
; CHECK: .loc 1 9 5 is_stmt 1 discriminator 2 # foo.c:9:5
; CHECK-NOT: .loc 1 9 5 is_stmt 0 discriminator
; CHECK-NOT: .loc 1 7 3 is_stmt 1 discriminator
; CHECK: .loc 1 0 5 is_stmt 0 # :0:5
; CHECK: .loc 1 9 5 discriminator 2 # foo.c:9:5
; CHECK: .loc 1 0 5 # :0:5
; CHECK: .loc 1 7 3 is_stmt 1 discriminator 2 # foo.c:7:3
; CHECK: .loc 1 14 3 # foo.c:14:3
; Check that variable __llvm_fs_discriminator__ is NOT generated.
; CHECK-NOT: __llvm_fs_discriminator__:

Expand Down
55 changes: 55 additions & 0 deletions llvm/test/DebugInfo/X86/loop-align-debug.ll
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
; RUN: llc %s --filetype=obj -o %t
; RUN: llvm-objdump -d %t | FileCheck %s --check-prefixes=OBJ
; RUN: llvm-dwarfdump --debug-line %t | FileCheck %s --check-prefixes=DBG
; RUN: llc %s -o - | FileCheck %s --check-prefixes=ASM

; OBJ: 1:{{.*}}nop

;; Address Line Column File ISA Discriminator OpIndex Flags
; DBG: 0x0000000000000000 3 0 0 0 0 0 is_stmt
; DBG: 0x0000000000000001 0 0 0 0 0 0
; DBG: 0x0000000000000010 5 0 0 0 0 0 is_stmt prologue_end
; DBG: 0x0000000000000017 5 0 0 0 0 0 is_stmt end_sequence

; ASM: .loc 0 0 0 is_stmt 0
; ASM-NEXT: .L{{.*}}:
; ASM-NEXT: .p2align 4, 0x90

;; $ cat test.cpp
;; void g();
;; void f() {
;; [[clang::code_align(16)]]
;; while (1) { g(); }
;; }

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define dso_local void @f() local_unnamed_addr !dbg !9 {
entry:
br label %while.body, !dbg !12

while.body: ; preds = %entry, %while.body
tail call void @g(), !dbg !12
br label %while.body, !dbg !12, !llvm.loop !13
}

declare !dbg !16 void @g() local_unnamed_addr

!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!2, !3}
!llvm.ident = !{!8}

!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, producer: "clang version 19.0.0git", isOptimized: true, runtimeVersion: 0, emissionKind: LineTablesOnly, splitDebugInlining: false, nameTableKind: None)
!1 = !DIFile(filename: "test.cpp", directory: "/")
!2 = !{i32 7, !"Dwarf Version", i32 5}
!3 = !{i32 2, !"Debug Info Version", i32 3}
!8 = !{!"clang version 19.0.0git"}
!9 = distinct !DISubprogram(name: "f", scope: !1, file: !1, line: 3, type: !10, scopeLine: 3, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0)
!10 = !DISubroutineType(types: !11)
!11 = !{}
!12 = !DILocation(line: 5, scope: !9)
!13 = distinct !{!13, !12, !12, !14, !15}
!14 = !{!"llvm.loop.mustprogress"}
!15 = !{!"llvm.loop.align", i32 16}
!16 = !DISubprogram(name: "g", scope: !1, file: !1, line: 2, type: !10, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized)
Loading