-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[DebugInfo][DWARF] Emit Per-Function Line Table Offsets and End Sequences #110192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-mc Author: None (alx32) ChangesSummary This patch introduces a new compiler option Background Previous similar PR: #93137 – This PR was very similar to the current one but at the time, the assembler had no support for emitting labels within the line table. That support was added in PR #99710 - and in this PR we use some of the support added in the assembler PR. In the current implementation, Clang generates line information in the For example, when functions are merged by ICF in LLD, multiple functions may end up sharing the same address range. Without explicit linkage between functions and their line entries, tools cannot accurately attribute line information to the correct function, adversely affecting debugging and call stack resolution. Implementation Details
End-of-Sequence Markers: Emits an explicit DW_LNE_end_sequence after each function's line entries in the line table. This marks the end of the line information for that function, ensuring that line entries are correctly delimited. Assembler and Streamer Modifications: Modifies the MCStreamer and related classes to support emitting the necessary labels and tracking the current function's line entries. A new flag GenerateFuncLineTableOffsets is added to control this behavior. Compiler Option: Introduces the Full diff: https://github.com/llvm/llvm-project/pull/110192.diff 7 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/Dwarf.def b/llvm/include/llvm/BinaryFormat/Dwarf.def
index d55947fc5103ac..b1fa81a2fc6abd 100644
--- a/llvm/include/llvm/BinaryFormat/Dwarf.def
+++ b/llvm/include/llvm/BinaryFormat/Dwarf.def
@@ -617,6 +617,7 @@ HANDLE_DW_AT(0x3e07, LLVM_apinotes, 0, APPLE)
HANDLE_DW_AT(0x3e08, LLVM_ptrauth_isa_pointer, 0, LLVM)
HANDLE_DW_AT(0x3e09, LLVM_ptrauth_authenticates_null_values, 0, LLVM)
HANDLE_DW_AT(0x3e0a, LLVM_ptrauth_authentication_mode, 0, LLVM)
+HANDLE_DW_AT(0x3e0b, LLVM_stmt_sequence, 0, LLVM)
// Apple extensions.
diff --git a/llvm/include/llvm/MC/MCDwarf.h b/llvm/include/llvm/MC/MCDwarf.h
index bea79545d1ab96..e7e1bef1ad2d72 100644
--- a/llvm/include/llvm/MC/MCDwarf.h
+++ b/llvm/include/llvm/MC/MCDwarf.h
@@ -123,6 +123,9 @@ class MCDwarfLoc {
friend class MCContext;
friend class MCDwarfLineEntry;
+ // DwarfDebug::endFunctionImpl needs to construct MCDwarfLoc(IsEndOfFunction)
+ friend class DwarfDebug;
+
MCDwarfLoc(unsigned fileNum, unsigned line, unsigned column, unsigned flags,
unsigned isa, unsigned discriminator)
: FileNum(fileNum), Line(line), Column(column), Flags(flags), Isa(isa),
@@ -239,7 +242,7 @@ class MCLineSection {
// Add an end entry by cloning the last entry, if exists, for the section
// the given EndLabel belongs to. The label is replaced by the given EndLabel.
- void addEndEntry(MCSymbol *EndLabel);
+ void addEndEntry(MCSymbol *EndLabel, bool generatingFuncLineTableOffsets);
using MCDwarfLineEntryCollection = std::vector<MCDwarfLineEntry>;
using iterator = MCDwarfLineEntryCollection::iterator;
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 707aecc5dc578e..d6d5970917401d 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -251,6 +251,15 @@ class MCStreamer {
/// discussion for future inclusion.
bool AllowAutoPadding = false;
+ // Flag specifying weather functions will have an offset into the line table
+ // where the line data for that function starts
+ bool GenerateFuncLineTableOffsets = false;
+
+ // Symbol that tracks the stream symbol for first line of the current function
+ // being generated. This symbol can be used to reference where the line
+ // entries for the function start in the generated line table.
+ MCSymbol *CurrentFuncFirstLineStreamSym;
+
protected:
MCFragment *CurFrag = nullptr;
@@ -313,6 +322,24 @@ class MCStreamer {
void setAllowAutoPadding(bool v) { AllowAutoPadding = v; }
bool getAllowAutoPadding() const { return AllowAutoPadding; }
+ void setGenerateFuncLineTableOffsets(bool v) {
+ GenerateFuncLineTableOffsets = v;
+ }
+ bool getGenerateFuncLineTableOffsets() const {
+ return GenerateFuncLineTableOffsets;
+ }
+
+ // Use the below functions to track the symbol that points to the current
+ // function's line info in the output stream.
+ void beginFunction() { CurrentFuncFirstLineStreamSym = nullptr; }
+ void emittedLineStreamSym(MCSymbol *StreamSym) {
+ if (!CurrentFuncFirstLineStreamSym)
+ CurrentFuncFirstLineStreamSym = StreamSym;
+ }
+ MCSymbol *getCurrentFuncFirstLineStreamSym() {
+ return CurrentFuncFirstLineStreamSym;
+ }
+
/// When emitting an object file, create and emit a real label. When emitting
/// textual assembly, this should do nothing to avoid polluting our output.
virtual MCSymbol *emitCFILabel();
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 0a1ff189bedbc4..c62075cf77c45a 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -527,6 +527,14 @@ DIE &DwarfCompileUnit::updateSubprogramScopeDIE(const DISubprogram *SP) {
*DD->getCurrentFunction()))
addFlag(*SPDie, dwarf::DW_AT_APPLE_omit_frame_ptr);
+ if (Asm->OutStreamer->getGenerateFuncLineTableOffsets() &&
+ Asm->OutStreamer->getCurrentFuncFirstLineStreamSym()) {
+ addSectionLabel(
+ *SPDie, dwarf::DW_AT_LLVM_stmt_sequence,
+ Asm->OutStreamer->getCurrentFuncFirstLineStreamSym(),
+ Asm->getObjFileLowering().getDwarfLineSection()->getBeginSymbol());
+ }
+
// Only include DW_AT_frame_base in full debug info
if (!includeMinimalInlineScopes()) {
const TargetFrameLowering *TFI = Asm->MF->getSubtarget().getFrameLowering();
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index e9649f9ff81658..bd6d5e0ea7a363 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -170,6 +170,12 @@ static cl::opt<DwarfDebug::MinimizeAddrInV5> MinimizeAddrInV5Option(
"Stuff")),
cl::init(DwarfDebug::MinimizeAddrInV5::Default));
+static cl::opt<bool> EmitFuncLineTableOffsetsOption(
+ "emit-func-debug-line-table-offsets", cl::Hidden,
+ cl::desc("Include line table offset in function's debug info and emit end "
+ "sequence after each function's line data."),
+ cl::init(false));
+
static constexpr unsigned ULEB128PadSize = 4;
void DebugLocDwarfExpression::emitOp(uint8_t Op, const char *Comment) {
@@ -443,6 +449,8 @@ DwarfDebug::DwarfDebug(AsmPrinter *A)
Asm->OutStreamer->getContext().setDwarfVersion(DwarfVersion);
Asm->OutStreamer->getContext().setDwarfFormat(Dwarf64 ? dwarf::DWARF64
: dwarf::DWARF32);
+ Asm->OutStreamer->setGenerateFuncLineTableOffsets(
+ EmitFuncLineTableOffsetsOption);
}
// Define out of line so we don't have to include DwarfUnit.h in DwarfDebug.h.
@@ -2221,6 +2229,10 @@ void DwarfDebug::beginFunctionImpl(const MachineFunction *MF) {
if (SP->getUnit()->getEmissionKind() == DICompileUnit::NoDebug)
return;
+ // Notify the streamer that we are beginning a function - this will reset the
+ // label pointing to the currently generated function's first line entry
+ Asm->OutStreamer->beginFunction();
+
DwarfCompileUnit &CU = getOrCreateDwarfCompileUnit(SP->getUnit());
Asm->OutStreamer->getContext().setDwarfCompileUnitID(
@@ -2249,7 +2261,8 @@ void DwarfDebug::terminateLineTable(const DwarfCompileUnit *CU) {
getDwarfCompileUnitIDForLineTable(*CU));
// Add the last range label for the given CU.
LineTable.getMCLineSections().addEndEntry(
- const_cast<MCSymbol *>(CURanges.back().End));
+ const_cast<MCSymbol *>(CURanges.back().End),
+ EmitFuncLineTableOffsetsOption);
}
void DwarfDebug::skippedNonDebugFunction() {
@@ -2342,6 +2355,21 @@ void DwarfDebug::endFunctionImpl(const MachineFunction *MF) {
// Construct call site entries.
constructCallSiteEntryDIEs(*SP, TheCU, ScopeDIE, *MF);
+ // If we're emitting line table offsets, we also need to emit an end label
+ // after all function's line entries
+ if (EmitFuncLineTableOffsetsOption) {
+ MCSymbol *LineSym = Asm->OutStreamer->getContext().createTempSymbol();
+ Asm->OutStreamer->emitLabel(LineSym);
+ MCDwarfLoc DwarfLoc(
+ 1, 1, 0, DWARF2_LINE_DEFAULT_IS_STMT ? DWARF2_FLAG_IS_STMT : 0, 0, 0);
+ MCDwarfLineEntry LineEntry(LineSym, DwarfLoc);
+ Asm->OutStreamer->getContext()
+ .getMCDwarfLineTable(
+ Asm->OutStreamer->getContext().getDwarfCompileUnitID())
+ .getMCLineSections()
+ .addLineEntry(LineEntry, Asm->OutStreamer->getCurrentSectionOnly());
+ }
+
// Clear debug info
// Ownership of DbgVariables is a bit subtle - ScopeVariables owns all the
// DbgVariables except those that are also in AbstractVariables (since they
diff --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index 8ff097f29aebd1..34a9541bbbcc3a 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -104,8 +104,17 @@ void MCDwarfLineEntry::make(MCStreamer *MCOS, MCSection *Section) {
// Get the current .loc info saved in the context.
const MCDwarfLoc &DwarfLoc = MCOS->getContext().getCurrentDwarfLoc();
+ MCSymbol *LineStreamLabel = nullptr;
+ // If functions need offsets into the generated line table, then we need to
+ // create a label referencing where the line was generated in the output
+ // stream
+ if (MCOS->getGenerateFuncLineTableOffsets()) {
+ LineStreamLabel = MCOS->getContext().createTempSymbol();
+ MCOS->emittedLineStreamSym(LineStreamLabel);
+ }
+
// Create a (local) line entry with the symbol and the current .loc info.
- MCDwarfLineEntry LineEntry(LineSym, DwarfLoc);
+ MCDwarfLineEntry LineEntry(LineSym, DwarfLoc, LineStreamLabel);
// clear DwarfLocSeen saying the current .loc info is now used.
MCOS->getContext().clearDwarfLocSeen();
@@ -145,7 +154,8 @@ makeStartPlusIntExpr(MCContext &Ctx, const MCSymbol &Start, int IntVal) {
return Res;
}
-void MCLineSection::addEndEntry(MCSymbol *EndLabel) {
+void MCLineSection::addEndEntry(MCSymbol *EndLabel,
+ bool generatingFuncLineTableOffsets) {
auto *Sec = &EndLabel->getSection();
// The line table may be empty, which we should skip adding an end entry.
// There are two cases:
@@ -158,8 +168,12 @@ void MCLineSection::addEndEntry(MCSymbol *EndLabel) {
if (I != MCLineDivisions.end()) {
auto &Entries = I->second;
auto EndEntry = Entries.back();
- EndEntry.setEndLabel(EndLabel);
- Entries.push_back(EndEntry);
+ // If generatingFuncLineTableOffsets is set, then we already generated an
+ // end label at the end of the last function, so skip generating another one
+ if (!generatingFuncLineTableOffsets) {
+ EndEntry.setEndLabel(EndLabel);
+ Entries.push_back(EndEntry);
+ }
}
}
diff --git a/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll b/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll
new file mode 100644
index 00000000000000..ef8b0c817cfb67
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll
@@ -0,0 +1,82 @@
+; RUN: llc -mtriple=i686-w64-mingw32 -o %t -filetype=obj %s
+; RUN: llvm-dwarfdump -v -all %t | FileCheck %s -check-prefix=NO_STMT_SEQ
+
+; RUN: llc -mtriple=i686-w64-mingw32 -o %t -filetype=obj %s -emit-func-debug-line-table-offsets
+; RUN: llvm-dwarfdump -v -all %t | FileCheck %s -check-prefix=STMT_SEQ
+
+; NO_STMT_SEQ-NOT: DW_AT_LLVM_stmt_sequence
+
+; STMT_SEQ: [[[ABBREV_CODE:[0-9]+]]] DW_TAG_subprogram
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence DW_FORM_sec_offset
+; STMT_SEQ: DW_TAG_subprogram [[[ABBREV_CODE]]]
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset] (0x00000028)
+; STMT_SEQ: DW_AT_name {{.*}}func01
+; STMT_SEQ: DW_TAG_subprogram [[[ABBREV_CODE]]]
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset] (0x00000033)
+; STMT_SEQ: DW_AT_name {{.*}}main
+
+;; Check that the line table starts at 0x00000028 (first function)
+; STMT_SEQ: Address Line Column File ISA Discriminator OpIndex Flags
+; STMT_SEQ-NEXT: ------------------ ------ ------ ------ --- ------------- ------- -------------
+; STMT_SEQ-NEXT: 0x00000028: 00 DW_LNE_set_address (0x00000006)
+
+;; Check that we have an 'end_sequence' just before the next function (0x00000033)
+; STMT_SEQ: 0x0000000000000006 1 0 1 0 0 0 is_stmt end_sequence
+; STMT_SEQ-NEXT: 0x00000033: 00 DW_LNE_set_address (0x00000027)
+
+;; Check that the end of the line table still has an 'end_sequence'
+; STMT_SEQ 0x00000049: 00 DW_LNE_end_sequence
+; STMT_SEQ-NEXT 0x0000000000000027 6 3 1 0 0 0 end_sequence
+
+
+; generated from:
+; clang -g -S -emit-llvm test.c -o test.ll
+; ======= test.c ======
+; int func01() {
+; return 1;
+; }
+; int main() {
+; return 0;
+; }
+; =====================
+
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+target triple = "arm64-apple-macosx14.0.0"
+
+; Function Attrs: noinline nounwind optnone ssp uwtable(sync)
+define i32 @func01() #0 !dbg !9 {
+ ret i32 1, !dbg !13
+}
+
+; Function Attrs: noinline nounwind optnone ssp uwtable(sync)
+define i32 @main() #0 !dbg !14 {
+ %1 = alloca i32, align 4
+ store i32 0, ptr %1, align 4
+ ret i32 0, !dbg !15
+}
+
+attributes #0 = { noinline nounwind optnone ssp uwtable(sync) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="apple-m1" "target-features"="+aes,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+sha3,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+zcm,+zcz" }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "Homebrew clang version 17.0.6", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: Apple, sysroot: "/Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk", sdk: "MacOSX14.sdk")
+!1 = !DIFile(filename: "test.c", directory: "/tmp/clang_test")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"uwtable", i32 1}
+!7 = !{i32 7, !"frame-pointer", i32 1}
+!8 = !{!"Homebrew clang version 17.0.6"}
+!9 = distinct !DISubprogram(name: "func01", scope: !1, file: !1, line: 1, type: !10, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0)
+!10 = !DISubroutineType(types: !11)
+!11 = !{!12}
+!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!13 = !DILocation(line: 2, column: 3, scope: !9)
+!14 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 5, type: !10, scopeLine: 5, spFlags: DISPFlagDefinition, unit: !0)
+!15 = !DILocation(line: 6, column: 3, scope: !14)
|
@llvm/pr-subscribers-debuginfo Author: None (alx32) ChangesSummary This patch introduces a new compiler option Background Previous similar PR: #93137 – This PR was very similar to the current one but at the time, the assembler had no support for emitting labels within the line table. That support was added in PR #99710 - and in this PR we use some of the support added in the assembler PR. In the current implementation, Clang generates line information in the For example, when functions are merged by ICF in LLD, multiple functions may end up sharing the same address range. Without explicit linkage between functions and their line entries, tools cannot accurately attribute line information to the correct function, adversely affecting debugging and call stack resolution. Implementation Details
End-of-Sequence Markers: Emits an explicit DW_LNE_end_sequence after each function's line entries in the line table. This marks the end of the line information for that function, ensuring that line entries are correctly delimited. Assembler and Streamer Modifications: Modifies the MCStreamer and related classes to support emitting the necessary labels and tracking the current function's line entries. A new flag GenerateFuncLineTableOffsets is added to control this behavior. Compiler Option: Introduces the Full diff: https://github.com/llvm/llvm-project/pull/110192.diff 7 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/Dwarf.def b/llvm/include/llvm/BinaryFormat/Dwarf.def
index d55947fc5103ac..b1fa81a2fc6abd 100644
--- a/llvm/include/llvm/BinaryFormat/Dwarf.def
+++ b/llvm/include/llvm/BinaryFormat/Dwarf.def
@@ -617,6 +617,7 @@ HANDLE_DW_AT(0x3e07, LLVM_apinotes, 0, APPLE)
HANDLE_DW_AT(0x3e08, LLVM_ptrauth_isa_pointer, 0, LLVM)
HANDLE_DW_AT(0x3e09, LLVM_ptrauth_authenticates_null_values, 0, LLVM)
HANDLE_DW_AT(0x3e0a, LLVM_ptrauth_authentication_mode, 0, LLVM)
+HANDLE_DW_AT(0x3e0b, LLVM_stmt_sequence, 0, LLVM)
// Apple extensions.
diff --git a/llvm/include/llvm/MC/MCDwarf.h b/llvm/include/llvm/MC/MCDwarf.h
index bea79545d1ab96..e7e1bef1ad2d72 100644
--- a/llvm/include/llvm/MC/MCDwarf.h
+++ b/llvm/include/llvm/MC/MCDwarf.h
@@ -123,6 +123,9 @@ class MCDwarfLoc {
friend class MCContext;
friend class MCDwarfLineEntry;
+ // DwarfDebug::endFunctionImpl needs to construct MCDwarfLoc(IsEndOfFunction)
+ friend class DwarfDebug;
+
MCDwarfLoc(unsigned fileNum, unsigned line, unsigned column, unsigned flags,
unsigned isa, unsigned discriminator)
: FileNum(fileNum), Line(line), Column(column), Flags(flags), Isa(isa),
@@ -239,7 +242,7 @@ class MCLineSection {
// Add an end entry by cloning the last entry, if exists, for the section
// the given EndLabel belongs to. The label is replaced by the given EndLabel.
- void addEndEntry(MCSymbol *EndLabel);
+ void addEndEntry(MCSymbol *EndLabel, bool generatingFuncLineTableOffsets);
using MCDwarfLineEntryCollection = std::vector<MCDwarfLineEntry>;
using iterator = MCDwarfLineEntryCollection::iterator;
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 707aecc5dc578e..d6d5970917401d 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -251,6 +251,15 @@ class MCStreamer {
/// discussion for future inclusion.
bool AllowAutoPadding = false;
+ // Flag specifying weather functions will have an offset into the line table
+ // where the line data for that function starts
+ bool GenerateFuncLineTableOffsets = false;
+
+ // Symbol that tracks the stream symbol for first line of the current function
+ // being generated. This symbol can be used to reference where the line
+ // entries for the function start in the generated line table.
+ MCSymbol *CurrentFuncFirstLineStreamSym;
+
protected:
MCFragment *CurFrag = nullptr;
@@ -313,6 +322,24 @@ class MCStreamer {
void setAllowAutoPadding(bool v) { AllowAutoPadding = v; }
bool getAllowAutoPadding() const { return AllowAutoPadding; }
+ void setGenerateFuncLineTableOffsets(bool v) {
+ GenerateFuncLineTableOffsets = v;
+ }
+ bool getGenerateFuncLineTableOffsets() const {
+ return GenerateFuncLineTableOffsets;
+ }
+
+ // Use the below functions to track the symbol that points to the current
+ // function's line info in the output stream.
+ void beginFunction() { CurrentFuncFirstLineStreamSym = nullptr; }
+ void emittedLineStreamSym(MCSymbol *StreamSym) {
+ if (!CurrentFuncFirstLineStreamSym)
+ CurrentFuncFirstLineStreamSym = StreamSym;
+ }
+ MCSymbol *getCurrentFuncFirstLineStreamSym() {
+ return CurrentFuncFirstLineStreamSym;
+ }
+
/// When emitting an object file, create and emit a real label. When emitting
/// textual assembly, this should do nothing to avoid polluting our output.
virtual MCSymbol *emitCFILabel();
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 0a1ff189bedbc4..c62075cf77c45a 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -527,6 +527,14 @@ DIE &DwarfCompileUnit::updateSubprogramScopeDIE(const DISubprogram *SP) {
*DD->getCurrentFunction()))
addFlag(*SPDie, dwarf::DW_AT_APPLE_omit_frame_ptr);
+ if (Asm->OutStreamer->getGenerateFuncLineTableOffsets() &&
+ Asm->OutStreamer->getCurrentFuncFirstLineStreamSym()) {
+ addSectionLabel(
+ *SPDie, dwarf::DW_AT_LLVM_stmt_sequence,
+ Asm->OutStreamer->getCurrentFuncFirstLineStreamSym(),
+ Asm->getObjFileLowering().getDwarfLineSection()->getBeginSymbol());
+ }
+
// Only include DW_AT_frame_base in full debug info
if (!includeMinimalInlineScopes()) {
const TargetFrameLowering *TFI = Asm->MF->getSubtarget().getFrameLowering();
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index e9649f9ff81658..bd6d5e0ea7a363 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -170,6 +170,12 @@ static cl::opt<DwarfDebug::MinimizeAddrInV5> MinimizeAddrInV5Option(
"Stuff")),
cl::init(DwarfDebug::MinimizeAddrInV5::Default));
+static cl::opt<bool> EmitFuncLineTableOffsetsOption(
+ "emit-func-debug-line-table-offsets", cl::Hidden,
+ cl::desc("Include line table offset in function's debug info and emit end "
+ "sequence after each function's line data."),
+ cl::init(false));
+
static constexpr unsigned ULEB128PadSize = 4;
void DebugLocDwarfExpression::emitOp(uint8_t Op, const char *Comment) {
@@ -443,6 +449,8 @@ DwarfDebug::DwarfDebug(AsmPrinter *A)
Asm->OutStreamer->getContext().setDwarfVersion(DwarfVersion);
Asm->OutStreamer->getContext().setDwarfFormat(Dwarf64 ? dwarf::DWARF64
: dwarf::DWARF32);
+ Asm->OutStreamer->setGenerateFuncLineTableOffsets(
+ EmitFuncLineTableOffsetsOption);
}
// Define out of line so we don't have to include DwarfUnit.h in DwarfDebug.h.
@@ -2221,6 +2229,10 @@ void DwarfDebug::beginFunctionImpl(const MachineFunction *MF) {
if (SP->getUnit()->getEmissionKind() == DICompileUnit::NoDebug)
return;
+ // Notify the streamer that we are beginning a function - this will reset the
+ // label pointing to the currently generated function's first line entry
+ Asm->OutStreamer->beginFunction();
+
DwarfCompileUnit &CU = getOrCreateDwarfCompileUnit(SP->getUnit());
Asm->OutStreamer->getContext().setDwarfCompileUnitID(
@@ -2249,7 +2261,8 @@ void DwarfDebug::terminateLineTable(const DwarfCompileUnit *CU) {
getDwarfCompileUnitIDForLineTable(*CU));
// Add the last range label for the given CU.
LineTable.getMCLineSections().addEndEntry(
- const_cast<MCSymbol *>(CURanges.back().End));
+ const_cast<MCSymbol *>(CURanges.back().End),
+ EmitFuncLineTableOffsetsOption);
}
void DwarfDebug::skippedNonDebugFunction() {
@@ -2342,6 +2355,21 @@ void DwarfDebug::endFunctionImpl(const MachineFunction *MF) {
// Construct call site entries.
constructCallSiteEntryDIEs(*SP, TheCU, ScopeDIE, *MF);
+ // If we're emitting line table offsets, we also need to emit an end label
+ // after all function's line entries
+ if (EmitFuncLineTableOffsetsOption) {
+ MCSymbol *LineSym = Asm->OutStreamer->getContext().createTempSymbol();
+ Asm->OutStreamer->emitLabel(LineSym);
+ MCDwarfLoc DwarfLoc(
+ 1, 1, 0, DWARF2_LINE_DEFAULT_IS_STMT ? DWARF2_FLAG_IS_STMT : 0, 0, 0);
+ MCDwarfLineEntry LineEntry(LineSym, DwarfLoc);
+ Asm->OutStreamer->getContext()
+ .getMCDwarfLineTable(
+ Asm->OutStreamer->getContext().getDwarfCompileUnitID())
+ .getMCLineSections()
+ .addLineEntry(LineEntry, Asm->OutStreamer->getCurrentSectionOnly());
+ }
+
// Clear debug info
// Ownership of DbgVariables is a bit subtle - ScopeVariables owns all the
// DbgVariables except those that are also in AbstractVariables (since they
diff --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index 8ff097f29aebd1..34a9541bbbcc3a 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -104,8 +104,17 @@ void MCDwarfLineEntry::make(MCStreamer *MCOS, MCSection *Section) {
// Get the current .loc info saved in the context.
const MCDwarfLoc &DwarfLoc = MCOS->getContext().getCurrentDwarfLoc();
+ MCSymbol *LineStreamLabel = nullptr;
+ // If functions need offsets into the generated line table, then we need to
+ // create a label referencing where the line was generated in the output
+ // stream
+ if (MCOS->getGenerateFuncLineTableOffsets()) {
+ LineStreamLabel = MCOS->getContext().createTempSymbol();
+ MCOS->emittedLineStreamSym(LineStreamLabel);
+ }
+
// Create a (local) line entry with the symbol and the current .loc info.
- MCDwarfLineEntry LineEntry(LineSym, DwarfLoc);
+ MCDwarfLineEntry LineEntry(LineSym, DwarfLoc, LineStreamLabel);
// clear DwarfLocSeen saying the current .loc info is now used.
MCOS->getContext().clearDwarfLocSeen();
@@ -145,7 +154,8 @@ makeStartPlusIntExpr(MCContext &Ctx, const MCSymbol &Start, int IntVal) {
return Res;
}
-void MCLineSection::addEndEntry(MCSymbol *EndLabel) {
+void MCLineSection::addEndEntry(MCSymbol *EndLabel,
+ bool generatingFuncLineTableOffsets) {
auto *Sec = &EndLabel->getSection();
// The line table may be empty, which we should skip adding an end entry.
// There are two cases:
@@ -158,8 +168,12 @@ void MCLineSection::addEndEntry(MCSymbol *EndLabel) {
if (I != MCLineDivisions.end()) {
auto &Entries = I->second;
auto EndEntry = Entries.back();
- EndEntry.setEndLabel(EndLabel);
- Entries.push_back(EndEntry);
+ // If generatingFuncLineTableOffsets is set, then we already generated an
+ // end label at the end of the last function, so skip generating another one
+ if (!generatingFuncLineTableOffsets) {
+ EndEntry.setEndLabel(EndLabel);
+ Entries.push_back(EndEntry);
+ }
}
}
diff --git a/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll b/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll
new file mode 100644
index 00000000000000..ef8b0c817cfb67
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll
@@ -0,0 +1,82 @@
+; RUN: llc -mtriple=i686-w64-mingw32 -o %t -filetype=obj %s
+; RUN: llvm-dwarfdump -v -all %t | FileCheck %s -check-prefix=NO_STMT_SEQ
+
+; RUN: llc -mtriple=i686-w64-mingw32 -o %t -filetype=obj %s -emit-func-debug-line-table-offsets
+; RUN: llvm-dwarfdump -v -all %t | FileCheck %s -check-prefix=STMT_SEQ
+
+; NO_STMT_SEQ-NOT: DW_AT_LLVM_stmt_sequence
+
+; STMT_SEQ: [[[ABBREV_CODE:[0-9]+]]] DW_TAG_subprogram
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence DW_FORM_sec_offset
+; STMT_SEQ: DW_TAG_subprogram [[[ABBREV_CODE]]]
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset] (0x00000028)
+; STMT_SEQ: DW_AT_name {{.*}}func01
+; STMT_SEQ: DW_TAG_subprogram [[[ABBREV_CODE]]]
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset] (0x00000033)
+; STMT_SEQ: DW_AT_name {{.*}}main
+
+;; Check that the line table starts at 0x00000028 (first function)
+; STMT_SEQ: Address Line Column File ISA Discriminator OpIndex Flags
+; STMT_SEQ-NEXT: ------------------ ------ ------ ------ --- ------------- ------- -------------
+; STMT_SEQ-NEXT: 0x00000028: 00 DW_LNE_set_address (0x00000006)
+
+;; Check that we have an 'end_sequence' just before the next function (0x00000033)
+; STMT_SEQ: 0x0000000000000006 1 0 1 0 0 0 is_stmt end_sequence
+; STMT_SEQ-NEXT: 0x00000033: 00 DW_LNE_set_address (0x00000027)
+
+;; Check that the end of the line table still has an 'end_sequence'
+; STMT_SEQ 0x00000049: 00 DW_LNE_end_sequence
+; STMT_SEQ-NEXT 0x0000000000000027 6 3 1 0 0 0 end_sequence
+
+
+; generated from:
+; clang -g -S -emit-llvm test.c -o test.ll
+; ======= test.c ======
+; int func01() {
+; return 1;
+; }
+; int main() {
+; return 0;
+; }
+; =====================
+
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+target triple = "arm64-apple-macosx14.0.0"
+
+; Function Attrs: noinline nounwind optnone ssp uwtable(sync)
+define i32 @func01() #0 !dbg !9 {
+ ret i32 1, !dbg !13
+}
+
+; Function Attrs: noinline nounwind optnone ssp uwtable(sync)
+define i32 @main() #0 !dbg !14 {
+ %1 = alloca i32, align 4
+ store i32 0, ptr %1, align 4
+ ret i32 0, !dbg !15
+}
+
+attributes #0 = { noinline nounwind optnone ssp uwtable(sync) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="apple-m1" "target-features"="+aes,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+sha3,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+zcm,+zcz" }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "Homebrew clang version 17.0.6", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: Apple, sysroot: "/Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk", sdk: "MacOSX14.sdk")
+!1 = !DIFile(filename: "test.c", directory: "/tmp/clang_test")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"uwtable", i32 1}
+!7 = !{i32 7, !"frame-pointer", i32 1}
+!8 = !{!"Homebrew clang version 17.0.6"}
+!9 = distinct !DISubprogram(name: "func01", scope: !1, file: !1, line: 1, type: !10, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0)
+!10 = !DISubroutineType(types: !11)
+!11 = !{!12}
+!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!13 = !DILocation(line: 2, column: 3, scope: !9)
+!14 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 5, type: !10, scopeLine: 5, spFlags: DISPFlagDefinition, unit: !0)
+!15 = !DILocation(line: 6, column: 3, scope: !14)
|
@llvm/pr-subscribers-llvm-binary-utilities Author: None (alx32) ChangesSummary This patch introduces a new compiler option Background Previous similar PR: #93137 – This PR was very similar to the current one but at the time, the assembler had no support for emitting labels within the line table. That support was added in PR #99710 - and in this PR we use some of the support added in the assembler PR. In the current implementation, Clang generates line information in the For example, when functions are merged by ICF in LLD, multiple functions may end up sharing the same address range. Without explicit linkage between functions and their line entries, tools cannot accurately attribute line information to the correct function, adversely affecting debugging and call stack resolution. Implementation Details
End-of-Sequence Markers: Emits an explicit DW_LNE_end_sequence after each function's line entries in the line table. This marks the end of the line information for that function, ensuring that line entries are correctly delimited. Assembler and Streamer Modifications: Modifies the MCStreamer and related classes to support emitting the necessary labels and tracking the current function's line entries. A new flag GenerateFuncLineTableOffsets is added to control this behavior. Compiler Option: Introduces the Full diff: https://github.com/llvm/llvm-project/pull/110192.diff 7 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/Dwarf.def b/llvm/include/llvm/BinaryFormat/Dwarf.def
index d55947fc5103ac..b1fa81a2fc6abd 100644
--- a/llvm/include/llvm/BinaryFormat/Dwarf.def
+++ b/llvm/include/llvm/BinaryFormat/Dwarf.def
@@ -617,6 +617,7 @@ HANDLE_DW_AT(0x3e07, LLVM_apinotes, 0, APPLE)
HANDLE_DW_AT(0x3e08, LLVM_ptrauth_isa_pointer, 0, LLVM)
HANDLE_DW_AT(0x3e09, LLVM_ptrauth_authenticates_null_values, 0, LLVM)
HANDLE_DW_AT(0x3e0a, LLVM_ptrauth_authentication_mode, 0, LLVM)
+HANDLE_DW_AT(0x3e0b, LLVM_stmt_sequence, 0, LLVM)
// Apple extensions.
diff --git a/llvm/include/llvm/MC/MCDwarf.h b/llvm/include/llvm/MC/MCDwarf.h
index bea79545d1ab96..e7e1bef1ad2d72 100644
--- a/llvm/include/llvm/MC/MCDwarf.h
+++ b/llvm/include/llvm/MC/MCDwarf.h
@@ -123,6 +123,9 @@ class MCDwarfLoc {
friend class MCContext;
friend class MCDwarfLineEntry;
+ // DwarfDebug::endFunctionImpl needs to construct MCDwarfLoc(IsEndOfFunction)
+ friend class DwarfDebug;
+
MCDwarfLoc(unsigned fileNum, unsigned line, unsigned column, unsigned flags,
unsigned isa, unsigned discriminator)
: FileNum(fileNum), Line(line), Column(column), Flags(flags), Isa(isa),
@@ -239,7 +242,7 @@ class MCLineSection {
// Add an end entry by cloning the last entry, if exists, for the section
// the given EndLabel belongs to. The label is replaced by the given EndLabel.
- void addEndEntry(MCSymbol *EndLabel);
+ void addEndEntry(MCSymbol *EndLabel, bool generatingFuncLineTableOffsets);
using MCDwarfLineEntryCollection = std::vector<MCDwarfLineEntry>;
using iterator = MCDwarfLineEntryCollection::iterator;
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 707aecc5dc578e..d6d5970917401d 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -251,6 +251,15 @@ class MCStreamer {
/// discussion for future inclusion.
bool AllowAutoPadding = false;
+ // Flag specifying weather functions will have an offset into the line table
+ // where the line data for that function starts
+ bool GenerateFuncLineTableOffsets = false;
+
+ // Symbol that tracks the stream symbol for first line of the current function
+ // being generated. This symbol can be used to reference where the line
+ // entries for the function start in the generated line table.
+ MCSymbol *CurrentFuncFirstLineStreamSym;
+
protected:
MCFragment *CurFrag = nullptr;
@@ -313,6 +322,24 @@ class MCStreamer {
void setAllowAutoPadding(bool v) { AllowAutoPadding = v; }
bool getAllowAutoPadding() const { return AllowAutoPadding; }
+ void setGenerateFuncLineTableOffsets(bool v) {
+ GenerateFuncLineTableOffsets = v;
+ }
+ bool getGenerateFuncLineTableOffsets() const {
+ return GenerateFuncLineTableOffsets;
+ }
+
+ // Use the below functions to track the symbol that points to the current
+ // function's line info in the output stream.
+ void beginFunction() { CurrentFuncFirstLineStreamSym = nullptr; }
+ void emittedLineStreamSym(MCSymbol *StreamSym) {
+ if (!CurrentFuncFirstLineStreamSym)
+ CurrentFuncFirstLineStreamSym = StreamSym;
+ }
+ MCSymbol *getCurrentFuncFirstLineStreamSym() {
+ return CurrentFuncFirstLineStreamSym;
+ }
+
/// When emitting an object file, create and emit a real label. When emitting
/// textual assembly, this should do nothing to avoid polluting our output.
virtual MCSymbol *emitCFILabel();
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index 0a1ff189bedbc4..c62075cf77c45a 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -527,6 +527,14 @@ DIE &DwarfCompileUnit::updateSubprogramScopeDIE(const DISubprogram *SP) {
*DD->getCurrentFunction()))
addFlag(*SPDie, dwarf::DW_AT_APPLE_omit_frame_ptr);
+ if (Asm->OutStreamer->getGenerateFuncLineTableOffsets() &&
+ Asm->OutStreamer->getCurrentFuncFirstLineStreamSym()) {
+ addSectionLabel(
+ *SPDie, dwarf::DW_AT_LLVM_stmt_sequence,
+ Asm->OutStreamer->getCurrentFuncFirstLineStreamSym(),
+ Asm->getObjFileLowering().getDwarfLineSection()->getBeginSymbol());
+ }
+
// Only include DW_AT_frame_base in full debug info
if (!includeMinimalInlineScopes()) {
const TargetFrameLowering *TFI = Asm->MF->getSubtarget().getFrameLowering();
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index e9649f9ff81658..bd6d5e0ea7a363 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -170,6 +170,12 @@ static cl::opt<DwarfDebug::MinimizeAddrInV5> MinimizeAddrInV5Option(
"Stuff")),
cl::init(DwarfDebug::MinimizeAddrInV5::Default));
+static cl::opt<bool> EmitFuncLineTableOffsetsOption(
+ "emit-func-debug-line-table-offsets", cl::Hidden,
+ cl::desc("Include line table offset in function's debug info and emit end "
+ "sequence after each function's line data."),
+ cl::init(false));
+
static constexpr unsigned ULEB128PadSize = 4;
void DebugLocDwarfExpression::emitOp(uint8_t Op, const char *Comment) {
@@ -443,6 +449,8 @@ DwarfDebug::DwarfDebug(AsmPrinter *A)
Asm->OutStreamer->getContext().setDwarfVersion(DwarfVersion);
Asm->OutStreamer->getContext().setDwarfFormat(Dwarf64 ? dwarf::DWARF64
: dwarf::DWARF32);
+ Asm->OutStreamer->setGenerateFuncLineTableOffsets(
+ EmitFuncLineTableOffsetsOption);
}
// Define out of line so we don't have to include DwarfUnit.h in DwarfDebug.h.
@@ -2221,6 +2229,10 @@ void DwarfDebug::beginFunctionImpl(const MachineFunction *MF) {
if (SP->getUnit()->getEmissionKind() == DICompileUnit::NoDebug)
return;
+ // Notify the streamer that we are beginning a function - this will reset the
+ // label pointing to the currently generated function's first line entry
+ Asm->OutStreamer->beginFunction();
+
DwarfCompileUnit &CU = getOrCreateDwarfCompileUnit(SP->getUnit());
Asm->OutStreamer->getContext().setDwarfCompileUnitID(
@@ -2249,7 +2261,8 @@ void DwarfDebug::terminateLineTable(const DwarfCompileUnit *CU) {
getDwarfCompileUnitIDForLineTable(*CU));
// Add the last range label for the given CU.
LineTable.getMCLineSections().addEndEntry(
- const_cast<MCSymbol *>(CURanges.back().End));
+ const_cast<MCSymbol *>(CURanges.back().End),
+ EmitFuncLineTableOffsetsOption);
}
void DwarfDebug::skippedNonDebugFunction() {
@@ -2342,6 +2355,21 @@ void DwarfDebug::endFunctionImpl(const MachineFunction *MF) {
// Construct call site entries.
constructCallSiteEntryDIEs(*SP, TheCU, ScopeDIE, *MF);
+ // If we're emitting line table offsets, we also need to emit an end label
+ // after all function's line entries
+ if (EmitFuncLineTableOffsetsOption) {
+ MCSymbol *LineSym = Asm->OutStreamer->getContext().createTempSymbol();
+ Asm->OutStreamer->emitLabel(LineSym);
+ MCDwarfLoc DwarfLoc(
+ 1, 1, 0, DWARF2_LINE_DEFAULT_IS_STMT ? DWARF2_FLAG_IS_STMT : 0, 0, 0);
+ MCDwarfLineEntry LineEntry(LineSym, DwarfLoc);
+ Asm->OutStreamer->getContext()
+ .getMCDwarfLineTable(
+ Asm->OutStreamer->getContext().getDwarfCompileUnitID())
+ .getMCLineSections()
+ .addLineEntry(LineEntry, Asm->OutStreamer->getCurrentSectionOnly());
+ }
+
// Clear debug info
// Ownership of DbgVariables is a bit subtle - ScopeVariables owns all the
// DbgVariables except those that are also in AbstractVariables (since they
diff --git a/llvm/lib/MC/MCDwarf.cpp b/llvm/lib/MC/MCDwarf.cpp
index 8ff097f29aebd1..34a9541bbbcc3a 100644
--- a/llvm/lib/MC/MCDwarf.cpp
+++ b/llvm/lib/MC/MCDwarf.cpp
@@ -104,8 +104,17 @@ void MCDwarfLineEntry::make(MCStreamer *MCOS, MCSection *Section) {
// Get the current .loc info saved in the context.
const MCDwarfLoc &DwarfLoc = MCOS->getContext().getCurrentDwarfLoc();
+ MCSymbol *LineStreamLabel = nullptr;
+ // If functions need offsets into the generated line table, then we need to
+ // create a label referencing where the line was generated in the output
+ // stream
+ if (MCOS->getGenerateFuncLineTableOffsets()) {
+ LineStreamLabel = MCOS->getContext().createTempSymbol();
+ MCOS->emittedLineStreamSym(LineStreamLabel);
+ }
+
// Create a (local) line entry with the symbol and the current .loc info.
- MCDwarfLineEntry LineEntry(LineSym, DwarfLoc);
+ MCDwarfLineEntry LineEntry(LineSym, DwarfLoc, LineStreamLabel);
// clear DwarfLocSeen saying the current .loc info is now used.
MCOS->getContext().clearDwarfLocSeen();
@@ -145,7 +154,8 @@ makeStartPlusIntExpr(MCContext &Ctx, const MCSymbol &Start, int IntVal) {
return Res;
}
-void MCLineSection::addEndEntry(MCSymbol *EndLabel) {
+void MCLineSection::addEndEntry(MCSymbol *EndLabel,
+ bool generatingFuncLineTableOffsets) {
auto *Sec = &EndLabel->getSection();
// The line table may be empty, which we should skip adding an end entry.
// There are two cases:
@@ -158,8 +168,12 @@ void MCLineSection::addEndEntry(MCSymbol *EndLabel) {
if (I != MCLineDivisions.end()) {
auto &Entries = I->second;
auto EndEntry = Entries.back();
- EndEntry.setEndLabel(EndLabel);
- Entries.push_back(EndEntry);
+ // If generatingFuncLineTableOffsets is set, then we already generated an
+ // end label at the end of the last function, so skip generating another one
+ if (!generatingFuncLineTableOffsets) {
+ EndEntry.setEndLabel(EndLabel);
+ Entries.push_back(EndEntry);
+ }
}
}
diff --git a/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll b/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll
new file mode 100644
index 00000000000000..ef8b0c817cfb67
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/DW_AT_LLVM_stmt_seq_sec_offset.ll
@@ -0,0 +1,82 @@
+; RUN: llc -mtriple=i686-w64-mingw32 -o %t -filetype=obj %s
+; RUN: llvm-dwarfdump -v -all %t | FileCheck %s -check-prefix=NO_STMT_SEQ
+
+; RUN: llc -mtriple=i686-w64-mingw32 -o %t -filetype=obj %s -emit-func-debug-line-table-offsets
+; RUN: llvm-dwarfdump -v -all %t | FileCheck %s -check-prefix=STMT_SEQ
+
+; NO_STMT_SEQ-NOT: DW_AT_LLVM_stmt_sequence
+
+; STMT_SEQ: [[[ABBREV_CODE:[0-9]+]]] DW_TAG_subprogram
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence DW_FORM_sec_offset
+; STMT_SEQ: DW_TAG_subprogram [[[ABBREV_CODE]]]
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset] (0x00000028)
+; STMT_SEQ: DW_AT_name {{.*}}func01
+; STMT_SEQ: DW_TAG_subprogram [[[ABBREV_CODE]]]
+; STMT_SEQ: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset] (0x00000033)
+; STMT_SEQ: DW_AT_name {{.*}}main
+
+;; Check that the line table starts at 0x00000028 (first function)
+; STMT_SEQ: Address Line Column File ISA Discriminator OpIndex Flags
+; STMT_SEQ-NEXT: ------------------ ------ ------ ------ --- ------------- ------- -------------
+; STMT_SEQ-NEXT: 0x00000028: 00 DW_LNE_set_address (0x00000006)
+
+;; Check that we have an 'end_sequence' just before the next function (0x00000033)
+; STMT_SEQ: 0x0000000000000006 1 0 1 0 0 0 is_stmt end_sequence
+; STMT_SEQ-NEXT: 0x00000033: 00 DW_LNE_set_address (0x00000027)
+
+;; Check that the end of the line table still has an 'end_sequence'
+; STMT_SEQ 0x00000049: 00 DW_LNE_end_sequence
+; STMT_SEQ-NEXT 0x0000000000000027 6 3 1 0 0 0 end_sequence
+
+
+; generated from:
+; clang -g -S -emit-llvm test.c -o test.ll
+; ======= test.c ======
+; int func01() {
+; return 1;
+; }
+; int main() {
+; return 0;
+; }
+; =====================
+
+
+; ModuleID = 'test.c'
+source_filename = "test.c"
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+target triple = "arm64-apple-macosx14.0.0"
+
+; Function Attrs: noinline nounwind optnone ssp uwtable(sync)
+define i32 @func01() #0 !dbg !9 {
+ ret i32 1, !dbg !13
+}
+
+; Function Attrs: noinline nounwind optnone ssp uwtable(sync)
+define i32 @main() #0 !dbg !14 {
+ %1 = alloca i32, align 4
+ store i32 0, ptr %1, align 4
+ ret i32 0, !dbg !15
+}
+
+attributes #0 = { noinline nounwind optnone ssp uwtable(sync) "frame-pointer"="non-leaf" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="apple-m1" "target-features"="+aes,+crc,+dotprod,+fp-armv8,+fp16fml,+fullfp16,+lse,+neon,+ras,+rcpc,+rdm,+sha2,+sha3,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8a,+zcm,+zcz" }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4, !5, !6, !7}
+!llvm.ident = !{!8}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "Homebrew clang version 17.0.6", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: Apple, sysroot: "/Library/Developer/CommandLineTools/SDKs/MacOSX14.sdk", sdk: "MacOSX14.sdk")
+!1 = !DIFile(filename: "test.c", directory: "/tmp/clang_test")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{i32 8, !"PIC Level", i32 2}
+!6 = !{i32 7, !"uwtable", i32 1}
+!7 = !{i32 7, !"frame-pointer", i32 1}
+!8 = !{!"Homebrew clang version 17.0.6"}
+!9 = distinct !DISubprogram(name: "func01", scope: !1, file: !1, line: 1, type: !10, scopeLine: 1, spFlags: DISPFlagDefinition, unit: !0)
+!10 = !DISubroutineType(types: !11)
+!11 = !{!12}
+!12 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!13 = !DILocation(line: 2, column: 3, scope: !9)
+!14 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 5, type: !10, scopeLine: 5, spFlags: DISPFlagDefinition, unit: !0)
+!15 = !DILocation(line: 6, column: 3, scope: !14)
|
llvm/include/llvm/MC/MCDwarf.h
Outdated
// DwarfDebug::endFunctionImpl needs to construct MCDwarfLoc(IsEndOfFunction) | ||
friend class DwarfDebug; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That feels like a bit of a layering and encapsulation break & hopefully can be avoided?
What happens if DwarfDebug doesn't do this? Doesn't MCDwarf correctly implicitly terminate the sequence when a new label is requested? Isn't that enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to keep this implementation as close to Apple's very similar implementation of -fcas-friendly-debug-info
in their fork of llvm-project - see MCDwarf.h in Apple's llvm-project fork . This to both to try to stick to their proposed pattern and to make it easier to merge the changes later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @dwblaikie, this was done so that DwarfDebug can create and add a new MCDwarfLineEntry
at the end of DwarfDebug::endFunctionImpl
. Do you have a better way of being able to add an DW_LNE_end_sequence
line entry from DwarfDebug
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If DwarfDebug needs some functionality in MCDwarf, it should probably be exposed to any MCDwarf client rather than uniquely to DwarfDebug.
But I don't know that DW_LNE_end_sequence placement should be chosen by DwarfDebug - it should go at the end of any chunk of .text that can be sliced-and-diced by the linker. (which means it should probably be at the end of every function on MachO when using subsections-via-symbols (but I guess it isn't currently, because the DWARF only gets rewritten by dsymutil which is DWARF-aware), or on ELF with -ffunction-sections
)
In general I'd expect DwarfDebug to not actually request where line table sequences start and end - they're a function of the object format about where content can be treated as contiguous or not.
Except for this patch, which needs a label that starts a sequence even if it isn't in what would be an isolated chunk (though I have my doubts about that - if the function isn't at the start of an isolated chunk of .text, does it need one of these labels? Could it use the label of the start of the isolated chunk it's part of? Could it share that location somehow with all functions in that chunk?)
if (Asm->OutStreamer->getGenerateFuncLineTableOffsets() && | ||
Asm->OutStreamer->getCurrentFuncFirstLineStreamSym()) { | ||
addSectionLabel( | ||
*SPDie, dwarf::DW_AT_LLVM_stmt_sequence, | ||
Asm->OutStreamer->getCurrentFuncFirstLineStreamSym(), | ||
Asm->getObjFileLowering().getDwarfLineSection()->getBeginSymbol()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than wiring up the attribute into the streamer, then querying it out here in DwarfCompileUnit, then going back into the streamer with labels - could the streamer "do the right thing" when a label is requested, and otherwise do the old/usual thing?
It doesn't seem like MCStreamer should "know" about function labels, it should know about line table labels, and you could in theory request them anywhere you want to be able to jump into parsing the line table without having to parse additional context. (ie: in beginFunctionImpl
? (though maybe that's after the prologue, in which case maybe there's some callback before the prologue, or we should add one))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a better design and I think it may work. For the current implementation - same as above - was trying to follow Apple's existing design. See here for example.
Before taking the more straightforward approach just want to make sure we aren't necessarily weighing in Apple's existing design - and we should try to get the best design for the current feature only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, fair enough -prior art and all.
Could you rope someone in from that work to join this discussion, then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rastogishubham / @adrian-prantl - In this PR I'm doing something similar to -fcas-friendly-debug-info
in the Apple branch (and also a bit extra on top). Do you think this is the way to go for future merges / consistency / etc ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bulbazord / @kastiglione - We're trying to do something similar to -fcas-friendly-debug-info
in the Apple branch - could you have a look if you're OK with this approach for consistency with Apple branch ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed this when you pinged me originally.
There is no good reason for -fcas-friendly-debug-info
to live downstream so we should try to unify both implementations.
@rastogishubham Can you take a look at both patches and see if we could rebase -fcas-friendly-debug-info
on top of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the ping, let me take a look, I need to also familiarize myself with -fcas-friendly-debug-info
code again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feedback still seems relevant, to me, to the current version of the code - it's still strange to me that MCStreamer needs to know about this with an attribute up-front with 'setGenerateFuncLineTableOffsets` - in assembly that isn't necessary, the asm directive is used wherever needed, and MCStreamer behaves appropriately/as needed (it's not like we needed a directive at the start of the file to announce that we might use line table offset directives later in the file - so why would we need that at the API level? We should just query for line table offsets when we need them and MCStreamer should provide them at that point, without needing some prior warning/up front thing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. The new implementation seems much more straightforward. LMK what you think.
I also tested it on a more complex example and the result seems correct.
llvm/lib/MC/MCDwarf.cpp
Outdated
@@ -158,8 +168,12 @@ void MCLineSection::addEndEntry(MCSymbol *EndLabel) { | |||
if (I != MCLineDivisions.end()) { | |||
auto &Entries = I->second; | |||
auto EndEntry = Entries.back(); | |||
EndEntry.setEndLabel(EndLabel); | |||
Entries.push_back(EndEntry); | |||
// If generatingFuncLineTableOffsets is set, then we already generated an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please move this comment up to be the third case? So change line 161 to say
// There are three cases:
and then add this comment as the third case?
@alx32 there is a major bug with this patch and I am not sure why it is happening. If we take this test file:
compile it regularly:
then use dwarfdump on it with the verbose mode (-v): We will see:
However, when I use apply your patch and use: then use dwardump: I see:
Notice that there are no line table opcodes that advance the file and line, the file and line stay 1 and 0 for the duration of the line table. That is very wrong. |
Thanks - Will have a look ! |
About the issues found by @rastogishubham above - these need some context to explain: Timeline:
The specific issue is that in the MC layer change, if a This behavior lead to the issue @rastogishubham pointed out where valid The reason for the above context is to bring up another issue - after the MC change, the behavior is that if a So it looks like the options here are:
I think Nr.2 above would not be ideal - but still wanted to present it as an option. |
The latest change is possibility Nr.1 presented in the above comment - similar to David's proposal here where we simply insert a line label and have that terminate the current sequence. @rastogishubham - LMK how you think we should proceed here either of the above 2 options or some other way. PS - There is still lots of back/forth through
|
@alx32 thanks for the detailed update, as long as the patch doesn't break the This should merge fine with what we have in the Apple branch. |
@alx32 can you please add a test that checks to make sure the line table is correct like in #110192 (comment) If that is added, I can approve the patch, thanks! |
@rastogishubham - sorry was out last week. What about the latest test change ? Do you think we need more coverage ? EDIT: Failures are infra related - not an actual issue: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
01add90
to
ab07e40
Compare
Updated (and rebased) to address merge conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, reckon this is worth a go.
…e lookups (#123391) **Summary** Add support for filtering line table entries based on `DW_AT_LLVM_stmt_sequence` attribute when looking up address ranges. This ensures that line entries are correctly attributed to their corresponding functions, even when multiple functions share the same address range due to optimizations. **Background** In #110192 we added support to clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for `DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434) The `DW_AT_LLVM_stmt_sequence` attribute allows accurate attribution of line number information to their corresponding functions, even in scenarios where functions are merged or share the same address space due to optimizations like Identical Code Folding (ICF) in the linker. **Implementation Details** The patch modifies `DWARFDebugLine::lookupAddressRange` to accept an optional DWARFDie parameter. When provided, the function checks if the `DIE` has a `DW_AT_LLVM_stmt_sequence` attribute. This attribute contains an offset into the line table that marks where the line entries for this DIE's function begin. If the attribute is present, the function filters the results to only include line entries from the sequence that starts at the specified offset. This ensures that even when multiple functions share the same address range, we return only the line entries that actually belong to the function represented by the DIE. The implementation: - Adds an optional DWARFDie parameter to lookupAddressRange - Extracts the `DW_AT_LLVM_stmt_sequence` offset if present - Modifies the address range lookup logic to filter sequences based on their offset - Returns only line entries from the matching sequence
…n line table lookups (#123391) **Summary** Add support for filtering line table entries based on `DW_AT_LLVM_stmt_sequence` attribute when looking up address ranges. This ensures that line entries are correctly attributed to their corresponding functions, even when multiple functions share the same address range due to optimizations. **Background** In llvm/llvm-project#110192 we added support to clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for `DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434) The `DW_AT_LLVM_stmt_sequence` attribute allows accurate attribution of line number information to their corresponding functions, even in scenarios where functions are merged or share the same address space due to optimizations like Identical Code Folding (ICF) in the linker. **Implementation Details** The patch modifies `DWARFDebugLine::lookupAddressRange` to accept an optional DWARFDie parameter. When provided, the function checks if the `DIE` has a `DW_AT_LLVM_stmt_sequence` attribute. This attribute contains an offset into the line table that marks where the line entries for this DIE's function begin. If the attribute is present, the function filters the results to only include line entries from the sequence that starts at the specified offset. This ensures that even when multiple functions share the same address range, we return only the line entries that actually belong to the function represented by the DIE. The implementation: - Adds an optional DWARFDie parameter to lookupAddressRange - Extracts the `DW_AT_LLVM_stmt_sequence` offset if present - Modifies the address range lookup logic to filter sequences based on their offset - Returns only line entries from the matching sequence
…e lookups (llvm#123391) **Summary** Add support for filtering line table entries based on `DW_AT_LLVM_stmt_sequence` attribute when looking up address ranges. This ensures that line entries are correctly attributed to their corresponding functions, even when multiple functions share the same address range due to optimizations. **Background** In llvm#110192 we added support to clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for `DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434) The `DW_AT_LLVM_stmt_sequence` attribute allows accurate attribution of line number information to their corresponding functions, even in scenarios where functions are merged or share the same address space due to optimizations like Identical Code Folding (ICF) in the linker. **Implementation Details** The patch modifies `DWARFDebugLine::lookupAddressRange` to accept an optional DWARFDie parameter. When provided, the function checks if the `DIE` has a `DW_AT_LLVM_stmt_sequence` attribute. This attribute contains an offset into the line table that marks where the line entries for this DIE's function begin. If the attribute is present, the function filters the results to only include line entries from the sequence that starts at the specified offset. This ensures that even when multiple functions share the same address range, we return only the line entries that actually belong to the function represented by the DIE. The implementation: - Adds an optional DWARFDie parameter to lookupAddressRange - Extracts the `DW_AT_LLVM_stmt_sequence` offset if present - Modifies the address range lookup logic to filter sequences based on their offset - Returns only line entries from the matching sequence
…es (#128953) **Summary:** This update adds handling for `DW_AT_LLVM_stmt_sequence` attributes in the DWARF linker. These attributes point to rows in the line table, which gets rewritten during linking. Since the row positions change, the offsets in these attributes need to be updated to match the new layout in the output `.debug_line` section. The changes add new data structures and tweak existing functions to track and fix these attributes. **Background** In #110192 we added support to clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for `DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434). This attribute holds a label pointing to the offset in the line table where the function's line entries begin. **Implementation details:** Here’s what’s changed in the code: - **New Tracking in `CompileUnit`:** A `StmtSeqListAttributes` vector is added to the `CompileUnit` class. It stores the locations where `DW_AT_LLVM_stmt_sequence` attributes need to be patched, recorded when cloning DIEs (debug info entries). - **Updated `emitLineTableForUnit` Function:** This function now has an optional `RowOffsets` parameter. It collects the byte offsets of each row in the output line table. We only need to use this functionality if `DW_AT_LLVM_stmt_sequence` attributes are present in the unit. - **Row Tracking with `TrackedRow`:** A `TrackedRow` struct keeps track of each input row’s original index and whether it starts a sequence in the output table. This links old rows to their new positions in the rewritten line table. Several implementations were considered and prototyped here, but so far this has proven the simplest and cleanest approach. - **Patching Step:** After the line table is written, the linker uses the data in `TrackedRow`'s objects and `RowOffsets` array to update the `DW_AT_LLVM_stmt_sequence` attributes with the correct offsets.
…n line tables (#128953) **Summary:** This update adds handling for `DW_AT_LLVM_stmt_sequence` attributes in the DWARF linker. These attributes point to rows in the line table, which gets rewritten during linking. Since the row positions change, the offsets in these attributes need to be updated to match the new layout in the output `.debug_line` section. The changes add new data structures and tweak existing functions to track and fix these attributes. **Background** In llvm/llvm-project#110192 we added support to clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for `DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434). This attribute holds a label pointing to the offset in the line table where the function's line entries begin. **Implementation details:** Here’s what’s changed in the code: - **New Tracking in `CompileUnit`:** A `StmtSeqListAttributes` vector is added to the `CompileUnit` class. It stores the locations where `DW_AT_LLVM_stmt_sequence` attributes need to be patched, recorded when cloning DIEs (debug info entries). - **Updated `emitLineTableForUnit` Function:** This function now has an optional `RowOffsets` parameter. It collects the byte offsets of each row in the output line table. We only need to use this functionality if `DW_AT_LLVM_stmt_sequence` attributes are present in the unit. - **Row Tracking with `TrackedRow`:** A `TrackedRow` struct keeps track of each input row’s original index and whether it starts a sequence in the output table. This links old rows to their new positions in the rewritten line table. Several implementations were considered and prototyped here, but so far this has proven the simplest and cleanest approach. - **Patching Step:** After the line table is written, the linker uses the data in `TrackedRow`'s objects and `RowOffsets` array to update the `DW_AT_LLVM_stmt_sequence` attributes with the correct offsets.
…es (llvm#128953) **Summary:** This update adds handling for `DW_AT_LLVM_stmt_sequence` attributes in the DWARF linker. These attributes point to rows in the line table, which gets rewritten during linking. Since the row positions change, the offsets in these attributes need to be updated to match the new layout in the output `.debug_line` section. The changes add new data structures and tweak existing functions to track and fix these attributes. **Background** In llvm#110192 we added support to clang to generate the `DW_AT_LLVM_stmt_sequence` attribute for `DW_TAG_subprogram`'s. Corresponding RFC: [New DWARF Attribute for Symbolication of Merged Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434). This attribute holds a label pointing to the offset in the line table where the function's line entries begin. **Implementation details:** Here’s what’s changed in the code: - **New Tracking in `CompileUnit`:** A `StmtSeqListAttributes` vector is added to the `CompileUnit` class. It stores the locations where `DW_AT_LLVM_stmt_sequence` attributes need to be patched, recorded when cloning DIEs (debug info entries). - **Updated `emitLineTableForUnit` Function:** This function now has an optional `RowOffsets` parameter. It collects the byte offsets of each row in the output line table. We only need to use this functionality if `DW_AT_LLVM_stmt_sequence` attributes are present in the unit. - **Row Tracking with `TrackedRow`:** A `TrackedRow` struct keeps track of each input row’s original index and whether it starts a sequence in the output table. This links old rows to their new positions in the rewritten line table. Several implementations were considered and prototyped here, but so far this has proven the simplest and cleanest approach. - **Patching Step:** After the line table is written, the linker uses the data in `TrackedRow`'s objects and `RowOffsets` array to update the `DW_AT_LLVM_stmt_sequence` attributes with the correct offsets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @alx32, I've ported the patch to the Swift compiler and tested it on a Swift codebase and found it fails to compile a Swift module. It is a bit challenging to create a minimal reproduction, but please let me know if you need one.
@@ -2223,6 +2223,9 @@ void DwarfDebug::beginFunctionImpl(const MachineFunction *MF) { | |||
return; | |||
|
|||
DwarfCompileUnit &CU = getOrCreateDwarfCompileUnit(SP->getUnit()); | |||
FunctionLineTableLabel = CU.emitFuncLineTableOffsets() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this call be placed after setDwarfCompileUnitID()
? If the compile unit ID is changed, it seems that FunctionLineTableLabel
would no longer be in the same line table as the debug lines added later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also suggest adding an assertion in MCDwarfLineEntry::setEndLabel
to ensure that this->LineStreamLabel == nullptr
. I found a case where LineStreamLabel
was emitted twice, which triggered a complaint from MCStreamer::emitLabel
, which is the reason why I caught the potential bug here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found a case where LineStreamLabel was emitted twice
Is this happening in the current upstreamed version of the patch or the Swift implementation ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I manually ported your patches, as well as some other necessary ones like b468ed4, back to a downstream Swift compiler repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After further investigation, I realized that this issue isn't strictly specific to Swift. Instead, it's related to scenarios where multiple CUs are emitted in one compiler invocation. In my case, the affected Swift module is configured with -enable-single-module-llvm-emission
.
The issue can be reproduced using tip-of-tree LLVM compiling llvm-test-suite with a slightly modified ReleaseLTO-g.cmake
configuration (in order to enable -emit-func-debug-line-table-offsets
for LTO objects).
set(CMAKE_EXE_LINKER_FLAGS_RELEASE "-Wl,-mllvm,-emit-func-debug-line-table-offsets" CACHE STRING "")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the repro. I have managed to repro it locally. I will look into.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimal repro via:
llc -O3 -mcpu=x86-64 -emit-func-debug-line-table-offsets -filetype=obj debug-line-lto-bug.ll
Where debug-line-lto-bug.ll
is https://gist.github.com/alx32/5d5db8dc9818e17e59e47c4f70d91ad1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix: #142253
@nocchijiang Thanks for the report.
Do you mean within a larger codebase it fails to compile one particular Swift module? Or that any Swift module will not compile after the port? I will probably be adding Swift support in the future, but it's not planned for the immediate timeline. Do you have plans currently to upstream the Swift support? |
It fails on one particular Swift module.
I am working on creating a minimal reproduction of the issue. Ideally I would make it an LLVM regression test, which would allow me to submit the proposed fix as mentioned in the review comment. |
@nocchijiang - Without a repro I won't quite be able to come up with an adequate fix here. Also note that if your downstream branch has |
Summary
This patch introduces a new compiler option
-mllvm -emit-func-debug-line-table-offsets
that enables the emission of per-function line table offsets and end sequences in DWARF debug information. This enhancement allows tools and debuggers to accurately attribute line number information to their corresponding functions, even in scenarios where functions are merged or share the same address space due to optimizations like Identical Code Folding (ICF) in the linker.Background
RFC: New DWARF Attribute for Symbolication of Merged Functions
Previous similar PR: #93137 – This PR was very similar to the current one but at the time, the assembler had no support for emitting labels within the line table. That support was added in PR #99710 - and in this PR we use some of the support added in the assembler PR.
In the current implementation, Clang generates line information in the
debug_line
section without directly associating line entries with their originatingDW_TAG_subprogram
DIEs. This can lead to issues when post-compilation optimizations merge functions, resulting in overlapping address ranges and ambiguous line information.For example, when functions are merged by ICF in LLD, multiple functions may end up sharing the same address range. Without explicit linkage between functions and their line entries, tools cannot accurately attribute line information to the correct function, adversely affecting debugging and call stack resolution.
Implementation Details
To address the above issue, the patch makes the following key changes:
DW_AT_LLVM_stmt_sequence
Attribute: Introduces a new LLVM-specific attributeDW_AT_LLVM_stmt_sequence
to eachDW_TAG_subprogram
DIE. This attribute holds a label pointing to the offset in the line table where the function's line entries begin.End-of-Sequence Markers: Emits an explicit DW_LNE_end_sequence after each function's line entries in the line table. This marks the end of the line information for that function, ensuring that line entries are correctly delimited.
Assembler and Streamer Modifications: Modifies the MCStreamer and related classes to support emitting the necessary labels and tracking the current function's line entries. A new flag GenerateFuncLineTableOffsets is added to control this behavior.
Compiler Option: Introduces the
-mllvm -emit-func-debug-line-table-offsets
option to enable this functionality, allowing users to opt-in as needed.