-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Symbolize line zero as if no source info is available #124846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Since line zero means "no line information", when symbolizing a location (an address or an inline frame associated with the address) that has a line zero location, we shouldn't include other irrelevant data (like filename) in the result.
@llvm/pr-subscribers-llvm-binary-utilities Author: David Blaikie (dwblaikie) ChangesSince line zero means "no line information", when symbolizing a location Full diff: https://github.com/llvm/llvm-project/pull/124846.diff 4 Files Affected:
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
index 99e1642ff23ad5..15eac10688f18f 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
@@ -1869,7 +1869,7 @@ DWARFContext::getInliningInfoForAddress(object::SectionedAddress Address,
LineTable->getFileLineInfoForAddress(
{Address.Address, Address.SectionIndex}, Spec.ApproximateLine,
CU->getCompilationDir(), Spec.FLIKind, Frame);
- } else {
+ } else if (CallLine != 0) {
// Otherwise, use call file, call line and call column from
// previous DIE in inlined chain.
if (LineTable)
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp b/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
index 939a5163d55abc..a30ae163fa5a98 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
@@ -1508,7 +1508,8 @@ bool DWARFDebugLine::LineTable::getFileLineInfoForAddress(
return false;
// Take file number and line/column from the row.
const auto &Row = Rows[RowIndex];
- if (!getFileNameByIndex(Row.File, CompDir, Kind, Result.FileName))
+ if (Row.Line == 0 ||
+ !getFileNameByIndex(Row.File, CompDir, Kind, Result.FileName))
return false;
Result.Line = Row.Line;
Result.Column = Row.Column;
diff --git a/llvm/test/tools/llvm-symbolizer/skip-line-zero.s b/llvm/test/tools/llvm-symbolizer/skip-line-zero.s
index e9fbea558e0eb8..74dfb5cdc1aaea 100644
--- a/llvm/test/tools/llvm-symbolizer/skip-line-zero.s
+++ b/llvm/test/tools/llvm-symbolizer/skip-line-zero.s
@@ -20,13 +20,13 @@
## Check that without '--skip-line-zero', line zero is displayed for a line-table entry which has no source correspondence.
# RUN: llvm-symbolizer --obj=%t.o -f=none 0x16d4 | FileCheck --strict-whitespace --match-full-lines --check-prefix=DISABLE %s
-# DISABLE:main.c:0:0
+# DISABLE:??:0:0
## Check that the '--skip-line-zero' does not cross sequence boundaries.
## If it fails to find in the current sequence then line zero is returned for the queried address.
# RUN: llvm-symbolizer --obj=%t.o -f=none --skip-line-zero 0x16c0 | FileCheck --strict-whitespace --match-full-lines --check-prefix=FAIL-ACROSS-SEQ %s
-# FAIL-ACROSS-SEQ:main.c:0:0
+# FAIL-ACROSS-SEQ:??:0:0
## Check that with '--skip-line-zero', the last non-zero line in the current sequence is displayed.
# RUN: llvm-symbolizer --obj=%t.o -f=none --skip-line-zero 0x1717 | FileCheck --strict-whitespace --match-full-lines --check-prefix=WITHIN-SEQ %s
diff --git a/llvm/test/tools/llvm-symbolizer/sym-verbose.test b/llvm/test/tools/llvm-symbolizer/sym-verbose.test
index 831fd6c7f05071..224c317f558a1a 100644
--- a/llvm/test/tools/llvm-symbolizer/sym-verbose.test
+++ b/llvm/test/tools/llvm-symbolizer/sym-verbose.test
@@ -50,13 +50,12 @@ CHECK-NEXT: Column: 0
CHECK: 0x4005ad
CHECK-NEXT: foo
-CHECK-NEXT: Filename: /tmp{{[\\/]}}discrim.c
+CHECK-NEXT: Filename: ??
CHECK-NEXT: Function start filename: /tmp{{[\\/]}}discrim.c
CHECK-NEXT: Function start line: 4
CHECK-NEXT: Function start address: 0x400590
CHECK-NEXT: Line: 0
-CHECK-NEXT: Column: 30
-CHECK-NEXT: Discriminator: 4
+CHECK-NEXT: Column: 0
CHECK-NEXT: main
CHECK-NEXT: Filename: /tmp{{[\\/]}}discrim.c
CHECK-NEXT: Function start filename: /tmp{{[\\/]}}discrim.c
|
@llvm/pr-subscribers-debuginfo Author: David Blaikie (dwblaikie) ChangesSince line zero means "no line information", when symbolizing a location Full diff: https://github.com/llvm/llvm-project/pull/124846.diff 4 Files Affected:
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
index 99e1642ff23ad5..15eac10688f18f 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
@@ -1869,7 +1869,7 @@ DWARFContext::getInliningInfoForAddress(object::SectionedAddress Address,
LineTable->getFileLineInfoForAddress(
{Address.Address, Address.SectionIndex}, Spec.ApproximateLine,
CU->getCompilationDir(), Spec.FLIKind, Frame);
- } else {
+ } else if (CallLine != 0) {
// Otherwise, use call file, call line and call column from
// previous DIE in inlined chain.
if (LineTable)
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp b/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
index 939a5163d55abc..a30ae163fa5a98 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
@@ -1508,7 +1508,8 @@ bool DWARFDebugLine::LineTable::getFileLineInfoForAddress(
return false;
// Take file number and line/column from the row.
const auto &Row = Rows[RowIndex];
- if (!getFileNameByIndex(Row.File, CompDir, Kind, Result.FileName))
+ if (Row.Line == 0 ||
+ !getFileNameByIndex(Row.File, CompDir, Kind, Result.FileName))
return false;
Result.Line = Row.Line;
Result.Column = Row.Column;
diff --git a/llvm/test/tools/llvm-symbolizer/skip-line-zero.s b/llvm/test/tools/llvm-symbolizer/skip-line-zero.s
index e9fbea558e0eb8..74dfb5cdc1aaea 100644
--- a/llvm/test/tools/llvm-symbolizer/skip-line-zero.s
+++ b/llvm/test/tools/llvm-symbolizer/skip-line-zero.s
@@ -20,13 +20,13 @@
## Check that without '--skip-line-zero', line zero is displayed for a line-table entry which has no source correspondence.
# RUN: llvm-symbolizer --obj=%t.o -f=none 0x16d4 | FileCheck --strict-whitespace --match-full-lines --check-prefix=DISABLE %s
-# DISABLE:main.c:0:0
+# DISABLE:??:0:0
## Check that the '--skip-line-zero' does not cross sequence boundaries.
## If it fails to find in the current sequence then line zero is returned for the queried address.
# RUN: llvm-symbolizer --obj=%t.o -f=none --skip-line-zero 0x16c0 | FileCheck --strict-whitespace --match-full-lines --check-prefix=FAIL-ACROSS-SEQ %s
-# FAIL-ACROSS-SEQ:main.c:0:0
+# FAIL-ACROSS-SEQ:??:0:0
## Check that with '--skip-line-zero', the last non-zero line in the current sequence is displayed.
# RUN: llvm-symbolizer --obj=%t.o -f=none --skip-line-zero 0x1717 | FileCheck --strict-whitespace --match-full-lines --check-prefix=WITHIN-SEQ %s
diff --git a/llvm/test/tools/llvm-symbolizer/sym-verbose.test b/llvm/test/tools/llvm-symbolizer/sym-verbose.test
index 831fd6c7f05071..224c317f558a1a 100644
--- a/llvm/test/tools/llvm-symbolizer/sym-verbose.test
+++ b/llvm/test/tools/llvm-symbolizer/sym-verbose.test
@@ -50,13 +50,12 @@ CHECK-NEXT: Column: 0
CHECK: 0x4005ad
CHECK-NEXT: foo
-CHECK-NEXT: Filename: /tmp{{[\\/]}}discrim.c
+CHECK-NEXT: Filename: ??
CHECK-NEXT: Function start filename: /tmp{{[\\/]}}discrim.c
CHECK-NEXT: Function start line: 4
CHECK-NEXT: Function start address: 0x400590
CHECK-NEXT: Line: 0
-CHECK-NEXT: Column: 30
-CHECK-NEXT: Discriminator: 4
+CHECK-NEXT: Column: 0
CHECK-NEXT: main
CHECK-NEXT: Filename: /tmp{{[\\/]}}discrim.c
CHECK-NEXT: Function start filename: /tmp{{[\\/]}}discrim.c
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me, but probably worth getting one or two others to confirm they're happy with this idea too.
Can you link to the code or documentation that explains this DWARF line table size encoding optimization? My reading of |
Not sure if this is what you're asking for, but DWARF has compact encoding for line-table entries that advance the line number. That is, things like "line=line+1" are very compact. The idea is that line numbers tend to go up by relatively small amounts, so it's worth making that not take up a lot of space. I remember doing some work to make sure that going to line 0 and back did NOT affect/reset the column number. Setting the column number to zero, and then putting it back, seemed like a waste of space. Line 0 means "no source" regardless of file or column or anything else. I think what David is doing here is reasonable for symbolizing. The other source-coordinate parameters are meaningless when line = 0. I haven't looked at what |
Agreed, what David is doing here seems correct. Lines are 1-based in DWARF which is why line 0 is treated as a special marker to indicate no line information. |
I agree with everything you said, but where is this actually implemented in the LLVM source code? I looked hard at the MC DWARF line table encoding, but I couldn't find where this is implemented. |
The DWARF spec wording for this in DWARFv5 (https://dwarfstd.org/doc/DWARF5.pdf) is page 151, table 6.3 which says "The compiler may emit the value 0 in cases where an instruction cannot be attributed to any source line." (by "line" read that as "entry" - it doesn't mean "oh, there's no line, but maybe there's a column", the column is a modifier on the line - so if there's no line, the column is meaningless (& if there's zero line, the file doesn't matter, etc)) Admittedly, the DWARF spec isn't /entirely/ clear that the file doesn't matter if line is zero - you could argue that the file is meaningful even if the line is zero and means the instruction came from "somewhere in this file" - but that generally hasn't been the interpretation the LLVM project has taken. (we could file a DWARF issue to get this clarified) So I wrote a small test case with DILocations, and didn't observe the behavior I was expecting - LLVM lowering did respect the varying column on the instructions. That's because we rarely use line zero on an instruction, I think - we mostly remove the location from the instruction entirely (and that usually causes the location of a previous instruction to "flow" on to the unlocated instruction, when th eline table is created). But a place where we then do produce a zero location - is when a flow-on location would be problematic. Especially at the start of a basic block, where a previous block's location would flow across the boundary which would be bad (sample accuracy, human user - it'd show we are in a block we aren't in, might not be reached/reachable/hot, etc) Ah, here we go: llvm-project/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp Lines 2101 to 2120 in 8b44884
|
We could probably sink this handling down into recordSourceLine so it does kick in for explicitly zero DILocation'd instructions too (these happen for things like call instructions that get merged) |
Thanks for the pointer! |
Not sure I've seen anyone suggest the "somewhere in this file" interpretation before. I find it hard to imagine "from this file but no particular line" as being usefully distinct from "no source line whatsoever." (And DWARF (at least as of v5) has no way to specify "no file" in any case.) |
In 124846 the symbolizer was changed to ignore 0-column entries, which lead to a slightly different representation in the stack traces. This patch addresses these differences. Not sure if the difference in kernel_trap.c is also a result of this change or not. Can be tracked separate from this, after the bots are back to green.
In #124846 the symbolizer was changed to ignore 0-column entries, which lead to a slightly different representation in the stack traces. This patch addresses these differences. Not sure if the difference in kernel_trap.c is also a result of this change or not. Can be tracked separate from this, after the bots are back to green.
Since line zero means "no line information", when symbolizing a location (an address or an inline frame associated with the address) that has a line zero location, we shouldn't include other irrelevant data (like filename) in the result.
In llvm#124846 the symbolizer was changed to ignore 0-column entries, which lead to a slightly different representation in the stack traces. This patch addresses these differences. Not sure if the difference in kernel_trap.c is also a result of this change or not. Can be tracked separate from this, after the bots are back to green.
This commit creates an inconsistency on `__sanitizer_symbolize_pc` API. Before this change, this API always uses the filename from debug info when the line number is 0. After this change, the filename becomes invalid when line number is 0. The sanitizer might fall back to use base filename from symbol table. So, this API may return either `??:0` or `{filename from symbol table}:0` depending on if the symbol table has the filename for it. Make sure this inconsistency is resolved before relanding the commit.
…debug info for the given address is available. (#128619) To reland llvm/llvm-project#124846, we need to make symbolizer consistent with the case when line number is 0. Always using filename and line from debug info even if the line number is 0 sounds like the reasonable path to go.
…24846)" This land commits 23aca2f and 1b15a89. llvm#128619 makes symbolizer to always use debug info when available so we can reland this chagnge.
…#124846)" (#133798)" This reverts commit 3483740 because #128619 doesn't handle the case when we have an empty frame from `getInliningInfoForAddress` because line num is 0 which makes it non-differentiable from missing debug info. So, we end up using the base filename from symtab again. Reverting for now until that issus is solved.
…llvm#124846)" (llvm#133798)" This reverts commit 3483740 because llvm#128619 doesn't handle the case when we have an empty frame from `getInliningInfoForAddress` because line num is 0 which makes it non-differentiable from missing debug info. So, we end up using the base filename from symtab again. Reverting for now until that issus is solved.
…llvm#124846)" (llvm#133798)" This reverts commit 3483740 because llvm#128619 doesn't handle the case when we have an empty frame from `getInliningInfoForAddress` because line num is 0 which makes it non-differentiable from missing debug info. So, we end up using the base filename from symtab again. Reverting for now until that issus is solved.
Since line zero means "no line information", when symbolizing a location
(an address or an inline frame associated with the address) that has a
line zero location, we shouldn't include other irrelevant data (like
filename) in the result.