Skip to content

[lldb/DWARF] Make sure bad abbreviation codes do not crash lldb #93006

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 23, 2024

Conversation

labath
Copy link
Collaborator

@labath labath commented May 22, 2024

We currently cannot represent abbreviation codes with more than 16 bits, and we were lldb-asserting if we ever ran into one. While I haven't seen any real DWARF with these kinds of abbreviations, it is possible to hit this with handcrafted evil dwarf, due some sort of corruptions, or just bugs (the addition of PeekDIEName makes these bugs more likely, as the function blindly dereferences offsets within the debug info section) .

Missing abbreviations were already reporting an error. This patch turns sure that large abbreviations into an error as well, and adds a test for both cases.

We currently cannot represent abbreviation codes with more than 16 bits,
and we were lldb-asserting if we ever ran into one. While I haven't seen
any real DWARF with these kinds of abbreviations, it is possible to hit
this with handcrafted evil dwarf, due some sort of corruptions, or just
bugs (the addition of PeekDIEName makes these bugs more likely, as the
function blindly dereferences offsets within the debug info section) .

Missing abbreviations were already reporting an error. This patch turns
sure that large abbreviations into an error as well, and adds a test for
both cases.
@labath labath requested a review from felipepiovezan May 22, 2024 09:07
@labath labath requested a review from JDevlieghere as a code owner May 22, 2024 09:07
@llvmbot llvmbot added the lldb label May 22, 2024
@llvmbot
Copy link
Member

llvmbot commented May 22, 2024

@llvm/pr-subscribers-lldb

Author: Pavel Labath (labath)

Changes

We currently cannot represent abbreviation codes with more than 16 bits, and we were lldb-asserting if we ever ran into one. While I haven't seen any real DWARF with these kinds of abbreviations, it is possible to hit this with handcrafted evil dwarf, due some sort of corruptions, or just bugs (the addition of PeekDIEName makes these bugs more likely, as the function blindly dereferences offsets within the debug info section) .

Missing abbreviations were already reporting an error. This patch turns sure that large abbreviations into an error as well, and adds a test for both cases.


Full diff: https://github.com/llvm/llvm-project/pull/93006.diff

2 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp (+16-18)
  • (added) lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s (+47)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
index 1b0fefedf9836..4357ccb2f5c9f 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
@@ -11,6 +11,7 @@
 #include <cassert>
 
 #include <algorithm>
+#include <limits>
 #include <optional>
 
 #include "llvm/Support/LEB128.h"
@@ -44,10 +45,20 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor &data,
                                   const DWARFUnit *cu,
                                   lldb::offset_t *offset_ptr) {
   m_offset = *offset_ptr;
+  auto report_error = [&](const char *fmt, const auto &...vals) {
+    cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
+        "[{0:x16}]: {1}, please file a bug and "
+        "attach the file at the start of this error message",
+        static_cast<uint64_t>(m_offset), llvm::formatv(fmt, vals...));
+    *offset_ptr = std::numeric_limits<lldb::offset_t>::max();
+    return false;
+  };
+
   m_parent_idx = 0;
   m_sibling_idx = 0;
   const uint64_t abbr_idx = data.GetULEB128(offset_ptr);
-  lldbassert(abbr_idx <= UINT16_MAX);
+  if (abbr_idx > std::numeric_limits<uint16_t>::max())
+    return report_error("abbreviation code {0} too big", abbr_idx);
   m_abbr_idx = abbr_idx;
 
   if (m_abbr_idx == 0) {
@@ -57,16 +68,9 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor &data,
   }
 
   const auto *abbrevDecl = GetAbbreviationDeclarationPtr(cu);
-  if (abbrevDecl == nullptr) {
-    cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-        "[{0:x16}]: invalid abbreviation code {1}, "
-        "please file a bug and "
-        "attach the file at the start of this error message",
-        (uint64_t)m_offset, (unsigned)abbr_idx);
-    // WE can't parse anymore if the DWARF is borked...
-    *offset_ptr = UINT32_MAX;
-    return false;
-  }
+  if (abbrevDecl == nullptr)
+    return report_error("invalid abbreviation code {0}", abbr_idx);
+
   m_tag = abbrevDecl->getTag();
   m_has_children = abbrevDecl->hasChildren();
   // Skip all data in the .debug_info or .debug_types for the attributes
@@ -74,13 +78,7 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor &data,
     if (DWARFFormValue::SkipValue(attribute.Form, data, offset_ptr, cu))
       continue;
 
-    cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
-        "[{0:x16}]: Unsupported DW_FORM_{1:x}, please file a bug "
-        "and "
-        "attach the file at the start of this error message",
-        (uint64_t)m_offset, (unsigned)attribute.Form);
-    *offset_ptr = m_offset;
-    return false;
+    return report_error("Unsupported DW_FORM_{1:x}", attribute.Form);
   }
   return true;
 }
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
new file mode 100644
index 0000000000000..3f32c037aeb20
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/invalid_abbreviation.s
@@ -0,0 +1,47 @@
+# REQUIRES: x86
+
+# RUN: llvm-mc -triple=x86_64-pc-linux -filetype=obj %s > %t
+# RUN: %lldb %t \
+# RUN:   -o exit 2>&1 | FileCheck %s
+
+# CHECK-DAG: error: {{.*}} [0x0000000000000022]: abbreviation code 65536 too big, please file a bug and attach the file at the start of this error message
+# CHECK-DAG: error: {{.*}} [0x0000000000000048]: invalid abbreviation code 47, please file a bug and attach the file at the start of this error message
+
+
+        .section        .debug_abbrev,"",@progbits
+        .uleb128 65535                  # Largest representable Abbreviation Code
+        .byte   17                      # DW_TAG_compile_unit
+        .byte   1                       # DW_CHILDREN_yes
+        .byte   37                      # DW_AT_producer
+        .byte   8                       # DW_FORM_string
+        .byte   0                       # EOM(1)
+        .byte   0                       # EOM(2)
+        .byte   0                       # EOM(3)
+
+        .section        .debug_info,"",@progbits
+.Lcu_begin0:
+        .long   .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
+.Ldebug_info_start0:
+        .short  5                       # DWARF version number
+        .byte   1                       # DWARF Unit Type
+        .byte   8                       # Address Size (in bytes)
+        .long   .debug_abbrev           # Offset Into Abbrev. Section
+        .uleb128 65535                  # DW_TAG_compile_unit
+        .asciz  "Hand-written DWARF"    # DW_AT_producer
+        .uleb128 65536                  # Unrepresentable abbreviation
+        .byte   0                       # End Of Children Mark
+.Ldebug_info_end0:
+
+        .section        .debug_info,"",@progbits
+.Lcu_begin1:
+        .long   .Ldebug_info_end1-.Ldebug_info_start1 # Length of Unit
+.Ldebug_info_start1:
+        .short  5                       # DWARF version number
+        .byte   1                       # DWARF Unit Type
+        .byte   8                       # Address Size (in bytes)
+        .long   .debug_abbrev           # Offset Into Abbrev. Section
+        .uleb128 65535                  # DW_TAG_compile_unit
+        .asciz  "Hand-written DWARF"    # DW_AT_producer
+        .byte   47                      # Missing abbreviation
+        .byte   0                       # End Of Children Mark
+.Ldebug_info_end1:

Copy link
Contributor

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for catching this

@@ -44,10 +45,20 @@ bool DWARFDebugInfoEntry::Extract(const DWARFDataExtractor &data,
const DWARFUnit *cu,
lldb::offset_t *offset_ptr) {
m_offset = *offset_ptr;
auto report_error = [&](const char *fmt, const auto &...vals) {
cu->GetSymbolFileDWARF().GetObjectFile()->GetModule()->ReportError(
Copy link
Member

@JDevlieghere JDevlieghere May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we assume cu is always a valid pointer? If so, should the signature take it by const reference instead?

AFAICT only GetAbbreviationDeclarationPtr checks that it's not NULL but we have a code path that calls the lambda before as well as when abbrevDecl is NULL which would happen if CU is NULL.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of the callers are in DWARFUnit (passing this), but I'm not going to pass up an opportunity to add more references. :P

Copy link
Collaborator

@clayborg clayborg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to back out the PeekDieName() patch locally at Meta because it was crashing us due to this assertion due to .debug_names tables having incorrect values for type units. Only one type unit will appear in the final file and type units can have differing contents for the same type. This means accelerator table entries from .o file that had a type unit, but its type unit didn't end up in the final output file, can be bogus and not point to valid DIE offsets which can cause PeekDieName to parse at random offsets in a type unit and crash or report an error now. So we might need an extra bool to be passed to the DWARFDebugInfoEntry::Extract(...) function that says to report an error and have PeekDieName call this with "report_errors = false". Right now many of these entries will cause a large numver of errors to be reported. This is being fixed by verifying that type unit accelerator table entries are matched to the right type unit, but that PR isn't in yet.

@labath
Copy link
Collaborator Author

labath commented May 23, 2024

We have to back out the PeekDieName() patch locally at Meta because it was crashing us due to this assertion due to .debug_names tables having incorrect values for type units. Only one type unit will appear in the final file and type units can have differing contents for the same type. This means accelerator table entries from .o file that had a type unit, but its type unit didn't end up in the final output file, can be bogus and not point to valid DIE offsets which can cause PeekDieName to parse at random offsets in a type unit and crash or report an error now. So we might need an extra bool to be passed to the DWARFDebugInfoEntry::Extract(...) function that says to report an error and have PeekDieName call this with "report_errors = false". Right now many of these entries will cause a large numver of errors to be reported. This is being fixed by verifying that type unit accelerator table entries are matched to the right type unit, but that PR isn't in yet.

I think that is orthogonal to this patch, but it is troubling nonetheless (thank you for bringing it to my attention). If you want to go down this path, you might want to consider something like call_once (or call_once_every_N_sec) to make sure there is some record of the issue somewhere. If not, I might get around to this at some point.

@labath labath merged commit a282463 into llvm:main May 23, 2024
3 checks passed
@labath labath deleted the crash branch May 23, 2024 09:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants