Skip to content

[libDebugInfo] Prevent infinite recursion in DWARFDie::getTypeSize() #74681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 7, 2023

Conversation

adrian-prantl
Copy link
Collaborator

when run on invalid input.

@llvmbot
Copy link
Member

llvmbot commented Dec 7, 2023

@llvm/pr-subscribers-debuginfo

Author: Adrian Prantl (adrian-prantl)

Changes

when run on invalid input.


Full diff: https://github.com/llvm/llvm-project/pull/74681.diff

5 Files Affected:

  • (modified) llvm/include/llvm/DebugInfo/DWARF/DWARFDie.h (+4-1)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFContext.cpp (+4-2)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFDie.cpp (+10-4)
  • (modified) llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp (+5-2)
  • (modified) llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp (+39)
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFDie.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFDie.h
index 421b84d644db64..5ccd93a979fe1b 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFDie.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFDie.h
@@ -10,6 +10,7 @@
 #define LLVM_DEBUGINFO_DWARF_DWARFDIE_H
 
 #include "llvm/ADT/ArrayRef.h"
+#include "llvm/ADT/SmallPtrSet.h"
 #include "llvm/ADT/iterator.h"
 #include "llvm/ADT/iterator_range.h"
 #include "llvm/BinaryFormat/Dwarf.h"
@@ -284,7 +285,9 @@ class DWARFDie {
   /// \param PointerSize the pointer size of the containing CU.
   /// \returns if this is a type DIE, or this DIE contains a DW_AT_type, returns
   /// the size of the type.
-  std::optional<uint64_t> getTypeSize(uint64_t PointerSize);
+  std::optional<uint64_t>
+  getTypeSize(uint64_t PointerSize,
+              llvm::SmallPtrSetImpl<const DWARFDebugInfoEntry *> &visited);
 
   class iterator;
 
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
index c671aedbc9e52b..87af76ae72ae0e 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
@@ -1663,8 +1663,10 @@ void DWARFContext::addLocalsForDie(DWARFCompileUnit *CU, DWARFDie Subprogram,
     if (auto NameAttr = Die.find(DW_AT_name))
       if (std::optional<const char *> Name = dwarf::toString(*NameAttr))
         Local.Name = *Name;
-    if (auto Type = Die.getAttributeValueAsReferencedDie(DW_AT_type))
-      Local.Size = Type.getTypeSize(getCUAddrSize());
+    if (auto Type = Die.getAttributeValueAsReferencedDie(DW_AT_type)) {
+      llvm::SmallPtrSet<const DWARFDebugInfoEntry *, 4> visited;
+      Local.Size = Type.getTypeSize(getCUAddrSize(), visited);
+    }
     if (auto DeclFileAttr = Die.find(DW_AT_decl_file)) {
       if (const auto *LT = CU->getContext().getLineTableForUnit(CU))
         LT->getFileNameByIndex(
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFDie.cpp b/llvm/lib/DebugInfo/DWARF/DWARFDie.cpp
index 0f01933002c06d..3fec05e49a9210 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFDie.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFDie.cpp
@@ -489,7 +489,12 @@ void DWARFDie::getCallerFrame(uint32_t &CallFile, uint32_t &CallLine,
   CallDiscriminator = toUnsigned(find(DW_AT_GNU_discriminator), 0);
 }
 
-std::optional<uint64_t> DWARFDie::getTypeSize(uint64_t PointerSize) {
+std::optional<uint64_t>
+DWARFDie::getTypeSize(uint64_t PointerSize,
+                      SmallPtrSetImpl<const DWARFDebugInfoEntry *> &visited) {
+  // Cycle detected?
+  if (!visited.insert(Die).second)
+    return {};
   if (auto SizeAttr = find(DW_AT_byte_size))
     if (std::optional<uint64_t> Size = SizeAttr->getAsUnsignedConstant())
       return Size;
@@ -511,14 +516,15 @@ std::optional<uint64_t> DWARFDie::getTypeSize(uint64_t PointerSize) {
   case DW_TAG_restrict_type:
   case DW_TAG_typedef: {
     if (DWARFDie BaseType = getAttributeValueAsReferencedDie(DW_AT_type))
-      return BaseType.getTypeSize(PointerSize);
+      return BaseType.getTypeSize(PointerSize, visited);
     break;
   }
   case DW_TAG_array_type: {
     DWARFDie BaseType = getAttributeValueAsReferencedDie(DW_AT_type);
     if (!BaseType)
       return std::nullopt;
-    std::optional<uint64_t> BaseSize = BaseType.getTypeSize(PointerSize);
+    std::optional<uint64_t> BaseSize =
+        BaseType.getTypeSize(PointerSize, visited);
     if (!BaseSize)
       return std::nullopt;
     uint64_t Size = *BaseSize;
@@ -543,7 +549,7 @@ std::optional<uint64_t> DWARFDie::getTypeSize(uint64_t PointerSize) {
   }
   default:
     if (DWARFDie BaseType = getAttributeValueAsReferencedDie(DW_AT_type))
-      return BaseType.getTypeSize(PointerSize);
+      return BaseType.getTypeSize(PointerSize, visited);
     break;
   }
   return std::nullopt;
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
index 9f455fa7e96a7e..8ac31e4a2a94fa 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
@@ -817,9 +817,12 @@ void DWARFUnit::updateVariableDieMap(DWARFDie Die) {
   // no type), then we use a size of one to still allow symbolization of the
   // exact address.
   uint64_t GVSize = 1;
-  if (Die.getAttributeValueAsReferencedDie(DW_AT_type))
-    if (std::optional<uint64_t> Size = Die.getTypeSize(getAddressByteSize()))
+  if (Die.getAttributeValueAsReferencedDie(DW_AT_type)) {
+    llvm::SmallPtrSet<const DWARFDebugInfoEntry *, 4> visited;
+    if (std::optional<uint64_t> Size =
+            Die.getTypeSize(getAddressByteSize(), visited))
       GVSize = *Size;
+  }
 
   if (Address != UINT64_MAX)
     VariableDieMap[Address] = {Address + GVSize, Die};
diff --git a/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp b/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
index 0b7f8f41bc53f4..7b9b25eaf5b750 100644
--- a/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
+++ b/llvm/unittests/DebugInfo/DWARF/DWARFDebugInfoTest.cpp
@@ -1615,6 +1615,45 @@ TEST(DWARFDebugInfo, TestFindRecurse) {
   EXPECT_EQ(AbsDieName, StringOpt.value_or(nullptr));
 }
 
+TEST(DWARFDebugInfo, TestSelfRecursiveType) {
+  typedef uint32_t AddrType;
+  Triple Triple = getDefaultTargetTripleForAddrSize(sizeof(AddrType));
+  if (!isConfigurationSupported(Triple))
+    GTEST_SKIP();
+
+  auto ExpectedDG = dwarfgen::Generator::create(Triple, 4);
+  ASSERT_THAT_EXPECTED(ExpectedDG, Succeeded());
+  dwarfgen::Generator *DG = ExpectedDG.get().get();
+  dwarfgen::CompileUnit &CU = DG->addCompileUnit();
+  dwarfgen::DIE CUDie = CU.getUnitDIE();
+
+  // Create an invalid self-recursive typedef.
+  dwarfgen::DIE TypedefDie = CUDie.addChild(DW_TAG_typedef);
+  TypedefDie.addAttribute(DW_AT_name, DW_FORM_strp, "illegal");
+  TypedefDie.addAttribute(DW_AT_type, DW_FORM_ref_addr, TypedefDie);
+
+  MemoryBufferRef FileBuffer(DG->generate(), "dwarf");
+  auto Obj = object::ObjectFile::createObjectFile(FileBuffer);
+  EXPECT_TRUE((bool)Obj);
+  std::unique_ptr<DWARFContext> DwarfContext = DWARFContext::create(**Obj);
+
+  // Verify the number of compile units is correct.
+  uint32_t NumCUs = DwarfContext->getNumCompileUnits();
+  EXPECT_EQ(NumCUs, 1u);
+  DWARFCompileUnit *U =
+      cast<DWARFCompileUnit>(DwarfContext->getUnitAtIndex(0));
+
+  {
+    DWARFDie CUDie = U->getUnitDIE(false);
+    EXPECT_TRUE(CUDie.isValid());
+    DWARFDie TypedefDie = CUDie.getFirstChild();
+
+    // Test that getTypeSize doesn't get into an infinite loop.
+    llvm::SmallPtrSet<const DWARFDebugInfoEntry *, 1> visited;
+    EXPECT_EQ(TypedefDie.getTypeSize(sizeof(AddrType), visited), std::nullopt);
+  }
+}
+
 TEST(DWARFDebugInfo, TestDwarfToFunctions) {
   // Test all of the dwarf::toXXX functions that take a
   // std::optional<DWARFFormValue> and extract the values from it.

Copy link

github-actions bot commented Dec 7, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@cyndyishida
Copy link
Member

This fixes tapi issues I've seen on my end, thank you!

@dwblaikie
Copy link
Collaborator

A more general question: libDebugInfoDWARF isn't at all robust to "interesting" let alone invalid input - this fixes one instance, but I'm wondering what your goals are. If this bug was worth fixing, what's your use case/what else are you planning to fix/handle?

(might be worth a broader discussion about how we might go about doing that? I'm not sure how this compares to Clang's recursion limits, for instance? Not sure if they use a similarly "visited" system, or if there's some other tools/techniques to consider?)

@pogo59
Copy link
Collaborator

pogo59 commented Dec 7, 2023

This appears to be imposing an implementation-detail cost on the caller?

@adrian-prantl
Copy link
Collaborator Author

A more general question: libDebugInfoDWARF isn't at all robust to "interesting" let alone invalid input - this fixes one instance, but I'm wondering what your goals are. If this bug was worth fixing, what's your use case/what else are you planning to fix/handle?

To answer this question narrowly, the kind of debug info in the test case (a self-recursive typedef) can be created by a combination of an LTO bug (which I'm trying to fix next, though it make time, because LTO) and a bug in dsymutil, and the resulting DWARF caused tooling like TAPI to crash with an infinite recursion. My goal is to just make sure that this kind of DWARF that is now out in the wild can be handled by users of libDebugInfo.

(might be worth a broader discussion about how we might go about doing that? I'm not sure how this compares to Clang's recursion limits, for instance? Not sure if they use a similarly "visited" system, or if there's some other tools/techniques to consider?)

Generally, DWARF consumers are good at dealing with recursive types. This is a special case where the getTypeSize() method which is AFAIK not used in e.g., LLDB, didn't expect to find a self-referential type.

std::optional<uint64_t> getTypeSize(uint64_t PointerSize);
std::optional<uint64_t>
getTypeSize(uint64_t PointerSize,
llvm::SmallPtrSetImpl<const DWARFDebugInfoEntry *> &Visited);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're already inside the llvm:: namespace here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I kind of agree with @pogo59 , this is exposing an implementation detail to the users of this call.
The more general way of handling this would be to do one of two things:

  1. Keep the interface as before, but change the implementation to be iterative and have the set be defined inside the implementation.
  2. Keep the interface as before, and keep the implementation recursive, but rename the old implementation to an impl method that takes the set as a parameter, and make the implementation of the "normal" method define the set and pass it to the impl method.

Both of those will keep callers unchanged.

@dwblaikie
Copy link
Collaborator

A more general question: libDebugInfoDWARF isn't at all robust to "interesting" let alone invalid input - this fixes one instance, but I'm wondering what your goals are. If this bug was worth fixing, what's your use case/what else are you planning to fix/handle?

To answer this question narrowly, the kind of debug info in the test case (a self-recursive typedef) can be created by a combination of an LTO bug (which I'm trying to fix next, though it make time, because LTO) and a bug in dsymutil, and the resulting DWARF caused tooling like TAPI to crash with an infinite recursion. My goal is to just make sure that this kind of DWARF that is now out in the wild can be handled by users of libDebugInfo.

(might be worth a broader discussion about how we might go about doing that? I'm not sure how this compares to Clang's recursion limits, for instance? Not sure if they use a similarly "visited" system, or if there's some other tools/techniques to consider?)

Generally, DWARF consumers are good at dealing with recursive types. This is a special case where the getTypeSize() method which is AFAIK not used in e.g., LLDB, didn't expect to find a self-referential type.

A more general question: libDebugInfoDWARF isn't at all robust to "interesting" let alone invalid input - this fixes one instance, but I'm wondering what your goals are. If this bug was worth fixing, what's your use case/what else are you planning to fix/handle?

To answer this question narrowly, the kind of debug info in the test case (a self-recursive typedef) can be created by a combination of an LTO bug (which I'm trying to fix next, though it make time, because LTO) and a bug in dsymutil, and the resulting DWARF caused tooling like TAPI to crash with an infinite recursion. My goal is to just make sure that this kind of DWARF that is now out in the wild can be handled by users of libDebugInfo.

Ah, fair enough.

(might be worth a broader discussion about how we might go about doing that? I'm not sure how this compares to Clang's recursion limits, for instance? Not sure if they use a similarly "visited" system, or if there's some other tools/techniques to consider?)

Generally, DWARF consumers are good at dealing with recursive types. This is a special case where the getTypeSize() method which is AFAIK not used in e.g., LLDB, didn't expect to find a self-referential type.

Huh, somewhat surprises me - but fair enough. Thanks for the context.

@adrian-prantl
Copy link
Collaborator Author

This appears to be imposing an implementation-detail cost on the caller?

Fixed!

Copy link
Contributor

@felipepiovezan felipepiovezan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@adrian-prantl adrian-prantl merged commit c6805ea into llvm:main Dec 7, 2023
adrian-prantl added a commit to adrian-prantl/llvm-project that referenced this pull request Dec 8, 2023
adrian-prantl added a commit to swiftlang/llvm-project that referenced this pull request Dec 11, 2023
…25-libDebugInfo-Prevent-infinite-recursion-in-DWARFDie-getTypeSize-74681

[Cherry-pick into stable/20230725] [libDebugInfo] Prevent infinite recursion in DWARFDie::getTypeSize() (llvm#74681)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants