Skip to content

[lldb][SymbolFileDWARF] Share GetDIEToType between SymbolFiles of a SymbolFileDWARFDebugMap #120569

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 23, 2024

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Dec 19, 2024

The problem here manifests as follows:

  1. We are stopped in main.o, so the first ParseTypeFromDWARF on FooImpl<char> gets called on main.o's SymbolFile. This adds a mapping from declaration die -> TypeSP into main.o's GetDIEToType map.
  2. We then CompleteType(FooImpl<char>). Depending on the order of entries in the debug-map, this might call CompleteType on lib.o's SymbolFile. In which case, GetDIEToType().lookup(decl_die) will return a nullptr. This is already a bit iffy because some of the surrounding code assumes we don't call CompleteTypeFromDWARF with a nullptr Type*. E.g., CompleteEnumType blindly dereferences it (though enums will never encounter this because their definition is fetched in ParseEnum, unlike for structures).
  3. While in CompleteTypeFromDWARF, we call ParseTypeFromDWARF again. This will parse the member function FooImpl::Create and its return type which is a typedef to FooImpl*. But now we're inside lib.o's SymbolFile, so we call it on the definition DIE. In step (2) we just inserted a nullptr into GetDIEToType for the definition DIE, so we trivially return a nullptr from ParseTypeFromDWARF. Instead of reporting back this parse failure to the user LLDB trucks on and marks FooImpl::Ref to be void*.

This test-case will trigger an assert in TypeSystemClang::VerifyDecl even if we just frame var (but only in debug-builds). In release builds where this function is a no-op, we'll create an incorrect Clang AST node for the Ref typedef.

The proposed fix here is to share the GetDIEToType map between SymbolFiles if a debug-map exists.

Alternatives considered

  • Check the GetDIEToType map of the SymbolFile that the declaration DIE belongs to. The assumption here being that if we called ParseTypeFromDWARF on a declaration, the GetDIEToType map that the result was inserted into was the one on that DIE's SymbolFile. This was the first version of this patch, but that felt like a weaker version sharing the map. It complicates the code in CompleteType and is less consistent with the other bookkeeping structures we already share between SymbolFiles
  • Return from SymbolFileDWARF::CompleteType if there is no type in the current GetDIEToType. Then SymbolFileDWARFDebugMap::CompleteType could continue to the next SymbolFile which does own the type. But that didn't quite work because we remove the GetForwardCompilerTypeToDie entry in SymbolFile::CompleteType, which SymbolFileDWARFDebugMap::CompleteType relies upon for iterating

…DIE's SymbolFile

The problem here manifests as follows:
1. We are stopped in main.o, so the first `ParseTypeFromDWARF`
on `FooImpl<char>` gets called on `main.o`'s SymbolFile. This
adds a mapping from *declaration die* -> `TypeSP` into `main.o`'s
`GetDIEToType` map.
2. We then `CompleteType(FooImpl<char>)`. Depending on the order of
entries in the debug-map, this might call `CompleteType` on `lib.o`'s
SymbolFile. In which case, `GetDIEToType().lookup(decl_die)` will
return a `nullptr`. This is already a bit iffy because surrounding code
used to assume we don't call `CompleteTypeFromDWARF` with a `nullptr` `Type*`.
This used to be more of an issue when `CompleteRecordType` actually made use of
the `Type*` parameter (see llvm#120456).
3. While in `CompleteTypeFromDWARF`, we call `ParseTypeFromDWARF` again.
This will parse the member function `FooImpl::Create` and its return
type which is a typedef to `FooImpl*`. But now we're inside `lib.o`'s SymbolFile,
so we call it on the definition DIE. In step (2) we just inserted a `nullptr` into
`GetDIEToType` for the definition DIE, so we trivially return a `nullptr` from
`ParseTypeFromDWARF`. Instead of reporting back this parse failure to
the user LLDB trucks on and marks `FooImpl::Ref` to be `void*`.

This test-case will trigger an assert in `TypeSystemClang::VerifyDecl`
even if we just `frame var` (but only in debug-builds). In release
builds where this function is a no-op, we'll create an incorrect Clang
AST node for the `Ref` typedef.
@llvmbot
Copy link
Member

llvmbot commented Dec 19, 2024

@llvm/pr-subscribers-lldb

Author: Michael Buch (Michael137)

Changes

…DIE's SymbolFile

The problem here manifests as follows:

  1. We are stopped in main.o, so the first ParseTypeFromDWARF on FooImpl&lt;char&gt; gets called on main.o's SymbolFile. This adds a mapping from declaration die -> TypeSP into main.o's GetDIEToType map.
  2. We then CompleteType(FooImpl&lt;char&gt;). Depending on the order of entries in the debug-map, this might call CompleteType on lib.o's SymbolFile. In which case, GetDIEToType().lookup(decl_die) will return a nullptr. This is already a bit iffy because surrounding code used to assume we don't call CompleteTypeFromDWARF with a nullptr Type*. This used to be more of an issue when CompleteRecordType actually made use of the Type* parameter (see [lldb][DWARFASTParserClang][NFC] Remove unused parameter to CompleteRecordType #120456).
  3. While in CompleteTypeFromDWARF, we call ParseTypeFromDWARF again. This will parse the member function FooImpl::Create and its return type which is a typedef to FooImpl*. But now we're inside lib.o's SymbolFile, so we call it on the definition DIE. In step (2) we just inserted a nullptr into GetDIEToType for the definition DIE, so we trivially return a nullptr from ParseTypeFromDWARF. Instead of reporting back this parse failure to the user LLDB trucks on and marks FooImpl::Ref to be void*.

This test-case will trigger an assert in TypeSystemClang::VerifyDecl even if we just frame var (but only in debug-builds). In release builds where this function is a no-op, we'll create an incorrect Clang AST node for the Ref typedef.

The proposed fix here is to check the GetDIEToType map of the SymbolFile that the declaration DIE belongs to. The assumption here being that if we called ParseTypeFromDWARF on a declaration, the GetDIEToType map that the result was inserted into was the one on that DIE's SymbolFile.


Full diff: https://github.com/llvm/llvm-project/pull/120569.diff

6 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp (+3-1)
  • (added) lldb/test/API/lang/cpp/typedef-to-outer-fwd/Makefile (+3)
  • (added) lldb/test/API/lang/cpp/typedef-to-outer-fwd/TestTypedefToOuterFwd.py (+32)
  • (added) lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.cpp (+15)
  • (added) lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.h (+10)
  • (added) lldb/test/API/lang/cpp/typedef-to-outer-fwd/main.cpp (+8)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 68e50902d641a2..936a44865b4a7e 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -1593,7 +1593,9 @@ bool SymbolFileDWARF::CompleteType(CompilerType &compiler_type) {
   DWARFASTParser *dwarf_ast = GetDWARFParser(*def_die.GetCU());
   if (!dwarf_ast)
     return false;
-  Type *type = GetDIEToType().lookup(decl_die.GetDIE());
+  Type *type = decl_die.GetDWARF()->GetDIEToType().lookup(decl_die.GetDIE());
+  assert (type);
+
   if (decl_die != def_die) {
     GetDIEToType()[def_die.GetDIE()] = type;
     DWARFASTParserClang *ast_parser =
diff --git a/lldb/test/API/lang/cpp/typedef-to-outer-fwd/Makefile b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/Makefile
new file mode 100644
index 00000000000000..d9db5666f9b6cd
--- /dev/null
+++ b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := lib.cpp main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/lang/cpp/typedef-to-outer-fwd/TestTypedefToOuterFwd.py b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/TestTypedefToOuterFwd.py
new file mode 100644
index 00000000000000..ad26d503c545b2
--- /dev/null
+++ b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/TestTypedefToOuterFwd.py
@@ -0,0 +1,32 @@
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class TestCaseTypedefToOuterFwd(TestBase):
+    '''
+    We are stopped in main.o, which only sees a forward declaration
+    of FooImpl. We then try to get the FooImpl::Ref typedef (whose
+    definition is in lib.o). Make sure we correctly resolve this
+    typedef.
+    '''
+    def test(self):
+        self.build()
+        (_, _, thread, _) = lldbutil.run_to_source_breakpoint(
+            self, "return", lldb.SBFileSpec("main.cpp")
+        )
+
+        foo = thread.frames[0].FindVariable('foo')
+        self.assertSuccess(foo.GetError(), "Found foo")
+
+        foo_type = foo.GetType()
+        self.assertTrue(foo_type)
+
+        impl = foo_type.GetPointeeType()
+        self.assertTrue(impl)
+
+        ref = impl.FindDirectNestedType('Ref')
+        self.assertTrue(ref)
+
+        self.assertEqual(ref.GetCanonicalType(), foo_type)
diff --git a/lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.cpp b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.cpp
new file mode 100644
index 00000000000000..2bc38f6503396d
--- /dev/null
+++ b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.cpp
@@ -0,0 +1,15 @@
+#include "lib.h"
+
+template <typename T> struct FooImpl {
+  using Ref = FooImpl<T> *;
+
+  Ref Create() { return new FooImpl<T>(); }
+};
+
+Foo getString() {
+  FooImpl<char> impl;
+  Foo ret;
+  ret.impl = impl.Create();
+
+  return ret;
+}
diff --git a/lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.h b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.h
new file mode 100644
index 00000000000000..d5dfddf2246209
--- /dev/null
+++ b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/lib.h
@@ -0,0 +1,10 @@
+#ifndef LIB_H_IN
+#define LIB_H_IN
+
+template <typename T> struct FooImpl;
+
+struct Foo {
+  FooImpl<char> *impl = nullptr;
+};
+
+#endif // LIB_H_IN
diff --git a/lldb/test/API/lang/cpp/typedef-to-outer-fwd/main.cpp b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/main.cpp
new file mode 100644
index 00000000000000..61c819ca0d31a1
--- /dev/null
+++ b/lldb/test/API/lang/cpp/typedef-to-outer-fwd/main.cpp
@@ -0,0 +1,8 @@
+#include "lib.h"
+
+extern Foo getString();
+
+int main() {
+  FooImpl<char> *foo = getString().impl;
+  return 0;
+}

Copy link

github-actions bot commented Dec 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link

github-actions bot commented Dec 19, 2024

✅ With the latest revision this PR passed the Python code formatter.

@Michael137
Copy link
Member Author

Michael137 commented Dec 19, 2024

This opens a couple cans of worms (though I think this fix still makes sense in isolation):

  1. This one might be by design and hard to change at this point but do we want to silently ignore failures to resolve a type in ResolveCompilerType?:
    // We have no encoding type, return void?
  2. The order of the entries in the debug-map for the test-case matter! Which is worrying because that is probably not a deterministic property we can rely on? If lib.cpp and main.cpp were swapped we actually run into an entirely different assertion (which I'm looking into separately)! If anyone has ideas on how to make this more deterministic let me know. Perhaps a unit-test for ParseTypeFromDWARF/CompleteTypeFromDWARF is the best way forward for that?
  3. Are nullptr entries in GetDIEToType ever allowed? Or does LLDB ever depend on that? If not, it might be worth enforcing this somehow (maybe even getting rid of DIE_IS_BEING_PARSED).

@Michael137 Michael137 changed the title [lldb][SymbolFileDWARF] CompleteType: Lookup type in the declaration … [lldb][SymbolFileDWARF] CompleteType: Lookup type in the declaration DIE's SymbolFile Dec 19, 2024
@@ -1593,7 +1593,9 @@ bool SymbolFileDWARF::CompleteType(CompilerType &compiler_type) {
DWARFASTParser *dwarf_ast = GetDWARFParser(*def_die.GetCU());
if (!dwarf_ast)
return false;
Type *type = GetDIEToType().lookup(decl_die.GetDIE());
Type *type = decl_die.GetDWARF()->GetDIEToType().lookup(decl_die.GetDIE());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you edit down your commit message into a comment that explains the double-lookup here?

@ZequanWu
Copy link
Contributor

In which case, GetDIEToType().lookup(decl_die) will return a nullptr. This is already a bit iffy because some of the surrounding code assumes we don't call CompleteTypeFromDWARF with a nullptr Type*. E.g., CompleteEnumType blindly dereferences it (though enums will never encounter this because their definition is fetched in ParseEnum, unlike for structures).

Should we bail out early if the Type* is null and return false to tell SymbolFileDWARFDebugMap::CompleteType that it can not complete this type and let it iterate to the symbol file that has the entry in its map.

@Michael137
Copy link
Member Author

In which case, GetDIEToType().lookup(decl_die) will return a nullptr. This is already a bit iffy because some of the surrounding code assumes we don't call CompleteTypeFromDWARF with a nullptr Type*. E.g., CompleteEnumType blindly dereferences it (though enums will never encounter this because their definition is fetched in ParseEnum, unlike for structures).

Should we bail out early if the Type* is null and return false to tell SymbolFileDWARFDebugMap::CompleteType that it can not complete this type and let it iterate to the symbol file that has the entry in its map.

Yea that might be even simpler. Let me try that.

Also, the other crash i alluded to seems to come from the fact that UniqueDWARFASTTypeList::Find will compare Declarations (i.e., line and col info) regardless of whether we're dealing with C++ or not. But GetUniqueTypeNameAndDeclaration will return a 0 Declaration for C++. So we can end up inserting the Declaration for a definition DIE into the map (via UpdateToDefDIE), but then try to retrieve it using a 0 Declaration. Should have a patch up for that soon (I think we can just skip the declaration check if we're dealing with C++?)

@ZequanWu
Copy link
Contributor

In which case, GetDIEToType().lookup(decl_die) will return a nullptr. This is already a bit iffy because some of the surrounding code assumes we don't call CompleteTypeFromDWARF with a nullptr Type*. E.g., CompleteEnumType blindly dereferences it (though enums will never encounter this because their definition is fetched in ParseEnum, unlike for structures).

Should we bail out early if the Type* is null and return false to tell SymbolFileDWARFDebugMap::CompleteType that it can not complete this type and let it iterate to the symbol file that has the entry in its map.

Yea that might be even simpler. Let me try that.

Also, the other crash i alluded to seems to come from the fact that UniqueDWARFASTTypeList::Find will compare Declarations (i.e., line and col info) regardless of whether we're dealing with C++ or not. But GetUniqueTypeNameAndDeclaration will return a 0 Declaration for C++. So we can end up inserting the Declaration for a definition DIE into the map (via UpdateToDefDIE), but then try to retrieve it using a 0 Declaration. Should have a patch up for that soon (I think we can just skip the declaration check if we're dealing with C++?)

I agree that checking Declaration for C++ die isn't necessary, the unique name should be sufficient. I get that Declaration is cleared for all C++ die in GetUniqueTypeNameAndDeclaration, so they all have 0 declaration and checking on declaration will just always be true for C++ right? How does it cause a problem?

@Michael137
Copy link
Member Author

Michael137 commented Dec 19, 2024

In which case, GetDIEToType().lookup(decl_die) will return a nullptr. This is already a bit iffy because some of the surrounding code assumes we don't call CompleteTypeFromDWARF with a nullptr Type*. E.g., CompleteEnumType blindly dereferences it (though enums will never encounter this because their definition is fetched in ParseEnum, unlike for structures).

Should we bail out early if the Type* is null and return false to tell SymbolFileDWARFDebugMap::CompleteType that it can not complete this type and let it iterate to the symbol file that has the entry in its map.

Yea that might be even simpler. Let me try that.
Also, the other crash i alluded to seems to come from the fact that UniqueDWARFASTTypeList::Find will compare Declarations (i.e., line and col info) regardless of whether we're dealing with C++ or not. But GetUniqueTypeNameAndDeclaration will return a 0 Declaration for C++. So we can end up inserting the Declaration for a definition DIE into the map (via UpdateToDefDIE), but then try to retrieve it using a 0 Declaration. Should have a patch up for that soon (I think we can just skip the declaration check if we're dealing with C++?)

I agree that checking Declaration for C++ die isn't necessary, the unique name should be sufficient. I get that Declaration is cleared for all C++ die in GetUniqueTypeNameAndDeclaration, so they all have 0 declaration and checking on declaration will just always be true for C++ right? How does it cause a problem?

It's causing issues when we call ParseTypeFromDWARF while we're inside CompleteRecordType. E.g., in the test-case in this patch.

CompleteRecordType will call UpdateToDefDIE, which will update the Declaration stored in the map to something non-zero (we get it from the ParsedDWARFTypeAttributes(def_die)). Then if we call ParseTypeFromDWARF, we will search the map for a 0 Declaration (because of GetUniqueTypeNameAndDeclaration). So we don't find the type and create a new forward declaration. Which causes problems later down the line. Specifically in the test-case, we use the forward declaration as the type that we call CreateTypedef on, but because we never called CompleteType on this one (we called it on the type we stored in the UniqueDWARFASTTypeList, but we never used it) we never started its definition. This violates the invariant in CreateTypedef when we call decl_ctx->addDecl, which requires decl_ctx to have a definition.

@ZequanWu
Copy link
Contributor

CompleteRecordType will call UpdateToDefDIE, which will update the Declaration stored in the map to something non-zero (we get it from the ParsedDWARFTypeAttributes(def_die)).

Ah, I missed this part. Thanks for the explanation. Also thanks for fixing those bugs introduced by the delay def die searching change.

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, we might be able to make the die-to-type map shared between all symbol files in a debug map (similar to how "unique dwarf ast type map" is shared).

@labath
Copy link
Collaborator

labath commented Dec 20, 2024

1. This one might be by design and hard to change at this point but do we want to silently ignore failures to resolve a type in `ResolveCompilerType`?: https://github.com/llvm/llvm-project/blob/eb812d28f542bf0de54c157a7391e446739570cc/lldb/source/Symbol/Type.cpp#L647

What would be the alternative? We might be able to log something or somehow report an error, but I don't think we can exclude the possibility of failure because of how the lazy parsing works.

2. The order of the entries in the debug-map for the test-case matter! Which is worrying because that is probably not a deterministic property we can rely on? If `lib.cpp` and `main.cpp` were swapped we actually run into an entirely different assertion (which I'm looking into separately)! If anyone has ideas on how to make this more deterministic let me know. Perhaps a unit-test for `ParseTypeFromDWARF`/`CompleteTypeFromDWARF` is the best way forward for that?

Maybe, but I don't expect the test to be pretty. What I (sometimes) do in these cases is to make the test bidirectional, i.e., have two compile unit where each one has a definition for a type declared/used in the other one. This way, you're guaranteed to hit the error code path regardless of the order of the two units. This is easier to do if you don't actually need to run the binary (which you probably don't need to do here).

3. Are `nullptr` entries in `GetDIEToType` ever allowed? Or does LLDB ever depend on that? If not, it might be worth enforcing this somehow (maybe even getting rid of `DIE_IS_BEING_PARSED`).

The way I interpret nullptr here is "I've tried parsing this die/type and failed -- don't bother trying again". DIE_IS_BEING_PARSED is "I'm currently parsing this type -- bail out to avoid recursion". We probably can merge these two cases, with the only loss being the inability to distinguish between these two cases. It might be slightly tricky though -- I remember contemplating that when I was working on this a couple of months back, and realizing it's not as easy as I had hoped. I don't remember what the problem was though.

@Michael137
Copy link
Member Author

Michael137 commented Dec 20, 2024

In which case, GetDIEToType().lookup(decl_die) will return a nullptr. This is already a bit iffy because some of the surrounding code assumes we don't call CompleteTypeFromDWARF with a nullptr Type*. E.g., CompleteEnumType blindly dereferences it (though enums will never encounter this because their definition is fetched in ParseEnum, unlike for structures).

Should we bail out early if the Type* is null and return false to tell SymbolFileDWARFDebugMap::CompleteType that it can not complete this type and let it iterate to the symbol file that has the entry in its map.

Just tried this. Unfortunately we remove the CompilerType from the GetForwardDeclCompilerTypeToDIE map inside of CompleteType. So in the next iteration of the SymbolFileDWARFDebugMap::CompleteType loop, it won't ever call into SymbolFileDWARF::CompleteType again :(

I guess we could put it back on failure? Not sure if it's worth the potential trouble

@Michael137
Copy link
Member Author

Michael137 commented Dec 20, 2024

As an alternative, we might be able to make the die-to-type map shared between all symbol files in a debug map (similar to how "unique dwarf ast type map" is shared).

Yea I considered that, but I was concerned with the unforeseen complications this might have. Particularly I'm not sure how the TypeSP ownership should be managed. IIUC, a TypeSP is solely supposed to be owned by a single SymbolFile (MakeType kind of binds the two together). But then if we bundled TypeSPs from different modules into a single map on SymbolFileDWARFDebugMap, could we run into issues if one of the SymbolFile's gets torn down (if that can happen)? Though I'm also not certain that we only store TypeSPs inside the SymbolFileDWARF. I don't think there's anything asserting/enforcing this.

@Michael137
Copy link
Member Author

Maybe, but I don't expect the test to be pretty. What I (sometimes) do in these cases is to make the test bidirectional, i.e., have two compile unit where each one has a definition for a type declared/used in the other one. This way, you're guaranteed to hit the error code path regardless of the order of the two units. This is easier to do if you don't actually need to run the binary (which you probably don't need to do here).

Great idea, let me try this out

@Michael137
Copy link
Member Author

Michael137 commented Dec 20, 2024

As an alternative, we might be able to make the die-to-type map shared between all symbol files in a debug map (similar to how "unique dwarf ast type map" is shared).

Yea I considered that, but I was concerned with the unforeseen complications this might have. Particularly I'm not sure how the TypeSP ownership should be managed. IIUC, a TypeSP is solely supposed to be owned by a single SymbolFile (MakeType kind of binds the two together). But then if we bundled TypeSPs from different modules into a single map on SymbolFileDWARFDebugMap, could we run into issues if one of the SymbolFile's gets torn down (if that can happen)? Though I'm also not certain that we only store TypeSPs inside the SymbolFileDWARF. I don't think there's anything asserting/enforcing this.

On second thought, this does feel more consistent with how all the other bookkeeping structures already work. And re. the ownership, my current patch pretty much achieves the same thing (but in a less apparent manner). I'm still mixing Type* into GetDIEToType of a different SymbolFile. And the UniqueDWARFASTType owns a TypeSP, and we are sharing those via a debug-map already. So I'll go with this suggestion. Confirmed that this does fix the test attached test-case

@Michael137 Michael137 changed the title [lldb][SymbolFileDWARF] CompleteType: Lookup type in the declaration DIE's SymbolFile [lldb][SymbolFileDWARF] Share GetDIEToType between SymbolFiles of a SymbolFileDWARFDebugMap Dec 20, 2024
@ZequanWu
Copy link
Contributor

Just tried this. Unfortunately we remove the CompilerType from the GetForwardDeclCompilerTypeToDIE map inside of CompleteType. So in the next iteration of the SymbolFileDWARFDebugMap::CompleteType loop, it won't ever call into SymbolFileDWARF::CompleteType again :(

I guess we could put it back on failure? Not sure if it's worth the potential trouble

We can move the the retrieval and null check of Type* before we remove it from GetForwardDeclCompilerTypeToDIE(), right?

@ZequanWu
Copy link
Contributor

As an alternative, we might be able to make the die-to-type map shared between all symbol files in a debug map (similar to how "unique dwarf ast type map" is shared).

Yea I considered that, but I was concerned with the unforeseen complications this might have. Particularly I'm not sure how the TypeSP ownership should be managed. IIUC, a TypeSP is solely supposed to be owned by a single SymbolFile (MakeType kind of binds the two together). But then if we bundled TypeSPs from different modules into a single map on SymbolFileDWARFDebugMap, could we run into issues if one of the SymbolFile's gets torn down (if that can happen)? Though I'm also not certain that we only store TypeSPs inside the SymbolFileDWARF. I don't think there's anything asserting/enforcing this.

On second thought, this does feel more consistent with how all the other bookkeeping structures already work. And re. the ownership, my current patch pretty much achieves the same thing (but in a less apparent manner). I'm still mixing Type* into GetDIEToType of a different SymbolFile. And the UniqueDWARFASTType owns a TypeSP, and we are sharing those via a debug-map already. So I'll go with this suggestion. Confirmed that this does fix the test attached test-case

Good to hear that works and making TypeSP shared among different SymbolFile make life easier.


self.assertEqual(ref.GetCanonicalType(), var_type)

def test_definition_in_main(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe merge these into one so we don't end up building the binary twice (for each debug info flavour)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, done

@Michael137 Michael137 force-pushed the bugfix/lldb-typedef-to-outer branch from 8397543 to d87abe7 Compare December 23, 2024 10:33
@Michael137 Michael137 merged commit 28d1490 into llvm:main Dec 23, 2024
7 checks passed
@Michael137 Michael137 deleted the bugfix/lldb-typedef-to-outer branch December 23, 2024 11:51
Michael137 added a commit that referenced this pull request Dec 23, 2024
…queDWARFASTTypeList in C++ (#120809)

This addresses an issue encountered when investigating
#120569.

In `ParseTypeFromDWARF`, we insert the parsed type into
`UniqueDWARFASTTypeMap` using the typename and `Declaration` obtained
with `GetUniqueTypeNameAndDeclaration`. For C++
`GetUniqueTypeNameAndDeclaration` will zero the `Declaration` info
(presumably because forward declaration may not have it, and for C++,
ODR means we can rely on fully qualified typenames for uniqueing). When
we then called `CompleteType`, we would first `MapDeclDIEToDefDIE`,
updating the `UniqueDWARFASTType` we inserted previously with the
`Declaration` of the definition DIE (without zeroing it). We would then
call into `ParseTypeFromDWARF` for the same type again, and search the
`UniqueDWARFASTTypeMap`. But we do the search using a zeroed declaration
we get from `GetUniqueTypeNameAndDeclaration`.

This particular example was fixed by
#120569 but this still feels a
little inconsistent. I'm not entirely sure we can get into that
situation after that patch anymore. So the only test-case I could come
up with was the unit-test which calls `MapDeclDIEToDefDIE` directly.
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Dec 23, 2024
…ymbolFileDWARFDebugMap (llvm#120569)

The problem here manifests as follows:
1. We are stopped in main.o, so the first `ParseTypeFromDWARF` on
`FooImpl<char>` gets called on `main.o`'s SymbolFile. This adds a
mapping from *declaration die* -> `TypeSP` into `main.o`'s
`GetDIEToType` map.
2. We then `CompleteType(FooImpl<char>)`. Depending on the order of
entries in the debug-map, this might call `CompleteType` on `lib.o`'s
SymbolFile. In which case, `GetDIEToType().lookup(decl_die)` will return
a `nullptr`. This is already a bit iffy because some of the surrounding
code assumes we don't call `CompleteTypeFromDWARF` with a `nullptr`
`Type*`. E.g., `CompleteEnumType` blindly dereferences it (though enums
will never encounter this because their definition is fetched in
ParseEnum, unlike for structures).
3. While in `CompleteTypeFromDWARF`, we call `ParseTypeFromDWARF` again.
This will parse the member function `FooImpl::Create` and its return
type which is a typedef to `FooImpl*`. But now we're inside `lib.o`'s
SymbolFile, so we call it on the definition DIE. In step (2) we just
inserted a `nullptr` into `GetDIEToType` for the definition DIE, so we
trivially return a `nullptr` from `ParseTypeFromDWARF`. Instead of
reporting back this parse failure to the user LLDB trucks on and marks
`FooImpl::Ref` to be `void*`.

This test-case will trigger an assert in `TypeSystemClang::VerifyDecl`
even if we just `frame var` (but only in debug-builds). In release
builds where this function is a no-op, we'll create an incorrect Clang
AST node for the `Ref` typedef.

The proposed fix here is to share the `GetDIEToType` map between
SymbolFiles if a debug-map exists.

**Alternatives considered**
* Check the `GetDIEToType` map of the `SymbolFile` that the declaration
DIE belongs to. The assumption here being that if we called
`ParseTypeFromDWARF` on a declaration, the `GetDIEToType` map that the
result was inserted into was the one on that DIE's SymbolFile. This was
the first version of this patch, but that felt like a weaker version
sharing the map. It complicates the code in `CompleteType` and is less
consistent with the other bookkeeping structures we already share
between SymbolFiles
* Return from `SymbolFileDWARF::CompleteType` if there is no type in the
current `GetDIEToType`. Then `SymbolFileDWARFDebugMap::CompleteType`
could continue to the next `SymbolFile` which does own the type. But
that didn't quite work because we remove the
`GetForwardCompilerTypeToDie` entry in `SymbolFile::CompleteType`, which
`SymbolFileDWARFDebugMap::CompleteType` relies upon for iterating

(cherry picked from commit 28d1490)
Michael137 added a commit to swiftlang/llvm-project that referenced this pull request Dec 23, 2024
…queDWARFASTTypeList in C++ (llvm#120809)

This addresses an issue encountered when investigating
llvm#120569.

In `ParseTypeFromDWARF`, we insert the parsed type into
`UniqueDWARFASTTypeMap` using the typename and `Declaration` obtained
with `GetUniqueTypeNameAndDeclaration`. For C++
`GetUniqueTypeNameAndDeclaration` will zero the `Declaration` info
(presumably because forward declaration may not have it, and for C++,
ODR means we can rely on fully qualified typenames for uniqueing). When
we then called `CompleteType`, we would first `MapDeclDIEToDefDIE`,
updating the `UniqueDWARFASTType` we inserted previously with the
`Declaration` of the definition DIE (without zeroing it). We would then
call into `ParseTypeFromDWARF` for the same type again, and search the
`UniqueDWARFASTTypeMap`. But we do the search using a zeroed declaration
we get from `GetUniqueTypeNameAndDeclaration`.

This particular example was fixed by
llvm#120569 but this still feels a
little inconsistent. I'm not entirely sure we can get into that
situation after that patch anymore. So the only test-case I could come
up with was the unit-test which calls `MapDeclDIEToDefDIE` directly.

(cherry picked from commit c660b28)
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 10, 2025
…through UniqueDWARFASTTypeList in C++ (#120809)

This addresses an issue encountered when investigating
llvm/llvm-project#120569.

In `ParseTypeFromDWARF`, we insert the parsed type into
`UniqueDWARFASTTypeMap` using the typename and `Declaration` obtained
with `GetUniqueTypeNameAndDeclaration`. For C++
`GetUniqueTypeNameAndDeclaration` will zero the `Declaration` info
(presumably because forward declaration may not have it, and for C++,
ODR means we can rely on fully qualified typenames for uniqueing). When
we then called `CompleteType`, we would first `MapDeclDIEToDefDIE`,
updating the `UniqueDWARFASTType` we inserted previously with the
`Declaration` of the definition DIE (without zeroing it). We would then
call into `ParseTypeFromDWARF` for the same type again, and search the
`UniqueDWARFASTTypeMap`. But we do the search using a zeroed declaration
we get from `GetUniqueTypeNameAndDeclaration`.

This particular example was fixed by
llvm/llvm-project#120569 but this still feels a
little inconsistent. I'm not entirely sure we can get into that
situation after that patch anymore. So the only test-case I could come
up with was the unit-test which calls `MapDeclDIEToDefDIE` directly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants