-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb][DWARFASTParserClang] Resolve nested types when parsing structures #66879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lldb ChangesMigrated from https://reviews.llvm.org/D156774 Full diff: https://github.com/llvm/llvm-project/pull/66879.diff 3 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 37fb16d4e0351c9..fc340da8eba0763 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2150,14 +2150,14 @@ bool DWARFASTParserClang::CompleteRecordType(const DWARFDIE &die,
std::vector<std::unique_ptr<clang::CXXBaseSpecifier>> bases;
// Parse members and base classes first
- std::vector<DWARFDIE> member_function_dies;
+ std::vector<DWARFDIE> member_function_and_type_dies;
DelayedPropertyList delayed_properties;
- ParseChildMembers(die, clang_type, bases, member_function_dies,
+ ParseChildMembers(die, clang_type, bases, member_function_and_type_dies,
delayed_properties, default_accessibility, layout_info);
- // Now parse any methods if there were any...
- for (const DWARFDIE &die : member_function_dies)
+ // Now parse any methods or nested types if there were any...
+ for (const DWARFDIE &die : member_function_and_type_dies)
dwarf->ResolveType(die);
if (type_is_objc_object_or_interface) {
@@ -3153,7 +3153,7 @@ void DWARFASTParserClang::ParseSingleMember(
bool DWARFASTParserClang::ParseChildMembers(
const DWARFDIE &parent_die, CompilerType &class_clang_type,
std::vector<std::unique_ptr<clang::CXXBaseSpecifier>> &base_classes,
- std::vector<DWARFDIE> &member_function_dies,
+ std::vector<DWARFDIE> &member_function_and_type_dies,
DelayedPropertyList &delayed_properties,
const AccessType default_accessibility,
ClangASTImporter::LayoutInfo &layout_info) {
@@ -3189,8 +3189,11 @@ bool DWARFASTParserClang::ParseChildMembers(
break;
case DW_TAG_subprogram:
+ case DW_TAG_enumeration_type:
+ case DW_TAG_structure_type:
+ case DW_TAG_union_type:
// Let the type parsing code handle this one for us.
- member_function_dies.push_back(die);
+ member_function_and_type_dies.push_back(die);
break;
case DW_TAG_inheritance:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
index 88bfc490e890744..d4218959e61a8fa 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
@@ -157,7 +157,7 @@ class DWARFASTParserClang : public DWARFASTParser {
bool ParseChildMembers(
const DWARFDIE &die, lldb_private::CompilerType &class_compiler_type,
std::vector<std::unique_ptr<clang::CXXBaseSpecifier>> &base_classes,
- std::vector<DWARFDIE> &member_function_dies,
+ std::vector<DWARFDIE> &member_function_and_type_dies,
DelayedPropertyList &delayed_properties,
const lldb::AccessType default_accessibility,
lldb_private::ClangASTImporter::LayoutInfo &layout_info);
diff --git a/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp b/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
index 20a085741f73d00..d47f9b636ec15eb 100644
--- a/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
+++ b/lldb/unittests/SymbolFile/DWARF/DWARFASTParserClangTests.cpp
@@ -462,3 +462,96 @@ TEST_F(DWARFASTParserClangTests, TestDefaultTemplateParamParsing) {
}
}
}
+
+TEST_F(DWARFASTParserClangTests, EnsureNestedTypesOfTypeAreParsed) {
+ const char *yamldata = R"(
+--- !ELF
+FileHeader:
+ Class: ELFCLASS64
+ Data: ELFDATA2LSB
+ Type: ET_EXEC
+ Machine: EM_386
+DWARF:
+ debug_str:
+ - Info
+ - B
+ - C
+ - Mask
+ - Enum
+ debug_abbrev:
+ - Table:
+ - Code: 0x1
+ Tag: DW_TAG_compile_unit
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_language
+ Form: DW_FORM_data2
+ - Code: 0x2
+ Tag: DW_TAG_structure_type
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ - Code: 0x3
+ Tag: DW_TAG_union_type
+ Children: DW_CHILDREN_no
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ - Code: 0x4
+ Tag: DW_TAG_structure_type
+ Children: DW_CHILDREN_no
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ - Code: 0x5
+ Tag: DW_TAG_enumeration_type
+ Children: DW_CHILDREN_yes
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ - Code: 0x6
+ Tag: DW_TAG_enumerator
+ Children: DW_CHILDREN_no
+ Attributes:
+ - Attribute: DW_AT_name
+ Form: DW_FORM_strp
+ debug_info:
+ - Version: 5
+ AddrSize: 8
+ UnitType: DW_UT_compile
+ Entries:
+ - AbbrCode: 0x1
+ Values:
+ - Value: 0x21
+ - AbbrCode: 0x2
+ Values:
+ - Value: 0x0
+ - AbbrCode: 0x3
+ Values:
+ - Value: 0x5
+ - AbbrCode: 0x4
+ Values:
+ - Value: 0x7
+ - AbbrCode: 0x5
+ Values:
+ - Value: 0x9
+ - AbbrCode: 0x6
+ Values:
+ - Value: 0xE
+)";
+
+ YAMLModuleTester t(yamldata);
+
+ DWARFUnit *unit = t.GetDwarfUnit();
+ ASSERT_NE(unit, nullptr);
+ const DWARFDebugInfoEntry *cu_entry = unit->DIE().GetDIE();
+ ASSERT_EQ(cu_entry->Tag(), DW_TAG_compile_unit);
+ DWARFDIE cu_die(unit, cu_entry);
+
+ auto holder = std::make_unique<clang_utils::TypeSystemClangHolder>("ast");
+ auto &ast_ctx = *holder->GetAST();
+ DWARFASTParserClangStub ast_parser(ast_ctx);
+
+ EXPECT_TRUE(false);
+}
|
auto &ast_ctx = *holder->GetAST(); | ||
DWARFASTParserClangStub ast_parser(ast_ctx); | ||
|
||
EXPECT_TRUE(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Michael137 Can you help me with the rest of this test? I'm changing function that complete (presumably) incomplete CompilerType's, but I'm not sure how to gather all required input to call it. Maybe that's not even the right way to write the test I need to write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for continuing!
I'll have a look at this unit-test locally. We can try doing ParseTypeFromDWARF
on the Info
DIE to do the DWARF parsing (we do it elsewhere in the test too). Then call CompleteType
on the resulting type to go through the new codepath you added.
@@ -462,3 +462,96 @@ TEST_F(DWARFASTParserClangTests, TestDefaultTemplateParamParsing) { | |||
} | |||
} | |||
} | |||
|
|||
TEST_F(DWARFASTParserClangTests, EnsureNestedTypesOfTypeAreParsed) { | |||
const char *yamldata = R"( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this YAML way of describing debug info quite unfriendly, especially compared to the output of llvm-dwarfdump
. Anyway, I tried to encode debug info for the following snippet:
struct Info {
enum Mask : uintptr_t {
Enum = ~(uintptr_t)(((intptr_t)1 << 3) - 1);
};
union B {};
struct C {};
}
Filtered out llvm-dwarfdump output:
0x0000000c: DW_TAG_compile_unit
DW_AT_language [DW_FORM_data2] (DW_LANG_C_plus_plus_14)
0x0000002c: DW_TAG_structure_type
DW_AT_calling_convention [DW_FORM_data1] (DW_CC_pass_by_value)
DW_AT_name [DW_FORM_strx1] ("Info")
DW_AT_byte_size [DW_FORM_data1] (0x04)
0x0000003a: DW_TAG_union_type
DW_AT_calling_convention [DW_FORM_data1] (DW_CC_pass_by_value)
DW_AT_name [DW_FORM_strx1] ("B")
DW_AT_byte_size [DW_FORM_data1] (0x01)
0x00000049: DW_TAG_structure_type
DW_AT_calling_convention [DW_FORM_data1] (DW_CC_pass_by_value)
DW_AT_name [DW_FORM_strx1] ("C")
DW_AT_byte_size [DW_FORM_data1] (0x01)
0x0000003b: DW_TAG_enumeration_type
DW_AT_type [DW_FORM_ref4] (0x0000005e "uintptr_t")
DW_AT_name [DW_FORM_strx1] ("Mask")
DW_AT_byte_size [DW_FORM_data1] (0x08)
0x00000044: DW_TAG_enumerator
DW_AT_name [DW_FORM_strx1] ("Enum")
DW_AT_const_value [DW_FORM_udata] (18446744073709551608)
Form: DW_FORM_strp | ||
- Code: 0x5 | ||
Tag: DW_TAG_enumeration_type | ||
Children: DW_CHILDREN_yes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently I have to encode tree-like structure (see C++ code above) in a flat list, which doesn't seem natural. I wonder if YAML I've written is even sensible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the back and forth, but I had a look and I'd suggest writing an API test instead. The yaml looks fine to me and we could use it to write a unit-test as a follow-up, but we'll have to jump through some hoops to get the unit-test down the code-path we want anyway. So an API test would be sufficient.
See reference example: https://github.com/llvm/llvm-project/blob/main/lldb/test/API/lang/cpp/bool/TestCPPBool.py
Put your test program into a main.cpp
and call the relevant member types from the expression evaluator using self.expect_expr()
in the python test file. E.g., for this change the test would like:
class NestedTypesTestCase(TestBase):
def test_with_run_command(self):
"""Test that referencing nested types work in the expression parser"""
self.build()
lldbutil.run_to_source_breakpoint(
self, "// breakpoint 1", lldb.SBFileSpec("main.cpp")
)
self.expect_expr(
"Info::Mask::Enum",
result_type=......,
result_value=....,
)
self.expect_expr("Info::B::val", result_type=..., result_value=...)
Let me know if you need more details on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I'll give it a try.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we need to complete nested types within a type? Seems like we can put that off until later. Right now if you parse a member function, any types it needs will be parsed lazily and only if needed, which is ok. If we start completing all types within types without ever needing to actually use then, it will make debugging slower and cause our memory usage to grow. I didn't see much explanation as to why this is needed in the bug report.
If we have a end to end test that shows the problem you are trying to fix it would help as well. |
The motivating example is something like:
LLDB will first resolve the |
What we do for classes is we create forward declarations for them at first. As soon as someone wants to know more, we complete the type via the external source interface. This works for expressions since this is what precomiled headers do, it also works with our CompilerType as we can ask for the forward type, the layout type, or the complete type. I wonder if we can make this work for enums as well. I am worried about a class that contains many types. Usually when we need to know something about a type, like in your example above, we will get a callback to the ClangExternalASTSourceCallbacks methods:
So if you typed Info::Mask::Enum, the expression parser will first ask about "Info" with no containing decl context (or maybe it specifies the translation unit decl context to indicate it is a root type) and we will return the "struct Info". Then it would call FindExternalVisibleDeclsByName() with the DeclContext set to "struct Info" and we will find "Mask" inside of this type, then it will ask about "Enum" within the "Info::Mask" type and it should be in the type definition. Maybe some isn't working correctly for enums in this lookup mechanism? |
Yup that was my understanding too. Though last I stepped through clang lookup for this case the call back into LLDB to get |
Thanks, the API test looks good Apologies for the delay, didn't have a chance to follow-up on Greg's concern yet. But will hopefully get the time in the next few days |
Superseded by #68705 |
I recently ran into this issue when looking at new failures in: And this was referred to in bug report: After looking at the code, I believe we do need to parse contained types for a struct/class/union as we don't get an external AST source call for them. So I do believe this is the right thing to do. Even if we do parse the extra types, if we have a class within a class, then it will add a forward declaration to it only, and expand it later if any only if it is needed. I have attached a slightly different version of this patch to #53904 for reference. |
So while you were able to work around issues with the expression parser with #68705, I do believe a modified version of this patch is needed. |
#77029 is a slightly modified version of this patch where it parses the types after the struct/union/class is created. |
Migrated from https://reviews.llvm.org/D156774