-
Notifications
You must be signed in to change notification settings - Fork 14.3k
[lldb] Fix expressions that involve nested structs/classes/unions. #77029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The LLDB expression parser relies on using the external AST source support in LLDB. This allows us to find a class at the root namespace level, but it wouldn't allow us to find nested classes all of the time. When LLDB finds a class via this mechanism, it would be able to complete this class when needed, but during completion, we wouldn't populate nested types within this class which would prevent us from finding contained types when needed as clang would expect them to be present if a class was completed. When we parse a type for a class, struct or union, we make a forward declaration to the class which can be completed. Now when the class is completed, we also add any contained types to the class' declaration context which now allows these types to be found. If we have a struct that contains a struct, we will add the forward declaration of the contained structure which can be c ompleted later. Having this forward declaration makes it possible for LLDB to find everything it needs now. This should fix an existing issue: llvm#53904 Previously, contained types could be parsed by accident and allow expression to complete successfully. Other times we would have to run an expression multiple times because our old type lookup from our expressions would cau se a type to be parsed, but not used in the current expression, but this would have parsed a type into the containing decl context and the expression might succeed if it is run again.
@llvm/pr-subscribers-lldb Author: Greg Clayton (clayborg) ChangesThe LLDB expression parser relies on using the external AST source support in LLDB. This allows us to find a class at the root namespace level, but it wouldn't allow us to find nested classes all of the time. When LLDB finds a class via this mechanism, it would be able to complete this class when needed, but during completion, we wouldn't populate nested types within this class which would prevent us from finding contained types when needed as clang would expect them to be present if a class was completed. When we parse a type for a class, struct or union, we make a forward declaration to the class which can be completed. Now when the class is completed, we also add any contained types to the class' declaration context which now allows these types to be found. If we have a struct that contains a struct, we will add the forward declaration of the contained structure which can be c ompleted later. Having this forward declaration makes it possible for LLDB to find everything it needs now. This should fix an existing issue: #53904 Previously, contained types could be parsed by accident and allow expression to complete successfully. Other times we would have to run an expression multiple times because our old type lookup from our expressions would cau se a type to be parsed, but not used in the current expression, but this would have parsed a type into the containing decl context and the expression might succeed if it is run again. Full diff: https://github.com/llvm/llvm-project/pull/77029.diff 5 Files Affected:
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 3e08f2550081f2..233de2f1ac58cc 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2150,6 +2150,7 @@ bool DWARFASTParserClang::CompleteRecordType(const DWARFDIE &die,
SymbolFileDWARF *dwarf = die.GetDWARF();
ClangASTImporter::LayoutInfo layout_info;
+ std::vector<DWARFDIE> contained_type_dies;
if (die.HasChildren()) {
const bool type_is_objc_object_or_interface =
@@ -2175,7 +2176,8 @@ bool DWARFASTParserClang::CompleteRecordType(const DWARFDIE &die,
DelayedPropertyList delayed_properties;
ParseChildMembers(die, clang_type, bases, member_function_dies,
- delayed_properties, default_accessibility, layout_info);
+ contained_type_dies, delayed_properties,
+ default_accessibility, layout_info);
// Now parse any methods if there were any...
for (const DWARFDIE &die : member_function_dies)
@@ -2231,6 +2233,13 @@ bool DWARFASTParserClang::CompleteRecordType(const DWARFDIE &die,
if (record_decl)
GetClangASTImporter().SetRecordLayout(record_decl, layout_info);
}
+ // Now parse all contained types inside of the class. We make forward
+ // declarations to all classes, but we need the CXXRecordDecl to have decls
+ // for all contained types because we don't get asked for them via the
+ // external AST support.
+ for (const DWARFDIE &die : contained_type_dies)
+ dwarf->ResolveType(die);
+
return (bool)clang_type;
}
@@ -2260,7 +2269,7 @@ bool DWARFASTParserClang::CompleteTypeFromDWARF(const DWARFDIE &die,
// Disable external storage for this type so we don't get anymore
// clang::ExternalASTSource queries for this type.
- m_ast.SetHasExternalStorage(clang_type.GetOpaqueQualType(), false);
+ //m_ast.SetHasExternalStorage(clang_type.GetOpaqueQualType(), false);
if (!die)
return false;
@@ -3106,10 +3115,39 @@ void DWARFASTParserClang::ParseSingleMember(
std::make_pair(field_decl, field_bit_offset));
}
+static bool IsTypeTag(dw_tag_t tag) {
+ switch (tag) {
+ case DW_TAG_typedef:
+ case DW_TAG_base_type:
+ case DW_TAG_pointer_type:
+ case DW_TAG_reference_type:
+ case DW_TAG_rvalue_reference_type:
+ case DW_TAG_const_type:
+ case DW_TAG_restrict_type:
+ case DW_TAG_volatile_type:
+ case DW_TAG_atomic_type:
+ case DW_TAG_unspecified_type:
+ case DW_TAG_structure_type:
+ case DW_TAG_union_type:
+ case DW_TAG_class_type:
+ case DW_TAG_enumeration_type:
+ case DW_TAG_inlined_subroutine:
+ case DW_TAG_subprogram:
+ case DW_TAG_subroutine_type:
+ case DW_TAG_array_type:
+ case DW_TAG_ptr_to_member_type:
+ return true;
+ default:
+ break;
+ }
+ return false;
+}
+
bool DWARFASTParserClang::ParseChildMembers(
const DWARFDIE &parent_die, CompilerType &class_clang_type,
std::vector<std::unique_ptr<clang::CXXBaseSpecifier>> &base_classes,
std::vector<DWARFDIE> &member_function_dies,
+ std::vector<DWARFDIE> &contained_type_dies,
DelayedPropertyList &delayed_properties,
const AccessType default_accessibility,
ClangASTImporter::LayoutInfo &layout_info) {
@@ -3159,6 +3197,8 @@ bool DWARFASTParserClang::ParseChildMembers(
break;
default:
+ if (IsTypeTag(tag))
+ contained_type_dies.push_back(die);
break;
}
}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
index 3e28e54d6220db..8d4af203bb2871 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.h
@@ -175,6 +175,7 @@ class DWARFASTParserClang : public lldb_private::plugin::dwarf::DWARFASTParser {
lldb_private::CompilerType &class_compiler_type,
std::vector<std::unique_ptr<clang::CXXBaseSpecifier>> &base_classes,
std::vector<lldb_private::plugin::dwarf::DWARFDIE> &member_function_dies,
+ std::vector<lldb_private::plugin::dwarf::DWARFDIE> &contained_type_dies,
DelayedPropertyList &delayed_properties,
const lldb::AccessType default_accessibility,
lldb_private::ClangASTImporter::LayoutInfo &layout_info);
diff --git a/lldb/test/API/commands/expression/nested/Makefile b/lldb/test/API/commands/expression/nested/Makefile
new file mode 100644
index 00000000000000..99998b20bcb050
--- /dev/null
+++ b/lldb/test/API/commands/expression/nested/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/commands/expression/nested/TestNestedExpressions.py b/lldb/test/API/commands/expression/nested/TestNestedExpressions.py
new file mode 100644
index 00000000000000..7f194e921e5628
--- /dev/null
+++ b/lldb/test/API/commands/expression/nested/TestNestedExpressions.py
@@ -0,0 +1,70 @@
+"""
+Test calling an expression with errors that a FixIt can fix.
+"""
+
+import lldb
+from lldbsuite.test.decorators import *
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test import lldbutil
+
+
+class NestedExpressions(TestBase):
+
+ def test_enum_in_nested_structs(self):
+ """
+ Test expressions that references an enumeration in nested structs.
+ """
+ self.build()
+ exe_path = self.getBuildArtifact("a.out")
+ target = self.dbg.CreateTarget(exe_path)
+ self.assertTrue(target, "Target: %s is not valid." % (exe_path))
+ self.expect_expr("A::B::C::EnumType::Eleven",
+ result_type="A::B::C::EnumType",
+ result_value="Eleven")
+
+ def test_struct_in_nested_structs(self):
+ """
+ Test expressions that references a struct in nested structs.
+ """
+ self.build()
+ exe_path = self.getBuildArtifact("a.out")
+ target = self.dbg.CreateTarget(exe_path)
+ self.assertTrue(target, "Target: %s is not valid." % (exe_path))
+ self.expect_expr("sizeof(A::B::C)", result_value="1")
+ self.expect_expr("sizeof(A::B)", result_value="2")
+
+ def test_static_in_nested_structs(self):
+ """
+ Test expressions that references a static variable in nested structs.
+ """
+ self.build()
+ (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
+ self, "Stop here to evaluate expressions", lldb.SBFileSpec("main.cpp")
+ )
+ self.expect_expr("A::B::C::enum_static",
+ result_type="A::B::C::EnumType",
+ result_value="Eleven")
+
+ def test_enum_in_nested_namespaces(self):
+ """
+ Test expressions that references an enumeration in nested namespaces.
+ """
+ self.build()
+ exe_path = self.getBuildArtifact("a.out")
+ target = self.dbg.CreateTarget(exe_path)
+ self.assertTrue(target, "Target: %s is not valid." % (exe_path))
+ self.expect_expr("a::b::c::Color::Blue",
+ result_type="a::b::c::Color",
+ result_value="Blue")
+
+ def test_static_in_nested_namespaces(self):
+ """
+ Test expressions that references an enumeration in nested namespaces.
+ """
+ self.build()
+ (target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
+ self, "Stop here to evaluate expressions", lldb.SBFileSpec("main.cpp")
+ )
+ self.expect_expr("a::b::c::d",
+ result_type="int",
+ result_value="12")
diff --git a/lldb/test/API/commands/expression/nested/main.cpp b/lldb/test/API/commands/expression/nested/main.cpp
new file mode 100644
index 00000000000000..9f8baaf35510d2
--- /dev/null
+++ b/lldb/test/API/commands/expression/nested/main.cpp
@@ -0,0 +1,35 @@
+namespace a {
+ namespace b {
+ namespace c {
+ static int d = 12;
+ enum Color {
+ Red, Green, Blue
+ };
+ }
+ }
+}
+
+struct A {
+ int _a = 'a';
+ struct B {
+ short _b = 'b';
+ struct C {
+ char _c = 'c';
+ enum EnumType : int {
+ Eleven = 11
+ };
+ static EnumType enum_static;
+ };
+ };
+};
+
+A::B::C::EnumType A::B::C::enum_static = A::B::C::Eleven;
+
+int foo() {
+ a::b::c::Color color = a::b::c::Blue;
+ return A::B::C::enum_static == a::b::c::d && ((int)color == 0);
+}
+
+int main() {
+ return foo(); // Stop here to evaluate expressions
+}
|
You can test this locally with the following command:darker --check --diff -r e68a0320a1592bf408ac6458efa2d1c548cfed7a...e5aac3b38124117c6e5077f1ae077378984aaf25 lldb/test/API/commands/expression/nested/TestNestedExpressions.py View the diff from darker here.--- TestNestedExpressions.py 2024-01-05 00:17:44.000000 +0000
+++ TestNestedExpressions.py 2024-01-05 17:10:41.544191 +0000
@@ -7,64 +7,65 @@
from lldbsuite.test.lldbtest import *
from lldbsuite.test import lldbutil
class NestedExpressions(TestBase):
-
def test_enum_in_nested_structs(self):
"""
- Test expressions that references an enumeration in nested structs.
+ Test expressions that references an enumeration in nested structs.
"""
self.build()
exe_path = self.getBuildArtifact("a.out")
target = self.dbg.CreateTarget(exe_path)
self.assertTrue(target, "Target: %s is not valid." % (exe_path))
- self.expect_expr("A::B::C::EnumType::Eleven",
- result_type="A::B::C::EnumType",
- result_value="Eleven")
+ self.expect_expr(
+ "A::B::C::EnumType::Eleven",
+ result_type="A::B::C::EnumType",
+ result_value="Eleven",
+ )
def test_struct_in_nested_structs(self):
"""
- Test expressions that references a struct in nested structs.
+ Test expressions that references a struct in nested structs.
"""
self.build()
exe_path = self.getBuildArtifact("a.out")
target = self.dbg.CreateTarget(exe_path)
self.assertTrue(target, "Target: %s is not valid." % (exe_path))
self.expect_expr("sizeof(A::B::C)", result_value="1")
self.expect_expr("sizeof(A::B)", result_value="2")
def test_static_in_nested_structs(self):
"""
- Test expressions that references a static variable in nested structs.
+ Test expressions that references a static variable in nested structs.
"""
self.build()
(target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
self, "Stop here to evaluate expressions", lldb.SBFileSpec("main.cpp")
)
- self.expect_expr("A::B::C::enum_static",
- result_type="A::B::C::EnumType",
- result_value="Eleven")
+ self.expect_expr(
+ "A::B::C::enum_static",
+ result_type="A::B::C::EnumType",
+ result_value="Eleven",
+ )
def test_enum_in_nested_namespaces(self):
"""
- Test expressions that references an enumeration in nested namespaces.
+ Test expressions that references an enumeration in nested namespaces.
"""
self.build()
exe_path = self.getBuildArtifact("a.out")
target = self.dbg.CreateTarget(exe_path)
self.assertTrue(target, "Target: %s is not valid." % (exe_path))
- self.expect_expr("a::b::c::Color::Blue",
- result_type="a::b::c::Color",
- result_value="Blue")
+ self.expect_expr(
+ "a::b::c::Color::Blue", result_type="a::b::c::Color", result_value="Blue"
+ )
def test_static_in_nested_namespaces(self):
"""
- Test expressions that references an enumeration in nested namespaces.
+ Test expressions that references an enumeration in nested namespaces.
"""
self.build()
(target, process, thread, bkpt) = lldbutil.run_to_source_breakpoint(
self, "Stop here to evaluate expressions", lldb.SBFileSpec("main.cpp")
)
- self.expect_expr("a::b::c::d",
- result_type="int",
- result_value="12")
+ self.expect_expr("a::b::c::d", result_type="int", result_value="12")
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
This patch was in response to this issue: And is related to #66879, but does things a bit differently. |
break; | ||
} | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already implemented in llvm::dwarf::isType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to find this function, but was able to find it because llvm::dwarf::isType
uses the #include style to make the contents which is not readable or grep'able:
inline bool isType(Tag T) {
switch (T) {
default:
return false;
#define HANDLE_DW_TAG(ID, NAME, VERSION, VENDOR, KIND) \
case DW_TAG_##NAME: \
return (KIND == DW_KIND_TYPE);
#include "llvm/BinaryFormat/Dwarf.def"
}
}
No wonder I couldn't find it with a code grep! Thanks for the pointer, I will use this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(kudos to @felipepiovezan who coincidentally told me about it for something else today morning)
@@ -2260,7 +2269,7 @@ bool DWARFASTParserClang::CompleteTypeFromDWARF(const DWARFDIE &die, | |||
|
|||
// Disable external storage for this type so we don't get anymore | |||
// clang::ExternalASTSource queries for this type. | |||
m_ast.SetHasExternalStorage(clang_type.GetOpaqueQualType(), false); | |||
//m_ast.SetHasExternalStorage(clang_type.GetOpaqueQualType(), false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on why this is necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be in here and I will revert this. I was testing to see if I removed this, if I would get a find 'B' in 'A'
call to the ClangASTSource::FindExternalVisibleDecls(...)
, but it never happened. I was hoping we could just leave the external AST source enabled for classes and that would allow us to find the contained types efficiently, so that was my original approach.
Fixes: - revert the code that commetned out turning off external AST for a class, this was from initial testing - remove IsTypeTag() and use llvm::dwarf::isType()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the fix! I did test this patch with both our regression issue, and also allowing us to remove the double-lookup working around #53904 and can confirm it addressed both issues. Thanks again! |
Glad it worked! It will also be a huge improvement in the expression parser and avoid these weird expression parser bugs where it doesn't work sometimes and does other times, so it was totally worth fixing. I am also glad to know that the type lookups are being so much more efficient. Having a repro case is always the best way to help us fix bugs, so thanks for posting the details so we can repro easily. |
One of the new tests fails on Windows:
But given the other issues we've had lately, it's not surprising. I've skipped it for now - ba4cf31. |
…lvm#77029) The LLDB expression parser relies on using the external AST source support in LLDB. This allows us to find a class at the root namespace level, but it wouldn't allow us to find nested classes all of the time. When LLDB finds a class via this mechanism, it would be able to complete this class when needed, but during completion, we wouldn't populate nested types within this class which would prevent us from finding contained types when needed as clang would expect them to be present if a class was completed. When we parse a type for a class, struct or union, we make a forward declaration to the class which can be completed. Now when the class is completed, we also add any contained types to the class' declaration context which now allows these types to be found. If we have a struct that contains a struct, we will add the forward declaration of the contained structure which can be c ompleted later. Having this forward declaration makes it possible for LLDB to find everything it needs now. This should fix an existing issue: llvm#53904 Previously, contained types could be parsed by accident and allow expression to complete successfully. Other times we would have to run an expression multiple times because our old type lookup from our expressions would cau se a type to be parsed, but not used in the current expression, but this would have parsed a type into the containing decl context and the expression might succeed if it is run again. (cherry picked from commit e42edb5)
Turns out it actually got worse, see #79668 - now whether or not a given nested type is accessible via name lookup depends on which copy of the outer type is loaded first & there's no workaround that I know of - the lookup then fails/succeeds permanently. Be good to get this addressed - as now it's not possible to lookup these nested members at all reliably (even the "repeat the lookup" workaround doesn't work - maybe there are other workarounds? But none that I can think of) |
The main issue we have is they way that clang works currently with the external AST source. We currently create a forward declaration type "struct Foo;" and we allow it to be completed once. Once it is completed clang expects the struct type to have everything that it needs, like all contained types (forward decls are allowed), otherwise we will never find that type as the external AST source thinks it is done. I would like to avoid having to find all copies of a struct type and parse all contained types in all of matching entries from everywhere in the DWARF if possible. If we did this then we would be parsing tons of debug info to add any and all potentially contained types and that would cause way too much debug info to be parsed. The best solution I can think of is to have the compile continue to call the external AST source to say "please find this type within this struct/class". These calls don't happen today, but if they did, we could do this very efficiently without having to populate all contained types into every struct/class anytime we complete the type. So right now the flow is:
The new flow could be:
The last item here would require modifications to clang which could be debugger specific changes that are only enabled in a special debug compiler mode so they don't affect the actual compiler. Thoughts? |
How'd this work before your recent changes, then - when each repeated query would get one level further down in the nesting? How'd that work given the clang limitations you're describing? In any case, the extra clang requirements here seem like they might be of uncertain cost/maintainability (if it's only updating one place that everyone calls through - great, but if it's updating multiple callers, and the risk that new callers miss this handling - that seems like a maintainability problem) which is worrying. So, I'd be uncertain about that direction without more info - and with that info, if it is just one codepath, yeah, maybe it's quick enough to address the regression. But if it'd require substantial reengineering to clang to get this working - this seems an unfortunate regression to carry in the interim, and it might be worth reverting to the previous (differently buggy, but at least work-around-able) behavior. And if this would require updating many callers in clang (unless such updates include an API change that would make it difficult to accidentally introduce a new caller that didn't handle things correctly for lldb), I'd worry about the future stability of such a change & might be inclined towards the less efficient "search all the DWARF" thing. |
It was because we were actually parsing and converting many extra types. Anytime we did a type query for a given type, we would end up converting all of those types into the clang AST. So for
Yeah, it would be great if anyone with compiler experience could check and see if this is controllable and easy to fix. I was thinking we could add new APIs to the External AST that are marked as "only override these if you are a debugger"
I would rather have the current problem over the previous one. Are we mostly worried about template classes here? Or do all classes, even non templated ones, have this same issue?
I mean, in the expression parser we clearly say with |
To clarify, the regression @dwblaikie reported in #79668 was introduced in #74786, not in this current PR. IIUC, this PR just partially fixed #53904?
Following overview of the clang<->lldb call-chain might be useful: #53904 (comment). Not sure I fully understand the "we leave the external AST enabled so we can get called for contained types" idea. Happy to chat about how to make this work though |
When we parse a class, we enable the external AST support on the CXXRecordDecl object. When we are asked to complete the type it will eventually make it down into
So we never get a query again to find anything inside of this class. That being said, I am not sure anything ever was doing another external AST lookup in this class after we completed it. But if we are able to add debugger specific support into the external AST virtual interface, we might be able to gets asked "withing this 'class A' please find me a contained type" and then modify the compiler to use the external AST source when a lookup in a class decl context fails. We currently modified it so that when we parse a struct/union/class type, we also parse all contained child DIE types so that they are available. This is failing when some classes have contained types defined and others don't because we only add the contained types for the struct/class/union DIE we used to create the type. If we can modify the compiler to make an external lookup call only when debugging, we can avoid this need to parse all contained types and wait until the expresion parser asks for that type. |
Right, what we want is for the |
Be great to see D101950 picked up again, so I'm glad to hear that's being looked at. Be nice to get some way to address this regression sooner rather than later... & yeah, I'd argue that classes derived from DWARF are not complete - nested types, member function templates, special member functions, and... static members I think - none of those things are consistently emitted in every copy of a type. So we probably shouldn't turn off the queries back into lldb for a type ever, really. |
The LLDB expression parser relies on using the external AST source support in LLDB. This allows us to find a class at the root namespace level, but it wouldn't allow us to find nested classes all of the time. When LLDB finds a class via this mechanism, it would be able to complete this class when needed, but during completion, we wouldn't populate nested types within this class which would prevent us from finding contained types when needed as clang would expect them to be present if a class was completed. When we parse a type for a class, struct or union, we make a forward declaration to the class which can be completed. Now when the class is completed, we also add any contained types to the class' declaration context which now allows these types to be found. If we have a struct that contains a struct, we will add the forward declaration of the contained structure which can be c ompleted later. Having this forward declaration makes it possible for LLDB to find everything it needs now.
This should fix an existing issue: #53904
Previously, contained types could be parsed by accident and allow expression to complete successfully. Other times we would have to run an expression multiple times because our old type lookup from our expressions would cau se a type to be parsed, but not used in the current expression, but this would have parsed a type into the containing decl context and the expression might succeed if it is run again.