Skip to content

Fix type lookup bug where wrong decl context was being used for a DIE. #94846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions lldb/source/Plugins/SymbolFile/DWARF/DWARFDIE.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -491,6 +491,18 @@ static void GetTypeLookupContextImpl(DWARFDIE die,
case DW_TAG_base_type:
push_ctx(CompilerContextKind::Builtin, name);
break;
// If any of the tags below appear in the parent chain, stop the decl
// context and return. Prior to these being in here, if a type existed in a
// namespace "a" like "a::my_struct", but we also have a function in that
// same namespace "a" which contained a type named "my_struct", both would
// return "a::my_struct" as the declaration context since the
// DW_TAG_subprogram would be skipped and its parent would be found.
case DW_TAG_compile_unit:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah looks like the early-return was removed in the refactor in #93291?
This LGTM but @labath had plans to align GetTypeLookupContext and GetDeclContext, so wanted him to have a look too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine. The removal of the early return was unintentional, although (see other comment) I don't think it's totally the right thing to do either.

case DW_TAG_type_unit:
case DW_TAG_subprogram:
case DW_TAG_lexical_block:
case DW_TAG_inlined_subroutine:
return;
default:
break;
}
Expand Down
2 changes: 2 additions & 0 deletions lldb/test/API/functionalities/type_types/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
CXX_SOURCES := main.cpp
include Makefile.rules
66 changes: 66 additions & 0 deletions lldb/test/API/functionalities/type_types/TestFindTypes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's also a unit-test for GetTypeLookupContext in lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp. Would be nice to add a test-case there too (if the yaml isn't too much of a hassle)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that. I have a DWARF generator that I wrote in python.

Test the SBModule and SBTarget type lookup APIs to find multiple types.
"""

import lldb
from lldbsuite.test.lldbtest import *
from lldbsuite.test import lldbutil


class TypeFindFirstTestCase(TestBase):
def test_find_first_type(self):
"""
Test SBTarget::FindTypes() and SBModule::FindTypes() APIs.
We had issues where our declaration context when finding types was
incorrectly calculated where a type in a namepace, and a type in a
function that was also in the same namespace would match a lookup. For
example:
namespace a {
struct Foo {
int foo;
};
unsigned foo() {
typedef unsigned Foo;
Foo foo = 12;
return foo;
}
} // namespace a
Previously LLDB would calculate the declaration context of "a::Foo"
correctly, but incorrectly calculate the declaration context of "Foo"
from within the foo() function as "a::Foo". Adding tests to ensure this
works correctly.
"""
self.build()
target = self.createTestTarget()
exe_module = target.GetModuleAtIndex(0)
self.assertTrue(exe_module.IsValid())
# Test the SBTarget and SBModule APIs for FindFirstType
for api in [target, exe_module]:
# We should find the "a::Foo" but not the "Foo" type in the function
types = api.FindTypes("a::Foo")
self.assertEqual(types.GetSize(), 1)
type_str0 = str(types.GetTypeAtIndex(0))
self.assertIn('struct Foo {', type_str0)

# When we search by type basename, we should find any type whose
# basename matches "Foo", so "a::Foo" and the "Foo" type in the
# function.
types = api.FindTypes("Foo")
self.assertEqual(types.GetSize(), 2)
type_str0 = str(types.GetTypeAtIndex(0))
type_str1 = str(types.GetTypeAtIndex(1))
# We don't know which order the types will come back as, so
self.assertEqual(set([str(t).split('\n')[0] for t in types]), set(["typedef Foo", "struct Foo {"]))

# When we search by type basename with "::" prepended, we should
# only types in the root namespace which means only "Foo" type in
Comment on lines +60 to +61
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# When we search by type basename with "::" prepended, we should
# only types in the root namespace which means only "Foo" type in
# When we search by type basename with "::" prepended, we should
# return only types in the root namespace which means only "Foo" type in

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just asking why the typedef will be at the root level but not the struct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say this is actually a bug, and it's one of the reasons why I said the two functions should be merged (as Michael alluded to in the other comment). I think that a better full name for this type would be something like ::a::foo()::Foo
I did look into this briefly (before being preempted). Basically, I had this function follow DW_AT_specification links just like it was done for GetDeclContext, but I ran into two problems:

  • this just gives you a name like ::a::foo::Foo which, while probably better than ::Foo is still not fully correct as the type Foo in foo(void) can be completely different from type Foo in foo(something_else).
  • it prevented us from accessing the type Foo in the expression evaluator, which currently "works" (scare quotes, because it only works if there are no conflicts) because we pretend the type to be global. To fix this we'd need to implement a dedicated method for injecting local types into the expression, just like we do with local variables. I don't think that's particularly hard, but it definitely takes more time that I had at the moment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I don't know of anyone that would use the function name when asking for a type, even if that type was defined in a function.

To make the expression parser work, we should be able to find all of the types using a context like just "Foo", where we would find both ::Foo and the Foo from a::foo(), but sort the results based on the CompilerDeclContext of each type that was returned. If we are evaluating an expression in a::foo(), the Foo type from that function could tell us the CompilerDeclContext it was defined in and we would find the closest match to where the expression is being run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I don't know of anyone that would use the function name when asking for a type, even if that type was defined in a function.

It's certainly not valid C++, if that's what you mean (c++ just doesn't give you the option of referring to a local type from outside of the function), but I don't think this kind of qualification is a revolutionary concept. For example if the aforementioned local struct had a constructor, it's mangled name would be _ZZN1a3fooEvEN3FooC1Ev , that is a::foo()::Foo::Foo(). Gdb (AFAICT) does not support these kinds of lookups, but it also does not return the type for query like ::Foo. (instead gdb supports searching for a type in a specific block/function, which is kind of similar to this)

To make the expression parser work, we should be able to find all of the types using a context like just "Foo", where we would find both ::Foo and the Foo from a::foo(), but sort the results based on the CompilerDeclContext of each type that was returned. If we are evaluating an expression in a::foo(), the Foo type from that function could tell us the CompilerDeclContext it was defined in and we would find the closest match to where the expression is being run.

This approach has the problem that a type (or really, anything) defined with the same name at a class level would take precedence over a local declaration.

For example with code like:

struct Foo { int one; }
class Class {
  struct Foo { int two; }
  void Method() {
    struct Foo { int three; }
  }
};

if we were stopped in Class::Method, then the name Foo should refer to the third definition with that name, but I believe this will end up using the second one because clang will see the class-level declaration first and will not bother asking for other possible definitions. I think we used to use the same approach for local variables, but then we changed the implementation because of the same shadowing problem.

# the function.
types = api.FindTypes("::Foo")
self.assertEqual(types.GetSize(), 1)
type_str0 = str(types.GetTypeAtIndex(0))
self.assertIn('typedef Foo', type_str0)
15 changes: 15 additions & 0 deletions lldb/test/API/functionalities/type_types/main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
namespace a {
struct Foo {};

unsigned foo() {
typedef unsigned Foo;
Foo foo = 12;
return foo;
}
} // namespace a

int main() {
a::Foo f = {};
a::foo();
return 0;
}
107 changes: 107 additions & 0 deletions lldb/unittests/SymbolFile/DWARF/DWARFDIETest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -258,3 +258,110 @@ TEST(DWARFDIETest, GetContext) {
struct_die.GetTypeLookupContext(),
testing::ElementsAre(make_namespace("NAMESPACE"), make_struct("STRUCT")));
}

TEST(DWARFDIETest, GetContextInFunction) {
// Make sure we get the right context fo each "struct_t" type. The first
// should be "a::struct_t" and the one defined in the "foo" function should be
// "struct_t". Previous DWARFDIE::GetTypeLookupContext() function calls would
// have the "struct_t" in "foo" be "a::struct_t" because it would traverse the
// entire die parent tree and ignore DW_TAG_subprogram and keep traversing the
// parents.
//
// 0x0000000b: DW_TAG_compile_unit
// 0x0000000c: DW_TAG_namespace
// DW_AT_name("a")
// 0x0000000f: DW_TAG_structure_type
// DW_AT_name("struct_t")
// 0x00000019: DW_TAG_subprogram
// DW_AT_name("foo")
// 0x0000001e: DW_TAG_structure_type
// DW_AT_name("struct_t")
// 0x00000028: NULL
// 0x00000029: NULL
// 0x0000002a: NULL
const char *yamldata = R"(
--- !ELF
FileHeader:
Class: ELFCLASS64
Data: ELFDATA2LSB
Type: ET_EXEC
Machine: EM_386
DWARF:
debug_str:
- ''
debug_abbrev:
- ID: 0
Table:
- Code: 0x1
Tag: DW_TAG_compile_unit
Children: DW_CHILDREN_yes
- Code: 0x2
Tag: DW_TAG_namespace
Children: DW_CHILDREN_yes
Attributes:
- Attribute: DW_AT_name
Form: DW_FORM_string
- Code: 0x3
Tag: DW_TAG_structure_type
Children: DW_CHILDREN_no
Attributes:
- Attribute: DW_AT_name
Form: DW_FORM_string
- Code: 0x4
Tag: DW_TAG_subprogram
Children: DW_CHILDREN_yes
Attributes:
- Attribute: DW_AT_name
Form: DW_FORM_string
debug_info:
- Length: 0x27
Version: 4
AbbrevTableID: 0
AbbrOffset: 0x0
AddrSize: 8
Entries:
- AbbrCode: 0x1
- AbbrCode: 0x2
Values:
- Value: 0xDEADBEEFDEADBEEF
CStr: a
- AbbrCode: 0x3
Values:
- Value: 0xDEADBEEFDEADBEEF
CStr: struct_t
- AbbrCode: 0x4
Values:
- Value: 0xDEADBEEFDEADBEEF
CStr: foo
- AbbrCode: 0x3
Values:
- Value: 0xDEADBEEFDEADBEEF
CStr: struct_t
- AbbrCode: 0x0
- AbbrCode: 0x0
- AbbrCode: 0x0)";

YAMLModuleTester t(yamldata);
auto *symbol_file =
llvm::cast<SymbolFileDWARF>(t.GetModule()->GetSymbolFile());
DWARFUnit *unit = symbol_file->DebugInfo().GetUnitAtIndex(0);
ASSERT_TRUE(unit);

auto make_namespace = [](llvm::StringRef name) {
return CompilerContext(CompilerContextKind::Namespace, ConstString(name));
};
auto make_struct = [](llvm::StringRef name) {
return CompilerContext(CompilerContextKind::Struct, ConstString(name));
};
// Grab the "a::struct_t" type from the "a" namespace
DWARFDIE a_struct_die = unit->DIE().GetFirstChild().GetFirstChild();
ASSERT_TRUE(a_struct_die);
EXPECT_THAT(
a_struct_die.GetDeclContext(),
testing::ElementsAre(make_namespace("a"), make_struct("struct_t")));
// Grab the "struct_t" defined in the "foo" function.
DWARFDIE foo_struct_die =
unit->DIE().GetFirstChild().GetFirstChild().GetSibling().GetFirstChild();
EXPECT_THAT(foo_struct_die.GetTypeLookupContext(),
testing::ElementsAre(make_struct("struct_t")));
}
Loading