Skip to content

[cherry-pick][stable/20230725] [lldb] Make only one function that needs to be implemented when searching for types #7885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Michael137
Copy link

This patch revives the effort to get this Phabricator patch into
upstream:

https://reviews.llvm.org/D137900

This patch was accepted before in Phabricator but I found some
-gsimple-template-names issues that are fixed in this patch.

A fixed up version of the description from the original patch starts
now.

This patch started off trying to fix Module::FindFirstType() as it
sometimes didn't work. The issue was the SymbolFile plug-ins didn't do
any filtering of the matching types they produced, and they only looked
up types using the type basename. This means if you have two types with
the same basename, your type lookup can fail when only looking up a
single type. We would ask the Module::FindFirstType to lookup "Foo::Bar"
and it would ask the symbol file to find only 1 type matching the
basename "Bar", and then we would filter out any matches that didn't
match "Foo::Bar". So if the SymbolFile found "Foo::Bar" first, then it
would work, but if it found "Baz::Bar" first, it would return only that
type and it would be filtered out.

Discovering this issue lead me to think of the patch Alex Langford did a
few months ago that was done for finding functions, where he allowed
SymbolFile objects to make sure something fully matched before parsing
the debug information into an AST type and other LLDB types. So this
patch aimed to allow type lookups to also be much more efficient.

As LLDB has been developed over the years, we added more ways to to type
lookups. These functions have lots of arguments. This patch aims to make
one API that needs to be implemented that serves all previous lookups:

  • Find a single type
  • Find all types
  • Find types in a namespace

This patch introduces a TypeQuery class that contains all of the state
needed to perform the lookup which is powerful enough to perform all of
the type searches that used to be in our API. It contain a vector of
CompilerContext objects that can fully or partially specify the lookup
that needs to take place.

If you just want to lookup all types with a matching basename,
regardless of the containing context, you can specify just a single
CompilerContext entry that has a name and a CompilerContextKind mask of
CompilerContextKind::AnyType.

Or you can fully specify the exact context to use when doing lookups
like: CompilerContextKind::Namespace "std"
CompilerContextKind::Class "foo"
CompilerContextKind::Typedef "size_type"

This change expands on the clang modules code that already used a
vector items, but it modifies it to work with
expression type lookups which have contexts, or user lookups where users
query for types. The clang modules type lookup is still an option that
can be enabled on the TypeQuery objects.

This mirrors the most recent addition of type lookups that took a
vector that allowed lookups to happen for the
expression parser in certain places.

Prior to this we had the following APIs in Module:

void
Module::FindTypes(ConstString type_name, bool exact_match, size_t max_matches,
                  llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
                  TypeList &types);

void
Module::FindTypes(llvm::ArrayRef<CompilerContext> pattern, LanguageSet languages,
                  llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
                  TypeMap &types);

void Module::FindTypesInNamespace(ConstString type_name,
                                  const CompilerDeclContext &parent_decl_ctx,
                                  size_t max_matches, TypeList &type_list);

The new Module API is much simpler. It gets rid of all three above
functions and replaces them with:

void FindTypes(const TypeQuery &query, TypeResults &results);

The TypeQuery class contains all of the needed settings:

  • The vector that allow efficient lookups in the symbol
    file classes since they can look at basename matches only realize fully
    matching types. Before this any basename that matched was fully realized
    only to be removed later by code outside of the SymbolFile layer which
    could cause many types to be realized when they didn't need to.
  • If the lookup is exact or not. If not exact, then the compiler context
    must match the bottom most items that match the compiler context,
    otherwise it must match exactly
  • If the compiler context match is for clang modules or not. Clang
    modules matches include a Module compiler context kind that allows types
    to be matched only from certain modules and these matches are not needed
    when d oing user type lookups.
  • An optional list of languages to use to limit the search to only
    certain languages

The TypeResults object contains all state required to do the lookup
and store the results:

  • The max number of matches
  • The set of SymbolFile objects that have already been searched
  • The matching type list for any matches that are found

The benefits of this approach are:

  • Simpler API, and only one API to implement in SymbolFile classes
  • Replaces the FindTypesInNamespace that used a CompilerDeclContext as a
    way to limit the search, but this only worked if the TypeSystem matched
    the current symbol file's type system, so you couldn't use it to lookup
    a type in another module
  • Fixes a serious bug in our FindFirstType functions where if we were
    searching for "foo::bar", and we found a "baz::bar" first, the basename
    would match and we would only fetch 1 type using the basename, only to
    drop it from the matching list and returning no results

@Michael137
Copy link
Author

Michael137 commented Dec 15, 2023

This cherry-picks @clayborg's FindTypes work. Cherry-picked the vtable ValueObject changes too for convenience

@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

Have to fixup the Swift plugin side of things now

Copy link

@adrian-prantl adrian-prantl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo failing Swift tests.

@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci test Windows platform

@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci test Windows platform

@Michael137
Copy link
Author

Requires llvm#75926

@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci test Windows platform

@Michael137
Copy link
Author

@swift-ci test

@Michael137 Michael137 force-pushed the lldb/find-types-rework-to-20230725 branch from 599923a to ed8cbe8 Compare December 20, 2023 09:09
@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci please test Windows platform

@Michael137
Copy link
Author

@swift-ci test
@swift-ci please test Windows platform

@Michael137
Copy link
Author

@swift-ci test

@Michael137 Michael137 force-pushed the lldb/find-types-rework-to-20230725 branch from 1ec0685 to 1a943f5 Compare January 8, 2024 12:20
@Michael137
Copy link
Author

@swift-ci test

1 similar comment
@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci please test Windows platform

@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci please test Windows platform

1 similar comment
@Michael137
Copy link
Author

@swift-ci please test Windows platform

@Michael137
Copy link
Author

Following tests are failing:

Failed Tests (5):
  lldb-api :: lang/c/tls_globals/TestTlsGlobals.py
  lldb-api :: lang/cpp/thread_local/TestThreadLocal.py
  lldb-api :: macosx/indirect_symbol/TestIndirectSymbols.py
  lldb-shell :: Unwind/eh-frame-dwarf-unwind.test
  lldb-shell :: Unwind/thread-step-out-ret-addr-check.test

But those also fail on stable/20230725, so I'll go ahead and merge this.

Michael137 added a commit to llvm/llvm-project that referenced this pull request Jan 9, 2024
This is required for users of `TypeQuery` that limit the set of
languages of the query using APIs such as
`GetSupportedLanguagesForTypes` or
`GetSupportedLanguagesForExpressions`.

Example usage: swiftlang#7885
@Michael137 Michael137 force-pushed the lldb/find-types-rework-to-20230725 branch from e048eab to f8c59ef Compare January 9, 2024 16:08
@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci test Windows

@Michael137
Copy link
Author

1 test failure remaining:

FAIL: LLDB (/Users/ec2-user/jenkins/workspace/apple-llvm-project-pr-macos/branch-stable/20230725/build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/llvm-macosx-x86_64/bin/clang-x86_64) :: test_swift_deployment_target_from_macho (TestSwiftDeploymentTarget.TestSwiftDeploymentTarget)
======================================================================
FAIL: test_swift_deployment_target_from_macho (TestSwiftDeploymentTarget.TestSwiftDeploymentTarget)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ec2-user/jenkins/workspace/apple-llvm-project-pr-macos/branch-stable/20230725/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py", line 169, in wrapper
    return func(*args, **kwargs)
  File "/Users/ec2-user/jenkins/workspace/apple-llvm-project-pr-macos/branch-stable/20230725/llvm-project/lldb/packages/Python/lldbsuite/test/decorators.py", line 169, in wrapper
    return func(*args, **kwargs)
  File "/Users/ec2-user/jenkins/workspace/apple-llvm-project-pr-macos/branch-stable/20230725/llvm-project/lldb/test/API/lang/swift/deployment_target/TestSwiftDeploymentTarget.py", line 64, in test_swift_deployment_target_from_macho
    self.expect("expression f", substrs=["i = 23"])
  File "/Users/ec2-user/jenkins/workspace/apple-llvm-project-pr-macos/branch-stable/20230725/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2433, in expect
    self.runCmd(
  File "/Users/ec2-user/jenkins/workspace/apple-llvm-project-pr-macos/branch-stable/20230725/llvm-project/lldb/packages/Python/lldbsuite/test/lldbtest.py", line 2109, in runCmd
    self.assertTrue(self.res.Succeeded(), msg if (msg) else CMD_MSG(cmd))
AssertionError: False is not True : Command 'expression f
Error output:
error: <EXPR>:3:1: error: cannot find 'f' in scope
f
^

Doesn't repro locally with exactly the same dotest invocation.

@adrian-prantl
Copy link

@swift-ci test macos

@Michael137
Copy link
Author

@swift-ci test

Michael137 added a commit to Michael137/llvm-project that referenced this pull request Jan 18, 2024
…eligibility

When we lookup a clang type from DWARF by name, we
use the first one we find. In this test we have two
entities with the name `ComparisonResult`, en enum
and a typedef to the enum. So if we happened to find
the typedef first, we would fail to apply the `OptionSet`
formatter to it because it explicitly wants a QualType
whose type-class is an enum.

This fixes `lang/swift/enum_objc/TestEnumObjC.py` when applying
using the new `FindTypes` `TypeQuery` APIs (see
swiftlang#7885)

The fix simply gets the canonical type before we check
its type-class.
@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci test Windows

@Michael137 Michael137 force-pushed the lldb/find-types-rework-to-20230725 branch from dec8971 to aabddfa Compare January 18, 2024 13:50
clayborg and others added 10 commits January 18, 2024 13:52
llvm#67599)

Add the ability to get a C++ vtable ValueObject from another
ValueObject.

This patch adds the ability to ask a ValueObject for a ValueObject that
represents the virtual function table for a C++ class. If the
ValueObject is not a C++ class with a vtable, a valid ValueObject value
will be returned that contains an appropriate error. If it is successful
a valid ValueObject that represents vtable will be returned. The
ValueObject that is returned will have a name that matches the demangled
value for a C++ vtable mangled name like "vtable for <class-name>". It
will have N children, one for each virtual function pointer. Each
child's value is the function pointer itself, the summary is the
symbolication of this function pointer, and the type will be a valid
function pointer from the debug info if there is debug information
corresponding to the virtual function pointer.

The vtable SBValue will have the following:
- SBValue::GetName() returns "vtable for <class>"
- SBValue::GetValue() returns a string representation of the vtable
address
- SBValue::GetSummary() returns NULL
- SBValue::GetType() returns a type appropriate for a uintptr_t type for
the current process
- SBValue::GetLoadAddress() returns the address of the vtable adderess
- SBValue::GetValueAsUnsigned(...) returns the vtable address
- SBValue::GetNumChildren() returns the number of virtual function
pointers in the vtable
- SBValue::GetChildAtIndex(...) returns a SBValue that represents a
virtual function pointer

The child SBValue objects that represent a virtual function pointer has
the following values:
- SBValue::GetName() returns "[%u]" where %u is the vtable function
pointer index
- SBValue::GetValue() returns a string representation of the virtual
function pointer
- SBValue::GetSummary() returns a symbolicated respresentation of the
virtual function pointer
- SBValue::GetType() returns the function prototype type if there is
debug info, or a generic funtion prototype if there is no debug info
- SBValue::GetLoadAddress() returns the address of the virtual function
pointer
- SBValue::GetValueAsUnsigned(...) returns the virtual function pointer
- SBValue::GetNumChildren() returns 0
- SBValue::GetChildAtIndex(...) returns invalid SBValue for any index

Examples of using this API via python:

```
(lldb) script vtable = lldb.frame.FindVariable("shape_ptr").GetVTable()
(lldb) script vtable
vtable for Shape = 0x0000000100004088 {
  [0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
  [1] = 0x0000000100003e4c a.out`Shape::~Shape() at main.cpp:3
  [2] = 0x0000000100003e7c a.out`Shape::area() at main.cpp:4
  [3] = 0x0000000100003e3c a.out`Shape::optional() at main.cpp:7
}
(lldb) script c = vtable.GetChildAtIndex(0)
(lldb) script c
(void ()) [0] = 0x0000000100003d20 a.out`Shape::~Shape() at main.cpp:3
```

(cherry picked from commit 7fbd427)
The current Darwin arm64e ABI on AArch64 systems using ARMv8.3 & newer
cores, adds authentication bits to the vtable pointer address. The
vtable address must be in addressable memory, so running it through
Process::FixDataAddress will be a no-op on other targets.

This was originally a downstream change that I hadn't upstreamed yet,
and it was surfaced by Greg's changes in
llvm#67599
so I needed to update the local patch, and was reminded that I should
upstream this.

(cherry picked from commit de24b0e)
…hing for types (llvm#74786)

This patch revives the effort to get this Phabricator patch into
upstream:

https://reviews.llvm.org/D137900

This patch was accepted before in Phabricator but I found some
-gsimple-template-names issues that are fixed in this patch.

A fixed up version of the description from the original patch starts
now.

This patch started off trying to fix Module::FindFirstType() as it
sometimes didn't work. The issue was the SymbolFile plug-ins didn't do
any filtering of the matching types they produced, and they only looked
up types using the type basename. This means if you have two types with
the same basename, your type lookup can fail when only looking up a
single type. We would ask the Module::FindFirstType to lookup "Foo::Bar"
and it would ask the symbol file to find only 1 type matching the
basename "Bar", and then we would filter out any matches that didn't
match "Foo::Bar". So if the SymbolFile found "Foo::Bar" first, then it
would work, but if it found "Baz::Bar" first, it would return only that
type and it would be filtered out.

Discovering this issue lead me to think of the patch Alex Langford did a
few months ago that was done for finding functions, where he allowed
SymbolFile objects to make sure something fully matched before parsing
the debug information into an AST type and other LLDB types. So this
patch aimed to allow type lookups to also be much more efficient.

As LLDB has been developed over the years, we added more ways to to type
lookups. These functions have lots of arguments. This patch aims to make
one API that needs to be implemented that serves all previous lookups:

- Find a single type
- Find all types
- Find types in a namespace

This patch introduces a `TypeQuery` class that contains all of the state
needed to perform the lookup which is powerful enough to perform all of
the type searches that used to be in our API. It contain a vector of
CompilerContext objects that can fully or partially specify the lookup
that needs to take place.

If you just want to lookup all types with a matching basename,
regardless of the containing context, you can specify just a single
CompilerContext entry that has a name and a CompilerContextKind mask of
CompilerContextKind::AnyType.

Or you can fully specify the exact context to use when doing lookups
like: CompilerContextKind::Namespace "std"
CompilerContextKind::Class "foo"
CompilerContextKind::Typedef "size_type"

This change expands on the clang modules code that already used a
vector<CompilerContext> items, but it modifies it to work with
expression type lookups which have contexts, or user lookups where users
query for types. The clang modules type lookup is still an option that
can be enabled on the `TypeQuery` objects.

This mirrors the most recent addition of type lookups that took a
vector<CompilerContext> that allowed lookups to happen for the
expression parser in certain places.

Prior to this we had the following APIs in Module:

```
void
Module::FindTypes(ConstString type_name, bool exact_match, size_t max_matches,
                  llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
                  TypeList &types);

void
Module::FindTypes(llvm::ArrayRef<CompilerContext> pattern, LanguageSet languages,
                  llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
                  TypeMap &types);

void Module::FindTypesInNamespace(ConstString type_name,
                                  const CompilerDeclContext &parent_decl_ctx,
                                  size_t max_matches, TypeList &type_list);
```

The new Module API is much simpler. It gets rid of all three above
functions and replaces them with:

```
void FindTypes(const TypeQuery &query, TypeResults &results);
```
The `TypeQuery` class contains all of the needed settings:

- The vector<CompilerContext> that allow efficient lookups in the symbol
file classes since they can look at basename matches only realize fully
matching types. Before this any basename that matched was fully realized
only to be removed later by code outside of the SymbolFile layer which
could cause many types to be realized when they didn't need to.
- If the lookup is exact or not. If not exact, then the compiler context
must match the bottom most items that match the compiler context,
otherwise it must match exactly
- If the compiler context match is for clang modules or not. Clang
modules matches include a Module compiler context kind that allows types
to be matched only from certain modules and these matches are not needed
when d oing user type lookups.
- An optional list of languages to use to limit the search to only
certain languages

The `TypeResults` object contains all state required to do the lookup
and store the results:
- The max number of matches
- The set of SymbolFile objects that have already been searched
- The matching type list for any matches that are found

The benefits of this approach are:
- Simpler API, and only one API to implement in SymbolFile classes
- Replaces the FindTypesInNamespace that used a CompilerDeclContext as a
way to limit the search, but this only worked if the TypeSystem matched
the current symbol file's type system, so you couldn't use it to lookup
a type in another module
- Fixes a serious bug in our FindFirstType functions where if we were
searching for "foo::bar", and we found a "baz::bar" first, the basename
would match and we would only fetch 1 type using the basename, only to
drop it from the matching list and returning no results

(cherry picked from commit dd95877)
Fix unexpected pass after
llvm#74786.

(cherry picked from commit dcbf1e4)
…lvm#77029)

The LLDB expression parser relies on using the external AST source
support in LLDB. This allows us to find a class at the root namespace
level, but it wouldn't allow us to find nested classes all of the time.
When LLDB finds a class via this mechanism, it would be able to complete
this class when needed, but during completion, we wouldn't populate
nested types within this class which would prevent us from finding
contained types when needed as clang would expect them to be present if
a class was completed. When we parse a type for a class, struct or
union, we make a forward declaration to the class which can be
completed. Now when the class is completed, we also add any contained
types to the class' declaration context which now allows these types to
be found. If we have a struct that contains a struct, we will add the
forward declaration of the contained structure which can be c ompleted
later. Having this forward declaration makes it possible for LLDB to
find everything it needs now.

This should fix an existing issue:
llvm#53904

Previously, contained types could be parsed by accident and allow
expression to complete successfully. Other times we would have to run an
expression multiple times because our old type lookup from our
expressions would cau se a type to be parsed, but not used in the
current expression, but this would have parsed a type into the
containing decl context and the expression might succeed if it is run
again.

(cherry picked from commit e42edb5)
This is required for users of `TypeQuery` that limit
the set of languages of the query using APIs such as
`GetSupportedLanguagesForTypes` or
`GetSupportedLanguagesForExpressions`.

(cherry picked from commit c322019)
This patch adjusts the `FindTypes` calls to the API changes
introduced in `dd95877`.
…llvm#68408)

Split out the assertions that fail on Windows in preparation to
XFAILing them.

Drive-by change:
* Add a missing `self.build()` call in `test_union_in_anon_namespace`
* Fix formatting
* Add expectedFailureWindows decorator

(cherry picked from commit d579a1a)
…eligibility

When we lookup a clang type from DWARF by name, we
use the first one we find. In this test we have two
entities with the name `ComparisonResult`, en enum
and a typedef to the enum. So if we happened to find
the typedef first, we would fail to apply the `OptionSet`
formatter to it because it explicitly wants a QualType
whose type-class is an enum.

This fixes `lang/swift/enum_objc/TestEnumObjC.py` when applying
using the new `FindTypes` `TypeQuery` APIs (see
swiftlang#7885)

The fix simply gets the canonical type before we check
its type-class.

(cherry picked from commit f7364b56732abeddb35082ba6002f75f26d795d0)
@Michael137 Michael137 force-pushed the lldb/find-types-rework-to-20230725 branch from aabddfa to 5ed31c3 Compare January 18, 2024 13:52
@Michael137
Copy link
Author

@swift-ci test

@Michael137
Copy link
Author

@swift-ci test Windows

1 similar comment
@Michael137
Copy link
Author

@swift-ci test Windows

@adrian-prantl adrian-prantl merged commit 6e3538d into swiftlang:stable/20230725 Jan 18, 2024
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
This is required for users of `TypeQuery` that limit the set of
languages of the query using APIs such as
`GetSupportedLanguagesForTypes` or
`GetSupportedLanguagesForExpressions`.

Example usage: swiftlang#7885
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants