Skip to content

Backport the import-std-module patch and enable the fallback mode by default #2222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

Teemperor
Copy link

This PR contains the LLDB/ASTImporter patches for the import-std-module feature + 1 dependent Clang patch that just removes unspecified template args from printed C++ type names.

Also one final commit that enables the fallback mode by default.

Teemperor and others added 27 commits December 10, 2020 20:03
…ion bodies

Right now the ASTImporter assumes for most Expr nodes that they are always equal
which leads to non-compatible declarations ending up being merged. This patch
adds the basic framework for comparing Stmts (and with that also Exprs) and
implements the custom checks for a few Stmt subclasses. I'll implement the
remaining subclasses in follow up patches (mostly because there are a lot of
subclasses and some of them require further changes like having GNU language in
the testing framework)

The motivation for this is that in LLDB we try to import libc++ source code and
some of the types we are importing there contain expressions (e.g. because they
use `enable_if<expr>`), so those declarations are currently merged even if they
are completely different (e.g. `enable_if<value> ...` and `enable_if<!value>
...` are currently considered equal which is clearly not true).

Reviewed By: martong, balazske

Differential Revision: https://reviews.llvm.org/D87444

(cherry picked from commit c0bcd11)
…e more consistent

There are several `::IsStructurallyEquivalent` overloads for Decl subclasses
that are used for comparing declarations. There is also one overload that takes
just two Decl pointers which ends up queuing the passed Decls to be later
compared in `CheckKindSpecificEquivalence`.

`CheckKindSpecificEquivalence` implements the dispatch logic for the different
Decl subclasses. It is supposed to hand over the queued Decls to the
subclass-specific `::IsStructurallyEquivalent` overload that will actually
compare the Decl instance. It also seems to implement a few pieces of actual
node comparison logic inbetween the dispatch code.

This implementation causes that the different overloads of
`::IsStructurallyEquivalent` do different (and sometimes no) comparisons
depending on which overload of `::IsStructurallyEquivalent` ends up being
called.

For example, if I want to compare two FieldDecl instances, then I could either
call the `::IsStructurallyEquivalent` with `Decl *` or with `FieldDecl *`
parameters. The overload that takes FieldDecls is doing a correct comparison.
However, the `Decl *` overload just queues the Decl pair.
`CheckKindSpecificEquivalence` has no dispatch logic for `FieldDecl`, so it
always returns true and never does any actual comparison.

On the other hand, if I try to compare two FunctionDecl instances the two
possible overloads of `::IsStructurallyEquivalent` have the opposite behaviour:
The overload that takes `FunctionDecl` pointers isn't comparing the names of the
FunctionDecls while the overload taking a plain `Decl` ends up comparing the
function names (as the comparison logic for that is implemented in
`CheckKindSpecificEquivalence`).

This patch tries to make this set of functions more consistent by making
`CheckKindSpecificEquivalence` a pure dispatch function without any
subclass-specific comparison logic. Also the dispatch logic is now autogenerated
so it can no longer miss certain subclasses.

The comparison code from `CheckKindSpecificEquivalence` is moved to the
respective `::IsStructurallyEquivalent` overload so that the comparison result
no longer depends if one calls the `Decl *` overload or the overload for the
specific subclass. The only difference is now that the `Decl *` overload is
queuing the parameter while the subclass-specific overload is directly doing the
comparison.

`::IsStructurallyEquivalent` is an implementation detail and I don't think the
behaviour causes any bugs in the current implementation (as carefully calling
the right overload for the different classes works around the issue), so the
test for this change is that I added some new code for comparing `MemberExpr`.
The new comparison code always calls the dispatching overload and it previously
failed as the dispatch didn't support FieldDecls.

Reviewed By: martong, a_sidorin

Differential Revision: https://reviews.llvm.org/D87619

(cherry picked from commit 7c4575e)
…of imported decls when merging ClassTemplateSpecializationDecls

When importing a `ClassTemplateSpecializationDecl` definition into a TU with a matching
`ClassTemplateSpecializationDecl` definition and a more recent forward decl, the ASTImporter
currently will call `MapImported()` for the definitions, but will return the forward declaration
from the `ASTImporter::Import()` call.

This is triggering some assertions in LLDB when we try to fully import some DeclContexts
before we delete the 'From' AST. The returned 'To' Decl before this patch is just the most recent
forward decl but that's not the Decl with the definition to which the ASTImporter will import
the child declarations.

This patch just changes that the ASTImporter returns the definition that the imported Decl was
merged with instead of the found forward declaration.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D92016

(cherry picked from commit 0c926e6)
The test case isn't using the AST matchers for all checks as there doesn't seem to be support for
matching NonTypeTemplateParmDecl default arguments. Otherwise this is simply importing the
default arguments.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D92106

(cherry picked from commit 89c1a7a)
…when trying to complete a forward decl

SemaSourceWithPriorities is a special SemaSource that wraps our normal LLDB
ExternalASTSource and the ASTReader (which is used for the C++ module loading).
It's only active when the `import-std-module` setting is turned on.

The `CompleteType` function there in `SemaSourceWithPriorities` is looping over
all ExternalASTSources and asks each to complete the type. However, that loop is
in another loop that keeps doing that until the type is complete. If that
function is ever called on a type that is a forward decl then that causes LLDB
to go into an infinite loop.

I remember I added that second loop and the comment because I thought I saw a
similar pattern in some other Clang code, but after some grepping I can't find
that code anywhere and it seems the rest of the code base only calls
CompleteType once (It would also be kinda silly to have calling it multiple
times). So it seems that's just a silly mistake.

The is implicitly tested by importing `std::pair`, but I also added a simpler
dedicated test that creates a dummy libc++ module with some forward declarations
and then imports them into the scratch AST context. At some point the
ASTImporter will check if one of the forward decls could be completed by the
ExternalASTSource, which will cause the `SemaSourceWithPriorities` to go into an
infinite loop once it receives the `CompleteType` call.

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D87289

(cherry picked from commit 32c8da4)
This adds support for substituting std::pair instantiations with enabled
import-std-module.

With the fixes in parent revisions we can currently substitute a single pair
(however, a result that returns a second pair currently causes LLDB to crash
while importing the second template instantiation).

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D85141

(cherry picked from commit b852225)
With the recent patches to the ASTImporter that improve template type importing
(D87444), most of the import-std-module tests can now finally import the
type of the STL container they are testing. This patch removes most of the casts
that were added to simplify types to something the ASTImporter can import
(for example, std::vector<int>::size_type was casted to `size_t` until now).
Also adds the missing tests that require referencing the container type (for
example simply printing the whole container) as here we couldn't use a casting
workaround.

The only casts that remain are in the forward_list tests that reference
the iterator and the stack test. Both tests are still failing to import the
respective container type correctly (or crash while trying to import).

(cherry picked from commit cabee89)
"[lldb/DataFormatters] Display null C++ pointers as nullptr" added an assumption
that the member we check for is always a nullptr, but it is actually never
initialized. That causes the test to randomly fail due to the pointer having
some random value that isn't 0.

(cherry picked from commit dc848a0)
We can handle all the types in the expression evaluator now without casting.
On Linux, we have a system header declaration that is still causing issues, so
I'm skipping the test there until I get around to fix this.

(cherry picked from commit a703998)
The test case isn't using the AST matchers for all checks as there doesn't seem to be support for
matching TemplateTypeParmDecl default arguments. Otherwise this is simply importing the
default arguments.

Also updates several LLDB tests that now as intended omit the default template
arguments of several std templates.

Reviewed By: martong

Differential Revision: https://reviews.llvm.org/D92103

(cherry picked from commit 3f6c856)
LLDB is supposed to ask the Clang Driver what the default module cache path is
and then use that value as the default for the
`symbols.clang-modules-cache-path` setting. However, we use the property type
`String` to change `symbols.clang-modules-cache-path` even though the type of
that setting is `FileSpec`, so the setter will simply do nothing and return
`false`. We also don't check the return value of the setter, so this whole code
ends up not doing anything at all.

This changes the setter to use the correct property type and adds an assert that
we actually successfully set the default path. Also adds a test that checks that
the default value for this setting is never unset/empty path as this would
effectively disable the import-std-module feature from working by default.

Reviewed By: JDevlieghere, shafik

Differential Revision: https://reviews.llvm.org/D92772

(cherry picked from commit 208e3f5)
…types are in the scratch AST

Several data formatters assume their types are in the Target's scratch AST and
build new types from that scratch AST instance. However, types from different
ASTs shouldn't be mixed, so this (unchecked) assumption may lead to problems if
we ever have more than one scratch AST or someone somehow invokes data
formatters on a type that are not in the scratch AST.

Instead we can use in all the formatters just the TypeSystem of the type we're
formatting. That's much simpler and avoids all the headache of finding the right
TypeSystem that matches the one of the formatted type.

Right now LLDB only has one scratch TypeSystemClang instance and we format only
types that are in the scratch AST, so this doesn't change anything in the
current way LLDB works. The intention here is to allow follow up refactorings
that introduce multiple scratch ASTs with the same Target.

Differential Revision: https://reviews.llvm.org/D92757

(cherry picked from commit 839e845)
arguments of types by default.

This somewhat improves the worst-case printing of types like
std::string, std::vector, etc., where many irrelevant default arguments
can be included in the type as printed if we've lost the type sugar.

(cherry picked from commit e7f3e21)
…pressed inline namespaces

Commit 5f12f4f made suppressing inline namespaces
when printing typenames default to true. As we're using the inline namespaces
in LLDB to construct internal type names (which need internal namespaces in them
to, for example, differentiate libc++'s std::__1::string from the std::string
from libstdc++), this broke most of the type formatting logic.

(cherry picked from commit da121ff)
…e in the internal type name

Our type formatters/summaries match on the internal type name we generate in LLDB for Clang types.

These names were generated using Clang's default printing policy. However Clang's
default printing policy got tweaked over the last month to make the generated type
names more readable (by for example excluding inline/anonymous namespaces and
removing template arguments that have their default value). This broke the formatter
system where LLDB's matching logic now no longer can format certain types as
the new type names generated by Clang's default printing policy no longer match
the type names that LLDB/the user specified.

I already introduced LLDB's own type printing policy and fixed the inline/anonymous
namespaces in da121ff (just to get the
test suite passing again).

This patch is restoring the old type printing behaviour where always include the template
arguments in the internal type name (even if they match the default args). This should get
template type formatters/summaries working again in the rare situation where we do
know template default arguments within LLDB. This can only happen when either having
a template that was parsed in the expression parser or when we get type information from a C++ module.

The Clang change that removed defaulted template arguments from Clang's printing policy was
e7f3e21

Reviewed By: labath

Differential Revision: https://reviews.llvm.org/D92311

(cherry picked from commit e0e7bbe)
Clang has some type sugar that only serves as a way to preserve the way a user
has typed a certain type in the source code. These types are currently not
unwrapped when we query the type name for a Clang type, which means that this
type sugar actually influences what formatters are picked for a certain type.
Currently if a user decides to reference a type by doing `::GlobalDecl Var = 3;`,
the type formatter for `GlobalDecl` will not be used (as the type sugar
around the type gives it the name `::GlobalDecl`. The same goes for other ways
to spell out a type such as `auto` etc.

With this patch most of this type sugar gets stripped when the full type name is
calculated. Typedefs are not getting desugared as that seems counterproductive.
I also don't desugar atomic types as that's technically not type sugar.

Reviewed By: jarin

Differential Revision: https://reviews.llvm.org/D87481

(cherry picked from commit b5e49e9)
Like the other type sugar removed by RemoveWrappingTypes, SubstTemplateTypeParm
is just pure sugar that should be ignored. If we don't ignore it (as we do now),
LLDB will fail to read values from record fields that have a
SubstTemplateTypeParm type.

Only way to produce such a type in LLDB is to either use the `import-std-module`
setting to get a template into the expression parser or just create your own
template directly in the expression parser which is what we do in the test.

Reviewed By: jarin

Differential Revision: https://reviews.llvm.org/D85132

(cherry picked from commit 950f1bf)
…nd a Python>=3.7 crash

Usually when we enter a SWIG wrapper function from Python, SWIG automatically
adds a `Py_BEGIN_ALLOW_THREADS`/`Py_END_ALLOW_THREADS` around the call to the SB
API C++ function. This will ensure that Python's GIL is released when we enter
LLDB and locked again when we return to the wrapper code.

D51569 changed this behaviour but only for the generated `__str__` wrappers. The
added `nothreadallow` disables the injection of the GIL release/re-acquire code
and the GIL is now kept locked when entering LLDB and is expected to be still
locked when returning from the LLDB implementation. The main reason for that was
that back when D51569 landed the wrapper itself created a Python string. These
days it just creates a std::string and SWIG itself takes care of getting the GIL
and creating the Python string from the std::string, so that workaround isn't
necessary anymore.

This patch just removes `nothreadallow` so that our `__str__` functions now
behave like all other wrapper functions in that they release the GIL when
calling into the SB API implementation.

The motivation here is actually to work around another potential bug in LLDB.
When one calls into the LLDB SB API while holding the GIL and that call causes
LLDB to interpret some Python script via `ScriptInterpreterPython`, then the GIL
will be unlocked when the control flow returns from the SB API. In the case of
the `__str__` wrapper this would cause that the next call to a Python function
requiring the GIL would fail (as SWIG will not try to reacquire the GIL as it
isn't aware that LLDB removed it).

The reason for this unexpected GIL release seems to be a workaround for recent
Python versions:
```
    // The only case we should go further and acquire the GIL: it is unlocked.
    if (PyGILState_Check())
      return;
```

The early-exit here causes `InitializePythonRAII::m_was_already_initialized` to
be always false and that causes that `InitializePythonRAII`'s destructor always
directly unlocks the GIL via `PyEval_SaveThread`. I'm investigating how to
properly fix this bug in a follow up patch, but for now this straightforward
patch seems to be enough to unblock my other patches (and it also has the
benefit of removing this workaround).

The test for this is just a simple test for `std::deque` which has a synthetic
child provider implemented as a Python script. Inspecting the deque object will
cause `expect_expr` to create a string error message by calling
`str(deque_object)`. Printing the ValueObject causes the Python script for the
synthetic children to execute which then triggers the bug described above where
the GIL ends up being unlocked.

Reviewed By: JDevlieghere

Differential Revision: https://reviews.llvm.org/D88302

(cherry picked from commit 070a1d5)
…Clang

We keep referring to the single object created by this class as
'scratch AST/Context/TypeSystem' so at this point we might as well rename the
class. It's also not involved at all in expression evaluation, so the
'ForExpressions' prefix is a bit misleading.

(cherry picked from commit 973f390)

 Conflicts:
	lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp
Template specializations are not handled in many of the
TypeSystemClang methods. For example, GetNumChildren does not handle
the TemplateSpecialization type class, so template specializations
always look like empty objects.

This patch just desugars template specializations in the existing
RemoveWrappingTypes desugaring helper.

Differential Revision: https://reviews.llvm.org/D83858

(cherry picked from commit 93ec6cd)
Just adds the respective accessor functions to ASTContextMetadata instead
of directly exposing the OriginMap to the whole world.

(cherry picked from commit f6913e7)
…lbacks for the same target decl

The ASTImporter has an `Imported(From, To)` callback that notifies subclasses
that a declaration has been imported in some way. LLDB uses this in the
`CompleteTagDeclsScope` to see which records have been imported into the scratch
context. If the record was declared inside the expression, then the
`CompleteTagDeclsScope` will forcibly import the full definition of that record
to the scratch context so that the expression AST can safely be disposed later
(otherwise we might end up going back to the deleted AST to complete the
minimally imported record). The way this is implemented is that there is a list
of decls that need to be imported (`m_decls_to_complete`) and we keep completing
the declarations inside that list until the list is empty. Every `To` Decl we
get via the `Imported` callback will be added to the list of Decls to be
completed.

There are some situations where the ASTImporter will actually give us two
`Imported` calls with the same `To` Decl. One way where this happens is if the
ASTImporter decides to merge an imported definition into an already imported
one. Another way is that the ASTImporter just happens to get two calls to
`ASTImporter::Import` for the same Decl. This for example happens when importing
the DeclContext of a Decl requires importing the Decl itself, such as when
importing a RecordDecl that was declared inside a function.

The bug addressed in this patch is that when we end up getting two `Imported`
calls for the same `To` Decl, then we would crash in the
`CompleteTagDeclsScope`.  That's because the first time we complete the Decl we
remove the Origin tracking information (that maps the Decl back to from where it
came from). The next time we try to complete the same `To` Decl the Origin
tracking information is gone and we hit the `to_context_md->getOrigin(decl).ctx
== m_src_ctx` assert (`getOrigin(decl).ctx` is a nullptr the second time as the
Origin was deleted).

This is actually a regression coming from D72495. Before D72495
`m_decls_to_complete` was actually a set so every declaration in there could
only be queued once to be completed. The set was changed to a vector to make the
iteration over it deterministic, but that also causes that we now potentially
end up trying to complete a Decl twice.

This patch essentially just reverts D72495 and makes the `CompleteTagDeclsScope`
use a SetVector for the list of declarations to be completed. The SetVector
should filter out the duplicates (as the original `set` did) and also ensure that
the completion order is deterministic. I actually couldn't find any way to cause
LLDB to reproduce this bug by merging declarations (this would require that we
for example declare two namespaces in a non-top-level expression which isn't
possible). But the bug reproduces very easily by just declaring a class in an
expression, so that's what the test is doing.

Reviewed By: shafik

Differential Revision: https://reviews.llvm.org/D85648

(cherry picked from commit 7866b91)
(cherry picked from commit 5c61c46f5c5bc142d9047bdb5f39ee7f36962c3c)
…imported std C++ module

By now LLDB can import the 'std' C++ module to improve expression evaluation,
but there are still a few problems to solve before we can do this by default.
One is that importing the C++ module is slightly slower than normal expression
evaluation (mostly because the disk access and loading the initial lookup data
is quite slow in comparison to the barebone Clang setup the rest of the LLDB
expression evaluator is usually doing). Another problem is that some complicated
types in the standard library aren't fully supported yet by the ASTImporter, so
we end up types that fail to import (which usually appears to the user as if the
type is empty or there is just no result variable).

To still allow people to adopt this mode in their daily debugging, this patch
adds a setting that allows LLDB to automatically retry failed expression with a
loaded C++ module. All success expressions will behave exactly as they would do
before this patch. Failed expressions get a another parse attempt if we find a
usable C++ module in the current execution context. This way we shouldn't have
any performance/parsing regressions in normal debugging workflows, while the
debugging workflows involving STL containers benefit from the C++ module type
info.

This setting is off by default for now with the intention to enable it by
default on macOS soon-ish.

The implementation is mostly just extracting the existing parse logic into its
own function and then calling the parse function again if the first evaluation
failed and we have a C++ module to retry the parsing with.

Reviewed By: shafik, JDevlieghere, aprantl

Differential Revision: https://reviews.llvm.org/D92784

(cherry picked from commit 9586082)
…ang::GetForTarget

Also add some documentation while I'm at it.

(cherry picked from commit 594308c)
… imported from C++ modules.

Right now we have one large AST for all types in LLDB. All ODR violations in
types we reconstruct are resolved by just letting the ASTImporter handle the
conflicts (either by merging types or somehow trying to introduce a duplicated
declaration in the AST). This works ok for the normal types we build from debug
information as most of them are just simple CXXRecordDecls or empty template
declarations.

However, with a loaded `std` C++ module we have alternative versions of pretty
much all declarations in the `std` namespace that are much more fleshed out than
the debug information declarations. They have all the information that is lost
when converting to DWARF, such as default arguments, template default arguments,
the actual uninstantiated template declarations and so on.

When we merge these C++ module types into the big scratch AST (that might
already contain debug information types) we give the ASTImporter the tricky task
of somehow creating a consistent AST out of all these declarations. Usually this
ends in a messy AST that contains a mostly broken mix of both module and debug
info declarations. The ASTImporter in LLDB is also importing types with the
MinimalImport setting, which usually means the only information we have when
merging two types is often just the name of the declaration and the information
that it contains some child declarations. This makes it pretty much impossible
to even implement a better merging logic (as the names of C++ module
declarations and debug info declarations are identical).

This patch works around this whole merging problem by separating C++ module
types from debug information types. This is done by splitting up the single
scratch AST into two: One default AST for debug information and a dedicated AST
for C++ module types.

The C++ module AST is implemented as a 'specialised AST' that lives within the
default ScratchTypeSystemClang. When we select the scratch AST we can explicitly
request that we want such a isolated sub-AST of the scratch AST. I kept the
infrastructure more general as we probably can use the same mechanism for other
features that introduce conflicting types (such as programs that are compiled
with a custom -wchar-size= option).

There are just two places where we explicitly have request the C++ module AST:
When we export persistent declarations (`$mytype`) and when we create our
persistent result variable (`$0`, `$1`, ...). There are a few formatters that
were previously assuming that there is only one scratch AST which I cleaned up
in a preparation revision here (D92757).

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D92759

(cherry picked from commit 47e7ecd)
This enables the import-std-module setting's fallback mode by default. With
this all expressions are from no on retried with a loaded C++ module if they
fail to parse without.
@Teemperor
Copy link
Author

@swift-ci test

@fredriss fredriss merged commit 4903d45 into swiftlang:apple/stable/20200714 Dec 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants