Skip to content

[WIP][clangd] Resolve the dependent type from its single instantiation. Take 1 #71279

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

zyn0217
Copy link
Contributor

@zyn0217 zyn0217 commented Nov 4, 2023

This is an enhancement to the HeuristicResolver, trying to extract the deduced type from the single instantiation for a template. This partially addresses the point 1 from
clangd/clangd#1768.

This patch doesn't tackle CXXUnresolvedConstructExpr or similarities since I feel that is more arduous and would prefer to leave it for my future work.

@llvmbot
Copy link
Member

llvmbot commented Nov 4, 2023

@llvm/pr-subscribers-clangd

Author: Younan Zhang (zyn0217)

Changes

This is an enhancement to the HeuristicResolver, trying to extract the deduced type from the single instantiation for a template. This partially addresses the point #1 from
clangd/clangd#1768.

This patch doesn't tackle CXXUnresolvedConstructExpr or similarities since I feel that is more arduous and would prefer to leave it for my future work.


Full diff: https://github.com/llvm/llvm-project/pull/71279.diff

2 Files Affected:

  • (modified) clang-tools-extra/clangd/HeuristicResolver.cpp (+101)
  • (modified) clang-tools-extra/clangd/unittests/XRefsTests.cpp (+48)
diff --git a/clang-tools-extra/clangd/HeuristicResolver.cpp b/clang-tools-extra/clangd/HeuristicResolver.cpp
index 3c147b6b582bf0b..d3dced9b325367a 100644
--- a/clang-tools-extra/clangd/HeuristicResolver.cpp
+++ b/clang-tools-extra/clangd/HeuristicResolver.cpp
@@ -7,10 +7,14 @@
 //===----------------------------------------------------------------------===//
 
 #include "HeuristicResolver.h"
+#include "AST.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/CXXInheritance.h"
+#include "clang/AST/Decl.h"
+#include "clang/AST/DeclCXX.h"
 #include "clang/AST/DeclTemplate.h"
 #include "clang/AST/ExprCXX.h"
+#include "clang/AST/RecursiveASTVisitor.h"
 #include "clang/AST/Type.h"
 
 namespace clang {
@@ -46,6 +50,98 @@ const Type *resolveDeclsToType(const std::vector<const NamedDecl *> &Decls,
   return nullptr;
 }
 
+// Visitor that helps to extract deduced type from instantiated entities.
+// This merely performs the source location comparison against each Decl
+// until it finds a Decl with the same location as the
+// dependent one. Its associated type will then be extracted.
+struct InstantiatedDeclVisitor : RecursiveASTVisitor<InstantiatedDeclVisitor> {
+
+  InstantiatedDeclVisitor(NamedDecl *DependentDecl) : DependentDecl(DependentDecl) {}
+
+  bool shouldVisitTemplateInstantiations() const { return true; }
+
+  bool shouldVisitLambdaBody() const { return true; }
+
+  bool shouldVisitImplicitCode() const { return true; }
+
+  template <typename D>
+  bool onDeclVisited(D *MaybeInstantiated) {
+    if (MaybeInstantiated->getDeclContext()->isDependentContext())
+      return true;
+    auto *Dependent = dyn_cast<D>(DependentDecl);
+    if (!Dependent)
+      return true;
+    auto LHS = MaybeInstantiated->getTypeSourceInfo(),
+         RHS = Dependent->getTypeSourceInfo();
+    if (!LHS || !RHS)
+      return true;
+    if (LHS->getTypeLoc().getSourceRange() !=
+        RHS->getTypeLoc().getSourceRange())
+      return true;
+    DeducedType = MaybeInstantiated->getType();
+    return false;
+  }
+
+  bool VisitFieldDecl(FieldDecl *FD) {
+    return onDeclVisited(FD);
+  }
+
+  bool VisitVarDecl(VarDecl *VD) {
+    return onDeclVisited(VD);
+  }
+
+  NamedDecl *DependentDecl;
+  QualType DeducedType;
+};
+
+/// Attempt to resolve the dependent type from the surrounding context for which
+/// a single instantiation is available.
+const Type *
+resolveTypeFromInstantiatedTemplate(const CXXDependentScopeMemberExpr *Expr) {
+  if (Expr->isImplicitAccess())
+    return nullptr;
+
+  auto *Base = Expr->getBase();
+  NamedDecl *ND = nullptr;
+  if (auto *CXXMember = dyn_cast<MemberExpr>(Base))
+    ND = CXXMember->getMemberDecl();
+
+  if (auto *DRExpr = dyn_cast<DeclRefExpr>(Base))
+    ND = DRExpr->getFoundDecl();
+
+  // FIXME: Handle CXXUnresolvedConstructExpr. This kind of type doesn't have
+  // available Decls to be matched against. Which inhibits the current heuristic
+  // from resolving expressions such as `T().fo^o()`, where T is a
+  // single-instantiated template parameter.
+  if (!ND)
+    return nullptr;
+
+  NamedDecl *Instantiation = nullptr;
+
+  // Find out a single instantiation that we can start with. The enclosing
+  // context for the current Decl might not be a templated entity (e.g. a member
+  // function inside a class template), hence we shall walk up the decl
+  // contexts first.
+  for (auto *EnclosingContext = ND->getDeclContext(); EnclosingContext;
+       EnclosingContext = EnclosingContext->getParent()) {
+    if (auto *ND = dyn_cast<NamedDecl>(EnclosingContext)) {
+      Instantiation = getOnlyInstantiation(ND);
+      if (Instantiation)
+        break;
+    }
+  }
+
+  if (!Instantiation)
+    return nullptr;
+
+  // This will traverse down the instantiation entity, visit each Decl, and
+  // extract the deduced type for the undetermined Decl `ND`.
+  InstantiatedDeclVisitor Visitor(ND);
+  Visitor.TraverseDecl(Instantiation);
+
+  return Visitor.DeducedType.getTypePtrOrNull();
+}
+
 } // namespace
 
 // Helper function for HeuristicResolver::resolveDependentMember()
@@ -150,6 +246,11 @@ std::vector<const NamedDecl *> HeuristicResolver::resolveMemberExpr(
   if (ME->isArrow()) {
     BaseType = getPointeeType(BaseType);
   }
+
+  if (BaseType->isDependentType())
+    if (auto *MaybeResolved = resolveTypeFromInstantiatedTemplate(ME))
+      BaseType = MaybeResolved;
+
   if (!BaseType)
     return {};
   if (const auto *BT = BaseType->getAs<BuiltinType>()) {
diff --git a/clang-tools-extra/clangd/unittests/XRefsTests.cpp b/clang-tools-extra/clangd/unittests/XRefsTests.cpp
index f53cbf01b7992c8..ead24dec575de0d 100644
--- a/clang-tools-extra/clangd/unittests/XRefsTests.cpp
+++ b/clang-tools-extra/clangd/unittests/XRefsTests.cpp
@@ -1222,6 +1222,54 @@ TEST(LocateSymbol, TextualSmoke) {
                         hasID(getSymbolID(&findDecl(AST, "MyClass"))))));
 }
 
+TEST(LocateSymbol, DeduceDependentTypeFromSingleInstantiation) {
+  Annotations T(R"cpp(
+    struct WildCat {
+      void $wild_meow[[meow]]();
+    };
+
+    struct DomesticCat {
+      void $domestic_meow[[meow]]();
+    };
+
+    template <typename Ours>
+    struct Human {
+      template <typename Others>
+      void feed(Others O) {
+        O.me$1^ow();
+        Others Child;
+        Child.me$2^ow();
+        // FIXME: Others().me^ow();
+        Ours Baby;
+        Baby.me$3^ow();
+        // struct Inner {
+        //   Ours Pet;
+        // };
+        // Inner().Pet.me^ow();
+        auto Lambda = [](auto C) {
+          C.me$4^ow();
+        };
+        Lambda(Others());
+      }
+    };
+
+    void foo() {
+      Human<DomesticCat>().feed(WildCat());
+    }
+  )cpp");
+
+  auto TU = TestTU::withCode(T.code());
+  auto AST = TU.build();
+  EXPECT_THAT(locateSymbolAt(AST, T.point("1")),
+              ElementsAre(sym("meow", T.range("wild_meow"), std::nullopt)));
+  EXPECT_THAT(locateSymbolAt(AST, T.point("2")),
+              ElementsAre(sym("meow", T.range("wild_meow"), std::nullopt)));
+  EXPECT_THAT(locateSymbolAt(AST, T.point("3")),
+              ElementsAre(sym("meow", T.range("domestic_meow"), std::nullopt)));
+  EXPECT_THAT(locateSymbolAt(AST, T.point("4")),
+              ElementsAre(sym("meow", T.range("wild_meow"), std::nullopt)));
+}
+
 TEST(LocateSymbol, Textual) {
   const char *Tests[] = {
       R"cpp(// Comment

Comment on lines +112 to +126
// FIXME: Handle CXXUnresolvedConstructExpr. This kind of type doesn't have
// available Decls to be matched against. Which inhibits the current heuristic
// from resolving expressions such as `T().fo^o()`, where T is a
// single-instantiated template parameter.
if (!ND)
return nullptr;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFC: Do we have a neat approach to deal with such an expression? One approach that comes to mind is:

  1. Extract the TemplateTypeParmDecl (TTPD) from the expression.

  2. Traverse every Decls inside the enclosing DeclContext for TTPD until we find a template entity whose template parameters contain TTPD.

  3. Find the instantiation for that entity. If succeeds, perform the similar steps below to extract the Stmt corresponding to the dependent expression (i.e. our interest).

  4. Figure out a way to extract the type from that expression, which can be intricate, as we don't know to which Decl the dependent expression is tied.

Copy link

github-actions bot commented Nov 4, 2023

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 99eb843c8bda5d636338edfd4f67728b6573a2a9 09f0acb3bdfe64cbc09116cd318f56cef242cc7c -- clang-tools-extra/clangd/AST.cpp clang-tools-extra/clangd/AST.h clang-tools-extra/clangd/FindTarget.cpp clang-tools-extra/clangd/FindTarget.h clang-tools-extra/clangd/HeuristicResolver.cpp clang-tools-extra/clangd/HeuristicResolver.h clang-tools-extra/clangd/Hover.cpp clang-tools-extra/clangd/InlayHints.cpp clang-tools-extra/clangd/ParsedAST.cpp clang-tools-extra/clangd/ParsedAST.h clang-tools-extra/clangd/SemanticHighlighting.cpp clang-tools-extra/clangd/XRefs.cpp clang-tools-extra/clangd/refactor/Rename.cpp clang-tools-extra/clangd/refactor/tweaks/DefineInline.cpp clang-tools-extra/clangd/refactor/tweaks/DefineOutline.cpp clang-tools-extra/clangd/refactor/tweaks/ExtractFunction.cpp clang-tools-extra/clangd/unittests/ASTTests.cpp clang-tools-extra/clangd/unittests/FindTargetTests.cpp clang-tools-extra/clangd/unittests/XRefsTests.cpp
View the diff from clang-format here.
diff --git a/clang-tools-extra/clangd/AST.cpp b/clang-tools-extra/clangd/AST.cpp
index d42dbf3163..120ba7a2b9 100644
--- a/clang-tools-extra/clangd/AST.cpp
+++ b/clang-tools-extra/clangd/AST.cpp
@@ -744,8 +744,7 @@ getOnlyInstantiatedNode(const DeclContext *StartingPoint,
   }
 
   NamedDecl *InstantiatedEnclosingDecl = nullptr;
-  for (auto *DC = StartingPoint; DC;
-       DC = DC->getParent()) {
+  for (auto *DC = StartingPoint; DC; DC = DC->getParent()) {
     auto *ND = llvm::dyn_cast<NamedDecl>(DC);
     if (!ND)
       continue;
@@ -836,7 +835,6 @@ getOnlyInstantiationForMemberFunction(const CXXMethodDecl *TemplatedDecl) {
         return MD;
       return nullptr;
     }
-
   };
   return Visitor(TemplatedDecl).Visit(OuterTemplate);
 }
diff --git a/clang-tools-extra/clangd/HeuristicResolver.cpp b/clang-tools-extra/clangd/HeuristicResolver.cpp
index 072ff4cf13..98fc83fa73 100644
--- a/clang-tools-extra/clangd/HeuristicResolver.cpp
+++ b/clang-tools-extra/clangd/HeuristicResolver.cpp
@@ -260,7 +260,8 @@ std::vector<const NamedDecl *> HeuristicResolver::resolveMemberExpr(
     return {};
 
   if (BaseType->isDependentType())
-    if (auto *MaybeResolved = resolveTypeFromInstantiatedTemplate(EnclosingDecl, ME))
+    if (auto *MaybeResolved =
+            resolveTypeFromInstantiatedTemplate(EnclosingDecl, ME))
       BaseType = MaybeResolved;
 
   if (const auto *BT = BaseType->getAs<BuiltinType>()) {
diff --git a/clang-tools-extra/clangd/HeuristicResolver.h b/clang-tools-extra/clangd/HeuristicResolver.h
index 16650e503a..261dcff138 100644
--- a/clang-tools-extra/clangd/HeuristicResolver.h
+++ b/clang-tools-extra/clangd/HeuristicResolver.h
@@ -45,7 +45,8 @@ namespace clangd {
 // not a specialization. More advanced heuristics may be added in the future.
 class HeuristicResolver {
 public:
-  HeuristicResolver(const ASTContext &Ctx, const DeclContext *EnclosingDecl = nullptr)
+  HeuristicResolver(const ASTContext &Ctx,
+                    const DeclContext *EnclosingDecl = nullptr)
       : Ctx(Ctx), EnclosingDecl(EnclosingDecl) {}
 
   // Try to heuristically resolve certain types of expressions, declarations, or
diff --git a/clang-tools-extra/clangd/ParsedAST.h b/clang-tools-extra/clangd/ParsedAST.h
index 1f1425a995..b2ae420e7d 100644
--- a/clang-tools-extra/clangd/ParsedAST.h
+++ b/clang-tools-extra/clangd/ParsedAST.h
@@ -24,13 +24,13 @@
 #include "Compiler.h"
 #include "Diagnostics.h"
 #include "Headers.h"
+#include "HeuristicResolver.h"
 #include "Preamble.h"
 #include "clang-include-cleaner/Record.h"
 #include "support/Path.h"
 #include "clang/Frontend/FrontendAction.h"
 #include "clang/Lex/Preprocessor.h"
 #include "clang/Tooling/Syntax/Tokens.h"
-#include "HeuristicResolver.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/StringRef.h"
 #include <memory>
diff --git a/clang-tools-extra/clangd/SemanticHighlighting.cpp b/clang-tools-extra/clangd/SemanticHighlighting.cpp
index d5cd8a3c08..21b347be5f 100644
--- a/clang-tools-extra/clangd/SemanticHighlighting.cpp
+++ b/clang-tools-extra/clangd/SemanticHighlighting.cpp
@@ -419,8 +419,7 @@ public:
   HighlightingsBuilder(const ParsedAST &AST, const HighlightingFilter &Filter,
                        const HeuristicResolver &Resolver)
       : TB(AST.getTokens()), SourceMgr(AST.getSourceManager()),
-        LangOpts(AST.getLangOpts()), Filter(Filter),
-        Resolver(Resolver) {}
+        LangOpts(AST.getLangOpts()), Filter(Filter), Resolver(Resolver) {}
 
   HighlightingToken &addToken(SourceLocation Loc, HighlightingKind Kind) {
     auto Range = getRangeForSourceLocation(Loc);

@@ -150,6 +246,11 @@ std::vector<const NamedDecl *> HeuristicResolver::resolveMemberExpr(
if (ME->isArrow()) {
BaseType = getPointeeType(BaseType);
}

if (BaseType->isDependentType())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check BaseType before access (line 254)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I should assume the non-nullity for BaseType: All of the construction points appear to create expressions with valid base types. Except for ASTImporter and ASTReader, where an empty expression may be created. But would we somehow end up resolving expressions created from these two places? I suppose we might encounter these inside a preamble, but I don't think this is where the heuristic resolver would work.

Anyway, I moved this statement after the non-null check. Thanks for spotting it!

Copy link
Collaborator

@HighCommander4 HighCommander4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks very much for looking at this!

I'm wondering if it's possible to achieve the same results (and also solve your CXXUnresolvedConstructExpr issue) with a more targeted change that may also be simpler.

While I haven't experimented with this approach to verify that it's feasible, here's what I had in mind:

  • In a suitable place where we simplify dependent types (I'm thinking around here), check if the type is a TemplateTypeParmType.
  • If so, get its TemplateTypeParmDecl, find the template whose parameter it is, and see if the template has a getOnlyInstantiation. If so, find the argument type substituted for this template parameter, and replace the template parameter with this type.

As a bonus, the "default argument heuristic" discussed here could be implemented in the same place. (Even if the template doesn't have exactly one instantiation, if the template parameter has a default argument, replace the template parameter with the default argument.)

I think this should handle the cases in this testcase without the need for any Visitor, though it's definitely possible I'm overlooking something.

@zyn0217
Copy link
Contributor Author

zyn0217 commented Nov 13, 2023

If so, get its TemplateTypeParmDecl, find the template whose parameter it is, and see if the template has a getOnlyInstantiation.

Thanks for the suggestion! Admittedly, it looks neat and terse. However, I suspect it won't work with the present TemplateTypeParm{Type,Decl} models. TemplateTypeParmDecl per se is not tied to any CXXRecordDecl nor FunctionDecl that the template parameter is declaring with. (For the most seemingly-relevant part, i.e. the DeclContext it records, is the scope that the template defines.)

If so, find the argument type substituted for this template parameter, and replace the template parameter with this type.

I was thinking this way before getting into the details :-). Soon I realized that we had to tackle synthesized types e.g.

template <template <class, int> typename C, typename D, int E>
void work() {
  C<D, E> complicated_type;
  // ^ Shall we perform the type substitution once again? ;)
}

@HighCommander4
Copy link
Collaborator

HighCommander4 commented Nov 20, 2023

I suspect it won't work with the present TemplateTypeParm{Type,Decl} models. TemplateTypeParmDecl per se is not tied to any CXXRecordDecl nor FunctionDecl that the template parameter is declaring with. (For the most seemingly-relevant part, i.e. the DeclContext it records, is the scope that the template defines.)

I've done some local experimentation, and what I'm seeing is that TemplateTypeParmDecl::getDeclContext() does return the FunctionDecl or CXXRecordDecl which owns the template parameter.

One tricky thing I found is that when starting from a template parameter type, it's important to use dyn_cast<TemplateTypeParmType>(T) rather than T->getAs<TemplateTypeParmType>(); the latter does more than just cast, it returns the canonical type whose associated Decl is null.

I also discovered some complications related to nested templates, and I have some thoughts on how to resolve them, but I'm going to suggest that we start with getting a simple case (e.g. just one level of templates, no nested templates) to work, and then tackle nested templates as a next step.

If so, find the argument type substituted for this template parameter, and replace the template parameter with this type.

I was thinking this way before getting into the details :-). Soon I realized that we had to tackle synthesized types e.g.

template <template <class, int> typename C, typename D, int E>
void work() {
  C<D, E> complicated_type;
  // ^ Shall we perform the type substitution once again? ;)
}

I'm not too concerned about this. The idea behind HeuristicResolver is to perform name lookups in the primary template as a heuristic.

So, to give an example:

template <typename T>
void work() {
  std::vector<T> a;
  a.pop_back();
}

here, pop_back is looked up in the primary template scope of std::vector; information from the argument list <T> is not used.

Now, applying this to your example:

template <template <typename> class Container, typename T>
void work() {
  Container<T> a;
  a.pop_back();
}

// work() has one instantiation, with [Container = std::vector, T = bar] 

applying the "only instantiation" heuristic will tell us that Container = std::vector, and then we can look up pop_back in the primary template scope of std::vector; we still don't use information from the argument list <T>.

We could potentially envision future enhancements to make use of information from the "only instantiation" to make lookup inside template specialization types more accurate than just "use the primary template" (which could then apply to the std::vector<T> case as well, and e.g. return a more accurate result if the "only instantiation" is T = bool), but that seems like more of an edge case and I'd rather leave it to a future change.

@HighCommander4
Copy link
Collaborator

I thought some more about this.

I appreciate better now the question you asked here:

I realized that we had to tackle synthesized types e.g.

template <template <class, int> typename C, typename D, int E>
void work() {
  C<D, E> complicated_type;
  // ^ Shall we perform the type substitution once again? ;)
}

My previous suggestion side-steps this by saying that we're only going to do lookups in the primary template scope of the argument for C, so we don't need to worry about substituting the arguments for D and E.

However, if in the future we want to be more accurate and e.g. handle cases where a template specialization resolves to something different than the primary template (e.g. a partial specialization), then your question becomes relevant: do we perform type substitution again with the arguments for D and E, or do we just fish out the already-substituted type from the "only instantiation"?

I realized that this consideration actually applies to any other "resolution" step performed in HeuristicResolver, including steps it already does today such as name lookup.

To illustrate, let's go back to the vector example:

template <typename T>
void work() {
  std::vector<T> a;
  a.pop_back();
}

To resolve the dependent member expression a.pop_back, HeuristicResolver currently performs name lookup in the primary template scope of std::vector for the name pop_back.

The name lookup step could also be avoided in principle if we had an "only instantiation": if we find the non-dependent MemberExpr that a.pop_back instantiates to in the "only instantiation", that will point directly to a CXXMethoDecl, and we have our result without having done the work of a name lookup in the primary template scope.

And this method can lead to more accurate results, e.g. if instead of pop_back we have an overloaded method name, the name lookup in the primary template scope will just return all overloads (since we're not using information about the argument types), whereas the instantiated expression in the "only instantiation" will point to the correct overload resolution result.


So, I think we could consider two alternative approaches to this:

Approach 1: What I suggested in the previous review: a limited use of the "only instantiation" in HeuristicResolver to map template parameter types to the type of their argument in the "only instantiation".

Approach 2: If there is an "only instantiation", then resolve any expression, type, etc. in the template body to the corresponding expression, type, etc. in the "only instantiation".

The implementation of "Approach 2" could look similar to the current InstantiatedDeclVisitor in the patch, except the node type we are searching for is not limited to a Decl, it can also be other node types like Stmt etc.

"Approach 2" can be largely independent of HeuristicResolver: since we are finding a concrete result for the thing we're trying to resolve in the "only instantiation", no further resolution steps (e.g. name lookup, or anything else that HeuristicResolver does) should be necessary. However, the entry point to the check can be the same (e.g. for resolving a dependent member expr, we'd want to do the "only instantiation" check in the same cases where we call HeuristicResolver::resolveMemberExpr() from external code), so the functionality could still live in HeuristicResolver for organizational purposes.

@zyn0217
Copy link
Contributor Author

zyn0217 commented Nov 26, 2023

Thank you again for reconsidering about this, and sorry for not responding for over a week.

Just now, I replicated the experimentation

I've done some local experimentation, and what I'm seeing is that TemplateTypeParmDecl::getDeclContext() does return the FunctionDecl or CXXRecordDecl which owns the template parameter.

and had to admit that I was wrong previously. Something must be wrong with my recollection, but I could clearly remember that I'd run into the same pitfall at the first time. (I was about to write the comment to highlight this, but soon, I gave up taking this approach since it looks more constricted than I thought. And here I'm explaining it.)

One tricky thing I found is that when starting from a template parameter type, it's important to use dyn_cast(T) rather than T->getAs(); the latter does more than just cast, it returns the canonical type whose associated Decl is null.

The contrived example involving template template parameter is one obstruction; Another more common example could be such. (Which is simplified from the test case):

template <typename T>
struct Outer {
  template <typename U>
  void foo(U arg) {
     arg.fo^o();
  }
};

struct Widget {
  void foo();
};

Outer<int>().foo(Widget());

Here, we want to find the only instantiation before locating foo(). Which is Outer<int>::foo<Widget>. To locate that, we shall find the templated Decl for foo, which could be done by conducting the ParmVarDecl -> TemplateParmVarDecl -> getDeclContext -> FunctionTemplateDecl dance. Everything goes well until the specialization set for the FunctionTemplateDecl is empty. Hence, our getOnlyInstantiation returns null, which is crazy!

This is because when clang performs the instantiation for member templates, it creates a new FunctionTemplateDecl from the primary template (i.e. where we start from) and links the instantiation declaration to that new Decl.

if (TemplateParams) {
// Our resulting instantiation is actually a function template, since we
// are substituting only the outer template parameters. For example, given
//
// template<typename T>
// struct X {
// template<typename U> friend void f(T, U);
// };
//
// X<int> x;
//
// We are instantiating the friend function template "f" within X<int>,
// which means substituting int for T, but leaving "f" as a friend function
// template.
// Build the function template itself.
FunctionTemplate = FunctionTemplateDecl::Create(SemaRef.Context, DC,
Function->getLocation(),
Function->getDeclName(),
TemplateParams, Function);
Function->setDescribedFunctionTemplate(FunctionTemplate);

And what makes it worse is, as of now, we don't have any approach to retrieve the new templated Decl from the primary Decl:

TemplateDeclInstantiator::InitMethodInstantiation(CXXMethodDecl *New,
CXXMethodDecl *Tmpl) {
if (InitFunctionInstantiation(New, Tmpl))
return true;
if (isa<CXXDestructorDecl>(New) && SemaRef.getLangOpts().CPlusPlus11)
SemaRef.AdjustDestructorExceptionSpec(cast<CXXDestructorDecl>(New));
New->setAccess(Tmpl->getAccess());
if (Tmpl->isVirtualAsWritten())
New->setVirtualAsWritten(true);
// FIXME: New needs a pointer to Tmpl
return false;
}

("// FIXME: New needs a pointer to Tmpl", I was quite frustrated when I read it!)

(This patch first walks up the DeclContext to collect the very outer Decl for templates; then, the visitor could access the later-created template decl and its instantiation from the enclosing template. Again, I didn't realize the visitor resolved this issue this way (by accident), so I'm sorry for not being so straightforward.)

The implementation of "Approach 2" could look similar to the current InstantiatedDeclVisitor in the patch, except the node type we are searching for is not limited to a Decl, it can also be other node types like Stmt etc.

Exactly. And this is why the patch is marked as "Take 1" :)

However, the entry point to the check can be the same (e.g. for resolving a dependent member expr, we'd want to do the "only instantiation" check in the same cases where we call HeuristicResolver::resolveMemberExpr() from external code), so the functionality could still live in HeuristicResolver for organizational purposes.

Yeah, I'd love to extend the HeuristicResolver to make it more accurate and not just dependent on the mere name lookup.

@zyn0217
Copy link
Contributor Author

zyn0217 commented Nov 26, 2023

I also discovered some complications related to nested templates, and I have some thoughts on how to resolve them, but I'm going to suggest that we start with getting a simple case (e.g. just one level of templates, no nested templates) to work, and then tackle nested templates as a next step.

Oops, I replied too quickly and I should have noticed this paragraph. Does my previous comment elaborate on the same issue you've encountered? Could you kindly share your thoughts more?

@HighCommander4
Copy link
Collaborator

HighCommander4 commented Nov 27, 2023

Thinking some more about "Approach 2", I realized there is a challenge with it: given the node we are trying to resolve (for example, a dependent member expr) as a starting point, we need to find the enclosing template decl, so that we can query it for its "only instantiation".

The current patch does this by introspecting the starting node, and in a subset of cases (base expression is MemberExpr or DeclRefExpr) extracting a Decl from it, and then walking its DeclContext chain to find the template decl.

However, we would ideally like our "only instantiation" heuristic to work for any AST node inside a template (and so, for dependent member exprs, we would like it to work for any base expression type, not just specific ones).

In principle, it could be possible to extend the introspection approach to handle all expression types, since if an expression is dependent, it will ultimately reference a template parameter (or a Decl whose type references a template parameter, etc.) However, I feel like this may be a lot of work to implement and could be error-prone.

I wonder if we could do something easier: require that the caller pass in a DeclContext* which it knows to be enclosing the node we are trying to resolve (and then we can walk the chain from there to find the template).

Editor operations like go-to-def which involve taking an action at a cursor position use SelectionTree to "hit-test" the node under the cursor. SelectionTree exposes the chain of nodes that was hit (see e.g. SelectionTree::print() which prints the chain), so we can recover an enclosing DeclContext* from there; we'd just need to plumb the information through to HeuristicResolver.

In the future when we want to use HeuristicResolver for code completion, it can get the enclosing DeclContext* even more easily (Sema::CurContext).

@HighCommander4
Copy link
Collaborator

HighCommander4 commented Nov 27, 2023

I also discovered some complications related to nested templates [...]

Does my previous comment elaborate on the same issue you've encountered?

I believe so, yes.

Could you kindly share your thoughts more?

Let's consider a nested template like this:

template <typename T>
struct Outer {
  template <typename U>
  void inner(U u) {
    u.foo();
  }
}

I believe our heuristic needs to require two things:

  1. Outer has an "only instantiation", say Outer<A>
  2. Outer<A>::inner also has an "only instantiation", say Outer<A>::inner<B>

Now, suppose we have written an algorithm for "given a template X, and some AST node N inside the body of X, find the corresponding AST node inside an instantiation X<C>".

Then, I think we can proceed as follows:

  1. Start with the expression u.foo in the uninstantiated template
  2. Get the enclosing template decl, it's Outer<T>::inner
  3. (As you observed, Outer<T>::inner has no "only instantiation", only Outer<A>::inner does.)
  4. Walk the DeclContext chain further to see if there are any further enclosing templates. We find Outer
  5. Check Outer for an "only instantiation". It has one, Outer<A>!
  6. Run our algorithm with X = Outer, N = Outer<T>::inner, C = A. It finds Outer<A>::inner
  7. Check Outer<A>::inner for an "only instantiation". It has one, Outer<A>::inner<B>!
  8. Run our algorithm with X = Outer<A>::inner, N = u.foo, C = B. It finds the concrete MemberExpr which u.foo instantiated to inside Outer<A>::inner<B>.

@zyn0217
Copy link
Contributor Author

zyn0217 commented Dec 9, 2023

@HighCommander4 While at the HeuristicResolver, I think we may have a bug inside HighlightingsBuilder defined in SemanticHighlighting.cpp.

const HeuristicResolver *Resolver = nullptr;

I didn't see any other lines initializing this member. so we're always yielding nullptr from H.getResolver calls?

@HighCommander4
Copy link
Collaborator

@HighCommander4 While at the HeuristicResolver, I think we may have a bug inside HighlightingsBuilder defined in SemanticHighlighting.cpp.

const HeuristicResolver *Resolver = nullptr;

I didn't see any other lines initializing this member. so we're always yielding nullptr from H.getResolver calls?

Thanks for spotting that. Posted #74971 to fix.

…ke 1

This is an enhancement to the HeuristicResolver, trying to extract
the deduced type from the single instantiation for a template. This
partially addresses the point llvm#1 from
clangd/clangd#1768.

This patch doesn't tackle CXXUnresolvedConstructExpr or similarities
since I feel that is more arduous and would prefer to leave it for
my future work.
@zyn0217
Copy link
Contributor Author

zyn0217 commented Dec 24, 2023

Another issue I've encountered while at it is that, given the following code,

void foo();
auto lambda = [] {
  return ^foo();
};

let N represent the selection node for the expression foo(), N.getDeclContext() then yields TranslationUnitDecl rather than the CXXRecordDeclof the lambda expression.

I presume there's something wrong at SelectionTree::Node::getDeclContext(), because lambda expressions would slip through the current traversal logic.

`-LambdaExpr <col:15, line:4:1> '(lambda at line:2:15)'    <-- parent 3; not a Decl thus ignored!
  |-CXXRecordDecl <line:2:15> col:15 implicit class definition
  | |-DefinitionData
  | | |-DefaultConstructor
  | | |-CopyConstructor
  | | |-MoveConstructor
  | | |-CopyAssignment
  | | |-MoveAssignment
  | | `-Destructor
  | |-CXXMethodDecl <col:16, line:4:1> line:2:15 constexpr operator() 'auto () const -> void' inline
  | | `-CompoundStmt <col:18, line:4:1>
  | |   `-ReturnStmt <line:3:3, col:14>
  | |     `-CallExpr <col:10, col:14> 'void'
  | |       `-ImplicitCastExpr <col:10> 'void (*)()' <FunctionToPointerDecay>
  | |         `-DeclRefExpr <col:10> 'void ()' lvalue Function 0xb76c1c8 'foo' 'void ()'
  | |-CXXConversionDecl <line:2:15, line:4:1> line:2:15 implicit constexpr operator void (*)() 'auto (*() const noexcept)() -> void' inline
  | |-CXXMethodDecl <col:15, line:4:1> line:2:15 implicit __invoke 'auto () -> void' static inline
  | `-CXXDestructorDecl <col:15> col:15 implicit referenced constexpr ~(lambda at line:2:15) 'void () noexcept' inline default trivial
  `-CompoundStmt <col:18, line:4:1>       <--parent 2
    `-ReturnStmt <line:3:3, col:14>           <-- parent 1
      `-CallExpr <col:10, col:14> 'void'       <-- starting point
        `-ImplicitCastExpr <col:10> 'void (*)()' <FunctionToPointerDecay>
          `-DeclRefExpr <col:10> 'void ()' lvalue Function 0xb76c1c8 'foo' 'void ()'

const DeclContext &SelectionTree::Node::getDeclContext() const {
for (const Node *CurrentNode = this; CurrentNode != nullptr;
CurrentNode = CurrentNode->Parent) {
if (const Decl *Current = CurrentNode->ASTNode.get<Decl>()) {
if (CurrentNode != this)
if (auto *DC = dyn_cast<DeclContext>(Current))
return *DC;
return *Current->getLexicalDeclContext();
}
}
llvm_unreachable("A tree must always be rooted at TranslationUnitDecl.");
}


Proposed #76329 for it.

@zyn0217 zyn0217 changed the title [clangd] Resolve the dependent type from its single instantiation. Take 1 [WIP][clangd] Resolve the dependent type from its single instantiation. Take 1 Dec 24, 2023
@zyn0217 zyn0217 marked this pull request as draft December 24, 2023 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants