Skip to content

[clangd] Store documentation when indexing standard library #133681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2025

Conversation

HighCommander4
Copy link
Collaborator

@llvmbot
Copy link
Member

llvmbot commented Mar 31, 2025

@llvm/pr-subscribers-clangd

@llvm/pr-subscribers-clang-tools-extra

Author: Nathan Ridge (HighCommander4)

Changes

Fixes clangd/clangd#2344


Full diff: https://github.com/llvm/llvm-project/pull/133681.diff

6 Files Affected:

  • (modified) clang-tools-extra/clangd/index/Background.cpp (+1)
  • (modified) clang-tools-extra/clangd/index/IndexAction.cpp (-1)
  • (modified) clang-tools-extra/clangd/indexer/IndexerMain.cpp (+1)
  • (modified) clang-tools-extra/clangd/unittests/StdLibTests.cpp (+37)
  • (modified) clang-tools-extra/clangd/unittests/SyncAPI.cpp (+7)
  • (modified) clang-tools-extra/clangd/unittests/SyncAPI.h (+3)
diff --git a/clang-tools-extra/clangd/index/Background.cpp b/clang-tools-extra/clangd/index/Background.cpp
index 496d1455def4b..8013e9ea86112 100644
--- a/clang-tools-extra/clangd/index/Background.cpp
+++ b/clang-tools-extra/clangd/index/Background.cpp
@@ -306,6 +306,7 @@ llvm::Error BackgroundIndex::index(tooling::CompileCommand Cmd) {
     return true;
   };
   IndexOpts.CollectMainFileRefs = true;
+  IndexOpts.StoreAllDocumentation = false;
 
   IndexFileIn Index;
   auto Action = createStaticIndexingAction(
diff --git a/clang-tools-extra/clangd/index/IndexAction.cpp b/clang-tools-extra/clangd/index/IndexAction.cpp
index ed56c2a9d2e81..cec5f4558455c 100644
--- a/clang-tools-extra/clangd/index/IndexAction.cpp
+++ b/clang-tools-extra/clangd/index/IndexAction.cpp
@@ -223,7 +223,6 @@ std::unique_ptr<FrontendAction> createStaticIndexingAction(
   Opts.CollectIncludePath = true;
   if (Opts.Origin == SymbolOrigin::Unknown)
     Opts.Origin = SymbolOrigin::Static;
-  Opts.StoreAllDocumentation = false;
   if (RefsCallback != nullptr) {
     Opts.RefFilter = RefKind::All;
     Opts.RefsInHeaders = true;
diff --git a/clang-tools-extra/clangd/indexer/IndexerMain.cpp b/clang-tools-extra/clangd/indexer/IndexerMain.cpp
index bc5d1a7408991..806734f6ad40e 100644
--- a/clang-tools-extra/clangd/indexer/IndexerMain.cpp
+++ b/clang-tools-extra/clangd/indexer/IndexerMain.cpp
@@ -55,6 +55,7 @@ class IndexActionFactory : public tooling::FrontendActionFactory {
   std::unique_ptr<FrontendAction> create() override {
     SymbolCollector::Options Opts;
     Opts.CountReferences = true;
+    Opts.StoreAllDocumentation = false;
     Opts.FileFilter = [&](const SourceManager &SM, FileID FID) {
       const auto F = SM.getFileEntryRefForID(FID);
       if (!F)
diff --git a/clang-tools-extra/clangd/unittests/StdLibTests.cpp b/clang-tools-extra/clangd/unittests/StdLibTests.cpp
index a7a33f78303d3..00c6d629e1c25 100644
--- a/clang-tools-extra/clangd/unittests/StdLibTests.cpp
+++ b/clang-tools-extra/clangd/unittests/StdLibTests.cpp
@@ -158,6 +158,43 @@ TEST(StdLibTests, EndToEnd) {
       UnorderedElementsAre(StdlibSymbol("list"), StdlibSymbol("vector")));
 }
 
+TEST(StdLibTests, StdLibDocComments) {
+  Config Cfg;
+  Cfg.Index.StandardLibrary = true;
+  WithContextValue Enabled(Config::Key, std::move(Cfg));
+
+  MockFS FS;
+  FS.Files["stdlib/vector"] = R"cpp(
+    namespace std {
+      template <typename T>
+      class vector {
+      public:
+        /**doc comment*/
+        unsigned int size() const;
+      };
+    }
+  )cpp";
+  MockCompilationDatabase CDB;
+  CDB.ExtraClangFlags.push_back("-isystem" + testPath("stdlib"));
+  ClangdServer::Options Opts = ClangdServer::optsForTest();
+  Opts.BuildDynamicSymbolIndex = true; // also used for stdlib index
+  ClangdServer Server(CDB, FS, Opts);
+
+  Annotations A(R"cpp(
+    #include <vector>
+    void foo() {
+      std::vector<int> v;
+      v.si^ze();
+    }
+  )cpp");
+
+  Server.addDocument(testPath("foo.cc"), A.code());
+  ASSERT_TRUE(Server.blockUntilIdleForTest());
+  auto HI = cantFail(runHover(Server, testPath("foo.cc"), A.point()));
+  EXPECT_TRUE(HI.has_value());
+  EXPECT_EQ(HI->Documentation, "doc comment");
+}
+
 } // namespace
 } // namespace clangd
 } // namespace clang
diff --git a/clang-tools-extra/clangd/unittests/SyncAPI.cpp b/clang-tools-extra/clangd/unittests/SyncAPI.cpp
index d48622eba5378..00bec7afd1a98 100644
--- a/clang-tools-extra/clangd/unittests/SyncAPI.cpp
+++ b/clang-tools-extra/clangd/unittests/SyncAPI.cpp
@@ -68,6 +68,13 @@ template <typename T> CaptureProxy<T> capture(std::optional<T> &Target) {
 }
 } // namespace
 
+llvm::Expected<std::optional<HoverInfo>> runHover(ClangdServer &Server,
+                                                  PathRef File, Position Pos) {
+  std::optional<llvm::Expected<std::optional<HoverInfo>>> HI;
+  Server.findHover(File, Pos, capture(HI));
+  return std::move(*HI);
+}
+
 llvm::Expected<CodeCompleteResult>
 runCodeComplete(ClangdServer &Server, PathRef File, Position Pos,
                 clangd::CodeCompleteOptions Opts) {
diff --git a/clang-tools-extra/clangd/unittests/SyncAPI.h b/clang-tools-extra/clangd/unittests/SyncAPI.h
index cf3de4f742e84..e0c7c4d72e73e 100644
--- a/clang-tools-extra/clangd/unittests/SyncAPI.h
+++ b/clang-tools-extra/clangd/unittests/SyncAPI.h
@@ -29,6 +29,9 @@ void runAddDocument(ClangdServer &Server, PathRef File, StringRef Contents,
                     WantDiagnostics WantDiags = WantDiagnostics::Auto,
                     bool ForceRebuild = false);
 
+llvm::Expected<std::optional<HoverInfo>> runHover(ClangdServer &Server,
+                                                  PathRef File, Position Pos);
+
 llvm::Expected<CodeCompleteResult>
 runCodeComplete(ClangdServer &Server, PathRef File, Position Pos,
                 clangd::CodeCompleteOptions Opts);

@HighCommander4
Copy link
Collaborator Author

HighCommander4 commented Mar 31, 2025

See diagnosis here:

The actual reason that std::vector::size is missing documentation when the standard library AST is indexed is that, unlike during preamble indexing, Opts.StoreAllDocumentation is false here. (And std::vector::size is also not "indexed for code completion" because it's a class member so code completion for it relies only on the AST of the open file.)

This is clearly unintentional -- we set StoreAllDocumentation = true in indexStandardLibrary() but then unfortunately overwrite it right after to false in createStaticIndexingAction().

(and additional details earlier in the thread).

@HighCommander4
Copy link
Collaborator Author

Review ping

Copy link
Member

@kadircet kadircet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry for the late reply, we were both OOO :()

thanks a lot for great analysis in clangd/clangd#2344!

IIUC, the root cause is we're putting stdlib-index slabs into dynamic-index, but we're building those slabs as if they're static. Hence I think the fix should actually be limited to stdlib-index itself; we should either:

  1. layer it as a static-index, so that it won't take precedence over dynamic-index'd symbols.
  2. index using indexHeaderSymbols so that it gets the same treatment as rest of the dynamic-index'd symbols.

I must admit that it wasn't the best idea to use an option struct with some defaults partially in createStaticIndexingAction, but I think any more changes in there is going to muddy the waters further. the reason why we don't store docs for static indexing is subtle but valid (we rely on dynamic-index taking over and we always index docs for code-completion enabled symbols). I wouldn't rely on caller to decide on that.

So do you mind changing the stdlib-indexing instead? 2nd alternative I mentioned above might be easier, by just using a SyntaxOnlyAction and calling indexHeaderSymbols after running Execute on the action. But I haven't tried it myself, so I'll leave that call to you.

@HighCommander4
Copy link
Collaborator Author

The described options seem a bit more involved than necessary to fix this bug, given that it's just the value of the StoreAllDocumentation flag that's a problem.

I revised the patch to add a new parameter to createStaticIndexingAction() (now called createIndexingAction()) and set the flag based on that -- does this address your concern about callers having to decide whether they want StoreAllDocumentation?

@HighCommander4 HighCommander4 requested a review from kadircet April 17, 2025 06:21
@kadircet
Copy link
Member

The described options seem a bit more involved than necessary to fix this bug, given that it's just the value of the StoreAllDocumentation flag that's a problem.

Maybe I miscommunicated something, but I was talking about a change like kadircet@ff0c31d.

I revised the patch to add a new parameter to createStaticIndexingAction() (now called createIndexingAction()) and set the flag based on that -- does this address your concern about callers having to decide whether they want StoreAllDocumentation?

I think this still leaves possibility for future divergences. Conceptually we should either make stdlib index act as dynamic-index (what I am suggesting) or change its priority to be similar to static-index. Otherwise we're likely to hit more discrepancies as the code evolves. Moreover changing createStaticIndexingAction also increases the mental load around all the complicated indexing architecture now.

@HighCommander4
Copy link
Collaborator Author

I was talking about a change like kadircet@ff0c31d

Thanks for the more fleshed-out suggestion.

As written, this would also have the following effects:

  1. we no longer use SymbolOrigin::StdLib for collected symbols
  2. we run the symbol collector with CollectMainFileSymbols = true
  3. we use the indexing option SystemSymbolFilterKind::DeclarationsOnly
  4. we no longer use this "deeply nested" optimization
  5. (maybe others, I haven't compared clangd::IndexAction to SyntaxOnlyAction + indexTopLevelDecls() in great depth)

Would I be right to say that (1) and (2) are undesirable? ((3) may actually be an improvement, and (4) may not make any difference as I'm guessing standard library implementations are unlikely to contain symbols with nesting depth 10 or more.)

To fix (1) and (2), I think we would need to add a new parameter to the indexHeaderSymbols interface. If that seems preferable to you over adding a parameter to createStaticIndexingAction, I'm happy to revise the patch along these lines.

@kadircet
Copy link
Member

  1. we no longer use SymbolOrigin::StdLib for collected symbols

thanks for catching that, I think that's OK to pass in an extra SymbolOrigin flag here.

  1. we run the symbol collector with CollectMainFileSymbols = true

I think this is WAI. Firstly we shouldn't have (m)any symbols in main-file when running stdlib indexing (as we just stitch together a file that #include <...> all stdlib headers we know of). but even if we run into some, these symbols are all part of the preamble section. hence even-if they're main-file only, they should get the same treatment as any other preamble symbol.

This also ensures we don't get different behavior if <vector> is indexed through preamble-index vs stdlib-index first.

  1. we use the indexing option SystemSymbolFilterKind::DeclarationsOnly

agreed that this sounds like an improvement.

  1. we no longer use this "deeply nested" optimization

I think this was just an oversight, we should probably have that in indexSymbols too. This can be a separate patch though.

If that seems preferable to you over adding a parameter to createStaticIndexingAction

Just to be clear, my concern is not about adding more knobs to createStaticIndexingAction, but rather it being the wrong fix. As your analysis revealed we actually had many more discrepancies between dynamic-index and stdlib-index. Amending createStaticIndexingAction to look more like dynamic-index will just blur the line between static and dynamic index, which sounds undesired overall.

I'm happy to revise the patch along these lines.

That would be great, thanks a lot for taking care of this!

@HighCommander4
Copy link
Collaborator Author

Addressed review comments

Copy link
Member

@kadircet kadircet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot, implementation LG, just a small hurdle with the test (sorry for not checking that out earlier)

Copy link
Member

@kadircet kadircet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, LGTM!

FuzzyFindRequest Req;
Req.Query = "inner";
Req.Scopes = {"std::vector::"};
SymbolSlab Result = runFuzzyFind(*Server.getIndexForTest(), Req);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it might be better to use runCustomAction from syncapi (we should have index in the inputs).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks for the suggestion!

@HighCommander4 HighCommander4 merged commit 1b0a0c7 into llvm:main May 5, 2025
11 checks passed
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
IanWood1 pushed a commit to IanWood1/llvm-project that referenced this pull request May 6, 2025
GeorgeARM pushed a commit to GeorgeARM/llvm-project that referenced this pull request May 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clangd/lsp: no lsp documentation on textDocument/hover
3 participants