Skip to content

[SourceKit] Add optional declarations array to interface gen request #75802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

J-MR-T
Copy link
Contributor

@J-MR-T J-MR-T commented Aug 9, 2024

Introduces the new DeclarationsArrayBuilder and adds it to the EditorConsumer. Declaration info always includes a kind, offset, and length, and includes a USR where applicable.
As the USR is already available for editor.open.interface type requests, this doesn't compute any new information, it just exposes more of what's there already.

The array can be enabled by setting key.enabledeclarations to true.

J-MR-T added 5 commits August 6, 2024 17:05
Introduces the new DeclarationsArrayBuilder and adds it to the
EditorConsumer. Declaration info always includes a kind, offset, and
length, and includes a USR where applicable.
As the USR is already available for editor.open.interface type requests,
this doesn't compute any new information, it just exposes more of what's
there already.
@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 9, 2024

I've run git-clang-format on the changes, but separated it out into 2 commits: c93a542 is the less radical one, that just includes formatting changes that agree with the code that already exists, while b18ec5f includes all the other changes by git-clang-format, that do make the style fit a bit less well with what was there already.

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 9, 2024

I wrote the test in the .swift file itself, as I thought that would be a bit more FileCheck idiomatic and robust than the .response files that are often used in testing right now, but I can of course change that to be more in line with the current tests, if that's desired :).

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 9, 2024

@swift-ci Please test
EDIT: I read over the part in the CI that specifies this only works for users with commit access ^^'

@ahoppen
Copy link
Member

ahoppen commented Aug 9, 2024

I would prefer to not add more options to these requests unless absolutely necessary. What’s the use case of the declarations array and why can’t the same information be extracted from the substructure?

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 9, 2024

Currently, the editor.open.interface request does not expose the USRs of the symbols defined in the module. This wouldn't be a good fit for the substructure, as that is purely syntactic in nature.

Getting the USRs of the symbols defined in the module is crucial to be able to pass it to other requests, as well as other tools like indexstore-db.

Protocol.md even describes a use case that's not easily possible without the USRs from the declaration array:

After 'opening' the module interface, to 'jump' to the location of a declaration with a particular USR, use the 'find_usr' request:

{
    <key.request>:       (UID) <source.request.editor.find_usr>
    <key.usr>:           (string) // USR to look for.
    <key.sourcefile>:    (string) // virtual name/path associated with the interface document
}

As, without this PR, editor.open.interface doesn't report USRs, this would require more intermediate requests to get individual USR for specific locations, through something like cursorinfo requests. But that is only really feasible for use cases like the one mentioned in Protocol.md, where only one symbol is required, as sending one cursorinfo request per declaration to get its USR is not practical. Especially considering that internally, USRs are already available anyway.

To me the snippet from Protocol.md reads as though it was originally intended for USRs to be exposed directly in this request. But I hope the case is compelling enough, that you would consider it, even if it wasn't the original intention :).

@bnbarham
Copy link
Contributor

bnbarham commented Aug 9, 2024

To me the snippet from Protocol.md reads as though it was originally intended for USRs to be exposed directly in this request. But I hope the case is compelling enough, that you would consider it, even if it wasn't the original intention :).

The docs there are referring to the case of the original request being cursorinfo referring to something inside an interface, eg. the expected flow would be:

  1. cursorinfo on a reference, definition in interface
  2. open interface
  3. find_usr (usr from cursorinfo)

And in general the expectation would be that further requests are sent for operations within that interface (eg. cursorinfo again for jump to def).

Getting the USRs of the symbols defined in the module is crucial to be able to pass it to other requests, as well as other tools like indexstore-db.

Do you have a concrete use case for needing all USRs? Is it specific to interfaces, ie. what about regular files?

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 13, 2024

Ah, I understood that the docs were about a single reference, but I hadn't noticed it was referring to a reference originating from a cursorinfo request. Thank you for clarifying that :)

About our use case:

In our IDE tooling, we keep track of our own symbols separately to be able to provide more information than what SourceKit itself offers. However, we can't build USRs for our own symbols, as we don't want to replicate the whole mangling logic on our side. Each of our symbols is instead uniquely identified by a subset of its properties, including the offset of its declaration within a source file. To resolve symbol references to their declaration efficiently, we use IndexSoreDB, which does not provide offsets of Swift module declarations. Therefore, we have to build USR <-> offset maps for each Swift module. For this, we currently use a combination of indexsource (to get all USRs present in a Swift module), open.interface, followed by calling find_usr for each of the USRs. We were hoping to replace this complicated workflow with a single open.interface request, which would provide both USRs and their offsets.

Is it specific to interfaces, ie. what about regular files?

For our use case, yes, this is specific to swift module interfaces. For regular files, the IndexStore already contains offsets of declarations along with their USRs. Of course, if it makes sense, we would be happy to implement this behavior for regular files as well.

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 26, 2024

@bnbarham Gentle ping :)

@bnbarham
Copy link
Contributor

Ah, sorry @J-MR-T! At a high level, I think it would be better to have this as a separate request. Makes it easier to both cancel as necessary + doesn't confuse the open with even more parameters - it's quite possible I'm missing something, but doesn't key.entities actually already contain this (it's just a lot more than you want)?

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 27, 2024

At a high level, I think it would be better to have this as a separate request. Makes it easier to both cancel as necessary + doesn't confuse the open with even more parameters

That is understandable, and we'd be happy to reimplement it as a separate request. The reason we chose to add it here is that the USRs and declarations are already available (this PR adds no logic to compute anything new).
On the other hand, if the option is the main concern, we could also just always output the key.declarations array, without having it be optional. That would require adjusting quite a few more tests, and reporting the array itself would have a slight performance impact, but those should be the only concerns (although I don't fully understand the problem with cancellability the way it is implemented now).

it's quite possible I'm missing something, but doesn't key.entities actually already contain this (it's just a lot more than you want)?

In theory, yes, key.entities from other requests would give us what we need, but editor.open.interface does not return key.entities, so I'm unsure which request you're referring to. A subsequent indexsource perhaps?

In the end I see see three options (pun not intended) here:

  1. If options for editor.open.interface are the main concern, then removing the option and always reporting the declarations array would work.
  2. If your preferred solution would be another separate request anyway, maybe it should be a new editor related request that can reuse the AST, and outputs either only key.entities, or the proposed key.declarations as well.
  3. We use either indexsource or preferably another request that already returns something like key.entities for an open editor.

Copy link
Contributor

@bnbarham bnbarham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is understandable, and we'd be happy to reimplement it as a separate request. The reason we chose to add it here is that the USRs and declarations are already available (this PR adds no logic to compute anything new). On the other hand, if the option is the main concern, we could also just always output the key.declarations array, without having it be optional. That would require adjusting quite a few more tests, and reporting the array itself would have a slight performance impact, but those should be the only concerns

Okay - let's just leave as is for now, it doesn't seem worth implementing the separate request unless there ends up being a need for similar data in regular files. But yes, re-using the already-built AST would be the way to go there if we went that route.

(although I don't fully understand the problem with cancellability the way it is implemented now).

Ignore me, editor.open (not editor.open.interface) is what I was concerned about here.

it's quite possible I'm missing something, but doesn't key.entities actually already contain this (it's just a lot more than you want)?

In theory, yes, key.entities from other requests would give us what we need, but editor.open.interface does not return key.entities, so I'm unsure which request you're referring to. A subsequent indexsource perhaps?

Seems I was thinking of doc-info here. It's a little different though and eg. doesn't take group name.

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 29, 2024

Seems I was thinking of doc-info here. It's a little different though and eg. doesn't take group name.

Yes. Another issue with doc-info is that it produces slightly different source code to editor.open.interface, so the offsets from doc-info wouldn't be correct for the source text from editor.open.interface (which we need to use to match Xcode).

@bnbarham
Copy link
Contributor

@swift-ci please test

Copy link
Contributor

@bnbarham bnbarham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working through this with us!

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 30, 2024

Thanks for working through this with us!

Thank you for considering our use-case in such depth! :)

Unfortunately, it seems I overlooked that sourcekitdAPI-InProc.cpp exists, as it wasn't in my local build. That's what caused the Linux build failure, this should be fixed now.

However, the Windows test failure perplexes me, the gen_swift_module.swift test crashes with:

SOURCEKITD FATAL ERROR: sourcekitd object did not resolve to a known type

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.

Exception Code: 0x80000003

 #0 0x00007ff7a882dd65 (t:\5\bin\sourcekitd-test.exe+0x2dd65)

 #1 0x00007ff9f24ad88d (C:\Windows\System32\ucrtbase.dll+0x6d88d)

 #2 0x00007ff9f24ae761 (C:\Windows\System32\ucrtbase.dll+0x6e761)

 #3 0x00007ff9c6f56e7b sourcekitd_request_create_from_yaml (t:\5\bin\sourcekitdInProc.dll+0x6e7b)

 #4 0x00007ff9c7415266 sourcekitd_set_uid_handlers (t:\5\bin\sourcekitdInProc.dll+0x4c5266)

 #5 0x00007ff9c7415401 sourcekitd_set_uid_handlers (t:\5\bin\sourcekitdInProc.dll+0x4c5401)

 #6 0x00007ff9c6f5cfdd sourcekitd_response_get_value (t:\5\bin\sourcekitdInProc.dll+0xcfdd)

 #7 0x00007ff9c6f5310c sourcekitd_variant_get_type (t:\5\bin\sourcekitdInProc.dll+0x310c)

 #8 0x00007ff7a88163c9 (t:\5\bin\sourcekitd-test.exe+0x163c9)

 #9 0x00007ff7a88048ce (t:\5\bin\sourcekitd-test.exe+0x48ce)

#10 0x00007ff7a8809ed9 (t:\5\bin\sourcekitd-test.exe+0x9ed9)

#11 0x00007ff7a8803f8b (t:\5\bin\sourcekitd-test.exe+0x3f8b)

#12 0x00007ff7a8803513 (t:\5\bin\sourcekitd-test.exe+0x3513)

#13 0x00007ff7a881efce (t:\5\bin\sourcekitd-test.exe+0x1efce)

#14 0x00007ff9f246268a (C:\Windows\System32\ucrtbase.dll+0x2268a)

#15 0x00007ff9f2dd7974 (C:\Windows\System32\KERNEL32.DLL+0x17974)

#16 0x00007ff9f5aba2f1 (C:\Windows\SYSTEM32\ntdll.dll+0x5a2f1)

That sounds like there's another place where I forgot to add the DeclarationsArray, and thus the object is somehow of unknown type. But I now grep'd through all usages of TokenAnnotationsArray (what I based the DeclarationsArray on), and now that sourcekitdAPI-InProc.cpp is handled, I can't find any relevant places in which TokenAnnotationsArray is used and DeclarationsArray isn't.

Unfortunately, I don't have a Windows machine to reproduce this with. Locally, on MacOS, gen_swift_module.swift passes (everything else that my changes could affect passes as locally well).

@J-MR-T
Copy link
Contributor Author

J-MR-T commented Aug 30, 2024

Hm, okay, the macos tests seem to have passed; that matches my local machine.

@bnbarham
Copy link
Contributor

Windows also uses inproc, so maybe the same issue? I'm a little confused as to why it didn't fail during the build. Let's just try re-test :)

@bnbarham
Copy link
Contributor

@swift-ci please test

@bnbarham
Copy link
Contributor

And yeah, from the logs (thanks @rintaro!):

C:\Users\swift-ci\jenkins\workspace\swift-PR-windows\swift\tools\SourceKit\tools\sourcekitd\lib\API\sourcekitdAPI-InProc.cpp(248,13): warning: enumeration value 'DeclarationsArray' not handled in switch [-Wswitch]
  248 |     switch (getBufferKind()) {
      |             ^~~~~~~~~~~~~~~
C:\Users\swift-ci\jenkins\workspace\swift-PR-windows\swift\tools\SourceKit\tools\sourcekitd\lib\API\sourcekitdAPI-InProc.cpp(248,13): note: add missing switch cases
  248 |     switch (getBufferKind()) {
      |             ^
  249 |       case CustomBufferKind::TokenAnnotationsArray:
  250 |       case CustomBufferKind::DocSupportAnnotationArray:

@bnbarham bnbarham merged commit f2e57d8 into swiftlang:main Aug 31, 2024
5 checks passed
@bnbarham
Copy link
Contributor

Thanks @J-MR-T!

@J-MR-T J-MR-T deleted the open-editor-interface-declarations-usrs branch May 2, 2025 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants