Skip to content

[SourceKit] Add locations for symbols in external modules #36942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

bnbarham
Copy link
Contributor

@bnbarham bnbarham commented Apr 16, 2021

The main change here is adding offsets and presumed location handling to .swiftsourceinfo and removing OptimizeForIDE.

It previously output line/column and on deserialization would compute the line map to convert those back to an offset in order to retrieve the SourceLoc. These were also presumed locations, which meant that getLoc() was not the original location. Instead, output the offset as well as information about #sourceLocation directives so that the presumed location can still be calculated.

Where possible creating a SourceLoc for serialized locations is avoided, since it requires loading the file (even if it doesn't perform a read since we now have the offset).

Some notes:

  1. Serialized locations are fixed size, but this means they all have an extra 16 bytes to store the line directive. To minimize the number of changes I left this as is, but could also just serialize the offset rather than index.
  2. Serialized locations output a length of 0. Previously (if they were read) they differed from the non-serialized for functions but did output some length (ie. the name rather than signature length). I wanted to avoid reading the external file, in which case I could still get the name length (if it has a name), but this still differs from non-serialized. Thoughts?
  3. The cursor info request handles updating the main location using the snapshots, but nothing else in the request does (ie. the location in the DocComment and SymbolGraph). Have left for now, but probably want to look into at some point.
  4. I still need to add a test for the presumed locations, but wanted to get this up for some feedback.

@@ -802,6 +810,106 @@ static ArrayRef<T> copyAndClearArray(llvm::BumpPtrAllocator &Allocator,
return Ref;
}

static void setLocationInfoForClangNode(ClangNode ClangNode,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just moved from the other SwiftSourceDocInfo.cpp (yes, there are two).

return SM.getByteDistance(TokenRange.Start, CharEndLoc);
}

static void setLocationInfo(const ValueDecl *VD,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved and added the getSerializedLoc condition

@bnbarham
Copy link
Contributor Author

@swift-ci please test

@swift-ci
Copy link
Contributor

Build failed
Swift Test Linux Platform
Git Sha - 0ac9d749169ad91ac561b0efc1d9b58c96c90538

@swift-ci
Copy link
Contributor

Build failed
Swift Test OS X Platform
Git Sha - 0ac9d749169ad91ac561b0efc1d9b58c96c90538

EXPECT_EQ(FooOffs, Info.DeclarationLoc->first);
EXPECT_EQ(strlen("foo"), Info.DeclarationLoc->second);
EXPECT_EQ(FooOffs, Info.Offset);
EXPECT_EQ(strlen("foo"), Info.Offset);
Copy link
Contributor Author

@bnbarham bnbarham Apr 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops! I'll fix these (and the other two) up in the morning. I only ran SourceKit/IDE/refactoring... that'll teach me.

Rather than calculate the offset from the line and column by reading the
source file, just store the offset to begin with.

Since the location is used through `getLoc()`, also store the *original*
location rather than the presumed. Add line directive information so
that the presumed location can be retrieved if needed.

Move the serialized location cache out of `Decl` and into `ASTContext`
since very few will actually have their locations deserialized. Actually
use that cache - the old one was never actually written to.
.swiftsourceinfo files contain the serialized location for declarations.
Use this when outputting locations in cursor info so that clients need
not perform an extra index lookup for external modules.

Actual offset/line/column will be added in a later patch as we only want
to include those if the loaded file is up-to-date with respect to the
file on disk.

Resolves rdar://75582627 when optimize_for_ide is 0 (ie.
.swiftsourceinfo files are read).
Add line/column in addition to offset so clients need not map it
themselves. For serialized locations only output offset/line/column if
the file is currently up to date with the one loaded in the semantic
source manager.
SourceKit should return locations for symbols outside of the current
module as well. Callsites of location and comment information should
explicitly disable retrieving serialized information where performance
is a concern.
@bnbarham bnbarham force-pushed the sourceinfo-changes branch from 0ac9d74 to c6feca5 Compare April 17, 2021 06:15
@bnbarham
Copy link
Contributor Author

@swift-ci please test

@bnbarham
Copy link
Contributor Author

@swift-ci please build toolchain macOS platform

@bnbarham
Copy link
Contributor Author

That's an... odd error. Somehow using a super old compiler? Just going to re-run to see if that changes anything.

@swift-ci please build toolchain macOS platform

@bnbarham
Copy link
Contributor Author

Artem ended up making it compile-able in older compilers, so should hopefully work this time 🤞

@swift-ci please build toolchain macOS platform

@swift-ci
Copy link
Contributor

macOS Toolchain
Download Toolchain
Git Sha - c6feca5

Install command
tar -zxf swift-PR-36942-952-osx.tar.gz --directory ~/

Copy link
Contributor

@benlangmuir benlangmuir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to add a test for the presumed locations, but wanted to get this up for some feedback.

Just to confirm: this is still TODO, right?

if (!LocData.hasValue())

auto *File = D->getDeclContext()->getModuleScopeContext();
auto Positions = cast<FileUnit>(File)->getBasicPositionsForDecl(D);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we know at this point that it's not a Module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially just inlining getLocData to avoid the copy into the (now removed) DeclLocationsTableData. It had this cast previously and I assumed it was correct. If it wasn't though, it would imply we're serializing a Decl that has no FileUnit. Do you know when that could happen?

Note that this brings up another part of the serialization that I left as is. From what I can tell, the .swiftsourceinfo serialization was built similar to .swiftdocinfo. The docinfo serializes DeclCommentTableData directly into the on-disk hash table values, hence the need for DeclCommentTableData. I wasn't sure if there was a good reason that the sourceinfo doesn't do that, so I left as is but removed the unnecessary copy (mostly to avoid defining more slightly-different-to-basicsourceinfo structs) .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know when that could happen?

No, I just see code that gets a module scope context and assumes it's a file. So I'm asking how do we know it's a FileUnit? If the answer is that we would never serialize something outside a file unit, great.

I wasn't sure if there was a good reason that the sourceinfo doesn't do that

I don't know either. Maybe @nkcsgexi would know.

Writer.write<uint32_t>(Pos.Line);
Writer.write<uint32_t>(Pos.Column);

Writer.write<uint32_t>(Pos.Directive.Offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth having a separate table of these so that the position only needs an offset into that table, which for most locations I would expect is to a null directive? Is this what you meant by:

Serialized locations are fixed size, but this means they all have an extra 16 bytes to store the line directive. To minimize the number of changes I left this as is, but could also just serialize the offset rather than index.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could make another table, but no, that's not quite what I was thinking.

The position only needs to be fixed size at the moment because the values stored in the on-disk hash table for decl records are indices. That doesn't need to be the case though, ie. we could store the offset into the blob and just have a marker for whether there is a directive or not. This could just be length, if 0 we could skip reading offset/line offset/name.

I avoided doing this so as to minimize the serialization changes. Do you think this/your idea is worth doing now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It's hard to say without trying different approaches and benchmarking them. Are you evaluating performance of these changes in some way?

@@ -2295,6 +2295,17 @@ ImmutableTextSnapshotRef SwiftEditorDocument::getLatestSnapshot() const {
return Impl.EditableBuffer->getSnapshot();
}

std::pair<unsigned, unsigned>
SwiftEditorDocument::getLineAndColumn(unsigned Offset) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably clarify this is location in the buffer not the presumed location. How about just use the same naming convention as SourceManager and call this getLineAndColumnInBuffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, will do.

@bnbarham
Copy link
Contributor Author

I still need to add a test for the presumed locations, but wanted to get this up for some feedback.

Just to confirm: this is still TODO, right?

Yep

@bnbarham
Copy link
Contributor Author

Split this up into a few PRs

@bnbarham bnbarham closed this Apr 30, 2021
@bnbarham bnbarham deleted the sourceinfo-changes branch April 30, 2021 01:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants