Skip to content

Update index as files are modified on disk #1302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions Documentation/Files_To_Reindex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Which files to re-index when a file is modified

## `.swift`

Obviously affects the file itself, which we do re-index.

If an internal declaration was changed, all files within the module might be also affected. If a public declaration was changed, all modules that depend on it might be affected. The effect can only really be in three ways:
1. It might change overload resolution in another file, which is fairly unlikely
2. A new declaration is introduced in this file that is already referenced by another file
3. A declaration is removed in this file that was referenced from other files. In those cases the other files now have an invalid reference.

We decided to not re-index any files other than the file itself because naively re-indexing all files that might depend on the modified file requires too much processing power that will likely have no or very little effect on the index – we are trading accuracy for CPU time here.
We mark the targets of the changed file as well as any dependent targets as out-of-date. The assumption is that most likely the user will go back to any of the affected files shortly afterwards and modify them again. When that happens, the affected file will get re-indexed and bring the index back up to date.

Alternatives would be:
- We could we check the file’s interface hash and re-index other files based on whether it has changed. But at that point we are somewhat implementing a build system. And even if a new public method was introduced it’s very likely that the user hasn’t actually used it anywhere yet, which means that re-indexing all dependent modules would still be doing unnecessary work.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, with fine enough dependencies and swift-driver support, whether some kind of incremental-mode-based indexing could make sense. But that's way off in the future 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we had it, I’m not sure if we really want to do a whole downstream re-index. Say you add a new public function in a fairly low-level module then we would need to re-index the entire project based on incremental information because it might affect overload resolution. And that’s still quite a bit of work, considering that most likely nothing really changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends how fine grained the dependencies are, in theory if you're adding a function called foo, you shouldn't need to rebuild anything that didn't do a lookup for foo

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s a dream for a far and bright future ✨

- To cover case (2) from above, we could re-index only dependencies that previously indexed with errors. This is an alternative that hasn’t been fully explored.

## `.h`

Affects all files that include the header (including via other headers). For existing headers, we know which files import a header from the existing index. For new header files, we assume that it hasn’t been included in any file yet and so we don’t re-index any file. If it is, it’s likely that the user will modify the file that includes the new header file shortly after, which will re-index it.

## `.c` / `.cpp` / `.m`

This is the easy case since only the file itself is affected.

## Compiler settings (`compile_commands.json` / `Package.swift`)

Ideally, we would considered like a file change for all files whose compile commands have changed, if they changed in a meaningful way (ie. in a way that would also trigger re-compilation in an incremental build). Non-meaningful changes would be:
- If compiler arguments, like include paths are shuffled around. We could have a really quick check for compiler arguments equality by comparing them unordered. Any really compiler argument change will most likely do more than rearranging the arguments.
- The addition of a new Swift file to a target is equivalent to that file being modified and shouldn’t trigger a re-index of the entire target.

At the moment, unit files don’t include information about the compiler arguments with which they were created, so it’s impossible to know whether the compiler arguments have changed when a project is opened. Thus, for now, we don’t re-index any files on compiler settings changing.
2 changes: 2 additions & 0 deletions Sources/SKCore/BuildSystem.swift
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,8 @@ public protocol BuildSystem: AnyObject, Sendable {
func fileHandlingCapability(for uri: DocumentURI) async -> FileHandlingCapability

/// Returns the list of source files in the project.
///
/// Header files should not be considered as source files because they cannot be compiled.
func sourceFiles() async -> [SourceFileInfo]

/// Adds a callback that should be called when the value returned by `sourceFiles()` changes.
Expand Down
1 change: 1 addition & 0 deletions Sources/SKSupport/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ add_library(SKSupport STATIC
Process+WaitUntilExitWithCancellation.swift
Random.swift
Result.swift
Sequence+AsyncMap.swift
SwitchableProcessResultExitStatus.swift
ThreadSafeBox.swift
WorkspaceType.swift
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

extension Sequence {
/// Just like `Sequence.map` but allows an `async` transform function.
func asyncMap<T>(
public func asyncMap<T>(
@_inheritActorContext _ transform: @Sendable (Element) async throws -> T
) async rethrows -> [T] {
var result: [T] = []
Expand All @@ -26,7 +26,7 @@ extension Sequence {
}

/// Just like `Sequence.compactMap` but allows an `async` transform function.
func asyncCompactMap<T>(
public func asyncCompactMap<T>(
@_inheritActorContext _ transform: @Sendable (Element) async throws -> T?
) async rethrows -> [T] {
var result: [T] = []
Expand Down
2 changes: 1 addition & 1 deletion Sources/SKSwiftPMWorkspace/SwiftPMBuildSystem.swift
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,7 @@ extension SwiftPMBuildSystem: SKCore.BuildSystem {
)
}

public func defaultLanguage(for document: DocumentURI) async -> Language? {
public func defaultLanguage(for document: DocumentURI) -> Language? {
// TODO (indexing): Query The SwiftPM build system for the document's language.
// https://github.com/apple/sourcekit-lsp/issues/1267
return nil
Expand Down
6 changes: 5 additions & 1 deletion Sources/SKTestSupport/Utils.swift
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,12 @@ extension Language {

init?(fileExtension: String) {
switch fileExtension {
case "c": self = .c
case "cpp": self = .cpp
case "m": self = .objective_c
default: self.init(rawValue: fileExtension)
case "mm": self = .objective_cpp
case "swift": self = .swift
default: return nil
}
}
}
Expand Down
31 changes: 22 additions & 9 deletions Sources/SemanticIndex/CheckedIndex.swift
Original file line number Diff line number Diff line change
Expand Up @@ -104,10 +104,6 @@ public final class CheckedIndex {
}
}

public func symbolProvider(for sourceFilePath: String) -> SymbolProviderKind? {
return index.symbolProvider(for: sourceFilePath)
}

public func symbols(inFilePath path: String) -> [Symbol] {
guard self.hasUpToDateUnit(for: URL(fileURLWithPath: path, isDirectory: false)) else {
return []
Expand All @@ -120,6 +116,19 @@ public final class CheckedIndex {
return index.unitTests(referencedByMainFiles: mainFilePaths).filter { checker.isUpToDate($0.location) }
}

/// Returns all the files that (transitively) include the header file at the given path.
///
/// If `crossLanguage` is set to `true`, Swift files that import a header will through a module will also be reported.
public func mainFilesContainingFile(uri: DocumentURI, crossLanguage: Bool = false) -> [DocumentURI] {
return index.mainFilesContainingFile(path: uri.pseudoPath, crossLanguage: crossLanguage).compactMap {
let url = URL(fileURLWithPath: $0)
guard checker.indexHasUpToDateUnit(for: url, index: self.index) else {
return nil
}
return DocumentURI(url)
}
}

/// Returns all unit test symbols in the index.
public func unitTests() -> [SymbolOccurrence] {
return index.unitTests().filter { checker.isUpToDate($0.location) }
Expand All @@ -139,11 +148,6 @@ public final class CheckedIndex {
public func fileHasInMemoryModifications(_ url: URL) -> Bool {
return checker.fileHasInMemoryModifications(url)
}

/// Wait for IndexStoreDB to be updated based on new unit files written to disk.
public func pollForUnitChangesAndWait() {
self.index.pollForUnitChangesAndWait()
}
}

/// A wrapper around `IndexStoreDB` that allows the retrieval of a `CheckedIndex` with a specified check level or the
Expand All @@ -167,6 +171,15 @@ public struct UncheckedIndex: Sendable {
public func checked(for checkLevel: IndexCheckLevel) -> CheckedIndex {
return CheckedIndex(index: underlyingIndexStoreDB, checkLevel: checkLevel)
}

public func symbolProvider(for sourceFilePath: String) -> SymbolProviderKind? {
return underlyingIndexStoreDB.symbolProvider(for: sourceFilePath)
}

/// Wait for IndexStoreDB to be updated based on new unit files written to disk.
public func pollForUnitChangesAndWait() {
self.underlyingIndexStoreDB.pollForUnitChangesAndWait()
}
}

/// Helper class to check if symbols from the index are up-to-date or if the source file has been modified after it was
Expand Down
55 changes: 48 additions & 7 deletions Sources/SemanticIndex/SemanticIndexManager.swift
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ private enum FileIndexStatus {
public final actor SemanticIndexManager {
/// The underlying index. This is used to check if the index of a file is already up-to-date, in which case it doesn't
/// need to be indexed again.
private let index: CheckedIndex
private let index: UncheckedIndex

/// The build system manager that is used to get compiler arguments for a file.
private let buildSystemManager: BuildSystemManager
Expand Down Expand Up @@ -101,14 +101,17 @@ public final actor SemanticIndexManager {
indexTasksWereScheduled: @escaping @Sendable (Int) -> Void,
indexTaskDidFinish: @escaping @Sendable () -> Void
) {
self.index = index.checked(for: .modifiedFiles)
self.index = index
self.buildSystemManager = buildSystemManager
self.indexTaskScheduler = indexTaskScheduler
self.indexTasksWereScheduled = indexTasksWereScheduled
self.indexTaskDidFinish = indexTaskDidFinish
}

/// Schedules a task to index all files in `files` that don't already have an up-to-date index.
/// Schedules a task to index `files`. Files that are known to be up-to-date based on `indexStatus` will
/// not be re-indexed. The method will re-index files even if they have a unit with a timestamp that matches the
/// source file's mtime. This allows re-indexing eg. after compiler arguments or dependencies have changed.
///
/// Returns immediately after scheduling that task.
///
/// Indexing is being performed with a low priority.
Expand All @@ -117,11 +120,22 @@ public final actor SemanticIndexManager {
}

/// Regenerate the build graph (also resolving package dependencies) and then index all the source files known to the
/// build system.
/// build system that don't currently have a unit with a timestamp that matches the mtime of the file.
///
/// This method is intended to initially update the index of a project after it is opened.
public func scheduleBuildGraphGenerationAndBackgroundIndexAllFiles() async {
generateBuildGraphTask = Task(priority: .low) {
await orLog("Generating build graph") { try await self.buildSystemManager.generateBuildGraph() }
await scheduleBackgroundIndex(files: await self.buildSystemManager.sourceFiles().map(\.uri))
let index = index.checked(for: .modifiedFiles)
let filesToIndex = await self.buildSystemManager.sourceFiles().map(\.uri)
.filter { uri in
guard let url = uri.fileURL else {
// The URI is not a file, so there's nothing we can index.
return false
}
return !index.hasUpToDateUnit(for: url)
}
await scheduleBackgroundIndex(files: filesToIndex)
generateBuildGraphTask = nil
}
}
Expand Down Expand Up @@ -170,6 +184,34 @@ public final actor SemanticIndexManager {
logger.debug("Done waiting for up-to-date index")
}

public func filesDidChange(_ events: [FileEvent]) async {
let filesToReIndex = await filesToReIndex(changedFiles: events.map(\.uri))

// Reset the index status for these files so they get re-indexed by `index(files:priority:)`
for uri in filesToReIndex {
indexStatus[uri] = nil
}
await scheduleBackgroundIndex(files: filesToReIndex)
}

/// Returns the files that should be re-indexed if the given files have been modified.
///
/// - SeeAlso: The `Documentation/Files_To_Reindex.md` file.
private func filesToReIndex(changedFiles: [DocumentURI]) async -> Set<DocumentURI> {
let sourceFiles = Set(await buildSystemManager.sourceFiles().map(\.uri))
let filesToReIndex = await changedFiles.asyncMap { (uri) -> [DocumentURI] in
if sourceFiles.contains(uri) {
// If this is a source file, re-index it
return [uri]
}
// Otherwise, see if it is a header file. If so, re-index all the files that import it.
// We don't want to re-index `.swift` files that depend on a header file, similar to how we don't re-index a Swift
// module if one of its dependent modules has changed.
return index.checked(for: .deletedFiles).mainFilesContainingFile(uri: uri, crossLanguage: false)
}.flatMap { $0 }
return Set(filesToReIndex)
}

// MARK: - Helper functions

/// Prepare the given targets for indexing
Expand All @@ -189,8 +231,7 @@ public final actor SemanticIndexManager {
let taskDescription = AnyIndexTaskDescription(
UpdateIndexStoreTaskDescription(
filesToIndex: Set(files),
buildSystemManager: self.buildSystemManager,
index: self.index.unchecked
buildSystemManager: self.buildSystemManager
)
)
let updateIndexStoreTask = await self.indexTaskScheduler.schedule(priority: priority, taskDescription) { newState in
Expand Down
17 changes: 1 addition & 16 deletions Sources/SemanticIndex/UpdateIndexStoreTaskDescription.swift
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,6 @@ public struct UpdateIndexStoreTaskDescription: IndexTaskDescription {
/// The build system manager that is used to get the toolchain and build settings for the files to index.
private let buildSystemManager: BuildSystemManager

/// A reference to the underlying index store. Used to check if the index is already up-to-date for a file, in which
/// case we don't need to index it again.
private let index: UncheckedIndex

/// The task is idempotent because indexing the same file twice produces the same result as indexing it once.
public var isIdempotent: Bool { true }

Expand All @@ -54,12 +50,10 @@ public struct UpdateIndexStoreTaskDescription: IndexTaskDescription {

init(
filesToIndex: Set<DocumentURI>,
buildSystemManager: BuildSystemManager,
index: UncheckedIndex
buildSystemManager: BuildSystemManager
) {
self.filesToIndex = filesToIndex
self.buildSystemManager = buildSystemManager
self.index = index
}

public func execute() async {
Expand Down Expand Up @@ -111,15 +105,6 @@ public struct UpdateIndexStoreTaskDescription: IndexTaskDescription {
}

private func updateIndexStoreForSingleFile(_ uri: DocumentURI) async {
guard let url = uri.fileURL else {
// The URI is not a file, so there's nothing we can index.
return
}
guard !index.checked(for: .modifiedFiles).hasUpToDateUnit(for: url) else {
// We consider a file's index up-to-date if we have any up-to-date unit. Changing build settings does not
// invalidate the up-to-date status of the index.
return
}
guard let language = await buildSystemManager.defaultLanguage(for: uri) else {
logger.error("Not indexing \(uri.forLogging) because its language could not be determined")
return
Expand Down
1 change: 0 additions & 1 deletion Sources/SourceKitLSP/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ add_library(SourceKitLSP STATIC
LanguageService.swift
Rename.swift
ResponseError+Init.swift
Sequence+AsyncMap.swift
SourceKitIndexDelegate.swift
SourceKitLSPCommandMetadata.swift
SourceKitLSPServer.swift
Expand Down
6 changes: 3 additions & 3 deletions Sources/SourceKitLSP/Rename.swift
Original file line number Diff line number Diff line change
Expand Up @@ -500,7 +500,7 @@ extension SourceKitLSPServer {
) async -> (swiftLanguageService: SwiftLanguageService, snapshot: DocumentSnapshot, location: SymbolLocation)? {
var reference: SymbolOccurrence? = nil
index.forEachSymbolOccurrence(byUSR: usr, roles: renameRoles) {
if index.symbolProvider(for: $0.location.path) == .swift {
if index.unchecked.symbolProvider(for: $0.location.path) == .swift {
reference = $0
// We have found a reference from Swift. Stop iteration.
return false
Expand Down Expand Up @@ -631,7 +631,7 @@ extension SourceKitLSPServer {
// If we terminate early by returning `false` from the closure, `forEachSymbolOccurrence` returns `true`,
// indicating that we have found a reference from clang.
let hasReferenceFromClang = !index.forEachSymbolOccurrence(byUSR: usr, roles: renameRoles) {
return index.symbolProvider(for: $0.location.path) != .clang
return index.unchecked.symbolProvider(for: $0.location.path) != .clang
}
let clangName: String?
if hasReferenceFromClang {
Expand Down Expand Up @@ -745,7 +745,7 @@ extension SourceKitLSPServer {
return cachedValue
}
let serverType = LanguageServerType(
symbolProvider: index.checked(for: .deletedFiles).symbolProvider(for: url.path)
symbolProvider: index.symbolProvider(for: url.path)
)
languageServerTypesCache[url] = serverType
return serverType
Expand Down
3 changes: 2 additions & 1 deletion Sources/SourceKitLSP/SourceKitLSPServer.swift
Original file line number Diff line number Diff line change
Expand Up @@ -847,6 +847,7 @@ public actor SourceKitLSPServer {
guard let toolchain = await workspace.buildSystemManager.toolchain(for: uri, language),
let service = await languageService(for: toolchain, language, in: workspace)
else {
logger.error("Failed to create language service for \(uri)")
return nil
}

Expand Down Expand Up @@ -2486,7 +2487,7 @@ extension SourceKitLSPServer {
func pollIndex(_ req: PollIndexRequest) async throws -> VoidResponse {
for workspace in workspaces {
await workspace.semanticIndexManager?.waitForUpToDateIndex()
workspace.index(checkedFor: .deletedFiles)?.pollForUnitChangesAndWait()
workspace.uncheckedIndex?.pollForUnitChangesAndWait()
}
return VoidResponse()
}
Expand Down
13 changes: 9 additions & 4 deletions Sources/SourceKitLSP/Workspace.swift
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,11 @@ public final class Workspace: Sendable {
/// The source code index, if available.
///
/// Usually a checked index (retrieved using `index(checkedFor:)`) should be used instead of the unchecked index.
private let uncheckedIndex: ThreadSafeBox<UncheckedIndex?>
private let _uncheckedIndex: ThreadSafeBox<UncheckedIndex?>

public var uncheckedIndex: UncheckedIndex? {
return _uncheckedIndex.value
}

/// The index that syntactically scans the workspace for tests.
let syntacticTestIndex = SyntacticTestIndex()
Expand Down Expand Up @@ -97,7 +101,7 @@ public final class Workspace: Sendable {
self.buildSetup = options.buildSetup
self.rootUri = rootUri
self.capabilityRegistry = capabilityRegistry
self.uncheckedIndex = ThreadSafeBox(initialValue: uncheckedIndex)
self._uncheckedIndex = ThreadSafeBox(initialValue: uncheckedIndex)
self.buildSystemManager = await BuildSystemManager(
buildSystem: underlyingBuildSystem,
fallbackBuildSystem: FallbackBuildSystem(buildSetup: buildSetup),
Expand Down Expand Up @@ -261,20 +265,21 @@ public final class Workspace: Sendable {
/// Returns a `CheckedIndex` that verifies that all the returned entries are up-to-date with the given
/// `IndexCheckLevel`.
func index(checkedFor checkLevel: IndexCheckLevel) -> CheckedIndex? {
return uncheckedIndex.value?.checked(for: checkLevel)
return _uncheckedIndex.value?.checked(for: checkLevel)
}

/// Write the index to disk.
///
/// After this method is called, the workspace will no longer have an index associated with it. It should only be
/// called when SourceKit-LSP shuts down.
func closeIndex() {
uncheckedIndex.value = nil
_uncheckedIndex.value = nil
}

public func filesDidChange(_ events: [FileEvent]) async {
await buildSystemManager.filesDidChange(events)
await syntacticTestIndex.filesDidChange(events)
await semanticIndexManager?.filesDidChange(events)
}
}

Expand Down
Loading