[SwiftScan] Add SwiftScan APIs to replay the cache output #68660

cachemeifyoucan · 2023-09-21T00:01:20Z

Add new C APIs to libSwiftScan for cache querying and cache replays.

cachemeifyoucan · 2023-09-21T00:03:40Z

I was trying to abstract out the compilation to avoid re-parsing command line, compute input/output and create instance in replay logics, but it turns out a lots of information needs to be passed, including:

the diagnostics needed
how to print diagnostics (color, format, etc.)
(added) prefix map for cache replay
inputs (to compute how to construct file specific diagnostic consumer)
outputs (to replay)

So stick with replaying from command-line for now.

cachemeifyoucan · 2023-09-21T00:03:49Z

@swift-ci please smoke test

cachemeifyoucan · 2023-09-21T16:10:50Z

@swift-ci please smoke test

include/swift-c/DependencyScan/DependencyScan.h

akyrtzi · 2023-09-21T23:15:28Z

The libclang API has this kind of interaction with the build system:

key query API call: if the key exists it returns an abstract object that represents the outputs (the outputs are not downloaded from this call)
outputs materialize API call: downloads outputs locally
replay API call: receives arguments and the abstract object for outputs

Separating (1) from (2) provides flexibility on the client side for treating them as separate actions that can be recorded in the build log with their own timing info and executed potentially with different priority/scheduling. I would prefer to preserve that for swift as well instead of swiftscan_cache_query bundling both actions into the same API call.

And once an opaque object for outputs is introduced you can pass it to swiftscan_cache_replay_compilation to formalize that the outputs must already be found and potentially simplify the work that the replay function needs to do.

cachemeifyoucan · 2023-09-21T23:34:36Z

Separating (1) from (2) provides flexibility on the client side for treating them as separate actions that can be recorded in the build log with their own timing info and executed potentially with different priority/scheduling. I would prefer to preserve that for swift as well instead of swiftscan_cache_query bundling both actions into the same API call.

Maybe. But isn't that more round trip to remote that just adds latency? Maybe query can return faster without download, but what do you do if query succeed but materialize failed? Isn't that the same as a miss?

akyrtzi · 2023-09-22T04:56:43Z

But isn't that more round trip to remote that just adds latency?

Key/value queries are separated from CAS actions, both in the LLVMCAS APIs and the network protocols, there's no saving of latency in putting them behind one C API function instead of two.

Maybe query can return faster without download, but what do you do if query succeed but materialize failed? Isn't that the same as a miss?

Yes, the client should treat it as a miss (or not, it is up to the client, maybe in a more strict mode it prefers to fail if the CAS download was not found, that's the benefit of the additional flexibility and control) and it will be more transparent to the client what exact action failed.

cachemeifyoucan · 2023-09-22T21:13:51Z

Split queryCacheKey and loadObject into different API. Add a reserved field in replay function in case we want to add in the cache keys in the future.

cachemeifyoucan · 2023-09-22T21:13:59Z

@swift-ci please smoke test

cachemeifyoucan · 2023-09-25T15:59:14Z

@swift-ci please test

benlangmuir · 2023-09-25T18:27:14Z

I'm a bit unclear on the intended design for the replay function. Currently it works by triggering a performFrontend that internally does the caching. IIUC that will replay all the cached compilations associated with that frontend job in the case of batch mode. But if we have multiple keys for that and you have added one key parameter to the function for future use, won't you then need to make multiple calls to the same function?

I wonder if we should have a list of keys instead? I assume that even after you stop using performFrontend, you need to setup some state from the compiler arguments to do the replay. That seems like something we want to avoid repeating multiple times for the same invocation if possible.

cachemeifyoucan · 2023-09-25T19:47:23Z

I'm a bit unclear on the intended design for the replay function. Currently it works by triggering a performFrontend that internally does the caching. IIUC that will replay all the cached compilations associated with that frontend job in the case of batch mode. But if we have multiple keys for that and you have added one key parameter to the function for future use, won't you then need to make multiple calls to the same function?

I wonder if we should have a list of keys instead? I assume that even after you stop using performFrontend, you need to setup some state from the compiler arguments to do the replay. That seems like something we want to avoid repeating multiple times for the same invocation if possible.

See my comment in the beginning. #68660 (comment)

The ideal situation is to replay from each key but the number of API required will grow significantly and might not be very stable. I was thinking if there are better way to communicate (like setup replay state from a JSON or a CAS object).

cachemeifyoucan · 2023-09-27T22:42:47Z

After offline discussion, we are going to revise the cache key structure and this patch will be updated with the new design in mind later.

cachemeifyoucan · 2023-10-25T18:51:38Z

Update the patch to using the new cache key schema and new C APIs.

Add a tool swift-scan-test that can be used to test libswiftscan C APIs.

cachemeifyoucan · 2023-10-25T18:51:49Z

@swift-ci please smoke test

cachemeifyoucan · 2023-10-26T17:52:14Z

@swift-ci please smoke test

cachemeifyoucan · 2023-10-26T18:42:00Z

@swift-ci please smoke test

include/swift-c/DependencyScan/DependencyScan.h

tools/libSwiftScan/libSwiftScan.cpp

cachemeifyoucan · 2023-10-30T22:26:47Z

Also rewrite the patch to split Caching related code from DependencyScanning related code in libSwiftScan implementation so the file gets smaller and easier to read.

cachemeifyoucan · 2023-10-30T22:27:45Z

@swift-ci please smoke test

Change how cached diagnostics are stored inside the CAS. It used to be stored as a standalone entry for a frontend invocation in the cache and now it is switched to be associated with input files, stored together with other outputs like object files, etc. This enables cleaner Cache Replay APIs and future cached diagnostics that can be splitted up by file contribution.

cachemeifyoucan · 2023-10-30T23:04:01Z

@swift-ci please smoke test

include/swift-c/DependencyScan/DependencyScan.h

tools/libSwiftScan/SwiftCaching.cpp

include/swift-c/DependencyScan/DependencyScan.h

cachemeifyoucan · 2023-11-02T02:47:30Z

@swift-ci please smoke test

cachemeifyoucan · 2023-11-02T16:16:33Z

@swift-ci please smoke test

akyrtzi

Could you add a swiftscan_cached_output_get_casidstring function, it will be useful for the client for logging purposes.

I also foresee a need for something like swiftscan_cache_compilation_is_uncacheable, for the case that the compiler decided that the compilation was uncacheable, so the client can distinguish this case. I'd suggest to add the function as a stub (always returning false) so clients can call it, and implement it properly in a future PR once we find a case for marking a key as 'uncacheable'.

include/swift-c/DependencyScan/DependencyScan.h

tools/libSwiftScan/SwiftCaching.cpp

tools/swift-scan-test/swift-scan-test.cpp

include/swift-c/DependencyScan/DependencyScan.h

cachemeifyoucan · 2023-11-03T17:07:04Z

@swift-ci please smoke test

cachemeifyoucan · 2023-11-03T18:28:42Z

@swift-ci please smoke test

include/swift-c/DependencyScan/DependencyScan.h

tools/libSwiftScan/SwiftCaching.cpp

include/swift-c/DependencyScan/DependencyScan.h

tools/libSwiftScan/SwiftCaching.cpp

lib/DriverTool/swift_cache_tool_main.cpp

tools/libSwiftScan/SwiftCaching.cpp

cachemeifyoucan · 2023-11-03T22:14:33Z

@swift-ci please smoke test

tools/libSwiftScan/SwiftCaching.cpp

Add new APIs libSwiftScan that can be used for cache query and cache replay. This enables swift-driver or build system to query the cache and replay the compilation results without invocation swift-frontend for better scheduling.

cachemeifyoucan · 2023-11-03T23:04:06Z

@swift-ci please smoke test

akyrtzi

Thank you for addressing all the feedback!

cachemeifyoucan · 2023-11-04T18:49:15Z

@swift-ci please smoke test windows platform

cachemeifyoucan requested review from akyrtzi, benlangmuir, artemcm and tshortli September 21, 2023 00:01

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from 537f604 to 68edb2c Compare September 21, 2023 16:10

benlangmuir reviewed Sep 21, 2023

View reviewed changes

include/swift-c/DependencyScan/DependencyScan.h Outdated Show resolved Hide resolved

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from 68edb2c to 7dd33b0 Compare September 22, 2023 21:12

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from 7dd33b0 to 0d56a00 Compare September 22, 2023 23:07

cachemeifyoucan mentioned this pull request Sep 25, 2023

[SwiftDriver] PathRemapping and Cache Replay Support swiftlang/swift-driver#1453

Merged

cachemeifyoucan changed the title ~~[SwiftScan] Add SwiftScan APIs to replay the cache output~~ [Do Not Merge/Review][SwiftScan] Add SwiftScan APIs to replay the cache output Sep 27, 2023

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from 0d56a00 to 9177399 Compare October 25, 2023 18:50

cachemeifyoucan changed the title ~~[Do Not Merge/Review][SwiftScan] Add SwiftScan APIs to replay the cache output~~ [SwiftScan] Add SwiftScan APIs to replay the cache output Oct 25, 2023

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from 9177399 to 9565794 Compare October 26, 2023 17:51

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from 9565794 to 0d1c792 Compare October 26, 2023 18:33

artemcm approved these changes Oct 30, 2023

View reviewed changes

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from eb0eeb8 to e640e14 Compare October 30, 2023 22:24

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from e640e14 to ced0922 Compare October 30, 2023 23:03

akyrtzi reviewed Nov 1, 2023

View reviewed changes

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from ced0922 to b8f5226 Compare November 1, 2023 23:20

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from b8f5226 to ef24432 Compare November 2, 2023 16:15

akyrtzi reviewed Nov 2, 2023

View reviewed changes

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from ef24432 to ad78812 Compare November 3, 2023 17:06

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from ad78812 to e5076cb Compare November 3, 2023 18:17

benlangmuir reviewed Nov 3, 2023

View reviewed changes

include/swift-c/DependencyScan/DependencyScan.h Show resolved Hide resolved

akyrtzi reviewed Nov 3, 2023

View reviewed changes

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from e5076cb to 79118aa Compare November 3, 2023 22:13

akyrtzi reviewed Nov 3, 2023

View reviewed changes

tools/libSwiftScan/SwiftCaching.cpp Outdated Show resolved Hide resolved

akyrtzi reviewed Nov 3, 2023

View reviewed changes

tools/libSwiftScan/SwiftCaching.cpp Outdated Show resolved Hide resolved

akyrtzi reviewed Nov 3, 2023

View reviewed changes

tools/libSwiftScan/SwiftCaching.cpp Outdated Show resolved Hide resolved

[Caching] Add new CacheReplay APIs to libSwiftScan

034c15c

Add new APIs libSwiftScan that can be used for cache query and cache replay. This enables swift-driver or build system to query the cache and replay the compilation results without invocation swift-frontend for better scheduling.

cachemeifyoucan force-pushed the eng/PR-cache-replay-api branch from 79118aa to 034c15c Compare November 3, 2023 23:00

akyrtzi approved these changes Nov 3, 2023

View reviewed changes

cachemeifyoucan merged commit 9291c25 into swiftlang:main Nov 4, 2023

[SwiftScan] Add SwiftScan APIs to replay the cache output #68660

[SwiftScan] Add SwiftScan APIs to replay the cache output #68660

Uh oh!

Conversation

cachemeifyoucan commented Sep 21, 2023

Uh oh!

cachemeifyoucan commented Sep 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cachemeifyoucan commented Sep 21, 2023

Uh oh!

cachemeifyoucan commented Sep 21, 2023

Uh oh!

Uh oh!

akyrtzi commented Sep 21, 2023

Uh oh!

cachemeifyoucan commented Sep 21, 2023

Uh oh!

akyrtzi commented Sep 22, 2023

Uh oh!

cachemeifyoucan commented Sep 22, 2023

Uh oh!

cachemeifyoucan commented Sep 22, 2023

Uh oh!

cachemeifyoucan commented Sep 25, 2023

Uh oh!

benlangmuir commented Sep 25, 2023

Uh oh!

cachemeifyoucan commented Sep 25, 2023

Uh oh!

cachemeifyoucan commented Sep 27, 2023

Uh oh!

cachemeifyoucan commented Oct 25, 2023

Uh oh!

cachemeifyoucan commented Oct 25, 2023

Uh oh!

cachemeifyoucan commented Oct 26, 2023

Uh oh!

cachemeifyoucan commented Oct 26, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cachemeifyoucan commented Oct 30, 2023

Uh oh!

cachemeifyoucan commented Oct 30, 2023

Uh oh!

cachemeifyoucan commented Oct 30, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cachemeifyoucan commented Nov 2, 2023

Uh oh!

cachemeifyoucan commented Nov 2, 2023

Uh oh!

akyrtzi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cachemeifyoucan commented Nov 3, 2023

Uh oh!

cachemeifyoucan commented Nov 3, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cachemeifyoucan commented Sep 21, 2023 •

edited

Loading