forked from llvm/llvm-project
-
Notifications
You must be signed in to change notification settings - Fork 341
[🍒][llvm][cas] Add validate-if-needed to recover from invalid data #10602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
mikeash
merged 8 commits into
swiftlang:swift/release/6.2
from
benlangmuir:eng/blangmuir/validate-if-needed-swift-123542312-release/6.2
May 1, 2025
Merged
[🍒][llvm][cas] Add validate-if-needed to recover from invalid data #10602
mikeash
merged 8 commits into
swiftlang:swift/release/6.2
from
benlangmuir:eng/blangmuir/validate-if-needed-swift-123542312-release/6.2
May 1, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
benlangmuir
commented
May 1, 2025
- Explanation: The on-disk CAS data files are designed to remain correct in spite of any process crashes, but they are not designed to allow power loss or similar failures where the OS may fail to flush all pages to disk. To fix this, we introduce an API that validates the CAS once for every machine boot and recovers from invalid data by removing it.
- Scope: Affects compilation caching/cas compilations.
- Issues: rdar://123542312
- Original PRs:
- Main PR: [llvm][cas] Add validate-if-needed to recover from invalid data #10581
- Additional small cherry-picks to reduce merge conflicts
- [llvm][cas] Extend on-disk CAS validation to ActionCache #10406
- [clang][cas] Pass through LLVM_CAS_LOG env var to the depscan daemon #10360
- [LLVM][Support] Add new CreateFileError functions llvm/llvm-project#125906
- Risk: Low. Validation failures are almost guaranteed to lead to errors at some point, so catching them earlier should not present any risk. The main risk here would be an unexpected spurious failure blocking validation or causing unnecessary reduction in cache hits after reboot. Both of the tools that adopt this change (clang scanning daemon and swift-build) have mechanisms to disable the new validation if there is any unexpected fallout.
- Testing: Regression tests added.
- Reviewers: @cachemeifyoucan
Add new CreateFileError functions to create a StringError with the specified error code and prepend the file path to it Needed for: llvm#125345 (cherry picked from commit 2464f4b)
Validate the ActionCache hash-mapped trie structure and sanity check the resulting values. Unlike the CAS itself there is no direct way to check the values are "correct", but at least we can check for invalid zero offsets, which is what we would get if we dropped page writes or truncated the file. (cherry picked from commit 2966de4)
…:validate) (cherry picked from commit eb2d1ea)
Introduce a new validate-if-needed API to the UnifiedOnDiskCache and llvm-cas tool that triggers out-of-process validation of the CAS once for every machine boot, and optionally recovers from invalid data by marking it for garbage collection. This fixes a hole in the CAS data coherence when a power loss or similar failure causes the OS to not flush all of the pages in the mmaped on-disk CAS files. The intent is that clients such as the clang scanning daemon or a build system should trigger this validation at least once before using the CAS. rdar://123542312 (cherry picked from commit 17aa9cf)
Ensure that the scanning daemon modifications to the CAS are captured properly by the log. (cherry picked from commit f6a775f)
Use the new validate-if-needed functionality to ensure the clang scanning daemon's CAS data is valid. (cherry picked from commit 1101d73)
(cherry picked from commit cb7bc8c)
Documentation says it can either be ENOTEMPTY (like Darwin) or EEXIST. Also print the error. (cherry picked from commit bb52d96)
@swift-ci please test |
@swift-ci please test llvm |
akyrtzi
approved these changes
May 1, 2025
@swift-ci please test macOS |
The macOS failure was unrelated. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.