-
Notifications
You must be signed in to change notification settings - Fork 6.8k
build: disable stale remote http cache for local development #19603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: disable stale remote http cache for local development #19603
Conversation
By default, we have set up a remote cache for local development that is read-only. This seems like a leftover from our initial http caching where RBE was not enabled yet. Nowadays, we use a different remote cache when RBE is enabled, so the one that is configured for local development is actually never updated, except if an Angular team members has credentials provided, and has the upload of results enabled. Generally, enabling remote caching has the potential of slowing down local development (especially when there are LOT of misses) and we should only enable it as part of remote build execution. This also mitigates a bug in Bazel where restored remote action inputs can break if their relative execroot path contains a space. This is the case for chromium on macOS, so when the browser test inputs are extracted from the remote cache, the test action inputs will be incomplete on disk so that the remote cache lookup for existing test results fails with an IO exception. One might wonder why this doesn't surface with RBE then. This is most likely (not confirmed) because the remote containers are linux based, and do not fail writing these paths with a space when Bazel opens a file descriptor through a JNI wrapper. Related sources for later reference: * https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/remote/merkletree/DirectoryTreeBuilder.java;l=217;drc=8b856f5484f0117b2aebc302f849c2a15f273310 * https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/remote/RemoteActionInputFetcher.java;l=101;drc=8b856f5484f0117b2aebc302f849c2a15f273310 ``` ERROR: /Users/paul/projects/material2/src/cdk/bidi/BUILD.bazel:36:18: failed due to unexpected I/O exception: /private/var/tmp/_bazel_paul/24d3819ba30ba6e17ef227c181228208/execroot/angular_material/bazel-out/darwin-fastbuild/bin/src/cdk/bidi/unit_tests_chromium.sh.runfiles/npm_angular_dev_infra_private/browsers/chromium_archive.out/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Helpers (No such file or directory) java.io.FileNotFoundException: /private/var/tmp/_bazel_paul/24d3819ba30ba6e17ef227c181228208/execroot/angular_material/bazel-out/darwin-fastbuild/bin/src/cdk/bidi/unit_tests_chromium.sh.runfiles/npm_angular_dev_infra_private/browsers/chromium_archive.out/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Helpers (No such file or directory) at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(Native Method) at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(NativePosixFiles.java:283) at com.google.devtools.build.lib.unix.UnixFileSystem.readdir(UnixFileSystem.java:160) at com.google.devtools.build.lib.vfs.Path.readdir(Path.java:384) at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.explodeDirectory(DirectoryTreeBuilder.java:217) at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.explodeDirectory(DirectoryTreeBuilder.java:210) at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.lambda$buildFromActionInputs$1(DirectoryTreeBuilder.java:149) at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.build(DirectoryTreeBuilder.java:201) at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.buildFromActionInputs(DirectoryTreeBuilder.java:123) at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.fromActionInputs(DirectoryTreeBuilder.java:62) at com.google.devtools.build.lib.remote.merkletree.MerkleTree.build(MerkleTree.java:140) at com.google.devtools.build.lib.remote.RemoteSpawnCache.lookup(RemoteSpawnCache.java:127) at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:123) at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:96) at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:39) at com.google.devtools.build.lib.exec.SpawnStrategyResolver.beginExecution(SpawnStrategyResolver.java:62) at com.google.devtools.build.lib.exec.StandaloneTestStrategy.beginTestAttempt(StandaloneTestStrategy.java:308) at com.google.devtools.build.lib.exec.StandaloneTestStrategy.access$200(StandaloneTestStrategy.java:69) at com.google.devtools.build.lib.exec.StandaloneTestStrategy$StandaloneTestRunnerSpawn.beginExecution(StandaloneTestStrategy.java:442) at com.google.devtools.build.lib.analysis.test.TestRunnerAction.beginIfNotCancelled(TestRunnerAction.java:834) at com.google.devtools.build.lib.analysis.test.TestRunnerAction.beginExecution(TestRunnerAction.java:803) at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:859) at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:850) at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$4.execute(SkyframeActionExecutor.java:781) at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:927) at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:898) at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:137) at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:80) at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:418) at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:897) at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:296) at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:438) at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:398) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) ``` It seems like this is already a known issue in Bazel, where runfiles with a space in their path are not supported. If possible, these details will be added to the issue so that Bazel can fix this as it currently seems low-priority and they primarily deal with runfiles, but not with action input fetching. Related issue: bazelbuild/bazel#4327.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Note, the reason this does not surface with RBE is because GCPs RBE service provides both remote execution and colocated remote caching. Since our CI relies on this we do not run into these remote cache issues on CI, but additionally we do not ever update the remote http cache.
@josephperrott Yeah, that seems correct. In the remote containers though, this issue could certainly also surface as it also would restore/prefetch action inputs from the remote cache. Though that doesn't seem to be a problem here, so my guess is that this is then related to the OS being linux. |
This issue has been automatically locked due to inactivity. Read more about our automatic conversation locking policy. This action has been performed automatically by a bot. |
By default, we have set up a remote cache for local development that is read-only
This seems like a leftover from our initial http caching where RBE was not enabled yet.
Nowadays, we use a different remote cache when RBE is enabled, so the one that
is configured for local development is actually never updated, except if an Angular
team members has credentials provided, and has the upload of results enabled.
Generally, enabling remote caching has the potential of slowing down local development
(especially when there are lot of misses) and we should only enable it as part of remote
build execution.
This also mitigates a bug in Bazel where restored remote action inputs can break if their
relative execroot path contains a space. This is the case for chromium on macOS, where
the browser test inputs (including chromium) are extracted from the remote cache, but
not fully so that the test action inputs will be incomplete on disk. Then later, the remote cache
SpawnResult
lookup for existing test results fails with an IO exception(as an input file is missing on disk when the merkle tree is built up).
One might wonder why this doesn't surface with RBE then. This is most likely
(not confirmed) because the remote containers are linux based, and do not fail writing
these paths with a space when Bazel opens a file descriptor through a JNI wrapper.
TL;DR: This fixes an issue where chromium tests cannot be run in macOS.
Hence marked as P2 to unblock impeded local development
Related sources for later reference:
It seems like this is already a known issue in Bazel, where runfiles with a space
in their path are not supported. If possible, these details will be added to the issue
so that Bazel can fix this as it currently seems low-priority and they primarily deal
with runfiles, but not with action input fetching.
Related issue: bazelbuild/bazel#4327.