Skip to content

build: disable stale remote http cache for local development #19603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

devversion
Copy link
Member

@devversion devversion commented Jun 10, 2020

By default, we have set up a remote cache for local development that is read-only
This seems like a leftover from our initial http caching where RBE was not enabled yet.

Nowadays, we use a different remote cache when RBE is enabled, so the one that
is configured for local development is actually never updated, except if an Angular
team members has credentials provided, and has the upload of results enabled.

Generally, enabling remote caching has the potential of slowing down local development
(especially when there are lot of misses) and we should only enable it as part of remote
build execution.

This also mitigates a bug in Bazel where restored remote action inputs can break if their
relative execroot path contains a space. This is the case for chromium on macOS, where
the browser test inputs (including chromium) are extracted from the remote cache, but
not fully so that the test action inputs will be incomplete on disk. Then later, the remote cache SpawnResult lookup for existing test results fails with an IO exception
(as an input file is missing on disk when the merkle tree is built up).

One might wonder why this doesn't surface with RBE then. This is most likely
(not confirmed) because the remote containers are linux based, and do not fail writing
these paths with a space when Bazel opens a file descriptor through a JNI wrapper.

TL;DR: This fixes an issue where chromium tests cannot be run in macOS.
Hence marked as P2 to unblock impeded local development

Related sources for later reference:

ERROR: /Users/paul/projects/material2/src/cdk/bidi/BUILD.bazel:36:18:  failed due to unexpected I/O exception: /private/var/tmp/_bazel_paul/24d3819ba30ba6e17ef227c181228208/execroot/angular_material/bazel-out/darwin-fastbuild/bin/src/cdk/bidi/unit_tests_chromium.sh.runfiles/npm_angular_dev_infra_private/browsers/chromium_archive.out/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Helpers (No such file or directory)
java.io.FileNotFoundException: /private/var/tmp/_bazel_paul/24d3819ba30ba6e17ef227c181228208/execroot/angular_material/bazel-out/darwin-fastbuild/bin/src/cdk/bidi/unit_tests_chromium.sh.runfiles/npm_angular_dev_infra_private/browsers/chromium_archive.out/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Helpers (No such file or directory)
	at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(Native Method)
	at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(NativePosixFiles.java:283)
	at com.google.devtools.build.lib.unix.UnixFileSystem.readdir(UnixFileSystem.java:160)
	at com.google.devtools.build.lib.vfs.Path.readdir(Path.java:384)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.explodeDirectory(DirectoryTreeBuilder.java:217)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.explodeDirectory(DirectoryTreeBuilder.java:210)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.lambda$buildFromActionInputs$1(DirectoryTreeBuilder.java:149)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.build(DirectoryTreeBuilder.java:201)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.buildFromActionInputs(DirectoryTreeBuilder.java:123)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.fromActionInputs(DirectoryTreeBuilder.java:62)
	at com.google.devtools.build.lib.remote.merkletree.MerkleTree.build(MerkleTree.java:140)
	at com.google.devtools.build.lib.remote.RemoteSpawnCache.lookup(RemoteSpawnCache.java:127)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:123)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:96)
	at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:39)
	at com.google.devtools.build.lib.exec.SpawnStrategyResolver.beginExecution(SpawnStrategyResolver.java:62)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy.beginTestAttempt(StandaloneTestStrategy.java:308)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy.access$200(StandaloneTestStrategy.java:69)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy$StandaloneTestRunnerSpawn.beginExecution(StandaloneTestStrategy.java:442)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.beginIfNotCancelled(TestRunnerAction.java:834)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.beginExecution(TestRunnerAction.java:803)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:859)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:850)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$4.execute(SkyframeActionExecutor.java:781)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:927)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:898)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:137)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:80)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:418)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:897)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:296)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:438)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:398)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)

It seems like this is already a known issue in Bazel, where runfiles with a space
in their path are not supported. If possible, these details will be added to the issue
so that Bazel can fix this as it currently seems low-priority and they primarily deal
with runfiles, but not with action input fetching.

Related issue: bazelbuild/bazel#4327.

By default, we have set up a remote cache for local development
that is read-only. This seems like a leftover from our initial http
caching where RBE was not enabled yet.

Nowadays, we use a different remote cache when RBE is enabled, so
the one that is configured for local development is actually never
updated, except if an Angular team members has credentials provided,
and has the upload of results enabled.

Generally, enabling remote caching has the potential of slowing down
local development (especially when there are LOT of misses) and we
should only enable it as part of remote build execution.

This also mitigates a bug in Bazel where restored remote action inputs
can break if their relative execroot path contains a space. This is the
case for chromium on macOS, so when the browser test inputs are extracted
from the remote cache, the test action inputs will be incomplete on disk
so that the remote cache lookup for existing test results fails with an IO
exception.

One might wonder why this doesn't surface with RBE then. This is most likely
(not confirmed) because the remote containers are linux based, and do not
fail writing these paths with a space when Bazel opens a file descriptor
through a JNI wrapper.

Related sources for later reference:
* https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/remote/merkletree/DirectoryTreeBuilder.java;l=217;drc=8b856f5484f0117b2aebc302f849c2a15f273310
* https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/remote/RemoteActionInputFetcher.java;l=101;drc=8b856f5484f0117b2aebc302f849c2a15f273310

```
ERROR: /Users/paul/projects/material2/src/cdk/bidi/BUILD.bazel:36:18:  failed due to unexpected I/O exception: /private/var/tmp/_bazel_paul/24d3819ba30ba6e17ef227c181228208/execroot/angular_material/bazel-out/darwin-fastbuild/bin/src/cdk/bidi/unit_tests_chromium.sh.runfiles/npm_angular_dev_infra_private/browsers/chromium_archive.out/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Helpers (No such file or directory)
java.io.FileNotFoundException: /private/var/tmp/_bazel_paul/24d3819ba30ba6e17ef227c181228208/execroot/angular_material/bazel-out/darwin-fastbuild/bin/src/cdk/bidi/unit_tests_chromium.sh.runfiles/npm_angular_dev_infra_private/browsers/chromium_archive.out/chrome-mac/Chromium.app/Contents/Frameworks/Chromium Framework.framework/Helpers (No such file or directory)
	at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(Native Method)
	at com.google.devtools.build.lib.unix.NativePosixFiles.readdir(NativePosixFiles.java:283)
	at com.google.devtools.build.lib.unix.UnixFileSystem.readdir(UnixFileSystem.java:160)
	at com.google.devtools.build.lib.vfs.Path.readdir(Path.java:384)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.explodeDirectory(DirectoryTreeBuilder.java:217)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.explodeDirectory(DirectoryTreeBuilder.java:210)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.lambda$buildFromActionInputs$1(DirectoryTreeBuilder.java:149)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.build(DirectoryTreeBuilder.java:201)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.buildFromActionInputs(DirectoryTreeBuilder.java:123)
	at com.google.devtools.build.lib.remote.merkletree.DirectoryTreeBuilder.fromActionInputs(DirectoryTreeBuilder.java:62)
	at com.google.devtools.build.lib.remote.merkletree.MerkleTree.build(MerkleTree.java:140)
	at com.google.devtools.build.lib.remote.RemoteSpawnCache.lookup(RemoteSpawnCache.java:127)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:123)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:96)
	at com.google.devtools.build.lib.actions.SpawnStrategy.beginExecution(SpawnStrategy.java:39)
	at com.google.devtools.build.lib.exec.SpawnStrategyResolver.beginExecution(SpawnStrategyResolver.java:62)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy.beginTestAttempt(StandaloneTestStrategy.java:308)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy.access$200(StandaloneTestStrategy.java:69)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy$StandaloneTestRunnerSpawn.beginExecution(StandaloneTestStrategy.java:442)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.beginIfNotCancelled(TestRunnerAction.java:834)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.beginExecution(TestRunnerAction.java:803)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:859)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:850)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$4.execute(SkyframeActionExecutor.java:781)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.continueAction(SkyframeActionExecutor.java:927)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:898)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:137)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:80)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:418)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:897)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:296)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:438)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:398)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
```

It seems like this is already a known issue in Bazel, where runfiles
with a space in their path are not supported. If possible, these
details will be added to the issue so that Bazel can fix this as
it currently seems low-priority and they primarily deal with runfiles,
but not with action input fetching.

Related issue: bazelbuild/bazel#4327.
@devversion devversion requested review from jelbourn and a team as code owners June 10, 2020 23:40
@googlebot googlebot added the cla: yes PR author has agreed to Google's Contributor License Agreement label Jun 10, 2020
@devversion devversion added merge safe target: patch This PR is targeted for the next patch release P2 The issue is important to a large percentage of users, with a workaround labels Jun 10, 2020
Copy link
Member

@josephperrott josephperrott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Note, the reason this does not surface with RBE is because GCPs RBE service provides both remote execution and colocated remote caching. Since our CI relies on this we do not run into these remote cache issues on CI, but additionally we do not ever update the remote http cache.

@josephperrott josephperrott added lgtm action: merge The PR is ready for merge by the caretaker labels Jun 11, 2020
@devversion
Copy link
Member Author

@josephperrott Yeah, that seems correct. In the remote containers though, this issue could certainly also surface as it also would restore/prefetch action inputs from the remote cache. Though that doesn't seem to be a problem here, so my guess is that this is then related to the OS being linux.

@andrewseguin andrewseguin merged commit b1e2a6d into angular:master Jun 12, 2020
@angular-automatic-lock-bot
Copy link

This issue has been automatically locked due to inactivity.
Please file a new issue if you are encountering a similar or related problem.

Read more about our automatic conversation locking policy.

This action has been performed automatically by a bot.

@angular-automatic-lock-bot angular-automatic-lock-bot bot locked and limited conversation to collaborators Jul 13, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
action: merge The PR is ready for merge by the caretaker cla: yes PR author has agreed to Google's Contributor License Agreement P2 The issue is important to a large percentage of users, with a workaround target: patch This PR is targeted for the next patch release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants