Fix caas zenith/hpctests/basic_users #662
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes issues preventing updating the appliance version used for Azimuth caas clusters:
502 error when trying to connect to monitoring/ondemand:
RockyLinux 9.5 upgraded to podman v5, which changes the default rootless network tool from slirp4netns to pasta.
The latter doesn't allow containers to reach the host's IP - see https://blog.podman.io/2024/03/podman-5-0-breaking-changes-in-detail/. This PR reverts the network stack for the zenith pod to slirp4netns1. It has been tested on RL9.5 only in caas. However the same option is supported in podman v4.9 used by RockyLinux 8.10 so this seems safe.
It also bumps the container images for the zenith clients and proxies, and removes some now-unneeded zenith configuration.
The "post-configuration validation" (hpctests) fails:
The issue is that since Root-squash nfs exports by default #599,
become
when running on the login node can't be used to create directories in /home. Therefore thehpctests_user
is set toazimuth
. Note Default hpctests_group to hpctests_user #663 was also required (merged from main).Root-squash nfs exports by default #599 changed configuration for the
basic_users
role, to cope with root-squashed NFS shares. The appliance defaults are suitable for that case, so need conditional modifications for the manila case. To make this simplier, caas slurm now mounts/home
on the control node when manila is in use, which makes it consistent with NFS, and will now mean theazimuth
user can access the control node whichever home fileshare is in use.No image build is required for any of these changes.
Footnotes
An alternative option may be to use pasta with the
--map-gw
option, but https://github.com/containers/podman/issues/22771 suggests this only works properly from podman v5.1, and it is not clear without testing that the gateway address = host's "main" IP address. Therefore simply restoring the previous behaviour seems preferable at the moment. ↩