Skip to content

yoga: Support running multinode clusters on Leafcloud #1021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 11, 2024

Conversation

markgoddard
Copy link
Contributor

@markgoddard markgoddard commented Apr 10, 2024

This PR includes various changes. The first two affect all deployments:

  • os_capacity: Add tags to playbook, update vault docs

The remaining ones affect only the multinode environment.

  • ci-multinode: Set default Ceph release to Quincy on Rocky Linux 9
  • ci-multinode: Use skc-ci-aio user for ci-multinode env
  • ci-multinode: Use Ark package repositories to install packages
  • ci-multinode: Allow rebooting for SELinux state
  • ci-multinode: Add API FQDNs to /etc/hosts in fix-networking.yml
  • ci-multinode: Wait for connection in fix-networking.yml
  • ci-multinode: Use qemu virtualisation

Details are in individual commit messages.

This PR is required by stackhpc/terraform-kayobe-multinode#45.

Similar to c338dd9, but applied to
ci-multinode instead of ci-aio.

This user only has read-only access to the package and container
repositories, so is safer than using the release-train-ci user which has
read/write permissions.
Similar to e9130b9 but applied to
ci-multinode rather than ci-aio.

Previously we were using Test Pulp on SMS lab, but this is out of
action. Switching to Ark allows CI jobs to run on Leafcloud (or anywhere
with Internet access).
The Yoga overcloud host images currently have SELinux disabled, but the
default config enables SELinux in permissive mode on Rocky Linux 9. This
change allows the ci-multinode environment to run on these images.
This avoids using the add-fqdn.yml playbook in
terraform-kayobe-multinode, which requires the Terraform/Ansible client
to have access to all hosts.
This allows us to drop the fix-homedir-ownership.yml playbook in
terraform-kayobe-multinode, which also performed the function of waiting
for hosts to become reachable.
Most multinode environments will use nested virtualisation, and we
can't guarantee that nested KVM support is available. Use QEMU as a
lowest common denominator.

We might consider setting this dynamically based on the hypervisor in
future.
seunghun1ee
seunghun1ee previously approved these changes Apr 11, 2024
Copy link
Member

@seunghun1ee seunghun1ee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Alex-Welsh
Alex-Welsh previously approved these changes Apr 11, 2024
Pacific is not supported on Rocky Linux 9, so it does not make sense as a default.
Previously the first deployment of a system with a Vault CA for internal
TLS and os_capacity enabled would fail when deploying HAProxy.
os_capacity deployment requires admin-openrc.sh to exist, but because of
the use of -kt haproxy the post-deploy tasks that create it will be
skipped.

This change fixes the issue by adding an os_capacity tag to the relevant
plays, and updating the Vault docs to skip the new tag when deploying
HAProxy.
@markgoddard markgoddard dismissed stale reviews from Alex-Welsh and seunghun1ee via 5037816 April 11, 2024 10:32
@markgoddard markgoddard force-pushed the yoga-multinode-ci-aio-user branch from 80d1c8d to 5037816 Compare April 11, 2024 10:32
@markgoddard markgoddard requested a review from Alex-Welsh April 11, 2024 11:00
@markgoddard markgoddard merged commit ee07cd3 into stackhpc/yoga Apr 11, 2024
@markgoddard markgoddard deleted the yoga-multinode-ci-aio-user branch April 11, 2024 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants