Skip to content

Support deploying multinodes on Leafcloud #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Apr 11, 2024
Merged

Support deploying multinodes on Leafcloud #45

merged 20 commits into from
Apr 11, 2024

Conversation

markgoddard
Copy link

@markgoddard markgoddard commented Apr 10, 2024

This PR contains various changes required to deploy a multinode cluster on Leafcloud. It also contains a number of other changes, fixes and reorgs.

  • Update Terraform provider hashes
  • Remove unused configure-local-networking.sh hello.sh in scripts/
  • Grow Ansible control host root volume
  • Remove Ansible playbooks targeting seed & overcloud hosts
  • Remove hosts from Ansible inventory except for Ansible control host
  • Change default root_domain to multinode.stackhpc.com
  • Support attaching a floating IP to the Ansible control host
  • Use new name of Tempest container when following logs
  • Move Tempest test results to ~/tempest-artifacts
  • Improve Tempest test result handling
  • Use SSH key defined by ssh_key_path when connecting to Ansible control host
  • Skip os_capacity when deploying HAProxy for Vault
  • Add an example tfvars file for Leafcloud
  • README: Reorganise and various fixes
  • Improve prechecks for ssh_key, vault_password, vxlan_vni

Details are in individual commit messages.

This PR depends on stackhpc/stackhpc-kayobe-config#1021

The StackHPC overcloud host image ships with small logical volumes that
must be expanded after deployment.
- add-fqdn.yml: This functionality has been moved to the
  fix-networking.yml custom playbook in stackhpc-kayobe-config
- fix-homedir-ownership.yml: This workaround should no longer be
  required due to fixes in the images. If the issues resurface it will
  be better to be made aware of it than hide it with a workaround.
  This playbook also had the job of waiting for the hosts to become
  reachable using wait_for_connection. This has been moved to
  grow-control-host.yml for the Ansible control host, and the
  fix-networking.yml custom playbook in stackhpc-kayobe-config.
These are no longer required since we only connect to the Ansible
control host using Ansible in this repository.
This may be used for SSH access to the Ansible control host when no
direct access to the multinode network is available.
This location is easier to find.
- Backup previous results to avoid mistaking them for new ones
- Ignore docker logs check failure because it's racy
- Check for failed-tests artifact and exit 1 if any tests fail
…l host

This avoids problems such as "Too many authentication failures" when
using SSH agent.
Previously the first deployment of a system with a Vault CA for internal
TLS and os_capacity enabled would fail when deploying HAProxy.
os_capacity deployment requires admin-openrc.sh to exist, but because of
the use of -kt haproxy the post-deploy tasks that create it will be
skipped.

This change fixes the issue by skipping the os_capacity tag when
deploying HAProxy.
seunghun1ee
seunghun1ee previously approved these changes Apr 11, 2024
@markgoddard markgoddard deleted the leafcloud branch April 11, 2024 10:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants