Skip to content

Merge Stackhpc work #27

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 36 commits into from
Feb 27, 2024
Merged

Merge Stackhpc work #27

merged 36 commits into from
Feb 27, 2024

Conversation

sjpb
Copy link
Collaborator

@sjpb sjpb commented Feb 23, 2024

Rolled up merges for PRs below. Have been merged here first to provide a total diff against prod2312 branch before applying.

TODO

  • Check state volume size is appropriate
  • Check openhpc config doesn't clash - this has been moved around a bit
  • Remove os_manila_mount* config (in unpushed commit)
  • Update openhpc role branch, now RL9 support has merged to main
  • Test in lab

PRs

#19

#20

Requires a volume with name ${var.cluster_name}-state to be manually created for prod and vtest environments.

  • With ~165 hosts this should probably be ~300GB

#23

Run terraform apply; terraform will destroy the openstack_networking_floatingip_v2. The FIP will be released from the project and the fixed to floating IP associations will be lost - but terraform will not error on this.
Add the FIPs back into the project manually.
Run terraform apply again. The FIPs will be reassociated with the fixed IPs.

#24

This modifies galaxy-installed roles so ./dev/setup-env.sh must be run to pull them in.

#25

This modifies galaxy-installed roles so ./dev/setup-env.sh must be run to pull them in.

#26

IMPORTANT: The ansible/roles/opensearch/tasks/migrate-opendistro.yml migration playbook only runs automatically when migrating a live system. It won't work after a rebuild as there will be no opendistro unit file. So we should migrate data manually before running site.yml.

Matt Pryor and others added 30 commits January 30, 2024 13:07
* Changes to support explicit quota checks

* Add count parameter to compute flavor

---------

Co-authored-by: Steve Brasier <[email protected]>
* Remove pulling the previous image

* Simplify logic around cluster_previous_image
* add support for manila to common environment

* use manila share for /scratch in stackhpc env w/ tests

* make home volume optional in skeleton TF

* remove share creation from CI TF

* add manila UI for caas

* support manila- or nfs/volume- based home dirs in caas

* remove manila config from UI

* add optional platform-lifecycle manila share for homedirs for caas

* add home and project manila config for caas

* tweak home volume size UI description to account for shares

* fix caas manila config typo

* tidy PR diff

* bump fatimage to include manila client

* Revert commit "tweak home volume size UI description to account for shares"

This reverts commit 3d9cfbadc141654cfbbb334b82ef1415362cc16c.

* add manila UI for caas

* add defaults for new caas manila extravars, where possible

* make cluster_home_manila_share_type optional for when default share type is defined

* address review comments

* bump manila requirement after role release

* default usage of home manila share to match project share for caas

* remove caas manila-specific ui-meta
* add manila /home specific UI for Arcus

* fixup description for home share size for arcus manila-only UI
* Add requires_ssh_key:true into ui meta

* require ssh key for manila too

---------

Co-authored-by: Steve Brasier <[email protected]>
* remove mention of home share from slurm-infra UI

* add description re. /project
Get lab *infra* functional based on prod
Prevent FIPs being released from project on terraform destroy
@sjpb sjpb changed the base branch from nrel to prod2312 February 23, 2024 15:32
@sjpb sjpb marked this pull request as ready for review February 27, 2024 15:20
@sjpb sjpb merged commit 272c12f into prod2312 Feb 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants