-
Notifications
You must be signed in to change notification settings - Fork 34
Deploy rebuild machinery #44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Need to:
|
ansible/validate.yml
Outdated
|
||
- name: Validate rebuild configuration | ||
hosts: openhpc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks a bit openstack specific. Is there a way we could only run this only if we are using openstack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rebuild functionality is openstack specific - maybe we need to add a group?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah or possibly just a variable that sets the cloud type? E.g
appliance_cloud_type: openstack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have gone with a group as its consistent with all other features and there could be reasons why you only want to enable rebuild on a subset of nodes I guess.
ansible/validate.yml
Outdated
that: openhpc_clouds_path.stat.exists | ||
fail_msg: "openrc file {{ openhpc_rebuild_clouds }} on localhost (specified by `openhpc_rebuild_clouds`) does not exist" | ||
- name: Check OpenRC file (openhpc_rebuild_clouds) is usable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add no_log here so that we don't leak a token in stdout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All I can find on the manpage is:
--log-file ¶
Specify a file to log output. Disabled by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validate tasks now defined in collection's role which does set this.
Needs stackhpc/ansible_collection_slurm_openstack_tools#26 and stackhpc/ansible_collection_slurm_openstack_tools#25 merging before fixing this. |
|
||
[rebuild:children] | ||
compute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only issue here is that the vagrant example is using this file and it will fall the validation as it is not an openstack cloud, see: https://github.com/stackhpc/openhpc-demo/blob/c9fc16510513f1a655049c7576f9ce672a850f18/environments/vagrant-example/ansible.cfg#L8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now failing with this error: https://github.com/stackhpc/openhpc-demo/pull/44/checks?check_run_id=2085683879#step:7:106
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've got some incompatible pieces here:
- Defining a group to turn on rebuild seems "right" as it's consistent with the way we do everything else
- Testing the
everything
template using vagrant seems right
Seems like there's a few approaches which would keep both of the above:
- Modify the inventory groups when running vagrant. Not easy to do directly
- Special-case the rebuild validate & deploy when running vagrant. Messy, and probably the problems above are really a specific case of a more general "we can't do this in vagrant" problem.
- Make vagrant pass the validation task - the "main" looks like it should be ok - might need
openhpc_rebuild_reconfigure: false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've got some incompatible pieces here:
- Defining a group to turn on rebuild seems "right" as it's consistent with the way we do everything else
- Testing the
everything
template using vagrant seems rightSeems like there's a few approaches which would keep both of the above:
- Modify the inventory groups when running vagrant. Not easy to do directly
- Special-case the rebuild validate & deploy when running vagrant. Messy, and probably the problems above are really a specific case of a more general "we can't do this in vagrant" problem.
- Make vagrant pass the validation task - the "main" looks like it should be ok - might need
openhpc_rebuild_reconfigure: false
To me it seems like the right place to declare an openstack cloud is in the inventory generated by the provisioning script as it has all the info about how the machines were provisioned. We should only install the openstack rebuild script on an openstack cloud. We could do this with a feature flag or via a constructed inventory (put the hosts in openstack_rebuild if the cloud is openstack). Unsupported clouds should ship with rebuild feature disabled. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure why we hit: https://github.com/stackhpc/openhpc-demo/runs/2085005006#step:7:80 - ansible version related? Think I hit the same issue here and had to use the unqualified name: Edit: this looks like a red herring as on re-run it found the role, see: https://github.com/stackhpc/openhpc-demo/pull/44/checks?check_run_id=2085683879#step:7:106 |
dev/vagrant-example-configure.sh
Outdated
echo "fake openstack command" | ||
EOL | ||
chmod u+x /tmp/vagrant-example/openstack | ||
PATH=/tmp/vagrant-example:$PATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to export this for it to be visible in the ansible process.
Seems to have got missed in move to multi-environments.