-
Notifications
You must be signed in to change notification settings - Fork 34
Enable build of environment-specific control images #160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Currently failing on:
Didn't see this in dev deployment as that had no group defined! |
cce41b2
to
ad4f769
Compare
8.4.7-1 appears to ignore admin username/password in grafana.ini, which causes TASK [grafana-datasources : Ensure datasources exist (via API)] to fail with: Unauthorized to perform action 'POST' on 'http://<control-hostname>:3000/api/user/using/1'
…/ansible-slurm-appliance into feature/control-images2
Think CI ran out of instances, can retry later. |
@m-bull I've added 91e91e5 to this, to skip |
Enables building environment-specific control images in the same way as for login/compute builds.
Ticket: https://stackhpc.atlassian.net/browse/DEV-695
Note this does NOT move state off-board from the controller, so reimaging the controller will loose all state. However it does at least provide an image which can be used to create a working controller for an existing cluster (regardless of e.g. upstream package changes).
Note that
ansible/site.yml
will need running after imaging a node with this image, to set:Fixes #133. Replaces #136.
Requires:
Note that with
openondemand
enabled, building a control image BEFORE deploying the login node fails insmslabs
environment as grafana needs to know theopenondemand_servername
which requires the private IP (defined as.ansible_host
) for the login node. This doesn't occur in CI as a direct deployment of control/login/2x compute is done first which generates thehosts
inventory file with this information.This limitation could be fixed in a later PR / for other environments by changing the TF to be two stages:
hosts
file using this info (instead of from instances).and only doing step 1 before running the image build. That should also allow login & compute image build without the control node actually existing.
CI does not currently try the built image.
NB: CI is failing until requirements.yml updated, but waiting for another PR to merge on openhpc role before bumping version
Dev deployment: vglabs-steveb-ansible-rocky85:/home/rocky/slurm-app-control-images