Skip to content

Commit 1a23c33

Browse files
authored
Deploy rebuild machinery (#44)
* deploy rebuild machinery * add rebuild group * fix groups * mock openstack command * add debug flags temporarily * export new PATH * remove rebuild group from everything layout - configure in generated inventory instead
1 parent 36deefa commit 1a23c33

File tree

6 files changed

+26
-4
lines changed

6 files changed

+26
-4
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ Although most of the inventory uses the group convention described above there a
121121
- An inventory group `<cluster_name>_<partition_name>` defining the hosts it contains - these must be homogenous w.r.t CPU and memory.
122122
- An entry in the `openhpc_slurm_partitions` mapping in `environments/<environment>/inventory/group_vars/openhpc/overrides.yml`.
123123
See the [openhpc role documentation](https://github.com/stackhpc/ansible-role-openhpc#slurmconf) for more options.
124-
124+
- On an OpenStack cloud, rebuilding/reimaging compute nodes from Slurm can be enabled by defining a `rebuild` group containing the relevant compute hosts (e.g. in the generated `hosts` file).
125125

126126
## Adding new functionality
127127
TODO: this is just rough notes:

ansible/slurm.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,15 @@
1818
- import_role:
1919
name: stackhpc.openhpc
2020

21+
- name: Setup slurm-driven reimage
22+
hosts: rebuild
23+
become: yes
24+
tags:
25+
- rebuild
26+
tasks:
27+
- import_role:
28+
name: stackhpc.slurm_openstack_tools.rebuild
29+
2130
- name: Set locked memory limits on user-facing nodes
2231
hosts:
2332
- compute

ansible/validate.yml

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,4 +27,14 @@
2727
- import_role:
2828
name: opendistro
2929
tasks_from: validate.yml
30-
tags: validate
30+
tags: validate
31+
32+
- name: Validate rebuild configuration
33+
hosts: rebuild
34+
gather_facts: false
35+
tags: rebuild
36+
tasks:
37+
- import_role:
38+
name: stackhpc.slurm_openstack_tools.rebuild
39+
tasks_from: validate.yml
40+
tags: validate

dev/vagrant-example-configure.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,4 @@ ansible-playbook ansible/adhoc/generate-passwords.yml
2020

2121
echo "Running site.yml"
2222

23-
ansible-playbook ansible/site.yml
23+
ansible-playbook -vvvv ansible/site.yml -e "openhpc_rebuild_clouds=/tmp/vagrant-example/openstack"

environments/common/inventory/groups

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,3 +66,6 @@ cluster # TODO: FIXME in branch
6666
[selinux:children]
6767
# Define selinux status for these
6868
cluster
69+
70+
[rebuild]
71+
# nodes in 'compute' group which can be rebuilt from slurm (on an OpenStack cloud)

environments/common/layouts/everything

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,4 @@ control
3535
control
3636

3737
[filebeat:children]
38-
slurm_stats
38+
slurm_stats

0 commit comments

Comments
 (0)