Skip to content

Commit cccd6c9

Browse files
committed
Compute-Init: wait for cloud-init before NFS mount
We are seeing issues where compute-init hits: TASK [Check if hostvars exist] FAILED! => {"changed": false, "msg": "Permission denied"} We have found we are ignoring errors on the mount. Its possible the mount will fail if the host networking has not be setup. Lets wait to make sure we can talk to NFS before attempting the NFS mount, mostly checking because the host networking stack might not yet be setup correctly. We could do "cloud-init status --wait" and block on cloud-init having finished, however we don't really depend on all parts of cloud-init being complete. Equally, we could think about ansible-init systemd unit file depending on cloud-init or the network being available, but there are cases where we do not want that.
1 parent 0aec76c commit cccd6c9

File tree

1 file changed

+7
-3
lines changed

1 file changed

+7
-3
lines changed

ansible/roles/compute_init/files/compute-init.yml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,13 @@
6161
owner: slurm
6262
group: root
6363
mode: u=rX,g=rwX,o=
64-
64+
65+
- name: Wait for NFS to reachable (checks host network up)
66+
ansible.builtin.wait_for:
67+
port: 2049
68+
host: '{{ server_node_ip }}'
69+
timeout: 120
70+
6571
- name: Mount /mnt/cluster
6672
mount:
6773
path: /mnt/cluster
@@ -70,8 +76,6 @@
7076
opts: ro,sync
7177
state: mounted
7278
register: _mount_mnt_cluster
73-
ignore_errors: true
74-
# TODO: add some retries here?
7579

7680
- block:
7781
- name: Report skipping initialization if cannot mount nfs

0 commit comments

Comments
 (0)