@@ -24,6 +24,9 @@ This guide covers the following types of hosts:
24
24
- Compute hosts
25
25
- Storage hosts
26
26
- Seed
27
+
28
+ The following types of hosts will be covered in future:
29
+
27
30
- Seed hypervisor
28
31
- Ansible control host
29
32
- Wazuh manager
@@ -61,8 +64,9 @@ Configuration
61
64
62
65
Make the following changes to your Kayobe configuration:
63
66
64
- - Set ``os_distribution `` to ``rocky `` in ``etc/kayobe/globals.yml ``
65
- - Set ``os_release `` to ``"9" `` in ``etc/kayobe/globals.yml ``
67
+ - Merge in the latest ``stackhpc-kayobe-config `` ``stackhpc/yoga `` branch.
68
+ - Set ``os_distribution `` to ``rocky `` in ``etc/kayobe/globals.yml ``.
69
+ - Set ``os_release `` to ``"9" `` in ``etc/kayobe/globals.yml ``.
66
70
- If you are using Kayobe multiple environments, add the following into
67
71
``kayobe-config/etc/kayobe/environments/<env>/kolla/config/nova.conf ``
68
72
(as Kolla custom service config environment merging is not supported in
@@ -166,16 +170,11 @@ Deploy latest CentOS Stream 8 images
166
170
------------------------------------
167
171
168
172
Make sure you deploy the latest CentOS Stream 8 containers prior to
169
- this migration.
170
-
171
- The usual steps apply:
172
-
173
- - Merge in the latest changes from the ``stackhpc-kayobe-config `` ``stackhpc/yoga `` branch
174
- - Upgrade services
173
+ this migration:
175
174
176
- .. code-block :: console
175
+ .. code-block :: console
177
176
178
- kayobe overcloud service deploy
177
+ kayobe overcloud service deploy
179
178
180
179
Controllers
181
180
===========
@@ -220,43 +219,82 @@ Full procedure for one host
220
219
221
220
kayobe overcloud host command run --command 'docker exec -it ovn_sb_db ovs-appctl -t /run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound' --show-output -l controllers
222
221
223
- 4. Deprovision the controller:
222
+ 4. If the controller is running Ceph services:
223
+
224
+ 1. Set host in maintenance mode:
225
+
226
+ .. code-block :: console
227
+
228
+ ceph orch host maintenance enter <hostname>
229
+
230
+ 2. Check there's nothing remaining on the host:
231
+
232
+ .. code-block :: console
233
+
234
+ ceph orch ps <hostname>
235
+
236
+ 5. Deprovision the controller:
224
237
225
238
.. code :: console
226
239
227
240
kayobe overcloud deprovision -l <hostname>
228
241
229
- 5 . Reprovision the controller:
242
+ 6 . Reprovision the controller:
230
243
231
244
.. code :: console
232
245
233
246
kayobe overcloud provision -l <hostname>
234
247
235
- 6 . Host configure:
248
+ 7 . Host configure:
236
249
237
250
.. code :: console
238
251
239
252
kayobe overcloud host configure -l <hostname> -kl <hostname>
240
253
241
- 7. Service deploy on all controllers:
254
+ 8. If the controller is running Ceph OSD services:
255
+
256
+ 1. Make sure the cephadm public key is in ``authorized_keys `` for stack or
257
+ root user - depends on your setup. For example, your SSH key may
258
+ already be defined in ``users.yml `` . If in doubt, run the cephadm
259
+ deploy playbook to copy the SSH key and install the cephadm binary.
260
+
261
+ .. code-block :: console
262
+
263
+ kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cephadm-deploy.yml
264
+
265
+ 2. Take the host out of maintenance mode:
266
+
267
+ .. code-block :: console
268
+
269
+ ceph orch host maintenance exit <hostname>
270
+
271
+ 3. Make sure that everything is back in working condition before moving
272
+ on to the next host:
273
+
274
+ .. code-block :: console
275
+
276
+ ceph -s
277
+ ceph -w
278
+
279
+ 9. Service deploy on all controllers:
242
280
243
281
.. code :: console
244
282
245
283
kayobe overcloud service deploy -kl controllers
246
284
247
- 8 . If using OVN, check OVN northbound DB cluster state on all controllers to see if the new host has joined:
285
+ 10 . If using OVN, check OVN northbound DB cluster state on all controllers to see if the new host has joined:
248
286
249
- .. code :: console
287
+ .. code :: console
250
288
251
- kayobe overcloud host command run --command 'docker exec -it ovn_nb_db ovs-appctl -t /run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound' --show-output -l controllers
289
+ kayobe overcloud host command run --command 'docker exec -it ovn_nb_db ovs-appctl -t /run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound' --show-output -l controllers
252
290
253
- 9 . If using OVN, check OVN southbound DB cluster state on all controllers to see if the new host has joined:
291
+ 11 . If using OVN, check OVN southbound DB cluster state on all controllers to see if the new host has joined:
254
292
255
- .. code :: console
293
+ .. code :: console
256
294
257
- kayobe overcloud host command run --command 'docker exec -it ovn_sb_db ovs-appctl -t /run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound' --show-output -l controllers
295
+ kayobe overcloud host command run --command 'docker exec -it ovn_sb_db ovs-appctl -t /run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound' --show-output -l controllers
258
296
259
- 10 . Some MariaDB instability has been observed. The exact cause is unknown but
297
+ 12 . Some MariaDB instability has been observed. The exact cause is unknown but
260
298
the simplest fix seems to be to run the Kayobe database recovery tool
261
299
between migrations.
262
300
@@ -291,25 +329,64 @@ Full procedure for one batch of hosts
291
329
292
330
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/nova-compute-{disable,drain}.yml --limit <host>
293
331
294
- 2. Deprovision the compute node:
332
+ 2. If the compute node is running Ceph OSD services:
333
+
334
+ 1. Set host in maintenance mode:
335
+
336
+ .. code-block :: console
337
+
338
+ ceph orch host maintenance enter <hostname>
339
+
340
+ 2. Check there's nothing remaining on the host:
341
+
342
+ .. code-block :: console
343
+
344
+ ceph orch ps <hostname>
345
+
346
+ 3. Deprovision the compute node:
295
347
296
348
.. code :: console
297
349
298
350
kayobe overcloud deprovision -l <hostname>
299
351
300
- 3 . Reprovision the compute node:
352
+ 4 . Reprovision the compute node:
301
353
302
354
.. code :: console
303
355
304
356
kayobe overcloud provision -l <hostname>
305
357
306
- 4 . Host configure:
358
+ 5 . Host configure:
307
359
308
360
.. code :: console
309
361
310
362
kayobe overcloud host configure -l <hostname> -kl <hostname>
311
363
312
- 5. Service deploy:
364
+ 6. If the compute node is running Ceph OSD services:
365
+
366
+ 1. Make sure the cephadm public key is in ``authorized_keys `` for stack or
367
+ root user - depends on your setup. For example, your SSH key may
368
+ already be defined in ``users.yml `` . If in doubt, run the cephadm
369
+ deploy playbook to copy the SSH key and install the cephadm binary.
370
+
371
+ .. code-block :: console
372
+
373
+ kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cephadm-deploy.yml
374
+
375
+ 2. Take the host out of maintenance mode:
376
+
377
+ .. code-block :: console
378
+
379
+ ceph orch host maintenance exit <hostname>
380
+
381
+ 3. Make sure that everything is back in working condition before moving
382
+ on to the next host:
383
+
384
+ .. code-block :: console
385
+
386
+ ceph -s
387
+ ceph -w
388
+
389
+ 7. Service deploy:
313
390
314
391
.. code :: console
315
392
@@ -320,8 +397,6 @@ If any VMs were powered off, they may now be powered back on.
320
397
Wait for Prometheus alerts and errors in OpenSearch Dashboard to resolve, or
321
398
address them.
322
399
323
- After updating controllers or network hosts, run any appropriate smoke tests.
324
-
325
400
Once happy that the system has been restored to full health, move onto the next
326
401
host or batch or hosts.
327
402
@@ -380,13 +455,13 @@ Full procedure for any storage host
380
455
381
456
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/cephadm-deploy.yml
382
457
383
- 6 . Take the host out of maintenance mode:
458
+ 7 . Take the host out of maintenance mode:
384
459
385
460
.. code-block :: console
386
461
387
462
ceph orch host maintenance exit <hostname>
388
463
389
- 7 . Make sure that everything is back in working condition before moving
464
+ 8 . Make sure that everything is back in working condition before moving
390
465
on to the next host:
391
466
392
467
.. code-block :: console
@@ -426,75 +501,81 @@ Full procedure
426
501
427
502
lsblk
428
503
429
- 2. If the data volume is not mounted at either ``/var/lib/docker `` or
504
+ 2. Use `mysqldump
505
+ <https://docs.openstack.org/kayobe/yoga/administration/seed.html#database-backup-restore> `_
506
+ to take a backup of the MariaDB database. Copy the backup file to one of
507
+ the Bifrost container's persistent volumes, such as ``/var/lib/ironic/ `` in
508
+ the ``bifrost_deploy `` container.
509
+
510
+ 3. If the data volume is not mounted at either ``/var/lib/docker `` or
430
511
``/var/lib/docker/volumes ``, make an external copy of the data
431
512
somewhere on the seed hypervisor.
432
513
433
- 3 . On the seed, stop the MariaDB process within the bifrost_deploy
514
+ 4 . On the seed, stop the MariaDB process within the bifrost_deploy
434
515
container:
435
516
436
517
.. code :: console
437
518
438
519
sudo docker exec bifrost_deploy systemctl stop mariadb
439
520
440
- 4 . On the seed, stop docker:
521
+ 5 . On the seed, stop docker:
441
522
442
523
.. code :: console
443
524
444
525
sudo systemctl stop docker
445
526
446
- 5 . On the seed, shut down the host:
527
+ 6 . On the seed, shut down the host:
447
528
448
529
.. code :: console
449
530
450
531
sudo systemctl poweroff
451
532
452
- 6 . Wait for the VM to shut down:
533
+ 7 . Wait for the VM to shut down:
453
534
454
535
.. code :: console
455
536
456
537
watch sudo virsh list --all
457
538
458
- 7 . Back up the VM volumes on the seed hypervisor
539
+ 8 . Back up the VM volumes on the seed hypervisor
459
540
460
541
.. code :: console
461
542
462
543
sudo mkdir /var/lib/libvirt/images/backup
463
544
sudo cp -r /var/lib/libvirt/images /var/lib/libvirt/images/backup
464
545
465
- 8 . Delete the seed root volume (check the structure & naming
546
+ 9 . Delete the seed root volume (check the structure & naming
466
547
conventions first)
467
548
468
549
.. code :: console
469
550
470
551
sudo virsh vol-delete seed-root --pool default
471
552
472
- 9 . Reprovision the seed
553
+ 10 . Reprovision the seed
473
554
474
- .. code :: console
555
+ .. code :: console
475
556
476
- kayobe seed vm provision
557
+ kayobe seed vm provision
477
558
478
- 10 . Seed host configure
559
+ 11 . Seed host configure
479
560
480
561
.. code :: console
481
562
482
563
kayobe seed host configure
483
564
484
- 11 . Rebuild seed container images (if using locally-built rather than
565
+ 12 . Rebuild seed container images (if using locally-built rather than
485
566
release train images)
486
567
487
568
.. code :: console
488
569
489
570
kayobe seed container image build --push
490
571
491
- 12 . Service deploy
572
+ 13 . Service deploy
492
573
493
574
.. code :: console
494
575
495
576
kayobe seed service deploy
496
577
497
- 13 . Verify that Bifrost/Ironic is healthy.
578
+ 14 . Verify that Bifrost/Ironic is healthy.
498
579
499
580
Seed hypervisor
500
581
===============
0 commit comments