Skip to content

Commit 76e133a

Browse files
authored
Merge pull request #799 from stackhpc/rl9-migrations-ceph-issues
Document new issues seen with Storage hosts
2 parents e8c7879 + 5f00e91 commit 76e133a

File tree

1 file changed

+46
-5
lines changed

1 file changed

+46
-5
lines changed

doc/source/operations/rocky-linux-9.rst

Lines changed: 46 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -233,11 +233,14 @@ Potential issues
233233
.. code-block:: yaml
234234
235235
mariabackup_image_full: "{{ docker_registry }}/stackhpc/rocky-source-mariadb-server:yoga-20230310T170929"
236-
- When using Octavia load balancers, restarting Neutron causes load balancers
237-
with floating IPs to stop processing traffic. See `LP#2042938
238-
<https://bugs.launchpad.net/neutron/+bug/2042938>`__ for details. The issue
239-
may be worked around after Neutron has been restarted by detaching then
240-
reattaching the floating IP to the load balancer's virtual IP.
236+
- When using Octavia load balancers, restarting Neutron causes load balancers
237+
with floating IPs to stop processing traffic. See `LP#2042938
238+
<https://bugs.launchpad.net/neutron/+bug/2042938>`__ for details. The issue
239+
may be worked around after Neutron has been restarted by detaching then
240+
reattaching the floating IP to the load balancer's virtual IP.
241+
242+
- If you are using hyper-converged Ceph, please also note the potential issues
243+
in the Storage section below.
241244

242245
Full procedure for one host
243246
---------------------------
@@ -466,6 +469,44 @@ Potential issues
466469
be identical, now that the "maintenance mode approach" is being used.
467470
It is still recommended to do the bootstrap host last.
468471

472+
- Prior to reprovisioning the bootstrap host, it can be beneficial to backup
473+
``/etc/ceph`` and ``/var/lib/ceph``, as sometimes the keys, config, etc.
474+
stored here will not be moved/recreated correctly.
475+
476+
- When a host is taken out of maintenance, you may see errors relating to
477+
permissions of /tmp/etc and /tmp/var. These issues should be resolved in
478+
Ceph version 17.2.7. See issue: https://github.com/ceph/ceph/pull/50736. In
479+
the meantime, you can work around this by running the command below. You may
480+
need to omit one or the other of ``/tmp/etc`` and ``/tmp/var``. You will
481+
likely need to run this multiple times. Run ``ceph -W cephadm`` to monitor
482+
the logs and see when permissions issues are hit.
483+
484+
.. code-block:: console
485+
486+
kayobe overcloud host command run --command "chown -R stack:stack /tmp/etc /tmp/var" -b -l storage
487+
488+
- It has been seen that sometimes the Ceph containers do not come up after
489+
reprovisioning. This seems to be related to having ``/var/lib/ceph``
490+
persisted through the reprovision (e.g. seen at a customer in a volume
491+
with software RAID). (Note: further investigation is needed for the root
492+
cause). When this occurs, you will need to redeploy the daemons:
493+
494+
List the daemons on the host:
495+
496+
.. code-block:: console
497+
498+
ceph orch ps <hostname>
499+
500+
501+
Redeploy the daemons, one at a time. It is recommended that you start with
502+
the crash daemon, as this will have the least impact if unexpected issues
503+
occur.
504+
505+
.. code-block:: console
506+
507+
ceph orch daemon redeploy <daemon name> to redeploy a daemon.
508+
509+
469510
- Commands starting with ``ceph`` are all run on the cephadm bootstrap
470511
host in a cephadm shell unless stated otherwise.
471512

0 commit comments

Comments
 (0)