Skip to content

Commit 0a12762

Browse files
committed
Document new issues seen with Storage hosts
1 parent 533ee57 commit 0a12762

File tree

1 file changed

+47
-5
lines changed

1 file changed

+47
-5
lines changed

doc/source/operations/rocky-linux-9.rst

Lines changed: 47 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -233,11 +233,14 @@ Potential issues
233233
.. code-block:: yaml
234234
235235
mariabackup_image_full: "{{ docker_registry }}/stackhpc/rocky-source-mariadb-server:yoga-20230310T170929"
236-
- When using Octavia load balancers, restarting Neutron causes load balancers
237-
with floating IPs to stop processing traffic. See `LP#2042938
238-
<https://bugs.launchpad.net/neutron/+bug/2042938>`__ for details. The issue
239-
may be worked around after Neutron has been restarted by detaching then
240-
reattaching the floating IP to the load balancer's virtual IP.
236+
- When using Octavia load balancers, restarting Neutron causes load balancers
237+
with floating IPs to stop processing traffic. See `LP#2042938
238+
<https://bugs.launchpad.net/neutron/+bug/2042938>`__ for details. The issue
239+
may be worked around after Neutron has been restarted by detaching then
240+
reattaching the floating IP to the load balancer's virtual IP.
241+
242+
- If you are using hyper-convered Ceph, please also note the potential issues
243+
in the Storage section below.
241244

242245
Full procedure for one host
243246
---------------------------
@@ -466,6 +469,45 @@ Potential issues
466469
be identical, now that the "maintenance mode approach" is being used.
467470
It is still recommended to do the bootstrap host last.
468471

472+
- Prior to reprovisioning the bootstrap host, it can be beneficial to backup
473+
``/etc/ceph`` and ``/var/lib/ceph``, as sometimes the keys, config, etc.
474+
stored here will not be moved/recreated correctly.
475+
476+
- When a host is taken out of maintenance, you may see errors relating to
477+
permissions of /tmp/etc and /tmp/var. These issues should be resolved in
478+
Ceph version 17.2.7. See issue: https://github.com/ceph/ceph/pull/50736. In
479+
the meantime, you can work around this by running the command below. You may
480+
need to omit one or the other of ``/tmp/etc`` and ``/tmp/var``. You will
481+
likely need to run this multiple times. Run ``ceph -W cephadm`` to monitor
482+
the logs and see when permissions issues are hit.
483+
484+
.. code-block:: console
485+
486+
kayobe overcloud host command run --command "chown -R stack:stack /tmp/etc /tmp/var" -b -l storage
487+
488+
489+
- It has been seen that sometimes the Ceph containers do not come up after
490+
reprovisioning. This seems to be related to having ``/var/lib/ceph
491+
``persisted through the reprovision (e.g. seen at a customer in a volume
492+
with software RAID). (Note: further investigation is needed for the root
493+
cause). When this occurs, you will need to redeploy the daemons:
494+
495+
List the daemons on the host:
496+
497+
.. code-block:: console
498+
499+
ceph orch ps <hostname>
500+
501+
502+
Redeploy the daemons, one at a time. It is recommended that you start with
503+
the crash daemon, as this will have the least impact if unexpected issues
504+
occur.
505+
506+
.. code-block:: console
507+
508+
ceph orch daemon redeploy <daemon name> to redeploy a daemon.
509+
510+
469511
- Commands starting with ``ceph`` are all run on the cephadm bootstrap
470512
host in a cephadm shell unless stated otherwise.
471513

0 commit comments

Comments
 (0)