Skip to content

Commit e27a120

Browse files
authored
Merge branch 'stackhpc/2024.1' into caracal-odds-and-ends
2 parents 8fe2d3a + 88a6397 commit e27a120

File tree

52 files changed

+285
-77
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

52 files changed

+285
-77
lines changed

.github/workflows/stackhpc-all-in-one.yml

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -264,6 +264,28 @@ jobs:
264264
run: |
265265
docker image pull $KAYOBE_IMAGE
266266
267+
# Rocky 9 OVN deployments will fail when the hostname contains a '.'
268+
- name: Fix hostname
269+
run: |
270+
docker run -t --rm \
271+
-v $(pwd):/stack/kayobe-automation-env/src/kayobe-config \
272+
-e KAYOBE_ENVIRONMENT -e KAYOBE_VAULT_PASSWORD -e KAYOBE_AUTOMATION_SSH_PRIVATE_KEY \
273+
${{ steps.kayobe_image.outputs.kayobe_image }} \
274+
/stack/kayobe-automation-env/src/kayobe-config/.automation/pipeline/playbook-run.sh etc/kayobe/ansible/fix-hostname.yml
275+
env:
276+
KAYOBE_AUTOMATION_SSH_PRIVATE_KEY: ${{ steps.ssh_key.outputs.ssh_key }}
277+
278+
# Reboot to Apply hostname change
279+
- name: Reboot
280+
run: |
281+
docker run -t --rm \
282+
-v $(pwd):/stack/kayobe-automation-env/src/kayobe-config \
283+
-e KAYOBE_ENVIRONMENT -e KAYOBE_VAULT_PASSWORD -e KAYOBE_AUTOMATION_SSH_PRIVATE_KEY \
284+
${{ steps.kayobe_image.outputs.kayobe_image }} \
285+
/stack/kayobe-automation-env/src/kayobe-config/.automation/pipeline/playbook-run.sh etc/kayobe/ansible/reboot.yml -e reboot_with_bootstrap_user=true
286+
env:
287+
KAYOBE_AUTOMATION_SSH_PRIVATE_KEY: ${{ steps.ssh_key.outputs.ssh_key }}
288+
267289
- name: Run growroot
268290
run: |
269291
docker run -t --rm \
@@ -304,10 +326,29 @@ jobs:
304326
env:
305327
KAYOBE_AUTOMATION_SSH_PRIVATE_KEY: ${{ steps.ssh_key.outputs.ssh_key }}
306328

329+
- name: Change rabbit queues from HA to Quorum
330+
run: |
331+
sed -i -e 's/om_enable_rabbitmq_high_availability: true/om_enable_rabbitmq_high_availability: false/' \
332+
-e 's/om_enable_rabbitmq_quorum_queues: false/om_enable_rabbitmq_quorum_queues: true/' \
333+
etc/kayobe/environments/ci-aio/kolla/globals.yml
334+
if: inputs.upgrade
335+
336+
- name: Migrate RabbitMQ queues
337+
run: |
338+
docker run -t --rm \
339+
-v $(pwd):/stack/kayobe-automation-env/src/kayobe-config \
340+
-e KAYOBE_ENVIRONMENT -e KAYOBE_VAULT_PASSWORD -e KAYOBE_AUTOMATION_SSH_PRIVATE_KEY \
341+
${{ steps.kayobe_image.outputs.kayobe_image }} \
342+
/stack/kayobe-automation-env/src/kayobe-config/.automation/pipeline/script-run.sh tools/rabbitmq-quorum-migration.sh
343+
env:
344+
KAYOBE_AUTOMATION_SSH_PRIVATE_KEY: ${{ steps.ssh_key.outputs.ssh_key }}
345+
if: inputs.upgrade
346+
307347
# If testing upgrade, checkout the current release branch
308348
# Stash changes to tracked files, and set clean=false to avoid removing untracked files.
349+
# Revert changes to RabbitMQ Queue types to avoid a merge conflict
309350
- name: Stash config changes
310-
run: git stash
351+
run: git restore etc/kayobe/environments/ci-aio/kolla/globals.yml && git stash
311352
if: inputs.upgrade
312353

313354
- name: Checkout current release config

doc/source/_static/images/release-train.svg

Lines changed: 1 addition & 1 deletion
Loading

doc/source/configuration/cephadm.rst

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -345,6 +345,10 @@ should be used in the Kolla Manila configuration e.g.:
345345
RADOS Gateways
346346
--------------
347347

348+
RADOS Gateway integration is described in the :kolla-ansible-doc:`Kolla Ansible
349+
documentation
350+
<https://docs.openstack.org/kolla-ansible/latest/reference/storage/external-ceph-guide.html#radosgw>`.
351+
348352
RADOS Gateways (RGWs) are defined with the following:
349353

350354
.. code:: yaml
@@ -375,7 +379,7 @@ The set of commands below configure all of these.
375379
- "config set client.rgw rgw_enable_apis 's3, swift, swift_auth, admin'"
376380
- "config set client.rgw rgw_enforce_swift_acls true"
377381
- "config set client.rgw rgw_keystone_accepted_admin_roles 'admin'"
378-
- "config set client.rgw rgw_keystone_accepted_roles 'member, Member, _member_, admin'"
382+
- "config set client.rgw rgw_keystone_accepted_roles 'member, admin'"
379383
- "config set client.rgw rgw_keystone_admin_domain Default"
380384
- "config set client.rgw rgw_keystone_admin_password {{ secrets_ceph_rgw_keystone_password }}"
381385
- "config set client.rgw rgw_keystone_admin_project service"
@@ -391,6 +395,12 @@ The set of commands below configure all of these.
391395
- "config set client.rgw rgw_swift_account_in_url true"
392396
- "config set client.rgw rgw_swift_versioning_enabled true"
393397
398+
Enable the Kolla Ansible RADOS Gateway integration in ``kolla.yml``:
399+
400+
.. code:: yaml
401+
402+
kolla_enable_ceph_rgw: true
403+
394404
As we have configured Ceph to respond to Swift APIs, you will need to tell
395405
Kolla to account for this when registering Swift endpoints with Keystone. Also,
396406
when ``rgw_swift_account_in_url`` is set, the equivalent Kolla variable should
@@ -412,6 +422,11 @@ before deploying the RADOS gateways. If you are using the Kolla load balancer
412422
413423
kayobe overcloud service deploy -kt ceph-rgw,keystone,haproxy,loadbalancer
414424
425+
There are two options for load balancing RADOS Gateway:
426+
427+
1. HA with Ceph Ingress services
428+
2. RGWs with hyper-converged Ceph (using the Kolla Ansible deployed HAProxy
429+
load balancer)
415430

416431
.. _RGWs-with-hyper-converged-Ceph:
417432

doc/source/configuration/magnum-capi.rst

Lines changed: 4 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -60,12 +60,12 @@ To deploy the CAPI management cluster using this site-specific environment, run
6060

6161
.. code-block:: bash
6262
63-
# Activate the environment
64-
./bin/activate <site-specific-name>
65-
6663
# Install or update the local Ansible Python venv
6764
./bin/ensure-venv
6865
66+
# Activate the environment
67+
source bin/activate <site-specific-name>
68+
6969
# Install or update Ansible dependencies
7070
ansible-galaxy install -f -r ./requirements.yml
7171
@@ -103,12 +103,7 @@ To configure the Magnum service with the Cluster API driver enabled, first ensur
103103

104104
Next, copy the CAPI management cluster's kubeconfig file into your stackhpc-kayobe-config environment (e.g. ``<your-skc-environment>/kolla/config/magnum/kubeconfig``). This file must be Ansible vault encrypted.
105105

106-
The following config should also be set in your stackhpc-kayobe-config environment:
107-
108-
.. code-block:: yaml
109-
:caption: kolla/globals.yml
110-
111-
magnum_capi_helm_driver_enabled: true
106+
The presence of a kubeconfig file in the Magnum config directory is used by Kolla to determine whether the CAPI Helm driver should be enabled.
112107

113108
To apply the configuration, run ``kayobe overcloud service reconfigure -kt magnum``.
114109

doc/source/configuration/wazuh.rst

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,6 @@ The short version
1212
particular the defaults assume that the ``provision_oc_net`` network will be
1313
used.
1414
#. Generate secrets: ``kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-secrets.yml``
15-
#. Encrypt the secrets: ``ansible-vault encrypt --vault-password-file ~/vault.password $KAYOBE_CONFIG_PATH/environments/ci-multinode/wazuh-secrets.yml``
1615
#. Deploy the Wazuh manager: ``kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-manager.yml``
1716
#. Deploy the Wazuh agents: ``kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-agent.yml``
1817

@@ -250,7 +249,6 @@ It will be used by wazuh secrets playbook to generate wazuh secrets vault file.
250249
.. code-block:: console
251250
252251
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-secrets.yml
253-
ansible-vault encrypt --vault-password-file ~/vault.pass $KAYOBE_CONFIG_PATH/wazuh-secrets.yml
254252
255253
Configure Wazuh Dashboard's Server Host
256254
---------------------------------------

doc/source/operations/upgrading-ceph.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ Place the host or batch of hosts into maintenance mode:
6363

6464
.. code-block:: console
6565
66-
sudo cephadm shell -- ceph orch host maintenance enter <host>
66+
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ceph-enter-maintenance.yml -l <host>
6767
6868
To update all eligible packages, use ``*``, escaping if necessary:
6969

@@ -72,7 +72,8 @@ To update all eligible packages, use ``*``, escaping if necessary:
7272
kayobe overcloud host package update --packages "*" --limit <host>
7373
7474
If the kernel has been upgraded, reboot the host or batch of hosts to pick up
75-
the change:
75+
the change. While running this playbook, consider setting ``ANSIBLE_SERIAL`` to
76+
the maximum number of hosts that can safely reboot concurrently.
7677

7778
.. code-block:: console
7879
@@ -82,7 +83,7 @@ Remove the host or batch of hosts from maintenance mode:
8283

8384
.. code-block:: console
8485
85-
sudo cephadm shell -- ceph orch host maintenance exit <host>
86+
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ceph-exit-maintenance.yml -l <host>
8687
8788
Wait for Ceph health to return to ``HEALTH_OK``:
8889

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
- name: Ensure a Ceph host has entered maintenance
3+
gather_facts: true
4+
any_errors_fatal: true
5+
# We need to check whether it is OK to stop hosts after previous hosts have
6+
# entered maintenance.
7+
serial: 1
8+
hosts: ceph
9+
become: true
10+
tasks:
11+
- name: Ensure a Ceph host has entered maintenance
12+
ansible.builtin.import_role:
13+
name: stackhpc.cephadm.enter_maintenance
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
---
2+
- name: Ensure a Ceph host has exited maintenance
3+
gather_facts: true
4+
any_errors_fatal: true
5+
hosts: ceph
6+
# The role currently requires hosts to exit maintenance serially.
7+
serial: 1
8+
become: true
9+
tasks:
10+
- name: Ensure a Ceph host has exited maintenance
11+
ansible.builtin.import_role:
12+
name: stackhpc.cephadm.exit_maintenance

etc/kayobe/ansible/cis.yml

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,7 @@
3535
- include_role:
3636
name: ansible-lockdown.rhel9_cis
3737
when: ansible_facts.os_family == 'RedHat' and ansible_facts.distribution_major_version == '9'
38-
tags: always
3938

4039
- include_role:
4140
name: ansible-lockdown.ubuntu22_cis
4241
when: ansible_facts.distribution == 'Ubuntu' and ansible_facts.distribution_major_version == '22'
43-
tags: always

etc/kayobe/ansible/fix-hostname.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
- name: Fix hostname on storage nodes for cephadm
3-
hosts: storage
2+
- name: Ensure hostnames match inventory hostnames
3+
hosts: fix-hostname
44
gather_facts: false
55
vars:
66
ansible_user: "{{ bootstrap_user }}"

etc/kayobe/ansible/prometheus-network-names.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
---
12
- name: Prometheus friendly network names
23
hosts: overcloud
34
gather_facts: no

etc/kayobe/ansible/reboot.yml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,26 @@
22
- name: Reboot the host
33
hosts: seed-hypervisor:seed:overcloud:infra-vms
44
serial: "{{ lookup('env', 'ANSIBLE_SERIAL') | default(1, true) }}"
5+
gather_facts: false
6+
vars:
7+
reboot_timeout_s: "{{ 20 * 60 }}"
8+
reboot_with_bootstrap_user: false
9+
ansible_user: "{{ bootstrap_user if reboot_with_bootstrap_user | bool else kayobe_ansible_user }}"
10+
ansible_ssh_common_args: "{{ '-o StrictHostKeyChecking=no' if reboot_with_bootstrap_user | bool else '' }}"
11+
ansible_python_interpreter: "/usr/bin/python3"
512
tags:
613
- reboot
714
tasks:
815
- name: Reboot and wait
916
become: true
1017
reboot:
18+
reboot_timeout: "{{ reboot_timeout_s }}"
19+
search_paths:
20+
# Systems running molly-guard hang waiting for confirmation before rebooting without this.
21+
- "/lib/molly-guard"
22+
# Default list:
23+
- "/sbin"
24+
- "/bin"
25+
- "/usr/sbin"
26+
- "/usr/bin"
27+
- "/usr/local/sbin"

etc/kayobe/ansible/requirements.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,15 @@
11
---
22
collections:
33
- name: stackhpc.cephadm
4-
version: 1.15.1
4+
version: 1.18.0
55
# NOTE: Pinning pulp.squeezer to 0.0.13 because 0.0.14+ depends on the
66
# pulp_glue Python library being installed.
77
- name: pulp.squeezer
88
version: 0.0.13
99
- name: stackhpc.pulp
1010
version: 0.5.5
1111
- name: stackhpc.hashicorp
12-
version: 2.5.0
12+
version: 2.5.1
1313
- name: stackhpc.kayobe_workflows
1414
version: 1.0.3
1515
roles:

etc/kayobe/ansible/stackhpc-openstack-tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
depth: 1
3232
single_branch: true
3333

34-
- name: Ensure the latest versions of pip and setuptools are installed # noqa package-latest
34+
- name: Ensure the latest versions of pip and setuptools are installed # noqa package-latest
3535
ansible.builtin.pip:
3636
name: "{{ item.name }}"
3737
state: latest

etc/kayobe/ansible/templates/wazuh-secrets.yml.j2

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ secrets_wazuh:
77
# Strengthen default wazuh api user pass
88
wazuh_api_users:
99
- username: "wazuh"
10-
password: "{{ secrets_wazuh.wazuh_api_users[0].password | default(lookup('community.general.random_string', min_lower=1, min_upper=1, min_special=1, min_numeric=1, length=30)) }}"
10+
password: "{{ secrets_wazuh.wazuh_api_users[0].password | default(lookup('community.general.random_string', min_lower=1, min_upper=1, min_special=1, min_numeric=1, length=30, override_special=override_special_characters)) }}"
1111
# OpenSearch 'admin' user pass
1212
opendistro_admin_password: "{{ secrets_wazuh.opendistro_admin_password | default(lookup('password', '/dev/null'), true) }}"
1313
# OpenSearch 'kibanaserver' user pass

etc/kayobe/ansible/ubuntu-upgrade.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,15 @@
4040
reboot:
4141
reboot_timeout: "{{ reboot_timeout_s }}"
4242
connect_timeout: 600
43+
search_paths:
44+
# Systems running molly-guard hang waiting for confirmation before rebooting without this.
45+
- "/lib/molly-guard"
46+
# Default list:
47+
- "/sbin"
48+
- "/bin"
49+
- "/usr/sbin"
50+
- "/usr/bin"
51+
- "/usr/local/sbin"
4352
become: true
4453
when: file_status.stat.exists
4554

@@ -101,6 +110,15 @@
101110
reboot:
102111
reboot_timeout: "{{ reboot_timeout_s }}"
103112
connect_timeout: 600
113+
search_paths:
114+
# Systems running molly-guard hang waiting for confirmation before rebooting without this.
115+
- "/lib/molly-guard"
116+
# Default list:
117+
- "/sbin"
118+
- "/bin"
119+
- "/usr/sbin"
120+
- "/usr/bin"
121+
- "/usr/local/sbin"
104122
become: true
105123

106124
- name: Update distribution facts

etc/kayobe/ansible/wazuh-secrets.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
gather_facts: false
44
vars:
55
wazuh_secrets_path: "{{ kayobe_env_config_path }}/wazuh-secrets.yml"
6+
override_special_characters: '"#$%&()*+,-./:;<=>?@[\]^_{|}~'
67
tasks:
78
- name: install passlib[bcrypt]
89
pip:

etc/kayobe/apt.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -74,15 +74,15 @@ stackhpc_apt_repositories:
7474
# Do not replace apt configuration for non-overcloud hosts. This can result in
7575
# errors if apt reconfiguration is performed before local repository mirrors
7676
# are deployed.
77-
apt_repositories: "{{ stackhpc_apt_repositories | selectattr('required') | list if 'overcloud' in group_names else [] }}"
77+
apt_repositories: "{{ stackhpc_apt_repositories | selectattr('required') | list if stackhpc_repos_enabled | bool else [] }}"
7878

7979
# Whether to disable repositories in /etc/apt/sources.list. This may be used
8080
# when replacing the distribution repositories via apt_repositories.
8181
# Default is false.
8282
# Do not disable the default apt configuration for non-overcloud hosts. This
8383
# can result in errors if apt reconfiguration is performed before local
8484
# repository mirrors are deployed.
85-
apt_disable_sources_list: "{{ 'overcloud' in group_names }}"
85+
apt_disable_sources_list: "{{ stackhpc_repos_enabled | bool }}"
8686

8787
# Apt auth configuration for accessing the package repository mirror.
8888
stackhpc_apt_auth:
@@ -98,7 +98,7 @@ stackhpc_apt_auth:
9898
# * filename: Name of a file in which to store the auth configuration. The
9999
# extension should be '.conf'.
100100
# Default is an empty list.
101-
apt_auth: "{{ stackhpc_apt_auth if 'overcloud' in group_names and stackhpc_repo_mirror_username is truthy else [] }}"
101+
apt_auth: "{{ stackhpc_apt_auth if stackhpc_repos_enabled | bool and stackhpc_repo_mirror_username is truthy else [] }}"
102102

103103
###############################################################################
104104
# Dummy variable to allow Ansible to accept this file.

etc/kayobe/dnf.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,10 +41,10 @@
4141
# file: myrepo
4242
# gpgkey: http://gpgkey
4343
# gpgcheck: yes
44-
#dnf_custom_repos:
44+
dnf_custom_repos: "{{ stackhpc_dnf_repos if stackhpc_repos_enabled | bool else [] }}"
4545

4646
# A dict of custom repositories that point to the local Pulp server.
47-
# To use these repos, set dnf_custom_repos to the value of stackhpc_dnf_repos.
47+
# To use these repos, set stackhpc_repos_enabled to true.
4848
# This is done by default for hosts in the overcloud group via a group_vars
4949
# file.
5050
stackhpc_dnf_repos: "{{ dnf_custom_repos_el9 | combine(dnf_custom_repos_rocky_9) | combine(dnf_custom_repos_elrepo_9 if dnf_install_elrepo_9 | bool else {}) }}"

etc/kayobe/environments/aufn-ceph/configure-openstack.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,4 +25,4 @@ ansible-galaxy collection install -p ansible/collections -r requirements.yml
2525
source $BASE_PATH/src/kayobe-config/etc/kolla/public-openrc.sh
2626

2727
# Run script to configure openstack cloud
28-
tools/openstack-config
28+
tools/openstack-config

etc/kayobe/environments/aufn-ceph/inventory/groups

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,4 @@ storage-ceph
2626
# Monitoring groups
2727

2828
[monitoring:children]
29-
controllers
29+
controllers

etc/kayobe/environments/ci-aio/inventory/groups

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,6 @@
33
[container-image-builders:children]
44
# Build container images on the all-in-one controller.
55
controllers
6+
7+
[fix-hostname:children]
8+
controllers

0 commit comments

Comments
 (0)