Skip to content

Commit 813673a

Browse files
authored
Merge pull request #692 from stackhpc/zed-yoga-merge
zed: yoga merge
2 parents c431721 + 4ef1fcc commit 813673a

15 files changed

+138
-22
lines changed

.github/workflows/overcloud-host-image-build.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ jobs:
130130
- name: Install OpenStack client
131131
run: |
132132
source venvs/kayobe/bin/activate &&
133-
pip install python-openstackclient
133+
pip install python-openstackclient -c https://opendev.org/openstack/requirements/raw/branch/stable/${{ steps.openstack_release.outputs.openstack_release }}/upper-constraints.txt
134134
135135
- name: Build a Rocky Linux 9 overcloud host image
136136
id: build_rocky_9

.github/workflows/stackhpc-container-image-build.yml

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -145,6 +145,10 @@ jobs:
145145
env:
146146
KAYOBE_VAULT_PASSWORD: ${{ secrets.KAYOBE_VAULT_PASSWORD }}
147147

148+
- name: Prune local Kolla container images over 1 week old
149+
run: |
150+
sudo docker image prune --all --force --filter until=168h --filter="label=kolla_version"
151+
148152
- name: Build and push kolla overcloud images
149153
run: |
150154
args="${{ github.event.inputs.regexes }}"
@@ -180,17 +184,16 @@ jobs:
180184
run: |
181185
sudo docker image ls --filter "reference=ark.stackhpc.com/stackhpc-dev/*:*${{ matrix.distro }}*${{ needs.generate-tag.outputs.datetime_tag }}" > ${{ matrix.distro }}-container-images
182186
187+
- name: Fail if no images have been built
188+
run: if [ $(wc -l < ${{ matrix.distro }}-container-images) -le 1 ]; then exit 1; fi
189+
183190
- name: Upload container images artifact
184191
uses: actions/upload-artifact@v3
185192
with:
186193
name: ${{ matrix.distro }} container images
187194
path: ${{ matrix.distro }}-container-images
188195
retention-days: 7
189196

190-
- name: Prune local Kolla container images over 1 week old
191-
run: |
192-
sudo docker image prune --all --force --filter until=168h --filter="label=kolla_version"
193-
194197
sync-container-repositories:
195198
name: Trigger container image repository sync
196199
needs:

doc/source/configuration/release-train.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,16 @@ promoted to production:
186186
187187
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/pulp-repo-promote-production.yml
188188
189+
Synchronising all Kolla container images can take a long time. A limited list
190+
of images can be synchronised using the ``stackhpc_pulp_images_kolla_filter``
191+
variable, which accepts a whitespace-separated list of regular expressions
192+
matching Kolla image names. Usage is similar to ``kolla-build`` CLI arguments.
193+
For example:
194+
195+
.. code-block:: console
196+
197+
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/pulp-container-sync.yml -e stackhpc_pulp_images_kolla_filter='"^glance nova-compute$"'
198+
189199
Initial seed deployment
190200
-----------------------
191201

doc/source/configuration/vault.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -229,6 +229,18 @@ Enable the required TLS variables in kayobe and kolla
229229
230230
kayobe overcloud service deploy
231231
232+
If VM provisioning fails with an error with this format:
233+
234+
.. code-block::
235+
236+
Unable to establish connection to http://<kolla internal vip/fqdn>:9696/v2.0/ports/some-sort-of-uuid: Connection aborted
237+
238+
Restart the nova-compute container on all hypervisors:
239+
240+
.. code-block::
241+
242+
kayobe overcloud host command run --command "docker restart nova_compute" --become --show-output -l compute
243+
232244
Barbican integration
233245
====================
234246

doc/source/configuration/walled-garden.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,8 @@ proxy:
7777
- "127.0.0.1"
7878
- "localhost"
7979
- "{{ ('http://' ~ docker_registry) | urlsplit('hostname') if docker_registry else '' }}"
80-
- "{{ admin_oc_net_name | net_ip(inventory_hostname=groups['seed'][0]) }}"
80+
- "{{ lookup('vars', admin_oc_net_name ~ '_ips')[groups.seed.0] }}"
81+
- "{{ lookup('vars', admin_oc_net_name ~ '_ips')[inventory_hostname] }}"
8182
- "{{ kolla_external_fqdn }}"
8283
- "{{ kolla_internal_fqdn }}"
8384

doc/source/configuration/wazuh.rst

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -17,8 +17,8 @@ The short version
1717
#. Deploy the Wazuh agents: ``kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-agent.yml``
1818

1919

20-
Wazuh Manager
21-
=============
20+
Wazuh Manager Host
21+
==================
2222

2323
Provision using infra-vms
2424
-------------------------
@@ -303,7 +303,7 @@ Encrypt the keys (and remember to commit to git):
303303
``ansible-vault encrypt --vault-password-file ~/vault.pass $KAYOBE_CONFIG_PATH/ansible/wazuh/certificates/certs/*.key``
304304

305305
Verification
306-
==============
306+
------------
307307

308308
The Wazuh portal should be accessible on port 443 of the Wazuh
309309
manager’s IPs (using HTTPS, with the root CA cert in ``etc/kayobe/ansible/wazuh/certificates/wazuh-certificates/root-ca.pem``).
@@ -315,11 +315,9 @@ Troubleshooting
315315

316316
Logs are in ``/var/log/wazuh-indexer/wazuh.log``. There are also logs in the journal.
317317

318-
============
319318
Wazuh agents
320319
============
321320

322-
323321
Wazuh agent playbook is located in ``etc/kayobe/ansible/wazuh-agent.yml``.
324322

325323
Wazuh agent variables file is located in ``etc/kayobe/inventory/group_vars/wazuh-agent/wazuh-agent``.
@@ -333,13 +331,13 @@ Deploy the Wazuh agents:
333331
``kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-agent.yml``
334332

335333
Verification
336-
=============
334+
------------
337335

338336
The Wazuh agents should register with the Wazuh manager. This can be verified via the agents page in Wazuh Portal.
339337
Check CIS benchmark output in agent section.
340338

341-
Additional resources:
342-
=====================
339+
Additional resources
340+
--------------------
343341

344342
For times when you need to upgrade wazuh with elasticsearch to version with opensearch or you just need to deinstall all wazuh components:
345343
Wazuh purge script: https://github.com/stackhpc/wazuh-server-purge
Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
---
2+
# Sometimes, typically after restarting OVN services, the priorities of entries
3+
# in the ha_chassis and gateway_chassis tables in the OVN northbound database
4+
# can become misaligned. This results in broken routing for external (bare
5+
# metal/SR-IOV) ports.
6+
7+
# This playbook can be used to fix the issue by realigning the priorities of
8+
# the table entries. It does so by assigning the highest priority to the
9+
# "first" (sorted alphabetically) OVN NB DB host. This results in all gateways
10+
# being scheduled to a single host, but is less complicated than trying to
11+
# balance them (and it's also not clear to me how to map between individual
12+
# ha_chassis and gateway_chassis entries).
13+
14+
# The playbook can be run as follows:
15+
# kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ovn-fix-chassis-priorities.yml
16+
17+
# If the 'controllers' group does not align with the group used to deploy the
18+
# OVN NB DB, this can be overridden by passing the following:
19+
# '-e ovn_nb_db_group=some_other_group'
20+
21+
- name: Find OVN DB DB Leader
22+
hosts: "{{ ovn_nb_db_group | default('controllers') }}"
23+
tasks:
24+
- name: Find the OVN NB DB leader
25+
command: docker exec -it ovn_nb_db ovn-nbctl get-connection
26+
changed_when: false
27+
failed_when: false
28+
register: ovn_check_result
29+
check_mode: no
30+
31+
- name: Group hosts by leader/follower role
32+
group_by:
33+
key: "ovn_nb_{{ 'leader' if ovn_check_result.rc == 0 else 'follower' }}"
34+
changed_when: false
35+
36+
- name: Assert one leader exists
37+
assert:
38+
that:
39+
- groups['ovn_nb_leader'] | default([]) | length == 1
40+
41+
- name: Fix OVN chassis priorities
42+
hosts: ovn_nb_leader
43+
vars:
44+
ovn_nb_db_group: controllers
45+
ovn_nb_db_hosts_sorted: "{{ query('inventory_hostnames', ovn_nb_db_group) | sort | list }}"
46+
ha_chassis_max_priority: 32767
47+
gateway_chassis_max_priority: "{{ ovn_nb_db_hosts_sorted | length }}"
48+
tasks:
49+
- name: Fix ha_chassis priorities
50+
command: >-
51+
docker exec -it ovn_nb_db
52+
bash -c '
53+
ovn-nbctl find ha_chassis chassis_name={{ item }} |
54+
awk '\''$1 == "_uuid" { print $3 }'\'' |
55+
while read uuid; do ovn-nbctl set ha_chassis $uuid priority={{ priority }}; done'
56+
loop: "{{ ovn_nb_db_hosts_sorted }}"
57+
vars:
58+
priority: "{{ ha_chassis_max_priority | int - ovn_nb_db_hosts_sorted.index(item) }}"
59+
60+
- name: Fix gateway_chassis priorities
61+
command: >-
62+
docker exec -it ovn_nb_db
63+
bash -c '
64+
ovn-nbctl find gateway_chassis chassis_name={{ item }} |
65+
awk '\''$1 == "_uuid" { print $3 }'\'' |
66+
while read uuid; do ovn-nbctl set gateway_chassis $uuid priority={{ priority }}; done'
67+
loop: "{{ ovn_nb_db_hosts_sorted }}"
68+
vars:
69+
priority: "{{ gateway_chassis_max_priority | int - ovn_nb_db_hosts_sorted.index(item) }}"

etc/kayobe/ansible/reboot.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
- name: Reboot the host
33
hosts: seed-hypervisor:seed:overcloud:infra-vms
4-
serial: "{{ lookup('env', 'ANSIBLE_SERIAL') | default(0, true) }}"
4+
serial: "{{ lookup('env', 'ANSIBLE_SERIAL') | default(1, true) }}"
55
tags:
66
- reboot
77
tasks:
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
22
# Overcloud host image versioning tags
33
# These images must be in SMS, since they are used by our AIO CI runners
4-
stackhpc_rocky_9_overcloud_host_image_version: "yoga-20230515T145140"
4+
stackhpc_rocky_9_overcloud_host_image_version: "yoga-20230929T133006"
55
stackhpc_ubuntu_jammy_overcloud_host_image_version: "yoga-20230609T120720"

etc/kayobe/pulp-repo-versions.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ stackhpc_pulp_repo_centos_stream_9_opstools_version: 20230615T071742
77
stackhpc_pulp_repo_centos_stream_9_storage_ceph_quincy_version: 20230712T025152
88
stackhpc_pulp_repo_docker_ce_ubuntu_version: 20230921T005001
99
stackhpc_pulp_repo_elrepo_9_version: 20230907T075311
10-
stackhpc_pulp_repo_epel_9_version: 20230921T005001
10+
stackhpc_pulp_repo_epel_9_version: 20230929T005202
1111
stackhpc_pulp_repo_grafana_version: 20230921T005001
1212
stackhpc_pulp_repo_opensearch_2_x_version: 20230725T013015
1313
stackhpc_pulp_repo_opensearch_dashboards_2_x_version: 20230725T013015
@@ -21,11 +21,11 @@ stackhpc_pulp_repo_rocky_9_1_baseos_version: 20230921T005001
2121
stackhpc_pulp_repo_rocky_9_1_crb_version: 20230921T005001
2222
stackhpc_pulp_repo_rocky_9_1_extras_version: 20230921T005001
2323
stackhpc_pulp_repo_rocky_9_1_highavailability_version: 20230921T005001
24-
stackhpc_pulp_repo_rocky_9_2_appstream_version: 20230825T131407
25-
stackhpc_pulp_repo_rocky_9_2_baseos_version: 20230825T131407
26-
stackhpc_pulp_repo_rocky_9_2_crb_version: 20230825T131407
27-
stackhpc_pulp_repo_rocky_9_2_extras_version: 20230825T131407
28-
stackhpc_pulp_repo_rocky_9_2_highavailability_version: 20230805T012805
24+
stackhpc_pulp_repo_rocky_9_2_appstream_version: 20230928T024829
25+
stackhpc_pulp_repo_rocky_9_2_baseos_version: 20230928T024829
26+
stackhpc_pulp_repo_rocky_9_2_crb_version: 20230928T024829
27+
stackhpc_pulp_repo_rocky_9_2_extras_version: 20230915T001040
28+
stackhpc_pulp_repo_rocky_9_2_highavailability_version: 20230918T015928
2929
stackhpc_pulp_repo_ubuntu_jammy_security_version: 20230908T053616
3030
stackhpc_pulp_repo_ubuntu_jammy_version: 20230908T053616
3131
stackhpc_pulp_repo_ubuntu_cloud_archive_version: 20230908T112533

etc/kayobe/stackhpc-overcloud-dib.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,12 +67,15 @@ stackhpc_overcloud_dib_packages:
6767
- "vim"
6868
- "git"
6969
- "less"
70+
- "python3"
7071
- "{% if os_distribution == 'ubuntu' %}netbase{% endif %}"
7172
- "{% if os_distribution == 'ubuntu' %}iputils-ping{% endif %}"
7273
- "{% if os_distribution == 'ubuntu' %}curl{% endif %}"
7374
- "{% if os_distribution == 'ubuntu' %}apt-utils{% endif %}"
7475
- "{% if os_distribution == 'rocky' %}NetworkManager-config-server{% endif %}"
7576
- "{% if os_distribution == 'rocky' %}linux-firmware{% endif %}"
77+
- "{% if os_distribution == 'rocky' %}cloud-utils-growpart{% endif %}"
78+
- "{% if os_distribution == 'ubuntu' %}cloud-guest-utils{% endif %}"
7679

7780
# StackHPC overcloud DIB image block device configuration.
7881
# This image layout conforms to the CIS partition benchmarks.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
security:
3+
- |
4+
The Rocky 8 minor version has been bumped to 8.8 and new snapshots have
5+
been created to include fixes for Zenbleed (CVE-2023-20593), Downfall
6+
(CVE-2022-40982). It is recommended that you update your OS packages and
7+
reboot into the kernel as soon as possible.
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
---
2+
security:
3+
- |
4+
The snapshots for Rocky 9.2 have been refreshed to include fixes for
5+
Zenbleed (CVE-2023-20593), Downfall (CVE-2022-40982). It is recommended
6+
that you update your OS packages and reboot into the kernel as soon as
7+
possible.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
upgrade:
3+
- |
4+
The ``reboot.yml`` custom Ansible playbook now defaults to reboot only one
5+
host at a time. Existing behaviour can be retained by setting
6+
ANSIBLE_SERIAL=0.

0 commit comments

Comments
 (0)