Skip to content

zed: yoga merge #692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Oct 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
3a4fa4c
Add a custom playbook to fix OVN chassis priorities
markgoddard Sep 22, 2023
e57542c
Document stackhpc_pulp_images_kolla_filter variable
priteau Sep 28, 2023
c22c0f0
Merge pull request #673 from stackhpc/document-kolla-filter
markgoddard Sep 29, 2023
ed2aadd
Use upper constraints to install openstackclient
jovial Sep 29, 2023
9ed844e
Merge pull request #674 from stackhpc/yoga/host-image-build
markgoddard Sep 29, 2023
dae64dd
Merge pull request #661 from stackhpc/wallaby-ovn-fix-chassis-priorit…
markgoddard Sep 29, 2023
e441f49
Add python and growpart to images (#676)
GregWhiteyBialas Sep 29, 2023
0c026e1
Bump Rocky 9 host image to 9.2
jovial Sep 29, 2023
b0fbc2d
Bump Rocky tag for python and growpart additions
MoteHue Sep 29, 2023
8a8e846
Bump Rocky 9 snapshots
jovial Sep 29, 2023
36af591
Change default to reboot one host at a time
priteau Oct 3, 2023
3062cc7
Bump Rocky 8 snapshots
jovial Sep 29, 2023
ffe9c4b
Add debugging info to tls deployment docs
Alex-Welsh Oct 4, 2023
ec606c6
Fail container build workflow when no images build
Alex-Welsh Sep 27, 2023
6740a20
Merge pull request #679 from stackhpc/yoga/rocky9-snapshots
cityofships Oct 4, 2023
fc16f34
Merge pull request #684 from stackhpc/container-build-check
markgoddard Oct 5, 2023
d56edd5
Merge pull request #681 from stackhpc/reboot-serial
markgoddard Oct 5, 2023
47efa99
Update walled garden guide no_proxy defaults
Alex-Welsh Oct 5, 2023
0ae4734
Merge pull request #687 from stackhpc/walled-garden-no-proxy-docs
markgoddard Oct 5, 2023
b0fbc05
Merge pull request #683 from stackhpc/tls-docs
markgoddard Oct 5, 2023
763c275
Merge pull request #677 from stackhpc/yoga/rocky-8-snapshots
markgoddard Oct 5, 2023
7bb0de4
Moving bifrost config into its proper folder.
grzegorzkoper Oct 6, 2023
1722ffa
docs: fix wazuh headings
markgoddard Sep 27, 2023
bad2942
Merge pull request #665 from stackhpc/xena-wazuh-docs-headings
markgoddard Oct 6, 2023
13b7955
Merge pull request #688 from stackhpc/bifrost_config
markgoddard Oct 6, 2023
e34f498
Merge stackhpc/wallaby into stackhpc/xena
markgoddard Oct 6, 2023
2f60888
Merge stackhpc/xena into stackhpc/yoga
markgoddard Oct 6, 2023
c090564
Merge pull request #690 from stackhpc/yoga-xena-merge
markgoddard Oct 6, 2023
4ef1fcc
Merge stackhpc/yoga into stackhpc/zed
markgoddard Oct 6, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/overcloud-host-image-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ jobs:
- name: Install OpenStack client
run: |
source venvs/kayobe/bin/activate &&
pip install python-openstackclient
pip install python-openstackclient -c https://opendev.org/openstack/requirements/raw/branch/stable/${{ steps.openstack_release.outputs.openstack_release }}/upper-constraints.txt

- name: Build a Rocky Linux 9 overcloud host image
id: build_rocky_9
Expand Down
11 changes: 7 additions & 4 deletions .github/workflows/stackhpc-container-image-build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,10 @@ jobs:
env:
KAYOBE_VAULT_PASSWORD: ${{ secrets.KAYOBE_VAULT_PASSWORD }}

- name: Prune local Kolla container images over 1 week old
run: |
sudo docker image prune --all --force --filter until=168h --filter="label=kolla_version"

- name: Build and push kolla overcloud images
run: |
args="${{ github.event.inputs.regexes }}"
Expand Down Expand Up @@ -180,17 +184,16 @@ jobs:
run: |
sudo docker image ls --filter "reference=ark.stackhpc.com/stackhpc-dev/*:*${{ matrix.distro }}*${{ needs.generate-tag.outputs.datetime_tag }}" > ${{ matrix.distro }}-container-images

- name: Fail if no images have been built
run: if [ $(wc -l < ${{ matrix.distro }}-container-images) -le 1 ]; then exit 1; fi

- name: Upload container images artifact
uses: actions/upload-artifact@v3
with:
name: ${{ matrix.distro }} container images
path: ${{ matrix.distro }}-container-images
retention-days: 7

- name: Prune local Kolla container images over 1 week old
run: |
sudo docker image prune --all --force --filter until=168h --filter="label=kolla_version"

sync-container-repositories:
name: Trigger container image repository sync
needs:
Expand Down
10 changes: 10 additions & 0 deletions doc/source/configuration/release-train.rst
Original file line number Diff line number Diff line change
Expand Up @@ -186,6 +186,16 @@ promoted to production:

kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/pulp-repo-promote-production.yml

Synchronising all Kolla container images can take a long time. A limited list
of images can be synchronised using the ``stackhpc_pulp_images_kolla_filter``
variable, which accepts a whitespace-separated list of regular expressions
matching Kolla image names. Usage is similar to ``kolla-build`` CLI arguments.
For example:

.. code-block:: console

kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/pulp-container-sync.yml -e stackhpc_pulp_images_kolla_filter='"^glance nova-compute$"'

Initial seed deployment
-----------------------

Expand Down
12 changes: 12 additions & 0 deletions doc/source/configuration/vault.rst
Original file line number Diff line number Diff line change
Expand Up @@ -229,6 +229,18 @@ Enable the required TLS variables in kayobe and kolla

kayobe overcloud service deploy

If VM provisioning fails with an error with this format:

.. code-block::

Unable to establish connection to http://<kolla internal vip/fqdn>:9696/v2.0/ports/some-sort-of-uuid: Connection aborted

Restart the nova-compute container on all hypervisors:

.. code-block::

kayobe overcloud host command run --command "docker restart nova_compute" --become --show-output -l compute

Barbican integration
====================

Expand Down
3 changes: 2 additions & 1 deletion doc/source/configuration/walled-garden.rst
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,8 @@ proxy:
- "127.0.0.1"
- "localhost"
- "{{ ('http://' ~ docker_registry) | urlsplit('hostname') if docker_registry else '' }}"
- "{{ admin_oc_net_name | net_ip(inventory_hostname=groups['seed'][0]) }}"
- "{{ lookup('vars', admin_oc_net_name ~ '_ips')[groups.seed.0] }}"
- "{{ lookup('vars', admin_oc_net_name ~ '_ips')[inventory_hostname] }}"
- "{{ kolla_external_fqdn }}"
- "{{ kolla_internal_fqdn }}"

Expand Down
14 changes: 6 additions & 8 deletions doc/source/configuration/wazuh.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,8 @@ The short version
#. Deploy the Wazuh agents: ``kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-agent.yml``


Wazuh Manager
=============
Wazuh Manager Host
==================

Provision using infra-vms
-------------------------
Expand Down Expand Up @@ -303,7 +303,7 @@ Encrypt the keys (and remember to commit to git):
``ansible-vault encrypt --vault-password-file ~/vault.pass $KAYOBE_CONFIG_PATH/ansible/wazuh/certificates/certs/*.key``

Verification
==============
------------

The Wazuh portal should be accessible on port 443 of the Wazuh
manager’s IPs (using HTTPS, with the root CA cert in ``etc/kayobe/ansible/wazuh/certificates/wazuh-certificates/root-ca.pem``).
Expand All @@ -315,11 +315,9 @@ Troubleshooting

Logs are in ``/var/log/wazuh-indexer/wazuh.log``. There are also logs in the journal.

============
Wazuh agents
============


Wazuh agent playbook is located in ``etc/kayobe/ansible/wazuh-agent.yml``.

Wazuh agent variables file is located in ``etc/kayobe/inventory/group_vars/wazuh-agent/wazuh-agent``.
Expand All @@ -333,13 +331,13 @@ Deploy the Wazuh agents:
``kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/wazuh-agent.yml``

Verification
=============
------------

The Wazuh agents should register with the Wazuh manager. This can be verified via the agents page in Wazuh Portal.
Check CIS benchmark output in agent section.

Additional resources:
=====================
Additional resources
--------------------

For times when you need to upgrade wazuh with elasticsearch to version with opensearch or you just need to deinstall all wazuh components:
Wazuh purge script: https://github.com/stackhpc/wazuh-server-purge
69 changes: 69 additions & 0 deletions etc/kayobe/ansible/ovn-fix-chassis-priorities.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
# Sometimes, typically after restarting OVN services, the priorities of entries
# in the ha_chassis and gateway_chassis tables in the OVN northbound database
# can become misaligned. This results in broken routing for external (bare
# metal/SR-IOV) ports.

# This playbook can be used to fix the issue by realigning the priorities of
# the table entries. It does so by assigning the highest priority to the
# "first" (sorted alphabetically) OVN NB DB host. This results in all gateways
# being scheduled to a single host, but is less complicated than trying to
# balance them (and it's also not clear to me how to map between individual
# ha_chassis and gateway_chassis entries).

# The playbook can be run as follows:
# kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ovn-fix-chassis-priorities.yml

# If the 'controllers' group does not align with the group used to deploy the
# OVN NB DB, this can be overridden by passing the following:
# '-e ovn_nb_db_group=some_other_group'

- name: Find OVN DB DB Leader
hosts: "{{ ovn_nb_db_group | default('controllers') }}"
tasks:
- name: Find the OVN NB DB leader
command: docker exec -it ovn_nb_db ovn-nbctl get-connection
changed_when: false
failed_when: false
register: ovn_check_result
check_mode: no

- name: Group hosts by leader/follower role
group_by:
key: "ovn_nb_{{ 'leader' if ovn_check_result.rc == 0 else 'follower' }}"
changed_when: false

- name: Assert one leader exists
assert:
that:
- groups['ovn_nb_leader'] | default([]) | length == 1

- name: Fix OVN chassis priorities
hosts: ovn_nb_leader
vars:
ovn_nb_db_group: controllers
ovn_nb_db_hosts_sorted: "{{ query('inventory_hostnames', ovn_nb_db_group) | sort | list }}"
ha_chassis_max_priority: 32767
gateway_chassis_max_priority: "{{ ovn_nb_db_hosts_sorted | length }}"
tasks:
- name: Fix ha_chassis priorities
command: >-
docker exec -it ovn_nb_db
bash -c '
ovn-nbctl find ha_chassis chassis_name={{ item }} |
awk '\''$1 == "_uuid" { print $3 }'\'' |
while read uuid; do ovn-nbctl set ha_chassis $uuid priority={{ priority }}; done'
loop: "{{ ovn_nb_db_hosts_sorted }}"
vars:
priority: "{{ ha_chassis_max_priority | int - ovn_nb_db_hosts_sorted.index(item) }}"

- name: Fix gateway_chassis priorities
command: >-
docker exec -it ovn_nb_db
bash -c '
ovn-nbctl find gateway_chassis chassis_name={{ item }} |
awk '\''$1 == "_uuid" { print $3 }'\'' |
while read uuid; do ovn-nbctl set gateway_chassis $uuid priority={{ priority }}; done'
loop: "{{ ovn_nb_db_hosts_sorted }}"
vars:
priority: "{{ gateway_chassis_max_priority | int - ovn_nb_db_hosts_sorted.index(item) }}"
2 changes: 1 addition & 1 deletion etc/kayobe/ansible/reboot.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
- name: Reboot the host
hosts: seed-hypervisor:seed:overcloud:infra-vms
serial: "{{ lookup('env', 'ANSIBLE_SERIAL') | default(0, true) }}"
serial: "{{ lookup('env', 'ANSIBLE_SERIAL') | default(1, true) }}"
tags:
- reboot
tasks:
Expand Down
2 changes: 1 addition & 1 deletion etc/kayobe/pulp-host-image-versions.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
# Overcloud host image versioning tags
# These images must be in SMS, since they are used by our AIO CI runners
stackhpc_rocky_9_overcloud_host_image_version: "yoga-20230515T145140"
stackhpc_rocky_9_overcloud_host_image_version: "yoga-20230929T133006"
stackhpc_ubuntu_jammy_overcloud_host_image_version: "yoga-20230609T120720"
12 changes: 6 additions & 6 deletions etc/kayobe/pulp-repo-versions.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ stackhpc_pulp_repo_centos_stream_9_opstools_version: 20230615T071742
stackhpc_pulp_repo_centos_stream_9_storage_ceph_quincy_version: 20230712T025152
stackhpc_pulp_repo_docker_ce_ubuntu_version: 20230921T005001
stackhpc_pulp_repo_elrepo_9_version: 20230907T075311
stackhpc_pulp_repo_epel_9_version: 20230921T005001
stackhpc_pulp_repo_epel_9_version: 20230929T005202
stackhpc_pulp_repo_grafana_version: 20230921T005001
stackhpc_pulp_repo_opensearch_2_x_version: 20230725T013015
stackhpc_pulp_repo_opensearch_dashboards_2_x_version: 20230725T013015
Expand All @@ -21,11 +21,11 @@ stackhpc_pulp_repo_rocky_9_1_baseos_version: 20230921T005001
stackhpc_pulp_repo_rocky_9_1_crb_version: 20230921T005001
stackhpc_pulp_repo_rocky_9_1_extras_version: 20230921T005001
stackhpc_pulp_repo_rocky_9_1_highavailability_version: 20230921T005001
stackhpc_pulp_repo_rocky_9_2_appstream_version: 20230825T131407
stackhpc_pulp_repo_rocky_9_2_baseos_version: 20230825T131407
stackhpc_pulp_repo_rocky_9_2_crb_version: 20230825T131407
stackhpc_pulp_repo_rocky_9_2_extras_version: 20230825T131407
stackhpc_pulp_repo_rocky_9_2_highavailability_version: 20230805T012805
stackhpc_pulp_repo_rocky_9_2_appstream_version: 20230928T024829
stackhpc_pulp_repo_rocky_9_2_baseos_version: 20230928T024829
stackhpc_pulp_repo_rocky_9_2_crb_version: 20230928T024829
stackhpc_pulp_repo_rocky_9_2_extras_version: 20230915T001040
stackhpc_pulp_repo_rocky_9_2_highavailability_version: 20230918T015928
stackhpc_pulp_repo_ubuntu_jammy_security_version: 20230908T053616
stackhpc_pulp_repo_ubuntu_jammy_version: 20230908T053616
stackhpc_pulp_repo_ubuntu_cloud_archive_version: 20230908T112533
3 changes: 3 additions & 0 deletions etc/kayobe/stackhpc-overcloud-dib.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,12 +67,15 @@ stackhpc_overcloud_dib_packages:
- "vim"
- "git"
- "less"
- "python3"
- "{% if os_distribution == 'ubuntu' %}netbase{% endif %}"
- "{% if os_distribution == 'ubuntu' %}iputils-ping{% endif %}"
- "{% if os_distribution == 'ubuntu' %}curl{% endif %}"
- "{% if os_distribution == 'ubuntu' %}apt-utils{% endif %}"
- "{% if os_distribution == 'rocky' %}NetworkManager-config-server{% endif %}"
- "{% if os_distribution == 'rocky' %}linux-firmware{% endif %}"
- "{% if os_distribution == 'rocky' %}cloud-utils-growpart{% endif %}"
- "{% if os_distribution == 'ubuntu' %}cloud-guest-utils{% endif %}"

# StackHPC overcloud DIB image block device configuration.
# This image layout conforms to the CIS partition benchmarks.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
security:
- |
The Rocky 8 minor version has been bumped to 8.8 and new snapshots have
been created to include fixes for Zenbleed (CVE-2023-20593), Downfall
(CVE-2022-40982). It is recommended that you update your OS packages and
reboot into the kernel as soon as possible.
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
security:
- |
The snapshots for Rocky 9.2 have been refreshed to include fixes for
Zenbleed (CVE-2023-20593), Downfall (CVE-2022-40982). It is recommended
that you update your OS packages and reboot into the kernel as soon as
possible.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
upgrade:
- |
The ``reboot.yml`` custom Ansible playbook now defaults to reboot only one
host at a time. Existing behaviour can be retained by setting
ANSIBLE_SERIAL=0.