Skip to content

Commit 1eef82c

Browse files
authored
Merge pull request #1385 from stackhpc/2024.1-2023.1-merge
2024.1: 2023.1 merge
2 parents 21b147e + 94c132c commit 1eef82c

19 files changed

+646
-80
lines changed

.automation

doc/source/configuration/release-train.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
.. _stackhpc_release_train:
2+
13
======================
24
StackHPC Release Train
35
======================

doc/source/contributor/package-updates.rst

Lines changed: 28 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -63,18 +63,20 @@ The following steps describe the process to test the new package and container r
6363
Creating the multinode environments
6464
-----------------------------------
6565

66-
There is a comprehensive guide to setting up a multinode environment with Terraform, found here: https://github.com/stackhpc/terraform-kayobe-multinode. There are some things to note:
66+
The `Multinode deployment workflow <https://github.com/stackhpc/stackhpc-kayobe-config/actions/workflows/stackhpc-multinode.yml>`_ can be used to automatically test changes.
67+
68+
To manually test the changes, there is a comprehensive guide to set up a Multinode environment with Terraform, found here: https://github.com/stackhpc/terraform-kayobe-multinode. There are some things to note:
6769

6870
* OVN is enabled by default, you should override it under ``etc/kayobe/environments/ci-multinode/kolla.yml kolla_enable_ovn: false`` for the OVS multinode environment.
6971

70-
* Remember to set different vxlan_vnis for each.
72+
* Remember to set a different ``vxlan_vni`` for each.
7173

72-
* Before starting any tests, run ``dnf distro-sync`` on each host to ensure you are using the same snapshots as in the release train. You can do this using the following commands:
74+
* Before starting any tests, run ``dnf distro-sync -y`` on each host to ensure you are using the same snapshots as in the release train. Option ``-y`` is used to prevent hosts hang waiting for the confirmation input. You can do this using the following commands:
7375

7476
.. code-block:: console
7577
76-
kayobe seed host command run -b --command "dnf distro-sync"
77-
kayobe overcloud host command run -b --command "dnf distro-sync"
78+
kayobe seed host command run -b --command "dnf distro-sync -y"
79+
kayobe overcloud host command run -b --command "dnf distro-sync -y"
7880
7981
* This may have installed a new kernel version. If so, you will need to reboot the overcloud hosts. You can check the installed kernels and the currently running kernel with the following commands. If the latest listed version is not running, you will need to reboot.
8082

@@ -85,7 +87,7 @@ There is a comprehensive guide to setting up a multinode environment with Terraf
8587
8688
kayobe playbook run --limit seed,overcloud $KAYOBE_CONFIG_PATH/ansible/reboot.yml
8789
88-
* The tempest tests run automatically at the end of deploy-openstack.sh. If you have the time, it is worth fixing any failing tests you can so that there is greater coverage for the package updates. (Also remember to propose these fixes in the relevant repos where applicable.)
90+
* The tempest tests run automatically at the end of the multinode deployment script. If you have the time, it is worth fixing any failing tests you can so that there is greater coverage for the package updates. (Also remember to propose these fixes in the relevant repos where applicable.)
8991

9092
Upgrading host packages
9193
-----------------------
@@ -102,6 +104,7 @@ For Rocky Linux 9, bump the snapshot versions in /etc/yum/repos.d with:
102104

103105
.. code-block:: console
104106
107+
kayobe seed host configure -t dnf
105108
kayobe overcloud host configure -t dnf
106109
107110
Install new packages:
@@ -112,22 +115,32 @@ Install new packages:
112115
113116
Perform a rolling reboot of hosts:
114117

118+
.. note::
119+
In the Multinode environment, the seed-hypervisor cannot access control
120+
plane instances with the Openstack client. To use Openstack client, connect
121+
to the Seed instance via SSH first. For authentication, use scp to copy
122+
``public-openrc.sh`` to the Seed
123+
115124
.. code-block:: console
116125
117-
export ANSIBLE_SERIAL=1
118-
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit controllers
119-
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[0]
126+
# Check your hypervisor hostname
127+
(seed) openstack hypervisor list
128+
129+
# Reboot controller instances and zeroth compute instance
130+
(seed-hypervisor) export ANSIBLE_SERIAL=1
131+
(seed-hypervisor) kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit controllers
132+
(seed-hypervisor) kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[0]
120133
121134
# Test live migration
122-
openstack server create --image cirros --flavor m1.tiny --network external --hypervisor-hostname antelope-pkg-refresh-ovs-compute-02.novalocal --os-compute-api-version 2.74 server1
123-
openstack server migrate --live-migration server1
124-
watch openstack server show server1
135+
(seed) openstack server create --image cirros --flavor m1.tiny --network external --hypervisor-hostname <Your Hypervisor Hostname> --os-compute-api-version 2.74 server1
136+
(seed) openstack server migrate --live-migration server1
137+
(seed) watch openstack server show server1
125138
126-
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[1]
139+
(seed-hypervisor) kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/reboot.yml --limit compute[1]
127140
128141
# Try and migrate back
129-
openstack server migrate --live-migration server1
130-
watch openstack server show server1
142+
(seed) openstack server migrate --live-migration server1
143+
(seed) watch openstack server show server1
131144
132145
Upgrading containers within a release
133146
-------------------------------------

doc/source/operations/upgrading-openstack.rst

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -449,9 +449,8 @@ To upgrade the Ansible control host:
449449
Syncing Release Train artifacts
450450
-------------------------------
451451

452-
New `StackHPC Release Train <../configuration/release-train>` content should be
453-
synced to the local Pulp server. This includes host packages (Deb/RPM) and
454-
container images.
452+
New :ref:`stackhpc_release_train` content should be synced to the local Pulp
453+
server. This includes host packages (Deb/RPM) and container images.
455454

456455
.. _sync-rt-package-repos:
457456

@@ -968,17 +967,27 @@ would be applied:
968967
kayobe overcloud host configure --check --diff
969968
970969
When ready to apply the changes, it may be advisable to do so in batches, or at
971-
least start with a small number of hosts.:
970+
least start with a small number of hosts:
972971

973972
.. code-block:: console
974973
975974
kayobe overcloud host configure --limit <host>
976975
977-
Alternatively, to apply the configuration to all hosts:
978976
979-
.. code-block:: console
977+
.. warning::
978+
979+
Take extra care when configuring Ceph hosts. Set the hosts to maintenance
980+
mode before reconfiguring them, and unset when done:
981+
982+
.. code-block:: console
983+
984+
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ceph-enter-maintenance.yml --limit <host>
985+
kayobe overcloud host configure --limit <host>
986+
kayobe playbook run $KAYOBE_CONFIG_PATH/ansible/ceph-exit-maintenance.yml --limit <host>
980987
981-
kayobe overcloud host configure
988+
**Always** reconfigure hosts in small batches or one-by-one. Check the Ceph
989+
state after each host configuration. Ensure all warnings and errors are
990+
resolved before moving on.
982991

983992
.. _building_ironic_deployment_images:
984993

etc/kayobe/ansible/deploy-os-capacity-exporter.yml

Lines changed: 53 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -15,59 +15,61 @@
1515
tags: os_capacity
1616
gather_facts: false
1717
tasks:
18-
- name: Create os-capacity directory
19-
ansible.builtin.file:
20-
path: /opt/kayobe/os-capacity/
21-
state: directory
22-
when: stackhpc_enable_os_capacity
23-
24-
- name: Read admin-openrc credential file
25-
ansible.builtin.command:
26-
cmd: "cat {{ lookup('ansible.builtin.env', 'KOLLA_CONFIG_PATH') }}/admin-openrc.sh"
18+
- name: Check if admin-openrc.sh exists
19+
ansible.builtin.stat:
20+
path: "{{ lookup('ansible.builtin.env', 'KOLLA_CONFIG_PATH') }}/admin-openrc.sh"
2721
delegate_to: localhost
28-
register: credential
29-
when: stackhpc_enable_os_capacity
30-
changed_when: false
22+
register: openrc_file_stat
23+
run_once: true
3124

32-
- name: Set facts for admin credentials
33-
ansible.builtin.set_fact:
34-
stackhpc_os_capacity_auth_url: "{{ credential.stdout_lines | select('match', '.*OS_AUTH_URL*.') | first | split('=') | last | replace(\"'\",'') }}"
35-
stackhpc_os_capacity_project_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
36-
stackhpc_os_capacity_domain_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_DOMAIN_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
37-
stackhpc_os_capacity_openstack_region_name: "{{ credential.stdout_lines | select('match', '.*OS_REGION_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
38-
stackhpc_os_capacity_username: "{{ credential.stdout_lines | select('match', '.*OS_USERNAME*.') | first | split('=') | last | replace(\"'\",'') }}"
39-
stackhpc_os_capacity_password: "{{ credential.stdout_lines | select('match', '.*OS_PASSWORD*.') | first | split('=') | last | replace(\"'\",'') }}"
40-
when: stackhpc_enable_os_capacity
25+
- block:
26+
- name: Create os-capacity directory
27+
ansible.builtin.file:
28+
path: /opt/kayobe/os-capacity/
29+
state: directory
4130

42-
- name: Template clouds.yml
43-
ansible.builtin.template:
44-
src: templates/os_capacity-clouds.yml.j2
45-
dest: /opt/kayobe/os-capacity/clouds.yaml
46-
when: stackhpc_enable_os_capacity
47-
register: clouds_yaml_result
31+
- name: Read admin-openrc credential file
32+
ansible.builtin.command:
33+
cmd: "cat {{ lookup('ansible.builtin.env', 'KOLLA_CONFIG_PATH') }}/admin-openrc.sh"
34+
delegate_to: localhost
35+
register: credential
36+
changed_when: false
4837

49-
- name: Copy CA certificate to OpenStack Capacity nodes
50-
ansible.builtin.copy:
51-
src: "{{ stackhpc_os_capacity_openstack_cacert }}"
52-
dest: /opt/kayobe/os-capacity/cacert.pem
53-
when:
54-
- stackhpc_enable_os_capacity
55-
- stackhpc_os_capacity_openstack_cacert | length > 0
56-
register: cacert_result
38+
- name: Set facts for admin credentials
39+
ansible.builtin.set_fact:
40+
stackhpc_os_capacity_auth_url: "{{ credential.stdout_lines | select('match', '.*OS_AUTH_URL*.') | first | split('=') | last | replace(\"'\",'') }}"
41+
stackhpc_os_capacity_project_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
42+
stackhpc_os_capacity_domain_name: "{{ credential.stdout_lines | select('match', '.*OS_PROJECT_DOMAIN_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
43+
stackhpc_os_capacity_openstack_region_name: "{{ credential.stdout_lines | select('match', '.*OS_REGION_NAME*.') | first | split('=') | last | replace(\"'\",'') }}"
44+
stackhpc_os_capacity_username: "{{ credential.stdout_lines | select('match', '.*OS_USERNAME*.') | first | split('=') | last | replace(\"'\",'') }}"
45+
stackhpc_os_capacity_password: "{{ credential.stdout_lines | select('match', '.*OS_PASSWORD*.') | first | split('=') | last | replace(\"'\",'') }}"
5746

58-
- name: Ensure os_capacity container is running
59-
community.docker.docker_container:
60-
name: os_capacity
61-
image: ghcr.io/stackhpc/os-capacity:master
62-
env:
63-
OS_CLOUD: openstack
64-
OS_CLIENT_CONFIG_FILE: /etc/openstack/clouds.yaml
65-
mounts:
66-
- type: bind
67-
source: /opt/kayobe/os-capacity/
68-
target: /etc/openstack/
69-
network_mode: host
70-
restart: "{{ clouds_yaml_result is changed or cacert_result is changed }}"
71-
restart_policy: unless-stopped
72-
become: true
73-
when: stackhpc_enable_os_capacity
47+
- name: Template clouds.yml
48+
ansible.builtin.template:
49+
src: templates/os_capacity-clouds.yml.j2
50+
dest: /opt/kayobe/os-capacity/clouds.yaml
51+
register: clouds_yaml_result
52+
53+
- name: Copy CA certificate to OpenStack Capacity nodes
54+
ansible.builtin.copy:
55+
src: "{{ stackhpc_os_capacity_openstack_cacert }}"
56+
dest: /opt/kayobe/os-capacity/cacert.pem
57+
when: stackhpc_os_capacity_openstack_cacert | length > 0
58+
register: cacert_result
59+
60+
- name: Ensure os_capacity container is running
61+
community.docker.docker_container:
62+
name: os_capacity
63+
image: ghcr.io/stackhpc/os-capacity:{{ stackhpc_os_capacity_version }}
64+
env:
65+
OS_CLOUD: openstack
66+
OS_CLIENT_CONFIG_FILE: /etc/openstack/clouds.yaml
67+
mounts:
68+
- type: bind
69+
source: /opt/kayobe/os-capacity/
70+
target: /etc/openstack/
71+
network_mode: host
72+
restart: "{{ clouds_yaml_result is changed or cacert_result is changed }}"
73+
restart_policy: unless-stopped
74+
become: true
75+
when: stackhpc_enable_os_capacity and openrc_file_stat.stat.exists

etc/kayobe/inventory/group_vars/cis-hardening/cis

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,9 @@ rhel9cis_rule_6_1_15: false
5151
# filesystem. We do not want to change /var/lib/docker permissions.
5252
rhel9cis_no_world_write_adjust: false
5353

54+
# Prevent hardening from recursivley changing permissions on log files
55+
rhel9cis_rule_4_2_3: false
56+
5457
# Configure log rotation to prevent audit logs from filling the disk
5558
rhel9cis_auditd:
5659
space_left_action: syslog
@@ -153,6 +156,9 @@ ubtu22cis_no_owner_adjust: false
153156
ubtu22cis_no_world_write_adjust: false
154157
ubtu22cis_suid_adjust: false
155158

159+
# Prevent hardening from recursivley changing permissions on log files
160+
ubtu22cis_rule_4_2_3: false
161+
156162
# Configure log rotation to prevent audit logs from filling the disk
157163
ubtu22cis_auditd:
158164
action_mail_acct: root

etc/kayobe/ipa.yml

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,9 @@
3030

3131
# List of additional Diskimage Builder (DIB) elements to use when building IPA
3232
# images. Default is none.
33-
#ipa_build_dib_elements_extra:
33+
ipa_build_dib_elements_extra:
34+
- extra-hardware
35+
- mellanox
3436

3537
# List of Diskimage Builder (DIB) elements to use when building IPA images.
3638
# Default is combination of ipa_build_dib_elements_default and
@@ -117,7 +119,10 @@
117119
#ipa_collectors_default:
118120

119121
# List of additional inspection collectors to run.
120-
#ipa_collectors_extra:
122+
ipa_collectors_extra:
123+
- "dmi-decode"
124+
- "extra-hardware"
125+
- "numa-topology"
121126

122127
# List of inspection collectors to run.
123128
#ipa_collectors:
@@ -135,7 +140,11 @@
135140
#ipa_kernel_options_default:
136141

137142
# List of additional kernel parameters for Ironic python agent.
138-
#ipa_kernel_options_extra:
143+
ipa_kernel_options_extra:
144+
# Useful until NTP is configured by default
145+
- ipa-insecure=1
146+
# Avoid disk benchmark failures on some NVMe drives
147+
- nvme_core.multipath=N
139148

140149
# List of kernel parameters for Ironic python agent.
141150
#ipa_kernel_options:

etc/kayobe/kolla-image-tags.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@ kolla_image_tags:
66
openstack:
77
rocky-9: 2024.1-rocky-9-20240903T113235
88
ubuntu-jammy: 2024.1-ubuntu-jammy-20240917T091559
9+
blazar:
10+
rocky-9: 2024.1-rocky-9-20241125T093138
11+
ubuntu-jammy: 2024.1-ubuntu-jammy-20241125T093138
912
heat:
1013
rocky-9: 2024.1-rocky-9-20240805T142526
1114
nova:

etc/kayobe/kolla.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,10 @@ kolla_sources:
150150
type: git
151151
location: https://github.com/stackhpc/octavia.git
152152
reference: stackhpc/{{ openstack_release }}
153+
blazar-base:
154+
type: git
155+
location: https://github.com/stackhpc/blazar
156+
reference: stackhpc/master
153157

154158
###############################################################################
155159
# Kolla image build configuration.

0 commit comments

Comments
 (0)