Skip to content

2023.1: zed merge #1095

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
24bf2b8
Bump horizon to fix CVE-2023-31047
seunghun1ee May 14, 2024
67440ee
Bump grafana to fix CVE-2023-49569
seunghun1ee May 14, 2024
76c3d69
Bump prometheus-msteams to fix CVE-2022-40083 and CVE-2021-4238
seunghun1ee May 14, 2024
645b179
Add releasenote for yoga security patch q2 2024
seunghun1ee May 14, 2024
1d2946a
Merge pull request #1071 from stackhpc/q2-yoga-security-patch
markgoddard May 14, 2024
689e533
Bump horizon to fix CVE-2023-31047
seunghun1ee May 14, 2024
9c8439f
Bump grafana to fix CVE-2023-49569
seunghun1ee May 14, 2024
249e60b
Bump prometheus-msteams to fix CVE-2021-4238 and CVE-2022-40083
seunghun1ee May 14, 2024
08e9d23
Add releasenote for q2 2024 security patch for zed
seunghun1ee May 14, 2024
4d63427
Merge pull request #1072 from stackhpc/q2-2024-zed-security-patch
markgoddard May 15, 2024
8a64c5c
Add alerts for low available swap space
seunghun1ee May 16, 2024
a0331ca
OS Capacity: Support providing a CA certificate
markgoddard May 20, 2024
b45b8b9
Merge pull request #1079 from stackhpc/os-capacity-cacert
markgoddard May 21, 2024
e54a5f4
Merge pull request #1066 from stackhpc/zed-yoga-merge
markgoddard May 21, 2024
643aa78
Add releasenote for swap space monitoring
seunghun1ee May 21, 2024
6c46bce
docs: Fix link in secret rotation page
priteau May 27, 2024
4b0dc54
Support synchronising custom container images
priteau May 30, 2024
2884d3c
Merge pull request #1076 from stackhpc/pulp-container-extra
priteau May 30, 2024
f23d52c
Merge pull request #1075 from stackhpc/monitor-swap-usage
seunghun1ee May 31, 2024
76b181a
Use Rocky Linux 9 as base for kayobe-automation
Alex-Welsh Aug 22, 2023
40b00d7
CI: Fix default kayobe base image when built on push
markgoddard Jan 15, 2024
9a5cc9e
Merge pull request #1086 from stackhpc/yoga-ci-rl9-base-image
markgoddard Jun 4, 2024
7e96cb3
Merge stackhpc/yoga into stackhpc/zed
markgoddard Jun 10, 2024
a826dca
Merge stackhpc/zed into stackhpc/2023.1
markgoddard Jun 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 15 additions & 3 deletions doc/source/configuration/monitoring.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ depending on your configuration, you may need set the
``kolla_enable_prometheus_ceph_mgr_exporter`` variable to ``true`` in order to
enable the ceph mgr exporter.

.. _os-capacity:

OpenStack Capacity
==================

Expand All @@ -149,9 +151,19 @@ project domain name in ``stackhpc-monitoring.yml``:
stackhpc_os_capacity_openstack_region_name: <openstack_region_name>

Additionally, you should ensure these credentials have the correct permissions
for the exporter. If you are deploying in a cloud with internal TLS, you may be required
to disable certificate verification for the OpenStack Capacity exporter
if your certificate is not signed by a trusted CA.
for the exporter.

If you are deploying in a cloud with internal TLS, you may be required
to provide a CA certificate for the OpenStack Capacity exporter if your
certificate is not signed by a trusted CA. For example, to use a CA certificate
named ``vault.crt`` that is also added to the Kolla containers:

.. code-block:: yaml

stackhpc_os_capacity_openstack_cacert: "{{ kayobe_env_config_path }}/kolla/certificates/ca/vault.crt"

Alternatively, to disable certificate verification for the OpenStack Capacity
exporter:

.. code-block:: yaml

Expand Down
27 changes: 27 additions & 0 deletions doc/source/configuration/release-train.rst
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,33 @@ By default, HashiCorp images (Consul and Vault) are not synced from Docker Hub
to the local Pulp. To sync these images, set ``stackhpc_sync_hashicorp_images``
to ``true``.

Custom container images
-----------------------

A custom list of container images can be synced to the local Pulp using the
``stackhpc_pulp_repository_container_repos_extra`` and
``stackhpc_pulp_distribution_container_extra`` variables.

.. code-block:: yaml

# List of extra container image repositories.
stackhpc_pulp_repository_container_repos_extra:
- name: "certbot/certbot"
url: "https://registry-1.docker.io"
policy: on_demand
proxy_url: "{{ pulp_proxy_url }}"
state: present
include_tags: "nightly"
required: True

# List of extra container image distributions.
stackhpc_pulp_distribution_container_extra:
- name: certbot
repository: certbot/certbot
base_path: certbot/certbot
state: present
required: True

Usage
=====

Expand Down
2 changes: 2 additions & 0 deletions doc/source/configuration/vault.rst
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,8 @@ Enable the required TLS variables in kayobe and kolla
# Whether TLS is enabled for the internal API endpoints. Default is 'no'.
kolla_enable_tls_internal: yes

See :ref:`os-capacity` for information on adding CA certificates to the trust store when deploying the OpenStack Capacity exporter.

3. Set the following in etc/kayobe/kolla/globals.yml or if environments are being used etc/kayobe/environments/$KAYOBE_ENVIRONMENT/kolla/globals.yml

.. code-block::
Expand Down
2 changes: 1 addition & 1 deletion doc/source/operations/secret-rotation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ process easier.

This was previously mitigated with a change to the StackHPC fork of
Kolla-Ansible, which has since been reverted due to an unforeseen issue. See
`here <https://github.com/stackhpc/kolla-ansible/pull/503>` for more
`here <https://github.com/stackhpc/kolla-ansible/pull/503>`__ for more
details.

#. A change to Nova, to automate :ref:`this<nova-change>` step to change the
Expand Down
12 changes: 12 additions & 0 deletions etc/kayobe/ansible/deploy-os-capacity-exporter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
delegate_to: localhost
register: credential
when: stackhpc_enable_os_capacity
changed_when: false

- name: Set facts for admin credentials
ansible.builtin.set_fact:
Expand All @@ -43,6 +44,16 @@
src: templates/os_capacity-clouds.yml.j2
dest: /opt/kayobe/os-capacity/clouds.yaml
when: stackhpc_enable_os_capacity
register: clouds_yaml_result

- name: Copy CA certificate to OpenStack Capacity nodes
ansible.builtin.copy:
src: "{{ stackhpc_os_capacity_openstack_cacert }}"
dest: /opt/kayobe/os-capacity/cacert.pem
when:
- stackhpc_enable_os_capacity
- stackhpc_os_capacity_openstack_cacert | length > 0
register: cacert_result

- name: Ensure os_capacity container is running
community.docker.docker_container:
Expand All @@ -56,6 +67,7 @@
source: /opt/kayobe/os-capacity/
target: /etc/openstack/
network_mode: host
restart: "{{ clouds_yaml_result is changed or cacert_result is changed }}"
restart_policy: unless-stopped
become: true
when: stackhpc_enable_os_capacity
3 changes: 3 additions & 0 deletions etc/kayobe/ansible/templates/os_capacity-clouds.yml.j2
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@ clouds:
interface: "internal"
identity_api_version: 3
auth_type: "password"
{% if stackhpc_os_capacity_openstack_cacert | length > 0 %}
cacert: /etc/openstack/cacert.pem
{% endif %}
{% if not stackhpc_os_capacity_openstack_verify | bool %}
verify: False
{% endif %}
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
# Path to a CA certificate file to trust in the OpenStack Capacity exporter.
stackhpc_os_capacity_openstack_cacert: "{{ kayobe_env_config_path }}/kolla/certificates/ca/vault.crt"
1 change: 0 additions & 1 deletion etc/kayobe/kolla.yml
Original file line number Diff line number Diff line change
Expand Up @@ -420,7 +420,6 @@ kolla_build_blocks:
ARG prometheus_url=https://github.com/prometheus/prometheus/releases/download/v${prometheus_version}/prometheus-${prometheus_version}.linux-{{debian_arch}}.tar.gz
{% endraw %}


# Dict mapping image customization variable names to their values.
# Each variable takes the form:
# <image name>_<customization>_<operation>
Expand Down
18 changes: 18 additions & 0 deletions etc/kayobe/kolla/config/prometheus/system.rules
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,24 @@ groups:
summary: "Prometheus exporter at {{ $labels.instance }} reports low memory"
description: "Available memory is {{ $value }} GiB."

- alert: LowSwapSpace
expr: (node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) < {% endraw %}{{ alertmanager_node_free_swap_warning_threshold_ratio }}{% raw %}
for: 1m
labels:
severity: warning
annotations:
summary: "Swap space at {{ $labels.instance }} reports low memory"
description: "Available swap space is {{ $value | humanizePercentage }}. Running out of swap space causes OOM Kills."

- alert: LowSwapSpace
expr: (node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) < {% endraw %}{{ alertmanager_node_free_swap_critical_threshold_ratio }}{% raw %}
for: 1m
labels:
severity: critical
annotations:
summary: "Swap space at {{ $labels.instance }} reports low memory"
description: "Available swap space is {{ $value | humanizePercentage }}. Running out of swap space causes OOM Kills."

- alert: HostOomKillDetected
expr: increase(node_vmstat_oom_kill[5m]) > 0
for: 5m
Expand Down
12 changes: 10 additions & 2 deletions etc/kayobe/pulp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -652,14 +652,22 @@ stackhpc_pulp_distribution_container_hashicorp:
state: present
required: "{{ stackhpc_sync_hashicorp_images | bool }}"

# List of extra container image repositories.
stackhpc_pulp_repository_container_repos_extra: []

# List of extra container image distributions.
stackhpc_pulp_distribution_container_extra: []

# List of container image repositories.
stackhpc_pulp_repository_container_repos: >-
{{ (stackhpc_pulp_repository_container_repos_kolla +
stackhpc_pulp_repository_container_repos_ceph +
stackhpc_pulp_repository_container_repos_hashicorp) | selectattr('required') }}
stackhpc_pulp_repository_container_repos_hashicorp +
stackhpc_pulp_repository_container_repos_extra) | selectattr('required') }}

# List of container image distributions.
stackhpc_pulp_distribution_container: >-
{{ (stackhpc_pulp_distribution_container_kolla +
stackhpc_pulp_distribution_container_ceph +
stackhpc_pulp_distribution_container_hashicorp) | selectattr('required') }}
stackhpc_pulp_distribution_container_hashicorp +
stackhpc_pulp_distribution_container_extra) | selectattr('required') }}
9 changes: 9 additions & 0 deletions etc/kayobe/stackhpc-monitoring.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,12 @@ alertmanager_low_memory_threshold_gib: 5
# link. Change to false to disable this alert.
alertmanager_warn_network_bond_single_link: true

# Threshold to trigger an LowSwapSpace alert on swap space depletion (ratio).
# When the ratio of free swap space is lower than each of these values, warning
# and critical alerts will be triggered respectively.
alertmanager_node_free_swap_warning_threshold_ratio: 0.25
alertmanager_node_free_swap_critical_threshold_ratio: 0.1

###############################################################################
# Exporter configuration

Expand All @@ -20,6 +26,9 @@ alertmanager_warn_network_bond_single_link: true
# targets being templated during deployment.
stackhpc_enable_os_capacity: true

# Path to a CA certificate file to trust in the OpenStack Capacity exporter.
stackhpc_os_capacity_openstack_cacert: ""

# Whether TLS certificate verification is enabled for the OpenStack Capacity
# exporter during Keystone authentication.
stackhpc_os_capacity_openstack_verify: true
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
features:
- |
Added two alerts (Warning and critical) that are triggered when the ratio
of (free_swap_sppace / total_swap_space) is below thresholds.
Each threshold can be modified by alterting value of
``alertmanager_node_free_swap_warning_threshold_ratio`` and
``alertmanager_node_free_swap_critical_threshold_ratio``.

Currently this solution has limitation of having one-size fits all policy.
This can cause unwanted alerts for the hosts which utilise swap heavily
Therefore it is recommended to tune the thresholds or apply silence rules
for the needs.
4 changes: 4 additions & 0 deletions releasenotes/notes/os-capacity-cacert-8b800b22d84ae0b1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
features:
- |
Adds support for providing a CA certificate for OpenStack Capacity exporter.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
features:
- |
Allows to synchronise a custom list of containers to Pulp using the
``stackhpc_pulp_repository_container_repos_extra`` and
``stackhpc_pulp_distribution_container_extra`` variables.
Loading