Skip to content

Commit 40a1526

Browse files
authored
Merge pull request #1095 from stackhpc/2023.1-zed-merge
2023.1: zed merge
2 parents 9549f1b + a826dca commit 40a1526

14 files changed

+123
-7
lines changed

doc/source/configuration/monitoring.rst

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,8 @@ depending on your configuration, you may need set the
126126
``kolla_enable_prometheus_ceph_mgr_exporter`` variable to ``true`` in order to
127127
enable the ceph mgr exporter.
128128

129+
.. _os-capacity:
130+
129131
OpenStack Capacity
130132
==================
131133

@@ -149,9 +151,19 @@ project domain name in ``stackhpc-monitoring.yml``:
149151
stackhpc_os_capacity_openstack_region_name: <openstack_region_name>
150152
151153
Additionally, you should ensure these credentials have the correct permissions
152-
for the exporter. If you are deploying in a cloud with internal TLS, you may be required
153-
to disable certificate verification for the OpenStack Capacity exporter
154-
if your certificate is not signed by a trusted CA.
154+
for the exporter.
155+
156+
If you are deploying in a cloud with internal TLS, you may be required
157+
to provide a CA certificate for the OpenStack Capacity exporter if your
158+
certificate is not signed by a trusted CA. For example, to use a CA certificate
159+
named ``vault.crt`` that is also added to the Kolla containers:
160+
161+
.. code-block:: yaml
162+
163+
stackhpc_os_capacity_openstack_cacert: "{{ kayobe_env_config_path }}/kolla/certificates/ca/vault.crt"
164+
165+
Alternatively, to disable certificate verification for the OpenStack Capacity
166+
exporter:
155167

156168
.. code-block:: yaml
157169

doc/source/configuration/release-train.rst

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,6 +147,33 @@ By default, HashiCorp images (Consul and Vault) are not synced from Docker Hub
147147
to the local Pulp. To sync these images, set ``stackhpc_sync_hashicorp_images``
148148
to ``true``.
149149

150+
Custom container images
151+
-----------------------
152+
153+
A custom list of container images can be synced to the local Pulp using the
154+
``stackhpc_pulp_repository_container_repos_extra`` and
155+
``stackhpc_pulp_distribution_container_extra`` variables.
156+
157+
.. code-block:: yaml
158+
159+
# List of extra container image repositories.
160+
stackhpc_pulp_repository_container_repos_extra:
161+
- name: "certbot/certbot"
162+
url: "https://registry-1.docker.io"
163+
policy: on_demand
164+
proxy_url: "{{ pulp_proxy_url }}"
165+
state: present
166+
include_tags: "nightly"
167+
required: True
168+
169+
# List of extra container image distributions.
170+
stackhpc_pulp_distribution_container_extra:
171+
- name: certbot
172+
repository: certbot/certbot
173+
base_path: certbot/certbot
174+
state: present
175+
required: True
176+
150177
Usage
151178
=====
152179

doc/source/configuration/vault.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -241,6 +241,8 @@ Enable the required TLS variables in kayobe and kolla
241241
# Whether TLS is enabled for the internal API endpoints. Default is 'no'.
242242
kolla_enable_tls_internal: yes
243243
244+
See :ref:`os-capacity` for information on adding CA certificates to the trust store when deploying the OpenStack Capacity exporter.
245+
244246
3. Set the following in etc/kayobe/kolla/globals.yml or if environments are being used etc/kayobe/environments/$KAYOBE_ENVIRONMENT/kolla/globals.yml
245247

246248
.. code-block::

doc/source/operations/secret-rotation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ process easier.
4646

4747
This was previously mitigated with a change to the StackHPC fork of
4848
Kolla-Ansible, which has since been reverted due to an unforeseen issue. See
49-
`here <https://github.com/stackhpc/kolla-ansible/pull/503>` for more
49+
`here <https://github.com/stackhpc/kolla-ansible/pull/503>`__ for more
5050
details.
5151

5252
#. A change to Nova, to automate :ref:`this<nova-change>` step to change the

etc/kayobe/ansible/deploy-os-capacity-exporter.yml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
delegate_to: localhost
2828
register: credential
2929
when: stackhpc_enable_os_capacity
30+
changed_when: false
3031

3132
- name: Set facts for admin credentials
3233
ansible.builtin.set_fact:
@@ -43,6 +44,16 @@
4344
src: templates/os_capacity-clouds.yml.j2
4445
dest: /opt/kayobe/os-capacity/clouds.yaml
4546
when: stackhpc_enable_os_capacity
47+
register: clouds_yaml_result
48+
49+
- name: Copy CA certificate to OpenStack Capacity nodes
50+
ansible.builtin.copy:
51+
src: "{{ stackhpc_os_capacity_openstack_cacert }}"
52+
dest: /opt/kayobe/os-capacity/cacert.pem
53+
when:
54+
- stackhpc_enable_os_capacity
55+
- stackhpc_os_capacity_openstack_cacert | length > 0
56+
register: cacert_result
4657

4758
- name: Ensure os_capacity container is running
4859
community.docker.docker_container:
@@ -56,6 +67,7 @@
5667
source: /opt/kayobe/os-capacity/
5768
target: /etc/openstack/
5869
network_mode: host
70+
restart: "{{ clouds_yaml_result is changed or cacert_result is changed }}"
5971
restart_policy: unless-stopped
6072
become: true
6173
when: stackhpc_enable_os_capacity

etc/kayobe/ansible/templates/os_capacity-clouds.yml.j2

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,9 @@ clouds:
1010
interface: "internal"
1111
identity_api_version: 3
1212
auth_type: "password"
13+
{% if stackhpc_os_capacity_openstack_cacert | length > 0 %}
14+
cacert: /etc/openstack/cacert.pem
15+
{% endif %}
1316
{% if not stackhpc_os_capacity_openstack_verify | bool %}
1417
verify: False
1518
{% endif %}
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
---
2+
# Path to a CA certificate file to trust in the OpenStack Capacity exporter.
3+
stackhpc_os_capacity_openstack_cacert: "{{ kayobe_env_config_path }}/kolla/certificates/ca/vault.crt"

etc/kayobe/kolla.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -420,7 +420,6 @@ kolla_build_blocks:
420420
ARG prometheus_url=https://github.com/prometheus/prometheus/releases/download/v${prometheus_version}/prometheus-${prometheus_version}.linux-{{debian_arch}}.tar.gz
421421
{% endraw %}
422422
423-
424423
# Dict mapping image customization variable names to their values.
425424
# Each variable takes the form:
426425
# <image name>_<customization>_<operation>

etc/kayobe/kolla/config/prometheus/system.rules

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,24 @@ groups:
2424
summary: "Prometheus exporter at {{ $labels.instance }} reports low memory"
2525
description: "Available memory is {{ $value }} GiB."
2626

27+
- alert: LowSwapSpace
28+
expr: (node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) < {% endraw %}{{ alertmanager_node_free_swap_warning_threshold_ratio }}{% raw %}
29+
for: 1m
30+
labels:
31+
severity: warning
32+
annotations:
33+
summary: "Swap space at {{ $labels.instance }} reports low memory"
34+
description: "Available swap space is {{ $value | humanizePercentage }}. Running out of swap space causes OOM Kills."
35+
36+
- alert: LowSwapSpace
37+
expr: (node_memory_SwapFree_bytes / node_memory_SwapTotal_bytes) < {% endraw %}{{ alertmanager_node_free_swap_critical_threshold_ratio }}{% raw %}
38+
for: 1m
39+
labels:
40+
severity: critical
41+
annotations:
42+
summary: "Swap space at {{ $labels.instance }} reports low memory"
43+
description: "Available swap space is {{ $value | humanizePercentage }}. Running out of swap space causes OOM Kills."
44+
2745
- alert: HostOomKillDetected
2846
expr: increase(node_vmstat_oom_kill[5m]) > 0
2947
for: 5m

etc/kayobe/pulp.yml

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -652,14 +652,22 @@ stackhpc_pulp_distribution_container_hashicorp:
652652
state: present
653653
required: "{{ stackhpc_sync_hashicorp_images | bool }}"
654654

655+
# List of extra container image repositories.
656+
stackhpc_pulp_repository_container_repos_extra: []
657+
658+
# List of extra container image distributions.
659+
stackhpc_pulp_distribution_container_extra: []
660+
655661
# List of container image repositories.
656662
stackhpc_pulp_repository_container_repos: >-
657663
{{ (stackhpc_pulp_repository_container_repos_kolla +
658664
stackhpc_pulp_repository_container_repos_ceph +
659-
stackhpc_pulp_repository_container_repos_hashicorp) | selectattr('required') }}
665+
stackhpc_pulp_repository_container_repos_hashicorp +
666+
stackhpc_pulp_repository_container_repos_extra) | selectattr('required') }}
660667
661668
# List of container image distributions.
662669
stackhpc_pulp_distribution_container: >-
663670
{{ (stackhpc_pulp_distribution_container_kolla +
664671
stackhpc_pulp_distribution_container_ceph +
665-
stackhpc_pulp_distribution_container_hashicorp) | selectattr('required') }}
672+
stackhpc_pulp_distribution_container_hashicorp +
673+
stackhpc_pulp_distribution_container_extra) | selectattr('required') }}

etc/kayobe/stackhpc-monitoring.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,12 @@ alertmanager_low_memory_threshold_gib: 5
1212
# link. Change to false to disable this alert.
1313
alertmanager_warn_network_bond_single_link: true
1414

15+
# Threshold to trigger an LowSwapSpace alert on swap space depletion (ratio).
16+
# When the ratio of free swap space is lower than each of these values, warning
17+
# and critical alerts will be triggered respectively.
18+
alertmanager_node_free_swap_warning_threshold_ratio: 0.25
19+
alertmanager_node_free_swap_critical_threshold_ratio: 0.1
20+
1521
###############################################################################
1622
# Exporter configuration
1723

@@ -20,6 +26,9 @@ alertmanager_warn_network_bond_single_link: true
2026
# targets being templated during deployment.
2127
stackhpc_enable_os_capacity: true
2228

29+
# Path to a CA certificate file to trust in the OpenStack Capacity exporter.
30+
stackhpc_os_capacity_openstack_cacert: ""
31+
2332
# Whether TLS certificate verification is enabled for the OpenStack Capacity
2433
# exporter during Keystone authentication.
2534
stackhpc_os_capacity_openstack_verify: true
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
features:
3+
- |
4+
Added two alerts (Warning and critical) that are triggered when the ratio
5+
of (free_swap_sppace / total_swap_space) is below thresholds.
6+
Each threshold can be modified by alterting value of
7+
``alertmanager_node_free_swap_warning_threshold_ratio`` and
8+
``alertmanager_node_free_swap_critical_threshold_ratio``.
9+
10+
Currently this solution has limitation of having one-size fits all policy.
11+
This can cause unwanted alerts for the hosts which utilise swap heavily
12+
Therefore it is recommended to tune the thresholds or apply silence rules
13+
for the needs.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
---
2+
features:
3+
- |
4+
Adds support for providing a CA certificate for OpenStack Capacity exporter.
Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
features:
3+
- |
4+
Allows to synchronise a custom list of containers to Pulp using the
5+
``stackhpc_pulp_repository_container_repos_extra`` and
6+
``stackhpc_pulp_distribution_container_extra`` variables.

0 commit comments

Comments
 (0)