Skip to content

Commit 5c3e93c

Browse files
committed
add basic auth with default user for alertmanager
1 parent ba1a95e commit 5c3e93c

File tree

7 files changed

+91
-27
lines changed

7 files changed

+91
-27
lines changed

ansible/roles/alertmanager/README.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -45,8 +45,10 @@ General variables:
4545
The following variables are equivalent to similarly-named arguments to the
4646
`alertmanager` binary. See `man alertmanager` for more info:
4747

48-
- `alertmanager_config_file`: String, path alertmanager config file will be
49-
written to. Parent directory will be created if necessary.
48+
- `alertmanager_config_file`: String, path the main alertmanager config file
49+
will be written to. Parent directory will be created if necessary.
50+
- `alertmanager_web_config_file`: String, path alertmanager web config file
51+
will be written to. Parent directory will be created if necessary.
5052
- `alertmanager_storage_path`: String, base path for data storage.
5153
- `alertmanager_web_listen_addresses`: List of strings, defining addresses to listeen on.
5254
- `alertmanager_web_external_url`: String, the URL under which Alertmanager is
@@ -59,7 +61,7 @@ The following variables are equivalent to similarly-named arguments to the
5961
alertmanager commandline as `--{{ key }}={{ value }}`.
6062
- `alertmanager_default_receivers`:
6163

62-
The following variables are templated into the [alertmanager configuration](https://prometheus.io/docs/alerting/latest/configuration/):
64+
The following variables are templated into the alertmanager [main configuration](https://prometheus.io/docs/alerting/latest/configuration/):
6365
- `alertmanager_config_template`: String, path to configuration template. The default
6466
is to template in `alertmanager_config_default` and `alertmanager_config_extra`.
6567
- `alertmanager_config_default`: Mapping with default configuration for the
@@ -85,3 +87,9 @@ The following variables are templated into the [alertmanager configuration](http
8587
- weekdays: ['monday:friday']
8688
```
8789
Note that `route` and `receivers` keys should not be added here.
90+
91+
The following variables are templated into the alertmanager [web configuration](https://prometheus.io/docs/alerting/latest/https/):
92+
- `alertmanager_web_config_default`: Mapping with default configuration for
93+
`basic_auth_users` providing the default web user.
94+
- `alertmanager_alertmanager_web_config_extra`: Mapping with additional web
95+
configuration. Keys in this become top-level keys in the web configuration.

ansible/roles/alertmanager/defaults/main.yml

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,26 @@ alertmanager_enabled: true
88
alertmanager_system_user: alertmanager
99
alertmanager_system_group: "{{ alertmanager_system_user }}"
1010
alertmanager_config_file: /etc/alertmanager/alertmanager.yml
11+
alertmanager_web_config_file: /etc/alertmanager/alertmanager-web.yml
1112
alertmanager_storage_path: /var/lib/alertmanager
1213

1314
alertmanager_port: '9093'
1415
alertmanager_web_listen_addresses:
1516
- ":{{ alertmanager_port }}"
16-
alertmanager_web_external_url: "http://{{ hostvars[groups['alertmanager'].0].ansible_host }}:{{ alertmanager_port}}/"
17+
alertmanager_web_external_url: '' # defined in environments/common/inventory/group_vars/all/alertmanager.yml for visibility
1718

1819
alertmanager_data_retention: '120h'
1920
alertmanager_data_maintenance_interval: '15m'
2021
alertmanager_config_flags: {} # other command-line parameters as shown by `man alertmanager`
2122
alertmanager_config_template: alertmanager.yml.j2
23+
alertmanager_web_config_template: alertmanager-web.yml.j2
2224

23-
# everything below here is interpolated into alertmanager_config_default:
25+
alertmanager_web_config_default:
26+
basic_auth_users:
27+
alertmanager: "{{ vault_alertmanager_admin_password | password_hash('bcrypt', '1234567890123456789012', ident='2b') }}"
28+
alertmanager_alertmanager_web_config_extra: {} # top-level only
29+
30+
# Variables below are interpolated into alertmanager_config_default:
2431

2532
# Uncomment below and add Slack bot app creds for Slack integration
2633
# alertmanager_slack_integration:

ansible/roles/alertmanager/tasks/configure.yml

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
mode: u=rwX,go=rX
88
loop:
99
- "{{ alertmanager_config_file | dirname }}"
10+
- "{{ alertmanager_web_config_file | dirname }}"
1011
- "{{ alertmanager_storage_path }}"
1112

1213
- name: Create alertmanager service file with immutable options
@@ -19,7 +20,6 @@
1920
register: _alertmanager_service
2021
notify: Restart alertmanager
2122

22-
2323
- name: Template alertmanager config
2424
ansible.builtin.template:
2525
src: "{{ alertmanager_config_template }}"
@@ -29,6 +29,15 @@
2929
mode: u=rw,go=
3030
notify: Restart alertmanager
3131

32+
- name: Template alertmanager web config
33+
ansible.builtin.template:
34+
src: "{{ alertmanager_web_config_template }}"
35+
dest: "{{ alertmanager_web_config_file }}"
36+
owner: "{{ alertmanager_system_user }}"
37+
group: "{{ alertmanager_system_group }}"
38+
mode: u=rw,go=
39+
notify: Restart alertmanager
40+
3241
- meta: flush_handlers
3342

3443
- name: Ensure alertmanager service state

ansible/roles/alertmanager/templates/alertmanager.service.j2

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ ExecStart={{ alertmanager_binary_dir }}/alertmanager \
2424
--web.listen-address={{ address }} \
2525
{% endfor %}
2626
--web.external-url={{ alertmanager_web_external_url }} \
27+
--web.config.file={{ alertmanager_web_config_file }} \
2728
{% for flag, flag_value in alertmanager_config_flags.items() %}
2829
--{{ flag }}={{ flag_value }} \
2930
{% endfor %}

ansible/roles/passwords/defaults/main.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ slurm_appliance_secrets:
1111
vault_k3s_node_password: "{{ vault_k3s_node_password | default(lookup('ansible.builtin.password', '/dev/null', length=64)) }}"
1212
vault_pulp_admin_password: "{{ vault_pulp_admin_password | default(lookup('password', '/dev/null', chars=['ascii_letters', 'digits'])) }}"
1313
vault_demo_user_password: "{{ vault_demo_user_password | default(lookup('password', '/dev/null')) }}"
14+
vault_alertmanager_admin_password: "{{ vault_alertmanager_admin_password | default(lookup('password', '/dev/null')) }}"
1415

1516
secrets_openhpc_mungekey_default:
1617
content: "{{ lookup('pipe', 'dd if=/dev/urandom bs=1 count=1024 2>/dev/null | base64') }}"

docs/alerting.md

Lines changed: 57 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -9,30 +9,65 @@ describe the overall alerting process:
99
sending out notifications via methods such as email, on-call notification
1010
systems, and chat platforms.
1111

12-
By default, both a `prometheus` server and an `alertmanager` server are
13-
deployed on the control node for new environments:
12+
The general Prometheus configuration is described in
13+
[monitoring-and-logging.md](./monitoring-and-logging.md#defaults-3) - note that
14+
section specifies some role variables which commonly need modification.
15+
16+
The alertmanager server is defined by the [ansible/roles/alertmanager](../ansible/roles/alertmanager/README.md),
17+
and all the configuration options and defaults are defined there. The defaults
18+
are fully functional, except that a [receiver](https://prometheus.io/docs/alerting/latest/configuration/#receiver)
19+
must be configured to generate notifications.
20+
21+
## Enabling alertmanager
22+
23+
1. Ensure both the `prometheus` and `alertmanager` servers are deployed on the
24+
control node - for new environments the `cookiecutter` tool will have done
25+
this:
26+
27+
```ini
28+
# environments/site/groups:
29+
[prometheus:children]
30+
control
31+
32+
[alertmanager:children]
33+
control
34+
```
35+
36+
2. If the appliance was deployed before the alertmanager functionality was included,
37+
generate a password for the alertmanager UI user:
1438

15-
```ini
16-
# environments/site/groups:
17-
[prometheus:children]
18-
control
39+
```shell
40+
ansible-playbook ansible/adhoc/generate-passwords.yml
41+
```
1942

20-
[alertmanager:children]
21-
control
43+
3. Configure a receiver to generate notifications from alerts. Currently a Slack
44+
integration is provided (see below) but alternative receivers could be defined
45+
via overriding role defaults.
46+
47+
4. If desired, any other [role defaults](../ansible/roles/alertmanager/README.md)
48+
may be overriden in e.g. `environments/site/inventory/group_vars/all/alertmanager.yml`.
49+
50+
5. Run the `monitoring.yml` playbook (if the cluster is already up) to configure
51+
both alertmanager and prometheus:
52+
53+
```shell
54+
ansible-playbook ansible/monitoring.yml
55+
```
56+
57+
## Access
58+
59+
There is a web interface provided by the alertmanager server. The default
60+
address can be seen using:
61+
62+
```shell
63+
ansible localhost -m debug -a var=alertmanager_web_external_url
2264
```
2365

24-
The general Prometheus configuration is described in
25-
[monitoring-and-logging.md](./monitoring-and-logging.md#defaults-3) - note this
26-
section specifies some role variables which commonly need modification.
66+
The user is `alertmanager` and the autogenerated password can be seen using:
2767

28-
The alertmanager server is defined by the [ansible/roles/alertmanager](../ansible/roles/alertmanager/README.md),
29-
and all the configuration options and defaults are defined there. By default
30-
it will be fully functional but:
31-
- `alertmanager_web_external_url` is likely to require modification.
32-
- A [receiver](https://prometheus.io/docs/alerting/latest/configuration/#receiver)
33-
must be defined to actually provide notifications. Currently a Slack receiver
34-
integration is provided (see below) but alternative receivers
35-
could be defined using the provided role variables.
68+
```shell
69+
ansible localhost -m debug -a var=vault_alertmanager_admin_password
70+
```
3671

3772
## Slack receiver
3873

@@ -72,10 +107,11 @@ of alerts via Slack.
72107
## Alerting Rules
73108

74109
These are part of [Prometheus configuration](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
75-
which is defined appliance at
110+
which is defined for the appliance at
76111
[environments/common/inventory/group_vars/all/prometheus.yml](../environments/common/inventory/group_vars/all/prometheus.yml).
77112

78-
Two `cloudalchemy.prometheus` role variables are relevant:
113+
Two [cloudalchemy.prometheus](https://github.com/cloudalchemy/ansible-prometheus)
114+
role variables are relevant:
79115
- `prometheus_alert_rules_files`: Paths to check for files providing rules.
80116
Note these are copied to Prometheus config directly, so jinja expressions for
81117
Prometheus do not need escaping.

environments/common/inventory/group_vars/all/alertmanager.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,3 +12,5 @@ alertmanager_slack_receiver: # defined here as needs prometheus address
1212
text: "{{ '{{' }} .GroupLabels.alertname {{ '}}' }} : {{ '{{' }} .CommonAnnotations.description {{ '}}' }}"
1313
title_link: "{{ prometheus_web_external_url }}/alerts?receiver=slack-receiver"
1414
send_resolved: true
15+
16+
alertmanager_web_external_url: "http://{{ hostvars[groups['alertmanager'].0].ansible_host }}:{{ alertmanager_port}}/"

0 commit comments

Comments
 (0)