-
Notifications
You must be signed in to change notification settings - Fork 23
fix: use rabbitmq
length for RabbitMQNodeDown
#1579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The `RabbitMQNodeDown` made the assumption that all deployments involve only three RabbitMQ nodes. However, this is not always the case as we do support deployments with a single node or more than three. Before this would have caused false alerts in deployments with a single RabbitMQ node. Whilst also concealing alerts in deployments with more than three nodes.
61b564c
to
e183052
Compare
controller
length for RabbitMQNodeDown
rabbitmq
length for RabbitMQNodeDown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now, thanks!
Co-authored-by: Matt Crees <[email protected]>
This fails to template correctly.
|
Kolla-Ansible uses copy, not template, for rules files [1], so they can either be hard-coded or templated by Kayobe. Possible Kayobe groups are: all, ungrouped, seed, seed-hypervisor, container-image-builders, hypervisors, infra-vms, wazuh-manager, wazuh-agent, github-runners, github-writer, controllers, network, monitoring, storage, compute-vgpu, compute, overcloud, vgpu, iommu, mlnx, docker, docker-registry, ntp, baremetal-compute, mgmt-switches, ctl-switches, hs-switches, switches, ceph, mons, mgrs, osds, rgws, cis-hardening, redfish_exporter_targets, fix-hostname, tempest_runner, controllers_with_ironic_enabled_False Short term I'd say we make a new variable in SKC and default it to the length of the controller group, and have a backlog task to make the prometheus rules files templatable in KA |
Good idea. Happy to +1 once it's in ready-to-review state |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Jack, I think this solution works well. Just need to update the release note
releasenotes/notes/use-length-for-rabbitmq-node-down-rule-c9e9c6b09f57954d.yaml
Outdated
Show resolved
Hide resolved
e62f3fb
to
747181f
Compare
The
RabbitMQNodeDown
made the assumption that all deployments involve three controllers. However, this is not always the case as we do support deployments with a single controller or more than three controllers.Before this would have caused false alerts in deployments with a single controller. Whilst also concealing alerts in deployments with more than three controllers.