-
Notifications
You must be signed in to change notification settings - Fork 23
Improve rabbitmq reset playbook matching #1103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is this always true? I reckon a container could get into a failed state due to rabbit issues, it might be better to match against the description as this is the last field and always matches the kolla container name. |
89b65e2
to
e220685
Compare
Feels a bit wrong to match against the description rather than the ID but fair point |
@@ -68,4 +68,4 @@ | |||
# The following services use RabbitMQ. | |||
- name: Restart OpenStack services | |||
shell: >- | |||
systemctl -a | egrep '(barbican|blazar|cinder|cloudkitty|designate|heat|ironic|keystone|magnum|manila|neutron|nova|octavia)' | awk '{ print $1 }' | xargs systemctl restart | |||
systemctl list-units --type=service | egrep '(barbican|blazar|cinder|cloudkitty|designate|heat|ironic|keystone|magnum|manila|neutron|nova|octavia)' | egrep -v ceph | awk '{ print $NF }' | xargs systemctl restart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about $NF
in your awk here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this is to match the description. A bit dangerous if this changes in kolla-ansible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New approach, splitting every word out to a new line and selecting unique entries with better regex matching (must start with kolla-
, end with -container.service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this now pick up each container twice, from the name and description? Could there be any issues from calling the restart twice in quick succession?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
systemctl list-units
supports passing a pattern in, can we do that instead of the grep? https://www.freedesktop.org/software/systemd/man/latest/systemctl.html#list-units%20PATTERN%E2%80%A6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added | uniq
to stop it duplicating lines.
I don't think the systemctl pattern really helps. Grep is more versatile here
e220685
to
e82e0fe
Compare
e82e0fe
to
22a35a8
Compare
@Alex-Welsh It's always good to check if a PR attempting to fix your issue is not already opened :) |
Oh, I see that you've seen that one (as you've upvoted Michal's suggestion) :) |
I can barely remember what I had for breakfast this morning, let alone anything that happened in May 😆. I'm happy to close this one and continue with yours if you update it |
@Alex-Welsh actually, it's my bad! I fixed matching aswell and never pushed to that branch I pasted earler! That's why I found this redundant. Great job and thanks! |
https://github.com/stackhpc/stackhpc-kayobe-config/blob/stackhpc/2023.1/etc/kayobe/ansible/stop-openstack-services.yml Just realised we might want to change this too, as it will also be matching against the dot on failed services. Less impactful as stopping a failed service probably won't do much, but still seems worth doing. |
Closing in favor of #1073 |
Two issues were picked up during the Sussex deployment. The original command matched against the correct lines but also these:
The first issue is that it was matching against ceph containers which happen to have
manila
in the nameThe second was that it was matching against failed units, which show up in the list like this:
So it's matching against the dot
Stopped containers don't need to be stopped and restarted so we should only match against running units