Skip to content

Do not fail on bind/unbind operations if the binding records are inconsistent. #1884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 16, 2019

Conversation

hairyhum
Copy link
Contributor

If there is a record for the rabbit_durable_route table but no record
for rabbit_route table, the binding operations should still proceed to
create/remove bindings. This will allow the clients to fix data inconsistency
that server did not fix during recovery.

[#163952284]

This PR should be tested with exchanges, which subscribe to binding actions.

…nsistent.

If there is a record for the rabbit_durable_route table but no record
for rabbit_route table, the binding operations should still proceed to
create/remove bindings. This will allow the clients to fix data inconsistency
that server did not fix during recovery.

[#163952284]
@michaelklishin
Copy link
Collaborator

@hairyhum what does "which subscribe to binding actions" mean? If I manually remove a rabbit_route row and then try queue.{bind,unbind} on the exchange the row belonged to, would this be close enough?

@hairyhum
Copy link
Contributor Author

@michaelklishin some exchanges have callbacks, which are triggered inside/outside of transaction when a binding added or deleted. I'm not sure if there will not be any side-effects like there were with #1589
There should not be, but still worth checking.

Yes, for testing it should be enough to delete rabbit_route records (and/or rabbit_reverse_route) while there are rabbit_durable_route records

Older versions still can return binding_not_found error.
@michaelklishin
Copy link
Collaborator

michaelklishin commented Feb 16, 2019

Managed to reproduce the NOT_FOUND error with the following synthetic test on a 3.7.12 node and in master at the time of writing:

  • Declare a few durable queues
  • Bind them to amq.fanout since default exchange bindings are now implicit and leave route rows
  • In the running node REPL, inspect routes with ets:tab2list(rabbit_durable_route). and ets:tab2list(rabbit_route).
  • Wipe all objects in rabbit_route with ets:delete_all_objects(rabbit_route).
  • Inspect both tables again
  • Try to bind the above queues to amq.fanout again
  • Observe a channel exception that says NOT_FOUND - no binding between exchange 'amq.fanout' in vhost '/' and queue 'q.durable.0' in vhost '/'

With the same test on this branch I observe no channel exceptions and all binding rows for the queues in question in rabbit_route are reinstated 👍. I observed no binding table inconsistencies or issues of any kind with subsequent queue operations.

@michaelklishin
Copy link
Collaborator

Conducted the above test with non-durable queues (and thus semi-durable bindings) and the outcome is the same 👍.

@michaelklishin
Copy link
Collaborator

There is a concern that this can affect stateful exchange types, namely rabbitmq_consistent_hash_exchange. That exchange only allows one binding between a queue and exchange starting with rabbitmq/rabbitmq-consistent-hash-exchange#37 and that's the right direction for it anyway.

I could not identify any regressions and the above tests look promising in addressing #1873 at least in part.

@michaelklishin michaelklishin merged commit 1117739 into master Feb 16, 2019
@michaelklishin michaelklishin deleted the allow_binding_if_tables_inconsistent branch February 16, 2019 12:47
@michaelklishin
Copy link
Collaborator

Backported to v3.7.x.

@michaelklishin michaelklishin added this to the 3.7.13 milestone Feb 17, 2019
@michaelklishin
Copy link
Collaborator

Setting milestone on the PR here because we are not yet ready to consider #1873 resolved (and thus can't know what milestone it should use).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants