-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Handle premature connection termination in connection tracking handler #1898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think rabbit_connection_tracking:register_connection
should be updated, instead, as there are other mnesia
operations in that module that handle no_exists
Happy for @lukebakken to take this over 👍 |
); | ||
ThisNode -> | ||
try | ||
rabbit_connection_tracking:register_connection( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@michaelklishin if we're going to catch these errors, how about moving try ... catch
into rabbit_connection_tracking:register_connection
itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's the handler's choice/responsibility to handle them. I don't know if we want to mask those errors for every caller and what the return value should be in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
If a connection disappears before the handler can get the relevant info from a connection_created event, the handler crashes: 2019-02-21 16:47:03.086 [info] <0.1283.0> accepting AMQP connection <0.1283.0> (10.44.1.56:34784 -> 10.44.0.54:5672) 2019-02-21 16:47:03.239 [info] <0.1283.0> Connection <0.1283.0> (10.44.1.56:34784 -> 10.44.0.54:5672) has a client-provided name: perf-test-configuration 2019-02-21 16:47:03.254 [info] <0.1283.0> connection <0.1283.0> (10.44.1.56:34784 -> 10.44.0.54:5672 - perf-test-configuration): user 'test' authenticated and granted access to vhost '/' 2019-02-21 16:47:03.256 [error] <0.392.0> ** gen_event handler rabbit_connection_tracking_handler crashed. ** Was installed in rabbit_event ** Last event was: {event,connection_created,[{user_provided_name,<<"perf-test-configuration">>},{type,network},{pid,<0.1283.0>},{name,<<"10.44.1.56:34784 -> 10.44.0.54:5672">>},{port,5672},{peer_port,34784},{host,{0,0,0,0,0,65535,2604,54}},{peer_host,{0,0,0,0,0,65535,2604,312}},{ssl,false},{peer_cert_subject,''},{peer_cert_issuer,''},{peer_cert_validity,''},{auth_mechanism,<<"PLAIN">>},{ssl_protocol,''},{ssl_key_exchange,''},{ssl_cipher,''},{ssl_hash,''},{protocol,{0,9,1}},{user,<<"test">>},{vhost,<<"/">>},{timeout,60},{frame_max,131072},{channel_max,2047},{client_properties,[{<<"connection_name">>,longstr,<<"perf-test-configuration">>},{<<"product">>,longstr,<<"RabbitMQ">>},{<<"copyright">>,longstr,<<"Copyright (c) 2007-2019 Pivotal Software, Inc.">>},{<<"capabilities">>,table,[{<<"exchange_exchange_bindings">>,bool,true},{<<"connection.blocked">>,bool,true},{<<"authentication_failure_close">>,bool,true},{<<"basic.nack">>,bool,true},{<<"publisher_confirms">>,bool,true},{<<"consumer_cancel_notify">>,bool,true}]},{<<"information">>,longstr,<<"Licensed under the MPL. See http://www.rabbitmq.com/">>},{<<"version">>,longstr,<<"5.6.0">>},{<<"platform">>,longstr,<<"Java">>}]},{connected_at,1550767623053},{node,'[email protected]'},...],...} ** When handler state == [] ** Reason == {error,{no_exists,['tracked_connection_on_node_rabbit@rabbitmq-2.rabbitmq-headless.rabbitmq.svc.cluster.local',{'[email protected]',<<"10.44.1.56:34784 -> 10.44.0.54:5672">>}]}} 2019-02-21 16:47:03.302 [info] <0.1292.0> accepting AMQP connection <0.1292.0> (10.44.1.56:34788 -> 10.44.0.54:5672) 2019-02-21 16:47:03.333 [info] <0.1292.0> Connection <0.1292.0> (10.44.1.56:34788 -> 10.44.0.54:5672) has a client-provided name: perf-test-consumer-0 2019-02-21 16:47:03.336 [info] <0.1292.0> connection <0.1292.0> (10.44.1.56:34788 -> 10.44.0.54:5672 - perf-test-consumer-0): user 'test' authenticated and granted access to vhost '/' 2019-02-21 16:47:03.942 [info] <0.1315.0> accepting AMQP connection <0.1315.0> (10.44.0.56:49878 -> 10.44.0.54:5672) 2019-02-21 16:47:03.963 [info] <0.1315.0> Connection <0.1315.0> (10.44.0.56:49878 -> 10.44.0.54:5672) has a client-provided name: perf-test-consumer-0 2019-02-21 16:47:03.972 [info] <0.1315.0> connection <0.1315.0> (10.44.0.56:49878 -> 10.44.0.54:5672 - perf-test-consumer-0): user 'test' authenticated and granted access to vhost '/' 2019-02-21 16:47:04.038 [info] <0.1337.0> accepting AMQP connection <0.1337.0> (10.44.0.56:49880 -> 10.44.0.54:5672) 2019-02-21 16:47:04.044 [info] <0.1337.0> Connection <0.1337.0> (10.44.0.56:49880 -> 10.44.0.54:5672) has a client-provided name: perf-test-producer-0 2019-02-21 16:47:04.048 [info] <0.1337.0> connection <0.1337.0> (10.44.0.56:49880 -> 10.44.0.54:5672 - perf-test-producer-0): user 'test' authenticated and granted access to vhost '/' cc @mkuratczyk rabbit_connection_tracking_handler: handle more errors when registering a connection Due to #1869, a connection can be accepted before connection tracking tables were initialized/synced from a peer. The handler should log a warning and not terminate, even if it would be restarted, to reduce log noise and scare fewer operators for no real reason: the condition is transient and should not normally last for more than a few tenths of a second. Use tracked connection ID in error message
663f213
to
0dccf40
Compare
Handle premature connection termination in connection tracking handler (cherry picked from commit 161c716)
Backported to |
Proposed Changes
If a connection disappears before the handler can get the relevant info from a connection_created event, the handler crashes.
Types of Changes
What types of changes does your code introduce to this project?
Put an
x
in the boxes that applyChecklist
Put an
x
in the boxes that apply. You can also fill these out after creatingthe PR. If you're unsure about any of them, don't hesitate to ask on the
mailing list. We're here to help! This is simply a reminder of what we are
going to look for before merging your code.
CONTRIBUTING.md
documentFurther Comments
Not sure if this is a good idea, submitting it while in WIP to get some feedback. I'm looking at how to add a test for this - feels very complicated - is it worth it?
Discovered by @mkuratczyk