Skip to content

Commit d0c13b4

Browse files
committed
Revert "rabbit_feature_flags: Retry after erpc:call() fails with noconnection"
This reverts commit 8749c60. [Why] The patch was supposed to solve an issue that we didn't understand and that was likely a network/DNS problem outside of RabbitMQ. We know it didn't solve that issue because it was reported again 6 months after the initial pull request (#8411). What we are sure however is that it increased the testing of RabbitMQ significantly because the code loops for 10+ minutes if the remote node is not running. The retry in the Feature flags subsystem was not the right place either. The `noconnection` error is visible there because it runs earlier during RabbitMQ startup. But retrying there won't solve a network issue magically. There are two ways to create a cluster: 1. peer discovery and this subsystem takes care of retries if necessary and appropriate 2. manually using the CLI, in which case the user is responsible for starting RabbitMQ nodes and clustering them Let's revert it until the root cause is really understood.
1 parent 7146274 commit d0c13b4

File tree

1 file changed

+0
-23
lines changed

1 file changed

+0
-23
lines changed

deps/rabbit/src/rabbit_ff_controller.erl

Lines changed: 0 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1390,32 +1390,9 @@ this_node_first(Nodes) ->
13901390
Ret :: term() | {error, term()}.
13911391

13921392
rpc_call(Node, Module, Function, Args, Timeout) ->
1393-
SleepBetweenRetries = 5000,
1394-
T0 = erlang:monotonic_time(),
13951393
try
13961394
erpc:call(Node, Module, Function, Args, Timeout)
13971395
catch
1398-
%% In case of `noconnection' with `Timeout'=infinity, we don't retry
1399-
%% at all. This is because the infinity "timeout" is used to run
1400-
%% callbacks on remote node and they can last an indefinite amount of
1401-
%% time, for instance, if there is a lot of data to migrate.
1402-
error:{erpc, noconnection} = Reason
1403-
when is_integer(Timeout) andalso Timeout > SleepBetweenRetries ->
1404-
?LOG_WARNING(
1405-
"Feature flags: no connection to node `~ts`; "
1406-
"retrying in ~b milliseconds",
1407-
[Node, SleepBetweenRetries],
1408-
#{domain => ?RMQLOG_DOMAIN_FEAT_FLAGS}),
1409-
timer:sleep(SleepBetweenRetries),
1410-
T1 = erlang:monotonic_time(),
1411-
TDiff = erlang:convert_time_unit(T1 - T0, native, millisecond),
1412-
Remaining = Timeout - TDiff,
1413-
Timeout1 = erlang:max(Remaining, 0),
1414-
case Timeout1 of
1415-
0 -> {error, Reason};
1416-
_ -> rpc_call(Node, Module, Function, Args, Timeout1)
1417-
end;
1418-
14191396
Class:Reason:Stacktrace ->
14201397
Message0 = erl_error:format_exception(Class, Reason, Stacktrace),
14211398
Message1 = lists:flatten(Message0),

0 commit comments

Comments
 (0)