-
Notifications
You must be signed in to change notification settings - Fork 114
Stop gen_server2 stats emissions when preparing to hibernate #220
Stop gen_server2 stats emissions when preparing to hibernate #220
Conversation
[#150544915] Signed-off-by: Gerhard Lazu <[email protected]>
We need to call stats emissions without timer when we prepare to hibernate and also stop stats timer without "destroying" stats. If the stats emission interval is lower than hibernate backoff timeout, stats emission will prevent the process from hibernating. A process not entering hibernation will not run a full GC sweep which is problematic for cases when shared binaries need garbage collecting. This will cause memory bloat and can potentially result in the memory alarm not clearing, meaning that RabbitMQ will end up in a permanently blocked state. [#150544915] Signed-off-by: Gerhard Lazu <[email protected]>
gen_server2 uses stats emission by default, so emission settings should be defined even if the rabbit app is not loaded.
Make `call_init_stats` start a stats timer. Return state from all the stats callbacks.
Test stats timer. Test rabbit_core_metrics collection. Test that stats timer stops on hibernation. Test that stats timer stops on backoff.
pre-hibernation queue length will be 0, because we hibernate if there are no messages left.
The PR requires more work. |
Stats timer can be restarted after a stats message, waking up grom hibernation and a message during backoff period. Only stats message should reset the old timer, because stop_stats_timer resets valid timer and leaves it if the timer already triggered, which means that the message is in the process message box and will be processed, and the timer will be reset. After waking up and message during backoff period the timer will be started only if it was successfully stopped. [#150544915]
src/gen_server2.erl
Outdated
|
||
init_stats(State) -> State. | ||
|
||
emit_stats(State = #gs2_state{queue = Queue}) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emit_stats/1
could be removed, as there is no need for InitStatsFun
anymore as it does nothing with or without metrics emission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean init_stats/1
?
Callbacks part can be refactored if we are confident about overall correctness of the behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I meant that one
src/gen_server2.erl
Outdated
init_stats(GS2State) -> | ||
emit_stats(rabbit_event:init_stats_timer(GS2State, #gs2_state.timer)). | ||
call_init_stats(State = #gs2_state{ init_stats_fun = InitStats }) -> | ||
StateWithInitTimer = rabbit_event:init_stats_timer(State, #gs2_state.timer), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we start the timer even in the case stats are not in use? That is, no core metrics available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to separate stats emission process from core metrics.
So stats emission can be stopped and started without emitting stats and stats can be emitted without restarting the timer.
Stats emission is controlled globally by collect_statistics
setting. And metrics collection function depend on ets table.
Remove init_stats_fun callback, it is not used. Do not use helper functions to call the callbacks, it can cause weird performance issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@gerhard @dcorbacho @kjnilsson is it ready to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, happy to take it for an LRE spin
Just realised that this should have been merged into |
Fixes #196, follow up to #197
Stop the stats emission timer before waiting for the backoff timeout.
Includes some changes to the
gen_server2
stats emission callback logic. Callbacks now only emit stats, while timer andgs2_state
is managed bygen_server2
itself. It makes it easier to control the timer and test.