Skip to content
This repository was archived by the owner on Nov 17, 2020. It is now read-only.

Stop gen_server2 stats emissions when preparing to hibernate #220

Merged

Conversation

hairyhum
Copy link
Contributor

Fixes #196, follow up to #197

Stop the stats emission timer before waiting for the backoff timeout.

Includes some changes to the gen_server2 stats emission callback logic. Callbacks now only emit stats, while timer and gs2_state is managed by gen_server2 itself. It makes it easier to control the timer and test.

Daniil Fedotov and others added 7 commits August 25, 2017 13:02
We need to call stats emissions without timer when we prepare to
hibernate and also stop stats timer without "destroying" stats. If the
stats emission interval is lower than hibernate backoff timeout, stats
emission will prevent the process from hibernating. A process not
entering hibernation will not run a full GC sweep which is problematic
for cases when shared binaries need garbage collecting. This will cause
memory bloat and can potentially result in the memory alarm not
clearing, meaning that RabbitMQ will end up in a permanently blocked
state.

[#150544915]

Signed-off-by: Gerhard Lazu <[email protected]>
gen_server2 uses stats emission by default, so emission settings
should be defined even if the rabbit app is not loaded.
Make `call_init_stats` start a stats timer.
Return state from all the stats callbacks.
Test stats timer.
Test rabbit_core_metrics collection.
Test that stats timer stops on hibernation.
Test that stats timer stops on backoff.
pre-hibernation queue length will be 0, because we
hibernate if there are no messages left.
@hairyhum
Copy link
Contributor Author

The PR requires more work.
We are stopping the stats timer before backoff, but if we get a message during the backoff timeout, the timer is never restarted.

Daniil Fedotov added 2 commits August 30, 2017 12:08
Stats timer can be restarted after a stats message,
waking up grom hibernation and a message during backoff
period.
Only stats message should reset the old timer, because
stop_stats_timer resets valid timer and leaves it
if the timer already triggered, which means that
the message is in the process message box and will
be processed, and the timer will be reset.
After waking up and message during backoff period
the timer will be started only if it was successfully
stopped.

[#150544915]
@hairyhum hairyhum changed the title DO NOT MERGE Stop gen_server2 stats emissions when preparing to hibernate Stop gen_server2 stats emissions when preparing to hibernate Aug 30, 2017

init_stats(State) -> State.

emit_stats(State = #gs2_state{queue = Queue}) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emit_stats/1 could be removed, as there is no need for InitStatsFun anymore as it does nothing with or without metrics emission.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean init_stats/1?
Callbacks part can be refactored if we are confident about overall correctness of the behaviour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant that one

init_stats(GS2State) ->
emit_stats(rabbit_event:init_stats_timer(GS2State, #gs2_state.timer)).
call_init_stats(State = #gs2_state{ init_stats_fun = InitStats }) ->
StateWithInitTimer = rabbit_event:init_stats_timer(State, #gs2_state.timer),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we start the timer even in the case stats are not in use? That is, no core metrics available

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to separate stats emission process from core metrics.
So stats emission can be stopped and started without emitting stats and stats can be emitted without restarting the timer.
Stats emission is controlled globally by collect_statistics setting. And metrics collection function depend on ets table.

Remove init_stats_fun callback, it is not used.
Do not use helper functions to call the callbacks,
it can cause weird performance issues.
Copy link
Contributor

@kjnilsson kjnilsson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@hairyhum
Copy link
Contributor Author

@gerhard @dcorbacho @kjnilsson is it ready to merge?

Copy link
Contributor

@gerhard gerhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, happy to take it for an LRE spin

@gerhard gerhard merged commit 8cafbf4 into master Aug 31, 2017
@gerhard gerhard deleted the stop-gen_server2-stats-emissions-when-preparing-to-hibernate branch August 31, 2017 09:41
@gerhard
Copy link
Contributor

gerhard commented Aug 31, 2017

Just realised that this should have been merged into stable - merging manually

@gerhard gerhard restored the stop-gen_server2-stats-emissions-when-preparing-to-hibernate branch August 31, 2017 09:48
@gerhard gerhard deleted the stop-gen_server2-stats-emissions-when-preparing-to-hibernate branch August 31, 2017 09:49
@michaelklishin michaelklishin added this to the 3.6.12 milestone Sep 1, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants