Stable GM memory usage during constant redelivery #1302

gerhard · 2017-07-24T16:51:33Z

In high throughput scenarios, e.g. basic.reject or basic.nack,
messages which belong to a mirrored queue and are replicated within a GM
group, are quickly promoted to the old heap. This means that garbage
collection happens only when the Erlang VM is under memory pressure,
which might be too late. When a process is under pressure, garbage
collection slows it down even further, to the point of RabbitMQ nodes
running out of memory and crashing. To avoid this scenario, We want the
GM process to garbage collect binaries regularly, i.e. every 250ms. The
variable queue does the same for a similar reason:
#289

Initially, we wanted to use the number of messages as the trigger for
garbage collection, but we soon discovered that different workloads
(e.g. small vs large messages) would result in unpredictable and
sub-optimal GC schedules.

Before setting fullsweep_after to 0, memory usage was 2x higher (400MB
vs 200MB) and throughput was 0.1x lower (18k vs 20k). This
spawn_opt setting disables generational collection,
meaning that all live data is copied at every garbage collection:
http://erlang.org/doc/man/erlang.html#spawn_opt-3

The RabbitMQ deployment used for testing this change:

AWS, c4.2xlarge, bosh-aws-xen-hvm-ubuntu-trusty-go_agent 3421.11
3 RabbitMQ nodes running OTP 20.0.1
3 durable & auto-delete queues with 3 replicas each
each queue master was defined on a different RabbitMQ node
every RabbitMQ node was running 1 queue master & 2 queue slaves
1 consumer per queue with QOS 100
100 durable messages @ 1KiB each
basic.reject operations

| Node   | Message throughput   | Memory usage   |
| ------ | -------------------- | -------------- |
| rmq0   | 12K - 20K msg/s      | 400 - 900 MB   |
| rmq1   | 12K - 20K msg/s      | 500 - 1000 MB  |
| rmq2   | 12K - 20K msg/s      | 500 - 800 MB   |

We don't want to use the backoff/hibernate feature because we have observed that the GM process is suspended half of the time. We really wanted to replace gen_server2 with gen_server, but it was more important to keep changes in 3.6 to a minimum. GM will eventually be replaced, so switching it from gen_server2 to gen_server will be soon redundant. We simply do not understand some of the gen_server2 trade-offs well enough to feel strongly about this change. [#148892851] Signed-off-by: Gerhard Lazu <[email protected]>

gerhard · 2017-07-24T16:52:31Z

Letting the benchmark run for 12h before it's ready to merge.

lukebakken · 2017-07-24T18:08:29Z

See #289, #290 and #339 for garbage_collect() in rabbit_variable_queue

lukebakken

👍 nice way to integrate garbage_collect()

lukebakken · 2017-07-24T18:22:02Z

src/gm.erl

 flush_timeout(_)                             -> 0.

+ensure_force_gc_timer(State = #state { force_gc_timer = TRef })
+  when TRef =/= undefined ->


TRef =/= undefined could be is_reference(TRef) but it doesn't really matter 😄

You're right, we'll change that. Thanks!

In high throughput scenarios, e.g. `basic.reject` or `basic.nack`, messages which belong to a mirrored queue and are replicated within a GM group, are quickly promoted to the old heap. This means that garbage collection happens only when the Erlang VM is under memory pressure, which might be too late. When a process is under pressure, garbage collection slows it down even further, to the point of RabbitMQ nodes running out of memory and crashing. To avoid this scenario, We want the GM process to garbage collect binaries regularly, i.e. every 250ms. The variable queue does the same for a similar reason: #289 Initially, we wanted to use the number of messages as the trigger for garbage collection, but we soon discovered that different workloads (e.g. small vs large messages) would result in unpredictable and sub-optimal GC schedules. Before setting `fullsweep_after` to 0, memory usage was 2x higher (400MB vs 200MB) and throughput was 0.1x lower (18k vs 20k). With this `spawn_opt` setting, the general collection algorithm is disabled, meaning that all live data is copied at every garbage collection: http://erlang.org/doc/man/erlang.html#spawn_opt-3 The RabbitMQ deployment used for testing this change: * AWS, c4.2xlarge, bosh-aws-xen-hvm-ubuntu-trusty-go_agent 3421.11 * 3 RabbitMQ nodes running OTP 20.0.1 * 3 durable & auto-delete queues with 3 replicas each * each queue master was defined on a different RabbitMQ node * every RabbitMQ node was running 1 queue master & 2 queue slaves * 1 consumer per queue with QOS 100 * 100 durable messages @ 1KiB each * `basic.reject` operations ``` | Node | Message throughput | Memory usage | | ------ | -------------------- | -------------- | | rmq0 | 12K - 20K msg/s | 400 - 900 MB | | rmq1 | 12K - 20K msg/s | 500 - 1000 MB | | rmq2 | 12K - 20K msg/s | 500 - 800 MB | ``` [#148892851] Signed-off-by: Gerhard Lazu <[email protected]>

gerhard · 2017-07-25T10:49:55Z

Having run the benchmark for 18h, we are confident that high message redelivery rates for mirrored queues no longer affect the cluster stability:

Even though there is a noticeable memory fluctuation after 11h, it's not significant. Memory usage grows from 400MB to 1000MB, it remains stable for 4h, then drops back to 400MB before another short increase. Eventually, memory returns back to 400MB and remains stable for the rest of the 18h. We observe the same behaviour on all nodes. Since the memory returns back to normal and since the cluster is stable throughout, we are happy to just take note of this and not investigate further.

gerhard · 2017-07-25T10:54:58Z

@michaelklishin @dcorbacho ready to review & merge

gerhard · 2017-07-25T10:56:43Z

We've left rmq-148892851 around, in case you need to use it for further benchmarks.

langyxxl · 2018-08-28T09:06:07Z

When in openstack large deployment, 1000 compute node ovs-agent will create 10K ha-queue.
This gc every 250ms will cause cpu usage very high. Even these queue is empty.

https://groups.google.com/forum/?nomobile=true#!topic/rabbitmq-users/6jGtaHINmNM

michaelklishin · 2018-08-28T11:46:49Z

@langyxxl please keep discussions to the mailing list. Thank you.

gerhard · 2018-09-04T08:57:36Z

The Erlang VM spends 44.80% in busy_wait. Setting +sbwt none via RABBITMQ_SERVER_ADDITIONAL_ERL_ARGS significantly reduces CPU utilisation - a few % for both user & system spaces.

Read more and discuss on the mailing list thread

cc @michaelklishin @dcorbacho

lukebakken approved these changes Jul 24, 2017

View reviewed changes

gerhard force-pushed the stable-gm-mem-usage-during-constant-redelivery branch from 1ac9ea3 to 7f84083 Compare July 25, 2017 08:25

gerhard changed the title ~~Stable GM mem usage during constant redelivery~~ Stable GM memory usage during constant redelivery Jul 25, 2017

gerhard force-pushed the stable-gm-mem-usage-during-constant-redelivery branch from 7f84083 to 7d0e49c Compare July 25, 2017 10:33

gerhard requested review from michaelklishin and dcorbacho July 25, 2017 10:55

michaelklishin approved these changes Jul 25, 2017

View reviewed changes

michaelklishin merged commit 82fb30b into stable Jul 25, 2017

michaelklishin mentioned this pull request Jul 26, 2017

Consider increasing DEFAULT_DISTRIBUTION_BUFFER_SIZE to 100-128 MB #1306

Closed

gerhard deleted the stable-gm-mem-usage-during-constant-redelivery branch July 26, 2017 17:09

gerhard added this to the 3.6.11 milestone Jul 26, 2017

lemenkov mentioned this pull request Nov 30, 2018

eheap_alloc: Cannot allocate 1318267840 bytes of memory (of type "old_heap"). #681

Closed

gerhard mentioned this pull request Apr 25, 2019

Add option to nack messages rabbitmq/rabbitmq-perf-test#198

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stable GM memory usage during constant redelivery #1302

Stable GM memory usage during constant redelivery #1302

Uh oh!

gerhard commented Jul 24, 2017 •

edited

Loading

Uh oh!

gerhard commented Jul 24, 2017 •

edited

Loading

Uh oh!

lukebakken commented Jul 24, 2017 •

edited

Loading

Uh oh!

lukebakken left a comment

Uh oh!

lukebakken Jul 24, 2017

Uh oh!

dumbbell Jul 25, 2017

Uh oh!

gerhard commented Jul 25, 2017

Uh oh!

gerhard commented Jul 25, 2017

Uh oh!

gerhard commented Jul 25, 2017

Uh oh!

langyxxl commented Aug 28, 2018

Uh oh!

michaelklishin commented Aug 28, 2018

Uh oh!

gerhard commented Sep 4, 2018 •

edited

Loading

Uh oh!

Uh oh!

Stable GM memory usage during constant redelivery #1302

Stable GM memory usage during constant redelivery #1302

Uh oh!

Conversation

gerhard commented Jul 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gerhard commented Jul 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukebakken commented Jul 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukebakken left a comment

Choose a reason for hiding this comment

Uh oh!

lukebakken Jul 24, 2017

Choose a reason for hiding this comment

Uh oh!

dumbbell Jul 25, 2017

Choose a reason for hiding this comment

Uh oh!

gerhard commented Jul 25, 2017

Uh oh!

gerhard commented Jul 25, 2017

Uh oh!

gerhard commented Jul 25, 2017

Uh oh!

langyxxl commented Aug 28, 2018

Uh oh!

michaelklishin commented Aug 28, 2018

Uh oh!

gerhard commented Sep 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

gerhard commented Jul 24, 2017 •

edited

Loading

gerhard commented Jul 24, 2017 •

edited

Loading

lukebakken commented Jul 24, 2017 •

edited

Loading

gerhard commented Sep 4, 2018 •

edited

Loading