Use system process rss memory when reporting vm memory #1259

hairyhum · 2017-06-12T12:07:53Z

Memory reported by erlang:memory(total) is not supposed to
be equal to the total size of all pages mapped to the emulator,
according to http://erlang.org/doc/man/erlang.html#memory-0
erlang:memory(total) under-reports memory usage by around 20%

Closes #1223.
[#145451399]

Memory reported by erlang:memory(total) is not supposed to be equal to the total size of all pages mapped to the emulator, according to http://erlang.org/doc/man/erlang.html#memory-0 erlang:memory(total) under-reports memory usage by around 20% [#1223] [#145451399]

lukebakken · 2017-06-12T13:38:52Z

src/vm_memory_monitor.erl

+    get_ps_memory();
+
+get_system_process_resident_memory({unix, linux}) ->
+    get_ps_memory();


Reading from /proc/PID/smaps is an option, I'll code up an example.

https://gist.github.com/lukebakken/03d92dcdaecce8bd48d1639f9545295d

The only reason I'm suggesting this is that reading from /proc should be faster than starting up a sub-process.

Output:

$ ./get-rss.escript $$ RSS: 8588 KiB $ ps -h -o rss -p $$ 8588

I agree with @lukebakken that this is a better solution for Linux. However, note that this is specific to Linux.

Yes, we went with ps mainly because it works the same across both Linux & BSD.

I would prefer to go with ps to begin with, then improve the approach as/when we learn about its limitations.

Would it be worth doing some benchmarks?

AFAIK this code is not executed on the hot path. Compatibility with more OSes seems like a bigger gain to me.

OK I didn't know how often this stat is collected 😄

dumbbell · 2017-06-12T14:51:02Z

src/vm_memory_monitor.erl

+    get_ps_memory();
+
+get_system_process_resident_memory({unix,freebsd}) ->
+    {error, not_implemented_for_os};


The current implementation of get_ps_memory() will work on FreeBSD and probably other *BSD as well.

The command line you used is portable to POSIX systems by the way. The only non-portable part is the field name (rss).

I looked at various systems' manpages: all common *BSD support the rss field (DragonFlyBSD, FreeBSD, NetBSD, OpenBSD). Solaris should support it too.

Plan9 and HP-UX do not provide a POSIX ps(1), and I couldn't find manpages for AIX.

/me wants to add Plan9 VMs in CI!

dumbbell · 2017-06-12T14:53:57Z

src/vm_memory_monitor.erl

+    {error, not_implemented_for_os};
+
+get_system_process_resident_memory({win32,_OSname}) ->
+    {error, not_implemented_for_os};


I confirm the difference on Windows too: for a simple Erlang shell, Windows reports 19 MiB but erlang:memory(total). reports 14 MiB.

This was a bad example, please disregard this comment.

dumbbell · 2017-06-13T13:22:21Z

@gerhard and @hairyhum, do you have examples where erlang:memory(total) underestimates the memory footprint?

A freshly started RabbitMQ without messages (or an Erlang shell) is a bad example: it doesn't reflect a production situation and is the worst case scenario because the memory allocated by the kernel to start the process (so outside of the Erlang VM scope) is significant compared to the memory allocated by the Erlang VM.

According to AIX, SunOS and BSD docs, ps command can have rss format, so get_ps_memory should work. We already requiring tasklist for wait command, so we can use it here.

hairyhum · 2017-06-13T14:51:58Z

@dumbbell
Log in to long-running-environment

ps -p `pgrep beam` -o rss
#   RSS
# 450784
rabbitmqctl eval 'erlang:memory(total).'
# 170767520

gerhard · 2017-06-13T15:14:48Z

It's actually worse than we thought:

Nodename	`ps` (bytes)	`erlang` (bytes)	Off by (%)
rabbit@rmq0-rmq-36-20170608	441290752	175692152	61%
rabbit@rmq1-rmq-36-20170608	663388160	327849080	51%
rabbit@rmq2-rmq-36-20170608	335593472	165004064	51%

ps_bytes_mem() {
  KiB=$(ps -p $(pgrep beam) -o rss=)
  echo $((KiB * 1024))
}

erlang_bytes_mem() {
  rabbitmqctl eval 'erlang:memory(total).'
}

diff() {
  ps=$(ps_bytes_mem)
  erl=$(erlang_bytes_mem)
  echo $((100 - erl * 100 / ps))
}

echo "
| Nodename | \`ps\` (bytes) | \`erlang\` (bytes) | Off by (%) |
|-|-|-|-|
| $RABBITMQ_NODENAME | $(ps_bytes_mem) | $(erlang_bytes_mem) | $(diff)% |
"

@dumbbell can you sanity check this? The difference seems too big to me.

Account for extra memory that was reported in other_ets [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

[#145451399] rabbitmq/rabbitmq-server#1259 Signed-off-by: Gerhard Lazu <[email protected]>

gerhard · 2017-06-14T13:18:05Z

@michaelklishin feel free to use this in the release notes:

As of this release, RabbitMQ will display the memory usage as reported by the operating system, not by the Erlang VM. According to the official Erlang documentation, total [memory] value is not supposed to be equal to the total size of all pages mapped to the emulator. Also, because of fragmentation and pre[-]reservation of memory areas, the size of the memory segments containing the dynamically allocated memory blocks can be much larger than the total size of the dynamically allocated memory blocks. This means that in some cases, RabbitMQ would significantly under-report the used memory. For example, Erlang VM would report 167MB used memory while the OS would report 420MB used by the Erlang VM process.

While it may appear that as of this version RabbitMQ is using more memory, it's just a result of switching from Erlang VM memory reporting to OS memory reporting.

Since memory usage is used to trigger memory alarms, publishers will be blocked more often. This change will improve stability since RabbitMQ is less likely to be killed by the OS due to out-of-memory errors.

OS memory reporting is now enabled by default. To disable, set {vm_memory_use_process_rss, false} in the rabbit config.

gerhard · 2017-06-14T13:21:08Z

@michaelklishin @dumbbell ready for QA

rabbitmq/rabbitmq-server#1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

lukebakken · 2017-06-14T13:46:42Z

src/vm_memory_monitor.erl

+%% erlang:memory(total) under-reports memory usage by around 20%
+-spec get_used_memory() -> Bytes :: integer().
+get_used_memory() ->
+    case application:get_env(rabbit, vm_memory_use_process_rss, false) of


Should this match the value in PROJECT_ENV in the Makefile (true)?

No because this module is moved to rabbitmq-common in the resolve-deps-circles branch to solve interdependencies between the broker, the Erlang client and the rabbitmq-common. But the configuration key will remain in the rabbit application.

lukebakken

I opened #1264 to add the new setting to the cuttlefish schema.

dumbbell · 2017-06-14T14:53:39Z

I have one concern with this approach: RSS reports the physical memory footprint, unlike erlang:memory(total) which reports the used memory.

We see that RSS is often larger compared to erlang:memory(total), probably because of pre-allocations done by the Erlang VM in order to improve performance. However, if memory pages used by the Erlang VM are paged out (by the kernel; I'm not talking about our internal paging of queue messages) for whatever reason, the RSS value will remain low while erlang:memory(total) could grow beyond the actual physical memory size.

It's possible that today, RabbitMQ doesn't behave nicely w.r.t. queue messages paging and blocking producers when the host starts to swap. But if we use RSS, the behavior will change for sure: I'm wondering if it will be better or worse.

michaelklishin · 2017-06-14T14:56:41Z

@dumbbell that's a valid concern. With this change it is more likely that publishers will be blocked earlier, so kernel swapping is less likely with this change. Would you agree?

If RabbitMQ node process is being swapped out, I'm afraid our throttling mechanisms won't be very effective either way.

dumbbell · 2017-06-14T15:10:59Z

Yes, I agree because in this patch, we are comparing comparable numbers (physical memory footprint vs. physical memory size), unlike before (memory consumption from Erlang applications PoV vs. physical memory size). Thus the new calculation is not just more accurate: it starts to make sense :-)

After thinking a bit more about this, the only situations I see where this could be worse would be:

Another process is eating all available memory.
Memory ballooning is being used and the "physical" memory size was reduced.

We do not support the second situation anyway because we only read the physical memory size during startup. So if it changes later, we don't know and calculations will be off anyway.

Perhaps in the future, RabbitMQ should decide about queue messages paging and blocking producers by looking at the free memory (+ cache), i.e. something close to what we do with disk space. I don't know, just a random thought.

michaelklishin · 2017-06-14T15:26:39Z

@dumbbell @lukebakken @hairyhum @gerhard @kjnilsson @dcorbacho @acogoluegnes we already have a feature flag for going back to the old calculation method.

I suggest that we rename rabbit.vm_memory_use_process_rss (a boolean) to rabbit.vm_memory_calculation_strategy (which can be rss, erlang, and who knows what else in the future, defaulting to rss). This will leave the door open for N more options in the future without having to add N more feature flags.

I'm also curious what @jerrykuch and @carlhoerberg think.

dumbbell · 2017-06-14T15:39:59Z

Good idea about the parameter name!

gerhard · 2017-06-14T15:59:28Z

Good point, happy to change the flag.

What should we do when a user sets an unsupported flag? I would prefer defaulting to rss & logging a warning over crashing.

michaelklishin · 2017-06-14T16:00:20Z

@gerhard default to rss and log a warning.

michaelklishin · 2017-06-14T16:01:50Z

@gerhard note that with Cuttlefish unsupported values for enum types will terminate boot. So will unreadable certificate and key files and other config validation failures.

gerhard · 2017-06-14T16:11:38Z

That's good to know, glad that cuttlefish guards against incorrect configuration in 3.7

Will default to rss & log a warning in 3.6

…tegy [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

rabbitmq/rabbitmq-server#1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

gerhard · 2017-06-14T16:57:11Z

@michaelklishin how about now?

michaelklishin · 2017-06-14T20:01:59Z

I'm reliably getting 7 failures in unit_inbroker_non_parallel_SUITE:

unit_inbroker_non_parallel_SUITE > non_parallel_tests > app_management
    #1. {error,
            {{badmatch,
                 {badrpc,
                     {'EXIT',
                         {noproc,
                             {gen_server,call,
                                 [vm_memory_monitor,get_memory_limit,
                                  infinity]}}}}},
             [{unit_inbroker_non_parallel_SUITE,app_management1,1,
                  [{file,"test/unit_inbroker_non_parallel_SUITE.erl"},
                   {line,109}]},
              {rpc,'-handle_call_call/6-fun-0-',5,
                  [{file,"rpc.erl"},{line,197}]}]}}

unit_inbroker_non_parallel_SUITE > non_parallel_tests > channel_statistics
    #1. {error,
            {badarg,
                [{ets,lookup,
                     [rabbit_exchange,
                      {resource,<<"/">>,exchange,<<"amq.rabbitmq.trace">>}],
                     []},
                 {rabbit_misc,dirty_read,1,
                     [{file,"src/rabbit_misc.erl"},{line,394}]},
                 {rabbit_trace,init,1,
                     [{file,"src/rabbit_trace.erl"},{line,48}]},
                 {rabbit_channel,init,1,
                     [{file,"src/rabbit_channel.erl"},{line,386}]},
                 {gen_server2,init_it,6,
                     [{file,"src/gen_server2.erl"},{line,554}]},
                 {proc_lib,init_p_do_apply,3,
                     [{file,"proc_lib.erl"},{line,247}]}]}}

unit_inbroker_non_parallel_SUITE > non_parallel_tests > disk_monitor
    #1. {error,{noproc,{gen_server,call,
                                   [rabbit_sup,
                                    {terminate_child,rabbit_disk_monitor_sup},
                                    infinity]}}}

unit_inbroker_non_parallel_SUITE > non_parallel_tests > disk_monitor_enable
    #1. {error,{noproc,{gen_server,call,
                                   [rabbit_sup,
                                    {terminate_child,rabbit_disk_monitor_sup},
                                    infinity]}}}

unit_inbroker_non_parallel_SUITE > non_parallel_tests > file_handle_cache
    #1. {error,{noproc,{gen_server2,call,
                                    [file_handle_cache,get_limit,infinity]}}}

unit_inbroker_non_parallel_SUITE > non_parallel_tests > head_message_timestamp_statistics
    #1. {error,
            {{badmatch,
                 {error,
                     {badarg,
                         [{ets,lookup,
                              [rabbit_exchange,
                               {resource,<<"/">>,exchange,
                                   <<"amq.rabbitmq.trace">>}],
                              []},
                          {rabbit_misc,dirty_read,1,
                              [{file,"src/rabbit_misc.erl"},{line,394}]},
                          {rabbit_trace,init,1,
                              [{file,"src/rabbit_trace.erl"},{line,48}]},
                          {rabbit_channel,init,1,
                              [{file,"src/rabbit_channel.erl"},{line,386}]},
                          {gen_server2,init_it,6,
                              [{file,"src/gen_server2.erl"},{line,554}]},
                          {proc_lib,init_p_do_apply,3,
                              [{file,"proc_lib.erl"},{line,247}]}]}}},
             [{rabbit_ct_broker_helpers,test_channel,0,
                  [{file,"src/rabbit_ct_broker_helpers.erl"},{line,1191}]},
              {unit_inbroker_non_parallel_SUITE,test_spawn,0,
                  [{file,"test/unit_inbroker_non_parallel_SUITE.erl"},
                   {line,709}]},
              {unit_inbroker_non_parallel_SUITE,head_message_timestamp1,1,
                  [{file,"test/unit_inbroker_non_parallel_SUITE.erl"},
                   {line,651}]},
              {rpc,'-handle_call_call/6-fun-0-',5,
                  [{file,"rpc.erl"},{line,197}]}]}}

unit_inbroker_non_parallel_SUITE > non_parallel_tests > log_management
    #1. {error,
            {badarg,
                [{erlang,process_info,[undefined,group_leader],[]},
                 {unit_inbroker_non_parallel_SUITE,override_group_leader,0,
                     [{file,"test/unit_inbroker_non_parallel_SUITE.erl"},
                      {line,386}]},
                 {unit_inbroker_non_parallel_SUITE,log_management1,1,
                     [{file,"test/unit_inbroker_non_parallel_SUITE.erl"},
                      {line,187}]},
                 {rpc,'-handle_call_call/6-fun-0-',5,
                     [{file,"rpc.erl"},{line,197}]}]}}

unit_inbroker_parallel_SUITE > parallel_tests > confirms
    #1. {error,killed}

Here's a CT log directory for that suite:

deps.rabbit.unit_inbroker_non_parallel_SUITE.logs.1.zip

michaelklishin · 2017-06-14T21:34:30Z

I can reproduce this on two machines reliably even after gmake distclean.

michaelklishin · 2017-06-14T22:40:05Z

I cannot reproduce the failures in stable.

gerhard · 2017-06-15T07:50:14Z

You're too good (sounds better than we're terrible). Thanks for spotting this, will go over it more thoroughly.

Tests were failing because they required vm_memory_monitor process to be running re #1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

gerhard · 2017-06-15T09:33:53Z

All failing tests are now passing. Running all broker tests, ok to QA again.

michaelklishin · 2017-06-16T16:33:07Z

The tests now pass, we are doing more observatory testing on more OS'es. There's also at least one place in management-agent that needs updating.

lukebakken

Windows 10 looks good:

C:\Users\lbakken>tasklist /fi "pid eq 8248" /fo LIST 2>&1

Image Name:   erl.exe
PID:          8248
Session Name: Console
Session#:     1
Mem Usage:    65,860 K

(rabbit@DSCH-WIN)2> os:getpid().
"8248"
(rabbit@DSCH-WIN)3> rabbit_vm:total_memory().
67440640

gerhard · 2017-06-19T08:29:21Z

@michaelklishin just PR-ed management-agent, anything else that you can think of?

dumbbell · 2017-06-21T06:47:51Z

src/rabbit_vm.erl

+    CmdOutput = os:cmd(Cmd),
+    %% Memory usage is displayed in kilobytes
+    %% with comma-separated thousands
+    case re:run(CmdOutput, "Mem Usage:\\s+([0-9,]+)\\s+K", [{capture, all_but_first, list}]) of


The output of tasklist is localized, thus this function can only work on an English Windows.

Here is a sample output from a French Windows:

C:\Users\Jean-Sébastien>tasklist /fi "pid eq 7116" /fo LIST Nom de l’image: explorer.exe PID: 7116 Nom de la session: Console Numéro de session: 1 Utilisation mémoire: 72 008 Ko C:\Users\Jean-Sébastien>

Thus:

the label is translated ("Utilisation mémoire")

the number is localized (a space to separate thousands).

Perhaps there is a way to set the locale for a particular command, but I don't know how.

Is it acceptable to fallback to Erlang memory calculation for RabbitMQ installations on French Windows?

@dumbbell good find. I'll look into this today.

I managed to find this but no details about how that can be adapted for one-off commands.

Apparently chcp does not change local-specific formatting of numbers/dates/etc. So it's not an option here.

lukebakken reviewed Jun 12, 2017

View reviewed changes

dumbbell reviewed Jun 12, 2017

View reviewed changes

Current process system memory functions for various OSes.

6a84ef4

According to AIX, SunOS and BSD docs, ps command can have rss format, so get_ps_memory should work. We already requiring tasklist for wait command, so we can use it here.

gerhard mentioned this pull request Jun 14, 2017

Add endpoints for memory: absolute and relative rabbitmq/rabbitmq-management#181

Merged

Use OS memory total when reporting memory usage

b6efa2d

Account for extra memory that was reported in other_ets [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

gerhard pushed a commit to rabbitmq/rabbitmq-management-agent that referenced this pull request Jun 14, 2017

Use vm_memory_monitor when reporting total mem_used

a5c9b40

[#145451399] rabbitmq/rabbitmq-server#1259 Signed-off-by: Gerhard Lazu <[email protected]>

gerhard pushed a commit to rabbitmq/rabbitmq-website that referenced this pull request Jun 14, 2017

Add new vm_memory_use_process_rss config value

6c88dc9

rabbitmq/rabbitmq-server#1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

gerhard mentioned this pull request Jun 14, 2017

Add new vm_memory_use_process_rss config value rabbitmq/rabbitmq-website#395

Merged

lukebakken reviewed Jun 14, 2017

View reviewed changes

lukebakken mentioned this pull request Jun 14, 2017

Add vm_memory_use_process_rss to cuttlefish schema #1264

Closed

lukebakken reviewed Jun 14, 2017

View reviewed changes

lukebakken mentioned this pull request Jun 14, 2017

Add vm_memory_calculation_strategy to cuttlefish schema #1265

Merged

Change vm_memory_use_process_rss config to vm_memory_calculation_stra…

bc0309d

…tegy [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

gerhard pushed a commit to rabbitmq/rabbitmq-website that referenced this pull request Jun 14, 2017

Change vm_memory_use_process_rss to vm_memory_calculation_strategy

970bca7

rabbitmq/rabbitmq-server#1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

Move function that reports total used memory

a0151e4

Tests were failing because they required vm_memory_monitor process to be running re #1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>

lukebakken reviewed Jun 16, 2017

View reviewed changes

gerhard mentioned this pull request Jun 19, 2017

Use total system process rss memory when reporting vm memory rabbitmq/rabbitmq-management-agent#48

Merged

Merge branch 'stable' into rabbitmq-server-1223

bbc69fe

michaelklishin merged commit d817366 into stable Jun 19, 2017

lukebakken deleted the rabbitmq-server-1223 branch June 20, 2017 12:49

dumbbell reviewed Jun 21, 2017

View reviewed changes

lukebakken mentioned this pull request Jun 21, 2017

Use wmic instead of tasklist to get Windows process memory #1270

Merged

Use system process rss memory when reporting vm memory #1259

Use system process rss memory when reporting vm memory #1259

Uh oh!

Conversation

hairyhum commented Jun 12, 2017 • edited by michaelklishin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukebakken Jun 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukebakken Jun 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dumbbell Jun 12, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dumbbell commented Jun 13, 2017

Uh oh!

hairyhum commented Jun 13, 2017

Uh oh!

gerhard commented Jun 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gerhard commented Jun 14, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gerhard commented Jun 14, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukebakken left a comment

Choose a reason for hiding this comment

Uh oh!

dumbbell commented Jun 14, 2017

Uh oh!

michaelklishin commented Jun 14, 2017

Uh oh!

dumbbell commented Jun 14, 2017

Uh oh!

michaelklishin commented Jun 14, 2017

Uh oh!

dumbbell commented Jun 14, 2017

Uh oh!

gerhard commented Jun 14, 2017

Uh oh!

michaelklishin commented Jun 14, 2017

Uh oh!

michaelklishin commented Jun 14, 2017

Uh oh!

gerhard commented Jun 14, 2017

Uh oh!

gerhard commented Jun 14, 2017

Uh oh!

michaelklishin commented Jun 14, 2017

Uh oh!

michaelklishin commented Jun 14, 2017

Uh oh!

michaelklishin commented Jun 14, 2017

Uh oh!

hairyhum commented Jun 12, 2017 •

edited by michaelklishin

Loading

lukebakken Jun 12, 2017 •

edited

Loading

lukebakken Jun 12, 2017 •

edited

Loading

dumbbell Jun 12, 2017 •

edited

Loading

gerhard commented Jun 13, 2017 •

edited

Loading

gerhard commented Jun 14, 2017 •

edited

Loading

lukebakken left a comment •

edited

Loading