-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Use system process rss memory when reporting vm memory #1259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Memory reported by erlang:memory(total) is not supposed to be equal to the total size of all pages mapped to the emulator, according to http://erlang.org/doc/man/erlang.html#memory-0 erlang:memory(total) under-reports memory usage by around 20% [#1223] [#145451399]
src/vm_memory_monitor.erl
Outdated
get_ps_memory(); | ||
|
||
get_system_process_resident_memory({unix, linux}) -> | ||
get_ps_memory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading from /proc/PID/smaps
is an option, I'll code up an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://gist.github.com/lukebakken/03d92dcdaecce8bd48d1639f9545295d
The only reason I'm suggesting this is that reading from /proc
should be faster than starting up a sub-process.
Output:
$ ./get-rss.escript $$
RSS: 8588 KiB
$ ps -h -o rss -p $$
8588
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @lukebakken that this is a better solution for Linux. However, note that this is specific to Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we went with ps
mainly because it works the same across both Linux & BSD.
I would prefer to go with ps
to begin with, then improve the approach as/when we learn about its limitations.
Would it be worth doing some benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK this code is not executed on the hot path. Compatibility with more OSes seems like a bigger gain to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK I didn't know how often this stat is collected 😄
src/vm_memory_monitor.erl
Outdated
get_ps_memory(); | ||
|
||
get_system_process_resident_memory({unix,freebsd}) -> | ||
{error, not_implemented_for_os}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation of get_ps_memory()
will work on FreeBSD and probably other *BSD as well.
The command line you used is portable to POSIX systems by the way. The only non-portable part is the field name (rss
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at various systems' manpages: all common *BSD support the rss
field (DragonFlyBSD, FreeBSD, NetBSD, OpenBSD). Solaris should support it too.
Plan9 and HP-UX do not provide a POSIX ps(1), and I couldn't find manpages for AIX.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/me wants to add Plan9 VMs in CI!
src/vm_memory_monitor.erl
Outdated
{error, not_implemented_for_os}; | ||
|
||
get_system_process_resident_memory({win32,_OSname}) -> | ||
{error, not_implemented_for_os}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I confirm the difference on Windows too: for a simple Erlang shell, Windows reports 19 MiB but erlang:memory(total).
reports 14 MiB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a bad example, please disregard this comment.
@gerhard and @hairyhum, do you have examples where A freshly started RabbitMQ without messages (or an Erlang shell) is a bad example: it doesn't reflect a production situation and is the worst case scenario because the memory allocated by the kernel to start the process (so outside of the Erlang VM scope) is significant compared to the memory allocated by the Erlang VM. |
According to AIX, SunOS and BSD docs, ps command can have rss format, so get_ps_memory should work. We already requiring tasklist for wait command, so we can use it here.
@dumbbell
|
It's actually worse than we thought:
ps_bytes_mem() {
KiB=$(ps -p $(pgrep beam) -o rss=)
echo $((KiB * 1024))
}
erlang_bytes_mem() {
rabbitmqctl eval 'erlang:memory(total).'
}
diff() {
ps=$(ps_bytes_mem)
erl=$(erlang_bytes_mem)
echo $((100 - erl * 100 / ps))
}
echo "
| Nodename | \`ps\` (bytes) | \`erlang\` (bytes) | Off by (%) |
|-|-|-|-|
| $RABBITMQ_NODENAME | $(ps_bytes_mem) | $(erlang_bytes_mem) | $(diff)% |
" @dumbbell can you sanity check this? The difference seems too big to me. |
Account for extra memory that was reported in other_ets [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>
[#145451399] rabbitmq/rabbitmq-server#1259 Signed-off-by: Gerhard Lazu <[email protected]>
@michaelklishin feel free to use this in the release notes: As of this release, RabbitMQ will display the memory usage as reported by the operating system, not by the Erlang VM. According to the official Erlang documentation, total [memory] value is not supposed to be equal to the total size of all pages mapped to the emulator. Also, because of fragmentation and pre[-]reservation of memory areas, the size of the memory segments containing the dynamically allocated memory blocks can be much larger than the total size of the dynamically allocated memory blocks. This means that in some cases, RabbitMQ would significantly under-report the used memory. For example, Erlang VM would report 167MB used memory while the OS would report 420MB used by the Erlang VM process. While it may appear that as of this version RabbitMQ is using more memory, it's just a result of switching from Erlang VM memory reporting to OS memory reporting. Since memory usage is used to trigger memory alarms, publishers will be blocked more often. This change will improve stability since RabbitMQ is less likely to be killed by the OS due to out-of-memory errors. OS memory reporting is now enabled by default. To disable, set |
@michaelklishin @dumbbell ready for QA |
rabbitmq/rabbitmq-server#1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>
src/vm_memory_monitor.erl
Outdated
%% erlang:memory(total) under-reports memory usage by around 20% | ||
-spec get_used_memory() -> Bytes :: integer(). | ||
get_used_memory() -> | ||
case application:get_env(rabbit, vm_memory_use_process_rss, false) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this match the value in PROJECT_ENV
in the Makefile
(true
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because this module is moved to rabbitmq-common in the resolve-deps-circles
branch to solve interdependencies between the broker, the Erlang client and the rabbitmq-common. But the configuration key will remain in the rabbit
application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened #1264 to add the new setting to the cuttlefish schema.
I have one concern with this approach: RSS reports the physical memory footprint, unlike We see that RSS is often larger compared to It's possible that today, RabbitMQ doesn't behave nicely w.r.t. queue messages paging and blocking producers when the host starts to swap. But if we use RSS, the behavior will change for sure: I'm wondering if it will be better or worse. |
@dumbbell that's a valid concern. With this change it is more likely that publishers will be blocked earlier, so kernel swapping is less likely with this change. Would you agree? If RabbitMQ node process is being swapped out, I'm afraid our throttling mechanisms won't be very effective either way. |
Yes, I agree because in this patch, we are comparing comparable numbers (physical memory footprint vs. physical memory size), unlike before (memory consumption from Erlang applications PoV vs. physical memory size). Thus the new calculation is not just more accurate: it starts to make sense :-) After thinking a bit more about this, the only situations I see where this could be worse would be:
We do not support the second situation anyway because we only read the physical memory size during startup. So if it changes later, we don't know and calculations will be off anyway. Perhaps in the future, RabbitMQ should decide about queue messages paging and blocking producers by looking at the free memory (+ cache), i.e. something close to what we do with disk space. I don't know, just a random thought. |
@dumbbell @lukebakken @hairyhum @gerhard @kjnilsson @dcorbacho @acogoluegnes we already have a feature flag for going back to the old calculation method. I suggest that we rename I'm also curious what @jerrykuch and @carlhoerberg think. |
Good idea about the parameter name! |
Good point, happy to change the flag. What should we do when a user sets an unsupported flag? I would prefer defaulting to |
@gerhard default to |
@gerhard note that with Cuttlefish unsupported values for enum types will terminate boot. So will unreadable certificate and key files and other config validation failures. |
That's good to know, glad that cuttlefish guards against incorrect configuration in 3.7 Will default to |
…tegy [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>
rabbitmq/rabbitmq-server#1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>
@michaelklishin how about now? |
I'm reliably getting 7 failures in
Here's a CT log directory for that suite: |
I can reproduce this on two machines reliably even after |
I cannot reproduce the failures in |
You're too good (sounds better than we're terrible). Thanks for spotting this, will go over it more thoroughly. |
Tests were failing because they required vm_memory_monitor process to be running re #1259 [#145451399] Signed-off-by: Gerhard Lazu <[email protected]>
All failing tests are now passing. Running all broker tests, ok to QA again. |
The tests now pass, we are doing more observatory testing on more OS'es. There's also at least one place in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Windows 10 looks good:
C:\Users\lbakken>tasklist /fi "pid eq 8248" /fo LIST 2>&1
Image Name: erl.exe
PID: 8248
Session Name: Console
Session#: 1
Mem Usage: 65,860 K
(rabbit@DSCH-WIN)2> os:getpid().
"8248"
(rabbit@DSCH-WIN)3> rabbit_vm:total_memory().
67440640
@michaelklishin just PR-ed |
CmdOutput = os:cmd(Cmd), | ||
%% Memory usage is displayed in kilobytes | ||
%% with comma-separated thousands | ||
case re:run(CmdOutput, "Mem Usage:\\s+([0-9,]+)\\s+K", [{capture, all_but_first, list}]) of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The output of tasklist
is localized, thus this function can only work on an English Windows.
Here is a sample output from a French Windows:
C:\Users\Jean-Sébastien>tasklist /fi "pid eq 7116" /fo LIST
Nom de l’image: explorer.exe
PID: 7116
Nom de la session: Console
Numéro de session: 1
Utilisation mémoire: 72 008 Ko
C:\Users\Jean-Sébastien>
Thus:
- the label is translated ("Utilisation mémoire")
- the number is localized (a space to separate thousands).
Perhaps there is a way to set the locale for a particular command, but I don't know how.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it acceptable to fallback to Erlang memory calculation for RabbitMQ installations on French Windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dumbbell good find. I'll look into this today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to find this but no details about how that can be adapted for one-off commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently chcp
does not change local-specific formatting of numbers/dates/etc. So it's not an option here.
Memory reported by erlang:memory(total) is not supposed to
be equal to the total size of all pages mapped to the emulator,
according to http://erlang.org/doc/man/erlang.html#memory-0
erlang:memory(total) under-reports memory usage by around 20%
Closes #1223.
[#145451399]