Skip to content

CDRIVER-4627 Avoid unexpected crash with SIGBUS when trying to allocate a /dev/shm shared memory but there is not enough memory #1244

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 9, 2023

Conversation

Romain-Geissler-1A
Copy link
Contributor

Hi,

This patch tries to fix an unexpected crash we have been witnessing in one of our production environment as /dev/shm was somehow (almost) full.

Indeed ftruncate doesn't ensure memory is properly allocated, it only changes the file size from the VFS point of view, but doesn't actually allocate any memory. So ftruncate might work despite we have no memory left, and later when trying to zero-memset the mmapped buffer, we might actually get a SIGBUS signal crashing the whole process.

Instead, make sure we can allocate the whole shared memory using posix_fallocate and gracefully handle allocation problems, without crashing.

The chromium project faced a similar issue in the past: https://bugs.chromium.org/p/chromium/issues/detail?id=951431

As a complementary, here is how to reproduce this rather simply if you have an OCI engine like podman. I started a fedoracontainer with podman and force using a very small /dev/shm, large enough to contain a single page (4k on x86_64):

podman run -t -i --rm --shm-size=4096 fedora

Inside this container, we can install basic tools like this: dnf install -y vim gcc gdb and we can check the /dev/shm size:

[root@b08c0508c3f5 /]# df -h /dev/shm/
Filesystem      Size  Used Avail Use% Mounted on
shm             4.0K     0  4.0K   0% /dev/shm

Now compiling this simple test program:

[root@b08c0508c3f5 /]# cat /shared-volume/test.c
#include <assert.h>
#include <stddef.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/mman.h>  
#include <sys/shm.h>   

int main(int argc, char* argv[])
{
        assert(argc == 3);

        int use_fallocate = atoi(argv[1]);
        size_t size = atoll(argv[2]);

        int fd = shm_open("/shared-memory", O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
        assert(fd != -1);
        printf("shm_open: ok\n");
        
        if (use_fallocate)
        {
                printf("Will posix_fallocate %llu bytes", size);
                assert(posix_fallocate(fd, 0, size) == 0);
                printf("posix_fallocate: ok\n");
        }       
        else
        {
                printf("Will ftruncate %llu bytes", size);
                assert(ftruncate(fd, size) == 0);
                printf("ftruncate: ok\n");
        }       
        
        void* mem = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
        assert(mem != MAP_FAILED);
        printf("mmap: ok\n");
        
        // Implement a non optimized memset to see at which offset we get SIGBUS
        char* char_mem = (char*)mem;
        for (size_t i = 0; i < size; ++i)
        {
                char_mem[i] = '\0';
        }       
        printf("memset: ok\n");
}


[root@b08c0508c3f5 /]# gcc -o /shared-volume/test -g /shared-volume/test.c

We can try to run it with and without fallocate, for 4096 (always work) and 4097 (fails) bytes:

[root@b08c0508c3f5 /]# echo "With ftruncate" && rm -f /dev/shm/shared-memory && /shared-volume/test 0 4096
With ftruncate
shm_open: ok
Will ftruncate 4096 bytesftruncate: ok
mmap: ok
memset: ok


[root@b08c0508c3f5 /]# echo "With posix_fallocate" && rm -f /dev/shm/shared-memory && /shared-volume/test 1 4096
With posix_fallocate
shm_open: ok
Will posix_fallocate 4096 bytesposix_fallocate: ok
mmap: ok
memset: ok


[root@b08c0508c3f5 /]# echo "With ftruncate" && rm -f /dev/shm/shared-memory && /shared-volume/test 0 4097
With ftruncate
shm_open: ok
Will ftruncate 4097 bytesftruncate: ok
mmap: ok
Bus error (core dumped)


[root@b08c0508c3f5 /]# echo "With posix_fallocate" && rm -f /dev/shm/shared-memory && /shared-volume/test 1 4097
With posix_fallocate
shm_open: ok
test: /shared-volume/test.c:26: main: Assertion `posix_fallocate(fd, 0, size) == 0' failed.
Aborted (core dumped)

The first core indeed cores when trying to memset the allocated memory to zero, checking with gdb:

Core was generated by `/shared-volume/test 0 4097'.
Program terminated with signal SIGBUS, Bus error.
#0  0x0000000000401359 in main (argc=3, argv=0x7ffe3680eb88) at /shared-volume/test.c:44
44                      char_mem[i] = '\0';
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.36-9.fc37.x86_64
(gdb) p i
$1 = 4096

Cheers,
Romain

@kevinAlbs kevinAlbs self-requested a review April 24, 2023 19:22
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the thorough analysis and the fix. Left a suggested edit to fix build on macOS.

Also consider: the Performance counters are an optional feature and can be disabled with the cmake option -DENABLE_SHM_COUNTERS=OFF.

@@ -184,10 +184,10 @@ mongoc_counters_alloc (size_t size)
/*
* NOTE:
*
* ftruncate() will cause reads to be zero. Therefore, we don't need to
* posix_fallocate() will cause reads to be zero. Therefore, we don't need to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

posix_fallocate() will cause reads to be zero.

Is (or was) this note correct? The call to memset (mem, 0, size); below zeros the memory, suggesting writing zeros may be required.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is correct (it's made explicit in the man page of ftruncate, it's not so clear in the man page of posix_fallocate but the caveats about the emulation case written by glibc folks mentions explicitly this feature of zeroing-out the "new" memory region allocated by this).

Yet indeed, the rest of the code seems to memset things to zero anyway. So, I don't know if you just want to drop this comment entirely ? Or remove the memset ?

Copy link
Collaborator

@kevinAlbs kevinAlbs May 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not so clear in the man page of posix_fallocate but the caveats about the emulation case written by glibc folks mentions explicitly this feature of zeroing-out the "new" memory region allocated by this

I suggest removing the comment. If this is not documented behavior in the POSIX standard, I err towards keeping the memset.

@kevinAlbs kevinAlbs changed the title Avoid unexpected crash with SIGBUS when trying to allocate a /dev/shm shared memory but there is not enough memory CDRIVER-4627 Avoid unexpected crash with SIGBUS when trying to allocate a /dev/shm shared memory but there is not enough memory Apr 26, 2023
@kevinAlbs kevinAlbs self-requested a review May 9, 2023 13:26
Copy link
Collaborator

@kevinAlbs kevinAlbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the fix. Changes look good with the comment removal.

… shared memory but there is not enough memory.

Indeed ftruncate doesn't ensure memory is properly allocated, it ony
changes the file size from the VFS point of view, but doesn't actually
allocate any memory. So ftruncate might work despite we have no memory
left, and later when trying to zero-memset the mmapped buffer, we might
actually get a SIGBUS signal crashing the whole process.

Instead, make sure we can allocate the whole shared memory using
posix_fallocate and gracefully handle allocation problems, without
crashing.

The chromium project faced a similar issue in the past:
https://bugs.chromium.org/p/chromium/issues/detail?id=951431
@kevinAlbs kevinAlbs merged commit d83922b into mongodb:master May 9, 2023
kevinAlbs pushed a commit that referenced this pull request May 9, 2023
Avoids an unexpected crash with SIGBUS when trying to allocate a /dev/shm shared memory but there is not enough memory. 

Indeed ftruncate doesn't ensure memory is properly allocated, it ony
changes the file size from the VFS point of view, but doesn't actually
allocate any memory. So ftruncate might work despite we have no memory
left, and later when trying to zero-memset the mmapped buffer, we might
actually get a SIGBUS signal crashing the whole process.

Instead, make sure we can allocate the whole shared memory using
posix_fallocate and gracefully handle allocation problems, without
crashing.

The chromium project faced a similar issue in the past:
https://bugs.chromium.org/p/chromium/issues/detail?id=951431
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants