-
Notifications
You must be signed in to change notification settings - Fork 9
Segfault if used with tarantool debug build #59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
After a brief research of the bug, it seems that the problem is in different versions of the “small” library used in the tarantool and memcached module.
|
As I understood the problem is this:
Memcached uses the next functions: The reason of the bug is the next: when initializing a struct mempool in mempool_create(memcached version) mempool_create_with_order is called(tarantool version with other struct mempool size), where some fields of the struct mempool are assigned. After that, the corrupted struct mempool will be used in mempool_free(memcached version). And, "BAM!", segfault when trying to use a NULL pointer. |
Hot fix: |
I vote to workaround this for now for tarantool >= 1.10 and leave the issue open to investigate further. |
The "small" library version has been updated. Part of #59
Crutch. If SMALL_EMBEDDED is used, "small" will try to compile with the default flags, but it can't be compiled with std=c89. Look like a "small" library bug. Part of #59
NB: Would some ABI compliance checker (say, this) help us with detecting ABI incompatibility in small? |
If I understand correctly it's not solve the problem, because now we have a situation when the memcached version implicitly depends on the tarantool "small" version. And now it should be synchronously update or rollback. |
The main reason of update: When the memcached module is linked with tarantool, it uses some of the "small" functions from tarantool (which were exported by tarantool, for example: mempool_create_with_order()) and some from the "small" with wich it was built (for example: mempool_create(), mempool_free()). This can lead to undefined behavior (segfault for example). To avoid the problem, will use the same (close) version of the "small" as tarantool. Part of #59
If SMALL_EMBEDDED is used, "small" will try to compile with the default flags, but it can't be compiled with std=c89. Look like a "small" library bug. So, we will use std=c99 globally as workaround. Part of #59
If SMALL_EMBEDDED is used, "small" will try to compile with the default flags, but it can't be compiled with std=c89. Look like a "small" library bug. So, we will use std=c99 globally as workaround. See tarantool/small#25 Part of #59 @Totktonada: added the comment.
The main reason of update: When the memcached module is linked with tarantool, it uses some of the "small" functions from tarantool (which were exported by tarantool, for example: mempool_create_with_order()) and some from the "small" with wich it was built (for example: mempool_create(), mempool_free()). This can lead to undefined behavior (segfault for example). To avoid the problem, will use the same (close) version of the "small" as tarantool. Part of #59
If SMALL_EMBEDDED is used, "small" will try to compile with the default flags, but it can't be compiled with std=c89. Look like a "small" library bug. So, we will use std=c99 globally as workaround. See tarantool/small#25 Part of #59 @Totktonada: added the comment.
Worked around in PR #60. |
But a proper fix should be implemented in a long term. |
Curious fact: everything works until 2.5.0-42-g03790ac55 ('cmake: remove dynamic-list linker option'), which is fix of #2971 ('Unhide back all symbols in 1.7+'). (I verified it.) |
Yep. If tarantool will hides symbols, the problem go away). Memcached will use only its own version of small. |
Background informationWe're going to decouple tarantool's runtime arena and a slab cache upward the And now I want to return here to make sure that all problems around The key fact here is that we have certain range of tarantool versions, where Symbols was visible by default since 2.5.0-42-g03790ac55 ('cmake: remove
memcached's small version is 1.1-63-g3df5050. Tarantool's small versions:
Note: small was bumped in 2.5.0-42-g03790ac55 by a mistake and reverted back AnalysisHere I want to do an attepmt to estimate small's changes in terms of ABI
So the changes look safe in terms of ABI compatibility (taking into account the ExperimentBuild Tarantool in Debug configuration: $ git clean -xffd; git submodule foreach --recursive 'git clean -xffd'
$ git checkout -- .
$ git checkout $TAG
$ rm -r third_party test-run
$ git checkout -- third_party
$ git submodule update --init --recursive
$ <..apply tweaks (see below)..>
$ cmake . -DCMAKE_BUILD_TYPE=Debug -DENABLE_BACKTRACE=ON -DENABLE_DIST=ON -DENABLE_FEEDBACK_DAEMON=OFF -DENABLE_BUNDLED_LIBCURL=OFF && make -j
$ make DESTDIR=../$(git describe) install Build Tarantool in RelWithDebInfo configuration: <..same as above..>
$ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo -DENABLE_BACKTRACE=ON -DENABLE_DIST=ON -DENABLE_FEEDBACK_DAEMON=OFF -DENABLE_BUNDLED_LIBCURL=OFF && make -j
$ make DESTDIR=../$(git describe)-release install Build tweaks
Build memcached in Debug configuration: $ cmake . -DCMAKE_BUILD_TYPE=Debug && make
<..ensure that the following line is present..>
-- Could NOT find Small (missing: SMALL_INCLUDE_DIR SMALL_LIBRARY)
<...> Build memcached in RelWithDebInfo configuration: $ cmake . -DCMAKE_BUILD_TYPE=RelWithDebInfo && make
<..ensure that the following line is present..>
-- Could NOT find Small (missing: SMALL_INCLUDE_DIR SMALL_LIBRARY)
<...> Run memcached tests (against Debug tarantool): $ (export PATH="$(realpath ../${TAG}/usr/local/bin):${PATH}"; make test-memcached && make test-memcached-bench)
# or for all tarantool builds at once
$ for TAG in 2.5.{1..3} 2.6.{1..3} 2.7.{1..3} 2.8.{1..3} 2.8.3-63-gb96997083; do (export PATH="$(realpath ../${TAG}/usr/local/bin):${PATH}"; make test-memcached && make test-memcached-bench); done Run memcached tests (against RelWithDebInfo tarantool): $ (export PATH="$(realpath ../${TAG}-release/usr/local/bin):${PATH}"; make test-memcached && make test-memcached-bench)
# or for all tarantool builds at once
$ for TAG in 2.5.{1..3} 2.6.{1..3} 2.7.{1..3} 2.8.{1..3} 2.8.3-63-gb96997083; do (export PATH="$(realpath ../${TAG}-release/usr/local/bin):${PATH}"; make test-memcached && make test-memcached-bench); done Note: I have problems with libmemcached build on Gentoo (it seems, it requires memcached master: 1.0.1-40-g5aaac83. Tarantool Debug + memcached Debug:
Tarantool Debug + memcached RelWithDebInfo:
Tarantool RelWithDebInfo + memcached Debug:
Tarantool RelWithDebInfo + memcached RelWithDebInfo:
Tests from PR #102 to run on memcached master are
|
Just copy and change includes. It is necessary to change function names in a next commit. Part of #59
See [1] for the problem root and [2] for analysis. In brief: `ibuf_*()` symbols may clash with the same named symbols in tarantool, so it worth to rename them to avoid effects of different binary layout: assertion fails and memory corruptions. For example, `ibuf_reserve_slow()` calls `slab_get()` and `slab_put()`. If something will be changed in tarantool's slab cache, we can meet problems in the module. This patch just renames the `ibuf_*()` functions. It is the most simple and durable workaround of the problem. It is likely not necessary to rename inline functions, but since I'm not dead sure, I renamed them all. It is the last known problem from #59. All others are confirmed as safe to ignore if we'll keep current libsmall version in the module (see the analysis in [2]). [1]: tarantool/tarantool#6873 [2]: #59 (comment) Fixes #59
I wrote it mainly to clarify when the hack could be removed. Follows up #59
See [1] for the problem root and [2] for analysis. In brief: `ibuf_*()` symbols may clash with the same named symbols in tarantool, so it worth to rename them to avoid effects of different binary layouts: assertion fails and memory corruptions. For example, `ibuf_reserve_slow()` calls `slab_get()` and `slab_put()`. If something will be changed in tarantool's slab cache, we can meet problems in the module. This patch just renames the `ibuf_*()` functions. It is the most simple and durable workaround of the problem. It is likely not necessary to rename inline functions, but since I'm not dead sure, I renamed them all. It is the last known problem from #59. All others are confirmed as safe to ignore if we'll keep current libsmall version in the module (see the analysis in [2]). [1]: tarantool/tarantool#6873 [2]: #59 (comment) Fixes #59
I wrote it mainly to clarify when the hack could be removed. Follows up #59
Just copy and change includes. It is necessary to change function names in a next commit. The source is the small version, which is bundled into the module: 1.1-63-g3df5050. Part of #59
See [1] for the problem root and [2] for analysis. In brief: `ibuf_*()` symbols may clash with the same named symbols in tarantool, so it worth to rename them to avoid effects of different binary layouts: assertion fails and memory corruptions. For example, `ibuf_reserve_slow()` calls `slab_get()` and `slab_put()`. If something will be changed in tarantool's slab cache, we can meet problems in the module. This patch just renames the `ibuf_*()` functions. It is the most simple and durable workaround of the problem. It is likely not necessary to rename inline functions, but since I'm not dead sure, I renamed them all. It is the last known problem from #59. All others are confirmed as safe to ignore if we'll keep current libsmall version in the module (see the analysis in [2]). [1]: tarantool/tarantool#6873 [2]: #59 (comment) Fixes #59
I wrote it mainly to clarify when the hack could be removed. Follows up #59
Historically memcached module uses slab cache created by Tarantool. Fibers get access to it via call of function cord_slab_cache() that is a part of Tarantool C API. The problem is happen when memcached and Tarantool use different versions of Small library where ABI is broken. Issues #59 and #96 are examples of such problems. We decided use independent slab arena in memcached module to avoid such problems in future and this patch implements it. Below I'll describe what is important in switching to own arena. memcached module uses different types of Small's allocators (ibuf and obuf used for every network connection, one slab cache used per fiber). There are two important things with used Small's allocators: 1. some allocators depends on other and sequence of allocations should be correct. README in small source code repository has a clear description of relationships between allocators [1]. 2. some allocators are thread-safe and others not. slab arena is shared between all memcached instances, when slab cache allocated on top of slab arena is not thread-safe and must be used in scope of fiber. memcached performs memory allocations and deallocations on different stages and I would list all these stages explicitly: - memcached module initialization - memcached module deinitialization - memcached instance gc - memcached instance initialization - memcached instance start - memcached instance stop - memcached instance deinitialization Most part of these cases covered in tests test/common/instance_api.test.lua and test/common/module_api.test.lua that were added in previous commits. The exception here is gc that was checked manually. In tests function is_port_open() has been replaced to memcached minimalistic client that do set value, get value with the same key and compare both values. This check is more rigorous than check port openess. 1. https://github.com/tarantool/small Fixes #96
Historically memcached module uses slab cache created by Tarantool. Fibers get access to it via call of function cord_slab_cache() that is a part of Tarantool C API. The problem is happen when memcached and Tarantool use different versions of Small library where ABI is broken. Issues #59 and #96 are examples of such problems. We decided use independent slab arena in memcached module to avoid such problems in future and this patch implements it. Below I'll describe what is important in switching to own arena. memcached module uses different types of Small's allocators (ibuf and obuf used for every network connection, one slab cache used per fiber). There are two important things with used Small's allocators: 1. some allocators depends on other and sequence of allocations should be correct. README in small source code repository has a clear description of relationships between allocators [1]. 2. some allocators are thread-safe and others not. slab arena is shared between all memcached instances, when slab cache allocated on top of slab arena is not thread-safe and must be used in scope of fiber. memcached performs memory allocations and deallocations on different stages and I would list all these stages explicitly: - memcached module initialization - memcached module deinitialization - memcached instance gc - memcached instance initialization - memcached instance start - memcached instance stop - memcached instance deinitialization Most part of these cases covered in tests test/common/instance_api.test.lua and test/common/module_api.test.lua that were added in previous commits. The exception here is gc that was checked manually. In tests function is_port_open() has been replaced to memcached minimalistic client that do set value, get value with the same key and compare both values. This check is more rigorous than check port openess. 1. https://github.com/tarantool/small Fixes #96
Historically memcached module uses slab cache created by Tarantool. Fibers get access to it via call of function cord_slab_cache() that is a part of Tarantool C API. The problem is happen when memcached and Tarantool use different versions of Small library where ABI is broken. Issues #59 and #96 are examples of such problems. We decided use independent slab arena in memcached module to avoid such problems in future and this patch implements it. Below I'll describe what is important in switching to own arena. memcached module uses different types of Small's allocators (ibuf and obuf used for every network connection, one slab cache used per fiber). There are two important things with used Small's allocators: 1. some allocators depends on other and sequence of allocations should be correct. README in small source code repository has a clear description of relationships between allocators [1]. 2. some allocators are thread-safe and others not. slab arena is shared between all memcached instances, when slab cache allocated on top of slab arena is not thread-safe and must be used in scope of fiber. memcached performs memory allocations and deallocations on different stages and I would list all these stages explicitly: - memcached module initialization - memcached module deinitialization - memcached instance gc - memcached instance initialization - memcached instance start - memcached instance stop - memcached instance deinitialization Most part of these cases covered in tests test/common/instance_api.test.lua and test/common/module_api.test.lua that were added in previous commits. The exception here is gc that was checked manually. In tests function is_port_open() has been replaced to memcached minimalistic client that do set value, get value with the same key and compare both values. This check is more rigorous than check port openess. 1. https://github.com/tarantool/small Fixes #96
Historically memcached module uses slab cache created by Tarantool. Fibers get access to it via call of function cord_slab_cache() that is a part of Tarantool C API. The problem is happen when memcached and Tarantool use different versions of Small library where ABI is broken. Issues #59 and #96 are examples of such problems. We decided use independent slab arena in memcached module to avoid such problems in future and this patch implements it. Below I'll describe what is important in switching to own arena. memcached module uses different types of Small's allocators (ibuf and obuf used for every network connection, one slab cache used per fiber). There are two important things with used Small's allocators: 1. some allocators depends on other and sequence of allocations should be correct. README in small source code repository has a clear description of relationships between allocators [1]. 2. some allocators are thread-safe and others not. slab arena is shared between all memcached instances, when slab cache allocated on top of slab arena is not thread-safe and must be used in scope of fiber. memcached performs memory allocations and deallocations on different stages and I would list all these stages explicitly: - memcached module initialization - memcached module deinitialization - memcached instance gc - memcached instance initialization - memcached instance start - memcached instance stop - memcached instance deinitialization Most part of these cases covered in tests test/common/instance_api.test.lua and test/common/module_api.test.lua that were added in previous commits. The exception here is gc that was checked manually. In tests function is_port_open() has been replaced to memcached minimalistic client that do set value, get value with the same key and compare both values. This check is more rigorous than check port openess. 1. https://github.com/tarantool/small Fixes #96
Historically memcached module uses slab cache created by Tarantool. Fibers get access to it via call of function cord_slab_cache() that is a part of Tarantool C API. The problem is happen when memcached and Tarantool use different versions of Small library where ABI is broken. Issues #59 and #96 are examples of such problems. We decided use independent slab arena in memcached module to avoid such problems in future and this patch implements it. Below I'll describe what is important in switching to own arena. memcached module uses different types of Small's allocators (ibuf and obuf used for every network connection, one slab cache used per fiber). There are two important things with used Small's allocators: 1. some allocators depends on other and sequence of allocations should be correct. README in small source code repository has a clear description of relationships between allocators [1]. 2. some allocators are thread-safe and others not. slab arena is shared between all memcached instances, when slab cache allocated on top of slab arena is not thread-safe and must be used in scope of fiber. memcached performs memory allocations and deallocations on different stages and I would list all these stages explicitly: - memcached module initialization - memcached module deinitialization - memcached instance gc - memcached instance initialization - memcached instance start - memcached instance stop - memcached instance deinitialization Most part of these cases covered in tests test/common/instance_api.test.lua and test/common/module_api.test.lua that were added in previous commits. The exception here is gc that was checked manually. In tests function is_port_open() has been replaced to memcached minimalistic client that do set value, get value with the same key and compare both values. This check is more rigorous than check port openess. 1. https://github.com/tarantool/small Fixes #96
Just copy and change includes. It is necessary to change function names in a next commit. The source is the small version, which is bundled into the module: 1.1-63-g3df5050. Part of #59
See [1] for the problem root and [2] for analysis. In brief: `ibuf_*()` symbols may clash with the same named symbols in tarantool, so it worth to rename them to avoid effects of different binary layouts: assertion fails and memory corruptions. For example, `ibuf_reserve_slow()` calls `slab_get()` and `slab_put()`. If something will be changed in tarantool's slab cache, we can meet problems in the module. This patch just renames the `ibuf_*()` functions. It is the most simple and durable workaround of the problem. It is likely not necessary to rename inline functions, but since I'm not dead sure, I renamed them all. It is the last known problem from #59. All others are confirmed as safe to ignore if we'll keep current libsmall version in the module (see the analysis in [2]). [1]: tarantool/tarantool#6873 [2]: #59 (comment) Fixes #59
I wrote it mainly to clarify when the hack could be removed. Follows up #59
Just copy and change includes. It is necessary to change function names in a next commit. The source is the small version, which is bundled into the module: 1.1-63-g3df5050. Part of #59
See [1] for the problem root and [2] for analysis. In brief: `ibuf_*()` symbols may clash with the same named symbols in tarantool, so it worth to rename them to avoid effects of different binary layouts: assertion fails and memory corruptions. For example, `ibuf_reserve_slow()` calls `slab_get()` and `slab_put()`. If something will be changed in tarantool's slab cache, we can meet problems in the module. This patch just renames the `ibuf_*()` functions. It is the most simple and durable workaround of the problem. It is likely not necessary to rename inline functions, but since I'm not dead sure, I renamed them all. It is the last known problem from #59. All others are confirmed as safe to ignore if we'll keep current libsmall version in the module (see the analysis in [2]). [1]: tarantool/tarantool#6873 [2]: #59 (comment) Fixes #59
I wrote it mainly to clarify when the hack could be removed. Follows up #59
Uh oh!
There was an error while loading. Please reload this page.
Got segfault on both master and this patchset (#54) on tarantool 2.5.0-62-gbb7c3d167 (debug) when run tests (cd test && ./test-run.py).
#54 (comment)
The text was updated successfully, but these errors were encountered: