Skip to content

Add proxy library (for Linux only now) #226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 21, 2024

Conversation

ldorau
Copy link
Contributor

@ldorau ldorau commented Feb 12, 2024

Ref: #240

@ldorau ldorau force-pushed the Add_proxy_library branch 4 times, most recently from 0485390 to 2e549bc Compare February 12, 2024 16:09
@ldorau ldorau force-pushed the Add_proxy_library branch 2 times, most recently from 253de2c to cdb5445 Compare February 12, 2024 20:23
@ldorau ldorau force-pushed the Add_proxy_library branch 3 times, most recently from 81b5d11 to 0551c74 Compare February 13, 2024 11:22
@ldorau
Copy link
Contributor Author

ldorau commented Feb 13, 2024

@igchor There is an issue with #229. I run ubench with proxy library in LD_PRELOAD:

$ LD_PRELOAD=./lib/libumf_proxy.so ./benchmark/ubench
  1. The constructor of libumf_proxy.so is called.
  2. The constructor runs umfMemoryProviderCreate(umfOsMemoryProviderOps()) that creates the global base allocator no. 1
  3. The constructor runs umfPoolCreate(umfJemallocPoolOps()) that creates the global base allocator no. 2 and allocates je_malloc pool inside it.
  4. The benchmark runs and ends.
  5. atexit(umf_ba_destroy_global); is called and the global base allocator no. 2 is destroyed.
  6. The destructor of libumf_proxy.so is called.
  7. The destructor of libumf_proxy.so runs umfPoolDestroy(pool) and it blows up, because the memory it uses has already been freed in 5)

See: https://github.com/oneapi-src/unified-memory-framework/actions/runs/7887458469/job/21523176454?pr=226

@ldorau ldorau changed the title [DRAFT] Add proxy library Add proxy library (for Linux only now) Feb 13, 2024
@ldorau ldorau marked this pull request as ready for review February 13, 2024 12:14
@ldorau ldorau requested a review from a team as a code owner February 13, 2024 12:14
@github-advanced-security
Copy link

This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation.

@ldorau
Copy link
Contributor Author

ldorau commented Feb 13, 2024

@igchor
Copy link
Member

igchor commented Feb 13, 2024

Build issue on Windows: https://github.com/oneapi-src/unified-memory-framework/actions/runs/7877470520/job/21493692037?pr=226

This seems to be caused by the fact that we include stdlib.h where malloc and others are defined (probably with dllimport). The fix is to not include stdlib.h in the file that implements proxy_lib

@ldorau ldorau requested a review from igchor February 14, 2024 14:36
@igchor
Copy link
Member

igchor commented Feb 15, 2024

IMO the scenario where someone uses both the proxy and the UMF API is invalid. I don't think we should support this, or at least it shouldn't be our priority right now. LGTM

To be honest, I think it SHOULD work in this scenario. If both proxy_lib and app uses libumf.so (dynamic lib) then there should be only one base_alloc and tracker. I think there is some other issue that we should solve here.

@ldorau
Copy link
Contributor Author

ldorau commented Feb 16, 2024

IMO the scenario where someone uses both the proxy and the UMF API is invalid. I don't think we should support this, or at least it shouldn't be our priority right now. LGTM

To be honest, I think it SHOULD work in this scenario. If both proxy_lib and app uses libumf.so (dynamic lib) then there should be only one base_alloc and tracker. I think there is some other issue that we should solve here.

The occurring issues can be caused by the fact that there is only one base_alloc and one tracker in case of dynamic linking. I think that the proxy library should use a separate base_alloc and a separate tracker than an app.

@vinser52
Copy link
Contributor

IMO the scenario where someone uses both the proxy and the UMF API is invalid. I don't think we should support this, or at least it shouldn't be our priority right now. LGTM

Did we discussed the scenarios that we are going to cover with this proxy library at our Memory WG? We also have not discussed the architecture and potential issues. Probably TBB malloc already solves such issues somehow, we have not investigated this. Furthermore, if we consider migrating TBB malloc and building it on top of UMF then there is a TBB malloc proxy that intercepts regular malloc/free flow- do we need to duplicate such functionality with umf_proxy? Why not to discuss this first and get feedback from our Memory WG attendees? Do we have any urgency with this proxy library?
Also, as I remember CAL needs some functionality to intercept malloc/free for GPU use cases.

@ldorau ldorau force-pushed the Add_proxy_library branch 2 times, most recently from 886c2ae to 43f80ca Compare February 16, 2024 10:28
@ldorau
Copy link
Contributor Author

ldorau commented Feb 16, 2024

IMO the scenario where someone uses both the proxy and the UMF API is invalid. I don't think we should support this, or at least it shouldn't be our priority right now. LGTM

To be honest, I think it SHOULD work in this scenario. If both proxy_lib and app uses libumf.so (dynamic lib) then there should be only one base_alloc and tracker. I think there is some other issue that we should solve here.

@bratpiorka @igchor
It DOES WORK in this scenario too with scalable pool. It turned out that those issues occur only with jemalloc pool, so there can be an issue with jemalloc pool.

The proxy library with scalable pool works well.

@igchor
Copy link
Member

igchor commented Feb 16, 2024

@ldorau Will you add some tests to this PR?

I have added a separate workflow running:

          LD_PRELOAD=./lib/libumf_proxy.so /usr/bin/date
          LD_PRELOAD=./lib/libumf_proxy.so /usr/bin/ls
          LD_PRELOAD=./lib/libumf_proxy.so ./benchmark/ubench

@igchor is it enough?

For now that's fine I think. But we should add some extra tests with a lot of mallocs/callocs/reallocs (even from multiple threads) as a next step.

@ldorau
Copy link
Contributor Author

ldorau commented Feb 19, 2024

The linear base allocator does not implement free(), so we can only destroy the whole allocator and free all allocations or - the second option - free nothing. We do not know which allocations come from _dl_init() and which come from hwloc for example. So the only possible option is to free nothing. When there were three allocators, two of them were linear base allocators. The first was a "leak" allocator and the second was destroyed, but this was wrong. It turned out that none of them can be destroyed, so only one "leak" linear base allocator is enough.

@bratpiorka @igchor Moreover, for the same reason we cannot destroy the UMF pool either! It turned out that in the phase of loading libraries, when allocations come from the UMF pool, the program loader (maybe _dl_init()) allocates new memory for the app's environment that is read at the very end after unloading most of libraries - getenv() is called and there is a segfault caused by use-after-free, because this part of environment was freed earlier during destroying the UMF pool in the destructor of the proxy library ...

@ldorau ldorau force-pushed the Add_proxy_library branch 6 times, most recently from 16633ba to 4b246b2 Compare February 20, 2024 11:29
@ldorau
Copy link
Contributor Author

ldorau commented Feb 20, 2024

The proxy library works correctly with all following applications now:

$ LD_PRELOAD=./lib/libumf_proxy.so ctest --output-on-failure
$ LD_PRELOAD=./lib/libumf_proxy.so ./benchmark/ubench
$ LD_PRELOAD=./lib/libumf_proxy.so ./test/umf_test-memoryPool
$ LD_PRELOAD=./lib/libumf_proxy.so /usr/bin/ls
$ LD_PRELOAD=./lib/libumf_proxy.so /usr/bin/date

@ldorau ldorau requested review from igchor and pbalcer February 20, 2024 11:40
@bratpiorka
Copy link
Contributor

With the latest changes LGTM. We should make more improvements in separate PRs.

#endif /* _WIN32 */

// check if we are running in the proxy library
static inline int is_running_in_proxy_lib(void) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work if someone links with proxy_lib? (I think we should support this scenario - from what I understand, this is how it will work on windows) Or is there no problem in that case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, it would be nice to have a test for such scenario.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work if someone links with proxy_lib? (I think we should support this scenario - from what I understand, this is how it will work on windows) Or is there no problem in that case?

Do you mean the dynamic linking instead of LD_PRELOD?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Add umf_ba_linear_pool_contains_pointer().
It returns:
- 0 if ptr does not belong to the pool or
- size (> 0) of the memory region from ptr
  to the end of the pool if ptr belongs to the pool.

It will be useful to implement realloc() in the proxy library.

Added also the baseAllocLinearPoolContainsPointer test.

Signed-off-by: Lukasz Dorau <[email protected]>
UMF proxy library is a library for intercepting user allocation requests.
It intercepts following APIs:
- aligned_alloc()
- calloc()
- free()
- malloc()
- malloc_usable_size()
- realloc()

Signed-off-by: Lukasz Dorau <[email protected]>
@lukaszstolarczuk
Copy link
Contributor

Thx! LGTM

@bratpiorka bratpiorka merged commit 1682326 into oneapi-src:main Feb 21, 2024
@ldorau ldorau deleted the Add_proxy_library branch February 21, 2024 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants