Skip to content

Fix OOM test for HOST ALL memspace #294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

kswiecicki
Copy link
Contributor

@kswiecicki kswiecicki commented Mar 5, 2024

Adjust alloc size to 4MB. Add an access to each allocation so that pages are actually allocated on the selected NUMA nodes.

Fixes: #285

This also fixes the nightly workflow fails here: https://github.com/oneapi-src/unified-memory-framework/actions/runs/8148821384.
Nightly workflow run with these changes on my fork: https://github.com/kswiecicki/unified-memory-framework/actions/runs/8188227540.

@ldorau
Copy link
Contributor

ldorau commented Mar 5, 2024

@kswiecicki CI build fail

@ldorau
Copy link
Contributor

ldorau commented Mar 6, 2024

@kswiecicki #285 is already fixed by #295

@kswiecicki kswiecicki force-pushed the mem-policy-fail branch 3 times, most recently from 68fb9f5 to 2dcc063 Compare March 6, 2024 09:24
@lukaszstolarczuk
Copy link
Contributor

@kswiecicki, I can see you're doing some changes yet - is it ready to review/merge at this moment? 😉 if you still testing this, consider changing this PR to draft, please.

@kswiecicki kswiecicki marked this pull request as draft March 6, 2024 09:33
@ldorau
Copy link
Contributor

ldorau commented Mar 6, 2024

Add an access to each allocation so that pages are actually
allocated on the selected NUMA nodes.
@kswiecicki
Copy link
Contributor Author

The Nightly build with your PR fails: https://github.com/ldorau/unified-memory-framework/actions/runs/8169938615/job/22335017162

After talking to @igchor and discovered that provider_os_memory_config tests had similar problems with memory policy assertions under valgrind. For that reason I've added a similar suppression for memspace_host_all test. Nightly tests shouldn't fail now.

@kswiecicki
Copy link
Contributor Author

kswiecicki commented Mar 7, 2024

I've rewritten the test case according to what we came up with on standup @bratpiorka. It seems that OOM killer no longer kills it. The test also worked correctly on a platform with 2 numa nodes with #ifdef 0 removed.

Also, I've added a skip for this test case if it was built with TSan, because it exits with 143 error code without providing any further info.

@kswiecicki kswiecicki marked this pull request as ready for review March 7, 2024 13:32
Copy link
Contributor

@ldorau ldorau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bratpiorka bratpiorka merged commit 822250e into oneapi-src:main Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The umf-memspace_host_all test fails on 2 nodes Ubuntu machine
5 participants