Skip to content

Draft: [SYCL] Implement resource pool for implementation allocations #5662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

steffenlarsen
Copy link
Contributor

These changes add a "resource pool" used by the SYCL implementation to get auxiliary device memory resources. When a resource is deleted it returns its memory to the pool to be reacquired by other internal logic. Additionally, reductions that need additional resources are made to use the pool when getting these.

These changes add a "resource pool" used by the SYCL implementation to
get auxiliary device memory resources. When a resource is deleted it
returns its memory to the pool to be reacquired by other internal logic.
Additionally, reductions that need additional resources are made to use
the pool when getting these.

Signed-off-by: Steffen Larsen <[email protected]>
Signed-off-by: Steffen Larsen <[email protected]>
@steffenlarsen
Copy link
Contributor Author

/summary:run

@steffenlarsen
Copy link
Contributor Author

Functionality tested by intel/llvm-test-suite#873. Verification with this requires #5653.

@steffenlarsen steffenlarsen marked this pull request as ready for review March 8, 2022 11:01
@steffenlarsen steffenlarsen requested review from a team as code owners March 8, 2022 11:01
@steffenlarsen steffenlarsen changed the title Draft: [SYCL] Implement resource pool for implementation allocations [SYCL] Implement resource pool for implementation allocations Mar 8, 2022
Signed-off-by: Steffen Larsen <[email protected]>
@vladimirlaz
Copy link
Contributor

+@smaslov-intel. As far as I remember where is some memory pool in L0 plugin/BE. Are there any drawbacks of having one more pool of memory in the SW stack?

@smaslov-intel
Copy link
Contributor

We need to make the current L0-only pooling serve the needs of other SYCL backends (and other language runtimes).
https://github.com/intel/llvm/blob/sycl/sycl%2Fplugins%2Flevel_zero%2Fusm_allocator.cpp

bader
bader previously approved these changes Mar 21, 2022
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sycl/doc/EnvironmentVariables.md looks okay, but there are merge conflicts in other files.

@steffenlarsen
Copy link
Contributor Author

/verify with intel/llvm-test-suite#873

Signed-off-by: Steffen Larsen <[email protected]>
Signed-off-by: Steffen Larsen <[email protected]>
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, fix pre-commit tests.

Signed-off-by: Steffen Larsen <[email protected]>
Copy link
Contributor

@s-kanaev s-kanaev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it feasible to have resource pool unit-tested?

Comment on lines +87 to +89
SYCLMemObjT(RT::PiMem MemObject, const context &SyclContext,
const size_t SizeInBytes, event AvailableEvent,
std::unique_ptr<SYCLMemObjAllocator> Allocator);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this c-tor supersede the other one? That is, should the other c-tor be marked with TODO for removal when ABI break is allowed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so. The other constructor is intended to be user-facing and as such has additional checks for argument validity. The new constructor is meant for internal use so we do not need the additional checks.

@steffenlarsen
Copy link
Contributor Author

Is it feasible to have resource pool unit-tested?

There is not currently a way to test reductions in unittests, but it may be possible to use the resource pool directly. I see there is a failure in testing that is likely related, so I will be investigating that and will see what is possible with unittests.

Signed-off-by: Steffen Larsen <[email protected]>
Signed-off-by: Steffen Larsen <[email protected]>
@s-kanaev
Copy link
Contributor

but it may be possible to use the resource pool directly

That's exactly what I mean. Thanks.

Signed-off-by: Steffen Larsen <[email protected]>
Signed-off-by: Steffen Larsen <[email protected]>
@steffenlarsen
Copy link
Contributor Author

but it may be possible to use the resource pool directly

That's exactly what I mean. Thanks.

Absolutely. I have added such tests. Please let me know if there are test cases I've missed.

@s-kanaev s-kanaev requested a review from dm-vodopyanov March 22, 2022 19:21
@steffenlarsen
Copy link
Contributor Author

@dm-vodopyanov | @vladimirlaz | @smaslov-intel - Friendly ping.

@smaslov-intel
Copy link
Contributor

@steffenlarsen: could you react on my previous comment:

We need to make the current L0-only pooling serve the needs of other SYCL backends (and other language runtimes).
https://github.com/intel/llvm/blob/sycl/sycl%2Fplugins%2Flevel_zero%2Fusm_allocator.cpp

We already have a memory pool underneath, and it looks that this new pooling adds overheads without clear benefits.

@steffenlarsen
Copy link
Contributor Author

steffenlarsen commented Mar 28, 2022

@steffenlarsen: could you react on my previous comment:

We need to make the current L0-only pooling serve the needs of other SYCL backends (and other language runtimes).
https://github.com/intel/llvm/blob/sycl/sycl%2Fplugins%2Flevel_zero%2Fusm_allocator.cpp

We already have a memory pool underneath, and it looks that this new pooling adds overheads without clear benefits.

@smaslov-intel - Sorry, I think I misunderstood your comment. We may be able to generalize the pooling, but it seems that the L0-only implementation is primarily focused on USM memory allocations. Originally, the plan for this pool was to use USM memory as well, but it would make reductions dependent on USM support. That would mean any backend that does not support USM would not be able to support reductions either. Do you think we should make the pooling work for non-USM allocations? Should we consider using this as an intermediate solution and retire it when we have generalized the L0-only pooling?

Correction: Reading it, it seem to be possible to enable it for buffers as well. Is that correct? If so, what would be needed to make it not L0-only? Is this enabled for buffers by default? If not, what would be the cost of enabling it? Could we enable it for certain allocations and not others? E.g. enable it when allocating auxiliary resource but not when creating other buffers?

@smaslov-intel
Copy link
Contributor

Do you think we should make the pooling work for non-USM allocations?

You mean SYCL buffers specifically? Or pi_mem objects? Or something else?

Reading it, it seem to be possible to enable it for buffers as well. Is that correct?

In Level-Zero backend SYCL buffers are backed by regular USM allocations, and those USM allocations are already pooled (but not pi_mem objects enclosing them). For backends that don't have USM and maintain distinct buffer objects the pooling needs to be extended. And your changes are probably good for that. But I think it should eventually be outside of SYCL RT, but in the code shared between plugins for different backends.

@steffenlarsen
Copy link
Contributor Author

You mean SYCL buffers specifically? Or pi_mem objects? Or something else?

I just meant pi_mem in general. This PR was mostly motivated by L0, but the pooling may already be solving the issue now that the leak fix is in. I will investigate to see if that is the case.

@cperkinsintel
Copy link
Contributor

may not be relevant, but there is another pooling PR open here: #5438 from @rdeodhar

@steffenlarsen steffenlarsen marked this pull request as draft April 7, 2022 08:27
@steffenlarsen steffenlarsen changed the title [SYCL] Implement resource pool for implementation allocations Draft: [SYCL] Implement resource pool for implementation allocations Apr 7, 2022
@steffenlarsen
Copy link
Contributor Author

I am currently investigating whether the existing pooling in the L0 plugin is enough to solve the problem. Converted this back to draft.

@steffenlarsen
Copy link
Contributor Author

The L0 pooling has shown to be enough to achieve this. We should consider expanding it to other backends, but as such we do not need this pooling patch.

@steffenlarsen steffenlarsen deleted the steffen/rework_redu_res_acq branch December 6, 2023 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants