Skip to content

[SYCL] Clear extensions functions cache upon context release #5282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

s-kanaev
Copy link
Contributor

This is to eliminate reuse of invalid cached values
after context being released.

    This is to eliminate reuse of invalid cached values
    after context being released.

Signed-off-by: Sergey Kanaev <[email protected]>
@s-kanaev s-kanaev requested a review from a team as a code owner January 11, 2022 13:47
@s-kanaev s-kanaev force-pushed the clear-ext-func-cache-on-context-release branch 2 times, most recently from c7d36c3 to 68811ba Compare January 11, 2022 14:18
@s-kanaev s-kanaev marked this pull request as draft January 11, 2022 14:42
Signed-off-by: Sergey Kanaev <[email protected]>
@s-kanaev s-kanaev force-pushed the clear-ext-func-cache-on-context-release branch from 68811ba to 6878a50 Compare January 11, 2022 14:57
@s-kanaev s-kanaev marked this pull request as ready for review January 11, 2022 15:03
@s-kanaev
Copy link
Contributor Author

/summary:run

s-kanaev pushed a commit to s-kanaev/llvm-test-suite that referenced this pull request Jan 11, 2022
    The test is going to be disabled until intel/llvm#5282 is merged

Signed-off-by: Sergey Kanaev <[email protected]>
Signed-off-by: Sergey Kanaev <[email protected]>
@s-kanaev
Copy link
Contributor Author

/summary:run

bader pushed a commit to intel/llvm-test-suite that referenced this pull request Jan 12, 2022
The test is going to be disabled until intel/llvm#5282 is merged

Signed-off-by: Sergey Kanaev <[email protected]>
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New approach LGTM. Just have some minor suggestions.

sergei and others added 2 commits January 13, 2022 22:55
Co-authored-by: Steffen Larsen <[email protected]>
Signed-off-by: Sergey Kanaev <[email protected]>
@s-kanaev
Copy link
Contributor Author

/summary:run

steffenlarsen
steffenlarsen previously approved these changes Jan 14, 2022
Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! Have a 👍 .

@s-kanaev
Copy link
Contributor Author

The failure on HIP AMD isn't relevant to this patch and is resolved by XFAIL-ing the tests int intel/llvm-test-suite#740

@s-kanaev s-kanaev requested review from romanovvlad and bader January 17, 2022 07:38
Copy link
Contributor

@bader bader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made a few style comments as I'm not an expert in this code and can't review design solution.
I'll let @romanovvlad to follow-up with that PR.

#undef _EXT_FUNCTION_COMMON
} // namespace detail

struct ExtFuncsCachesT {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why plural form?

Suggested change
struct ExtFuncsCachesT {
struct ExtFuncsCacheT {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's plural because each context has it's own cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be reasonable to call the whole map a cache instead of ExtFuncsPerContextT instances.

@@ -1397,6 +1480,8 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
// PI interface supports higher version or the same version.
strncpy(PluginInit->PluginVersion, SupportedVersion, 4);

ExtFuncsCaches = new ExtFuncsCachesT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be an option if the following drawback is resolved. Smart pointer (both shared_ptr and unique_ptr) are not trivially destructible. Having a smart pointer here may lead to potential mem leak or anything else related to order of destructors calls.

Probably, @alexbatashev has smth to comment here if I'm wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a smart pointer here may lead to potential mem leak or anything else related to order of destructors calls.

Could you clarify the scenario when "mem leak or anything else related to order of destructors calls" happen and how using a raw pointer solves the problems in such scenarios?

From POV, you can always call release method when you call delete for a raw pointer and have additional mem leak protection for the cases when delete is not called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mem-leak and destructor ordering I refer here are not related to the owned object, i.e. cache, but to the smart pointer itself. shared_ptr has two atomic counters. My blind guess is they're in heap. Both unique_ptr and shared_ptr have a 'complex' d-tor which has to check if the pointer owns the object.

Sure thing. When smart pointer is used I'll put call to release method in piTearDown. Bearing this in mind, is there any real need for smart pointer as I won't use its features?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the type of ExtFuncsCaches to a unique_ptr, we may have a problem since unique_ptr can be destructed(as a regular global var) earlier than the latest call to an API which uses this var.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that possible? Isn't calling functions from unloaded sharing library is UB?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is that possible? Isn't calling functions from unloaded sharing library is UB?

A global object can be destroyed earlier then a library is unloaded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, so I expected that plug-in has a single global object managing the lifetime of plug-in data ("plug-in context") and after this global object is destroyed, there can't be any "calls to an API which uses this var".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there can't be any "calls to an API which uses this var".

How this can be guaranteed?

@s-kanaev
Copy link
Contributor Author

/verify with intel/llvm-test-suite#745

// if cached that extension is not available return nullptr and
// PI_INVALID_VALUE
*fptr = F;
return F ? PI_SUCCESS : PI_INVALID_VALUE;
*fptr = FuncInitialized.first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we get a lot from having lazy initialization for these function pointers. Suggest moving all the initialization to piContextCreate. This should simplify the logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per what was discussed with @steffenlarsen , the main drawback of constant (as opposed to lazy) init in piContextCreate is that use-cases that create lots of contexts without using extensions will be impacted on performance. I reckon it's only synthetic cases, but still...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with leaving as is. But still think that if an application creates lots of context this initialization will be that last thing we should worry about. Also this would make "common" case faster and code more simpler because less mutexes and checks should be needed.

@@ -1397,6 +1480,8 @@ pi_result piPluginInit(pi_plugin *PluginInit) {
// PI interface supports higher version or the same version.
strncpy(PluginInit->PluginVersion, SupportedVersion, 4);

ExtFuncsCaches = new ExtFuncsCachesT;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change the type of ExtFuncsCaches to a unique_ptr, we may have a problem since unique_ptr can be destructed(as a regular global var) earlier than the latest call to an API which uses this var.

Sergey Kanaev added 2 commits January 17, 2022 16:02
@s-kanaev
Copy link
Contributor Author

/verify with intel/llvm-test-suite#745

@s-kanaev s-kanaev requested review from romanovvlad and bader January 18, 2022 07:36
@bader bader removed their request for review January 18, 2022 08:12
@romanovvlad romanovvlad merged commit 2eed402 into intel:sycl Jan 18, 2022
bader pushed a commit that referenced this pull request Feb 6, 2022
…5282)" (#5433)

This reverts commit 2eed402.

Revert is due massive performance regression introduced.
Employing of spin-lock and/or pre-initialization didn't help reduce performance impact.
aelovikov-intel pushed a commit to aelovikov-intel/llvm that referenced this pull request Mar 27, 2023
The test is going to be disabled until intel#5282 is merged

Signed-off-by: Sergey Kanaev <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants