-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL] Add environment variable to disable in-memory program caching #11751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
If a new environment variable is added, the EnvironmentVariables.md file should be updated to document it. But, also, when I look at llvm/sycl/doc/EnvironmentVariables.md I see there are already several environment variables for controlling cacheing there. Are we sure we need another? How is this new environment variable different than neutering the |
@cperkinsintel Those environment variables control the persistent caching i.e. the storing of compiled code on disk, while this new environment variable intends to give an option to disable the in-memory cache. Even if persistent caching is disabled, programs still go through the in-memory cache. |
Should we make it explicit in the name that this refers to in-memory, non-persistent, caching? E.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, LGTM
// nullptr for the mutex. | ||
auto [Kernel, ArgMask] = BuildF(); | ||
return make_tuple(Kernel, nullptr, ArgMask, Program); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this logic be moved to getOrBuild
to avoid having to do it before each call to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I think it'd probably be better suited for another PR. There are really two types of uses of getOrBuild
, one with kernels and one with programs. The returns types differ between the two case, so it'd be a little awkward to insert this special logic for the program case in there. Alternatively, one could make a getOrBuildProgram
and getOrBuildKernel
program and then put that special logic in the getOrBuildKernel
function, but there would still be a oddity with the return type: when caching, we wrap the value in a BuildResult
and return a pointer to that (owned by the cache), but when are not caching, we bypass the BuildResult
object, and would only want to return the value. This can still be resolved by further modifying these getOrBuild
functions to return only the values we extract from the BuildResult
anyways, but I believe this'll most likely create a lot of changes unrelated to the original goal of PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I remember of the code, it is due for a bit of an overhaul anyway. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
In #11751, ref counting of kernels objects was changed to be more accurate in order to allow for in-memory caching to be disabled. When getting a kernel form the cache, the ref count the kernel handle is now incremented (when caching is enabled). Thus, a method like `ProgramManager::getOrCreateKernel` will increment the ref count of the kernel it gets. However, in `enqueueImpKernel`, when enqueuing a kernel with a kernel bundle,`ProgramManager::getOrCreateKernel` is called twice, first indirectly by: https://github.com/intel/llvm/blob/c43a90f28eebfcdf1bc1d55430485e2834790a60/sycl/source/detail/scheduler/commands.cpp#L2527-L2528 and second directly by: https://github.com/intel/llvm/blob/c43a90f28eebfcdf1bc1d55430485e2834790a60/sycl/source/detail/scheduler/commands.cpp#L2538-L2548 This means that the ref count of the acquired kernel is incremented twice, yet the rest of the function will only free once, which leads to a leak of the kernel. As the second comment and asserts say, the only need for the second call to `getOrCreateKernel` is to fetch the mutex associated to the cached kernel retrieved from the first call, so this PR adjusts `get_kernel` to save this mutex and forgo this extra `getOrCreateKernel` call and unintentional additional ref count.
This PR adds an environment variable
SYCL_CACHE_IN_MEM
to control the in-memory caching of programs. Currently, every program/kernel is saved in the globalKernelProgramCache
, which means that every program/kernel will not be released until end of the program when the destructor ofKernelProgramCache
is ran. By enabling this environment variables, caching is not performed and the resources program and kernels use are freed after use.