-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][USM] Enable per-context USM behavior. #467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: James Brodman <[email protected]>
@@ -108,23 +105,24 @@ class CLUSM { | |||
} // namespace usm | |||
|
|||
namespace cliext { | |||
bool initializeExtensions(cl_platform_id platform); | |||
bool initializeExtensions(cl_context context, cl_platform_id platform); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, follow llvm coding guidelines. name variables starting from big letter, functions from small.
cl::sycl::detail::usm::CLUSM *retVal = nullptr; | ||
if (gCLUSM.find(ctxt) == gCLUSM.end()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest rewriting it to:
cl::sycl::detail::usm::CLUSM *&retVal = gCLUSM[ctxt];
if(nullptr == retVal)
retVal = new cl::sycl::detail::usm::CLUSM();
And move check for backend to the function beginning. Now you are allocating CLUSM even for non OpenCL BE.
cl::sycl::detail::usm::CLUSM::Create(gCLUSM); | ||
} | ||
|
||
__SYCL_EXPORTED extern std::map<cl_context, cl::sycl::detail::usm::CLUSM *> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't you store pointer to CLUSM in the context object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. Wasn't sure if you guys preferred it to be that invasive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually - not sure I can really do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below in the opencl_shim part.
detail::getSyclObjImpl(Context)->getHandleRef()); | ||
auto clusm = GetCLUSM(context); | ||
if (usesUSM && clusm) { | ||
cl_bool t = CL_TRUE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please make this variable name more descriptive?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t
auto theKernel = pi::pi_cast<cl_kernel>(Kernel); | ||
// Enable USM Indirect Access for Kernels | ||
if (clusm->useCLUSM()) { | ||
CHECK_OCL_CODE(clusm->setKernelExecInfo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you do the same 3 calls to setKernelExecInfo in both branches of if. Could you please move it outside of the if?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are different calls. The ones through the clusm object do extra book keeping.
static clHostMemAllocINTEL_fn pfn_clHostMemAllocINTEL = NULL; | ||
// Changing to per-context tracking | ||
// TODO: use piContext everywhere | ||
static std::map<cl_context, clHostMemAllocINTEL_fn> pfn_clHostMemAllocINTEL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid using global objects. It seems you can easily store them in context object or in some object which depends on context object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought. By the time we're here, we've left sycl land. We don't have a handle on the sycl context. Not sure I can avoid global objects here unless we break the CL extension interface in a really non-CL friendly way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's take hostMemAlloc for example, we can have something like this:
// The class holds all data you currently using globals for.
// CLUSM and USMHandle can be merged.
class USMHandle {
FuncType *pfn_clDeviceMemAllocINTEL = nullptr;
...
CLUSM CLUSM;
};
class context_impl {
context_impl () {
...
MUSMHandle.reset(getUSMHandleFor(m_Context));
}
void *alignedAlloc(...) {
// The following logic can be implemented in USMHandle
if(m_USMHandle->pfn_clDeviceMemAllocINTEL) {
pfn_clHostMemAllocINTEL(m_Context, ...);
}
else {
USMHandle->CLUSM->hostMemAlloc(...);
}
}
...
unique_ptr<USMHandle> m_USMHandle;
};
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is what I meant. Now we're driving all this branching logic into the implementation instead of having a nice clean shim. Would you guys prefer this? It's a non-trivial rewrite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the only way forward here is to also do the PI rewrite - this would let us do a clean interface using SYCL objects instead of CL objects and let us do this type of book keeping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer this as having global objects cause a lot of problems.
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Keep track of USM "state" per-context to handle multiple device scenarios.
Signed-off-by: James Brodman [email protected]