-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][USM][PI] Initial Commit to move all USM implementation detail inside the PI in… #937
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, just a few code style comments, but I'd like @smaslov-intel to review and maybe address some questions in the code comments.
@smaslov-intel Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, just few requests mostly for more comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 33 files reviewed, 10 unresolved discussions (waiting on @bader, @jbrodman, @MichalMrozek, @romanovvlad, and @smaslov-intel)
sycl/include/CL/sycl/detail/pi.h, line 989 at r3 (raw file):
Previously, jbrodman wrote…
- USM allocations aren't reference counted, so I'm not sure what you're asking.
I am asking if we should make them reference counted like other objects in PI.
No - they're raw pointers. We can't shove a wrapper around them. Users can use a std::shared_ptr or std::unique_ptr on the host if they want. Won't work across host and device.
OK, please make a note of this in the documentation comment of the alloc/free interfaces, because this is really different from the rest of PI API.
sycl/include/CL/sycl/detail/pi.h, line 1037 at r4 (raw file):
Previously, jbrodman wrote…
Mirrors the OpenCL extension.
I am OK for now.
sycl/plugins/opencl/pi_opencl.cpp, line 511 at r4 (raw file):
Previously, jbrodman wrote…
Just pushed a new commit that seems to handle this in a thread safe way.
OK
sycl/plugins/opencl/pi_opencl.cpp, line 591 at r4 (raw file):
Previously, jbrodman wrote…
Yes. It's doing extra indirection b/c OpenCL, in all its infinite wisdom, decided that pointer args should be set by value instead of by address + size.
I think the OpenCL runtime might also be doing something special on its end, not sure.
OK. Please make it more general than for USM, e.g. name it piextKernelSetArgPointer
sycl/plugins/opencl/pi_opencl.cpp, line 756 at r4 (raw file):
Previously, jbrodman wrote…
Device defined, which is why they're not enumerated.
No, because it's mean to dynamically change behavior during program execution.
OK, but let's define the enumeration of the hints (OK to be ignored by devices that don't know or don't care)
sycl/plugins/opencl/pi_opencl.cpp, line 36 at r5 (raw file):
template <typename T> pi_result getExtFuncFromContext(pi_context context, const char *func, T *fptr) { thread_local static std::map<pi_context, T> FuncPtrs;
I think this is good for now, but there is still redundancy: every context in every thread will repeat it while it is only really needed once for each PI plugin. Please add a TODO to fix this once we settle the pi_plugin infrastructure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 12 files at r4, 1 of 3 files at r5.
Reviewable status: 1 of 33 files reviewed, 10 unresolved discussions (waiting on @bader, @jbrodman, @MichalMrozek, and @romanovvlad)
sycl/include/CL/sycl/detail/pi.h, line 1054 at r4 (raw file):
Previously, smaslov-intel wrote…
Please do add doxygen comments (at least something short initially) so that the new interfaces are clearer.
The exsiting PI interfaces are mostly taken from OpenCL so not having doc for them is kind of OK for now.
But we shall do that for them too.
Please copy doxygen comment to the header too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 22 files at r3, 1 of 12 files at r4, 1 of 3 files at r5.
Reviewable status: 5 of 33 files reviewed, 10 unresolved discussions (waiting on @bader, @jbrodman, @MichalMrozek, and @romanovvlad)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 5 of 33 files reviewed, 10 unresolved discussions (waiting on @bader, @jbrodman, @MichalMrozek, and @romanovvlad)
sycl/plugins/opencl/pi_opencl.cpp, line 794 at r4 (raw file):
Previously, smaslov-intel wrote…
Please describe different types of info that can be queried
Is this the full list? You should also specify the type of the data that is returned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 11 files at r1, 18 of 22 files at r3, 8 of 12 files at r4, 1 of 3 files at r5.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @bader, @jbrodman, @MichalMrozek, and @romanovvlad)
…terface. Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
How does it work now? Say we have two different opencl platforms on a system - How many pi opencl's get loaded? We need to to this at a minimum once per context. |
Signed-off-by: James Brodman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 6 files at r6, 4 of 4 files at r7.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @bader, @jbrodman, @MichalMrozek, and @romanovvlad)
sycl/include/CL/sycl/detail/pi.def, line 102 at r7 (raw file):
_PI_API(piextUSMSharedAlloc) _PI_API(piextUSMFree) _PI_API(piextKernelSetArgPointer)
Please move this to the where piKernelSetArg is (and also in the header)
sycl/plugins/opencl/pi_opencl.cpp, line 622 at r4 (raw file):
clSetKernelExecInfo
Yes, that's what I think we need. Could you change the piextUSMKernelSetIndirectAccess into piKernelSetExecInfo ?
Also we'd move that to where other piKernel* API are.
sycl/plugins/opencl/pi_opencl.cpp, line 794 at r4 (raw file):
Previously, smaslov-intel wrote…
Is this the full list? You should also specify the type of the data that is returned
would you add more info about supported parameters?
Ok.
No - you're missing my point. KernelSetIndirectAccess is USM-specific behavior. It is not general I cannot make it into piKernelSetExecInfo. You could make a piKernelSetExecInfo by wiring up clSetKernelExecInfo straight up, but this has nothing to do with USM. I could implement KernelSetIndirectAccess by calling KernelSetExeciInfo inside a lot but there's no need.
I mean - they're all listed - what do you want? A link to the Spec in the comments? Also can we please just use Github comments instead of some 3rd party site for comments? This "reviewable" site is a PITA. |
I am asking for a general interface "piKernelSetExecInfo" to be added instead with an extension parameter (not supported by plugins not supporting USM), e.g. PIEXT_USM_INDIRECT_POINTERS.
The comment currently says "examples", not the full list, which it should be. |
…xplain queries. Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
Signed-off-by: James Brodman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few test improvement suggestions.
sycl/test/usm/smemllaligned.cpp
Outdated
for (int i = 0; i < numNodes; i++) { | ||
s_cur->Num = i * 2; | ||
if (dev.get_info<info::device::usm_shared_allocations>()) { | ||
Node *s_head = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node *s_head = nullptr; | |
Node *s_head = (Node *)aligned_alloc_shared(alignof(Node), sizeof(Node), dev, ctxt); |
sycl/test/usm/smemllaligned.cpp
Outdated
} | ||
s_cur = s_head; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s_cur = s_head; | |
Node *s_cur = s_head; |
sycl/test/usm/smemllaligned.cpp
Outdated
s_cur->Num = i * 2; | ||
if (dev.get_info<info::device::usm_shared_allocations>()) { | ||
Node *s_head = nullptr; | ||
Node *s_cur = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node *s_cur = nullptr; |
sycl/test/usm/smemllaligned.cpp
Outdated
if (s_cur->Num != want) { | ||
return -1; | ||
s_cur = s_head; | ||
int mismatches = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mismatches
is declared, but not used.
sycl/test/usm/smemll.cpp
Outdated
if (s_cur->Num != want) { | ||
return -1; | ||
s_cur = s_head; | ||
int mismatches = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mismatches
is declared, but not used.
sycl/test/usm/allocatorll.cpp
Outdated
|
||
event e0 = q.memcpy(d_cur, &h_cur, sizeof(Node)); | ||
e0.wait(); | ||
Node *d_head = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node *d_head = nullptr; | |
Node *d_head = alloc.allocate(1); |
sycl/test/usm/allocatorll.cpp
Outdated
event e0 = q.memcpy(d_cur, &h_cur, sizeof(Node)); | ||
e0.wait(); | ||
Node *d_head = nullptr; | ||
Node *d_cur = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Node *d_cur = nullptr; | |
Node *d_cur = d_head; |
sycl/test/usm/allocatorll.cpp
Outdated
|
||
d_cur = h_cur.pNext; | ||
} | ||
d_head = alloc.allocate(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d_head = alloc.allocate(1); |
sycl/test/usm/allocatorll.cpp
Outdated
d_cur = h_cur.pNext; | ||
} | ||
d_head = alloc.allocate(1); | ||
d_cur = d_head; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d_cur = d_head; |
sycl/test/usm/allocator_vector.cpp
Outdated
@@ -25,32 +25,33 @@ int main() { | |||
auto dev = q.get_device(); | |||
auto ctxt = q.get_context(); | |||
|
|||
usm_allocator<int, usm::alloc::host> alloc(ctxt, dev); | |||
if (dev.get_info<info::device::usm_host_allocations>()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of wrapping test code with if (device supports USM) { ... }
, it's better to use early exit.
if (!dev.get_info<info::device::usm_host_allocations>())
return 0;
... // tests using USM
Signed-off-by: James Brodman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…ptors I tracked the appearance of `_nocheck` down to intel#937 and James doesn't remember anything specific requiring that vs "checked" version. Also, some similar USM-related aspects (those that don't just delegate to `get_info`) do use "checked" APIs, so it seems logical to unify this processing. The main reason I do this is that it would be a bit easier to cache the values of "checked" interfaces by pre-computing USM support in `device_impl`'s ctor. Note that we perform at least some of those every time we allocate USM memory, so doing the caching is desirable (even for the sake of cleaning up our traces).
…ptors (#18453) I tracked the appearance of `_nocheck` down to #937 and James doesn't remember anything specific requiring that vs "checked" version. Also, some similar USM-related aspects (those that don't just delegate to `get_info`) do use "checked" APIs, so it seems logical to unify this processing. The main reason I do this is that it would be a bit easier to cache the values of "checked" interfaces by pre-computing USM support in `device_impl`'s ctor. Note that we perform at least some of those every time we allocate USM memory, so doing the caching is desirable (even for the sake of cleaning up our traces).
…terface.
Removes need for USM Dispatcher. CLUSM is also now unused as CL USM is supported on CPU.
Signed-off-by: James Brodman [email protected]