Skip to content

[SYCL] Add implementation of host-interop-task and test. #1748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 38 commits into from
Jun 30, 2020

Conversation

s-kanaev
Copy link
Contributor

@s-kanaev s-kanaev commented May 25, 2020

This patch is number two in series of patches for interop part of host task.
This patch introduces an API to enqueue host-task with interop_handle argument
See the proposal [1].

Depends on #1937

[1] https://github.com/codeplaysoftware/standards-proposals/blob/master/host_task/host_task.md

Sergey Kanaev added 2 commits May 25, 2020 15:01
@s-kanaev s-kanaev mentioned this pull request May 25, 2020
3 tasks
Signed-off-by: Sergey Kanaev <[email protected]>
@s-kanaev s-kanaev force-pushed the private/s-kanaev/ht-interop-task-iface branch from a73440f to e04ea75 Compare May 25, 2020 13:13
Sergey Kanaev added 3 commits May 25, 2020 16:28
@s-kanaev s-kanaev marked this pull request as ready for review June 10, 2020 19:36
@s-kanaev s-kanaev requested a review from a team as a code owner June 10, 2020 19:36
@s-kanaev s-kanaev requested review from v-klochkov and Ruyk June 10, 2020 19:36
@s-kanaev
Copy link
Contributor Author

s-kanaev commented Jun 10, 2020

@Ruyk, @StuartDAdams I bet you know how to improve testing here.

@Ruyk
Copy link
Contributor

Ruyk commented Jun 11, 2020

Some ideas:

  • A test that uses OpenCL interop to copy data from buffer A to buffer B , by getting cl_mem objects and calling the clEnqueueBufferCopy. Then run a SYCL kernel that modifies the data in place for B, e.g. increment one, then copy back to buffer A. Run it on a loop, to ensure the dependencies and the reference counting of the objects is not leaked. We could easily do the CUDA variant even with a macro later on.
  • Same as above, but performing each command group on a separate SYCL queue (on the same or different devices). This ensures the dependency tracking works well but also there is no accidental side effects on other queues.
  • A test that does a clEnqueueWait inside the interop scope, for an event captured outside the command group. The OpenCl event can be set after the command group finishes. Must not deadlock according to implementation and proposal, sketch below:
cl_event userEvent = clCreateUserEvent(...)
q.submit([&](handler& ) {
   h.codeplay_host_task([=](interop_handler& ih) {
     clWaitForEvents(1, &userEvent); 
   }
});
clSetUserEventStatus(userEvent, CL_COMPLETE);
q.wait();

Copy link
Contributor

@nyalloc nyalloc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff. Got a few comments on code style and a few things I'd like to clarify, but happy to see this ball rolling.

@s-kanaev s-kanaev marked this pull request as draft June 17, 2020 16:17
@s-kanaev
Copy link
Contributor Author

@Ruyk, @StuartDAdams ping

Sergey Kanaev added 2 commits June 25, 2020 23:43
Signed-off-by: Sergey Kanaev <[email protected]>
Signed-off-by: Sergey Kanaev <[email protected]>
Ruyk
Ruyk previously approved these changes Jun 26, 2020
Copy link
Contributor

@Ruyk Ruyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two really minor changes required for the CUDA backend to work, but they should not block this PR. It seems to work fine otherwise.

@Ruyk
Copy link
Contributor

Ruyk commented Jun 26, 2020

Are all the lit testing passing for you? host-task lit test sometimes deadlocks on my system,

[New LWP 30329]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
0x00007f8f775c5f30 in cl::sycl::detail::Command::enqueue(cl::sycl::detail::EnqueueResultT&, cl::sycl::detail::BlockingT) ()
   from /home/ruyman/open-source/build/lib/libsycl.so.2
(gdb) up
#1  0x00007f8f775db425 in cl::sycl::detail::Scheduler::GraphProcessor::waitForEvent(std::shared_ptr<cl::sycl::detail::event_impl>) ()
   from /home/ruyman/open-source/build/lib/libsycl.so.2
(gdb)
#2  0x00007f8f775d735d in cl::sycl::detail::Scheduler::waitForRecordToFinish(cl::sycl::detail::MemObjRecord*) ()
   from /home/ruyman/open-source/build/lib/libsycl.so.2
(gdb)
#3  0x00007f8f775d9158 in cl::sycl::detail::Scheduler::removeMemoryObject(cl::sycl::detail::SYCLMemObjI*) ()
   from /home/ruyman/open-source/build/lib/libsycl.so.2
(gdb)
#4  0x00007f8f775e8ba8 in cl::sycl::detail::SYCLMemObjT::updateHostMemory() () from /home/ruyman/open-source/build/lib/libsycl.so.2
(gdb)
#5  0x000000000040a343 in cl::sycl::detail::buffer_impl::~buffer_impl() ()
(gdb)
#6  0x000000000040a319 in void __gnu_cxx::new_allocator<cl::sycl::detail::buffer_impl>::destroy<cl::sycl::detail::buffer_impl>(cl::sycl::detail::buffer_impl*) ()
(gdb) quit

@s-kanaev
Copy link
Contributor Author

Are all the lit testing passing for you? host-task lit test sometimes deadlocks on my system

@Ruyk, please, try again with #1937 fix.

@Ruyk
Copy link
Contributor

Ruyk commented Jun 26, 2020

Yes, that seems to fix the problem.

v-klochkov
v-klochkov previously approved these changes Jun 26, 2020
Signed-off-by: Sergey Kanaev <[email protected]>
@s-kanaev s-kanaev dismissed stale reviews from v-klochkov and Ruyk via e5c6cf5 June 29, 2020 12:00
@s-kanaev s-kanaev requested review from v-klochkov and Ruyk June 29, 2020 12:04
@s-kanaev
Copy link
Contributor Author

@StuartDAdams , @Ruyk the only latest changes are some stylistic ones. Please review and approve if you have no objections.

Copy link
Contributor

@Ruyk Ruyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CUDA variant I have seems to work locally (bearing two minor changes), so looks good!

@bader bader merged commit f088e38 into intel:sycl Jun 30, 2020
@s-kanaev s-kanaev deleted the private/s-kanaev/ht-interop-task-iface branch September 2, 2020 09:42
Fznamznon pushed a commit to Fznamznon/llvm that referenced this pull request Dec 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants