You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This patch fixes#4171.
The issue highlighted in that ticket is that CUDA contexts are bound to
threads and PI calls are executed both in the main thread and in threads
of the thread pool.
And to ensure the active contexts are correct the CUDA plugin uses
`ScopedContext` RAII struct to set the active context to the PI context
and restore the previous active context at the end of the PI call.
However for optimization purposes `ScopedContext` skips the context
recovery if there was no active context on the thread originally, which
means it leaves the PI context active on the thread.
In addition deleting a CUDA context only deactivates it on the thread
where it is deleted, it will stay active in other threads after being
deleted.
Which means that if you start from an application with no CUDA context
active, create a SYCL queue, run an operation then delete the SYCL
queue, the context on the current thread will be created, set active,
deactivated and deleted properly.
However it won't be deactivated in the threads of the thread pools,
which means that if we create another queue and run SYCL operations on
the thread pool again, that second queue will setup its own context in
the threads but then try to restore the deleted context from the previous
queue.
This patch aims to fix that issue by simply never restoring previous
active context, which means that PI calls from the second queue running
in the thread pool would just override the deleted context and not try
to restore it.
This should work well in SYCL only code as all the PI calls are guarded
by the `ScopedContext` and will change the active context accordingly,
in fact it may even provide performance improvement in certain
multi-context scenarios, because the current implementation would only
really prevent context switches for the first context used, this will
prevent context switches for the latest context used instead.
In CUDA interop scenarios, however it does mean that after running any
SYCL code CUDA interop code cannot make assumptions about the active
context and needs to reset it to whatever context it needs. But as far
as I'm aware, this is already the current practice in `oneMKL` and
`oneDNN`, where they also use a `ScopedContext` mechanism.
In summary this patch should:
* Fix trying to restore deleted contexts in internal SYCL threads
* May improve performance in certain multi-context scenarios
* Break assumptions on the active context for CUDA interop code
0 commit comments