-
Notifications
You must be signed in to change notification settings - Fork 788
Changes to lock and unlock _pi_queue as a single shared object, excep… #2588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
EventList[I]->Queue->ZeCommandListFenceMapMutex.lock(); | ||
auto Queue = EventList[I]->Queue; | ||
|
||
if (Queue->RefCount > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if really need this now that we lock the whole Queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we do. We only lock the queue while a call to one of the pi interfaces is using the queue. The retain/release calls still need to be used to reference count the queue itself to keep a queue from being deleted while it was still "in use" by another thread. say the execution goes like this: t1 - retain, t1 piEnqueBegin q locked, t1 EnqueEnd q unlocked, t2 - retain, t2 enqueBegin q locked, t2 end q unlocked, t2 release, t1 release. That first call from t2 to release cannot get rid of the queue, or undo any of it's data structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Shouldn't the Queue lock be acquired before accessing its RefCount then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, RefCount is atomic, and is only used to make sure the _pi_queue doesn't get deleted/freed before all threads are finished with it. You cannot lock the queue before checking it's reference count. RefCount checking is the first thing that must happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You cannot lock the queue before checking it's reference count.
Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Thinking on it more, you are probably right. The code probably needs to lock before incrementing or decrementing RefCount, and before reading RefCount when checking for RefCount > 0. I'll update the code.
if (Event->Queue->RefCount > 0) { | ||
Event->Queue->ZeCommandListFenceMapMutex.lock(); | ||
auto Queue = Event->Queue; | ||
if (Queue->RefCount > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if really need this now that we lock the whole Queue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think we do, See reply to similar comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve to get testing done, please do reply on the remaining comment though.
…ting any const members.
The latest changes look correct. Just one more thing: please add a control (env var) that will skip locking. This can be useful for debugging and to measure performance impact on single-threaded applications, where such locking isn't needed. |
* sycl: [SYCL] Fix test failure due to size discrepancy in 32 and 64 bit environment (intel#2594) [BuildBot] Linux GPU driver uplift to 20.39.17972 (intel#2600) [SYCL][NFC] Remove cyclic dependency in headers (intel#2601) [SYCL][Doc] Fix Sphinx docs index (intel#2599) [SYCL][PI][L0] Add an assert to piEnqueueKernelLaunch() when the GlobalWorkOffset!=0 (intel#2593) [SYCL][Driver] Turn on -spirv-allow-extra-diexpressions option (intel#2578) [SYCL] Serialize queue related PI API in the Level-Zero plugin (intel#2588) Added missing math APIs for devicelib. (intel#2558) [SYCL][DOC] Fix path to FPGA device selector (intel#2563) [SYCL][CUDA] Add basic sub-group functionality (intel#2587) [SYCL] Add specialization constant feature design doc. (intel#2572) [SYCL] Expand release lit test to CPU and ACC (intel#2589) [SYCL] Add clang support for FPGA kernel attribute scheduler_target_fmax_mhz (intel#2511) [SYCL][ESIMD] Fixed compiler crash in LowerESIMDVecArg pass (intel#2556)
Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@0fd9882186c73e1
Remove the prerelease.yml job
Remove the prerelease.yml job
…ting
any const members.