Skip to content

Changes to lock and unlock _pi_queue as a single shared object, excep… #2588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 6, 2020
Merged

Changes to lock and unlock _pi_queue as a single shared object, excep… #2588

merged 1 commit into from
Oct 6, 2020

Conversation

kbsmith-intel
Copy link
Contributor

…ting

any const members.

EventList[I]->Queue->ZeCommandListFenceMapMutex.lock();
auto Queue = EventList[I]->Queue;

if (Queue->RefCount > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if really need this now that we lock the whole Queue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we do. We only lock the queue while a call to one of the pi interfaces is using the queue. The retain/release calls still need to be used to reference count the queue itself to keep a queue from being deleted while it was still "in use" by another thread. say the execution goes like this: t1 - retain, t1 piEnqueBegin q locked, t1 EnqueEnd q unlocked, t2 - retain, t2 enqueBegin q locked, t2 end q unlocked, t2 release, t1 release. That first call from t2 to release cannot get rid of the queue, or undo any of it's data structures.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Shouldn't the Queue lock be acquired before accessing its RefCount then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, RefCount is atomic, and is only used to make sure the _pi_queue doesn't get deleted/freed before all threads are finished with it. You cannot lock the queue before checking it's reference count. RefCount checking is the first thing that must happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot lock the queue before checking it's reference count.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Thinking on it more, you are probably right. The code probably needs to lock before incrementing or decrementing RefCount, and before reading RefCount when checking for RefCount > 0. I'll update the code.

if (Event->Queue->RefCount > 0) {
Event->Queue->ZeCommandListFenceMapMutex.lock();
auto Queue = Event->Queue;
if (Queue->RefCount > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if really need this now that we lock the whole Queue

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we do, See reply to similar comment.

smaslov-intel
smaslov-intel previously approved these changes Oct 5, 2020
Copy link
Contributor

@smaslov-intel smaslov-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve to get testing done, please do reply on the remaining comment though.

@smaslov-intel smaslov-intel dismissed their stale review October 5, 2020 07:51

approved to start the testing

@smaslov-intel
Copy link
Contributor

The latest changes look correct. Just one more thing: please add a control (env var) that will skip locking. This can be useful for debugging and to measure performance impact on single-threaded applications, where such locking isn't needed.

@romanovvlad romanovvlad merged commit 7e8934d into intel:sycl Oct 6, 2020
@kbsmith-intel kbsmith-intel deleted the queue_lock branch October 6, 2020 17:54
alexbatashev pushed a commit to alexbatashev/llvm that referenced this pull request Oct 7, 2020
* sycl:
  [SYCL] Fix test failure due to size discrepancy in 32 and 64 bit environment (intel#2594)
  [BuildBot] Linux GPU driver uplift to 20.39.17972 (intel#2600)
  [SYCL][NFC] Remove cyclic dependency in headers (intel#2601)
  [SYCL][Doc] Fix Sphinx docs index (intel#2599)
  [SYCL][PI][L0] Add an assert to piEnqueueKernelLaunch() when the GlobalWorkOffset!=0 (intel#2593)
  [SYCL][Driver] Turn on -spirv-allow-extra-diexpressions option (intel#2578)
  [SYCL] Serialize queue related PI API in the Level-Zero plugin (intel#2588)
  Added missing math APIs for devicelib. (intel#2558)
  [SYCL][DOC] Fix path to FPGA device selector (intel#2563)
  [SYCL][CUDA] Add basic sub-group functionality (intel#2587)
  [SYCL] Add specialization constant feature design doc. (intel#2572)
  [SYCL] Expand release lit test to CPU and ACC (intel#2589)
  [SYCL] Add clang support for FPGA kernel attribute scheduler_target_fmax_mhz (intel#2511)
  [SYCL][ESIMD] Fixed compiler crash in LowerESIMDVecArg pass (intel#2556)
jsji pushed a commit that referenced this pull request May 30, 2024
kbenzie added a commit to kbenzie/intel-llvm that referenced this pull request Feb 17, 2025
Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants