Skip to content

[SYCL][L0] Force submit of dependent open command batches #6199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 1, 2022

Conversation

smaslov-intel
Copy link
Contributor

@smaslov-intel smaslov-intel commented May 26, 2022

Previously we were only force-submitting batches seen as dependency in different PI queue.
Now, we also properly submit copy/compute open command-lists within the same PI queue too.

E2E test: https://github.com/alexanderfle/llvm-test-suite/blob/reuse_events_in_L0_tests/SYCL/Plugin/level_zero_inorder_interleaving_kernel_copy.cpp

Signed-off-by: Sergey V Maslov [email protected]

@smaslov-intel smaslov-intel requested a review from a team as a code owner May 26, 2022 05:54
@smaslov-intel smaslov-intel requested a review from againull May 26, 2022 05:55
Signed-off-by: Sergey V Maslov <[email protected]>
Comment on lines +1715 to +1723
if (Queue == CurQueue &&
OpenCommandList->second.isCopy(Queue) == UseCopyEngine) {
// Don't force execute the batch yet since the new command
// is going to the same open batch as the dependent event.
} else {
if (auto Res = Queue->executeOpenCommandList(
OpenCommandList->second.isCopy(Queue)))
return Res;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From this code it looks like we always force execute open command list if it is compute command list. Could you please explain why behavior needs to be different for copy and compute command lists? I think comment https://github.com/intel/llvm/pull/6199/files#diff-15dd1eb076d2164bd9e87d9057f05f652a716498e8cdf5975e564c65309a0985R1706-R1709 should mention about this difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not different for copy vs. compute. This code force-submits copy if the dependent command is compute, and vice versa. Perhaps, you were confused by this check OpenCommandList->second.isCopy(Queue) == UseCopyEngine looking like it is about copy-only, but I think it is not :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, yes, I was confused by this check.

Comment on lines 1700 to 1704
// The caller of createAndRetainPiZeEventList must already hold
// a lock of the CurQueue. Additionally lock the Queue if it
// is different from CurQueue.
auto Lock = ((Queue == CurQueue) ? std::unique_lock<pi_shared_mutex>()
: std::unique_lock(Queue->Mutex));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Problem is that this can cause a deadlock itself if we have one thread where the caller locks Queue1 and the callee locks Queue2 and another thread where the caller locks Queue2 and the callee locks Queue1.
I understand that this bug existed before your patch, so I am ok this to be fixed separately. I believe we will need to move createAndRetainPiZeEventList out of queue lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Added TODO comment to rework this further

Signed-off-by: Sergey V Maslov <[email protected]>
Comment on lines +1715 to +1723
if (Queue == CurQueue &&
OpenCommandList->second.isCopy(Queue) == UseCopyEngine) {
// Don't force execute the batch yet since the new command
// is going to the same open batch as the dependent event.
} else {
if (auto Res = Queue->executeOpenCommandList(
OpenCommandList->second.isCopy(Queue)))
return Res;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, yes, I was confused by this check.

@againull againull merged commit e8bff05 into intel:sycl Jun 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants