Skip to content

[UR] fix: synchronize resubmission of the same command buffer #18566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 3, 2025

Conversation

EuphoricThinking
Copy link
Contributor

@EuphoricThinking EuphoricThinking commented May 20, 2025

Fixes #18610

ur_exp_command_buffer_desc_t desc{UR_STRUCTURE_TYPE_EXP_COMMAND_BUFFER_DESC,
nullptr, false, false, false};
nullptr, false, true /* WAS: false */, false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change the buffer to in order?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to extend the time of the buffer execution, but as @pbalcer mentioned, I'll revert this to out-of-order

@@ -112,6 +113,7 @@ struct urCommandBufferExpExecutionTest : uur::urKernelExecutionTest {
}

ur_exp_command_buffer_handle_t cmd_buf_handle = nullptr;
ur_exp_command_buffer_handle_t cmd_buf_in_order = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is it used?

Copy link
Contributor

@pbalcer pbalcer May 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the interest of time, and knowing that right now there's no difference between the two, I suggest not adding in order cmd buf here for now, and just testing out of order for now (both should exhibit the issue we are trying to solve).

Copy link
Contributor

@pbalcer pbalcer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a few nits.

@@ -98,7 +98,7 @@ struct urCommandBufferExpExecutionTest : uur::urKernelExecutionTest {
UUR_RETURN_ON_FATAL_FAILURE(checkCommandBufferSupport(device));

ur_exp_command_buffer_desc_t desc{UR_STRUCTURE_TYPE_EXP_COMMAND_BUFFER_DESC,
nullptr, false, false, false};
nullptr, false, false /* WAS: false */, false};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
nullptr, false, false /* WAS: false */, false};
nullptr, false, false, false};

Copy link
Contributor

@EwanC EwanC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


// Tests that the same command-buffer submitted across different in-order
// queues has an implicit dependency on first submission
TEST_P(urEnqueueCommandBufferExpTest, SerializeAcrossQueues) {
// Execute command-buffer to first in-order queue (created by parent
// urQueueTest fixture)
// urQueueTestWithParam fixture)
ASSERT_SUCCESS(
urEnqueueCommandBufferExp(queue, cmd_buf_handle, 0, nullptr, nullptr));

// Execute command-buffer to second in-order queue, should have implicit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Execute command-buffer to second in-order queue, should have implicit
// Execute command-buffer to second queue, should have implicit

@aelovikov-intel aelovikov-intel changed the title fix: synchronize resubmission of the same command buffer [UR] fix: synchronize resubmission of the same command buffer May 30, 2025
add new test fixture
test serialization of the command buffer execution
in an in-order queue
synchronization implemented as the host thread waiting
for an event to be signalled after the completion of the command
buffer execution
@pbalcer
Copy link
Contributor

pbalcer commented Jun 3, 2025

@intel/llvm-gatekeepers please merge

@kbenzie kbenzie merged commit 29a2502 into intel:sycl Jun 3, 2025
53 of 58 checks passed
sommerlukas pushed a commit that referenced this pull request Jun 5, 2025
PR #18566 fixes the DG2 SYCL-Graph E2E
tests fails seen on DG2 for the V2 adapter. Re-enable tests and add a
new test with a reduced version of `buffer_ordering.cpp` that also
exposed the bug.

See:
* #18668
* #18579
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[UR][L0 v2] urEnqueueCommandBufferExpTest.SerializeAcrossQueues flaky on PVC
6 participants