Skip to content

[UR] [V2] Fix synchronization between command_list_manager usages #17297

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 14, 2025

Conversation

Xewar313
Copy link
Contributor

@Xewar313 Xewar313 commented Mar 4, 2025

Changes:
Command_list_manager no longer synchronize its calls, instead the responsibility to ensure exclusivity belongs to the caller.
To add synchronization I implemented the mechanism similar to rust lock as suggested in #17061 (comment).

@Xewar313 Xewar313 requested review from a team as code owners March 4, 2025 15:47
@Xewar313 Xewar313 requested a review from konradkusiak97 March 4, 2025 15:47
@Xewar313 Xewar313 marked this pull request as draft March 5, 2025 08:27
@Xewar313 Xewar313 marked this pull request as ready for review March 5, 2025 11:03
Copy link
Contributor

@pbalcer pbalcer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine, but remove the shared_ptr. I don't see its value, and shared_ptrs are expensive to use.

@@ -140,20 +142,20 @@ ur_result_t ur_queue_immediate_in_order_t::queueGetNativeHandle(
ur_queue_native_desc_t *pDesc, ur_native_handle_t *phNativeQueue) {
std::ignore = pDesc;
*phNativeQueue = reinterpret_cast<ur_native_handle_t>(
this->commandListManager.getZeCommandList());
this->commandListManager.get_no_lock()->getZeCommandList());
return UR_RESULT_SUCCESS;
}

ur_result_t ur_queue_immediate_in_order_t::queueFinish() {
TRACK_SCOPE_LATENCY("ur_queue_immediate_in_order_t::queueFinish");

std::unique_lock<ur_shared_mutex> lock(this->Mutex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we still use this->mutex (here and in other places)? shouldn't we remove it and use command list locking exclusively?

@Xewar313
Copy link
Contributor Author

@intel/llvm-gatekeepers please merge. One test fails on unrelated changes and it has an issue #17392

@Xewar313
Copy link
Contributor Author

@intel/llvm-gatekeepers please merge, and if there is any issue with merge, I would be really grateful if someone could tell how to solve it

@Xewar313
Copy link
Contributor Author

@intel/llvm-gatekeepers bump

@sarnex
Copy link
Contributor

sarnex commented Mar 14, 2025

Unlikely ESIMD fail is related

@sarnex sarnex merged commit fb582a7 into intel:sycl Mar 14, 2025
30 of 31 checks passed
Comment on lines +497 to +506
ur_result_t urCommandBufferEnqueueExp(
ur_exp_command_buffer_handle_t CommandBuffer, ur_queue_handle_t UrQueue,
uint32_t NumEventsInWaitList, const ur_event_handle_t *EventWaitList,
ur_event_handle_t *Event) try {
return UrQueue->get().enqueueCommandBufferExp(
CommandBuffer, NumEventsInWaitList, EventWaitList, Event);
} catch (...) {
return exceptionToResult(std::current_exception());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this function reintroduced? I think it might be a rebase error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops, let me know if i should revert this

Copy link
Contributor

@EwanC EwanC Mar 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a PR to fix this #17476 (not worth a revert, it's dead code but not breaking anything)

uditagarwal97 pushed a commit that referenced this pull request Mar 14, 2025
`urCommandBufferEnqueueExp` was changed in
#16984 to `urEnqueueCommandBufferExp`
but added back accidentally to the v2 L0 adapter in
#17297
kbenzie pushed a commit to oneapi-src/unified-runtime that referenced this pull request Mar 17, 2025
`urCommandBufferEnqueueExp` was changed in
intel/llvm#16984 to `urEnqueueCommandBufferExp`
but added back accidentally to the v2 L0 adapter in
intel/llvm#17297
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants