Skip to content

[SYCL] Fast launch of kernels that have no dependencies #4188

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

alexanderfle
Copy link
Contributor

@alexanderfle alexanderfle commented Jul 26, 2021

This patch aims to reduce the time for the specific case when the kernel doesn't have any dependencies.
If the kernel doesn't use accessors, then the vector MRequirements is empty.
If the kernel doesn't depend on other events, then the vector MEvents is empty.
In most cases this is the kernel that uses USM memory only, for such cases, the patch is intended.

Since the vectors MRequirements and MEvents are empty for ExecCGCommand, ExecCGCommand of such kernel doesn't affect the command graph and is not added as a node.
Since this command doesn't depend on the graph in any way, it can be safely executed independently, without going through the mechanism of placing it into the queue, which in this case is meaningless. Thus, the command can be executed instantly.

This significantly saves time that could be wasted in:

  • a lot of extra checks trying to add a command to the graph using 1 write-mutex to the graph.
  • many extra checks in queuing process using 1 read-mutex and 1 write-mutex.

Here printGraphAsDot is useless because the graph isn't changed and this is used only to support the old behavior, and this can be removed if desired.

Signed-off-by: Alexander Flegontov [email protected]

@alexanderfle
Copy link
Contributor Author

/summary:run

@alexanderfle alexanderfle marked this pull request as ready for review July 30, 2021 15:30
@alexanderfle alexanderfle requested a review from a team as a code owner July 30, 2021 15:30
@alexanderfle alexanderfle requested a review from againull July 30, 2021 15:30
Copy link
Contributor

@againull againull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a commit message explaining the change.

@alexanderfle
Copy link
Contributor Author

sure, I've updated the description.

@alexanderfle alexanderfle requested a review from againull August 2, 2021 11:21
Copy link
Contributor

@againull againull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@againull againull merged commit 441dc3b into intel:sycl Aug 4, 2021
romanovvlad pushed a commit that referenced this pull request Nov 10, 2021
The changes make even more performance optimization
for calling kernels that have no dependencies, starting in
the #4188. If a kernel doesn't change the execution
command graph then the time spent on allocation and
deallocation objects of the ExecCGCommand and
CGExecKernel classes can be reduced by simply not
creating them and directly calling kernel submission
function(i.e enqueueImpKernel).

Signed-off-by: Alexander Flegontov [email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants