-
Notifications
You must be signed in to change notification settings - Fork 787
[SYCL][L0][Plugin] Add support for batching copy commands #5155
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][L0][Plugin] Add support for batching copy commands #5155
Conversation
Signed-off-by: Arvind Sudarsanam <[email protected]>
/summary:run |
Signed-off-by: Arvind Sudarsanam <[email protected]>
pi_uint32 QueueBatchSize = {0}; | ||
} command_batch; | ||
|
||
command_batch ComputeCommandBatch, CopyCommandBatch; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
// be batched together. | ||
// For copy commands, IsCopy is set to 'true'. | ||
// For non-copy commands, IsCopy is set to 'false'. | ||
bool isBatchingAllowed(bool IsCopy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool isBatchingAllowed(bool IsCopy); | |
bool isBatchingAllowed(bool IsCopy) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done as part of bigger commit.
pi_result executeOpenCommandList(bool IsCopy); | ||
|
||
// Wrapper function to execute both OpenCommandLists (Copy and Compute). | ||
pi_result executeOpenCommandLists() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make it more distinct from the executeOpenCommandList
:
pi_result executeOpenCommandLists() { | |
pi_result executeAllOpenCommandLists() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also please add comment about how one should choose between the 2 versions. I.e. in what circumstances both batches need to be closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
@@ -2817,6 +2831,15 @@ pi_result piContextRelease(pi_context Context) { | |||
return ContextReleaseHelper(Context); | |||
} | |||
|
|||
// This function will initialize batch sizes for the compute anc copy command | |||
// lists | |||
static void initializeQueueBatchSizes(pi_queue Queue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move _pi_queue ctor implementation to pi_level_zero.cpp, and remove this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks.
if (Event->Queue->hasOpenCommandList(false /* IsCopy */) && | ||
Event->Queue->ComputeCommandBatch.OpenCommandList->first == | ||
Event->ZeCommandList) { | ||
if (auto Res = Event->Queue->executeOpenCommandList(false /* IsCopy */)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe
using IsCopy = bool;
and then you can write arguments like
IsCopy{false}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
bool UseCopyEngine = | ||
(!(Queue->isInOrderQueue()) || UseCopyEngineForInOrderQueue) && | ||
PreferCopyEngine && Queue->Device->hasCopyEngine(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it into utility pi_queue::useCopyEngine(bool PreferCopyEngine = true)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
@@ -5551,7 +5590,8 @@ static pi_result enqueueMemCopyHelper(pi_command_type CommandType, | |||
pi_cast<std::uintptr_t>(ZeEvent)); | |||
printZeEventList(WaitList); | |||
|
|||
if (auto Res = Queue->executeCommandList(CommandList, BlockingWrite)) | |||
if (auto Res = | |||
Queue->executeCommandList(CommandList, BlockingWrite, OkToBatch)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the commands that we still don't batch? Maybe we should just make the last argument (OkToBatch) true by default now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have considerable number of call-sites where this argument is 'false' (wait commands, membuffermap, memadvise, etc). So, we don't need to change the default now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, few minor suggestions
Signed-off-by: Arvind Sudarsanam <[email protected]>
…changes Signed-off-by: Arvind Sudarsanam <[email protected]>
…changes Signed-off-by: Arvind Sudarsanam <[email protected]>
Signed-off-by: Arvind Sudarsanam <[email protected]>
Thanks Sergey for all your comments. I have tried my best to handle them. Please take a look. |
}(); | ||
|
||
// Static variable that holds batch config info for copy command batching. | ||
static const zeCommandListBatchConfig ZeCommandListBatchCopyConfig = [] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please re-use the initialization lambda from the compute config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean turn lambda into a named function and us it to init both of configs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it....I made the appropriate changes. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just one comment to re-use code for env var inititalization
…changes Signed-off-by: Arvind Sudarsanam <[email protected]>
Signed-off-by: Arvind Sudarsanam <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sycl/doc/EnvironmentVariables.md looks good to me.
* upstream/sycl: (1382 commits) [SYCL][XPTI] Report memory allocation info from SYCL runtime (intel#5172) [CI] Switch labels for OCL x64 job (intel#5185) [SYCL] Add basic support for the generic_space address space (intel#5148) [CI] Update CODEOWNERS for SYCL printf support passes (intel#5199) [SYCL][Matrix] Enable wi_slice for joint_matrix (intel#4979) [SYCL][Group algorithms] Move group sort extension to experimental (intel#5169) [SYCL] Fix kernel bundles don't really carry kernel IDs (intel#5121) [SYCL] Initial printf support for non-constant AS format strings (intel#5069) [SYCL][NFC] Fix static code analysis concerns (intel#5189) [SYCL][Doc] Fix typos to fix doc build (intel#5190) [Driver][SYCL] Turn on -fsycl-dead-args-optimization by default (intel#3004) [SYCL][L0][Plugin] Add support for batching copy commands (intel#5155) [CI] Add cache checkout script to docker containers (intel#5184) [SYCL][Doc] Add HIP backend to the filter selector (intel#5176) [Doc] Add documentation for Docker images (intel#4778) [LIBCLC] Add functionality for in-kernel asserts for CUDA backend (intel#5174) Force opt to use new pass manager in exponential-deferred-inlining test after a8c2ba1 [SYCL] Add vec and marray support to known_identity type trait (intel#5163) Correctly resolve merge conflicts Update SPV_INTEL_hw_thread_queries to rev 2 ...
Signed-off-by: Arvind Sudarsanam [email protected]