Skip to content

[SYCL] update for run-on-host-intel #5414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 37 additions & 19 deletions sycl/source/detail/scheduler/commands.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1925,20 +1925,27 @@ static pi_result SetKernelParamsAndLaunch(
// The function is used as argument to piEnqueueNativeKernel which requires
// that the passed function takes one void* argument.
void DispatchNativeKernel(void *Blob) {
// First value is a pointer to Corresponding CGExecKernel object.
CGExecKernel *HostTask = *(CGExecKernel **)Blob;
bool ShouldDeleteCG = static_cast<void **>(Blob)[1] != nullptr;
void **CastedBlob = (void **)Blob;

std::vector<Requirement *> *Reqs =
static_cast<std::vector<Requirement *> *>(CastedBlob[0]);

std::unique_ptr<HostKernelBase> *HostKernel =
static_cast<std::unique_ptr<HostKernelBase> *>(CastedBlob[1]);

NDRDescT *NDRDesc = static_cast<NDRDescT *>(CastedBlob[2]);

// Other value are pointer to the buffers.
void **NextArg = static_cast<void **>(Blob) + 2;
for (detail::Requirement *Req : HostTask->MRequirements)
void **NextArg = CastedBlob + 3;
for (detail::Requirement *Req : *Reqs)
Req->MData = *(NextArg++);
HostTask->MHostKernel->call(HostTask->MNDRDesc, nullptr);

// The command group will (if not already was) be released in scheduler.
// Hence we're free to deallocate it here.
if (ShouldDeleteCG)
delete HostTask;
(*HostKernel)->call(*NDRDesc, nullptr);

// The ownership of these objects have been passed to us, need to cleanup
delete Reqs;
delete HostKernel;
delete NDRDesc;
}

cl_int enqueueImpKernel(
Expand Down Expand Up @@ -2118,15 +2125,26 @@ cl_int ExecCGCommand::enqueueImp() {

// piEnqueueNativeKernel takes arguments blob which is passes to user
// function.
// Reserve extra space for the pointer to CGExecKernel to restore context.
std::vector<void *> ArgsBlob(HostTask->MArgs.size() + 2);
ArgsBlob[0] = (void *)HostTask;
{
std::intptr_t ShouldDeleteCG =
static_cast<std::intptr_t>(MDeps.size() == 0 && MUsers.size() == 0);
ArgsBlob[1] = reinterpret_cast<void *>(ShouldDeleteCG);
}
void **NextArg = ArgsBlob.data() + 2;
// Need the following items to restore context in the host task.
// Make a copy on heap to "dettach" from the command group as it can be
// released before the host task completes.
std::vector<void *> ArgsBlob(HostTask->MArgs.size() + 3);

std::vector<Requirement *> *CopyReqs =
new std::vector<Requirement *>(HostTask->MRequirements);

// Not actually a copy, but move. Should be OK as it's not expected that
// MHostKernel will be used elsewhere.
std::unique_ptr<HostKernelBase> *CopyHostKernel =
new std::unique_ptr<HostKernelBase>(std::move(HostTask->MHostKernel));

NDRDescT *CopyNDRDesc = new NDRDescT(HostTask->MNDRDesc);

ArgsBlob[0] = (void *)CopyReqs;
ArgsBlob[1] = (void *)CopyHostKernel;
ArgsBlob[2] = (void *)CopyNDRDesc;
Comment on lines +2143 to +2145
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Would it be better to wrap these into a struct passed as a single first argument?


void **NextArg = ArgsBlob.data() + 3;

if (MQueue->is_host()) {
for (ArgDesc &Arg : HostTask->MArgs) {
Expand Down