-
Notifications
You must be signed in to change notification settings - Fork 22
Implement SYCL kernels in noncentral_chisquare #1054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SYCL kernels in noncentral_chisquare #1054
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this an optimization? If so, how is this proven? Is there any measurement data?
kernel_parallel_for_func2); | ||
}; | ||
event_out = DPNP_QUEUE.submit(kernel_func2); | ||
event_out.wait(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How many kernels are you submitting here?
Co-authored-by: densmirn <[email protected]>
…central_chisquare
The purpose of these changes is not optimization, but to implement the execution on the device to prevent copying data to the host. |
Is it really copying data to the host? how was it found? Even if this is not an optimization, and if we assume that copying occurs here, do these changes really do not degrade the execution time? |
…central_chisquare
…central_chisquare
|
Pay attention please, here a lot of kernels are added to the queue. |
974ca82
to
a0a098d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
No description provided.