-
Notifications
You must be signed in to change notification settings - Fork 788
[SYCL] Use a global flush buffer in stream #1678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Change stream implementation to use a global buffer for all the work-items to flush their outputs into. Previously a set of local flush buffers were used for this purpose. This change is made to avoid the issue of some devices running out of local memory when available space is less than statement size times the number of work-items. Signed-off-by: Shahab Layeghi <[email protected]>
Is there any impact on performance within this implementation? |
In previous implementation where we used flush buffer in local memory users saw 10x slowdown in some cases because local memory was exhausted. So I expect that mainly this change has positive performance effect. |
@@ -22,6 +23,15 @@ AccessorImplHost::~AccessorImplHost() { | |||
} | |||
} | |||
|
|||
void AccessorImplHost::resize(size_t GlobalSize) { | |||
if (GlobalSize != 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 1 is special value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No special significance anymore, but now it prevents some unnecessary operations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, then, when do you call resize
method with GlobalSize
equal to one?
@@ -59,7 +59,7 @@ int main() { | |||
check_size<program, 16>(); | |||
check_size<range<1>, 8>(); | |||
check_size<sampler, 16>(); | |||
check_size<stream, 208>(); | |||
check_size<stream, 144>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you'll have to update version of library now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something besides the change already done in sycl/CMakeLists.txt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, refer to this document: https://github.com/intel/llvm/blob/sycl/sycl/doc/ABIPolicyGuide.md
Add a comment per review. Signed-off-by: Shahab Layeghi <[email protected]>
Minor format fix Signed-off-by: Shahab Layeghi <[email protected]>
@s-kanaev, @v-klochkov Merging is blocked, I believe because your approve is required on behalf of runtime reviewers. Please approve or provide your feedback. |
Change stream implementation to use a global buffer for all the
work-items to flush their outputs into. Previously a set of local flush
buffers were used for this purpose.
This change is made to avoid the issue of some devices running out of
local memory when available space is less than statement size times the
number of work-items.
Signed-off-by: Shahab Layeghi [email protected]