-
Notifications
You must be signed in to change notification settings - Fork 772
[SYCL] Support USM buffer location property in malloc_shared #6218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
buffer_location property spec update is here: #6219 |
Amazing, thank you Peter! |
@intel/llvm-reviewers-runtime Are the failures in L0 GEN9 and HIP AMDGPU related to this PR? I am unsure how to interpret the test output. |
@pcolberg I restarted the test run. @steffenlarsen only had a moment to look at it very briefly and mentioned a theory about Hip and L0 not handling empty property lists properly, but that was a bit rushed, off the cuff. Does that remark contain any clue for yourself? |
Thanks @cperkinsintel and @steffenlarsen! Yes, that explains the failures, I am indeed always passing an empty property list. I will rework the change to pass |
Hi @steffenlarsen is this PR now good for merging? |
@steffenlarsen Thanks for adressing this so quickly 🙂 Which variant of the change would you prefer, the single case with empty or nonempty null-terminated list, or the two cases with either null pointer or null-terminated nonempty list? I tend to branchless code for readability. On the other hand, the heap allocation in the absence of properties is not that pretty either; not that the overhead matters for this high-level call, just in principle. |
@tiwaria1 Apart from the pending review, we should align this PR with #6220, which still fails in L0 for unknown reason #6220 (comment). If the resolution requires a design change of the implementation, this PR should be modified in the same way. |
ping @cperkinsintel @intel/llvm-reviewers-runtime , please review the change. |
ESIMD Emu runner is broken, unrelated to this PR:
Update: This appears to be an infrastructure issue and has since been resolved. |
I agree, the branchless version was cleaner. If you want to minimize the heap allocation, you could make a static sized array with the maximum number of possible properties, then insert 0 after the last present property. |
Thanks @steffenlarsen, I have changed the properties list to a stack array. @cperkinsintel This PR is now ready for review and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@intel/llvm-gatekeepers This PR is ready for merge. |
This ports commit 12c988a from
malloc_device
tomalloc_shared
for use with the FPGA Runtime for OpenCL.See
malloc_device
implementation in #5634See SYCL extension specification in #6219
@tiwaria1 @bsyrowik @GarveyJoe @aditikum @ajaykumarkannan