Skip to content

[SYCL] Support USM buffer location property in malloc_shared #6218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 7, 2022
Merged

[SYCL] Support USM buffer location property in malloc_shared #6218

merged 1 commit into from
Jun 7, 2022

Conversation

pcolberg
Copy link
Contributor

@pcolberg pcolberg commented May 31, 2022

This ports commit 12c988a from malloc_device to malloc_shared for use with the FPGA Runtime for OpenCL.

See malloc_device implementation in #5634

See SYCL extension specification in #6219

@tiwaria1 @bsyrowik @GarveyJoe @aditikum @ajaykumarkannan

@pcolberg pcolberg requested a review from a team as a code owner May 31, 2022 17:51
@pcolberg pcolberg self-assigned this May 31, 2022
@pcolberg pcolberg requested a review from cperkinsintel May 31, 2022 17:51
@pcolberg pcolberg marked this pull request as draft May 31, 2022 17:51
@tiwaria1
Copy link
Contributor

buffer_location property spec update is here: #6219

@bsyrowik
Copy link

Amazing, thank you Peter!

@pcolberg
Copy link
Contributor Author

pcolberg commented May 31, 2022

@intel/llvm-reviewers-runtime Are the failures in L0 GEN9 and HIP AMDGPU related to this PR? I am unsure how to interpret the test output.

@cperkinsintel
Copy link
Contributor

@pcolberg I restarted the test run. @steffenlarsen only had a moment to look at it very briefly and mentioned a theory about Hip and L0 not handling empty property lists properly, but that was a bit rushed, off the cuff. Does that remark contain any clue for yourself?

@pcolberg
Copy link
Contributor Author

pcolberg commented May 31, 2022

@pcolberg I restarted the test run. @steffenlarsen only had a moment to look at it very briefly and mentioned a theory about Hip and L0 not handling empty property lists properly, but that was a bit rushed, off the cuff. Does that remark contain any clue for yourself?

Thanks @cperkinsintel and @steffenlarsen! Yes, that explains the failures, I am indeed always passing an empty property list. I will rework the change to pass nullptr instead.

@steffenlarsen
Copy link
Contributor

Apologies for the inconvenience, @pcolberg. The problem with empty lists of properties should be fixed with #6225.

@tiwaria1
Copy link
Contributor

tiwaria1 commented Jun 1, 2022

Hi @steffenlarsen is this PR now good for merging?

@pcolberg
Copy link
Contributor Author

pcolberg commented Jun 1, 2022

@steffenlarsen Thanks for adressing this so quickly 🙂

Which variant of the change would you prefer, the single case with empty or nonempty null-terminated list, or the two cases with either null pointer or null-terminated nonempty list? I tend to branchless code for readability. On the other hand, the heap allocation in the absence of properties is not that pretty either; not that the overhead matters for this high-level call, just in principle.

@pcolberg
Copy link
Contributor Author

pcolberg commented Jun 1, 2022

@tiwaria1 Apart from the pending review, we should align this PR with #6220, which still fails in L0 for unknown reason #6220 (comment). If the resolution requires a design change of the implementation, this PR should be modified in the same way.

@tiwaria1
Copy link
Contributor

tiwaria1 commented Jun 3, 2022

ping @cperkinsintel @intel/llvm-reviewers-runtime , please review the change.

@pcolberg
Copy link
Contributor Author

pcolberg commented Jun 3, 2022

ESIMD Emu runner is broken, unrelated to this PR:

##[error]Path does not exist /localdisk2/github/_work/llvm/llvm/build/results_esimd_emu_default.json
##[error]Exit code 1 returned from process: file name '/localdisk2/github/bin.2.292.0/Runner.PluginHost', arguments 'action "GitHub.Runner.Plugins.Artifact.PublishArtifact, Runner.Plugins"'.

Update: This appears to be an infrastructure issue and has since been resolved.

@steffenlarsen
Copy link
Contributor

Which variant of the change would you prefer, the single case with empty or nonempty null-terminated list, or the two cases with either null pointer or null-terminated nonempty list? I tend to branchless code for readability. On the other hand, the heap allocation in the absence of properties is not that pretty either; not that the overhead matters for this high-level call, just in principle.

I agree, the branchless version was cleaner. If you want to minimize the heap allocation, you could make a static sized array with the maximum number of possible properties, then insert 0 after the last present property.

This ports commit 12c988a from
malloc_device to malloc_shared for use with the FPGA Runtime for OpenCL.

See malloc_device implementation in #5634

See extension specification in #5665
@pcolberg
Copy link
Contributor Author

pcolberg commented Jun 6, 2022

If you want to minimize the heap allocation, you could make a static sized array with the maximum number of possible properties, then insert 0 after the last present property.

Thanks @steffenlarsen, I have changed the properties list to a stack array.

@cperkinsintel This PR is now ready for review and merge.

Copy link
Contributor

@steffenlarsen steffenlarsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@pcolberg
Copy link
Contributor Author

pcolberg commented Jun 6, 2022

@intel/llvm-gatekeepers This PR is ready for merge.

pvchupin pushed a commit that referenced this pull request Jun 7, 2022
…locations (#6219)

Spec states that the runtime buffer_location property will not have an effect on shared/host allocations. This is incorrect as the property is meant to be used with host/shared allocations with the same behavior as device allocations.

#6218
#6220
@pvchupin pvchupin merged commit 6e89821 into intel:sycl Jun 7, 2022
@pcolberg pcolberg deleted the buffer_location branch June 7, 2022 02:56
steffenlarsen pushed a commit that referenced this pull request Jun 8, 2022
See malloc_shared implementation in #6218

See extension specification in #5665
pvchupin pushed a commit that referenced this pull request Jun 8, 2022
… to malloc_shared (#6268)

This aligns the implementation with malloc_host and avoids an unneeded
call in the case where the FPGA backend is used without buffer location.

This amends #6218

See also #6220
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants