Skip to content

[SYCL] Refactor address space attributes to align with llorg. #3634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
May 14, 2021

Conversation

bader
Copy link
Contributor

@bader bader commented Apr 27, 2021

This PR contains two patches:

  1. [SYCL] Implement SYCL address space attributes handling - cherry-pick from llorg of llvm/llvm-project@7818906.
  2. Adjust DPC++ to address space handling changes from llorg - changes on top of llorg commit to keep DPC++ working.

Today use re-use OpenCL attributes in SYCL mode, so second patch updates these uses with SYCL attributes (not all of them though).
Second patch includes content of https://reviews.llvm.org/D100396 + additional changes:

  1. Enable opencl_constant address space in SYCL mode.
  2. SYCL mode reuses a lot of OpenCL code, which rely on OpenCL language address spaces. Switching to new SYCL address spaces breaks re-use "as is". I updated uses of opencl_local address space, but there are a lot of uses for opencl_global (e.g. special types like images, pipes, etc. are annotated with OpenCL language address space). We will have to introduce SYCL address spaces in less invasive way possible.
  3. Identified that llvm/llvm-project@7818906 drops address space attributes for global variables. I'll upload fix for this issue to llorg shortly.
  4. I left a couple of FIXME comments, which should be addressed eventually.
  5. clang/lib/Sema/SPIRVBuiltins.td updated to use SYCL address spaces. @Naghasan, please, take a look. I'm not sure if SPIRVBuiltins.td is supposed to be used for SYCL language mode only.
  6. Reverted a whitespace few changes and one LIT test (clang/test/SemaOpenCLCXX/address-space-lambda.clcpp) to match llorg source code.
  7. Reverted 761b2ea, which was a workaround for custom SYCL mangling scheme for NVPTX target.

bader added 3 commits April 27, 2021 14:44
Default address space (applies when no explicit address space was
specified) maps to generic (4) address space.

Added SYCL named address spaces `sycl_global`, `sycl_local` and
`sycl_private` defined as sub-sets of the default address space.

Static variables without address space now reside in global address
space when compile for SPIR target, unless they have an explicit address
space qualifier in source code.

Differential Revision: https://reviews.llvm.org/D89909
@bader bader requested a review from AaronBallman April 27, 2021 16:19
@AaronBallman
Copy link
Contributor

This PR is marked as a draft -- do you want me to hold off on reviewing it until you remove that, or would you like review now?

@bader
Copy link
Contributor Author

bader commented Apr 29, 2021

/summary:run

@bader bader marked this pull request as ready for review April 29, 2021 13:49
@bader
Copy link
Contributor Author

bader commented Apr 29, 2021

This PR is marked as a draft -- do you want me to hold off on reviewing it until you remove that, or would you like review now?

I think it should be ready for review now. Hopefully CI checks won't detect new issues with this patch.

Copy link
Contributor

@premanandrao premanandrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bader, I have tried to best review the changes and they mostly look good, but I do not understand it fully. I just have a few nits and some minor questions.

Comment on lines 136 to 138
// FIXME: SYCL specification considers unannotated pointers and references
// to be pointing to the generic address space. See section 5.9.3 of
// SYCL 2020 specification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the SYCL spec maintainers aware of this implementation issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is super strange, AMDGPU target is doing this exact thing (default is generic or flat address space) with a view on killing the map with private as default (see AMDGPU's adjust comment). NVPTX doesn't because the generic and private (local in the CUDA terminology) maps to the same address space, however the semantic used for Default matches the semantic of a flat address space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be happy to avoid these SYCL customizations, but I don't think we can do better w/o breaking SPIR ABI.
For instance, SPIR-V translator relies on mapping address space 0 to private address space (in OpenCL terms).
Let me know if you have other ideas.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure to whom you are answering :) but

I would be happy to avoid these SYCL customizations, but I don't think we can do better w/o breaking SPIR ABI.

That was the underlying reason of my comment: I don't think you can avoid this but this is not unprecedented anyway. Because of that, this comment is strange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was suggested by Anastasia here: https://reviews.llvm.org/D89909#inline-946388, https://reviews.llvm.org/D89909#inline-948325.
I'm open to improvements. Do you suggest removing "FIXME:" word from the comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding properly, it sounds like the FIXME is wrong -- the code doesn't need to be fixed, it's basically behaving as the implementation requires it, and what needs to change is the SYCL spec based on implementation feedback. So perhaps change if it from a FIXME to a NOTE and file an issue with the SYCL spec authors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed FIXME to NOTE.

I'm not sure if there is an issue with the SYCL spec. The problem we discussed https://reviews.llvm.org/D89909 is that we now have two different mappings from AST/language address space to LLVM address space. The only difference between these is how Default address space is mapped. I see two different points of view on how Default should be treated for GPU programming languages. Anastasia thinks Default should be mapped to OpenCL private address space, but this logic doesn't work in case of SYCL and other language mode, which do not infer address space attribute at AST level. SYCL spec authors are well aware of this issue and I think SYCL 2020 spec has clarifications for this.

@Naghasan, feel free to correct me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code doesn't need to be fixed, it's basically behaving as the implementation requires it

That's my interpretation as well. The change to NOTE make sense.

Anastasia thinks Default should be mapped to OpenCL private address space

I fail to see any rational for that. SPIR is a target, it is not bound to an input language.

I'm not sure if there is an issue with the SYCL spec.

Oh there is a lot of issues in the spec, but I don't think this is one of them. And I'm not fully convinced there is a clang design issue either: the underlying issue is we are trying to map the SYCL/CUDA/OpenCL memory model onto C/C++ which will have limitations (pointer escape for instances).

Comment on lines 136 to 138
// FIXME: SYCL specification considers unannotated pointers and references
// to be pointing to the generic address space. See section 5.9.3 of
// SYCL 2020 specification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is super strange, AMDGPU target is doing this exact thing (default is generic or flat address space) with a view on killing the map with private as default (see AMDGPU's adjust comment). NVPTX doesn't because the generic and private (local in the CUDA terminology) maps to the same address space, however the semantic used for Default matches the semantic of a flat address space.

@bader bader linked an issue May 11, 2021 that may be closed by this pull request
premanandrao
premanandrao previously approved these changes May 13, 2021
Copy link
Contributor

@premanandrao premanandrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my review comments. I am okay with this change.

@bader
Copy link
Contributor Author

bader commented May 13, 2021

Thanks!
@AaronBallman, @Naghasan, does it look good to you?

AaronBallman
AaronBallman previously approved these changes May 13, 2021
Copy link
Contributor

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM aside from the commenting suggestion.

Comment on lines 136 to 138
// FIXME: SYCL specification considers unannotated pointers and references
// to be pointing to the generic address space. See section 5.9.3 of
// SYCL 2020 specification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding properly, it sounds like the FIXME is wrong -- the code doesn't need to be fixed, it's basically behaving as the implementation requires it, and what needs to change is the SYCL spec based on implementation feedback. So perhaps change if it from a FIXME to a NOTE and file an issue with the SYCL spec authors?

@bader bader dismissed stale reviews from AaronBallman and premanandrao via f57217b May 13, 2021 16:28
@bader
Copy link
Contributor Author

bader commented May 14, 2021

@kbobrovs, @DenisBakhvalov, I see failures in ESIMD/private_memory/pm_access_2.cpp and ESIMD/private_memory/pm_access_3.cpp after comment update:

warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi1EEEE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTS8KernelIDILi1EE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi2EEEE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTS8KernelIDILi2EE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi3EEEE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTS8KernelIDILi3EE allocation size is too big: using TPM
LLVM ERROR: Cannot find pointer replacement

Are these are known issues?

@bader
Copy link
Contributor Author

bader commented May 14, 2021

@kbobrovs, @DenisBakhvalov, I see failures in ESIMD/private_memory/pm_access_2.cpp and ESIMD/private_memory/pm_access_3.cpp after comment update:

warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi1EEEE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTS8KernelIDILi1EE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi2EEEE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTS8KernelIDILi2EE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTSN2cl4sycl6detail19__pf_kernel_wrapperI8KernelIDILi3EEEE allocation size is too big: using TPM
warning: GenXPromoteArray: _ZTS8KernelIDILi3EE allocation size is too big: using TPM
LLVM ERROR: Cannot find pointer replacement

Are these are known issues?

I see these tests also fail in other PRs - #3614, so these doesn't seem to be caused by changes from this PR.

@bader bader merged commit 870b840 into intel:sycl May 14, 2021
@bader bader deleted the sycl-address-spaces branch May 14, 2021 15:10
@Naghasan
Copy link
Contributor

Thanks!
@AaronBallman, @Naghasan, does it look good to you?

LGTM :)

@bader
Copy link
Contributor Author

bader commented May 14, 2021

Thanks!
@AaronBallman, @Naghasan, does it look good to you?

LGTM :)

Thanks a lot for review! I'll try to address your comments and remove the difference between llvm/llvm-project@7818906 and what we landed to sycl branch in follow-up patches.

npmiller added a commit to npmiller/llvm that referenced this pull request Aug 6, 2021
This patch fixes the `core/GroupAsyncCopy.cl` and `ocl/prefetch.cl` lit
tests in `libclc`.

These started failing with the introduction of the SYCL address spaces
in intel#3634.

The issue is that the builtins are declared with the SYCL address spaces
in the tablegen so when used in an OpenCL setting it ends up with
address space mismatches between things like `opencl_global` and
`sycl_global`.

These two were the only tests affected because they're also the only
tested builtins that do not have generic address space variants in
tablegen, the other tests would just fall back on the generic address
space variants.

With this patch the builtins should automatically get the appropriate
version of the address space depending on the context.
bader pushed a commit that referenced this pull request Aug 19, 2021
This patch fixes the `core/GroupAsyncCopy.cl` and `ocl/prefetch.cl` lit
tests in `libclc`.

These started failing with the introduction of the SYCL address spaces
in #3634.

The issue is that the builtins are declared with the SYCL address spaces
in the tablegen so when used in an OpenCL setting it ends up with
address space mismatches between things like `opencl_global` and
`sycl_global`.

These two were the only tests affected because they're also the only
tested builtins that do not have generic address space variants in
tablegen, the other tests would just fall back on the generic address
space variants.

With this patch the builtins should automatically get the appropriate
version of the address space depending on the context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash on unreachable that is reachable
4 participants