[SYCL][NVPTX] Split max_work_group_size into 3 NVVM annotations #14420

frasercrmck · 2024-07-03T13:21:43Z

NVVM IR supports separated maxntidx, maxntidy, and maxntidz annotations. The backend will print them individually as three dimensions. This better preserves programmer intent than prematurely flattening them together.

Note that the semantics are in fact identical; the CUDA implementation internally multiplies all dimensions together and only guarantees that the total is never exceeded, but not that any individual dimension is not exceeded. Thus 64,1,1 is identical to 4,4,4.

We try and preserve a logical mapping of dimensions by index flipping between SYCL (z,y,x) and NVVM (x,y,z) in CUDA terminology despite, as mentioned above, it being largely irrelevant.

Also this patch simplifies the attribute's getter functions as all dimensions are mandatory, and the getters seemed copied from the reqd_work_group_size attribute where some are optional.

We could probably improve the code further by making the operands "unsigned" and not "Expr", and renaming them from X,Y,Z to Dim{0,1,2} as per the SYCL spec. This has been left for future work, however, as there's a non-trivial amount of code that expects to be able to treat the max_work_group_size and reqd_work_group_size attributes identically through templates and identical helper methods.

NVVM IR supports separated maxntidx, maxntidy, and maxntidz annotations. The backend will print them individually as three dimensions. This better preserves programmer intent than prematurely flattening them together. Note that the semantics are in fact identical; the CUDA implementation internally multiplies all dimensions together and only guarantees that the total is never exceeded, but not that any individual dimension is not exceeded. Thus 64,1,1 is identical to 4,4,4. We try and preserve a logical mapping of dimensions by index flipping between SYCL (z,y,x) and NVVM (x,y,z) in CUDA terminology despite, as mentioned above, it being largely irrelevant. Also this patch simplifies the attribute's getter functions as all dimensions are mandatory, and the getters seemed copied from the reqd_work_group_size attribute where some are optional. We could probably improve the code further by making the operands "unsigned" and not "Expr", and renaming them from X,Y,Z to Dim{0,1,2} as per the SYCL spec. This has been left for future work, however, as there's a non-trivial amount of code that expects to be able to treat the max_work_group_size and reqd_work_group_size attributes identically through templates and identical helper methods.

steffenlarsen

Great improvement! 🚀

smanna12

LGTM. Thank you

jchlanda

🏅

clang/include/clang/Basic/Attr.td

…notation

frasercrmck · 2024-07-04T15:24:26Z

@intel/llvm-gatekeepers this is good to merge, thank you!

frasercrmck requested a review from a team as a code owner July 3, 2024 13:21

frasercrmck requested review from jchlanda and steffenlarsen July 3, 2024 13:21

frasercrmck temporarily deployed to WindowsCILock July 3, 2024 13:23 — with GitHub Actions Inactive

frasercrmck had a problem deploying to WindowsCILock July 3, 2024 14:48 — with GitHub Actions Failure

steffenlarsen approved these changes Jul 3, 2024

View reviewed changes

smanna12 approved these changes Jul 3, 2024

View reviewed changes

frasercrmck temporarily deployed to WindowsCILock July 3, 2024 16:29 — with GitHub Actions Inactive

frasercrmck had a problem deploying to WindowsCILock July 3, 2024 16:29 — with GitHub Actions Failure

frasercrmck temporarily deployed to WindowsCILock July 3, 2024 16:32 — with GitHub Actions Inactive

frasercrmck temporarily deployed to WindowsCILock July 3, 2024 16:33 — with GitHub Actions Inactive

jchlanda approved these changes Jul 4, 2024

View reviewed changes

clang/include/clang/Basic/Attr.td Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/sycl' into sycl-split-maxntid-an…

e6bf196

…notation

frasercrmck had a problem deploying to WindowsCILock July 4, 2024 10:57 — with GitHub Actions Error

remove isa<>

8c111cb

frasercrmck temporarily deployed to WindowsCILock July 4, 2024 10:59 — with GitHub Actions Inactive

frasercrmck temporarily deployed to WindowsCILock July 4, 2024 13:09 — with GitHub Actions Inactive

martygrant merged commit ef62cad into intel:sycl Jul 4, 2024
14 checks passed

frasercrmck deleted the sycl-split-maxntid-annotation branch July 4, 2024 15:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SYCL][NVPTX] Split max_work_group_size into 3 NVVM annotations #14420

[SYCL][NVPTX] Split max_work_group_size into 3 NVVM annotations #14420

Uh oh!

frasercrmck commented Jul 3, 2024

Uh oh!

steffenlarsen left a comment

Uh oh!

smanna12 left a comment

Uh oh!

jchlanda left a comment

Uh oh!

Uh oh!

frasercrmck commented Jul 4, 2024

Uh oh!

Uh oh!

Uh oh!

[SYCL][NVPTX] Split max_work_group_size into 3 NVVM annotations #14420

[SYCL][NVPTX] Split max_work_group_size into 3 NVVM annotations #14420

Uh oh!

Conversation

frasercrmck commented Jul 3, 2024

Uh oh!

steffenlarsen left a comment

Choose a reason for hiding this comment

Uh oh!

smanna12 left a comment

Choose a reason for hiding this comment

Uh oh!

jchlanda left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frasercrmck commented Jul 4, 2024

Uh oh!

Uh oh!

Uh oh!