[SYCL] UR_KERNEL_SUB_GROUP_INFO_SUB_GROUP_SIZE_INTEL on Cuda and HIP #17137

jchlanda · 2025-02-24T14:07:57Z

For HIP the value of sub group size can either be 32 or 64, it can be retrieved from intel_reqd_sub_group_size metadata node.

Cuda only supports 32, which is enforced in the compiler, see SemaSYCL::addIntelReqdSubGroupSizeAttr.

jchlanda · 2025-02-24T14:08:07Z

Fixes: #14357

aarongreig

UR LGTM

llvm/lib/SYCLLowerIR/ComputeModuleRuntimeInfo.cpp

frasercrmck · 2025-02-24T14:29:01Z

unified-runtime/source/adapters/cuda/kernel.cpp

-    return ReturnValue(0);
+    // The only supported value of required sub-group size for CUDA devices is
+    // 32.
+    return ReturnValue(32);


If I'm understanding correctly, the CUDA binaries will also have the same metadata as HIP, just we're choosing to hard-code 32?

Yes, you are righty, decorating a kernel with sycl::reqd_sub_group_size(SG_SIZE) results in it having !intel_reqd_sub_group_size !14 attached. For Cuda, the value of that node can only ever be 32.

I'm in two minds about hard coding that value, but copy-pasting the code handling the metadata in question for it only ever generating 32 doesn't seem great either.

Yeah... I lean ever so slightly towards not hard coding it, but it's a toss up. If we could more programmatically share the 32 between the compiler and UR then that would also be an option. But I'm not a huge fan of individually held assumptions.

I think what would help would be sharing more code between HIP and CUDA. Right now they're often just copy/pastes of each other. There's nothing CUDA or HIP-specific about fetching and interpreting metadata from the program, for instance. And I suppose another argument for not hard-coding it would be that it makes any eventual refactor for code sharing more trivial as the copy/paste becomes obvious: it'd be in the identical metadata code, not in the value retrieval code.

Anyway these are ultimately just ideas, I don't know if it's worth it.

Yeah, a GPU-adapter that would abstract away the duplication would be nice. You make a good point about providing the implementation to make it easier for future re-factor. Will extend the CUDA path as well.

unified-runtime/source/adapters/hip/kernel.cpp

frasercrmck · 2025-02-25T09:43:36Z

unified-runtime/source/adapters/hip/program.hpp

@@ -39,6 +39,7 @@ struct ur_program_handle_t_ {
  std::unordered_map<std::string, std::string> GlobalIDMD;
  std::unordered_map<std::string, std::tuple<uint32_t, uint32_t, uint32_t>>
      KernelReqdWorkGroupSizeMD;
+  std::unordered_map<std::string, uint32_t> KernelReqdSubGroupSizeMD;


Last time I changed the program metadata stuff I was wondering if all these separate maps for each program metadata item are the best idea, in terms of memory usage, cache efficiency, access times, etc.

Would std::unordered_map<std::string, struct KernelMetadata> bring any benefits, do we think?

An honest answer is that I don't know and without micro benchmarking it, it's impossible to answer this.
I'm not too keen on the idea of bundling all possible metadata into one struct and storing that as per kernel name value. Intuitively, I'd say most of the time that struct would be storing 0-initialised bytes. I'm tempted to leave it as it is.

unified-runtime/source/ur/ur.hpp

frasercrmck

lots of food for thought but it's not essential

jchlanda · 2025-03-05T07:37:38Z

Friendly ping: @uditagarwal97 @intel/llvm-reviewers-runtime

For HIP the value of sub group size can either be 32 or 64, it can be retrieved from `intel_reqd_sub_group_size` metadata node. Cuda only supports 32, which is enforced in the compiler, see [SemaSYCL::addIntelReqdSubGroupSizeAttr](https://github.com/intel/llvm/blob/sycl/clang/lib/Sema/SemaSYCLDeclAttr.cpp#L828).

jchlanda · 2025-03-11T07:45:46Z

Friendly ping @npmiller, @AlexeySachkov.

jchlanda · 2025-03-13T11:35:38Z

@AlexeySachkov are you happy with the changes in the patch?
Thank you.

jchlanda · 2025-03-18T07:38:11Z

@intel/llvm-gatekeepers I think this should be ready to land.

sommerlukas · 2025-03-18T08:40:13Z

@jchlanda Github UI still says that merge is blocked due to missing review from @intel/dpcpp-tools-reviewers, so the change to code owners might only take effect for new PRs.

@intel/dpcpp-tools-reviewers can we get a quick review on this one?

maksimsab

tools part LGTM.

npmiller · 2025-03-18T16:29:00Z

@intel/llvm-gatekeepers this should be ready to merge now

uditagarwal97 · 2025-03-18T16:33:35Z

@intel/llvm-gatekeepers this should be ready to merge now

CI test results are 2 weeks old. Can you please rebase and re-run the CI?

jchlanda · 2025-03-19T06:36:57Z

tools part LGTM.

@maksimsab while you're here could I kindly ask you to have a look a this patch as well. It's missing a tools review and I didn't have much luck with pinging.
Thank you.

npmiller · 2025-03-19T11:24:59Z

@intel/llvm-gatekeepers the fresh CI run passed, it should be good to merge now

jchlanda requested review from a team as code owners February 24, 2025 14:07

jchlanda requested review from npmiller and uditagarwal97 February 24, 2025 14:07

jchlanda temporarily deployed to WindowsCILock February 24, 2025 14:09 — with GitHub Actions Inactive

aarongreig approved these changes Feb 24, 2025

View reviewed changes

frasercrmck reviewed Feb 24, 2025

View reviewed changes

jchlanda temporarily deployed to WindowsCILock February 24, 2025 15:08 — with GitHub Actions Inactive

jchlanda had a problem deploying to WindowsCILock February 25, 2025 06:54 — with GitHub Actions Failure

frasercrmck reviewed Feb 25, 2025

View reviewed changes

unified-runtime/source/adapters/hip/kernel.cpp Outdated Show resolved Hide resolved

frasercrmck reviewed Feb 25, 2025

View reviewed changes

unified-runtime/source/ur/ur.hpp Show resolved Hide resolved

frasercrmck approved these changes Feb 25, 2025

View reviewed changes

jchlanda had a problem deploying to WindowsCILock February 25, 2025 10:22 — with GitHub Actions Failure

jchlanda had a problem deploying to WindowsCILock February 25, 2025 12:40 — with GitHub Actions Failure

jchlanda had a problem deploying to WindowsCILock February 25, 2025 12:52 — with GitHub Actions Failure

jchlanda had a problem deploying to WindowsCILock February 25, 2025 13:57 — with GitHub Actions Failure

jchlanda had a problem deploying to WindowsCILock February 25, 2025 15:20 — with GitHub Actions Error

jchlanda force-pushed the jakub/info_sub_group_size_intel branch from 301e2c6 to 09a31f0 Compare February 26, 2025 06:10

jchlanda temporarily deployed to WindowsCILock February 26, 2025 06:10 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock February 26, 2025 06:45 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock February 26, 2025 09:43 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock February 26, 2025 10:19 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock February 26, 2025 11:31 — with GitHub Actions Inactive

jchlanda had a problem deploying to WindowsCILock February 26, 2025 13:24 — with GitHub Actions Error

jchlanda temporarily deployed to WindowsCILock February 26, 2025 13:28 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock February 26, 2025 15:22 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock March 5, 2025 07:40 — with GitHub Actions Inactive

jchlanda had a problem deploying to WindowsCILock March 5, 2025 08:12 — with GitHub Actions Failure

uditagarwal97 approved these changes Mar 5, 2025

View reviewed changes

jchlanda force-pushed the jakub/info_sub_group_size_intel branch from 14aa75c to 5c99321 Compare March 6, 2025 06:38

jchlanda temporarily deployed to WindowsCILock March 6, 2025 06:38 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock March 6, 2025 07:26 — with GitHub Actions Inactive

jchlanda added 3 commits March 6, 2025 06:59

Add a sycl-post-link test

a7d3348

Typo fix

4b4fc14

jchlanda force-pushed the jakub/info_sub_group_size_intel branch from 5c99321 to 4b4fc14 Compare March 6, 2025 11:59

jchlanda temporarily deployed to WindowsCILock March 6, 2025 12:00 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock March 6, 2025 13:02 — with GitHub Actions Inactive

npmiller approved these changes Mar 11, 2025

View reviewed changes

maksimsab approved these changes Mar 18, 2025

View reviewed changes

Merge branch 'sycl' into jakub/info_sub_group_size_intel

96619cb

npmiller had a problem deploying to WindowsCILock March 18, 2025 16:51 — with GitHub Actions Error

npmiller temporarily deployed to WindowsCILock March 18, 2025 19:53 — with GitHub Actions Inactive

npmiller temporarily deployed to WindowsCILock March 18, 2025 20:48 — with GitHub Actions Inactive

npmiller temporarily deployed to WindowsCILock March 18, 2025 20:49 — with GitHub Actions Inactive

sommerlukas merged commit 0d5266b into intel:sycl Mar 19, 2025
32 of 33 checks passed

npmiller mentioned this pull request Mar 25, 2025

[SYCL][CUDA][HIP] CUDA, and HIP devices incorrectly report subgroup size kernel attribute #14357

Closed

[SYCL] UR_KERNEL_SUB_GROUP_INFO_SUB_GROUP_SIZE_INTEL on Cuda and HIP #17137

[SYCL] UR_KERNEL_SUB_GROUP_INFO_SUB_GROUP_SIZE_INTEL on Cuda and HIP #17137

Uh oh!

Conversation

jchlanda commented Feb 24, 2025

Uh oh!

jchlanda commented Feb 24, 2025

Uh oh!

aarongreig left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frasercrmck Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

jchlanda Feb 24, 2025

Choose a reason for hiding this comment

Uh oh!

frasercrmck Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

jchlanda Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frasercrmck Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

jchlanda Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

frasercrmck left a comment

Choose a reason for hiding this comment

Uh oh!

jchlanda commented Mar 5, 2025

Uh oh!

jchlanda commented Mar 11, 2025

Uh oh!

jchlanda commented Mar 13, 2025

Uh oh!

jchlanda commented Mar 18, 2025

Uh oh!

sommerlukas commented Mar 18, 2025

Uh oh!

maksimsab left a comment

Choose a reason for hiding this comment

Uh oh!

npmiller commented Mar 18, 2025

Uh oh!

uditagarwal97 commented Mar 18, 2025

Uh oh!

jchlanda commented Mar 19, 2025

Uh oh!

npmiller commented Mar 19, 2025

Uh oh!

Uh oh!

Uh oh!