Skip to content

[Offload] Add ol_dimensions_t and convert ranges from size_t -> uint32_t #143901

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 12, 2025

Conversation

RossBrunton
Copy link
Contributor

This is a three element x, y, z size_t vector that can be used any place
where a 3D vector is required. This ensures that all vectors across
liboffload are the same and don't require any resizing/reordering
dances.

This is a three element x, y, z size_t vector that can be used any place
where a 3D vector is required. This ensures that all vectors across
liboffload are the same and don't require any resizing/reordering
dances.
@llvmbot
Copy link
Member

llvmbot commented Jun 12, 2025

@llvm/pr-subscribers-offload

Author: Ross Brunton (RossBrunton)

Changes

This is a three element x, y, z size_t vector that can be used any place
where a 3D vector is required. This ensures that all vectors across
liboffload are the same and don't require any resizing/reordering
dances.


Full diff: https://github.com/llvm/llvm-project/pull/143901.diff

4 Files Affected:

  • (modified) offload/liboffload/API/Common.td (+10)
  • (modified) offload/liboffload/API/Kernel.td (+2-6)
  • (modified) offload/liboffload/src/OffloadImpl.cpp (+6-6)
  • (modified) offload/unittests/OffloadAPI/kernel/olLaunchKernel.cpp (+4-9)
diff --git a/offload/liboffload/API/Common.td b/offload/liboffload/API/Common.td
index 7674da0438c29..79356029972fa 100644
--- a/offload/liboffload/API/Common.td
+++ b/offload/liboffload/API/Common.td
@@ -148,6 +148,16 @@ def : Struct {
   ];
 }
 
+def : Struct {
+  let name = "ol_dimensions_t";
+  let desc = "A three element vector";
+  let members = [
+    StructMember<"size_t", "x", "X">,
+    StructMember<"size_t", "y", "Y">,
+    StructMember<"size_t", "z", "Z">,
+  ];
+}
+
 def : Function {
   let name = "olInit";
   let desc = "Perform initialization of the Offload library and plugins";
diff --git a/offload/liboffload/API/Kernel.td b/offload/liboffload/API/Kernel.td
index 45e3d8112791c..0913a036fa04f 100644
--- a/offload/liboffload/API/Kernel.td
+++ b/offload/liboffload/API/Kernel.td
@@ -29,12 +29,8 @@ def : Struct {
     let desc = "Size-related arguments for a kernel launch.";
     let members = [
         StructMember<"size_t", "Dimensions", "Number of work dimensions">,
-        StructMember<"size_t", "NumGroupsX", "Number of work groups on the X dimension">,
-        StructMember<"size_t", "NumGroupsY", "Number of work groups on the Y dimension">,
-        StructMember<"size_t", "NumGroupsZ", "Number of work groups on the Z dimension">,
-        StructMember<"size_t", "GroupSizeX", "Size of a work group on the X dimension.">,
-        StructMember<"size_t", "GroupSizeY", "Size of a work group on the Y dimension.">,
-        StructMember<"size_t", "GroupSizeZ", "Size of a work group on the Z dimension.">,
+        StructMember<"struct ol_dimensions_t", "NumGroups", "Number of work groups in each dimension">,
+        StructMember<"struct ol_dimensions_t", "GroupSize", "Size of a work group in each dimension">,
         StructMember<"size_t", "DynSharedMemory", "Size of dynamic shared memory in bytes.">
     ];
 }
diff --git a/offload/liboffload/src/OffloadImpl.cpp b/offload/liboffload/src/OffloadImpl.cpp
index d2b331905ab77..0a784cddeaecb 100644
--- a/offload/liboffload/src/OffloadImpl.cpp
+++ b/offload/liboffload/src/OffloadImpl.cpp
@@ -499,12 +499,12 @@ Error olLaunchKernel_impl(ol_queue_handle_t Queue, ol_device_handle_t Device,
   auto *QueueImpl = Queue ? Queue->AsyncInfo : nullptr;
   AsyncInfoWrapperTy AsyncInfoWrapper(*DeviceImpl, QueueImpl);
   KernelArgsTy LaunchArgs{};
-  LaunchArgs.NumTeams[0] = LaunchSizeArgs->NumGroupsX;
-  LaunchArgs.NumTeams[1] = LaunchSizeArgs->NumGroupsY;
-  LaunchArgs.NumTeams[2] = LaunchSizeArgs->NumGroupsZ;
-  LaunchArgs.ThreadLimit[0] = LaunchSizeArgs->GroupSizeX;
-  LaunchArgs.ThreadLimit[1] = LaunchSizeArgs->GroupSizeY;
-  LaunchArgs.ThreadLimit[2] = LaunchSizeArgs->GroupSizeZ;
+  LaunchArgs.NumTeams[0] = LaunchSizeArgs->NumGroups.x;
+  LaunchArgs.NumTeams[1] = LaunchSizeArgs->NumGroups.y;
+  LaunchArgs.NumTeams[2] = LaunchSizeArgs->NumGroups.z;
+  LaunchArgs.ThreadLimit[0] = LaunchSizeArgs->GroupSize.x;
+  LaunchArgs.ThreadLimit[1] = LaunchSizeArgs->GroupSize.y;
+  LaunchArgs.ThreadLimit[2] = LaunchSizeArgs->GroupSize.z;
   LaunchArgs.DynCGroupMem = LaunchSizeArgs->DynSharedMemory;
 
   KernelLaunchParamsTy Params;
diff --git a/offload/unittests/OffloadAPI/kernel/olLaunchKernel.cpp b/offload/unittests/OffloadAPI/kernel/olLaunchKernel.cpp
index d466799c1acaa..157f33a363700 100644
--- a/offload/unittests/OffloadAPI/kernel/olLaunchKernel.cpp
+++ b/offload/unittests/OffloadAPI/kernel/olLaunchKernel.cpp
@@ -19,13 +19,8 @@ struct LaunchKernelTestBase : OffloadQueueTest {
                                    DeviceBin->getBufferSize(), &Program));
     ASSERT_SUCCESS(olGetKernel(Program, kernel, &Kernel));
     LaunchArgs.Dimensions = 1;
-    LaunchArgs.GroupSizeX = 64;
-    LaunchArgs.GroupSizeY = 1;
-    LaunchArgs.GroupSizeZ = 1;
-
-    LaunchArgs.NumGroupsX = 1;
-    LaunchArgs.NumGroupsY = 1;
-    LaunchArgs.NumGroupsZ = 1;
+    LaunchArgs.GroupSize = {64, 1, 1};
+    LaunchArgs.NumGroups = {1, 1, 1};
 
     LaunchArgs.DynSharedMemory = 0;
   }
@@ -60,7 +55,7 @@ OFFLOAD_TESTS_INSTANTIATE_DEVICE_FIXTURE(olLaunchKernelNoArgsTest);
 TEST_P(olLaunchKernelTest, Success) {
   void *Mem;
   ASSERT_SUCCESS(olMemAlloc(Device, OL_ALLOC_TYPE_MANAGED,
-                            LaunchArgs.GroupSizeX * sizeof(uint32_t), &Mem));
+                            LaunchArgs.GroupSize.x * sizeof(uint32_t), &Mem));
   struct {
     void *Mem;
   } Args{Mem};
@@ -88,7 +83,7 @@ TEST_P(olLaunchKernelNoArgsTest, Success) {
 TEST_P(olLaunchKernelTest, SuccessSynchronous) {
   void *Mem;
   ASSERT_SUCCESS(olMemAlloc(Device, OL_ALLOC_TYPE_MANAGED,
-                            LaunchArgs.GroupSizeX * sizeof(uint32_t), &Mem));
+                            LaunchArgs.GroupSize.x * sizeof(uint32_t), &Mem));
 
   struct {
     void *Mem;

LaunchArgs.ThreadLimit[0] = LaunchSizeArgs->GroupSizeX;
LaunchArgs.ThreadLimit[1] = LaunchSizeArgs->GroupSizeY;
LaunchArgs.ThreadLimit[2] = LaunchSizeArgs->GroupSizeZ;
LaunchArgs.NumTeams[0] = LaunchSizeArgs->NumGroups.x;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this isn't giving a warning of narrowing from u64 -> u32

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://godbolt.org/z/8Enn4z8o8

C++ doesn't care, I guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that was an LLVM default on warning? Maybe liboffload isn't using the right flags to build.

let name = "ol_dimensions_t";
let desc = "A three element vector";
let members = [
StructMember<"size_t", "x", "X">,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be u32, that's how it is in the launch args and the CUDA structs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCL/UR and OpenCL use u32 for their ranges, while Level Zero seems to use uint32_t... I guess it makes sense for Offload to use uint32_t as the most widely supported; I'll change it.

@RossBrunton RossBrunton changed the title [Offload] Add ol_dimensions_t [Offload] Add ol_dimensions_t and convert ranges from size_t -> uint32_t Jun 12, 2025
@jhuber6 jhuber6 merged commit 4f60321 into llvm:main Jun 12, 2025
7 checks passed
RossBrunton added a commit to RossBrunton/intel-llvm that referenced this pull request Jun 12, 2025
tomtor pushed a commit to tomtor/llvm-project that referenced this pull request Jun 14, 2025
…t32_t (llvm#143901)

This is a three element x, y, z size_t vector that can be used any place
where a 3D vector is required. This ensures that all vectors across
liboffload are the same and don't require any resizing/reordering
dances.
uditagarwal97 pushed a commit to intel/llvm that referenced this pull request Jun 18, 2025
github-actions bot pushed a commit to oneapi-src/unified-runtime that referenced this pull request Jun 19, 2025
kbenzie pushed a commit to oneapi-src/unified-runtime that referenced this pull request Jun 19, 2025
akuhlens pushed a commit to akuhlens/llvm-project that referenced this pull request Jun 24, 2025
…t32_t (llvm#143901)

This is a three element x, y, z size_t vector that can be used any place
where a 3D vector is required. This ensures that all vectors across
liboffload are the same and don't require any resizing/reordering
dances.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants