Skip to content

Commit 06ac828

Browse files
authored
[libc] Fix flipped AMDGPU kernel launch arguments (#83648)
Summary: These values were incorrectly flipped, setting the size of the blocks to the threads and vice-versa. When I originally wrote the thread utilities it was using COV4 which used an implicit format. Then when I updated I accidentally flipped them and never noticed because nothing depended on the size of the threads until I checked it manually.
1 parent 057e725 commit 06ac828

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

libc/utils/gpu/loader/amdgpu/Loader.cpp

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -230,12 +230,12 @@ hsa_status_t launch_kernel(hsa_agent_t dev_agent, hsa_executable_t executable,
230230
implicit_args_t *implicit_args = reinterpret_cast<implicit_args_t *>(
231231
reinterpret_cast<uint8_t *>(args) + sizeof(args_t));
232232
implicit_args->grid_dims = dims;
233-
implicit_args->grid_size_x = params.num_threads_x;
234-
implicit_args->grid_size_y = params.num_threads_y;
235-
implicit_args->grid_size_z = params.num_threads_z;
236-
implicit_args->workgroup_size_x = params.num_blocks_x;
237-
implicit_args->workgroup_size_y = params.num_blocks_y;
238-
implicit_args->workgroup_size_z = params.num_blocks_z;
233+
implicit_args->grid_size_x = params.num_blocks_x;
234+
implicit_args->grid_size_y = params.num_blocks_y;
235+
implicit_args->grid_size_z = params.num_blocks_z;
236+
implicit_args->workgroup_size_x = params.num_threads_x;
237+
implicit_args->workgroup_size_y = params.num_threads_y;
238+
implicit_args->workgroup_size_z = params.num_threads_z;
239239

240240
// Obtain a packet from the queue.
241241
uint64_t packet_id = hsa_queue_add_write_index_relaxed(queue, 1);

0 commit comments

Comments
 (0)