Skip to content

Commit 3ae4c35

Browse files
committed
AMDGPU: Implicit kernel arguments related optimization when uniform-workgroup-size=true
Summary: Under code object version 5, ockl_get_local_size returns the value computed by the expression: workgroup_id < hidden_block_count ? hidden_group_size : hidden_remainder For functions with the attribute uniform-work-group-size=true. we can evaluate workgroup_id < hidden_block_count as true, and thus hidden_group_size is returned for ockl_get_local_size. With uniform-workgroup-size=true, this work also set all remainders to zero, and if there is reqd_work_group_size, we also set work-group-size to the required value from the metadata. Reviewers: arsenm and bcahoon Differential Revision: https://reviews.llvm.org/D131276
1 parent 2d3b54f commit 3ae4c35

File tree

2 files changed

+441
-108
lines changed

2 files changed

+441
-108
lines changed

0 commit comments

Comments
 (0)