Skip to content

Commit daa207d

Browse files
committed
Update on "[ET-VK] Fixing out_limits_scaled calculation for conv2d pw ops."
The fix changing the calculation of `out_limits_scaled` from: ```glsl const int out_limits_scaled[2] = {out_limits.x + (TILE_SIZE_X - 1) * TILE_SIZE_X, out_limits.y + (TILE_SIZE_Y - 1) * TILE_SIZE_Y}; ``` to: ```glsl const int out_limits_scaled[2] = {(out_limits.x + (TILE_SIZE_X - 1)) / TILE_SIZE_X, (out_limits.y + (TILE_SIZE_Y - 1)) / TILE_SIZE_Y}; ``` This change ensures that `out_limits_scaled` is calculated correctly, taking into account the tile size and the output limits of the convolution operation. Differential Revision: [D75575662](https://our.internmc.facebook.com/intern/diff/D75575662/) [ghstack-poisoned]
2 parents a0eb272 + 680d935 commit daa207d

File tree

2 files changed

+6
-2
lines changed

2 files changed

+6
-2
lines changed

backends/vulkan/runtime/graph/ops/glsl/conv2d_pw.glsl

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,9 @@ layout(local_size_x_id = 0, local_size_y_id = 1, local_size_z_id = 2) in;
4646
* size is only 1x1, making it easier to re-use loaded texels from t_kernel.
4747
*/
4848
void main() {
49-
const int out_limits_scaled[2] = {(out_limits.x + (TILE_SIZE_X - 1)) / TILE_SIZE_X, (out_limits.y + (TILE_SIZE_Y - 1)) / TILE_SIZE_Y};
49+
const int out_limits_scaled[2] =
50+
{(out_limits.x + (TILE_SIZE_X - 1)) / TILE_SIZE_X,
51+
(out_limits.y + (TILE_SIZE_Y - 1)) / TILE_SIZE_Y};
5052

5153
const int div_by_x = int(gl_GlobalInvocationID.x / out_limits_scaled[0]);
5254
const int out_pos[3] = {int(gl_GlobalInvocationID.x % out_limits_scaled[0]), div_by_x, int(gl_GlobalInvocationID.y)};

backends/vulkan/runtime/graph/ops/glsl/conv2d_pw_s1p0.glsl

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,9 @@ layout(local_size_x_id = 0, local_size_y_id = 1, local_size_z_id = 2) in;
4848
* size is only 1x1, making it easier to re-use loaded texels from t_kernel.
4949
*/
5050
void main() {
51-
const int out_limits_scaled[2] = {(out_limits.x + (TILE_SIZE_X - 1)) / TILE_SIZE_X, (out_limits.y + (TILE_SIZE_Y - 1)) / TILE_SIZE_Y};
51+
const int out_limits_scaled[2] =
52+
{(out_limits.x + (TILE_SIZE_X - 1)) / TILE_SIZE_X,
53+
(out_limits.y + (TILE_SIZE_Y - 1)) / TILE_SIZE_Y};
5254

5355
const uint16_t div_by_x = uint16_t(gl_GlobalInvocationID.x / out_limits_scaled[0]);
5456
const uint16_t out_pos_xy[2] = {uint16_t(gl_GlobalInvocationID.x % out_limits_scaled[0]), div_by_x};

0 commit comments

Comments
 (0)