Skip to content

Commit dd6b089

Browse files
committed
Update on "[ET-VK] Removed shared memory usage and simplied conv2d dw op shader to improve performance."
This diff removes shared memory usage in `conv2d_dw_output_tile.glsl` shader to improve performance. Makes sum a one dimensional array, and moves bias application before storing texel. Differential Revision: [D75499165](https://our.internmc.facebook.com/intern/diff/D75499165/) [ghstack-poisoned]
2 parents d187bfb + 302825b commit dd6b089

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

backends/vulkan/runtime/graph/ops/glsl/conv2d_pw_s1p0.glsl

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,8 @@
1414

1515
#define VEC4_T ${texel_type(DTYPE)}
1616

17-
#define TILE_SIZE_X ${TILE_SIZE_X}us
18-
#define TILE_SIZE_Y ${TILE_SIZE_Y}us
17+
#define TILE_SIZE_X uint16_t(${TILE_SIZE_X})
18+
#define TILE_SIZE_Y uint16_t(${TILE_SIZE_Y})
1919

2020
#define op(X, A, B) ${OPERATOR}
2121

@@ -67,8 +67,8 @@ void main() {
6767
// | pos[2] | pos[3] |
6868
// +--------+--------+
6969
uint16_t pos[TILE_SIZE_X * TILE_SIZE_Y * 2];
70-
for (uint16_t y = 0us, i = 0us; y < TILE_SIZE_Y; ++y) {
71-
for (uint16_t x = 0us; x < TILE_SIZE_X; ++x) {
70+
for (uint16_t y = uint16_t(0), i = uint16_t(0); y < TILE_SIZE_Y; ++y) {
71+
for (uint16_t x = uint16_t(0); x < TILE_SIZE_X; ++x) {
7272
pos[i * 2] = out_pos_xy[0] * TILE_SIZE_X + x;
7373
pos[i * 2 + 1] = out_pos_xy[1] * TILE_SIZE_Y + y;
7474
i++;

0 commit comments

Comments
 (0)