Skip to content

Commit ef1ea85

Browse files
committed
Update on "[ET-VK] Modify quantized linear tiling shader to linearly dispatch work to improve thread occupancy and performance."
This diff changes tiled 8 bit quantized linear mat mul op to linearly dispatch work which increases thread occupancy and improves performance. Differential Revision: [D73751979](https://our.internmc.facebook.com/intern/diff/D73751979/) [ghstack-poisoned]
1 parent 6bac017 commit ef1ea85

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

backends/vulkan/runtime/graph/ops/impl/QuantizedLinearInt8.cpp

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ void add_q_8w_linear_tiled_node(
180180

181181
std::vector<int64_t> mat1_sizes = graph.sizes_of(mat1);
182182
const int64_t M = utils::val_at(-2, mat1_sizes);
183-
int out_tile_nrows = 4;
183+
uint32_t out_tile_nrows = 4;
184184
if (M % 6 == 0) {
185185
kernel_name += "_o4x2";
186186
out_tile_nrows = 2;
@@ -197,9 +197,9 @@ void add_q_8w_linear_tiled_node(
197197

198198
utils::uvec3 out_limits = graph.logical_limits_of(out);
199199
utils::uvec3 global_wg_size = {
200-
out_limits[0] * (utils::div_up(out_limits, out_tile_nrows)),
200+
out_limits[0] * (utils::div_up(out_limits[1], out_tile_nrows)),
201201
1,
202-
out_limit[2]};
202+
out_limits[2]};
203203

204204
utils::uvec3 local_wg_size{64, 1, 1};
205205
if (use_coop_algorithm) {

0 commit comments

Comments
 (0)