Skip to content

Commit 9432f70

Browse files
authored
[MLIR][NVGPU-Tests] Fix a failing sm90 test (#111731)
The memref.expand_shape explicitly takes an output_shape now. This patch adds it to the Op and fixes the failing test. Signed-off-by: Durgadoss R <[email protected]>
1 parent c86edd0 commit 9432f70

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

mlir/test/Integration/GPU/CUDA/sm90/tma_load_128x128_stride_noswizzle.mlir

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@ module {
5757
%s4 = gpu.memcpy async [%s3] %srcMemref, %srcMemref_host : memref<128x128xf16>, memref<128x128xf16>
5858
%s5 = gpu.memcpy async [%s4] %dstMemref, %dstMemref_host : memref<128x128xf16>, memref<128x128xf16>
5959

60-
%expand_shape = memref.expand_shape %srcMemref [[0, 1], [2, 3]] : memref<128x128xf16> into memref<2x64x2x64xf16>
60+
%expand_shape = memref.expand_shape %srcMemref [[0, 1], [2, 3]] output_shape [2, 64, 2, 64] : memref<128x128xf16> into memref<2x64x2x64xf16>
6161
%transpose = memref.transpose %expand_shape (d0, d1, d2, d3) -> (d0, d2, d1, d3) : memref<2x64x2x64xf16> to memref<2x2x64x64xf16, strided<[8192, 64, 128, 1]>>
6262
%cast = memref.cast %transpose : memref<2x2x64x64xf16, strided<[8192, 64, 128, 1]>> to memref<*xf16>
6363
%24 = nvgpu.tma.create.descriptor %cast box[%c2, %c2, %c64, %c64] : memref<*xf16> -> <tensor = memref<2x2x64x64xf16, 3>, swizzle = none, l2promo = none, oob = zero, interleave = none>

0 commit comments

Comments
 (0)