[mlir][linalg] fix linalg.batch_reduce_matmul auto cast #102585

zhczhong · 2024-08-09T08:44:13Z

Fix the auto-cast of linalg.batch_reduce_matmul from cast_to_T(A * cast_to_T(B)) + C to cast_to_T(A) * cast_to_T(B) + C

llvmbot · 2024-08-09T08:44:49Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-linalg

Author: zhicong zhong (zhczhong)

Changes

Fix the auto-cast of linalg.batch_reduce_matmul from cast_to_T(A * cast_to_T(B)) + C to cast_to_T(A) * cast_to_T(B) + C

Full diff: https://github.com/llvm/llvm-project/pull/102585.diff

3 Files Affected:

(modified) mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml (+13-14)
(modified) mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py (+1-2)
(modified) mlir/test/Dialect/Linalg/generalize-named-ops.mlir (+27)

diff --git a/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml b/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml
index 46b3ec0f60ebfa..249b0f56477cc8 100644
--- a/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml
+++ b/mlir/include/mlir/Dialect/Linalg/IR/LinalgNamedStructuredOps.yaml
@@ -1,5 +1,3 @@
-### AUTOGENERATED from core_named_ops.py
-### To regenerate, run: bin/update_core_linalg_named_ops.sh
 --- !LinalgOpConfig
 metadata: !LinalgOpMetadata
   name: copy
@@ -1908,25 +1906,25 @@ structured_op: !LinalgStructuredOpConfig
           scalar_arg: C
         - !ScalarExpression
           scalar_fn:
-            kind: type
-            fn_name: cast_signed
-            type_var: U
+            kind: binary
+            fn_name: mul
             operands:
             - !ScalarExpression
               scalar_fn:
-                kind: binary
-                fn_name: mul
+                kind: type
+                fn_name: cast_signed
+                type_var: U
                 operands:
                 - !ScalarExpression
                   scalar_arg: A
+            - !ScalarExpression
+              scalar_fn:
+                kind: type
+                fn_name: cast_signed
+                type_var: U
+                operands:
                 - !ScalarExpression
-                  scalar_fn:
-                    kind: type
-                    fn_name: cast_signed
-                    type_var: U
-                    operands:
-                    - !ScalarExpression
-                      scalar_arg: B
+                  scalar_arg: B
 --- !LinalgOpConfig
 metadata: !LinalgOpMetadata
   name: matvec
@@ -6509,3 +6507,4 @@ structured_op: !LinalgStructuredOpConfig
                           scalar_const: '2.3283063999999999E-10 : f64'
             - !ScalarExpression
               scalar_arg: min
+
diff --git a/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py b/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py
index 67bde8f736ef46..afb68b471d347a 100644
--- a/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py
+++ b/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py
@@ -593,8 +593,7 @@ def batch_reduce_matmul(
     domain(D.b, D.m, D.n, D.k)
     implements(ContractionOpInterface)
     C[D.m, D.n] += TypeFn.cast_signed(
-        U, A[D.b, D.m, D.k] * TypeFn.cast_signed(U, B[D.b, D.k, D.n])
-    )
+        U, A[D.b, D.m, D.k]) * TypeFn.cast_signed(U, B[D.b, D.k, D.n]) 
 
 
 @linalg_structured_op
diff --git a/mlir/test/Dialect/Linalg/generalize-named-ops.mlir b/mlir/test/Dialect/Linalg/generalize-named-ops.mlir
index 31fac9b4b41659..1e8f1435ca0fa5 100644
--- a/mlir/test/Dialect/Linalg/generalize-named-ops.mlir
+++ b/mlir/test/Dialect/Linalg/generalize-named-ops.mlir
@@ -329,6 +329,33 @@ func.func @batch_reduce_gemm(%lhs: memref<7x8x9xf32>, %rhs: memref<7x9x8xf32>, %
 // CHECK:         %[[ADD:.+]] = arith.addf %[[BBARG2]], %[[MUL]] : f32
 // CHECK:         linalg.yield %[[ADD]] : f32
 
+// -----
+
+func.func @generalize_batch_reduce_gemm_bf16(%lhs: memref<7x8x9xbf16>, %rhs: memref<7x9x8xbf16>, %out: memref<8x8xf32>) {
+  linalg.batch_reduce_matmul ins(%lhs, %rhs: memref<7x8x9xbf16>, memref<7x9x8xbf16>)
+                             outs(%out: memref<8x8xf32>)
+  return
+}
+
+// CHECK-DAG: #[[MAP0:.+]] = affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)>
+// CHECK-DAG: #[[MAP1:.+]] = affine_map<(d0, d1, d2, d3) -> (d0, d3, d2)>
+// CHECK-DAG: #[[MAP2:.+]] = affine_map<(d0, d1, d2, d3) -> (d1, d2)>
+
+// CHECK: @generalize_batch_reduce_gemm_bf16
+
+// CHECK: linalg.generic
+// CHECK-SAME: indexing_maps = [#[[MAP0]], #[[MAP1]], #[[MAP2]]]
+// CHECK-SAME: iterator_types = ["reduction", "parallel", "parallel", "reduction"]}
+// CHECK-SAME: ins(%{{.+}}, %{{.+}} : memref<7x8x9xbf16>, memref<7x9x8xbf16>)
+// CHECK-SAME: outs(%{{.+}} : memref<8x8xf32>
+// CHECK:         ^{{.+}}(%[[BBARG0:.+]]: bf16, %[[BBARG1:.+]]: bf16, %[[BBARG2:.+]]: f32)
+// CHECK:         %[[EXTBF16_0:.+]] = arith.extf %[[BBARG0]] : bf16 to f32
+// CHECK:         %[[EXTBF16_1:.+]] = arith.extf %[[BBARG1]] : bf16 to f32
+// CHECK:         %[[MUL:.+]] = arith.mulf %[[EXTBF16_0]], %[[EXTBF16_1]] : f32
+// CHECK:         %[[ADD:.+]] = arith.addf %[[BBARG2]], %[[MUL]] : f32
+// CHECK:         linalg.yield %[[ADD]] : f32
+
+
 // -----
 
 // CHECK-LABEL: generalize_linalg_map

github-actions · 2024-08-09T08:48:25Z

✅ With the latest revision this PR passed the Python code formatter.

rengolin · 2024-08-09T10:02:00Z

@shahidact

rengolin · 2024-08-09T10:05:19Z

Quick question: did you change the Python file and then regenerated the Yaml file, or did you change both manually?

OpDSL doesn't make it easy to know the difference and I made that mistake myself already once. 😅

shahidact · 2024-08-09T11:02:44Z

@zhczhong Good catch, Pls also include the related changes from "core_named_ops.py" in this PR.

xurui1995

LGTM

zhczhong · 2024-08-10T01:32:19Z

Quick question: did you change the Python file and then regenerated the Yaml file, or did you change both manually?

OpDSL doesn't make it easy to know the difference and I made that mistake myself already once. 😅

Thanks for the reminder! I changed the python file and use it to generate the Yaml file

@zhczhong Good catch, Pls also include the related changes from "core_named_ops.py" in this PR.

Thanks! The change has been included here

llvm-project/mlir/python/mlir/dialects/linalg/opdsl/ops/core_named_ops.py

Lines 587 to 597 in b7f615e

    
               """Performs a batch-reduce matrix multiplication of two 3D inputs. 
        
               The partial multiplication results are reduced into a 2D output. 
        
               Numeric casting is performed on the operands to the inner multiply, promoting 
        
               them to the same data type as the accumulator/output. 
        
               """ 
        
               domain(D.b, D.m, D.n, D.k) 
        
               implements(ContractionOpInterface) 
        
               C[D.m, D.n] += TypeFn.cast_signed(U, A[D.b, D.m, D.k]) * TypeFn.cast_signed( 
        
                   U, B[D.b, D.k, D.n] 
        
               )

MaheshRavishankar

This looks right to me. Thanks for fixing it

zhczhong requested review from xurui1995 and ZhennanQin August 9, 2024 08:44

zhczhong requested review from ftynse, makslevental, stellaraccident, dcaballe, nicolasvasilache and rengolin as code owners August 9, 2024 08:44

llvmbot added mlir:linalg mlir:python MLIR Python bindings mlir labels Aug 9, 2024

zhczhong force-pushed the fix_brgemm branch from e95fede to ba7c25b Compare August 9, 2024 08:51

fix linalg.batch_reduce_matmul auto cast

b7f615e

zhczhong force-pushed the fix_brgemm branch from ba7c25b to b7f615e Compare August 9, 2024 08:52

xurui1995 approved these changes Aug 9, 2024

View reviewed changes

MaheshRavishankar approved these changes Aug 11, 2024

View reviewed changes

yifeizh2 approved these changes Aug 12, 2024

View reviewed changes

zhczhong merged commit 558d7ad into llvm:main Aug 12, 2024
8 checks passed

zhczhong mentioned this pull request Aug 15, 2024

Fix the auto cast of brgemm and the correctness issue of deepTileatmul intel/graph-compiler#247

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][linalg] fix linalg.batch_reduce_matmul auto cast #102585

[mlir][linalg] fix linalg.batch_reduce_matmul auto cast #102585

Uh oh!

zhczhong commented Aug 9, 2024

Uh oh!

llvmbot commented Aug 9, 2024 •

edited

Loading

Uh oh!

github-actions bot commented Aug 9, 2024 •

edited

Loading

Uh oh!

rengolin commented Aug 9, 2024

Uh oh!

rengolin commented Aug 9, 2024

Uh oh!

shahidact commented Aug 9, 2024

Uh oh!

xurui1995 left a comment

Uh oh!

zhczhong commented Aug 10, 2024

Uh oh!

MaheshRavishankar left a comment

Uh oh!

Uh oh!

Uh oh!

[mlir][linalg] fix linalg.batch_reduce_matmul auto cast #102585

[mlir][linalg] fix linalg.batch_reduce_matmul auto cast #102585

Uh oh!

Conversation

zhczhong commented Aug 9, 2024

Uh oh!

llvmbot commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rengolin commented Aug 9, 2024

Uh oh!

rengolin commented Aug 9, 2024

Uh oh!

shahidact commented Aug 9, 2024

Uh oh!

xurui1995 left a comment

Choose a reason for hiding this comment

Uh oh!

zhczhong commented Aug 10, 2024

Uh oh!

MaheshRavishankar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Aug 9, 2024 •

edited

Loading

github-actions bot commented Aug 9, 2024 •

edited

Loading