Skip to content

[mlir][sparse] Add more tests and verification for n:m #81186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Feb 9, 2024

Conversation

yinying-lisa-li
Copy link
Contributor

@yinying-lisa-li yinying-lisa-li commented Feb 8, 2024

  1. Add python test for n out of m
  2. Add more methods for python binding
  3. Add verification for n:m and invalid encoding tests
  4. Add e2e test for n:m

Previous PRs for n:m #80501 #79935

Copy link

github-actions bot commented Feb 8, 2024

✅ With the latest revision this PR passed the Python code formatter.

@yinying-lisa-li yinying-lisa-li marked this pull request as ready for review February 9, 2024 18:00
@llvmbot llvmbot added mlir:sparse Sparse compiler in MLIR mlir labels Feb 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Feb 9, 2024

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-sparse

Author: Yinying Li (yinying-lisa-li)

Changes
  1. Add python test for n out of m
  2. Add more methods for python binding
  3. Add verification for n:m and invalid encoding tests
  4. Add e2e test for n:m

Full diff: https://github.com/llvm/llvm-project/pull/81186.diff

8 Files Affected:

  • (modified) mlir/include/mlir-c/Dialect/SparseTensor.h (+10)
  • (modified) mlir/lib/Bindings/Python/DialectSparseTensor.cpp (+37-1)
  • (modified) mlir/lib/CAPI/Dialect/SparseTensor.cpp (+18)
  • (modified) mlir/lib/Dialect/SparseTensor/IR/Detail/LvlTypeParser.cpp (+15-7)
  • (modified) mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp (+31)
  • (modified) mlir/test/Dialect/SparseTensor/invalid_encoding.mlir (+106)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_ds.mlir (+22)
  • (modified) mlir/test/python/dialects/sparse_tensor/dialect.py (+84)
diff --git a/mlir/include/mlir-c/Dialect/SparseTensor.h b/mlir/include/mlir-c/Dialect/SparseTensor.h
index 2c71b0008ad16a..d549f5dddc1318 100644
--- a/mlir/include/mlir-c/Dialect/SparseTensor.h
+++ b/mlir/include/mlir-c/Dialect/SparseTensor.h
@@ -84,6 +84,16 @@ mlirSparseTensorEncodingAttrGetPosWidth(MlirAttribute attr);
 MLIR_CAPI_EXPORTED int
 mlirSparseTensorEncodingAttrGetCrdWidth(MlirAttribute attr);
 
+MLIR_CAPI_EXPORTED unsigned
+mlirSparseTensorEncodingAttrGetStructuredN(MlirSparseTensorLevelType lvlType);
+
+MLIR_CAPI_EXPORTED unsigned
+mlirSparseTensorEncodingAttrGetStructuredM(MlirSparseTensorLevelType lvlType);
+
+MLIR_CAPI_EXPORTED MlirSparseTensorLevelType
+mlirSparseTensorEncodingAttrBuildLvlType(
+    enum MlirBaseSparseTensorLevelType lvlType, unsigned n, unsigned m);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/mlir/lib/Bindings/Python/DialectSparseTensor.cpp b/mlir/lib/Bindings/Python/DialectSparseTensor.cpp
index 607534c6156439..74f4d2413a6993 100644
--- a/mlir/lib/Bindings/Python/DialectSparseTensor.cpp
+++ b/mlir/lib/Bindings/Python/DialectSparseTensor.cpp
@@ -60,6 +60,15 @@ static void populateDialectSparseTensorSubmodule(const py::module &m) {
           py::arg("lvl_to_dim"), py::arg("pos_width"), py::arg("crd_width"),
           py::arg("context") = py::none(),
           "Gets a sparse_tensor.encoding from parameters.")
+      .def_classmethod(
+          "build_level_type",
+          [](py::object cls, MlirBaseSparseTensorLevelType lvlType, unsigned n,
+             unsigned m) {
+            return mlirSparseTensorEncodingAttrBuildLvlType(lvlType, n, m);
+          },
+          py::arg("cls"), py::arg("lvl_type"), py::arg("n") = 0,
+          py::arg("m") = 0,
+          "Builds a sparse_tensor.encoding.level_type from parameters.")
       .def_property_readonly(
           "lvl_types",
           [](MlirAttribute self) {
@@ -89,7 +98,34 @@ static void populateDialectSparseTensorSubmodule(const py::module &m) {
       .def_property_readonly("pos_width",
                              mlirSparseTensorEncodingAttrGetPosWidth)
       .def_property_readonly("crd_width",
-                             mlirSparseTensorEncodingAttrGetCrdWidth);
+                             mlirSparseTensorEncodingAttrGetCrdWidth)
+      .def_property_readonly(
+          "structured_n",
+          [](MlirAttribute self) -> unsigned {
+            const int lvlRank = mlirSparseTensorEncodingGetLvlRank(self);
+            return mlirSparseTensorEncodingAttrGetStructuredN(
+                mlirSparseTensorEncodingAttrGetLvlType(self, lvlRank - 1));
+          })
+      .def_property_readonly(
+          "structured_m",
+          [](MlirAttribute self) -> unsigned {
+            const int lvlRank = mlirSparseTensorEncodingGetLvlRank(self);
+            return mlirSparseTensorEncodingAttrGetStructuredM(
+                mlirSparseTensorEncodingAttrGetLvlType(self, lvlRank - 1));
+          })
+      .def_property_readonly("lvl_types_enum", [](MlirAttribute self) {
+        const int lvlRank = mlirSparseTensorEncodingGetLvlRank(self);
+        std::vector<MlirBaseSparseTensorLevelType> ret;
+        ret.reserve(lvlRank);
+        for (int l = 0; l < lvlRank; l++) {
+          // Convert level type to 32 bits to ignore n and m for n_out_of_m
+          // format.
+          ret.push_back(
+              static_cast<MlirBaseSparseTensorLevelType>(static_cast<uint32_t>(
+                  mlirSparseTensorEncodingAttrGetLvlType(self, l))));
+        }
+        return ret;
+      });
 }
 
 PYBIND11_MODULE(_mlirDialectsSparseTensor, m) {
diff --git a/mlir/lib/CAPI/Dialect/SparseTensor.cpp b/mlir/lib/CAPI/Dialect/SparseTensor.cpp
index a34b9a29b0e90a..4e1bd45863fdac 100644
--- a/mlir/lib/CAPI/Dialect/SparseTensor.cpp
+++ b/mlir/lib/CAPI/Dialect/SparseTensor.cpp
@@ -94,3 +94,21 @@ int mlirSparseTensorEncodingAttrGetPosWidth(MlirAttribute attr) {
 int mlirSparseTensorEncodingAttrGetCrdWidth(MlirAttribute attr) {
   return cast<SparseTensorEncodingAttr>(unwrap(attr)).getCrdWidth();
 }
+
+MlirSparseTensorLevelType
+mlirSparseTensorEncodingAttrBuildLvlType(MlirBaseSparseTensorLevelType lvlType,
+                                         unsigned n, unsigned m) {
+  LevelType lt = static_cast<LevelType>(lvlType);
+  return static_cast<MlirSparseTensorLevelType>(*buildLevelType(
+      *getLevelFormat(lt), isOrderedLT(lt), isUniqueLT(lt), n, m));
+}
+
+unsigned
+mlirSparseTensorEncodingAttrGetStructuredN(MlirSparseTensorLevelType lvlType) {
+  return getN(static_cast<LevelType>(lvlType));
+}
+
+unsigned
+mlirSparseTensorEncodingAttrGetStructuredM(MlirSparseTensorLevelType lvlType) {
+  return getM(static_cast<LevelType>(lvlType));
+}
diff --git a/mlir/lib/Dialect/SparseTensor/IR/Detail/LvlTypeParser.cpp b/mlir/lib/Dialect/SparseTensor/IR/Detail/LvlTypeParser.cpp
index 752d6e6481dfee..a585928c3fa3ee 100644
--- a/mlir/lib/Dialect/SparseTensor/IR/Detail/LvlTypeParser.cpp
+++ b/mlir/lib/Dialect/SparseTensor/IR/Detail/LvlTypeParser.cpp
@@ -41,8 +41,16 @@ FailureOr<uint64_t> LvlTypeParser::parseLvlType(AsmParser &parser) const {
     ParseResult res = parser.parseCommaSeparatedList(
         mlir::OpAsmParser::Delimiter::OptionalSquare,
         [&]() -> ParseResult { return parseStructure(parser, &structure); },
-        " in block n out of m");
+        " in structure n out of m");
     FAILURE_IF_FAILED(res)
+    if (structure.size() != 2) {
+      parser.emitError(loc, "expected exactly 2 structure sizes");
+      return failure();
+    }
+    if (structure[0] > structure[1]) {
+      parser.emitError(loc, "expected n <= m in n_out_of_m");
+      return failure();
+    }
   }
 
   ParseResult res = parser.parseCommaSeparatedList(
@@ -57,10 +65,6 @@ FailureOr<uint64_t> LvlTypeParser::parseLvlType(AsmParser &parser) const {
   } else if (base.compare("compressed") == 0) {
     properties |= static_cast<uint64_t>(LevelFormat::Compressed);
   } else if (base.compare("structured") == 0) {
-    if (structure.size() != 2) {
-      parser.emitError(loc, "expected exactly 2 structure sizes");
-      return failure();
-    }
     properties |= static_cast<uint64_t>(LevelFormat::NOutOfM);
     properties |= nToBits(structure[0]) | mToBits(structure[1]);
   } else if (base.compare("loose_compressed") == 0) {
@@ -102,13 +106,17 @@ LvlTypeParser::parseStructure(AsmParser &parser,
   OptionalParseResult intValParseResult = parser.parseOptionalInteger(intVal);
   if (intValParseResult.has_value()) {
     if (failed(*intValParseResult)) {
-      parser.emitError(loc, "failed to parse block size");
+      parser.emitError(loc, "failed to parse structure size");
+      return failure();
+    }
+    if (intVal < 0) {
+      parser.emitError(loc, "expected structure size to be >= 0");
       return failure();
     }
     structure->push_back(intVal);
     return success();
   }
-  parser.emitError(loc, "expected valid integer for block size");
+  parser.emitError(loc, "expected valid integer for structure size");
   return failure();
 }
 
diff --git a/mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp b/mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp
index 67b1d7974fa259..aed43f26d54f11 100644
--- a/mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp
+++ b/mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp
@@ -657,6 +657,37 @@ LogicalResult SparseTensorEncodingAttr::verify(
       return emitError() << "expected all singleton lvlTypes "
                             "following a singleton level";
   }
+  // TODO: audit formats that actually are supported by backend.
+  if (auto it = std::find_if(lvlTypes.begin(), lvlTypes.end(), isNOutOfMLT);
+      it != std::end(lvlTypes)) {
+    if (it != lvlTypes.end() - 1)
+      return emitError() << "expected n_out_of_m to be the last level type";
+    if (!std::all_of(lvlTypes.begin(), it,
+                     [](LevelType i) { return isDenseLT(i); }))
+      return emitError() << "expected all dense lvlTypes "
+                            "before a n_out_of_m level";
+    if (dimToLvl && (dimToLvl.getNumDims() != dimToLvl.getNumResults())) {
+      if (!isBlockSparsity(dimToLvl)) {
+        return emitError()
+               << "expected 1xm block structure for n_out_of_m level";
+      }
+      auto sizes = getBlockSize(dimToLvl);
+      unsigned coefficient = 0;
+      for (const auto &elem : sizes) {
+        if (elem != 0) {
+          if (elem != coefficient && coefficient != 0) {
+            return emitError() << "expected only one blocked level "
+                                  "with the same coefficients";
+          }
+          coefficient = elem;
+        }
+      }
+      if (coefficient != getM(*it)) {
+        return emitError() << "expected coeffiencts of Affine expressions "
+                              "to be equal to m of n_out_of_m level";
+      }
+    }
+  }
   // Before we can check that the level-rank is consistent/coherent
   // across all fields, we need to define it.  The source-of-truth for
   // the `getLvlRank` method is the length of the level-types array,
diff --git a/mlir/test/Dialect/SparseTensor/invalid_encoding.mlir b/mlir/test/Dialect/SparseTensor/invalid_encoding.mlir
index 2d189cc94c15e2..b48e66cf762bd9 100644
--- a/mlir/test/Dialect/SparseTensor/invalid_encoding.mlir
+++ b/mlir/test/Dialect/SparseTensor/invalid_encoding.mlir
@@ -315,3 +315,109 @@ func.func private @BSR(%arg0: tensor<?x?xf64, #BSR>) {
 func.func private @BSR_explicit(%arg0: tensor<?x?xf64, #BSR_explicit>) {
   return
 }
+
+// -----
+
+// expected-error@+6 {{expected structure size to be >= 0}}
+#NOutOfM = #sparse_tensor.encoding<{
+  map = ( i, j, k ) ->
+  ( i            : dense,
+    k floordiv 4 : dense,
+    j            : dense,
+    k mod 4      : structured[-2, 4]
+  )
+}>
+func.func private @NOutOfM(%arg0: tensor<?x?x?xf64, #NOutOfM>) {
+  return
+}
+
+// -----
+
+// expected-error@+6 {{expected n <= m in n_out_of_m}}
+#NOutOfM = #sparse_tensor.encoding<{
+  map = ( i, j, k ) ->
+  ( i            : dense,
+    k floordiv 4 : dense,
+    j            : dense,
+    k mod 4      : structured[5, 4]
+  )
+}>
+func.func private @NOutOfM(%arg0: tensor<?x?x?xf64, #NOutOfM>) {
+  return
+}
+
+// -----
+
+// expected-error@+1 {{expected all dense lvlTypes before a n_out_of_m level}}
+#NOutOfM = #sparse_tensor.encoding<{
+  map = ( i, j, k ) ->
+  ( i            : dense,
+    k floordiv 4 : compressed,
+    j            : dense,
+    k mod 4      : structured[2, 4]
+  )
+}>
+func.func private @NOutOfM(%arg0: tensor<?x?x?xf64, #NOutOfM>) {
+  return
+}
+
+// -----
+
+// expected-error@+1 {{expected n_out_of_m to be the last level type}}
+#NOutOfM = #sparse_tensor.encoding<{
+  map = ( i, j, k ) ->
+  ( i            : dense,
+    k floordiv 4 : structured[2, 4],
+    j            : dense,
+    k mod 4      : compressed
+  )
+}>
+func.func private @NOutOfM(%arg0: tensor<?x?x?xf64, #NOutOfM>) {
+  return
+}
+
+// -----
+
+// expected-error@+1 {{expected 1xm block structure for n_out_of_m level}}
+#NOutOfM = #sparse_tensor.encoding<{
+  map = ( i, j, k ) ->
+  ( i            : dense,
+    k floordiv 2 : dense,
+    j            : dense,
+    k mod 4      : structured[2, 4]
+  )
+}>
+func.func private @NOutOfM(%arg0: tensor<?x?x?xf64, #NOutOfM>) {
+  return
+}
+
+// -----
+
+// expected-error@+1 {{expected coeffiencts of Affine expressions to be equal to m of n_out_of_m level}}
+#NOutOfM = #sparse_tensor.encoding<{
+  map = ( i, j, k ) ->
+  ( i            : dense,
+    k floordiv 2 : dense,
+    j            : dense,
+    k mod 2      : structured[2, 4]
+  )
+}>
+func.func private @NOutOfM(%arg0: tensor<?x?x?xf64, #NOutOfM>) {
+  return
+}
+
+// -----
+
+// expected-error@+1 {{expected only one blocked level with the same coefficients}}
+#NOutOfM = #sparse_tensor.encoding<{
+  map = ( i, j, k ) ->
+  ( i floordiv 2 : dense,
+    i mod 2      : dense,
+    j            : dense,
+    k floordiv 4 : dense,
+    k mod 4      : structured[2, 4]
+  )
+}>
+func.func private @NOutOfM(%arg0: tensor<?x?x?xf64, #NOutOfM>) {
+  return
+}
diff --git a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_ds.mlir b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_ds.mlir
index ec5c7580657cd7..251944c657cbac 100644
--- a/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_ds.mlir
+++ b/mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_ds.mlir
@@ -45,6 +45,13 @@
   crdWidth = 8
 }>
 
+#NV_58 = #sparse_tensor.encoding<{
+  map = ( i, j ) -> ( i            : dense,
+                      j floordiv 8 : dense,
+                      j mod 8      : structured[5, 8]),
+  crdWidth = 8
+}>
+
 module {
 
   func.func private @getTensorFilename(index) -> (!Filename)
@@ -65,6 +72,7 @@ module {
     %A1 = sparse_tensor.new %fileName : !Filename to tensor<?x?xf64, #CSR>
     %A2 = sparse_tensor.new %fileName : !Filename to tensor<?x?xf64, #CSR_hi>
     %A3 = sparse_tensor.new %fileName : !Filename to tensor<?x?xf64, #NV_24>
+    %A4 = sparse_tensor.new %fileName : !Filename to tensor<?x?xf64, #NV_58>
 
     //
     // CSR:
@@ -113,10 +121,24 @@ module {
     %vecv3 = vector.transfer_read %val3[%c0], %f0 : memref<?xf64>, vector<12xf64>
     vector.print %vecv3 : vector<12xf64>
 
+    //
+    // NV_58
+    //
+    // CHECK-NEXT: ( 2, 3, 5, 7, 1, 2, 4, 7, 0, 2, 4, 5 )
+    // CHECK-NEXT: ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 )
+    //
+    %crd4 = sparse_tensor.coordinates %A4 {level = 2 : index } : tensor<?x?xf64, #NV_58> to memref<?xi8>
+    %vecc4 = vector.transfer_read %crd4[%c0], %u0 : memref<?xi8>, vector<12xi8>
+    vector.print %vecc4 : vector<12xi8>
+    %val4 = sparse_tensor.values %A4 : tensor<?x?xf64, #NV_58> to memref<?xf64>
+    %vecv4 = vector.transfer_read %val4[%c0], %f0 : memref<?xf64>, vector<12xf64>
+    vector.print %vecv4 : vector<12xf64>
+
     // Release the resources.
     bufferization.dealloc_tensor %A1: tensor<?x?xf64, #CSR>
     bufferization.dealloc_tensor %A2: tensor<?x?xf64, #CSR_hi>
     bufferization.dealloc_tensor %A3: tensor<?x?xf64, #NV_24>
+    bufferization.dealloc_tensor %A4: tensor<?x?xf64, #NV_58>
 
     return
   }
diff --git a/mlir/test/python/dialects/sparse_tensor/dialect.py b/mlir/test/python/dialects/sparse_tensor/dialect.py
index 412c5797067b7a..1fa7030ca1be91 100644
--- a/mlir/test/python/dialects/sparse_tensor/dialect.py
+++ b/mlir/test/python/dialects/sparse_tensor/dialect.py
@@ -52,6 +52,90 @@ def testEncodingAttr1D():
         print(f"created_pos_width: {created.pos_width}")
 
 
+# CHECK-LABEL: TEST: testEncodingAttrStructure
+@run
+def testEncodingAttrStructure():
+    with Context() as ctx:
+        parsed = Attribute.parse(
+            "#sparse_tensor.encoding<{"
+            "  map = (d0, d1) -> (d0 : dense, d1 floordiv 4 : dense,"
+            "  d1 mod 4 : structured[2, 4]),"
+            "  posWidth = 16,"
+            "  crdWidth = 32"
+            "}>"
+        )
+        # CHECK: #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 floordiv 4 : dense, d1 mod 4 : structured[2, 4]), posWidth = 16, crdWidth = 32 }>
+        print(parsed)
+
+        casted = st.EncodingAttr(parsed)
+        # CHECK: equal: True
+        print(f"equal: {casted == parsed}")
+
+        # CHECK: lvl_types: [65536, 65536, 4406637494272]
+        print(f"lvl_types: {casted.lvl_types}")
+        # CHECK: lvl_types_enum: [<LevelType.dense: 65536>, <LevelType.dense: 65536>, <LevelType.n_out_of_m: 1048576>]
+        print(f"lvl_types_enum: {casted.lvl_types_enum}")
+        # CHECK: structured_n: 2
+        print(f"structured_n: {casted.structured_n}")
+        # CHECK: structured_m: 4
+        print(f"structured_m: {casted.structured_m}")
+        # CHECK: dim_to_lvl: (d0, d1) -> (d0, d1 floordiv 4, d1 mod 4)
+        print(f"dim_to_lvl: {casted.dim_to_lvl}")
+        # CHECK: lvl_to_dim: (d0, d1, d2) -> (d0, d1 * 4 + d2)
+        print(f"lvl_to_dim: {casted.lvl_to_dim}")
+        # CHECK: pos_width: 16
+        print(f"pos_width: {casted.pos_width}")
+        # CHECK: crd_width: 32
+        print(f"crd_width: {casted.crd_width}")
+
+        created = st.EncodingAttr.get(
+            casted.lvl_types, casted.dim_to_lvl, casted.lvl_to_dim, 0, 0
+        )
+        # CHECK: #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 floordiv 4 : dense, d1 mod 4 : structured[2, 4]) }>
+        print(created)
+        # CHECK: created_equal: False
+        print(f"created_equal: {created == casted}")
+
+        built_2_4 = st.EncodingAttr.build_level_type(st.LevelType.n_out_of_m, 2, 4)
+        dim_to_lvl = AffineMap.get(
+            2,
+            0,
+            [
+                AffineExpr.get_dim(0),
+                AffineExpr.get_floor_div(AffineExpr.get_dim(1), 4),
+                AffineExpr.get_mod(AffineExpr.get_dim(1), 4),
+            ],
+        )
+        lvl_to_dim = AffineMap.get(
+            3,
+            0,
+            [
+                AffineExpr.get_dim(0),
+                AffineExpr.get_add(
+                    AffineExpr.get_mul(AffineExpr.get_dim(1), 4),
+                    AffineExpr.get_dim(2),
+                ),
+            ],
+        )
+        built = st.EncodingAttr.get(
+            [st.LevelType.dense, st.LevelType.dense, built_2_4],
+            dim_to_lvl,
+            lvl_to_dim,
+            0,
+            0,
+        )
+        # CHECK: #sparse_tensor.encoding<{ map = (d0, d1) -> (d0 : dense, d1 floordiv 4 : dense, d1 mod 4 : structured[2, 4]) }>
+        print(built)
+        # CHECK: built_equal: True
+        print(f"built_equal: {built == created}")
+
+        # Verify that the factory creates an instance of the proper type.
+        # CHECK: is_proper_instance: True
+        print(f"is_proper_instance: {isinstance(created, st.EncodingAttr)}")
+        # CHECK: created_pos_width: 0
+        print(f"created_pos_width: {created.pos_width}")
+
+
 # CHECK-LABEL: TEST: testEncodingAttr2D
 @run
 def testEncodingAttr2D():

@yinying-lisa-li yinying-lisa-li merged commit 2a6b521 into llvm:main Feb 9, 2024
@yinying-lisa-li yinying-lisa-li deleted the nm_test branch February 9, 2024 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:sparse Sparse compiler in MLIR mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants