Skip to content

[mlir][bufferization] Add tensor-like and buffer-like interfaces #134220

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 15, 2025

Conversation

andrey-golubev
Copy link
Contributor

@andrey-golubev andrey-golubev commented Apr 3, 2025

Current one-shot bufferization infrastructure operates on top of TensorType and BaseMemRefType. These are non-extensible base classes of the respective builtins: tensor and memref. Thus, the infrastructure is bound to work only with builtin tensor/memref types. At the same time, there are customization points that allow one to provide custom logic to control the bufferization behavior.

This patch introduces new type interfaces: tensor-like and buffer-like that aim to supersede TensorType/BaseMemRefType within the bufferization dialect and allow custom tensors / memrefs to be used. Additionally, these new type interfaces are attached to the respective builtin types so that the switch is seamless.

Note that this patch does very minimal initial work, it does NOT refactor bufferization infrastructure.

Current one-shot bufferization infrastructure operates on top of
TensorType and BaseMemRefType. These are non-extensible base classes of
the respective builtins: tensor and memref. Thus, the infrastructure is
bound to work only with builtin tensor/memref types. At the same time,
there are customization points that allow one to provide custom logic to
control the bufferization behavior.

This patch introduces new type interfaces: tensor-like and memref-like
that aim to supersede TensorType/BaseMemRefType within the bufferization
dialect and allow custom tensors / memrefs to be used. Additionally,
these new type interfaces are attached to the respective builtin types
so that the switch is seamless.

Note that this patch does very minimal initial work, it does NOT
refactor bufferization infrastructure.
@llvmbot llvmbot added mlir mlir:bufferization Bufferization infrastructure labels Apr 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Apr 3, 2025

@llvm/pr-subscribers-mlir-core
@llvm/pr-subscribers-mlir-bufferization

@llvm/pr-subscribers-mlir

Author: Andrei Golubev (andrey-golubev)

Changes

Current one-shot bufferization infrastructure operates on top of TensorType and BaseMemRefType. These are non-extensible base classes of the respective builtins: tensor and memref. Thus, the infrastructure is bound to work only with builtin tensor/memref types. At the same time, there are customization points that allow one to provide custom logic to control the bufferization behavior.

This patch introduces new type interfaces: tensor-like and memref-like that aim to supersede TensorType/BaseMemRefType within the bufferization dialect and allow custom tensors / memrefs to be used. Additionally, these new type interfaces are attached to the respective builtin types so that the switch is seamless.

Note that this patch does very minimal initial work, it does NOT refactor bufferization infrastructure.


Full diff: https://github.com/llvm/llvm-project/pull/134220.diff

8 Files Affected:

  • (added) mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h (+21)
  • (added) mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.td (+51)
  • (modified) mlir/include/mlir/Dialect/Bufferization/IR/CMakeLists.txt (+6)
  • (modified) mlir/lib/Dialect/Bufferization/IR/BufferizationDialect.cpp (+23)
  • (modified) mlir/test/lib/Dialect/Test/CMakeLists.txt (+1)
  • (modified) mlir/test/lib/Dialect/Test/TestTypeDefs.td (+40)
  • (modified) mlir/test/lib/Dialect/Test/TestTypes.h (+1)
  • (modified) mlir/unittests/IR/InterfaceTest.cpp (+77)
diff --git a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h
new file mode 100644
index 0000000000000..9e83f0e3ad2de
--- /dev/null
+++ b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h
@@ -0,0 +1,21 @@
+//===- BufferizationTypeInterfaces.h - Bufferization type interfaces -*- C++
+//-*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===---------------------------------------------------------------------------===//
+
+#ifndef MLIR_DIALECT_BUFFERIZATION_IR_BUFFERIZATIONTYPEINTERFACES_H_
+#define MLIR_DIALECT_BUFFERIZATION_IR_BUFFERIZATIONTYPEINTERFACES_H_
+
+#include "mlir/IR/BuiltinTypeInterfaces.h"
+
+//===----------------------------------------------------------------------===//
+// Bufferization Type Interfaces
+//===----------------------------------------------------------------------===//
+
+#include "mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h.inc"
+
+#endif // MLIR_DIALECT_BUFFERIZATION_IR_BUFFERIZATIONTYPEINTERFACES_H_
diff --git a/mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.td b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.td
new file mode 100644
index 0000000000000..c4c760d38203c
--- /dev/null
+++ b/mlir/include/mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.td
@@ -0,0 +1,51 @@
+//===- BufferizationTypeInterfaces.td - Bufferization type interfaces -*- tablegen -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===---------------------------------------------------------------------------------===//
+//
+// This is the definition file for type interfaces used in Bufferization.
+//
+//===---------------------------------------------------------------------------------===//
+
+#ifndef BUFFERIZATION_TYPE_INTERFACES
+#define BUFFERIZATION_TYPE_INTERFACES
+
+include "mlir/IR/OpBase.td"
+include "mlir/IR/BuiltinTypeInterfaces.td"
+
+def Bufferization_TensorLikeTypeInterface
+    : TypeInterface<"TensorLikeType", [ShapedTypeInterface]> {
+  let cppNamespace = "::mlir::bufferization";
+  let description = [{
+    Indicates that the type that attaches this interface can be treated as a
+    tensor type (similarly to a MLIR builtin tensor) within the Bufferization
+    dialect.
+
+    Implementing this interface means that the type also implements
+    ShapedTypeInterface.
+
+    The interface currently has no methods as it is used by types to opt into
+    being supported by the bufferization procedures.
+  }];
+}
+
+def Bufferization_MemRefLikeTypeInterface
+    : TypeInterface<"MemRefLikeType", [ShapedTypeInterface]> {
+  let cppNamespace = "::mlir::bufferization";
+  let description = [{
+    Indicates that the type that attaches this interface can be treated as a
+    memref type (similarly to a MLIR builtin memref) within the Bufferization
+    dialect.
+
+    Implementing this interface means that the type also implements
+    ShapedTypeInterface.
+
+    The interface currently has no methods as it is used by types to opt into
+    being supported by the bufferization procedures.
+  }];
+}
+
+#endif // BUFFERIZATION_TYPE_INTERFACES
diff --git a/mlir/include/mlir/Dialect/Bufferization/IR/CMakeLists.txt b/mlir/include/mlir/Dialect/Bufferization/IR/CMakeLists.txt
index 13a5bc370a4fc..3ead52148c208 100644
--- a/mlir/include/mlir/Dialect/Bufferization/IR/CMakeLists.txt
+++ b/mlir/include/mlir/Dialect/Bufferization/IR/CMakeLists.txt
@@ -10,3 +10,9 @@ mlir_tablegen(BufferizationEnums.h.inc -gen-enum-decls)
 mlir_tablegen(BufferizationEnums.cpp.inc -gen-enum-defs)
 add_public_tablegen_target(MLIRBufferizationEnumsIncGen)
 add_dependencies(mlir-headers MLIRBufferizationEnumsIncGen)
+
+set(LLVM_TARGET_DEFINITIONS BufferizationTypeInterfaces.td)
+mlir_tablegen(BufferizationTypeInterfaces.h.inc -gen-type-interface-decls)
+mlir_tablegen(BufferizationTypeInterfaces.cpp.inc -gen-type-interface-defs)
+add_public_tablegen_target(MLIRBufferizationTypeInterfacesIncGen)
+add_dependencies(mlir-headers MLIRBufferizationTypeInterfacesIncGen)
diff --git a/mlir/lib/Dialect/Bufferization/IR/BufferizationDialect.cpp b/mlir/lib/Dialect/Bufferization/IR/BufferizationDialect.cpp
index e5a0c3c45b09e..31faa96695379 100644
--- a/mlir/lib/Dialect/Bufferization/IR/BufferizationDialect.cpp
+++ b/mlir/lib/Dialect/Bufferization/IR/BufferizationDialect.cpp
@@ -9,8 +9,11 @@
 #include "mlir/Dialect/Affine/IR/AffineOps.h"
 #include "mlir/Dialect/Bufferization/IR/BufferizableOpInterface.h"
 #include "mlir/Dialect/Bufferization/IR/Bufferization.h"
+#include "mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h"
 #include "mlir/Dialect/MemRef/IR/MemRef.h"
 #include "mlir/Dialect/Tensor/IR/Tensor.h"
+#include "mlir/ExecutionEngine/CRunnerUtils.h"
+#include "mlir/IR/BuiltinTypes.h"
 #include "mlir/Interfaces/FunctionInterfaces.h"
 #include "mlir/Transforms/InliningUtils.h"
 
@@ -51,6 +54,16 @@ struct BufferizationInlinerInterface : public DialectInlinerInterface {
     return true;
   }
 };
+
+template <typename Tensor>
+struct BuiltinTensorExternalModel
+    : TensorLikeType::ExternalModel<BuiltinTensorExternalModel<Tensor>,
+                                    Tensor> {};
+
+template <typename MemRef>
+struct BuiltinMemRefExternalModel
+    : MemRefLikeType::ExternalModel<BuiltinMemRefExternalModel<MemRef>,
+                                    MemRef> {};
 } // namespace
 
 //===----------------------------------------------------------------------===//
@@ -63,6 +76,16 @@ void mlir::bufferization::BufferizationDialect::initialize() {
 #include "mlir/Dialect/Bufferization/IR/BufferizationOps.cpp.inc"
       >();
   addInterfaces<BufferizationInlinerInterface>();
+
+  assert(getContext() != nullptr);
+  RankedTensorType::attachInterface<
+      BuiltinTensorExternalModel<RankedTensorType>>(*getContext());
+  UnrankedTensorType::attachInterface<
+      BuiltinTensorExternalModel<UnrankedTensorType>>(*getContext());
+  MemRefType::attachInterface<BuiltinMemRefExternalModel<MemRefType>>(
+      *getContext());
+  UnrankedMemRefType::attachInterface<
+      BuiltinMemRefExternalModel<UnrankedMemRefType>>(*getContext());
 }
 
 LogicalResult BufferizationDialect::verifyRegionArgAttribute(
diff --git a/mlir/test/lib/Dialect/Test/CMakeLists.txt b/mlir/test/lib/Dialect/Test/CMakeLists.txt
index a48ac24ca056d..6e608e4772391 100644
--- a/mlir/test/lib/Dialect/Test/CMakeLists.txt
+++ b/mlir/test/lib/Dialect/Test/CMakeLists.txt
@@ -93,6 +93,7 @@ mlir_target_link_libraries(MLIRTestDialect PUBLIC
   MLIRTransformUtils
   MLIRTransforms
   MLIRValueBoundsOpInterface
+  MLIRBufferizationDialect
 )
 
 add_mlir_translation_library(MLIRTestFromLLVMIRTranslation
diff --git a/mlir/test/lib/Dialect/Test/TestTypeDefs.td b/mlir/test/lib/Dialect/Test/TestTypeDefs.td
index f1c31658c13ac..76f3644345215 100644
--- a/mlir/test/lib/Dialect/Test/TestTypeDefs.td
+++ b/mlir/test/lib/Dialect/Test/TestTypeDefs.td
@@ -19,6 +19,7 @@ include "TestAttrDefs.td"
 include "TestInterfaces.td"
 include "mlir/IR/BuiltinTypes.td"
 include "mlir/Interfaces/DataLayoutInterfaces.td"
+include "mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.td"
 
 // All of the types will extend this class.
 class Test_Type<string name, list<Trait> traits = []>
@@ -403,4 +404,43 @@ def TestTypeOpAsmTypeInterface : Test_Type<"TestTypeOpAsmTypeInterface",
   let mnemonic = "op_asm_type_interface";
 }
 
+def TestTensorType : Test_Type<"TestTensor", [Bufferization_TensorLikeTypeInterface]> {
+  let mnemonic = "test_tensor";
+  let parameters = (ins
+    ArrayRefParameter<"int64_t">:$shape,
+    "mlir::Type":$elementType
+  );
+  let assemblyFormat = "`<` `[` $shape `]` `,` $elementType `>`";
+
+  let extraClassDeclaration = [{
+    // ShapedTypeInterface:
+    bool hasRank() const {
+      return true;
+    }
+    test::TestTensorType cloneWith(std::optional<llvm::ArrayRef<int64_t>> shape, mlir::Type elementType) const {
+      return test::TestTensorType::get(getContext(), shape.value_or(getShape()), elementType);
+    }
+  }];
+}
+
+def TestMemrefType : Test_Type<"TestMemref", [Bufferization_MemRefLikeTypeInterface]> {
+  let mnemonic = "test_memref";
+  let parameters = (ins
+    ArrayRefParameter<"int64_t">:$shape,
+    "mlir::Type":$elementType,
+    DefaultValuedParameter<"mlir::Attribute", "nullptr">:$memSpace
+  );
+  let assemblyFormat = "`<` `[` $shape `]` `,` $elementType (`,` $memSpace^)? `>`";
+
+  let extraClassDeclaration = [{
+    // ShapedTypeInterface:
+    bool hasRank() const {
+      return true;
+    }
+    test::TestMemrefType cloneWith(std::optional<llvm::ArrayRef<int64_t>> shape, mlir::Type elementType) const {
+      return test::TestMemrefType::get(getContext(), shape.value_or(getShape()), elementType, getMemSpace());
+    }
+  }];
+}
+
 #endif // TEST_TYPEDEFS
diff --git a/mlir/test/lib/Dialect/Test/TestTypes.h b/mlir/test/lib/Dialect/Test/TestTypes.h
index cef3f056a7986..6499a96f495d0 100644
--- a/mlir/test/lib/Dialect/Test/TestTypes.h
+++ b/mlir/test/lib/Dialect/Test/TestTypes.h
@@ -18,6 +18,7 @@
 #include <tuple>
 
 #include "TestTraits.h"
+#include "mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h"
 #include "mlir/IR/Diagnostics.h"
 #include "mlir/IR/Dialect.h"
 #include "mlir/IR/DialectImplementation.h"
diff --git a/mlir/unittests/IR/InterfaceTest.cpp b/mlir/unittests/IR/InterfaceTest.cpp
index 42196b003e7da..4547e2dd3c8d0 100644
--- a/mlir/unittests/IR/InterfaceTest.cpp
+++ b/mlir/unittests/IR/InterfaceTest.cpp
@@ -6,6 +6,8 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "mlir/Dialect/Bufferization/IR/Bufferization.h"
+#include "mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h"
 #include "mlir/IR/BuiltinAttributes.h"
 #include "mlir/IR/BuiltinDialect.h"
 #include "mlir/IR/BuiltinOps.h"
@@ -84,3 +86,78 @@ TEST(InterfaceTest, TestImplicitConversion) {
   typeA = typeB;
   EXPECT_EQ(typeA, typeB);
 }
+
+TEST(InterfaceTest, TestBuiltinTensorIsTensorLikeType) {
+  MLIRContext context;
+  // Note: attaches external model to builtins
+  context.loadDialect<bufferization::BufferizationDialect>();
+
+  auto builtinRankedTensor = mlir::RankedTensorType::get(
+      {1, 2, 3}, mlir::IntegerType::get(&context, 32));
+  EXPECT_TRUE(mlir::isa<bufferization::TensorLikeType>(builtinRankedTensor));
+  EXPECT_FALSE(mlir::isa<bufferization::MemRefLikeType>(builtinRankedTensor));
+
+  auto builtinUnrankedTensor =
+      mlir::UnrankedTensorType::get(mlir::IntegerType::get(&context, 32));
+  EXPECT_TRUE(mlir::isa<bufferization::TensorLikeType>(builtinUnrankedTensor));
+  EXPECT_FALSE(mlir::isa<bufferization::MemRefLikeType>(builtinUnrankedTensor));
+}
+
+TEST(InterfaceTest, TestCustomTensorIsTensorLikeType) {
+  MLIRContext context;
+  context.loadDialect<test::TestDialect>();
+
+  auto customTensorType = test::TestTensorType::get(
+      &context, {1, 2, 3}, mlir::IntegerType::get(&context, 32));
+  EXPECT_TRUE(mlir::isa<bufferization::TensorLikeType>(customTensorType));
+
+  auto customCloneType = customTensorType.cloneWith(
+      ArrayRef<int64_t>{3, 4, 5}, customTensorType.getElementType());
+  EXPECT_EQ(customTensorType.getElementType(),
+            customCloneType.getElementType());
+  EXPECT_TRUE(mlir::isa<bufferization::TensorLikeType>(customCloneType));
+  EXPECT_TRUE(mlir::isa<test::TestTensorType>(customCloneType));
+
+  // user-specified conversions
+  bufferization::TensorLikeType baseCopy = customTensorType;
+  std::ignore = baseCopy;
+}
+
+TEST(InterfaceTest, TestBuiltinMemrefIsMemRefLikeType) {
+  MLIRContext context;
+  // Note: attaches external model to builtins
+  context.loadDialect<bufferization::BufferizationDialect>();
+
+  auto builtinRankedMemref =
+      mlir::MemRefType::get({1, 2, 3}, mlir::IntegerType::get(&context, 32));
+  EXPECT_TRUE(mlir::isa<bufferization::MemRefLikeType>(builtinRankedMemref));
+  EXPECT_FALSE(mlir::isa<bufferization::TensorLikeType>(builtinRankedMemref));
+
+  auto builtinUnrankedMemref = mlir::UnrankedMemRefType::get(
+      mlir::IntegerType::get(&context, 32), nullptr);
+  EXPECT_TRUE(mlir::isa<bufferization::MemRefLikeType>(builtinUnrankedMemref));
+  EXPECT_FALSE(mlir::isa<bufferization::TensorLikeType>(builtinUnrankedMemref));
+}
+
+TEST(InterfaceTest, TestCustomMemrefIsMemRefLikeType) {
+  MLIRContext context;
+  context.loadDialect<test::TestDialect>();
+
+  auto customMemrefType = test::TestMemrefType::get(
+      &context, {1, 2, 3}, mlir::IntegerType::get(&context, 32),
+      mlir::StringAttr::get(&context, "some_memspace"));
+  EXPECT_TRUE(mlir::isa<bufferization::MemRefLikeType>(customMemrefType));
+
+  auto customCloneType = customMemrefType.cloneWith(
+      ArrayRef<int64_t>{3, 4, 5}, customMemrefType.getElementType());
+  EXPECT_EQ(customMemrefType.getElementType(),
+            customCloneType.getElementType());
+  EXPECT_TRUE(mlir::isa<bufferization::MemRefLikeType>(customCloneType));
+  EXPECT_TRUE(mlir::isa<test::TestMemrefType>(customCloneType));
+  EXPECT_EQ(customMemrefType.getMemSpace(),
+            mlir::cast<test::TestMemrefType>(customCloneType).getMemSpace());
+
+  // user-specified conversions
+  bufferization::MemRefLikeType baseCopy = customMemrefType;
+  std::ignore = baseCopy;
+}

include "mlir/IR/BuiltinTypeInterfaces.td"

def Bufferization_TensorLikeTypeInterface
: TypeInterface<"TensorLikeType", [ShapedTypeInterface]> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: this attaches ShapedTypeInterface but TensorType (base class of ranked tensor) also attaches ShapedTypeInterface. is there any risk that we can run into trouble due to:

ranked tensor type -> TensorType -> **ShapedTypeInterface**
                  |-> TensorLikeTypeInterface -> **ShapedTypeInterface**

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to inherit the ShapedTypeInterface? If not, let's keep this simpler.

Copy link
Contributor Author

@andrey-golubev andrey-golubev Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the code, TensorLike doesn't need it (or I haven't seen it). MemRefLike does need it: there are .hasRank() and .getShape() API usages (albeit the ones I've seen are in asserts)

(also, there are memref.getMemorySpace() usages - this is an "API" of BaseMemRefType but I guess I could introduce it later once switch to these new type interfaces happens.)

overall, my motivation is to have ShapedTypeInterface APIs available to avoid boilerplate of cast<ShapedTypeInterface>(tensorLike).getBlah(). however, since TensorLike doesn't seem to need it, maybe i can drop it in that one at least? (but then it's going to be "tensor like" - not shaped type, "memref like" - shaped type which is kind of dumb given that memref is "created" from tensor).

it feels like - in MLIR - tensor and memref are both shaped types "by design" but as we don't have a generic type interfaces for those, this design has to kind of leak into bufferization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which functions require these? The main places to look at are in BufferizableOpInterface.cpp. There are functions such as bufferization::getMemRefType. These will probably have to become interface methods on TensorTypeInterface. Apart from that, I don't think the bufferization driver itself really needs anything from the shape type interface.

So I would recommend to go without ShapeTypeInterface for now. Are there any places where you'd have to insert explicit casts because of this today?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I only saw the shaped type APIs around here (note NDEBUG):

#ifndef NDEBUG
if (auto rankedTensorType = dyn_cast<RankedTensorType>(tensorType)) {
assert(bufferType.hasRank() && callerType.hasRank() &&
"expected ranked memrefs");
assert(llvm::all_equal({bufferType.getShape(), callerType.getShape(),
rankedTensorType.getShape()}) &&
"expected same shape");
} else {
assert(!bufferType.hasRank() && !callerType.hasRank() &&
"expected unranked memrefs");
}
#endif // NDEBUG

Are there any places where you'd have to insert explicit casts because of this today?

I don't think I can see such places (well, maybe in the user code but then those places are likely to cast down to the actual type so kind of not an issue).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to need a function such as MemRefTypeInterface::toTensorType(). It should be possible to reimplement these asserts based on that function. We then use operator== instead of checking shape, rank, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. then let's proceed without enforcing shaped type interface. worst case, can always be added later once the bulk of the code is migrated and new cases are discovered "lazily".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed ShapedTypeInterface propagation.

@@ -6,6 +6,8 @@
//
//===----------------------------------------------------------------------===//

#include "mlir/Dialect/Bufferization/IR/Bufferization.h"
#include "mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note: i haven't seen bufferization-specific unit tests, so ended up adding all tests here. these seem to test "TestDialect".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested by the path, this tests libIR capabilities. We rarely use this style of unit tests. The way to test this would be to write a test-only pass that, e.g., looks at function signatures and adds an attribute to the function indicating whether the types of the function results implement certain interface. It can then live in mlir/test/lib/Dialect/Bufferization. There is already a precedent.

Copy link
Contributor Author

@andrey-golubev andrey-golubev Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the pointer! i'll add it a bit later (seems to be quite an endeavour).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now re-done according to the suggestion. thanks!

@andrey-golubev
Copy link
Contributor Author

cc @joker-eph @ftynse

@joker-eph
Copy link
Collaborator

these new type interfaces are attached to the respective builtin types so that the switch is seamless.

Where is it done? I'm not sure how you get around the fact that these are in the bufferization folder which I don't think the core IR can depend on.

@andrey-golubev
Copy link
Contributor Author

andrey-golubev commented Apr 3, 2025

these new type interfaces are attached to the respective builtin types so that the switch is seamless.

Where is it done? I'm not sure how you get around the fact that these are in the bufferization folder which I don't think the core IR can depend on.

This is done in BufferizationDialect::initialize(). So the flow is: users load the dialect and the new type interfaces get attached to the builtin types.
Update: This is probably very implicit but at the same time very seamless for the users: they just load the dialect - happens anyway? - and get builtin support (out of the box experience).

Indeed, as we cannot depend on bufferization in builtins, this is done via external model. So, TypeInterface x = rankedTensor is not going to work ever without an explicit llvm::cast, but llvm::isa<> works. You can find isa<> check in the tests.

@joker-eph
Copy link
Collaborator

So the flow is: users load the dialect

Well we're talking about the builtin dialect, user never has to load it. Are you talking about something else here?

Indeed, as we cannot depend on bufferization in builtins, this is done via external model.

Right, but we should never have external interface with a promise in the first place.
It is interestingly OK here though because the special combination of builtin types (always loaded) and the external interface being attached in the initialization of the pass. This makes it impossible to misuse the pass.
There is still a concern that people using bufferization without reusing the pass as-is will hit silent issues here because the lack of promise won't trigger a failure if they don't attach the model (we should refactor the initialization to be reusable by other bufferization pass using the bufferization infra here I think).

@andrey-golubev
Copy link
Contributor Author

andrey-golubev commented Apr 3, 2025

So the flow is: users load the dialect

Well we're talking about the builtin dialect, user never has to load it. Are you talking about something else here?

I meant loading bufferization dialect. I assume that if the users want to use bufferization infrastructure, they have to load bufferization dialect first? And this is when interface attachment happens (for builtins).

Indeed, as we cannot depend on bufferization in builtins, this is done via external model.

Right, but we should never have external interface with a promise in the first place. It is interestingly OK here though because the special combination of builtin types (always loaded) and the external interface being attached in the initialization of the pass. This makes it impossible to misuse the pass. There is still a concern that people using bufferization without reusing the pass as-is will hit silent issues here because the lack of promise won't trigger a failure if they don't attach the model (we should refactor the initialization to be reusable by other bufferization pass using the bufferization infra here I think).

I'm not sure I follow. Do you think there could be a case where we don't load the bufferization dialect but still use --some-pass-X that relies on it? Wouldn't this introduce other problems also (e.g. "bufferization Op A not found")?
Do you imply that I need to add declarePromisedInterface<ExternalModel, RankedTensorType>(), ... to some place outside of bufferization dialect? So that at least there's an LLVM error when we call a pass but bufferization is not loaded?

Edit: wouldn't this actually mean we always fail unless bufferization dialect is loaded? Even for the programs that don't need it.

@joker-eph
Copy link
Collaborator

I'm not sure I follow. Do you think there could be a case where we don't load the bufferization dialect but still use

Nevermind, I thought it was the initialize function of the bufferization pass instead of the dialect!

@@ -0,0 +1,21 @@
//===- BufferizationTypeInterfaces.h - Bufferization type interfaces -*- C++
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: 80 cols.

Copy link
Contributor Author

@andrey-golubev andrey-golubev Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't figured how to do it in 80 :| are there any tips when the file name is too long? should i drop the middle part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trimmed the middle part, please tell if it's acceptable.

#ifndef MLIR_DIALECT_BUFFERIZATION_IR_BUFFERIZATIONTYPEINTERFACES_H_
#define MLIR_DIALECT_BUFFERIZATION_IR_BUFFERIZATIONTYPEINTERFACES_H_

#include "mlir/IR/BuiltinTypeInterfaces.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for ShapedTypeInterface

Copy link
Contributor Author

@andrey-golubev andrey-golubev Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a comment stating why the include is needed. i guess, let's figure out in another discussion what we plan generally. then either this stays or is to be removed.

include "mlir/IR/BuiltinTypeInterfaces.td"

def Bufferization_TensorLikeTypeInterface
: TypeInterface<"TensorLikeType", [ShapedTypeInterface]> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it have to inherit the ShapedTypeInterface? If not, let's keep this simpler.

let description = [{
Indicates that the type that attaches this interface can be treated as a
tensor type (similarly to a MLIR builtin tensor) within the Bufferization
dialect.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'd rather "during bufferization", the dialect may have little to do with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

#include "mlir/Dialect/MemRef/IR/MemRef.h"
#include "mlir/Dialect/Tensor/IR/Tensor.h"
#include "mlir/ExecutionEngine/CRunnerUtils.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to depend on ExecutionEngine from lib/Dialect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ouch, good catch. i think this is some random clangd added include. i don't even know what that is!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed!

@@ -63,6 +76,16 @@ void mlir::bufferization::BufferizationDialect::initialize() {
#include "mlir/Dialect/Bufferization/IR/BufferizationOps.cpp.inc"
>();
addInterfaces<BufferizationInlinerInterface>();

assert(getContext() != nullptr);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: assert(condition && "message") is the convention in MLIR. I'm also relatively confident that the context is never null if you have a live dialect object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropped the assert. these days i seem to overdo them.

I'm also relatively confident that the context is never null if you have a live dialect object.

agree. i guess initialization only (normally) happens through dialect loading which happens through the context (i.e. it cannot be invalid here).

bool hasRank() const {
return true;
}
test::TestTensorType cloneWith(std::optional<llvm::ArrayRef<int64_t>> shape, mlir::Type elementType) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: rewrap at 80 cols.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -6,6 +6,8 @@
//
//===----------------------------------------------------------------------===//

#include "mlir/Dialect/Bufferization/IR/Bufferization.h"
#include "mlir/Dialect/Bufferization/IR/BufferizationTypeInterfaces.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested by the path, this tests libIR capabilities. We rarely use this style of unit tests. The way to test this would be to write a test-only pass that, e.g., looks at function signatures and adds an attribute to the function indicating whether the types of the function results implement certain interface. It can then live in mlir/test/lib/Dialect/Bufferization. There is already a precedent.

@ftynse
Copy link
Member

ftynse commented Apr 4, 2025

I'll leave it up to @matthias-springer , but I find it weird to attach interfaces in dialect initialization of another, irrelevant dialect. We already have external models for bufferization op interfaces, e.g.,

void mlir::vector::registerBufferizableOpInterfaceExternalModels(
. I'd expect we do the same for bufferization type interfaces. AFAICS, one-shot-bufferize does not list the bufferization dialect as dependent, so there is no guarantee it is loaded when the pass runs. I'd find it worryingly inconsistent if built-in types wouldn't bufferize when another dialect is not loaded.

@andrey-golubev
Copy link
Contributor Author

andrey-golubev commented Apr 4, 2025

I'll leave it up to @matthias-springer , but I find it weird to attach interfaces in dialect initialization of another, irrelevant dialect. We already have external models for bufferization op interfaces, e.g.,

I think it's just me unaware of what's the correct way of doing this. Thanks for the suggestion!

. I'd expect we do the same for bufferization type interfaces. AFAICS, one-shot-bufferize does not list the bufferization dialect as dependent, so there is no guarantee it is loaded when the pass runs. I'd find it worryingly inconsistent if built-in types wouldn't bufferize when another dialect is not loaded.

I agree. Which is why I thought that dialect initialization is the best place.

For the "external model" my only concern is: can we actually have builtin::registerBufferizableOpInterfaceExternalModels()? Wouldn't it introduce unintentional coupling of builtins and bufferization?

@matthias-springer
Copy link
Member

matthias-springer commented Apr 4, 2025

For the "external model" my only concern is: can we actually have builtin::registerBufferizableOpInterfaceExternalModels()? Wouldn't it introduce unintentional coupling of builtins and bufferization?

This has to be in the bufferization dialect: bufferization::registerTensorLikeTypeInterfaceExternalModels and bufferization::registerMemRefLikeTypeInterfaceExternalModels. Or better, just a single function called bufferization::registerBuiltinTypeInterfaceExternalModels.

I would prefer doing this in the separate functions instead of the dialect initializer for consistency reasons. You're also going to have to add this to InitAllDialects.h. And we should mention in the commit message that additional external models must now be registered. (Note to LLVM integration: ...)

@joker-eph
Copy link
Collaborator

joker-eph commented Apr 5, 2025

I'll leave it up to @matthias-springer , but I find it weird to attach interfaces in dialect initialization of another, irrelevant dialect.

I would prefer doing this in the separate functions instead of the dialect initializer for consistency reasons. Y

This is not doable in any other way: we must have "promised interface" declarations in-tree and you can't do this on builtin-type using an interface defined in bufferization, from a layering point of view.

I noticed this in my comment earlier: #134220 (comment)

@matthias-springer
Copy link
Member

So you’re saying the external model should be attached in the bufferization dialect initializer because that won’t require a promise?

@joker-eph
Copy link
Collaborator

So you’re saying the external model should be attached in the bufferization dialect initializer because that won’t require a promise?

That seems like an OK-ish alternative to promise (I'm assuming we can't write the promise right now). Caveat: Alex noticed that the bufferization dialect isn't a dependency of the actual bufferization.

@matthias-springer
Copy link
Member

matthias-springer commented Apr 7, 2025

AFAICS, one-shot-bufferize does not list the bufferization dialect as dependent, so there is no guarantee it is loaded when the pass runs. I'd find it worryingly inconsistent if built-in types wouldn't bufferize when another dialect is not loaded.

I just looked into this.

The bufferization dialect is a dependency of the pass:

struct OneShotBufferizePass
    : public bufferization::impl::OneShotBufferizePassBase<
          OneShotBufferizePass> {
  using Base::Base;

  void getDependentDialects(DialectRegistry &registry) const override {
    registry
        .insert<bufferization::BufferizationDialect, memref::MemRefDialect>();
  }

The reason for this is that we are creating bufferization.to_tensor / bufferization.to_memref at the bufferization boundaries.

These two ops will also have to be updated to accept an arbitrary type.

And improve the op verifiers to check that the operands/results implement the new type interfaces. Actually, I'm not sure about the verifiers... Is it OK for ops to pass/fail verification depending on whether a type interface external model was loaded or not? (edit: nvm, this won't be a problem for the builtin tensor/memref types if we load the external model in the bufferization initializer. And for other user-defined types, there will be a promise.)

I would recommend to make the to_memref/to_tensor change in a separate, follow-up PR. (Or a PR that will be merged before this PR.) The assembly format will have to change and the diff will likely be quite large.

: TypeInterface<"TensorLikeType", [ShapedTypeInterface]> {
let cppNamespace = "::mlir::bufferization";
let description = [{
Indicates that the type that attaches this interface can be treated as a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Could be shorted to: Indicates that this type is a tensor type for bufferization purposes. Same for memref.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied. albeit i left the piece about "similarly to a MLIR builtin X" to have a clearer reference to builtin tensor / memref.

@andrey-golubev
Copy link
Contributor Author

Ok, so for interface attachment: leave as it is now - dialect initialization would attach tensor-like / memref-like for builtins (so my unawareness turned out to be a good thing, huh).

Caveat: Alex noticed that the bufferization dialect isn't a dependency of the actual bufferization.

If in terms of the pass, is it true though? It seems to just be added in the C++ definition of the pass (according to Matthias).

As an aside, should I move:

  void getDependentDialects(DialectRegistry &registry) const override {
    registry
        .insert<bufferization::BufferizationDialect, memref::MemRefDialect>();
  }

into tablegen though? Is there a strong preference to keep this in C++ (e.g. include dependencies)?

And improve the op verifiers to check that the operands/results implement the new type interfaces.

My plan so far is to switch over to "MemRefLike" / "TensorLike" in the operands / results. I guess this way we don't need to change much (verification related) as the "signature" would require types to implement the new interfaces?


I'll proceed with test restructuring as suggested by Alex for now. Then, I guess, that's about it?

@matthias-springer
Copy link
Member

My plan so far is to switch over to "MemRefLike" / "TensorLike" in the operands / results. I guess this way we don't need to change much (verification related) as the "signature" would require types to implement the new interfaces?

You're right. I thought we infer the type of the tensors in the assembly format. But that is no longer the case. So there won't be any change in assembly format.

If in terms of the pass, is it true though?

What do you mean by that? From a functional point of view, whether it's defined in C++ or TableGen should not matter. Feel free to move it over to TableGen.

@llvmbot llvmbot added the mlir:core MLIR Core Infrastructure label Apr 7, 2025
@andrey-golubev
Copy link
Contributor Author

If in terms of the pass, is it true though?

What do you mean by that? From a functional point of view, whether it's defined in C++ or TableGen should not matter. Feel free to move it over to TableGen.

I was wondering whether Alex could've missed the dependency on the bufferization dialect because it was added in C++ (less obvious than let dependentDialects in TableGen imho). Moved to TableGen.

I think at this point I've addressed all outstanding issues, so @ftynse @matthias-springer please take a look once again.

}];
}

def Bufferization_MemRefLikeTypeInterface
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can ask for one more change? The term used in the bufferization is "buffer", not memref. Can we change this to BufferLikeTypeInterface? E.g., we have getBufferType in BufferizableOpInterface.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought of this as well but wasn't sure it's better than memref. If you prefer "buffer", sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

@andrey-golubev
Copy link
Contributor Author

Reminder to myself: "memref-like" -> "buffer-like" in PR title and commit message (to be done during squash once all approvals are collected)

@joker-eph
Copy link
Collaborator

Why don't you update the PR title/description right now? This is what GitHub will use for the squash.

@andrey-golubev
Copy link
Contributor Author

andrey-golubev commented Apr 8, 2025

Why don't you update the PR title/description right now? This is what GitHub will use for the squash.

I don't have the rights to merge myself, so will squash everything manually in git anyway.

Does merge actually use the GitHub's info? I thought it would use commit message "as is" instead (i guess it depends on what are the settings of the repo). Commit message comes from git, not github, so there's little reason in updating github's title / description without doing the same for the underlying commit.

@joker-eph
Copy link
Collaborator

joker-eph commented Apr 8, 2025

I don't have the rights to merge myself, so will squash everything manually in git anyway.

You don't need to do this.

Does merge actually use the GitHub's info? I thought it would use commit message "as is" instead (i guess it depends on what are the settings of the repo). Commit message comes from git, not github, so there's little reason in updating github's title / description without doing the same for the underlying commit.

No: LLVM GitHub setup is that your commits and their messages are ignored. Only the PR title and description are use to form the final commit (which won't be a "merge commit" in git parlance, but a squashed and rebased single commit.

This setting allows us reviewers to validate that the title and description are accurate before approving

@andrey-golubev andrey-golubev changed the title [mlir][bufferization] Add tensor-like and memref-like interfaces [mlir][bufferization] Add tensor-like and buffer-like interfaces Apr 8, 2025
@andrey-golubev
Copy link
Contributor Author

I see. I guess the procedure changed at some point since my last visit to LLVM repo, nice!

Updated the PR title / description just now.

@joker-eph
Copy link
Collaborator

I see. I guess the procedure changed at some point since my last visit to LLVM repo, nice!

It has been like this ever since we've been using GitHub pull-requests (replicating Phabricator workflow was also another motivation).

@andrey-golubev
Copy link
Contributor Author

andrey-golubev commented Apr 8, 2025

It has been like this ever since we've been using GitHub pull-requests (replicating Phabricator workflow was also another motivation).

Hmm, pretty sure there was some older process (just after the move?) - or maybe the lack of ;)
I see "force-pushed" in older patches of mine. That was also before codeowners lists i think.
Anyhow, thanks for keeping me up to date!

Does it mean that I also don't need to rebase to latest main before merging? (to re-run CI on latest state)

@joker-eph
Copy link
Collaborator

Hmm, pretty sure there was some older process (just after the move?) - or maybe the lack of ;)

The setting has always been this one, since day 0 :)

I see "force-pushed" in older patches of mine.

Yes: I always rebase, amend, and force push. I never have multiple commits in my PR.
The thing is that this has no bearing on the resulting merging action: GitHub will "rebase+squash" and use the PR description :)

@ftynse ftynse merged commit 00eaff3 into llvm:main Apr 15, 2025
11 checks passed
nithinsubbiah added a commit to iree-org/llvm-project that referenced this pull request Apr 16, 2025
nithinsubbiah added a commit to nithinsubbiah/iree that referenced this pull request Apr 16, 2025
nithinsubbiah added a commit to iree-org/iree that referenced this pull request Apr 16, 2025
@andrey-golubev andrey-golubev deleted the bufferization_interfaces branch April 17, 2025 07:27
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…m#134220)

Current one-shot bufferization infrastructure operates on top of
TensorType and BaseMemRefType. These are non-extensible base classes of
the respective builtins: tensor and memref. Thus, the infrastructure is
bound to work only with builtin tensor/memref types. At the same time,
there are customization points that allow one to provide custom logic to
control the bufferization behavior.

This patch introduces new type interfaces: tensor-like and buffer-like
that aim to supersede TensorType/BaseMemRefType within the bufferization
dialect and allow custom tensors / memrefs to be used. Additionally,
these new type interfaces are attached to the respective builtin types
so that the switch is seamless.

Note that this patch does very minimal initial work, it does NOT
refactor bufferization infrastructure.

See https://discourse.llvm.org/t/rfc-changing-base-types-for-tensors-and-memrefs-from-c-base-classes-to-type-interfaces/85509
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir:bufferization Bufferization infrastructure mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants