Skip to content

[MLIR] EmitC: Add subscript operator #84783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 15, 2024

Conversation

mgehre-amd
Copy link
Contributor

@mgehre-amd mgehre-amd commented Mar 11, 2024

Introduces a SubscriptOp that allows to write IR like

func.func @load_store(%arg0: !emitc.array<4x8xf32>, %arg1: !emitc.array<3x5xf32>, %arg2: index, %arg3: index) {
  %0 = emitc.subscript %arg0[%arg2, %arg3] : <4x8xf32>, index, index
  %1 = emitc.subscript %arg1[%arg2, %arg3] : <3x5xf32>, index, index
  emitc.assign %0 : f32 to %1 : f32
  return
}

which gets translated into the C++ code

v1[v2][v3] = v0[v1][v2];

To make this happen, this

  • adds the SubscriptOp
  • allows the subscript op as rhs of emitc.assign
  • updates the emitter to print SubscriptOps

The emitter prints emitc.subscript in a delayed fashing to allow it being used as lvalue.
I.e. while processing

%0 = emitc.subscript %arg0[%arg2, %arg3] : <4x8xf32>, index, index

it will not emit any text, but record in the valueMapper that the name for %0 is v0[v1][v2], see CppEmitter::getSubscriptName. Only when that result is then used (here in emitc.assign), that name is inserted into the text.

@llvmbot
Copy link
Member

llvmbot commented Mar 11, 2024

@llvm/pr-subscribers-mlir-emitc

@llvm/pr-subscribers-mlir

Author: Matthias Gehre (mgehre-amd)

Changes

Introduces a SubscriptOp that allows to write IR like

func.func @<!-- -->load_store(%arg0: !emitc.array&lt;4x8xf32&gt;, %arg1: !emitc.array&lt;3x5xf32&gt;, %arg2: index, %arg3: index) {
  %0 = emitc.subscript %arg0[%arg2, %arg3] : &lt;4x8xf32&gt;
  %1 = emitc.subscript %arg1[%arg2, %arg3] : &lt;3x5xf32&gt;
  emitc.assign %0 : f32 to %1 : f32
  return
}

which gets translated into the C++ code

v1[v2][v3] = v0[v1][v2];

To make this happen, this

  • adds the SubscriptOp
  • allows the subscript op as rhs of emitc.assign
  • updates the emitter to print SubscriptOps

The emitter prints emitc.subscript in a delayed fashing to allow it being used as lvalue.
I.e. while processing

%0 = emitc.subscript %arg0[%arg2, %arg3] : &lt;4x8xf32&gt;

it will not emit any text, but record in the valueMapper that the name for %0 is v0[v1][v2], see CppEmitter::getSubscriptName. Only when that result is then used (here in emitc.assign), that name is inserted into the text.


Full diff: https://github.com/llvm/llvm-project/pull/84783.diff

5 Files Affected:

  • (modified) mlir/include/mlir/Dialect/EmitC/IR/EmitC.td (+32)
  • (modified) mlir/lib/Dialect/EmitC/IR/EmitC.cpp (+17-2)
  • (modified) mlir/lib/Target/Cpp/TranslateToCpp.cpp (+34-7)
  • (modified) mlir/test/Dialect/EmitC/invalid_ops.mlir (+8)
  • (added) mlir/test/Target/Cpp/subscript.mlir (+12)
diff --git a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
index ac1e38a5506da0..bcdd001528c46d 100644
--- a/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
+++ b/mlir/include/mlir/Dialect/EmitC/IR/EmitC.td
@@ -1125,4 +1125,36 @@ def EmitC_IfOp : EmitC_Op<"if",
   let hasCustomAssemblyFormat = 1;
 }
 
+def EmitC_SubscriptOp : EmitC_Op<"subscript",
+  [TypesMatchWith<"result type matches element type of 'array'",
+                  "array", "result",
+                  "::llvm::cast<ArrayType>($_self).getElementType()">]> {
+  let summary = "Array subscript operation";
+  let description = [{
+    With the `subscript` operation the subscript operator `[]` can be applied
+    to variables or arguments of array type.
+
+    Example:
+
+    ```mlir
+    %i = index.constant 1
+    %j = index.constant 7
+    %0 = emitc.subscript %arg0[%i][%j] : (!emitc.array<4x8xf32>) -> f32
+    ```
+  }];
+  let arguments = (ins Arg<EmitC_ArrayType, "the reference to load from">:$array,
+                       Variadic<Index>:$indices);
+  let results = (outs AnyType:$result);
+
+  let builders = [
+    OpBuilder<(ins "Value":$array, "ValueRange":$indices), [{
+      build($_builder, $_state, cast<ArrayType>(array.getType()).getElementType(), array, indices);
+    }]>
+  ];
+
+  let hasVerifier = 1;
+  let assemblyFormat = "$array `[` $indices `]` attr-dict `:` type($array)";
+}
+
+
 #endif // MLIR_DIALECT_EMITC_IR_EMITC
diff --git a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
index 9426bbbe2370f0..e401a83bcb42e6 100644
--- a/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
+++ b/mlir/lib/Dialect/EmitC/IR/EmitC.cpp
@@ -132,9 +132,10 @@ LogicalResult ApplyOp::verify() {
 LogicalResult emitc::AssignOp::verify() {
   Value variable = getVar();
   Operation *variableDef = variable.getDefiningOp();
-  if (!variableDef || !llvm::isa<emitc::VariableOp>(variableDef))
+  if (!variableDef ||
+      !llvm::isa<emitc::VariableOp, emitc::SubscriptOp>(variableDef))
     return emitOpError() << "requires first operand (" << variable
-                         << ") to be a Variable";
+                         << ") to be a Variable or subscript";
 
   Value value = getValue();
   if (variable.getType() != value.getType())
@@ -746,6 +747,20 @@ LogicalResult emitc::YieldOp::verify() {
   return success();
 }
 
+//===----------------------------------------------------------------------===//
+// SubscriptOp
+//===----------------------------------------------------------------------===//
+
+LogicalResult emitc::SubscriptOp::verify() {
+  if (getIndices().size() != (size_t)getArray().getType().getRank()) {
+    return emitOpError() << "requires number of indices ("
+                         << getIndices().size()
+                         << ") to match the rank of the array type ("
+                         << getArray().getType().getRank() << ")";
+  }
+  return success();
+}
+
 //===----------------------------------------------------------------------===//
 // TableGen'd op method definitions
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Target/Cpp/TranslateToCpp.cpp b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
index 3cf137c1d07c0e..6e477a34fc4ba9 100644
--- a/mlir/lib/Target/Cpp/TranslateToCpp.cpp
+++ b/mlir/lib/Target/Cpp/TranslateToCpp.cpp
@@ -171,6 +171,9 @@ struct CppEmitter {
   /// Return the existing or a new name for a Value.
   StringRef getOrCreateName(Value val);
 
+  // Returns the textual representation of a subscript operation.
+  std::string getSubscriptName(emitc::SubscriptOp op);
+
   /// Return the existing or a new label of a Block.
   StringRef getOrCreateName(Block &block);
 
@@ -340,8 +343,7 @@ static LogicalResult printOperation(CppEmitter &emitter,
 
 static LogicalResult printOperation(CppEmitter &emitter,
                                     emitc::AssignOp assignOp) {
-  auto variableOp = cast<emitc::VariableOp>(assignOp.getVar().getDefiningOp());
-  OpResult result = variableOp->getResult(0);
+  OpResult result = assignOp.getVar().getDefiningOp()->getResult(0);
 
   if (failed(emitter.emitVariableAssignment(result)))
     return failure();
@@ -349,6 +351,13 @@ static LogicalResult printOperation(CppEmitter &emitter,
   return emitter.emitOperand(assignOp.getValue());
 }
 
+static LogicalResult printOperation(CppEmitter &emitter,
+                                    emitc::SubscriptOp subscriptOp) {
+  // Add name to cache so that `hasValueInScope` works.
+  emitter.getOrCreateName(subscriptOp.getResult());
+  return success();
+}
+
 static LogicalResult printBinaryOperation(CppEmitter &emitter,
                                           Operation *operation,
                                           StringRef binaryOperator) {
@@ -1067,12 +1076,28 @@ CppEmitter::CppEmitter(raw_ostream &os, bool declareVariablesAtTop)
   labelInScopeCount.push(0);
 }
 
+std::string CppEmitter::getSubscriptName(emitc::SubscriptOp op) {
+  std::string out;
+  llvm::raw_string_ostream ss(out);
+  ss << getOrCreateName(op.getArray());
+  for (auto index : op.getIndices()) {
+    ss << "[" << getOrCreateName(index) << "]";
+  }
+  return out;
+}
+
 /// Return the existing or a new name for a Value.
 StringRef CppEmitter::getOrCreateName(Value val) {
   if (auto literal = dyn_cast_if_present<emitc::LiteralOp>(val.getDefiningOp()))
     return literal.getValue();
-  if (!valueMapper.count(val))
-    valueMapper.insert(val, formatv("v{0}", ++valueInScopeCount.top()));
+  if (!valueMapper.count(val)) {
+    if (auto subscript =
+            dyn_cast_if_present<emitc::SubscriptOp>(val.getDefiningOp())) {
+      valueMapper.insert(val, getSubscriptName(subscript));
+    } else {
+      valueMapper.insert(val, formatv("v{0}", ++valueInScopeCount.top()));
+    }
+  }
   return *valueMapper.begin(val);
 }
 
@@ -1312,6 +1337,8 @@ LogicalResult CppEmitter::emitVariableAssignment(OpResult result) {
 
 LogicalResult CppEmitter::emitVariableDeclaration(OpResult result,
                                                   bool trailingSemicolon) {
+  if (isa<emitc::SubscriptOp>(result.getDefiningOp()))
+    return success();
   if (hasValueInScope(result)) {
     return result.getDefiningOp()->emitError(
         "result variable for the operation already declared");
@@ -1387,8 +1414,8 @@ LogicalResult CppEmitter::emitOperation(Operation &op, bool trailingSemicolon) {
                 emitc::ExpressionOp, emitc::ForOp, emitc::FuncOp, emitc::IfOp,
                 emitc::IncludeOp, emitc::LogicalAndOp, emitc::LogicalNotOp,
                 emitc::LogicalOrOp, emitc::MulOp, emitc::RemOp, emitc::ReturnOp,
-                emitc::SubOp, emitc::UnaryMinusOp, emitc::UnaryPlusOp,
-                emitc::VariableOp, emitc::VerbatimOp>(
+                emitc::SubOp, emitc::SubscriptOp, emitc::UnaryMinusOp,
+                emitc::UnaryPlusOp, emitc::VariableOp, emitc::VerbatimOp>(
               [&](auto op) { return printOperation(*this, op); })
           // Func ops.
           .Case<func::CallOp, func::FuncOp, func::ReturnOp>(
@@ -1401,7 +1428,7 @@ LogicalResult CppEmitter::emitOperation(Operation &op, bool trailingSemicolon) {
   if (failed(status))
     return failure();
 
-  if (isa<emitc::LiteralOp>(op))
+  if (isa<emitc::LiteralOp, emitc::SubscriptOp>(op))
     return success();
 
   if (getEmittedExpression() ||
diff --git a/mlir/test/Dialect/EmitC/invalid_ops.mlir b/mlir/test/Dialect/EmitC/invalid_ops.mlir
index 58b3a11ed93e15..cc718e190484a8 100644
--- a/mlir/test/Dialect/EmitC/invalid_ops.mlir
+++ b/mlir/test/Dialect/EmitC/invalid_ops.mlir
@@ -387,3 +387,11 @@ func.func @logical_or_resulterror(%arg0: i32, %arg1: i32) {
   %0 = "emitc.logical_or"(%arg0, %arg1) : (i32, i32) -> i32
   return
 }
+
+// -----
+
+func.func @test_subscript_indices_mismatch(%arg0: !emitc.array<4x8xf32>, %arg2: index) {
+  // expected-error @+1 {{'emitc.subscript' op requires number of indices (1) to match the rank of the array type (2)}}
+  %0 = emitc.subscript %arg0[%arg2] : <4x8xf32>
+  return
+}
diff --git a/mlir/test/Target/Cpp/subscript.mlir b/mlir/test/Target/Cpp/subscript.mlir
new file mode 100644
index 00000000000000..0f9bc515b48dc2
--- /dev/null
+++ b/mlir/test/Target/Cpp/subscript.mlir
@@ -0,0 +1,12 @@
+// RUN: mlir-translate -mlir-to-cpp %s | FileCheck %s
+// RUN: mlir-translate -mlir-to-cpp -declare-variables-at-top %s | FileCheck %s
+
+func.func @load_store(%arg0: !emitc.array<4x8xf32>, %arg1: !emitc.array<3x5xf32>, %arg2: index, %arg3: index) {
+  %0 = emitc.subscript %arg0[%arg2, %arg3] : <4x8xf32>
+  %1 = emitc.subscript %arg1[%arg2, %arg3] : <3x5xf32>
+  emitc.assign %0 : f32 to %1 : f32
+  return
+}
+// CHECK: void load_store(float [[ARR1:[^ ]*]][4][8], float [[ARR2:[^ ]*]][3][5],
+// CHECK-SAME:            size_t [[I:[^ ]*]], size_t [[J:[^ ]*]])
+// CHECK-NEXT: [[ARR2]][[[I]]][[[J]]] = [[ARR1]][[[I]]][[[J]]];

@marbre
Copy link
Member

marbre commented Mar 12, 2024

Thanks for working on this! To make this more useful, I think the subscript op should also be usable on !emitc.ptr pointing to arrays.

@simon-camp
Copy link
Contributor

Thanks for the PR @mgehre-amd. Some early comments before looking into the changes in the emitter.

I can imagine different types to be subscriptable with different constraints, some of which you already check for.

array type
  • indices should be integer like, we use IntegerIndexOrOpaqueTypein other places for this
  • element type and result type should match
  • number of indices should match rank of the array type
pointer type
  • index should be integer like, we use IntegerIndexOrOpaqueTypein other places for this
  • pointee type and result type should match
  • only one index allowed
opaque type
  • indices can be any type
  • result can be any type
  • number of indices can be variadic since C++23

@mgehre-amd
Copy link
Contributor Author

Thanks for the PR @mgehre-amd. Some early comments before looking into the changes in the emitter.

I can imagine different types to be subscriptable with different constraints, some of which you already check for.

array type
  • indices should be integer like, we use IntegerIndexOrOpaqueTypein other places for this
  • element type and result type should match
  • number of indices should match rank of the array type
pointer type
  • index should be integer like, we use IntegerIndexOrOpaqueTypein other places for this
  • pointee type and result type should match
  • only one index allowed
opaque type
  • indices can be any type
  • result can be any type
  • number of indices can be variadic since C++23

Thanks for your comments! Do you think we can split support for SubscriptOp for this additional types into separate PRs?

Copy link
Contributor

@simon-camp simon-camp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me.

I don't really like that we are accumulating more and more special cases in the emitter. With emitc.literal, emitc.expression and emitc.subscript we now have three ops for which the text emission is deferred in one way or another. I think we can refactor this in a follow up though.

Introduces a SubscriptOp that allows to write IR like
```
func.func @load_store(%arg0: !emitc.array<4x8xf32>, %arg1: !emitc.array<3x5xf32>, %arg2: index, %arg3: index) {
  %0 = emitc.subscript %arg0[%arg2, %arg3] : <4x8xf32>
  %1 = emitc.subscript %arg1[%arg2, %arg3] : <3x5xf32>
  emitc.assign %0 : f32 to %1 : f32
  return
}
```
which gets translated into the C++ code
```
v1[v2][v3] = v0[v1][v2];
```

To make this happen, this
- adds the SubscriptOp
- allows the subscript op as rhs of emitc.assign
- updates the emitter to print SubscriptOps

The emitter prints emitc.subscript in a delayed fashing to allow
it being used as lvalue.
I.e. while processing
```
%0 = emitc.subscript %arg0[%arg2, %arg3] : <4x8xf32>
```
it will not emit any text, but record in the `valueMapper`
that the name for `%0` is `v0[v1][v2]`, see `CppEmitter::getSubscriptName`.
Only when that result is then used (here in `emitc.assign`),
that name is inserted into the text.
@mgehre-amd mgehre-amd force-pushed the mgehre.add_subscript_op branch from 3bc50bf to 8d85dd6 Compare March 14, 2024 22:10
@mgehre-amd mgehre-amd requested a review from simon-camp March 15, 2024 07:19
Copy link
Contributor

@simon-camp simon-camp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention, can you update the syntax in your PR description please.

@mgehre-amd
Copy link
Contributor Author

I forgot to mention, can you update the syntax in your PR description please.

Good catch, done!

@mgehre-amd mgehre-amd requested a review from simon-camp March 15, 2024 10:05
Copy link
Contributor

@simon-camp simon-camp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@mgehre-amd mgehre-amd merged commit 01a31ce into llvm:main Mar 15, 2024
@mgehre-amd mgehre-amd deleted the mgehre.add_subscript_op branch March 15, 2024 10:08
mgehre-amd added a commit to Xilinx/llvm-project that referenced this pull request Mar 22, 2024
Introduces a SubscriptOp that allows to write IR like
```
func.func @load_store(%arg0: !emitc.array<4x8xf32>, %arg1: !emitc.array<3x5xf32>, %arg2: index, %arg3: index) {
  %0 = emitc.subscript %arg0[%arg2, %arg3] : <4x8xf32>, index, index
  %1 = emitc.subscript %arg1[%arg2, %arg3] : <3x5xf32>, index, index
  emitc.assign %0 : f32 to %1 : f32
  return
}
```
which gets translated into the C++ code
```
v1[v2][v3] = v0[v1][v2];
```

To make this happen, this
- adds the SubscriptOp
- allows the subscript op as rhs of emitc.assign
- updates the emitter to print SubscriptOps

The emitter prints emitc.subscript in a delayed fashing to allow it
being used as lvalue.
I.e. while processing
```
%0 = emitc.subscript %arg0[%arg2, %arg3] : <4x8xf32>, index, index
```
it will not emit any text, but record in the `valueMapper` that the name
for `%0` is `v0[v1][v2]`, see `CppEmitter::getSubscriptName`. Only when
that result is then used (here in `emitc.assign`), that name is inserted
into the text.
benvanik pushed a commit to iree-org/iree that referenced this pull request Aug 23, 2024
Replace array related helper macros with the `emitc.subscript` operation
introduced in llvm/llvm-project#84783.

Signed-off-by: Simon Camphausen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants