Skip to content

Commit 0050e8f

Browse files
committed
[mlir][Tutorial] Add a section to Toy Ch.2 detailing the custom assembly format.
Summary: This details the C++ format as well as the new declarative format. This has been one of the major missing pieces from the toy tutorial. Differential Revision: https://reviews.llvm.org/D74938
1 parent 9eb436f commit 0050e8f

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+1176
-344
lines changed

mlir/docs/Tutorials/Toy/Ch-2.md

Lines changed: 180 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -517,12 +517,7 @@ def ConstantOp : Toy_Op<"constant", [NoSideEffect]> {
517517
}
518518
```
519519
520-
Above we introduce several of the concepts for defining operations in the ODS
521-
framework, but there are many more that we haven't had a chance to: regions,
522-
variadic operands, etc. Check out the
523-
[full specification](../../OpDefinitions.md) for more details.
524-
525-
## Complete Toy Example
520+
#### Specifying a Custom Assembly Format
526521
527522
At this point we can generate our "Toy IR". A simplified version of the previous
528523
example:
@@ -565,6 +560,185 @@ module {
565560
} loc("test/codegen.toy":0:0)
566561
```
567562
563+
One thing to notice here is that all of our Toy operations are printed using the
564+
generic assembly format. This format is the one shown when breaking down
565+
`toy.transpose` at the beginning of this chapter. MLIR allows for operations to
566+
define their own custom assembly format, either
567+
[declaratively](../../OpDefinitions.md#declarative-assembly-format) or
568+
imperatively via C++. Defining a custom assembly format allows for tailoring the
569+
generated IR into something a bit more readable by removing a lot of the fluff
570+
that is required by the generic format. Let's walk through an example of an
571+
operation format that we would like to simplify.
572+
573+
##### `toy.print`
574+
575+
The current form of `toy.print` is a little verbose. There are a lot of
576+
additional characters that we would like to strip away. Let's begin by thinking
577+
of what a good format of `toy.print` would be, and see how we can implement it.
578+
Looking at the basics of `toy.print` we get:
579+
580+
```mlir
581+
toy.print %5 : tensor<*xf64> loc(...)
582+
```
583+
584+
Here we have stripped much of the format down to the bare essentials, and it has
585+
become much more readable. To provide a custom assembly format, an operation can
586+
either override the `parser` and `printer` fields for a C++ format, or the
587+
`assemblyFormat` field for the declarative format. Let's look at the C++ variant
588+
first, as this is what the declarative format maps to internally.
589+
590+
```tablegen
591+
/// Consider a stripped definition of `toy.print` here.
592+
def PrintOp : Toy_Op<"print"> {
593+
let arguments = (ins F64Tensor:$input);
594+
595+
// Divert the printer and parser to static functions in our .cpp
596+
// file that correspond to 'print' and 'printPrintOp'. 'printer' and 'parser'
597+
// here correspond to an instance of a 'OpAsmParser' and 'OpAsmPrinter'. More
598+
// details on these classes is shown below.
599+
let printer = [{ return ::print(printer, *this); }];
600+
let parser = [{ return ::parse$cppClass(parser, result); }];
601+
}
602+
```
603+
604+
A C++ implementation for the printer and parser is shown below:
605+
606+
```c++
607+
/// The 'OpAsmPrinter' class is a stream that will allows for formatting
608+
/// strings, attributes, operands, types, etc.
609+
static void print(mlir::OpAsmPrinter &printer, PrintOp op) {
610+
printer << "toy.print " << op.input();
611+
printer.printOptionalAttrDict(op.getAttrs());
612+
printer << " : " << op.input().getType();
613+
}
614+
615+
/// The 'OpAsmPrinter' class provides a collection of methods for parsing
616+
/// various punctuation, as well as attributes, operands, types, etc. Each of
617+
/// these methods returns a `ParseResult`. This class is a wrapper around
618+
/// `LogicalResult` that can be converted to a boolean `true` value on failure,
619+
/// or `false` on success. This allows for easily chaining together a set of
620+
/// parser rules. These rules are used to populate an `mlir::OperationState`
621+
/// similarly to the `build` methods described above.
622+
static mlir::ParseResult parsePrintOp(mlir::OpAsmParser &parser,
623+
mlir::OperationState &result) {
624+
// Parse the input operand, the attribute dictionary, and the type of the
625+
// input.
626+
mlir::OpAsmParser::OperandType inputOperand;
627+
mlir::Type inputType;
628+
if (parser.parseOperand(inputOperand) ||
629+
parser.parseOptionalAttrDict(result.attributes) || parser.parseColon() ||
630+
parser.parseType(inputType))
631+
return mlir::failure();
632+
633+
// Resolve the input operand to the type we parsed in.
634+
if (parser.resolveOperand(inputOperand, inputType, result.operands))
635+
return mlir::failure();
636+
637+
return mlir::success();
638+
}
639+
```
640+
641+
With the C++ implementation defined, let's see how this can be mapped to the
642+
[declarative format](../../OpDefinitions.md#declarative-assembly-format). The
643+
declarative format is largely composed of three different components:
644+
645+
* Directives
646+
- A type of builtin function, with an optional set of arguments.
647+
* Literals
648+
- A keyword or punctuation surrounded by \`\`.
649+
* Variables
650+
- An entity that has been registered on the operation itself, i.e. an
651+
argument(attribute or operand), result, successor, etc. In the `PrintOp`
652+
example above, a variable would be `$input`.
653+
654+
A direct mapping of our C++ format looks something like:
655+
656+
```tablegen
657+
/// Consider a stripped definition of `toy.print` here.
658+
def PrintOp : Toy_Op<"print"> {
659+
let arguments = (ins F64Tensor:$input);
660+
661+
// In the following format we have two directives, `attr-dict` and `type`.
662+
// These correspond to the attribute dictionary and the type of a given
663+
// variable represectively.
664+
let assemblyFormat = "$input attr-dict `:` type($input)";
665+
}
666+
```
667+
668+
The [declarative format](../../OpDefinitions.md#declarative-assembly-format) has
669+
many more interesting features, so be sure to check it out before implementing a
670+
custom format in C++. After beautifying the format of a few of our operations we
671+
now get a much more readable:
672+
673+
```mlir
674+
module {
675+
func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> {
676+
%0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:10)
677+
%1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:25)
678+
%2 = toy.mul %0, %1 : tensor<*xf64> loc("test/codegen.toy":5:25)
679+
toy.return %2 : tensor<*xf64> loc("test/codegen.toy":5:3)
680+
} loc("test/codegen.toy":4:1)
681+
func @main() {
682+
%0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64> loc("test/codegen.toy":9:17)
683+
%1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64> loc("test/codegen.toy":9:3)
684+
%2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64> loc("test/codegen.toy":10:17)
685+
%3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64> loc("test/codegen.toy":10:3)
686+
%4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":11:11)
687+
%5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":12:11)
688+
toy.print %5 : tensor<*xf64> loc("test/codegen.toy":13:3)
689+
toy.return loc("test/codegen.toy":8:1)
690+
} loc("test/codegen.toy":8:1)
691+
} loc("test/codegen.toy":0:0)
692+
```
693+
694+
Above we introduce several of the concepts for defining operations in the ODS
695+
framework, but there are many more that we haven't had a chance to: regions,
696+
variadic operands, etc. Check out the
697+
[full specification](../../OpDefinitions.md) for more details.
698+
699+
## Complete Toy Example
700+
701+
At this point we can generate our "Toy IR". A simplified version of the previous
702+
example:
703+
704+
```toy
705+
# User defined generic function that operates on unknown shaped arguments.
706+
def multiply_transpose(a, b) {
707+
return transpose(a) * transpose(b);
708+
}
709+
710+
def main() {
711+
var a<2, 3> = [[1, 2, 3], [4, 5, 6]];
712+
var b<2, 3> = [1, 2, 3, 4, 5, 6];
713+
var c = multiply_transpose(a, b);
714+
var d = multiply_transpose(b, a);
715+
print(d);
716+
}
717+
```
718+
719+
Results in the following IR:
720+
721+
```mlir
722+
module {
723+
func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> {
724+
%0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:10)
725+
%1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:25)
726+
%2 = toy.mul %0, %1 : tensor<*xf64> loc("test/codegen.toy":5:25)
727+
toy.return %2 : tensor<*xf64> loc("test/codegen.toy":5:3)
728+
} loc("test/codegen.toy":4:1)
729+
func @main() {
730+
%0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64> loc("test/codegen.toy":9:17)
731+
%1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64> loc("test/codegen.toy":9:3)
732+
%2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64> loc("test/codegen.toy":10:17)
733+
%3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64> loc("test/codegen.toy":10:3)
734+
%4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":11:11)
735+
%5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":12:11)
736+
toy.print %5 : tensor<*xf64> loc("test/codegen.toy":13:3)
737+
toy.return loc("test/codegen.toy":8:1)
738+
} loc("test/codegen.toy":8:1)
739+
} loc("test/codegen.toy":0:0)
740+
```
741+
568742
You can build `toyc-ch2` and try yourself: `toyc-ch2
569743
test/Examples/Toy/Ch2/codegen.toy -emit=mlir -mlir-print-debuginfo`. We can also
570744
check our RoundTrip: `toyc-ch2 test/Examples/Toy/Ch2/codegen.toy -emit=mlir

mlir/docs/Tutorials/Toy/Ch-3.md

Lines changed: 15 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,9 @@ Which corresponds to the following IR:
3838

3939
```mlir
4040
func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
41-
%0 = "toy.transpose"(%arg0) : (tensor<*xf64>) -> tensor<*xf64>
42-
%1 = "toy.transpose"(%0) : (tensor<*xf64>) -> tensor<*xf64>
43-
"toy.return"(%1) : (tensor<*xf64>) -> ()
41+
%0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
42+
%1 = toy.transpose(%0 : tensor<*xf64>) to tensor<*xf64>
43+
toy.return %1 : tensor<*xf64>
4444
}
4545
```
4646

@@ -133,8 +133,8 @@ observe our pattern in action:
133133

134134
```mlir
135135
func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
136-
%0 = "toy.transpose"(%arg0) : (tensor<*xf64>) -> tensor<*xf64>
137-
"toy.return"(%arg0) : (tensor<*xf64>) -> ()
136+
%0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
137+
toy.return %arg0 : tensor<*xf64>
138138
}
139139
```
140140

@@ -154,7 +154,7 @@ Let's retry now `toyc-ch3 test/transpose_transpose.toy -emit=mlir -opt`:
154154

155155
```mlir
156156
func @transpose_transpose(%arg0: tensor<*xf64>) -> tensor<*xf64> {
157-
"toy.return"(%arg0) : (tensor<*xf64>) -> ()
157+
toy.return %arg0 : tensor<*xf64>
158158
}
159159
```
160160

@@ -229,13 +229,12 @@ def main() {
229229
```mlir
230230
module {
231231
func @main() {
232-
%0 = "toy.constant"() {value = dense<[1.000000e+00, 2.000000e+00]> : tensor<2xf64>}
233-
: () -> tensor<2xf64>
234-
%1 = "toy.reshape"(%0) : (tensor<2xf64>) -> tensor<2x1xf64>
235-
%2 = "toy.reshape"(%1) : (tensor<2x1xf64>) -> tensor<2x1xf64>
236-
%3 = "toy.reshape"(%2) : (tensor<2x1xf64>) -> tensor<2x1xf64>
237-
"toy.print"(%3) : (tensor<2x1xf64>) -> ()
238-
"toy.return"() : () -> ()
232+
%0 = toy.constant dense<[1.000000e+00, 2.000000e+00]> : tensor<2xf64>
233+
%1 = toy.reshape(%0 : tensor<2xf64>) to tensor<2x1xf64>
234+
%2 = toy.reshape(%1 : tensor<2x1xf64>) to tensor<2x1xf64>
235+
%3 = toy.reshape(%2 : tensor<2x1xf64>) to tensor<2x1xf64>
236+
toy.print %3 : tensor<2x1xf64>
237+
toy.return
239238
}
240239
}
241240
```
@@ -246,10 +245,9 @@ our pattern in action:
246245
```mlir
247246
module {
248247
func @main() {
249-
%0 = "toy.constant"() {value = dense<[[1.000000e+00], [2.000000e+00]]> \
250-
: tensor<2x1xf64>} : () -> tensor<2x1xf64>
251-
"toy.print"(%0) : (tensor<2x1xf64>) -> ()
252-
"toy.return"() : () -> ()
248+
%0 = toy.constant dense<[[1.000000e+00], [2.000000e+00]]> : tensor<2x1xf64>
249+
toy.print %0 : tensor<2x1xf64>
250+
toy.return
253251
}
254252
}
255253
```

mlir/docs/Tutorials/Toy/Ch-4.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -150,20 +150,20 @@ Now let's look at a working example:
150150

151151
```mlir
152152
func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> {
153-
%0 = "toy.transpose"(%arg0) : (tensor<*xf64>) -> tensor<*xf64>
154-
%1 = "toy.transpose"(%arg1) : (tensor<*xf64>) -> tensor<*xf64>
155-
%2 = "toy.mul"(%0, %1) : (tensor<*xf64>, tensor<*xf64>) -> tensor<*xf64>
156-
"toy.return"(%2) : (tensor<*xf64>) -> ()
153+
%0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64>
154+
%1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64>
155+
%2 = toy.mul %0, %1 : tensor<*xf64>
156+
toy.return %2 : tensor<*xf64>
157157
}
158158
func @main() {
159-
%0 = "toy.constant"() {value = dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>} : () -> tensor<2x3xf64>
160-
%1 = "toy.reshape"(%0) : (tensor<2x3xf64>) -> tensor<2x3xf64>
161-
%2 = "toy.constant"() {value = dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64>} : () -> tensor<6xf64>
162-
%3 = "toy.reshape"(%2) : (tensor<6xf64>) -> tensor<2x3xf64>
163-
%4 = "toy.generic_call"(%1, %3) {callee = @multiply_transpose} : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64>
164-
%5 = "toy.generic_call"(%3, %1) {callee = @multiply_transpose} : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64>
165-
"toy.print"(%5) : (tensor<*xf64>) -> ()
166-
"toy.return"() : () -> ()
159+
%0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>
160+
%1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64>
161+
%2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64>
162+
%3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64>
163+
%4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64>
164+
%5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64>
165+
toy.print %5 : tensor<*xf64>
166+
toy.return
167167
}
168168
```
169169

@@ -226,8 +226,8 @@ func @main() {
226226
%4 = "toy.transpose"(%2) : (tensor<*xf64>) -> tensor<*xf64>
227227
%5 = "toy.transpose"(%3) : (tensor<*xf64>) -> tensor<*xf64>
228228
%6 = "toy.mul"(%4, %5) : (tensor<*xf64>, tensor<*xf64>) -> tensor<*xf64>
229-
"toy.print"(%6) : (tensor<*xf64>) -> ()
230-
"toy.return"() : () -> ()
229+
toy.print %6 : tensor<*xf64>
230+
toy.return
231231
}
232232
```
233233

@@ -374,8 +374,8 @@ func @main() {
374374
%0 = "toy.constant"() {value = dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>} : () -> tensor<2x3xf64>
375375
%1 = "toy.transpose"(%0) : (tensor<2x3xf64>) -> tensor<3x2xf64>
376376
%2 = "toy.mul"(%1, %1) : (tensor<3x2xf64>, tensor<3x2xf64>) -> tensor<3x2xf64>
377-
"toy.print"(%2) : (tensor<3x2xf64>) -> ()
378-
"toy.return"() : () -> ()
377+
toy.print %2 : tensor<3x2xf64>
378+
toy.return
379379
}
380380
```
381381

mlir/docs/Tutorials/Toy/Ch-5.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -239,11 +239,11 @@ Looking back at our current working example:
239239

240240
```mlir
241241
func @main() {
242-
%0 = "toy.constant"() {value = dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>} : () -> tensor<2x3xf64>
243-
%2 = "toy.transpose"(%0) : (tensor<2x3xf64>) -> tensor<3x2xf64>
244-
%3 = "toy.mul"(%2, %2) : (tensor<3x2xf64>, tensor<3x2xf64>) -> tensor<3x2xf64>
245-
"toy.print"(%3) : (tensor<3x2xf64>) -> ()
246-
"toy.return"() : () -> ()
242+
%0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>
243+
%2 = toy.transpose(%0 : tensor<2x3xf64>) to tensor<3x2xf64>
244+
%3 = toy.mul %2, %2 : tensor<3x2xf64>
245+
toy.print %3 : tensor<3x2xf64>
246+
toy.return
247247
}
248248
```
249249

@@ -291,7 +291,7 @@ func @main() {
291291
}
292292
293293
// Print the value held by the buffer.
294-
"toy.print"(%0) : (memref<3x2xf64>) -> ()
294+
toy.print %0 : memref<3x2xf64>
295295
dealloc %2 : memref<2x3xf64>
296296
dealloc %1 : memref<3x2xf64>
297297
dealloc %0 : memref<3x2xf64>
@@ -340,7 +340,7 @@ func @main() {
340340
}
341341
342342
// Print the value held by the buffer.
343-
"toy.print"(%0) : (memref<3x2xf64>) -> ()
343+
toy.print %0 : memref<3x2xf64>
344344
dealloc %1 : memref<2x3xf64>
345345
dealloc %0 : memref<3x2xf64>
346346
return

mlir/docs/Tutorials/Toy/Ch-6.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -115,11 +115,11 @@ Looking back at our current working example:
115115

116116
```mlir
117117
func @main() {
118-
%0 = "toy.constant"() {value = dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>} : () -> tensor<2x3xf64>
119-
%2 = "toy.transpose"(%0) : (tensor<2x3xf64>) -> tensor<3x2xf64>
120-
%3 = "toy.mul"(%2, %2) : (tensor<3x2xf64>, tensor<3x2xf64>) -> tensor<3x2xf64>
121-
"toy.print"(%3) : (tensor<3x2xf64>) -> ()
122-
"toy.return"() : () -> ()
118+
%0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>
119+
%2 = toy.transpose(%0 : tensor<2x3xf64>) to tensor<3x2xf64>
120+
%3 = toy.mul %2, %2 : tensor<3x2xf64>
121+
toy.print %3 : tensor<3x2xf64>
122+
toy.return
123123
}
124124
```
125125

0 commit comments

Comments
 (0)