@@ -517,12 +517,7 @@ def ConstantOp : Toy_Op<"constant", [NoSideEffect]> {
517
517
}
518
518
```
519
519
520
- Above we introduce several of the concepts for defining operations in the ODS
521
- framework, but there are many more that we haven' t had a chance to: regions,
522
- variadic operands, etc. Check out the
523
- [full specification](../../OpDefinitions.md) for more details.
524
-
525
- ## Complete Toy Example
520
+ #### Specifying a Custom Assembly Format
526
521
527
522
At this point we can generate our "Toy IR". A simplified version of the previous
528
523
example:
@@ -565,6 +560,185 @@ module {
565
560
} loc("test/codegen.toy":0:0)
566
561
```
567
562
563
+ One thing to notice here is that all of our Toy operations are printed using the
564
+ generic assembly format. This format is the one shown when breaking down
565
+ `toy.transpose` at the beginning of this chapter. MLIR allows for operations to
566
+ define their own custom assembly format, either
567
+ [declaratively](../../OpDefinitions.md#declarative-assembly-format) or
568
+ imperatively via C++. Defining a custom assembly format allows for tailoring the
569
+ generated IR into something a bit more readable by removing a lot of the fluff
570
+ that is required by the generic format. Let' s walk through an example of an
571
+ operation format that we would like to simplify.
572
+
573
+ ##### `toy.print`
574
+
575
+ The current form of `toy.print` is a little verbose. There are a lot of
576
+ additional characters that we would like to strip away. Let' s begin by thinking
577
+ of what a good format of `toy.print` would be, and see how we can implement it.
578
+ Looking at the basics of `toy.print` we get:
579
+
580
+ ```mlir
581
+ toy.print %5 : tensor<*xf64> loc(...)
582
+ ```
583
+
584
+ Here we have stripped much of the format down to the bare essentials, and it has
585
+ become much more readable. To provide a custom assembly format, an operation can
586
+ either override the `parser` and `printer` fields for a C++ format, or the
587
+ `assemblyFormat` field for the declarative format. Let' s look at the C++ variant
588
+ first, as this is what the declarative format maps to internally.
589
+
590
+ ```tablegen
591
+ // / Consider a stripped definition of `toy.print` here.
592
+ def PrintOp : Toy_Op<" print" > {
593
+ let arguments = (ins F64Tensor:$input);
594
+
595
+ // Divert the printer and parser to static functions in our .cpp
596
+ // file that correspond to 'print' and 'printPrintOp'. 'printer' and 'parser'
597
+ // here correspond to an instance of a 'OpAsmParser' and 'OpAsmPrinter'. More
598
+ // details on these classes is shown below.
599
+ let printer = [ { return ::print(printer, * this); }] ;
600
+ let parser = [ { return ::parse$cppClass(parser, result); }] ;
601
+ }
602
+ ```
603
+
604
+ A C++ implementation for the printer and parser is shown below:
605
+
606
+ ``` c++
607
+ // / The 'OpAsmPrinter' class is a stream that will allows for formatting
608
+ // / strings, attributes, operands, types, etc.
609
+ static void print (mlir::OpAsmPrinter &printer, PrintOp op) {
610
+ printer << "toy.print " << op.input();
611
+ printer.printOptionalAttrDict(op.getAttrs());
612
+ printer << " : " << op.input().getType();
613
+ }
614
+
615
+ /// The 'OpAsmPrinter' class provides a collection of methods for parsing
616
+ /// various punctuation, as well as attributes, operands, types, etc. Each of
617
+ /// these methods returns a ` ParseResult ` . This class is a wrapper around
618
+ /// ` LogicalResult ` that can be converted to a boolean ` true ` value on failure,
619
+ /// or ` false ` on success. This allows for easily chaining together a set of
620
+ /// parser rules. These rules are used to populate an ` mlir::OperationState `
621
+ /// similarly to the ` build ` methods described above.
622
+ static mlir::ParseResult parsePrintOp(mlir::OpAsmParser &parser,
623
+ mlir::OperationState &result) {
624
+ // Parse the input operand, the attribute dictionary, and the type of the
625
+ // input.
626
+ mlir::OpAsmParser::OperandType inputOperand;
627
+ mlir::Type inputType;
628
+ if (parser.parseOperand(inputOperand) ||
629
+ parser.parseOptionalAttrDict(result.attributes) || parser.parseColon() ||
630
+ parser.parseType(inputType))
631
+ return mlir::failure();
632
+
633
+ // Resolve the input operand to the type we parsed in.
634
+ if (parser.resolveOperand(inputOperand, inputType, result.operands))
635
+ return mlir::failure();
636
+
637
+ return mlir::success();
638
+ }
639
+ ```
640
+
641
+ With the C++ implementation defined, let's see how this can be mapped to the
642
+ [declarative format](../../OpDefinitions.md#declarative-assembly-format). The
643
+ declarative format is largely composed of three different components:
644
+
645
+ * Directives
646
+ - A type of builtin function, with an optional set of arguments.
647
+ * Literals
648
+ - A keyword or punctuation surrounded by \`\`.
649
+ * Variables
650
+ - An entity that has been registered on the operation itself, i.e. an
651
+ argument(attribute or operand), result, successor, etc. In the `PrintOp`
652
+ example above, a variable would be `$input`.
653
+
654
+ A direct mapping of our C++ format looks something like:
655
+
656
+ ```tablegen
657
+ /// Consider a stripped definition of `toy.print` here.
658
+ def PrintOp : Toy_Op<"print"> {
659
+ let arguments = (ins F64Tensor:$input);
660
+
661
+ // In the following format we have two directives, `attr-dict` and `type`.
662
+ // These correspond to the attribute dictionary and the type of a given
663
+ // variable represectively.
664
+ let assemblyFormat = "$input attr-dict `:` type($input)";
665
+ }
666
+ ```
667
+
668
+ The [ declarative format] ( ../../OpDefinitions.md#declarative-assembly-format ) has
669
+ many more interesting features, so be sure to check it out before implementing a
670
+ custom format in C++. After beautifying the format of a few of our operations we
671
+ now get a much more readable:
672
+
673
+ ``` mlir
674
+ module {
675
+ func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> {
676
+ %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:10)
677
+ %1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:25)
678
+ %2 = toy.mul %0, %1 : tensor<*xf64> loc("test/codegen.toy":5:25)
679
+ toy.return %2 : tensor<*xf64> loc("test/codegen.toy":5:3)
680
+ } loc("test/codegen.toy":4:1)
681
+ func @main() {
682
+ %0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64> loc("test/codegen.toy":9:17)
683
+ %1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64> loc("test/codegen.toy":9:3)
684
+ %2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64> loc("test/codegen.toy":10:17)
685
+ %3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64> loc("test/codegen.toy":10:3)
686
+ %4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":11:11)
687
+ %5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":12:11)
688
+ toy.print %5 : tensor<*xf64> loc("test/codegen.toy":13:3)
689
+ toy.return loc("test/codegen.toy":8:1)
690
+ } loc("test/codegen.toy":8:1)
691
+ } loc("test/codegen.toy":0:0)
692
+ ```
693
+
694
+ Above we introduce several of the concepts for defining operations in the ODS
695
+ framework, but there are many more that we haven't had a chance to: regions,
696
+ variadic operands, etc. Check out the
697
+ [ full specification] ( ../../OpDefinitions.md ) for more details.
698
+
699
+ ## Complete Toy Example
700
+
701
+ At this point we can generate our "Toy IR". A simplified version of the previous
702
+ example:
703
+
704
+ ``` toy
705
+ # User defined generic function that operates on unknown shaped arguments.
706
+ def multiply_transpose(a, b) {
707
+ return transpose(a) * transpose(b);
708
+ }
709
+
710
+ def main() {
711
+ var a<2, 3> = [[1, 2, 3], [4, 5, 6]];
712
+ var b<2, 3> = [1, 2, 3, 4, 5, 6];
713
+ var c = multiply_transpose(a, b);
714
+ var d = multiply_transpose(b, a);
715
+ print(d);
716
+ }
717
+ ```
718
+
719
+ Results in the following IR:
720
+
721
+ ``` mlir
722
+ module {
723
+ func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> {
724
+ %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:10)
725
+ %1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64> loc("test/codegen.toy":5:25)
726
+ %2 = toy.mul %0, %1 : tensor<*xf64> loc("test/codegen.toy":5:25)
727
+ toy.return %2 : tensor<*xf64> loc("test/codegen.toy":5:3)
728
+ } loc("test/codegen.toy":4:1)
729
+ func @main() {
730
+ %0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64> loc("test/codegen.toy":9:17)
731
+ %1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64> loc("test/codegen.toy":9:3)
732
+ %2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64> loc("test/codegen.toy":10:17)
733
+ %3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64> loc("test/codegen.toy":10:3)
734
+ %4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":11:11)
735
+ %5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/codegen.toy":12:11)
736
+ toy.print %5 : tensor<*xf64> loc("test/codegen.toy":13:3)
737
+ toy.return loc("test/codegen.toy":8:1)
738
+ } loc("test/codegen.toy":8:1)
739
+ } loc("test/codegen.toy":0:0)
740
+ ```
741
+
568
742
You can build ` toyc-ch2 ` and try yourself: `toyc-ch2
569
743
test/Examples/Toy/Ch2/codegen.toy -emit=mlir -mlir-print-debuginfo`. We can also
570
744
check our RoundTrip: `toyc-ch2 test/Examples/Toy/Ch2/codegen.toy -emit=mlir
0 commit comments