|
| 1 | +.. _conversion: |
| 2 | + |
| 3 | +FX Converters |
| 4 | +================== |
| 5 | +The converter library in Torch-TensorRT is located in ``TensorRT/py/torch_tensorrt/fx/converters``. |
| 6 | +They are categorized into - ``aten_ops_converters``, ``acc_ops_converters`` and ``nn_ops_converters``. |
| 7 | +The individual converters present are useful for the quantization workflow. |
| 8 | +The converters are registered using the ``tensorrt_converter`` |
| 9 | + |
| 10 | +Steps |
| 11 | +================== |
| 12 | + |
| 13 | +External Interface |
| 14 | +----------------- |
| 15 | +Depending on whether the operation is generated using acc_trace, aten_trace or fx_trace, the converters are included in either of |
| 16 | +``aten_ops_converters``, ``acc_ops_converters`` and ``nn_ops_converters``. The converters are registered using ``tensorrt_converter`` decorator. The function decorated |
| 17 | +has the arguments - ``network, target, args, kwargs, name``, which is common across all the operators schema. |
| 18 | + |
| 19 | + * acc_ops_converters |
| 20 | + * acc_trace is produced by ``torch_tensorrt.fx.tracer.acc_tracer.acc_tracer.trace``. |
| 21 | + * aten_ops |
| 22 | + There are two options at present for this |
| 23 | + * Dynamo: aten_trace is produced by ``torch_tensorrt.dynamo.backend.compile``. The second round of trace is produced by ``aot_torch_tensorrt_ateb_backend`` by invoking ``aot_module_simplified`` from ``torch._functorch.aot_autograd`` |
| 24 | + * FX : aten_trace is produced by ``torch_tensorrt.fx.tracer.dispatch_tracer.aten_tracer.trace``. This flow is more common currently, but this will soon be deprecated in torch_tensorrt. |
| 25 | + * nn_ops |
| 26 | + * symbolic_trace is produced by ``torch.fx._symbolic_trace``. |
| 27 | + |
| 28 | +Converter implementation |
| 29 | +----------------- |
| 30 | +Three different kind of implementation is illustrated for converter implementation |
| 31 | + |
| 32 | + * activation type |
| 33 | + We illustrate the implementation of ``leaky_relu`` operator |
| 34 | + * The implementation is present in ``py/torch_tensorrt/fx/impl/activation.py`` |
| 35 | + .. code_block:: none |
| 36 | + def leaky_relu( |
| 37 | + network: TRTNetwork, |
| 38 | + target: Target, |
| 39 | + source_ir: Optional[SourceIR], |
| 40 | + name: str, |
| 41 | + input_val: TRTTensor, |
| 42 | + alpha: Optional[Any], |
| 43 | + ): |
| 44 | + #implementation |
| 45 | + |
| 46 | + As illustrated above we define the function with args: network, name (defined in the trace), target (defined in the trace), source_ir which |
| 47 | + can take enum SourceIR.aten, SourceIR.acc or SourceIR.nn. The input_val, alpha, beta(in some cases) are extraced from the trace created. |
| 48 | + In the case of ``acc_trace`` the input parameters in kwargs, whereas in ``aten_trace`` the input parameters in args. |
| 49 | + |
| 50 | + * The converter should be registered with the approriate opcode in the ``tensorrt_converter``. |
| 51 | + |
| 52 | + Acc op is defined in ``py/torch_tensorrt/fx/converters/acc_ops_converters`` |
| 53 | + .. code_block:: none |
| 54 | + def acc_ops_leaky_relu( |
| 55 | + network: TRTNetwork, |
| 56 | + target: Target, |
| 57 | + args: Tuple[Argument, ...], |
| 58 | + kwargs: Dict[str, Argument], |
| 59 | + name: str, |
| 60 | + ) -> Union[TRTTensor, Sequence[TRTTensor]]: |
| 61 | + input_val = kwargs["input"] |
| 62 | + negative_slope = kwargs["negative_slope"] |
| 63 | + operation_type = trt.ActivationType.LEAKY_RELU |
| 64 | + |
| 65 | + return activation.leaky_relu( |
| 66 | + network, target, SourceIR.ACC, name, kwargs["input"], kwargs["negative_slope"] |
| 67 | + ) |
| 68 | + |
| 69 | + Aten op is defined in ``py/torch_tensorrt/fx/converters/aten_ops_converters`` |
| 70 | + .. code_block:: none |
| 71 | + def aten_ops_leaky_relu( |
| 72 | + network: TRTNetwork, |
| 73 | + target: Target, |
| 74 | + args: Tuple[Argument, ...], |
| 75 | + kwargs: Dict[str, Argument], |
| 76 | + name: str, |
| 77 | + ) -> Union[TRTTensor, Sequence[TRTTensor]]: |
| 78 | + return activation.leaky_relu(network, target, SourceIR.ATEN, name, args[0], args[1]) |
| 79 | + |
| 80 | + As pointed above the acc_trace and aten_trace have input arguments in different form. acc has args in kwargs while aten has args in args |
| 81 | + * operation type |
| 82 | + We illustrate the implementation of ``fmod`` operator |
| 83 | + * The implementation is present in ``py/torch_tensorrt/fx/impl/elementwise/ops.py`` |
| 84 | + .. code_block:: none |
| 85 | + def fmod( |
| 86 | + network: TRTNetwork, |
| 87 | + target: Target, |
| 88 | + source_ir: Optional[SourceIR], |
| 89 | + name: str, |
| 90 | + input: TRTTensor, |
| 91 | + other: TRTTensor, |
| 92 | + ) -> TRTTensor: |
| 93 | + #implementation |
| 94 | + The implementation of these operators are segregated according to the operator type |
| 95 | + * Since fmod is elementwise it is present in ``py/torch_tensorrt/fx/impl/elementwise`` |
| 96 | + The ``py/torch_tensorrt/fx/impl/elementwise/base`` contains base functions for elementwise operator. |
| 97 | + * Operators like sqrt will be present in ``py/torch_tensorrt/fx/impl/unary`` |
| 98 | + The ``py/torch_tensorrt/fx/impl/unary/base`` contains base functions for unary operator. |
| 99 | + * Operators like softmax, layer_norm, batch_norm are present in ``py/torch_tensorrt/fx/impl/normalization`` |
| 100 | + Since there are no base operations common to all, there is no base file |
| 101 | + * Operators like slice, select, where, embedding are present in ``py/torch_tensorrt/fx/impl/*.py``. |
| 102 | + They have individual operator implementation with the same API structure as above but with different arguments |
| 103 | + * The converter should be registered with the approriate opcode which is same as the activation example |
| 104 | + |
| 105 | + * lowering type |
| 106 | + There are some converters which can be decomposed into suboperations and need not have seperate converter registration. |
| 107 | + Such converters can be implemented via ``lowering passes``. |
| 108 | + We illustrate this via ``addmm`` operator |
| 109 | + * The decompositions are registered via ``register_decomposition``. We define ``addmm_replacement`` and replace it with the torch ops, which will have their corresponding converters called. |
| 110 | + .. code_block:: none |
| 111 | + def addmm_replacement( |
| 112 | + input_: torch.Tensor, mat1: torch.Tensor, mat2: torch.Tensor, *, beta=1, alpha=1 |
| 113 | + ) -> torch.Tensor: |
| 114 | + return torch.add( |
| 115 | + torch.mul(input_, beta), torch.mul(torch.matmul(mat1, mat2), alpha) |
| 116 | + ) |
| 117 | +
|
| 118 | +Tests |
| 119 | +---------------- |
| 120 | + |
| 121 | +* FX testing |
| 122 | + * The fx tests are present in ``test/converters/aten_op`` |
| 123 | + * The test class is derived from ``DispatchTestCase``, with parameterized testing to implement different test cases |
| 124 | + * The results for ``dispatch_tracer.aten_trace`` and torch are compared. |
| 125 | + * The ``expected_op`` is also test. This op will be called by the model and needs to be specified so that the approriate converter is invoked |
| 126 | + * The tests throw error if any of the above conditions fail |
| 127 | + |
| 128 | +* Dynamo testing |
| 129 | + * The dynamo tests are present for the lowering ops. The above converters will soon be ported to dynamo tests |
| 130 | + * The lowering op test is present in ``dynamo/backend/test/test_decompositions.py`` |
| 131 | + * The results for ``fx.symbolic_trace `` and ``torch_tensorrt.compile`` is compared |
| 132 | + * The tests also test for the ``expected_op`` and the ``unexpected_op``. ``expected_op`` are the ones corresponding to the operations the operations are lowered to, while ``unexpected_op`` is the original op |
| 133 | + * The tests throw error if any of the above conditions fail |
0 commit comments