|
| 1 | +.. _dynamo_conversion: |
| 2 | + |
| 3 | +Dynamo Converters |
| 4 | +================== |
| 5 | +The dynamo converter library in Torch-TensorRT is located in ``TensorRT/py/torch_tensorrt/dynamo/conversion``. |
| 6 | + |
| 7 | + |
| 8 | + |
| 9 | +Steps |
| 10 | +================== |
| 11 | + |
| 12 | +Operation Set |
| 13 | +------------------- |
| 14 | +The converters in dynamo are produced by ``aten_trace`` and falls under ``aten_ops_converters`` ( FX earlier had ``acc_ops_converters``, ``aten_ops_converters`` or ``nn_ops_converters`` depending on the trace through which it was produced). The converters are registered using ``dynamo_tensorrt_converter`` for dynamo. The function decorated |
| 15 | +has the arguments - ``network, target, args, kwargs, name``, which is common across all the operators schema. |
| 16 | +These functions are mapped in the ``aten`` converter registry dictionary (at present a compilation of FX and dynamo converters, FX will be deprecated soon), with key as the function target name. |
| 17 | + |
| 18 | + * aten_trace is produced by ``torch_tensorrt.dynamo.trace(..)`` for the export path and ``torch_tensorrt.compile(ir=dynamo)`` for the compile path. |
| 19 | + The export path makes use of ``aten_tracer`` whereas the alternate trace in compile is produced by the AOT Autograd library. |
| 20 | + Both these simplify the torch operators to reduced set of Aten operations. |
| 21 | + |
| 22 | + |
| 23 | +As mentioned above, if you would like to add a new converter, its implementation will be included in ``TensorRT/py/torch_tensorrt/dynamo/conversion/impl`` |
| 24 | +Although there is a corresponding implementation of the converters included in the common implementation library present in ``TensorRT/py/torch_tensorrt/fx/impl`` for FX converters, this documentation focuses on the implementation of the ``aten_ops`` converters in dynamo. |
| 25 | + |
| 26 | + |
| 27 | +Converter implementation |
| 28 | +------------------------ |
| 29 | +In this section, we illustrate the steps to be implemented for writing a converter. We divide them according to activation, operator, lowering pass implementation or evaluator. |
| 30 | +Each of them is detailed with the help of an example |
| 31 | + |
| 32 | + * Registration |
| 33 | + |
| 34 | + The converter needs to be registered with the appropriate op code in the ``dynamo_tensorrt_converter``. |
| 35 | + |
| 36 | + * Activation type |
| 37 | + |
| 38 | + Example: ``leaky_relu`` |
| 39 | + |
| 40 | + |
| 41 | + * aten_ops_converters: Dynamo_converters |
| 42 | + |
| 43 | + Define in ``py/torch_tensorrt/dynamo/conversion/aten_ops_converters``. One needs to register the opcode generated in the trace with ``dynamo_tensorrt_converter`` decorator. Op code to be used for the registration or the converter registry key in this case is ``torch.ops.aten.leaky_relu.default`` |
| 44 | + |
| 45 | + .. code-block:: python |
| 46 | + |
| 47 | + @dynamo_tensorrt_converter(torch.ops.aten.leaky_relu.default) |
| 48 | + def aten_ops_leaky_relu( |
| 49 | + network: TRTNetwork, |
| 50 | + target: Target, |
| 51 | + args: Tuple[Argument, ...], |
| 52 | + kwargs: Dict[str, Argument], |
| 53 | + name: str, |
| 54 | + ) -> Union[TRTTensor, Sequence[TRTTensor]]: |
| 55 | + return activation.leaky_relu(network, target, SourceIR.ATEN, name, args[0], args[1]) |
| 56 | +
|
| 57 | + The ``tensorrt_converter`` (used for FX registration) and ``dynamo_tensorrt_converter`` are similar decorator functions with some differences. |
| 58 | + |
| 59 | + #. Both register the converters in the registeries (python dictionaries) - ``CONVERTERS`` and ``DYNAMO_CONVERTERS`` respectively. These are two dictioneries which are concatenated to form the overall converter registry |
| 60 | + #. The dictionary is keyed on the ``OpOverLoad`` which is mentioned in more detail below with examples |
| 61 | + #. Both return the decorated converter implementation |
| 62 | + #. The ``CONVERTERS`` directly registers the decorated ``converter_implementation`` function, while ``DYNAMO_CONVERTERS`` has additionational arguments and registers the ``ConverterSupport`` object |
| 63 | + #. The additional arguments are: |
| 64 | + |
| 65 | + .. code-block:: python |
| 66 | + def dynamo_tensorrt_converter( |
| 67 | + key: Target, |
| 68 | + enabled: bool = True, |
| 69 | + capability_validator: Optional[Callable[[Node], bool]] = None, |
| 70 | + priority: ConverterPriority = ConverterPriority.STANDARD, |
| 71 | + ) -> Callable[[Any], Union[TRTTensor, Sequence[TRTTensor]]]: |
| 72 | +
|
| 73 | + #. key: Node target for which the converter is implemented for (for example, torch.ops.aten.leaky_relu.Tensor) |
| 74 | + #. enabled: Whether the converter should be enabled/cached or not |
| 75 | + #. capability_validator: Function which evaluates whether a node is valid for conversion by the decorated converter. It defaults to None, implying the capability_validator function is always true. This means all nodes of "key" kind can be supported by this converter by default. See ``embedding`` example for more details |
| 76 | + #. priority: Converter's level of priority relative to other converters with the same target |
| 77 | + |
| 78 | + #. The ``ConverterSupport`` is a compilation of ``converter_implementation`` and ``capability_validator``. |
| 79 | + |
| 80 | + |
| 81 | + The function decorated by ``tensorrt_converter`` and ``dynamo_tensorrt_converter`` has the following arguments which are automatically generated by the trace functions mentioned above. |
| 82 | + |
| 83 | + #. network : Node in the form of ``call_module`` or ``call_function`` having the target as the key |
| 84 | + #. target: Target key in the ``call_module`` or ``call_function`` above. eg: ``torch.ops.aten_.leaky_relu.default``. Note that ``torch.ops.aten._leaky_relu`` is the ``OpOverloadPacket`` while ``torch.ops.aten_.leaky_relu.default`` is ``OpOverload``. |
| 85 | + #. args: The arguments passed in the ``call_module`` or ``call_function`` above |
| 86 | + #. kwargs: The kwargs passed in the ``call_module`` or ``call_function`` above |
| 87 | + #. name: String containing the name of the target |
| 88 | + |
| 89 | + As a user writing new converters, one just needs to take care that the approriate arguments are extracted from the trace generated to the implementation function in the implementation lib function ``activation.leaky_relu`` (which we will discuss below in detail). |
| 90 | + |
| 91 | + * Operation type |
| 92 | + |
| 93 | + Example: ``fmod`` |
| 94 | + |
| 95 | + It follows the same steps as the above converter. In this case the opcode is ``torch.ops.aten.fmod.Scalar`` or ``torch.ops.aten.fmod.Tensor``. |
| 96 | + Hence both the opcodes are registered in ``py/torch_tensorrt/dynamo/conversion/aten_ops_converters``. |
| 97 | + Note that ``torch.ops.aten.fmod`` is the ``OpOverLoadPacket`` while the registry is keyed on ``torch.ops.aten.fmod.Scalar`` or ``torch.ops.aten.fmod.Tensor``, which is ``OpOverLoad`` |
| 98 | + |
| 99 | + Example: ``embedding`` |
| 100 | + |
| 101 | + It follows the same steps as the above converter. In this case the opcode is ``torch.ops.aten.embedding.default``. |
| 102 | + There are some converters which have special cases to be accounted for. In those cases, one should use ``capability_validators`` to register the converter using ``@dynamo_tensorrt_converter`` |
| 103 | + We illustrate this through ``torch.ops.aten.embedding.default``. It has parameters - ``scale_grad_by_freq`` and ``sparse`` which are not currently supported by the implementation. |
| 104 | + In such cases we can write validator ``embedding_param_validator`` which implements that given those paramters the converter is not supported and register the converter by |
| 105 | + |
| 106 | + .. code-block:: python |
| 107 | + @dynamo_tensorrt_converter( |
| 108 | + torch.ops.aten.embedding.default, capability_validator=embedding_param_validator |
| 109 | + ) |
| 110 | +
|
| 111 | + So if there is a new converter in which certain special cases are not to be supported then they can be specified in the ``capability_validator``. |
| 112 | + |
| 113 | + * Evaluator type |
| 114 | + |
| 115 | + Example: ``operator.getitem`` |
| 116 | + |
| 117 | + Evaluators are categorized as so since they do not make any modification to the graph. This is implemented in ``py/torch_tensorrt/dynamo/conversion/op_evaluators.py``, with the corresponding ``capbility_validator``. |
| 118 | + The opcode is ``operator.getitem``. |
| 119 | + |
| 120 | + |
| 121 | + * Implementation Library |
| 122 | + |
| 123 | + The dynamo converters would be located in ``py/torch_tensorrt/dynamo/conversion/impl`` |
| 124 | + |
| 125 | + * Activation |
| 126 | + |
| 127 | + Example: ``leaky_relu`` |
| 128 | + |
| 129 | + The implementation is to be placed in present in ``py/torch_tensorrt/dynamo/conversion/impl/activation.py``. This is where all the activation functions are defined and implemented. |
| 130 | + |
| 131 | + .. code-block:: python |
| 132 | +
|
| 133 | + def leaky_relu( |
| 134 | + network: TRTNetwork, |
| 135 | + target: Target, |
| 136 | + source_ir: Optional[SourceIR], |
| 137 | + name: str, |
| 138 | + input_val: TRTTensor, |
| 139 | + alpha: Optional[Any], |
| 140 | + ): |
| 141 | + #implementation |
| 142 | +
|
| 143 | + The implementation function has the following arguments. |
| 144 | + |
| 145 | + #. network : ``network`` passed from the decorated function registration |
| 146 | + #. target: ``target`` passed from the decorated function registration |
| 147 | + #. source_ir: Enum attribute. ``SourceIR`` enum is defined in ``py/torch_tensorrt/dynamo/conversion/impl/converter_utils`` |
| 148 | + #. name: ``name`` passed from the decorated function registration |
| 149 | + #. input_val: Approriate arguments extracted from the decorated function registration from args or kwargs |
| 150 | + #. alpha: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the alpha attribute of the created TensorRT activation layer eg: Used in leaky_relu, elu, hardtanh |
| 151 | + #. beta: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the beta attribute of the created TensorRT activation layer eg: Used in hardtanh |
| 152 | + #. dyn_range_fn: A optional function which takes the dynamic range of a TensorRT Tensor and returns the output dynamic range |
| 153 | + |
| 154 | + The implementation functions call the ``convert_activation`` function in ``py/torch_tensorrt/dynamo/conversion/impl/activation.py``. This function will add the approriate activation layer via ``network.add_activation``. |
| 155 | + |
| 156 | + * Operator |
| 157 | + |
| 158 | + The implementation is to be placed in ``py/torch_tensorrt/dynamo/conversion/impl/elementwise/ops.py`` for dynamo. This is where all the elementwise functions are defined and implemented. |
| 159 | + For a new operator, one should identify the category to which it belongs. Following are some examples |
| 160 | + |
| 161 | + #. Elementwise operators like ``fmod`` is present in ``py/torch_tensorrt/dynamo/conversion/impl/elementwise``. The ``py/torch_tensorrt/dynamo/conversion/impl/elementwise/base`` contains base functions for elementwise operator. |
| 162 | + #. Unary operators like ``sqrt`` will be present in ``py/torch_tensorrt/dynamo/conversion/impl/unary``. The ``py/torch_tensorrt/dynamo/conversion/impl/unary/base`` contains base functions for unary operator. |
| 163 | + #. Normalization operators like ``softmax``, ``layer_norm``, ``batch_norm`` will be present in ``py/torch_tensorrt/dynamo/conversion/impl/normalization``. Since there are no base operations common to all, there is no base file. But one can choose to implement a base file, if there are common functions across all normalization operations |
| 164 | + #. Individual operators like ``slice``, ``select``, ``where``, ``embedding`` will be present in ``py/torch_tensorrt/dynamo/conversion/impl/*.py``. They will have individual operator implementation with the same API structure as above but with different individual arguments |
| 165 | + |
| 166 | + Please note that the above operators would have common functions to be implemented which should be placed in |
| 167 | + ``py/torch_tensorrt/dynamo/conversion/impl/converter_utils.py`` |
| 168 | + |
| 169 | + |
| 170 | + * Lowering type |
| 171 | + |
| 172 | + There are some converters which can be decomposed into suboperations and need not have seperate converter registration. |
| 173 | + Such converters can be implemented via ``lowering passes`` |
| 174 | + |
| 175 | + Example: ``addmm`` |
| 176 | + |
| 177 | + The decompositions are registered via ``register_decomposition`` in ``py/torch_tensorrt/dynamo/backend/lowering/_decompositions.py`` |
| 178 | + We define ``addmm_replacement`` and replace it with the torch ops, which will have their corresponding converters called. |
| 179 | + |
| 180 | + .. code-block:: python |
| 181 | +
|
| 182 | + @register_decomposition(torch.ops.aten.addmm, registry=DECOMPOSITIONS) |
| 183 | + def addmm_replacement( |
| 184 | + input_: torch.Tensor, mat1: torch.Tensor, mat2: torch.Tensor, *, beta=1, alpha=1 |
| 185 | + ) -> torch.Tensor: |
| 186 | + return torch.add( |
| 187 | + torch.mul(input_, beta), torch.mul(torch.matmul(mat1, mat2), alpha) |
| 188 | + ) |
| 189 | + |
| 190 | + Note that there are some pre-existing dynamo decompositions in torch directory, in which case they should be used, |
| 191 | + In that case please enable the decompositions in ``py/torch_tensorrt/dynamo/lowering/_decomposition_groups.py`` in ``torch_enabled_decompositions``. |
| 192 | + Similarly you can choose to disable any in ``torch_disabled_decompositions``. Please note that the ones already defined in the lowering will take precedence over torch lowering ops. |
| 193 | + |
| 194 | + |
| 195 | + |
| 196 | + |
| 197 | +Tests |
| 198 | +----- |
| 199 | + |
| 200 | +* Dynamo testing: |
| 201 | + |
| 202 | + Dynamo tests are present for the lowering ops in ``tests/py/dynamo/lowering/test_decompositions.py``. The above converters will soon be ported to dynamo tests |
| 203 | + |
| 204 | + #. Compare the results for ``fx.symbolic_trace`` and ``torch_tensorrt.dynamo.compile``. |
| 205 | + #. Test for the ``expected_op`` and the ``unexpected_op``. |
| 206 | + |
| 207 | + #. ``expected_op``: Operations the operations are lowered to. eg: ``mul`` and ``add`` for ``addmm`` |
| 208 | + #. Note that specify that ``disable_passes= True`` for cases where you would not want lowering passes (which should be the default when testing converters) |
| 209 | + #. ``unexpected_op``: Original operation. eg: ``addmm`` for ``addmm`` |
| 210 | + |
| 211 | +The tests should fail if any of the above two conditions fail |
0 commit comments