FX converter documentation (#2039)

apbose · web-flow · commit c875c3961a78 · 2023-09-21T12:03:22.000-07:00
diff --git a/docsrc/contributors/fx_converters.rst b/docsrc/contributors/fx_converters.rst
@@ -0,0 +1,211 @@
+.. _dynamo_conversion:
+
+Dynamo Converters
+==================
+The dynamo converter library in Torch-TensorRT is located in ``TensorRT/py/torch_tensorrt/dynamo/conversion``.
+
+
+
+Steps
+==================
+
+Operation Set
+-------------------
+The converters in dynamo are produced by ``aten_trace`` and falls under ``aten_ops_converters`` ( FX earlier had ``acc_ops_converters``, ``aten_ops_converters`` or ``nn_ops_converters`` depending on the trace through which it was produced).  The converters are registered using  ``dynamo_tensorrt_converter`` for dynamo. The function decorated
+has the arguments - ``network, target, args, kwargs, name``,  which is common across all the operators schema.
+These functions are mapped in the ``aten`` converter registry dictionary (at present a compilation of FX and dynamo converters, FX will be deprecated soon), with key as the function target name.
+    
+    * aten_trace is produced by ``torch_tensorrt.dynamo.trace(..)`` for the export path and ``torch_tensorrt.compile(ir=dynamo)`` for the compile path.
+    The export path makes use of ``aten_tracer`` whereas the alternate trace in compile is produced by the AOT Autograd library.
+    Both these simplify the torch operators to reduced set of Aten operations.
+        
+
+As mentioned above, if you would like to add a new converter, its implementation will be included in ``TensorRT/py/torch_tensorrt/dynamo/conversion/impl``
+Although there is a corresponding implementation of the converters  included in the common implementation library present in ``TensorRT/py/torch_tensorrt/fx/impl`` for FX converters, this documentation focuses on the implementation of the ``aten_ops`` converters in dynamo.
+
+
+Converter implementation
+------------------------
+In this section, we illustrate the steps to be implemented for writing a converter. We divide them according to activation, operator, lowering pass implementation or evaluator.
+Each of them is detailed with the help of an example
+
+    * Registration
+        
+        The converter needs to be registered with the appropriate op code in the ``dynamo_tensorrt_converter``. 
+        
+        * Activation type
+
+            Example: ``leaky_relu``
+            
+            
+            * aten_ops_converters: Dynamo_converters
+
+                    Define in ``py/torch_tensorrt/dynamo/conversion/aten_ops_converters``. One needs to register the opcode generated in the trace with ``dynamo_tensorrt_converter`` decorator. Op code to be used for the registration or the converter registry key in this case is ``torch.ops.aten.leaky_relu.default``
+
+                        .. code-block:: python
+            
+                            @dynamo_tensorrt_converter(torch.ops.aten.leaky_relu.default)
+                                def aten_ops_leaky_relu(
+                                    network: TRTNetwork,
+                                    target: Target,
+                                    args: Tuple[Argument, ...],
+                                    kwargs: Dict[str, Argument],
+                                    name: str,
+                                ) -> Union[TRTTensor, Sequence[TRTTensor]]:
+                                        return activation.leaky_relu(network, target, SourceIR.ATEN, name, args[0], args[1])
+
+            The ``tensorrt_converter`` (used for FX registration) and ``dynamo_tensorrt_converter`` are similar decorator functions with some differences. 
+            
+            #. Both register the converters in the registeries (python dictionaries) - ``CONVERTERS`` and ``DYNAMO_CONVERTERS`` respectively. These are two dictioneries which are concatenated to form the overall converter registry 
+            #. The dictionary is keyed on the ``OpOverLoad`` which is mentioned in more detail below with examples 
+            #. Both return the decorated converter implementation
+            #. The ``CONVERTERS`` directly registers the decorated ``converter_implementation`` function, while ``DYNAMO_CONVERTERS`` has additionational arguments and registers the ``ConverterSupport`` object
+            #. The additional arguments are:
+
+                .. code-block:: python
+                    def dynamo_tensorrt_converter(
+                        key: Target,
+                        enabled: bool = True,
+                        capability_validator: Optional[Callable[[Node], bool]] = None,
+                        priority: ConverterPriority = ConverterPriority.STANDARD,
+                    ) -> Callable[[Any], Union[TRTTensor, Sequence[TRTTensor]]]:
+
+                #. key: Node target for which the converter is implemented for (for example, torch.ops.aten.leaky_relu.Tensor)
+                #. enabled: Whether the converter should be enabled/cached or not
+                #. capability_validator: Function which evaluates whether a node is valid for conversion by the decorated converter. It defaults to None, implying the capability_validator function is always true. This means all nodes of "key" kind can be supported by this converter by default. See ``embedding`` example for more details
+                #. priority: Converter's level of priority relative to other converters with the same target
+
+            #. The ``ConverterSupport`` is a compilation of ``converter_implementation`` and ``capability_validator``.
+
+
+            The function decorated by ``tensorrt_converter`` and ``dynamo_tensorrt_converter`` has the following arguments which are automatically generated by the trace functions mentioned above.
+            
+            #. network : Node in the form of ``call_module`` or ``call_function`` having the target as the key
+            #. target: Target key in the ``call_module`` or ``call_function`` above. eg: ``torch.ops.aten_.leaky_relu.default``. Note that ``torch.ops.aten._leaky_relu`` is the ``OpOverloadPacket`` while ``torch.ops.aten_.leaky_relu.default`` is ``OpOverload``.
+            #. args: The arguments passed in the ``call_module`` or ``call_function`` above
+            #. kwargs: The kwargs passed in the ``call_module`` or ``call_function`` above
+            #. name: String containing the name of the target
+
+            As a user writing new converters, one just needs to take care that the approriate arguments are extracted from the trace generated to the implementation function in the implementation lib function ``activation.leaky_relu`` (which we will discuss below in detail).
+
+        * Operation type
+
+            Example: ``fmod``
+
+            It follows the same steps as the above converter. In this case the opcode is ``torch.ops.aten.fmod.Scalar`` or ``torch.ops.aten.fmod.Tensor``. 
+            Hence both the opcodes are registered in ``py/torch_tensorrt/dynamo/conversion/aten_ops_converters``.
+            Note that ``torch.ops.aten.fmod`` is the ``OpOverLoadPacket`` while the registry is keyed on ``torch.ops.aten.fmod.Scalar`` or ``torch.ops.aten.fmod.Tensor``, which is ``OpOverLoad``
+
+            Example: ``embedding``
+
+            It follows the same steps as the above converter. In this case the opcode is ``torch.ops.aten.embedding.default``. 
+            There are some converters which have special cases to be accounted for. In those cases, one should use ``capability_validators`` to register the converter using ``@dynamo_tensorrt_converter``
+            We illustrate this through ``torch.ops.aten.embedding.default``. It has parameters - ``scale_grad_by_freq`` and ``sparse`` which are not currently supported by the implementation.
+            In such cases we can write validator ``embedding_param_validator`` which implements that given those paramters the converter is not supported and register the converter by 
+                
+                .. code-block:: python
+                    @dynamo_tensorrt_converter(
+                        torch.ops.aten.embedding.default, capability_validator=embedding_param_validator
+                    )
+
+            So if there is a new converter in which certain special cases are not to be supported then they can be specified in the ``capability_validator``.
+        
+        * Evaluator type
+
+            Example: ``operator.getitem``
+
+            Evaluators are categorized as so since they do not make any modification to the graph. This is implemented in ``py/torch_tensorrt/dynamo/conversion/op_evaluators.py``, with the corresponding ``capbility_validator``.
+            The opcode is ``operator.getitem``.
+
+
+    * Implementation Library
+
+        The dynamo converters would be located in ``py/torch_tensorrt/dynamo/conversion/impl``
+        
+        * Activation
+
+            Example: ``leaky_relu``
+        
+            The implementation is to be placed in present in ``py/torch_tensorrt/dynamo/conversion/impl/activation.py``. This is where all the activation functions are defined and implemented.
+            
+            .. code-block:: python
+
+                def leaky_relu(
+                    network: TRTNetwork,
+                    target: Target,
+                    source_ir: Optional[SourceIR],
+                    name: str,
+                    input_val: TRTTensor,
+                    alpha: Optional[Any],
+                ):
+                    #implementation
+
+            The implementation function has the following arguments.
+
+            #. network : ``network`` passed from the decorated function registration
+            #. target: ``target`` passed from the decorated function registration
+            #. source_ir: Enum attribute. ``SourceIR`` enum is defined in ``py/torch_tensorrt/dynamo/conversion/impl/converter_utils``
+            #. name: ``name`` passed from the decorated function registration
+            #. input_val: Approriate arguments extracted from the decorated function registration from args or kwargs
+            #. alpha: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the alpha attribute of the created TensorRT activation layer eg: Used in leaky_relu, elu, hardtanh           
+            #. beta: Approriate arguments extracted from the decorated function registration from args or kwargs. If not None, it will set the beta attribute of the created TensorRT activation layer eg: Used in hardtanh
+            #. dyn_range_fn: A optional function which takes the dynamic range of a TensorRT Tensor and returns the output dynamic range
+
+            The implementation functions call the ``convert_activation`` function in ``py/torch_tensorrt/dynamo/conversion/impl/activation.py``. This function will add the approriate activation layer via ``network.add_activation``.
+        
+        * Operator
+        
+            The implementation is to be placed in ``py/torch_tensorrt/dynamo/conversion/impl/elementwise/ops.py`` for dynamo. This is where all the elementwise functions are defined and implemented.
+            For a new operator, one should identify the category to which it belongs. Following are some examples
+
+            #. Elementwise operators like ``fmod`` is present in ``py/torch_tensorrt/dynamo/conversion/impl/elementwise``. The ``py/torch_tensorrt/dynamo/conversion/impl/elementwise/base`` contains base functions for elementwise operator.
+            #. Unary operators like ``sqrt`` will be present in ``py/torch_tensorrt/dynamo/conversion/impl/unary``. The ``py/torch_tensorrt/dynamo/conversion/impl/unary/base`` contains base functions for unary operator.
+            #. Normalization operators like ``softmax``, ``layer_norm``, ``batch_norm`` will be present in ``py/torch_tensorrt/dynamo/conversion/impl/normalization``. Since there are no base operations common to all, there is no base file. But one can choose to implement a base file, if there are common functions across all normalization operations
+            #. Individual operators like ``slice``, ``select``, ``where``, ``embedding`` will be present in ``py/torch_tensorrt/dynamo/conversion/impl/*.py``. They will have individual operator implementation with the same API structure as above but with different individual arguments
+            
+            Please note that the above operators would have common functions to be implemented which should be placed in 
+            ``py/torch_tensorrt/dynamo/conversion/impl/converter_utils.py``
+
+
+    * Lowering type
+
+        There are some converters which can be decomposed into suboperations and need not have seperate converter registration.
+        Such converters can be implemented via ``lowering passes``
+
+        Example: ``addmm``
+        
+        The decompositions are registered via ``register_decomposition`` in ``py/torch_tensorrt/dynamo/backend/lowering/_decompositions.py``
+        We define ``addmm_replacement`` and replace it with the torch ops, which will have their corresponding converters called.
+
+        .. code-block:: python
+
+            @register_decomposition(torch.ops.aten.addmm, registry=DECOMPOSITIONS)
+            def addmm_replacement(
+                input_: torch.Tensor, mat1: torch.Tensor, mat2: torch.Tensor, *, beta=1, alpha=1
+            ) -> torch.Tensor:
+                return torch.add(
+                    torch.mul(input_, beta), torch.mul(torch.matmul(mat1, mat2), alpha)
+                )
+        
+        Note that there are some pre-existing dynamo decompositions in torch directory, in which case they should be used,
+        In that case please enable the decompositions in ``py/torch_tensorrt/dynamo/lowering/_decomposition_groups.py`` in ``torch_enabled_decompositions``.
+        Similarly you can choose to disable any in ``torch_disabled_decompositions``. Please note that the ones already defined in the lowering will take precedence over torch lowering ops.
+
+
+
+       
+Tests
+-----
+
+* Dynamo testing: 
+    
+    Dynamo tests are present for the lowering ops in ``tests/py/dynamo/lowering/test_decompositions.py``. The above converters will soon be ported to dynamo tests
+    
+    #. Compare the results for ``fx.symbolic_trace`` and ``torch_tensorrt.dynamo.compile``.
+    #. Test for the ``expected_op`` and the ``unexpected_op``. 
+        
+        #. ``expected_op``: Operations the operations are lowered to. eg: ``mul`` and ``add`` for ``addmm``
+        #. Note that specify that ``disable_passes= True`` for cases where you would not want lowering passes (which should be the default when testing converters)
+        #. ``unexpected_op``: Original operation. eg: ``addmm`` for ``addmm``
+        
+The tests should fail if any of the above two conditions fail