Skip to content

Commit af6810d

Browse files
committed
FX converter doc
1 parent 98376b3 commit af6810d

File tree

1 file changed

+133
-0
lines changed

1 file changed

+133
-0
lines changed

docsrc/contributors/fx_converters.rst

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
.. _conversion:
2+
3+
FX Converters
4+
==================
5+
The converter library in Torch-TensorRT is located in ``TensorRT/py/torch_tensorrt/fx/converters``.
6+
They are categorized into - ``aten_ops_converters``, ``acc_ops_converters`` and ``nn_ops_converters``.
7+
The individual converters present are useful for the quantization workflow.
8+
The converters are registered using the ``tensorrt_converter``
9+
10+
Steps
11+
==================
12+
13+
External Interface
14+
-----------------
15+
Depending on whether the operation is generated using acc_trace, aten_trace or fx_trace, the converters are included in either of
16+
``aten_ops_converters``, ``acc_ops_converters`` and ``nn_ops_converters``. The converters are registered using ``tensorrt_converter`` decorator. The function decorated
17+
has the arguments - ``network, target, args, kwargs, name``, which is common across all the operators schema.
18+
19+
* acc_ops_converters
20+
* acc_trace is produced by ``torch_tensorrt.fx.tracer.acc_tracer.acc_tracer.trace``.
21+
* aten_ops
22+
There are two options at present for this
23+
* Dynamo: aten_trace is produced by ``torch_tensorrt.dynamo.backend.compile``. The second round of trace is produced by ``aot_torch_tensorrt_ateb_backend`` by invoking ``aot_module_simplified`` from ``torch._functorch.aot_autograd``
24+
* FX : aten_trace is produced by ``torch_tensorrt.fx.tracer.dispatch_tracer.aten_tracer.trace``. This flow is more common currently, but this will soon be deprecated in torch_tensorrt.
25+
* nn_ops
26+
* symbolic_trace is produced by ``torch.fx._symbolic_trace``.
27+
28+
Converter implementation
29+
-----------------
30+
Three different kind of implementation is illustrated for converter implementation
31+
32+
* activation type
33+
We illustrate the implementation of ``leaky_relu`` operator
34+
* The implementation is present in ``py/torch_tensorrt/fx/impl/activation.py``
35+
.. code_block:: none
36+
def leaky_relu(
37+
network: TRTNetwork,
38+
target: Target,
39+
source_ir: Optional[SourceIR],
40+
name: str,
41+
input_val: TRTTensor,
42+
alpha: Optional[Any],
43+
):
44+
#implementation
45+
46+
As illustrated above we define the function with args: network, name (defined in the trace), target (defined in the trace), source_ir which
47+
can take enum SourceIR.aten, SourceIR.acc or SourceIR.nn. The input_val, alpha, beta(in some cases) are extraced from the trace created.
48+
In the case of ``acc_trace`` the input parameters in kwargs, whereas in ``aten_trace`` the input parameters in args.
49+
50+
* The converter should be registered with the approriate opcode in the ``tensorrt_converter``.
51+
52+
Acc op is defined in ``py/torch_tensorrt/fx/converters/acc_ops_converters``
53+
.. code_block:: none
54+
def acc_ops_leaky_relu(
55+
network: TRTNetwork,
56+
target: Target,
57+
args: Tuple[Argument, ...],
58+
kwargs: Dict[str, Argument],
59+
name: str,
60+
) -> Union[TRTTensor, Sequence[TRTTensor]]:
61+
input_val = kwargs["input"]
62+
negative_slope = kwargs["negative_slope"]
63+
operation_type = trt.ActivationType.LEAKY_RELU
64+
65+
return activation.leaky_relu(
66+
network, target, SourceIR.ACC, name, kwargs["input"], kwargs["negative_slope"]
67+
)
68+
69+
Aten op is defined in ``py/torch_tensorrt/fx/converters/aten_ops_converters``
70+
.. code_block:: none
71+
def aten_ops_leaky_relu(
72+
network: TRTNetwork,
73+
target: Target,
74+
args: Tuple[Argument, ...],
75+
kwargs: Dict[str, Argument],
76+
name: str,
77+
) -> Union[TRTTensor, Sequence[TRTTensor]]:
78+
return activation.leaky_relu(network, target, SourceIR.ATEN, name, args[0], args[1])
79+
80+
As pointed above the acc_trace and aten_trace have input arguments in different form. acc has args in kwargs while aten has args in args
81+
* operation type
82+
We illustrate the implementation of ``fmod`` operator
83+
* The implementation is present in ``py/torch_tensorrt/fx/impl/elementwise/ops.py``
84+
.. code_block:: none
85+
def fmod(
86+
network: TRTNetwork,
87+
target: Target,
88+
source_ir: Optional[SourceIR],
89+
name: str,
90+
input: TRTTensor,
91+
other: TRTTensor,
92+
) -> TRTTensor:
93+
#implementation
94+
The implementation of these operators are segregated according to the operator type
95+
* Since fmod is elementwise it is present in ``py/torch_tensorrt/fx/impl/elementwise``
96+
The ``py/torch_tensorrt/fx/impl/elementwise/base`` contains base functions for elementwise operator.
97+
* Operators like sqrt will be present in ``py/torch_tensorrt/fx/impl/unary``
98+
The ``py/torch_tensorrt/fx/impl/unary/base`` contains base functions for unary operator.
99+
* Operators like softmax, layer_norm, batch_norm are present in ``py/torch_tensorrt/fx/impl/normalization``
100+
Since there are no base operations common to all, there is no base file
101+
* Operators like slice, select, where, embedding are present in ``py/torch_tensorrt/fx/impl/*.py``.
102+
They have individual operator implementation with the same API structure as above but with different arguments
103+
* The converter should be registered with the approriate opcode which is same as the activation example
104+
105+
* lowering type
106+
There are some converters which can be decomposed into suboperations and need not have seperate converter registration.
107+
Such converters can be implemented via ``lowering passes``.
108+
We illustrate this via ``addmm`` operator
109+
* The decompositions are registered via ``register_decomposition``. We define ``addmm_replacement`` and replace it with the torch ops, which will have their corresponding converters called.
110+
.. code_block:: none
111+
def addmm_replacement(
112+
input_: torch.Tensor, mat1: torch.Tensor, mat2: torch.Tensor, *, beta=1, alpha=1
113+
) -> torch.Tensor:
114+
return torch.add(
115+
torch.mul(input_, beta), torch.mul(torch.matmul(mat1, mat2), alpha)
116+
)
117+
118+
Tests
119+
----------------
120+
121+
* FX testing
122+
* The fx tests are present in ``test/converters/aten_op``
123+
* The test class is derived from ``DispatchTestCase``, with parameterized testing to implement different test cases
124+
* The results for ``dispatch_tracer.aten_trace`` and torch are compared.
125+
* The ``expected_op`` is also test. This op will be called by the model and needs to be specified so that the approriate converter is invoked
126+
* The tests throw error if any of the above conditions fail
127+
128+
* Dynamo testing
129+
* The dynamo tests are present for the lowering ops. The above converters will soon be ported to dynamo tests
130+
* The lowering op test is present in ``dynamo/backend/test/test_decompositions.py``
131+
* The results for ``fx.symbolic_trace `` and ``torch_tensorrt.compile`` is compared
132+
* The tests also test for the ``expected_op`` and the ``unexpected_op``. ``expected_op`` are the ones corresponding to the operations the operations are lowered to, while ``unexpected_op`` is the original op
133+
* The tests throw error if any of the above conditions fail

0 commit comments

Comments
 (0)