|
| 1 | +<h1> Errors </h1> |
| 2 | + |
| 3 | +In this section we discuss errors that can commonly arise during export. |
| 4 | + |
| 5 | +# Expected Graph Breaks |
| 6 | +## Unsupported Features |
| 7 | +In PT2 Export, we are primarily reusing the same tracing mechanism—Dynamo—that we use in eager mode. Recall that in eager mode, graph breaks are expected—we always have a fallback option. A consequence of this design is that Dynamo has incomplete coverage of PyTorch and Python features. (That said, the fewer graph breaks there are, generally speaking, the better performance we can expect—even in eager mode—because it enables optimizations to apply over larger regions of code. Thus we are actively working on filling in coverage gaps to avoid graph breaks where possible.)Unfortunately, this means that you may encounter graph breaks during export due to Dynamo coverage gaps. In such cases, you should expect to get an error that includes a link to [ExportDB](./export_db.md). The corresponding entry should show a minimal negative example (failure) and a minimal positive example (success) that should help you understand the limitation and the workaround, i.e., how to fix the error by rewriting code. |
| 8 | + |
| 9 | +# Constraint Violations |
| 10 | +Recall that you can specify constraints on dynamic dimensions, which encode the soundness conditions for export. It is possible that these constraints are not valid.## Various Cases |
| 11 | +Specifically, the compiler may find that: |
| 12 | +- A dynamic dimension must be equal to a constant. |
| 13 | + - In this case, this dimension must be static: you cannot mark it dynamic. |
| 14 | +- A dynamic dimension must be in a range that does not follow the specified range, i.e., is not entirely included between the specified lower and upper bounds. |
| 15 | + - In this case, you need to adjust the specified bounds. |
| 16 | + - Note that when bounds are not specified, they are implicitly assumed to be [2, infinity). |
| 17 | + - For technical reasons that are difficult to explain here, they are assumed to be not 0 or 1. This is not a bug, and does not necessarily mean that your exported program will not work for dimensions 0 or 1. It does mean, though, that you should test for these cases. |
| 18 | +- A dynamic dimension must be equal to another dynamic dimension that it is not specified equal to. |
| 19 | + - In this case, you need to add the missing equality. |
| 20 | + - By default, all dynamic dimensions are assumed to be independent. |
| 21 | + - For legacy reasons that are difficult to explain here, you might find spurious implicitly assumed equalities when dimensions in your example inputs happen to be equal. If you ever encounter such a case, please report it as a bug. |
| 22 | + |
| 23 | +## Using the Compiler as a Guide |
| 24 | +See [this overview](./soundness.md/#Constraint Violations and How to Fix Them) of how to fix such errors. Briefly: |
| 25 | +* You should see generated functions specializations and specify_constraints on the console that respectively summarize which dimensions are assumed static and what the necessary constraints on the remaining dynamic dimensions are. |
| 26 | +* If you agree with this information, you can copy-paste and call specify_constraints with your example inputs to specify constraints, and you can copy-paste and call specializations on your example inputs to assert their constant values. |
| 27 | +* If you do not agree and would like to provide tighter constraints, feel free to modify specify_constraints; the compiler will be happy to accept. |
| 28 | +* If you do not agree and would like looser constraints, please use TORCH_LOGs=dynamic to enable INFO-level dynamic-shape logging, which will guide you to where the inferred constraints come from. You can also try TORCH_LOGs=+dynamic to enable (further, verbose) DEBUG-level logging. |
| 29 | + * Note that you might have to change your code or your expectations based on this information. If you are absolutely convinced that the compiler has a bug, please report it! For example, there are tricky cases where the constraints may come from non-user code, like a fast path in the compiler itself. We encourage you to try different example inputs to avoid such constraints. |
| 30 | + |
| 31 | +# Missing META Kernels for Operators |
| 32 | +## ATen Operators |
| 33 | +In the unfortunate case where your model uses an ATen operator that is not supported yet, you may get an obscure error of the form: |
| 34 | +```python |
| 35 | +Unable to find op(FakeTensor, FakeTensor, ...) |
| 36 | +``` |
| 37 | +Please report a bug if you encounter this error |
| 38 | + |
| 39 | +## Custom Operators |
| 40 | +In this case you should follow the instructions at [Custom Operators](./custom_operators.md). Note that the current mechanism is not ideal, but will be updated soon to make it easy for you to register custom operators. |
| 41 | + |
| 42 | +# Validation Errors |
| 43 | +Note that we do not do any validation of the exported program yet; this is planned for the near future.I n these cases you should report a bug since the issue is likely in PyTorch. |
| 44 | +## Correctness |
| 45 | +The export workflow should complain when the exported program behaves differently than the eager program by running the example inputs through both. ## Serialization roundtrip failure |
| 46 | +The export workflow should serialize and deserialize the exported program, and then run the correctness test again. |
0 commit comments