You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Doesn't torch.compile already capture the backward graph?
17
33
# ------------
18
-
# Partially. AOTAutograd captures the backward graph ahead-of-time, but with certain limitations:
19
-
# - Graph breaks in the forward lead to graph breaks in the backward
20
-
# - `Backward hooks <https://pytorch.org/docs/stable/notes/autograd.html#backward-hooks-execution>`_ are not captured
34
+
# And it does, **partially**. AOTAutograd captures the backward graph ahead-of-time, but with certain limitations:
35
+
# 1. Graph breaks in the forward lead to graph breaks in the backward
36
+
# 2. `Backward hooks <https://pytorch.org/docs/stable/notes/autograd.html#backward-hooks-execution>`_ are not captured
21
37
#
22
38
# Compiled Autograd addresses these limitations by directly integrating with the autograd engine, allowing
23
39
# it to capture the full backward graph at runtime. Models with these two characteristics should try
24
40
# Compiled Autograd, and potentially observe better performance.
25
41
#
26
42
# However, Compiled Autograd has its own limitations:
27
-
# - Dynamic autograd structure leads to recompiles
43
+
# 1. Additional runtime overhead at the start of the backward
44
+
# 2. Dynamic autograd structure leads to recompiles
45
+
#
46
+
# .. note:: Compiled Autograd is under active development and is not yet compatible with all existing PyTorch features. For the latest status on a particular feature, refer to `Compiled Autograd Landing Page <https://docs.google.com/document/d/11VucFBEewzqgkABIjebZIzMvrXr3BtcY1aGKpX61pJY>`_.
# Run the script with either TORCH_LOGS environment variables
80
-
#
81
-
# - To only print the compiled autograd graph, use `TORCH_LOGS="compiled_autograd" python example.py`
82
-
# - To sacrifice some performance, in order to print the graph with more tensor medata and recompile reasons, use `TORCH_LOGS="compiled_autograd_verbose" python example.py`
83
-
#
84
-
# Logs can also be enabled through the private API torch._logging._internal.set_logs.
87
+
# Run the script with the TORCH_LOGS environment variables:
88
+
# - To only print the compiled autograd graph, use ``TORCH_LOGS="compiled_autograd" python example.py``
89
+
# - To print the graph with more tensor medata and recompile reasons, at the cost of performance, use ``TORCH_LOGS="compiled_autograd_verbose" python example.py``
# The compiled autograd graph should now be logged to stdout. Certain graph nodes will have names that are prefixed by aot0_,
96
-
# these correspond to the nodes previously compiled ahead of time in AOTAutograd backward graph 0.
97
-
#
98
-
# NOTE: This is the graph that we will call torch.compile on, NOT the optimized graph. Compiled Autograd basically
99
-
# generated some python code to represent the entire C++ autograd execution.
100
+
# The compiled autograd graph should now be logged to stderr. Certain graph nodes will have names that are prefixed by ``aot0_``,
101
+
# these correspond to the nodes previously compiled ahead of time in AOTAutograd backward graph 0 e.g. ``aot0_view_2`` corresponds to ``view_2`` of the AOT backward graph with id=0.
100
102
#
101
-
"""
103
+
104
+
stderr_output="""
102
105
DEBUG:torch._dynamo.compiled_autograd.__compiled_autograd_verbose:Cache miss due to new autograd node: torch::autograd::GraphRoot (NodeCall 0) with key size 39, previous key sizes=[]
# .. note:: This is the graph that we will call torch.compile on, NOT the optimized graph. Compiled Autograd generates some python code to represent the entire C++ autograd execution.
# There is a `call_hook` node in the graph, which dynamo will inline
238
+
# There should be a ``call_hook`` node in the graph, which dynamo will later inline into
227
239
#
228
240
229
-
"""
241
+
stderr_output="""
230
242
DEBUG:torch._dynamo.compiled_autograd.__compiled_autograd_verbose:Cache miss due to new autograd node: torch::autograd::GraphRoot (NodeCall 0) with key size 39, previous key sizes=[]
# Compiled Autograd is under active development and is not yet compatible with all existing PyTorch features.
296
-
# For the latest status on a particular feature, refer to: https://docs.google.com/document/d/11VucFBEewzqgkABIjebZIzMvrXr3BtcY1aGKpX61pJY.
303
+
# Conclusion
304
+
# ----------
305
+
# In this tutorial, we went over the high-level ecosystem of torch.compile with compiled autograd, the basics of compiled autograd and a few common recompilation reasons.
306
+
#
307
+
# For feedback on this tutorial, please file an issue on https://github.com/pytorch/tutorials.
0 commit comments