pytorch · Olivia-liu · Oct 2, 2024 · Oct 2, 2024 · Oct 2, 2024
@@ -116,7 +116,7 @@ python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --no-use_fp
 cd executorch
 python3 -m examples.apple.mps.scripts.mps_example --model_name="mv3" --generate_etrecord -b
 ```
-2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./sdk-etdump.md).
+2. Run your Program on the ExecuTorch runtime and generate an [ETDump](./etdump.md).
 ```
 ./cmake-out/examples/apple/mps/mps_executor_runner --model_path mv3_mps_bundled_fp16.pte --bundled_program --dump-outputs
 ```

@@ -100,15 +100,15 @@ python3 -m examples.apple.coreml.scripts.export --model_name mv3 --generate_etre
 # Builds `coreml_executor_runner`.
 ./examples/apple/coreml/scripts/build_executor_runner.sh
 ```
-3. Run and generate an [ETDump](./sdk-etdump.md).
+3. Run and generate an [ETDump](./etdump.md).
 ```bash
 cd executorch
 
 # Generate the ETDump file.
 ./coreml_executor_runner --model_path mv3_coreml_all.pte --profile_model --etdump_path etdump.etdp
 ```
 
-4. Create an instance of the [Inspector API](./sdk-inspector.rst) by passing in the [ETDump](./sdk-etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table.
+4. Create an instance of the [Inspector API](./model-inspector.rst) by passing in the [ETDump](./etdump.md) you have sourced from the runtime along with the optionally generated [ETRecord](./etrecord.rst) from step 1 or execute the following command in your terminal to display the profiling data table.
 ```bash
 python examples/apple/coreml/scripts/inspector_cli.py --etdump_path etdump.etdp --etrecord_path mv3_coreml.bin
 ```

@@ -127,7 +127,7 @@ A demo of the runtime code can be found [here](https://github.com/pytorch/execut
 
 ## Surfacing custom metadata from delegate events
 
-As seen in the runtime logging API's above, users can log an array of bytes along with their delegate profiling event. We make this data available for users in post processing via the [Inspector API](./sdk-inspector.rst).
+As seen in the runtime logging API's above, users can log an array of bytes along with their delegate profiling event. We make this data available for users in post processing via the [Inspector API](./model-inspector.rst).
 
 Users can pass a metadata parser when creating an instance of the Inspector. The parser is a callable that deserializes the data and returns a list of strings or a dictionary containing key-value pairs. The deserialized data is then added back to the corresponding event in the event block for user consumption. Here's an example of how to write this parser:
 

@@ -36,10 +36,10 @@ ETDump (ExecuTorch Dump) is the binary blob that is generated by the runtime aft
 If you only care about looking at the raw performance data without linking back to source code and other extensive features, an ETDump alone will be enough to leverage the basic features of the Developer Tools. For the full experience, it is recommended that the users also generate an ETRecord.
 ```
 
-More details are available in the [ETDump documentation](sdk-etdump.md) on how to generate and store an ETDump from the runtime.
+More details are available in the [ETDump documentation](etdump.md) on how to generate and store an ETDump from the runtime.
 
 
 ### Inspector APIs
 The Inspector Python APIs are the main user enrty point into the Developer Tools. They join the data sourced from ETDump and ETRecord to give users access to all the performance and debug data sourced from the runtime along with linkage back to eager model source code and module hierarchy in an easy to use API.
 
-More details are available in the [Inspector API documentation](sdk-inspector.rst) on how to use the Inspector APIs.
+More details are available in the [Inspector API documentation](model-inspector.rst) on how to use the Inspector APIs.
@@ -0,0 +1,44 @@
+# Prerequisite | ETDump - ExecuTorch Dump
+
+ETDump (ExecuTorch Dump) is one of the core components of the ExecuTorch Developer Tools. It is the mechanism through which all forms of profiling and debugging data is extracted from the runtime. Users can't parse ETDump directly; instead, they should pass it into the Inspector API, which deserializes the data, offering interfaces for flexible analysis and debugging.
+
+
+## Generating an ETDump
+
+Generating an ETDump is a relatively straightforward process. Users can follow the steps detailed below to integrate it into their application that uses ExecuTorch.
+
+1. ***Include*** the ETDump header in your code.
+```C++
+#include <executorch/devtools/etdump/etdump_flatcc.h>
+```
+
+2. ***Create*** an Instance of the ETDumpGen class and pass it into the `load_method` call that is invoked in the runtime.
+
+```C++
+torch::executor::ETDumpGen etdump_gen = torch::executor::ETDumpGen();
+Result<Method> method =
+      program->load_method(method_name, &memory_manager, &etdump_gen);
+```
+
+3. ***Dump Out the ETDump Buffer*** - after the inference iterations have been completed, users can dump out the ETDump buffer. If users are on a device which has a filesystem, they could just write it out to the filesystem. For more constrained embedded devices, users will have to extract the ETDump buffer from the device through a mechanism that best suits them (e.g. UART, JTAG etc.)
+
+```C++
+etdump_result result = etdump_gen.get_etdump_data();
+if (result.buf != nullptr && result.size > 0) {
+    // On a device with a file system users can just write it out
+    // to the file-system.
+    FILE* f = fopen(FLAGS_etdump_path.c_str(), "w+");
+    fwrite((uint8_t*)result.buf, 1, result.size, f);
+    fclose(f);
+    free(result.buf);
+  }
+```
+
+4. ***Compile*** your binary using CMake with the `ET_EVENT_TRACER_ENABLED` pre-processor flag to enable events to be traced and logged into ETDump inside the ExecuTorch runtime. This flag needs to be added to the ExecuTorch library and any operator library that you are compiling into your binary. For reference, you can take a look at `examples/sdk/CMakeLists.txt`. The lines of interest are:
+```
+target_compile_options(executorch INTERFACE -DET_EVENT_TRACER_ENABLED)
+target_compile_options(portable_ops_lib INTERFACE -DET_EVENT_TRACER_ENABLED)
+```
+## Using an ETDump
+
+Pass this ETDump into the [Inspector API](./model-inspector.rst) to access this data and do post-run analysis.
@@ -18,7 +18,7 @@ them to debug and visualize their model.
 * Delegate debug handle maps
 
 The ``ETRecord`` object itself is intended to be opaque to users and they should not access any components inside it directly.
-It should be provided to the `Inspector API <sdk-inspector.html>`__ to link back performance and debug data sourced from the runtime back to the Python source code.
+It should be provided to the `Inspector API <model-inspector.html>`__ to link back performance and debug data sourced from the runtime back to the Python source code.
 
 Generating an ``ETRecord``
 --------------------------
@@ -37,4 +37,4 @@ they are interested in working with via our tooling.
 Using an ``ETRecord``
 ---------------------
 
-Pass the ``ETRecord`` as an optional argument into the `Inspector API <sdk-inspector.html>`__ to access this data and  do post-run analysis.
+Pass the ``ETRecord`` as an optional argument into the `Inspector API <model-inspector.html>`__ to access this data and  do post-run analysis.
@@ -134,7 +134,7 @@ Most of the ExecuTorch APIs, including those described above, return either `Res
 
 ### Profile the Module
 
-Use [ExecuTorch Dump](sdk-etdump.md) to trace model execution. Create an instance of the `ETDumpGen` class and pass it to the `Module` constructor. After executing a method, save the `ETDump` to a file for further analysis. You can capture multiple executions in a single trace if desired.
+Use [ExecuTorch Dump](etdump.md) to trace model execution. Create an instance of the `ETDumpGen` class and pass it to the `Module` constructor. After executing a method, save the `ETDump` to a file for further analysis. You can capture multiple executions in a single trace if desired.
 
 ```cpp
 #include <fstream>

@@ -203,10 +203,10 @@ Topics in this section will help you get started with ExecuTorch.
    devtools-overview
    bundled-io
    etrecord
-   sdk-etdump
+   etdump
    sdk-profiling
-   sdk-debugging
-   sdk-inspector
+   model-debugging
+   model-inspector
    memory-planning-inspection
    delegate-debugging
    devtools-tutorial

@@ -774,7 +774,7 @@ Run the export script and the ETRecord will be generated as `etrecord.bin`.
 
 ##### ETDump generation
 
-An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [the ETDump docs](../sdk-etdump.md).
+An ETDump is an artifact generated at runtime containing a trace of the model execution. For more information, see [the ETDump docs](../etdump.md).
 
 Include the ETDump header in your code.
 ```cpp
@@ -843,7 +843,7 @@ This prints the performance data in a tabular format in “inspector_out.txt”,
 ![](../_static/img/llm_manual_print_data_tabular.png)
 <a href="../_static/img/llm_manual_print_data_tabular.png" target="_blank">View in full size</a>
 
-To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](../sdk-inspector.md).
+To learn more about the Inspector and the rich functionality it provides, see the [Inspector API Reference](../model-inspector.md).
 
 ## Custom Kernels
 With the ExecuTorch custom operator APIs, custom operator and kernel authors can easily bring in their kernel into PyTorch/ExecuTorch.

@@ -0,0 +1,82 @@
+# Debugging Models in ExecuTorch
+
+With the ExecuTorch Developer Tools, users can debug their models for numerical inaccurcies and extract model outputs from their device to do quality analysis (such as Signal-to-Noise, Mean square error etc.).
+
+Currently, ExecuTorch supports the following debugging flows:
+- Extraction of model level outputs via ETDump.
+- Extraction of intermediate outputs (outside of delegates) via ETDump:
+  - Linking of these intermediate outputs back to the eager model python code.
+
+
+## Steps to debug a model in ExecuTorch
+
+### Runtime
+For a real example reflecting the steps below, please refer to [example_runner.cpp](https://github.com/pytorch/executorch/blob/main/examples/devtools/example_runner/example_runner.cpp).
+
+1. [Optional] Generate an [ETRecord](./etrecord.rst) while exporting your model. When provided, this enables users to link profiling information back to the eager model source code (with stack traces and module hierarchy).
+2. Integrate [ETDump generation](./etdump.md) into the runtime and set the debugging level by configuring the `ETDumpGen` object. Then, provide an additional buffer to which intermediate outputs and program outputs will be written. Currently we support two levels of debugging:
+    - Program level outputs
+    ```C++
+    Span<uint8_t> buffer((uint8_t*)debug_buffer, debug_buffer_size);
+    etdump_gen.set_debug_buffer(buffer);
+    etdump_gen.set_event_tracer_debug_level(
+        EventTracerDebugLogLevel::kProgramOutputs);
+    ```
+
+    - Intermediate outputs of executed (non-delegated) operations (will include the program level outputs too)
+    ```C++
+    Span<uint8_t> buffer((uint8_t*)debug_buffer, debug_buffer_size);
+    etdump_gen.set_debug_buffer(buffer);
+    etdump_gen.set_event_tracer_debug_level(
+        EventTracerDebugLogLevel::kIntermediateOutputs);
+    ```
+3. Build the runtime with the pre-processor flag that enables tracking of debug events. Instructions are in the [ETDump documentation](./etdump.md).
+4. Run your model and dump out the ETDump buffer as described [here](./etdump.md). (Do so similarly for the debug buffer if configured above)
+
+
+### Accessing the debug outputs post run using the Inspector API's
+Once a model has been run, using the generated ETDump and debug buffers, users can leverage the [Inspector API's](./model-inspector.rst) to inspect these debug outputs.
+
+```python
+from executorch.devtools import Inspector
+
+# Create an Inspector instance with etdump and the debug buffer.
+inspector = Inspector(etdump_path=etdump_path,
+            buffer_path = buffer_path,
+            # etrecord is optional, if provided it'll link back
+            # the runtime events to the eager model python source code.
+            etrecord = etrecord_path)
+
+# Accessing program outputs is as simple as this:
+for event_block in inspector.event_blocks:
+    if event_block.name == "Execute":
+        print(event_blocks.run_output)
+
+# Accessing intermediate outputs from each event (an event here is essentially an instruction that executed in the runtime).
+for event_block in inspector.event_blocks:
+    if event_block.name == "Execute":
+        for event in event_block.events:
+            print(event.debug_data)
+            # If an ETRecord was provided by the user during Inspector initialization, users
+            # can print the stacktraces and module hierarchy of these events.
+            print(event.stack_traces)
+            print(event.module_hierarchy)
+```
+
+We've also provided a simple set of utilities that let users perform quality analysis of their model outputs with respect to a set of reference outputs (possibly from the eager mode model).
+
+
+```python
+from executorch.devtools.inspector import compare_results
+
+# Run a simple quality analysis between the model outputs sourced from the
+# runtime and a set of reference outputs.
+#
+# Setting plot to True will result in the quality metrics being graphed
+# and displayed (when run from a notebook) and will be written out to the
+# filesystem. A dictionary will always be returned which will contain the
+# results.
+for event_block in inspector.event_blocks:
+    if event_block.name == "Execute":
+        compare_results(event_blocks.run_output, ref_outputs, plot = True)
+```
@@ -0,0 +1,159 @@
+Inspector APIs
+==============
+
+Overview
+--------
+
+The Inspector APIs provide a convenient interface for analyzing the
+contents of `ETRecord <etrecord.html>`__ and
+`ETDump <etdump.html>`__, helping developers get insights about model
+architecture and performance statistics. It’s built on top of the `EventBlock Class <#eventblock-class>`__ data structure,
+which organizes a group of `Event <#event-class>`__\ s for easy access to details of profiling events.
+
+There are multiple ways in which users can interact with the Inspector
+APIs:
+
+* By using `public methods <#inspector-methods>`__ provided by the ``Inspector`` class.
+* By accessing the `public attributes <#inspector-attributes>`__ of the ``Inspector``, ``EventBlock``, and ``Event`` classes.
+* By using a `CLI <#cli>`__ tool for basic functionalities.
+
+Please refer to the `e2e use case doc <tutorials/devtools-integration-tutorial.html>`__ get an understanding of how to use these in a real world example.
+
+
+Inspector Methods
+-----------------
+
+Constructor
+~~~~~~~~~~~
+
+.. autofunction:: executorch.devtools.Inspector.__init__
+
+**Example Usage:**
+
+.. code:: python
+
+    from executorch.devtools import Inspector
+
+    inspector = Inspector(etdump_path="/path/to/etdump.etdp", etrecord="/path/to/etrecord.bin")
+
+to_dataframe
+~~~~~~~~~~~~~~~~
+
+.. autofunction:: executorch.devtools.Inspector.to_dataframe
+
+
+print_data_tabular
+~~~~~~~~~~~~~~~~~~
+
+.. autofunction:: executorch.devtools.Inspector.print_data_tabular
+
+.. _example-usage-1:
+
+**Example Usage:**
+
+.. code:: python
+
+    inspector.print_data_tabular()
+
+.. image:: _static/img/print_data_tabular.png
+Note that the unit of delegate profiling events is "cycles". We're working on providing a way to set different units in the future.
+
+
+find_total_for_module
+~~~~~~~~~~~~~~~~~~~~~
+
+.. autofunction:: executorch.devtools.Inspector.find_total_for_module
+
+.. _example-usage-2:
+
+**Example Usage:**
+
+.. code:: python
+
+    print(inspector.find_total_for_module("L__self___conv_layer"))
+
+::
+
+    0.002
+
+
+get_exported_program
+~~~~~~~~~~~~~~~~~~~~
+
+.. autofunction:: executorch.devtools.Inspector.get_exported_program
+
+.. _example-usage-3:
+
+**Example Usage:**
+
+.. code:: python
+
+    print(inspector.get_exported_program())
+
+::
+
+    ExportedProgram:
+        class GraphModule(torch.nn.Module):
+            def forward(self, arg0_1: f32[4, 3, 64, 64]):
+                # No stacktrace found for following nodes
+                _param_constant0 = self._param_constant0
+                _param_constant1 = self._param_constant1
+
+                ### ... Omit part of the program for documentation readability ... ###
+
+    Graph signature: ExportGraphSignature(parameters=[], buffers=[], user_inputs=['arg0_1'], user_outputs=['aten_tan_default'], inputs_to_parameters={}, inputs_to_buffers={}, buffers_to_mutate={}, backward_signature=None, assertion_dep_token=None)
+    Range constraints: {}
+    Equality constraints: []
+
+
+Inspector Attributes
+--------------------
+
+``EventBlock`` Class
+~~~~~~~~~~~~~~~~~~~~
+
+Access ``EventBlock`` instances through the ``event_blocks`` attribute
+of an ``Inspector`` instance, for example:
+
+.. code:: python
+
+    inspector.event_blocks
+
+.. autoclass:: executorch.devtools.inspector.EventBlock
+
+``Event`` Class
+~~~~~~~~~~~~~~~
+
+Access ``Event`` instances through the ``events`` attribute of an
+``EventBlock`` instance.
+
+.. autoclass:: executorch.devtools.inspector.Event
+
+**Example Usage:**
+
+.. code:: python
+
+    for event_block in inspector.event_blocks:
+        for event in event_block.events:
+            if event.name == "Method::execute":
+                print(event.perf_data.raw)
+
+::
+
+    [175.748, 78.678, 70.429, 122.006, 97.495, 67.603, 70.2, 90.139, 66.344, 64.575, 134.135, 93.85, 74.593, 83.929, 75.859, 73.909, 66.461, 72.102, 84.142, 77.774, 70.038, 80.246, 59.134, 68.496, 67.496, 100.491, 81.162, 74.53, 70.709, 77.112, 59.775, 79.674, 67.54, 79.52, 66.753, 70.425, 71.703, 81.373, 72.306, 72.404, 94.497, 77.588, 79.835, 68.597, 71.237, 88.528, 71.884, 74.047, 81.513, 76.116]
+
+
+CLI
+---
+
+Execute the following command in your terminal to display the data
+table. This command produces the identical table output as calling the
+`print_data_tabular <#print-data-tabular>`__ mentioned earlier:
+
+.. code:: bash
+
+    python3 -m devtools.inspector.inspector_cli --etdump_path <path_to_etdump> --etrecord_path <path_to_etrecord>
+
+Note that the `etrecord_path` argument is optional.
+
+We plan to extend the capabilities of the CLI in the future.