Skip to content

Commit 9619b9c

Browse files
Update docs on Module new APIs. (#5968)
Update docs on Module new APIs. (#5952) Summary: Pull Request resolved: #5952 Reviewed By: kirklandsign Differential Revision: D64005568 fbshipit-source-id: 7cd8ab9fe33d5745064aca7720d34ce1d9f4f06b (cherry picked from commit 0424eef) Co-authored-by: Anthony Shoumikhin <[email protected]>
1 parent fef1299 commit 9619b9c

File tree

1 file changed

+104
-23
lines changed

1 file changed

+104
-23
lines changed

docs/source/extension-module.md

Lines changed: 104 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
**Author:** [Anthony Shoumikhin](https://github.com/shoumikhin)
44

5-
In the [Running an ExecuTorch Model in C++ Tutorial](running-a-model-cpp-tutorial.md), we explored the lower-level ExecuTorch APIs for running an exported model. While these APIs offer zero overhead, great flexibility, and control, they can be verbose and complex for regular use. To simplify this and resemble PyTorch's eager mode in Python, we introduce the Module facade APIs over the regular ExecuTorch runtime APIs. The Module APIs provide the same flexibility but default to commonly used components like `DataLoader` and `MemoryAllocator`, hiding most intricate details.
5+
In the [Running an ExecuTorch Model in C++ Tutorial](running-a-model-cpp-tutorial.md), we explored the lower-level ExecuTorch APIs for running an exported model. While these APIs offer zero overhead, great flexibility, and control, they can be verbose and complex for regular use. To simplify this and resemble PyTorch's eager mode in Python, we introduce the `Module` facade APIs over the regular ExecuTorch runtime APIs. The `Module` APIs provide the same flexibility but default to commonly used components like `DataLoader` and `MemoryAllocator`, hiding most intricate details.
66

77
## Example
88

@@ -37,7 +37,7 @@ The code now boils down to creating a `Module` and calling `forward()` on it, wi
3737
3838
### Creating a Module
3939
40-
Creating a `Module` object is an extremely fast operation that does not involve significant processing time or memory allocation. The actual loading of a `Program` and a `Method` happens lazily on the first inference unless explicitly requested with a dedicated API.
40+
Creating a `Module` object is a fast operation that does not involve significant processing time or memory allocation. The actual loading of a `Program` and a `Method` happens lazily on the first inference unless explicitly requested with a dedicated API.
4141
4242
```cpp
4343
Module module("/path/to/model.pte");
@@ -60,31 +60,32 @@ const auto error = module.load_method("forward");
6060
6161
assert(module.is_method_loaded("forward"));
6262
```
63-
Note: the `Program` is loaded automatically before any `Method` is loaded. Subsequent attemps to load them have no effect if one of the previous attemps was successful.
6463

65-
You can also force-load the "forward" method with a convenience syntax:
64+
You can also use the convenience function to load the `forward` method:
6665

6766
```cpp
6867
const auto error = module.load_forward();
6968

7069
assert(module.is_method_loaded("forward"));
7170
```
7271
72+
**Note:** The `Program` is loaded automatically before any `Method` is loaded. Subsequent attempts to load them have no effect if a previous attempt was successful.
73+
7374
### Querying for Metadata
7475
75-
Get a set of method names that a Module contains udsing the `method_names()` function:
76+
Get a set of method names that a `Module` contains using the `method_names()` function:
7677
7778
```cpp
7879
const auto method_names = module.method_names();
7980
8081
if (method_names.ok()) {
81-
assert(method_names.count("forward"));
82+
assert(method_names->count("forward"));
8283
}
8384
```
8485

85-
Note: `method_names()` will try to force-load the `Program` when called the first time.
86+
**Note:** `method_names()` will force-load the `Program` when called for the first time.
8687

87-
Introspect miscellaneous metadata about a particular method via `MethodMeta` struct returned by `method_meta()` function:
88+
To introspect miscellaneous metadata about a particular method, use the `method_meta()` function, which returns a `MethodMeta` struct:
8889

8990
```cpp
9091
const auto method_meta = module.method_meta("forward");
@@ -94,47 +95,123 @@ if (method_meta.ok()) {
9495
assert(method_meta->num_inputs() > 1);
9596

9697
const auto input_meta = method_meta->input_tensor_meta(0);
97-
9898
if (input_meta.ok()) {
9999
assert(input_meta->scalar_type() == ScalarType::Float);
100100
}
101-
const auto output_meta = meta->output_tensor_meta(0);
102101

102+
const auto output_meta = method_meta->output_tensor_meta(0);
103103
if (output_meta.ok()) {
104104
assert(output_meta->sizes().size() == 1);
105105
}
106106
}
107107
```
108108

109-
Note: `method_meta()` will try to force-load the `Method` when called for the first time.
109+
**Note:** `method_meta()` will also force-load the `Method` the first time it is called.
110110

111-
### Perform an Inference
111+
### Performing an Inference
112112

113-
Assuming that the `Program`'s method names and their input format is known ahead of time, we rarely need to query for those and can run the methods directly by name using the `execute()` function:
113+
Assuming the `Program`'s method names and their input format are known ahead of time, you can run methods directly by name using the `execute()` function:
114114

115115
```cpp
116-
const auto result = module.execute("forward", Tensor(&tensor));
116+
const auto result = module.execute("forward", tensor);
117+
```
118+
119+
For the standard `forward()` method, the above can be simplified:
120+
121+
```cpp
122+
const auto result = module.forward(tensor);
117123
```
118124

119-
Which can also be simplified for the standard `forward()` method name as:
125+
**Note:** `execute()` or `forward()` will load the `Program` and the `Method` the first time they are called. Therefore, the first inference will take longer, as the model is loaded lazily and prepared for execution unless it was explicitly loaded earlier.
126+
127+
### Setting Input and Output
128+
129+
You can set individual input and output values for methods with the following APIs.
130+
131+
#### Setting Inputs
132+
133+
Inputs can be any `EValue`, which includes tensors, scalars, lists, and other supported types. To set a specific input value for a method:
134+
135+
```cpp
136+
module.set_input("forward", input_value, input_index);
137+
```
138+
139+
- `input_value` is an `EValue` representing the input you want to set.
140+
- `input_index` is the zero-based index of the input to set.
141+
142+
For example, to set the first input tensor:
143+
144+
```cpp
145+
module.set_input("forward", tensor_value, 0);
146+
```
147+
148+
You can also set multiple inputs at once:
149+
150+
```cpp
151+
std::vector<runtime::EValue> inputs = {input1, input2, input3};
152+
module.set_inputs("forward", inputs);
153+
```
154+
155+
**Note:** You can skip the method name argument for the `forward()` method.
156+
157+
By pre-setting all inputs, you can perform an inference without passing any arguments:
120158
121159
```cpp
122-
const auto result = module.forward(Tensor(&tensor));
160+
const auto result = module.forward();
123161
```
124162

125-
Note: `execute()` or `forward()` will try to force load the `Program` and the `Method` when called for the first time. Therefore, the first inference will take more time than subsequent ones as it loads the model lazily and prepares it for execution unless the `Program` or `Method` was loaded explicitly earlier using the corresponding functions.
163+
Or just setting and then passing the inputs partially:
164+
165+
```cpp
166+
// Set the second input ahead of time.
167+
module.set_input(input_value_1, 1);
168+
169+
// Execute the method, providing the first input at call time.
170+
const auto result = module.forward(input_value_0);
171+
```
172+
173+
**Note:** The pre-set inputs are stored in the `Module` and can be reused multiple times for the next executions.
174+
175+
Don't forget to clear or reset the inputs if you don't need them anymore by setting them to default-constructed `EValue`:
176+
177+
```cpp
178+
module.set_input(runtime::EValue(), 1);
179+
```
180+
181+
#### Setting Outputs
182+
183+
Only outputs of type Tensor can be set at runtime, and they must not be memory-planned at model export time. Memory-planned tensors are preallocated during model export and cannot be replaced.
184+
185+
To set the output tensor for a specific method:
186+
187+
```cpp
188+
module.set_output("forward", output_tensor, output_index);
189+
```
190+
191+
- `output_tensor` is an `EValue` containing the tensor you want to set as the output.
192+
- `output_index` is the zero-based index of the output to set.
193+
194+
**Note:** Ensure that the output tensor you're setting matches the expected shape and data type of the method's output.
195+
196+
You can skip the method name for `forward()` and the index for the first output:
197+
198+
```cpp
199+
module.set_output(output_tensor);
200+
```
201+
202+
**Note:** The pre-set outputs are stored in the `Module` and can be reused multiple times for the next executions, just like inputs.
126203

127204
### Result and Error Types
128205

129-
Most of the ExecuTorch APIs, including those described above, return either `Result` or `Error` types. Let's understand what those are:
206+
Most of the ExecuTorch APIs return either `Result` or `Error` types:
130207

131-
* [`Error`](https://github.com/pytorch/executorch/blob/main/runtime/core/error.h) is a C++ enum containing a collection of valid error codes, where the default is `Error::Ok`, denoting success.
208+
- [`Error`](https://github.com/pytorch/executorch/blob/main/runtime/core/error.h) is a C++ enum containing valid error codes. The default is `Error::Ok`, denoting success.
132209

133-
* [`Result`](https://github.com/pytorch/executorch/blob/main/runtime/core/result.h) can hold either an `Error` if the operation has failed or a payload, i.e., the actual result of the operation like an `EValue` wrapping a `Tensor` or any other standard C++ data type if the operation succeeded. To check if `Result` has a valid value, call the `ok()` function. To get the `Error` use the `error()` function, and to get the actual data, use the overloaded `get()` function or dereferencing pointer operators like `*` and `->`.
210+
- [`Result`](https://github.com/pytorch/executorch/blob/main/runtime/core/result.h) can hold either an `Error` if the operation fails, or a payload such as an `EValue` wrapping a `Tensor` if successful. To check if a `Result` is valid, call `ok()`. To retrieve the `Error`, use `error()`, and to get the data, use `get()` or dereference operators like `*` and `->`.
134211

135-
### Profile the Module
212+
### Profiling the Module
136213

137-
Use [ExecuTorch Dump](etdump.md) to trace model execution. Create an instance of the `ETDumpGen` class and pass it to the `Module` constructor. After executing a method, save the `ETDump` to a file for further analysis. You can capture multiple executions in a single trace if desired.
214+
Use [ExecuTorch Dump](etdump.md) to trace model execution. Create an `ETDumpGen` instance and pass it to the `Module` constructor. After executing a method, save the `ETDump` data to a file for further analysis:
138215

139216
```cpp
140217
#include <fstream>
@@ -147,7 +224,7 @@ using namespace ::executorch::extension;
147224

148225
Module module("/path/to/model.pte", Module::LoadMode::MmapUseMlock, std::make_unique<ETDumpGen>());
149226

150-
// Execute a method, e.g. module.forward(...); or module.execute("my_method", ...);
227+
// Execute a method, e.g., module.forward(...); or module.execute("my_method", ...);
151228

152229
if (auto* etdump = dynamic_cast<ETDumpGen*>(module.event_tracer())) {
153230
const auto trace = etdump->get_etdump_data();
@@ -162,3 +239,7 @@ if (auto* etdump = dynamic_cast<ETDumpGen*>(module.event_tracer())) {
162239
}
163240
}
164241
```
242+
243+
# Conclusion
244+
245+
The `Module` APIs provide a simplified interface for running ExecuTorch models in C++, closely resembling the experience of PyTorch's eager mode. By abstracting away the complexities of the lower-level runtime APIs, developers can focus on model execution without worrying about the underlying details.

0 commit comments

Comments
 (0)