|
1 | 1 | # Using ExecuTorch with C++
|
2 | 2 |
|
3 |
| -Placeholder for top-level C++ documentation |
| 3 | +In order to support a wide variety of devices, from high-end mobile phones down to tiny embedded systems, ExecuTorch provides an API surface with a high degree of customizability. The C++ APIs expose advanced configuration options, such as controlling memory allocation, placement, and data loading. To meet the needs of both application and embedded programming, ExecuTorch provides a low-level, highly-customizable core set of APIs, and set of high-level extensions, which abstract away many of the low-level details that are not relevant for mobile application programming. |
| 4 | + |
| 5 | +## High-Level APIs |
| 6 | + |
| 7 | +The C++ `Module` class provides the high-level interface to load and execute a model from C++. It is responsible for loading the .pte file, configuring memory allocation and placement, and running the model. The Module constructor takes a file path and provides a simplified `forward()` method to run the model. |
| 8 | + |
| 9 | +In addition the Module class, the tensor extension provides an encapsulated interface to define and manage tensor memory. It provides the `TensorPtr` class, which is a "fat" smart pointer. It provides ownership over the tensor data and metadata, such as size and strides. The `make_tensor_ptr` and `from_blob` methods, defined in `tensor.h`, provide owning and non-owning tensor creation APIs, respectively. |
| 10 | + |
| 11 | +```cpp |
| 12 | +#include <executorch/extension/module/module.h> |
| 13 | +#include <executorch/extension/tensor/tensor.h> |
| 14 | + |
| 15 | +using namespace ::executorch::extension; |
| 16 | + |
| 17 | +// Load the model. |
| 18 | +Module module("/path/to/model.pte"); |
| 19 | + |
| 20 | +// Create an input tensor. |
| 21 | +float input[1 * 3 * 256 * 256]; |
| 22 | +auto tensor = from_blob(input, {1, 3, 256, 256}); |
| 23 | + |
| 24 | +// Perform an inference. |
| 25 | +const auto result = module.forward(tensor); |
| 26 | + |
| 27 | +if (result.ok()) { |
| 28 | + // Retrieve the output data. |
| 29 | + const auto output = result->at(0).toTensor().const_data_ptr<float>(); |
| 30 | +} |
| 31 | +``` |
| 32 | +
|
| 33 | +For more information on the Module class, see [Running an ExecuTorch Model Using the Module Extension in C++](extension-module.md). For information on high-level tensor APIs, see [Managing Tensor Memory in C++](extension-tensor.md). |
| 34 | +
|
| 35 | +## Low-Level APIs |
| 36 | +
|
| 37 | +Running a model using the low-level runtime APIs allows for a high-degree of control over memory allocation, placement, and loading. This allows for advanced use cases, such as placing allocations in specific memory banks or loading a model without a file system. For an end to end example using the low-level runtime APIs, see [Running an ExecuTorch Model in C++ Tutorial](running-a-model-cpp-tutorial.md). |
| 38 | +
|
| 39 | +## Next Steps |
| 40 | +
|
| 41 | +- [Runtime API Reference](executorch-runtime-api-reference.md) for documentation on the available C++ runtime APIs. |
| 42 | +- [Running an ExecuTorch Model Using the Module Extension in C++](extension-module.md) for information on the high-level Module API. |
| 43 | +- [Managing Tensor Memory in C++](extension-tensor.md) for information on high-level tensor APIs. |
| 44 | +- [Running an ExecuTorch Model in C++ Tutorial](running-a-model-cpp-tutorial.md) for information on the low-level runtime APIs. |
0 commit comments