Populate top-level C++ API doc

GregoryComer · GregoryComer · commit e450f8402d1e · 2025-02-21T04:04:22.000-08:00
diff --git a/docs/source/using-executorch-cpp.md b/docs/source/using-executorch-cpp.md
@@ -1,3 +1,44 @@
 # Using ExecuTorch with C++
 
-Placeholder for top-level C++ documentation
+In order to support a wide variety of devices, from high-end mobile phones down to tiny embedded systems, ExecuTorch provides an API surface with a high degree of customizability. The C++ APIs expose advanced configuration options, such as controlling memory allocation, placement, and data loading. To meet the needs of both application and embedded programming, ExecuTorch provides a low-level, highly-customizable core set of APIs, and set of high-level extensions, which abstract away many of the low-level details that are not relevant for mobile application programming.
+
+## High-Level APIs
+
+The C++ `Module` class provides the high-level interface to load and execute a model from C++. It is responsible for loading the .pte file, configuring memory allocation and placement, and running the model. The Module constructor takes a file path and provides a simplified `forward()` method to run the model.
+
+In addition the Module class, the tensor extension provides an encapsulated interface to define and manage tensor memory. It provides the `TensorPtr` class, which is a "fat" smart pointer. It provides ownership over the tensor  data and metadata, such as size and strides. The `make_tensor_ptr` and `from_blob` methods, defined in `tensor.h`, provide owning and non-owning tensor creation APIs, respectively.
+
+```cpp
+#include <executorch/extension/module/module.h>
+#include <executorch/extension/tensor/tensor.h>
+
+using namespace ::executorch::extension;
+
+// Load the model.
+Module module("/path/to/model.pte");
+
+// Create an input tensor.
+float input[1 * 3 * 256 * 256];
+auto tensor = from_blob(input, {1, 3, 256, 256});
+
+// Perform an inference.
+const auto result = module.forward(tensor);
+
+if (result.ok()) {
+  // Retrieve the output data.
+  const auto output = result->at(0).toTensor().const_data_ptr<float>();
+}
+```
+
+For more information on the Module class, see [Running an ExecuTorch Model Using the Module Extension in C++](extension-module.md). For information on high-level tensor APIs, see [Managing Tensor Memory in C++](extension-tensor.md).
+
+## Low-Level APIs
+
+Running a model using the low-level runtime APIs allows for a high-degree of control over memory allocation, placement, and loading. This allows for advanced use cases, such as placing allocations in specific memory banks or loading a model without a file system. For an end to end example using the low-level runtime APIs, see [Running an ExecuTorch Model in C++ Tutorial](running-a-model-cpp-tutorial.md).
+
+## Next Steps
+
+- [Runtime API Reference](executorch-runtime-api-reference.md) for documentation on the available C++ runtime APIs.
+- [Running an ExecuTorch Model Using the Module Extension in C++](extension-module.md) for information on the high-level Module API.
+- [Managing Tensor Memory in C++](extension-tensor.md) for information on high-level tensor APIs.
+- [Running an ExecuTorch Model in C++ Tutorial](running-a-model-cpp-tutorial.md) for information on the low-level runtime APIs.