Minor changes to event tracing in executor_runner

benkli01 · benkli01 · commit 50b711e51ff7 · 2024-08-14T12:37:59.000+01:00
- Make flag 'num_executions' available in the executor_runner
  irrespective of the event tracing
- Update docs to explain usage of 'ENABLE_XNNPACK_PROFILING' for
  additional profiling info

Signed-off-by: Benjamin Klimczak &lt;benjamin.klimczak@arm.com&gt;
Change-Id: I35abbd2d913880cb129bddb80514992f4dd84004
diff --git a/docs/source/native-delegates-executorch-xnnpack-delegate.md b/docs/source/native-delegates-executorch-xnnpack-delegate.md
@@ -74,7 +74,7 @@ Since weight packing creates an extra copy of the weights inside XNNPACK, We fre
 When executing the XNNPACK subgraphs, we prepare the tensor inputs and outputs and feed them to the XNNPACK runtime graph. After executing the runtime graph, the output pointers are filled with the computed tensors.
 
 #### **Profiling**
-We have enabled basic profiling for XNNPACK delegate that can be enabled with the following compiler flag `-DEXECUTORCH_ENABLE_EVENT_TRACER`. With ExecuTorch's SDK integration, you can also now use the SDK tools to profile the model. You can follow the steps in [Using the ExecuTorch SDK to Profile a Model](./tutorials/sdk-integration-tutorial) on how to profile ExecuTorch models and use SDK's Inspector API to view XNNPACK's internal profiling information. An example implementation is available in the `xnn_executor_runner` (see [tutorial here](tutorial-xnnpack-delegate-lowering.md#profiling)).
+We have enabled basic profiling for the XNNPACK delegate that can be enabled with the compiler flag `-DEXECUTORCH_ENABLE_EVENT_TRACER` (add `-DENABLE_XNNPACK_PROFILING` for additional details). With ExecuTorch's SDK integration, you can also now use the SDK tools to profile the model. You can follow the steps in [Using the ExecuTorch SDK to Profile a Model](./tutorials/sdk-integration-tutorial) on how to profile ExecuTorch models and use SDK's Inspector API to view XNNPACK's internal profiling information. An example implementation is available in the `xnn_executor_runner` (see [tutorial here](tutorial-xnnpack-delegate-lowering.md#profiling)).
 
 
 [comment]: <> (TODO: Refactor quantizer to a more official quantization doc)
diff --git a/docs/source/tutorial-xnnpack-delegate-lowering.md b/docs/source/tutorial-xnnpack-delegate-lowering.md
@@ -173,4 +173,4 @@ Now you should be able to find the executable built at `./cmake-out/backends/xnn
 You can build the XNNPACK backend [CMake target](https://github.com/pytorch/executorch/blob/main/backends/xnnpack/CMakeLists.txt#L83), and link it with your application binary such as an Android or iOS application. For more information on this you may take a look at this [resource](demo-apps-android.md) next.
 
 ## Profiling
-To enable profiling in the `xnn_executor_runner` pass the flags `-DEXECUTORCH_ENABLE_EVENT_TRACER=ON` and `-DEXECUTORCH_BUILD_SDK=ON` to the build command. This will enable ETDump generation when running the inference and enables command line flags for profiling (see `xnn_executor_runner --help` for details).
+To enable profiling in the `xnn_executor_runner` pass the flags `-DEXECUTORCH_ENABLE_EVENT_TRACER=ON` and `-DEXECUTORCH_BUILD_SDK=ON` to the build command (add `-DENABLE_XNNPACK_PROFILING=ON` for additional details). This will enable ETDump generation when running the inference and enables command line flags for profiling (see `xnn_executor_runner --help` for details).
diff --git a/examples/portable/executor_runner/executor_runner.cpp b/examples/portable/executor_runner/executor_runner.cpp
@@ -40,12 +40,12 @@ DEFINE_string(
     model_path,
     "model.pte",
     "Model serialized in flatbuffer format.");
+DEFINE_uint32(num_executions, 1, "Number of times to run the model.");
 #ifdef ET_EVENT_TRACER_ENABLED
 DEFINE_string(
     etdump_path,
     "model.etdump",
     "If ETDump generation is enabled an ETDump will be written out to this path.");
-DEFINE_uint32(num_executions, 10, "Number of times to run the model.");
 #endif // ET_EVENT_TRACER_ENABLED
 
 using namespace torch::executor;
@@ -153,7 +153,6 @@ int main(int argc, char** argv) {
   // the method can mutate the memory-planned buffers, so the method should only
   // be used by a single thread at at time, but it can be reused.
   //
-  uint32_t num_executions = 1;
   EventTracer* event_tracer_ptr = nullptr;
 #ifdef ET_EVENT_TRACER_ENABLED
   std::unique_ptr<FILE, decltype(&fclose)> etdump_file(
@@ -163,7 +162,6 @@ int main(int argc, char** argv) {
       "Failed to open ETDump file at %s.",
       FLAGS_etdump_path.c_str());
 
-  num_executions = FLAGS_num_executions;
   torch::executor::ETDumpGen etdump_gen = torch::executor::ETDumpGen();
   event_tracer_ptr = &etdump_gen;
 #endif // ET_EVENT_TRACER_ENABLED
@@ -187,15 +185,15 @@ int main(int argc, char** argv) {
   ET_LOG(Info, "Inputs prepared.");
 
   // Run the model.
-  for (uint32_t i = 0; i < num_executions; i++) {
+  for (uint32_t i = 0; i < FLAGS_num_executions; i++) {
     Error status = method->execute();
     ET_CHECK_MSG(
         status == Error::Ok,
         "Execution of method %s failed with status 0x%" PRIx32,
         method_name,
         (uint32_t)status);
   }
-  ET_LOG(Info, "Model executed successfully %i time(s).", num_executions);
+  ET_LOG(Info, "Model executed successfully %i time(s).", FLAGS_num_executions);
 
   // Print the outputs.
   std::vector<EValue> outputs(method->outputs_size());