Skip to content

Commit 6bbf5b8

Browse files
committed
Update base for Update on "[5.1/ N] set_option/get_option API with {backend_name, backend options} only"
This PR only expose the set_option/get_option API via the pair {backend_name, backend_options}, without necessarily backend options map. The backend options map and it's corresponding API will be exposed to another PR Reference PR in #11758 which exposes the set_option/get_option with backendoptions map too Differential Revision: [D77190316](https://our.internmc.facebook.com/intern/diff/D77190316/) [ghstack-poisoned]
2 parents 1a175f1 + 222d9e3 commit 6bbf5b8

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+4639
-559
lines changed

.ci/docker/requirements-ci.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -28,3 +28,6 @@ matplotlib>=3.9.4
2828
myst-parser==0.18.1
2929
sphinx_design==0.4.1
3030
sphinx-copybutton==0.5.0
31+
32+
# script unit test requirements
33+
yaspin==3.1.0
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
# Executorch Benchmark Tooling
2+
3+
A library providing tools for fetching, processing, and analyzing ExecutorchBenchmark data from the HUD Open API. This tooling helps compare performance metrics between private and public devices with identical settings.
4+
5+
## Table of Contents
6+
7+
- [Overview](#overview)
8+
- [Installation](#installation)
9+
- [Tools](#tools)
10+
- [get_benchmark_analysis_data.py](#get_benchmark_analysis_datapy)
11+
- [Quick Start](#quick-start)
12+
- [Command Line Options](#command-line-options)
13+
- [Example Usage](#example-usage)
14+
- [Working with Output Files](#working-with-output-files-csv-and-excel)
15+
- [Python API Usage](#python-api-usage)
16+
- [Running Unit Tests](#running-unit-tests)
17+
18+
## Overview
19+
20+
The Executorch Benchmark Tooling provides a suite of utilities designed to:
21+
22+
- Fetch benchmark data from HUD Open API for specified time ranges
23+
- Clean and process data by filtering out failures
24+
- Compare metrics between private and public devices with matching configurations
25+
- Generate analysis reports in various formats (CSV, Excel, JSON)
26+
- Support filtering by device pools, backends, and models
27+
28+
This tooling is particularly useful for performance analysis, regression testing, and cross-device comparisons.
29+
30+
## Installation
31+
32+
Install dependencies:
33+
34+
```bash
35+
pip install -r requirements.txt
36+
```
37+
38+
## Tools
39+
40+
### get_benchmark_analysis_data.py
41+
42+
This script is mainly used to generate analysis data comparing private devices with public devices using the same settings.
43+
44+
It fetches benchmark data from HUD Open API for a specified time range, cleans the data by removing entries with FAILURE indicators, and retrieves all private device metrics along with equivalent public device metrics based on matching [model, backend, device_pool_names, arch] configurations. Users can filter the data by specifying private device_pool_names, backends, and models.
45+
46+
#### Quick Start
47+
48+
```bash
49+
# generate excel sheets for all private devices with public devices using the same settings
50+
python3 .ci/scripts/benchmark_tooling/get_benchmark_analysis_data.py \
51+
--startTime "2025-06-11T00:00:00" \
52+
--endTime "2025-06-17T18:00:00" \
53+
--outputType "excel"
54+
55+
# generate the benchmark stability analysis
56+
python3 .ci/scripts/benchmark_tooling/analyze_benchmark_stability.py \
57+
--primary-file private.xlsx \
58+
--reference-file public.xlsx
59+
```
60+
61+
#### Command Line Options
62+
63+
##### Basic Options:
64+
- `--startTime`: Start time in ISO format (e.g., "2025-06-11T00:00:00") (required)
65+
- `--endTime`: End time in ISO format (e.g., "2025-06-17T18:00:00") (required)
66+
- `--env`: Choose environment ("local" or "prod", default: "prod")
67+
- `--no-silent`: Show processing logs (default: only show results & minimum logging)
68+
69+
##### Output Options:
70+
- `--outputType`: Choose output format (default: "print")
71+
- `print`: Display results in console
72+
- `json`: Generate JSON file
73+
- `df`: Display results in DataFrame format: `{'private': List[{'groupInfo':Dict,'df': DF},...],'public':List[{'groupInfo':Dict,'df': DF}]`
74+
- `excel`: Generate Excel files with multiple sheets, the field in first row and first column contains the JSON string of the raw metadata
75+
- `csv`: Generate CSV files in separate folders, the field in first row and first column contains the JSON string of the raw metadata
76+
- `--outputDir`: Directory to save output files (default: current directory)
77+
78+
##### Filtering Options:
79+
80+
- `--device-pools`: Filter by private device pool names (e.g., "samsung-galaxy-s22-5g", "samsung-galaxy-s22plus-5g")
81+
- `--backends`: Filter by specific backend names (e.g.,"xnnpack_q8")
82+
- `--models`: Filter by specific model names (e.g., "mv3", "meta-llama-llama-3.2-1b-instruct-qlora-int4-eo8")
83+
84+
#### Example Usage
85+
86+
Filter by multiple private device pools and models:
87+
```bash
88+
# This fetches all private table data for models 'llama-3.2-1B' and 'mv3'
89+
python3 get_benchmark_analysis_data.py \
90+
--startTime "2025-06-01T00:00:00" \
91+
--endTime "2025-06-11T00:00:00" \
92+
--device-pools 'apple_iphone_15_private' 'samsung_s22_private' \
93+
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
94+
```
95+
96+
Filter by specific device pool and models:
97+
```bash
98+
# This fetches all private iPhone table data for models 'llama-3.2-1B' and 'mv3',
99+
# and associated public iPhone data
100+
python3 get_benchmark_analysis_data.py \
101+
--startTime "2025-06-01T00:00:00" \
102+
--endTime "2025-06-11T00:00:00" \
103+
--device-pools 'apple_iphone_15_private' \
104+
--models 'meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8' 'mv3'
105+
```
106+
107+
#### Working with Output Files CSV and Excel
108+
109+
You can use methods in `common.py` to convert the file data back to DataFrame format. These methods read the first row in CSV/Excel files and return results with the format `list of {"groupInfo":DICT, "df":df.Dataframe{}}`.
110+
111+
```python
112+
import logging
113+
logging.basicConfig(level=logging.INFO)
114+
from .ci.scripts.benchmark_tooling.common import read_all_csv_with_metadata, read_excel_with_json_header
115+
116+
# For CSV files (assuming the 'private' folder is in the current directory)
117+
folder_path = './private'
118+
res = read_all_csv_with_metadata(folder_path)
119+
logging.info(res)
120+
121+
# For Excel files (assuming the Excel file is in the current directory)
122+
file_path = "./private.xlsx"
123+
res = read_excel_with_json_header(file_path)
124+
logging.info(res)
125+
```
126+
127+
#### Python API Usage
128+
129+
To use the benchmark fetcher in your own scripts:
130+
131+
```python
132+
from .ci.scripts.benchmark_tooling.get_benchmark_analysis_data import ExecutorchBenchmarkFetcher
133+
134+
# Initialize the fetcher
135+
fetcher = ExecutorchBenchmarkFetcher(env="prod", disable_logging=False)
136+
137+
# Fetch data for a specific time range
138+
fetcher.run(
139+
start_time="2025-06-11T00:00:00",
140+
end_time="2025-06-17T18:00:00"
141+
)
142+
143+
# Get results in different formats
144+
# As DataFrames
145+
df_results = fetcher.to_df()
146+
147+
# Export to Excel
148+
fetcher.to_excel(output_dir="./results")
149+
150+
# Export to CSV
151+
fetcher.to_csv(output_dir="./results")
152+
153+
# Export to JSON
154+
json_path = fetcher.to_json(output_dir="./results")
155+
156+
# Get raw dictionary results
157+
dict_results = fetcher.to_dict()
158+
159+
# Use the output_data method for flexible output
160+
results = fetcher.output_data(output_type="excel", output_dir="./results")
161+
```
162+
163+
## Running Unit Tests
164+
165+
The benchmark tooling includes unit tests to ensure functionality.
166+
167+
### Using pytest for unit tests
168+
169+
```bash
170+
# From the executorch root directory
171+
pytest -c /dev/null .ci/scripts/tests/test_get_benchmark_analysis_data.py
172+
```

.ci/scripts/benchmark_tooling/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)