Skip to content

Commit af6f3ed

Browse files
shoumikhinfacebook-github-bot
authored andcommitted
Add documentation for the apple benchmarking app. (#5935)
Summary: Pull Request resolved: #5935 . Reviewed By: kirklandsign Differential Revision: D63988442 fbshipit-source-id: a0517166adc9ef3a1b2ee26d6003293b96ff7314
1 parent e194feb commit af6f3ed

9 files changed

+367
-0
lines changed
137 KB
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

extension/apple/Benchmark/README.md

Lines changed: 367 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,367 @@
1+
# Executorch Benchmark App for Apple Platforms
2+
3+
## Introduction
4+
5+
The **Benchmark App** is a tool designed to help developers measure the performance of PyTorch models on Apple devices using the Executorch runtime.
6+
It provides a flexible framework for dynamically generating and running performance tests on your models, allowing you to assess metrics such as load times, inference speeds, memory usage, and more.
7+
8+
<p align="center">
9+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app.png" alt="Benchmark App" style="width:800px">
10+
</p>
11+
12+
## Prerequisites
13+
14+
- [Xcode](https://apps.apple.com/us/app/xcode/id497799835?mt=12/) 15.0 or later with command-line tools if not already installed (`xcode-select --install`).
15+
- [CMake](https://cmake.org/download/) 3.19 or later
16+
- Download and open the macOS `.dmg` installer and move the CMake app to `/Applications` folder.
17+
- Install CMake command line tools: `sudo /Applications/CMake.app/Contents/bin/cmake-gui --install`
18+
- A development provisioning profile with the [`increased-memory-limit`](https://developer.apple.com/documentation/bundleresources/entitlements/com_apple_developer_kernel_increased-memory-limit) entitlement if targeting iOS devices.
19+
20+
## Setting Up the App
21+
22+
### Get the Code
23+
24+
To get started, clone the Executorch repository and cd into the source code directory:
25+
26+
```bash
27+
git clone https://github.com/pytorch/executorch.git --depth 1 --recurse-submodules --shallow-submodules
28+
cd executorch
29+
```
30+
31+
This command performs a shallow clone to speed up the process.
32+
33+
### Set Up the Frameworks
34+
35+
The Benchmark App relies on prebuilt Executorch frameworks.
36+
You have two options:
37+
38+
<details>
39+
<summary>Option 1: Download Prebuilt Frameworks</summary>
40+
<br/>
41+
42+
Run the provided script to download the prebuilt frameworks:
43+
44+
```bash
45+
./extension/apple/Benchmark/Frameworks/download_frameworks.sh
46+
```
47+
</details>
48+
49+
<details>
50+
<summary>Option 2: Build Frameworks Locally</summary>
51+
<br/>
52+
53+
Alternatively, you can build the frameworks yourself by following the [guide](https://pytorch.org/executorch/main/apple-runtime.html#local-build).
54+
</details>
55+
56+
Once the frameworks are downloaded or built, verify that the `Frameworks` directory contains the necessary `.xcframework` files:
57+
58+
```bash
59+
ls extension/apple/Benchmark/Frameworks
60+
```
61+
62+
You should see:
63+
64+
```
65+
backend_coreml.xcframework
66+
backend_mps.xcframework
67+
backend_xnnpack.xcframework
68+
executorch.xcframework
69+
kernels_custom.xcframework
70+
kernels_optimized.xcframework
71+
kernels_portable.xcframework
72+
kernels_quantized.xcframework
73+
```
74+
75+
## Adding Models and Resources
76+
77+
Place your exported model files (`.pte`) and any other resources (e.g., `tokenizer.bin`) into the `extension/apple/Benchmark/Resources` directory:
78+
79+
```bash
80+
cp <path/to/my_model.pte> <path/to/llama3.pte> <path/to/tokenizer.bin> extension/apple/Benchmark/Resources
81+
```
82+
83+
Optionally, check that the files are there:
84+
85+
```bash
86+
ls extension/apple/Benchmark/Resources
87+
```
88+
89+
For this example you should see:
90+
91+
```
92+
llama3.pte
93+
my_model.pte
94+
tokenizer.bin
95+
```
96+
97+
The app automatically bundles these resources and makes them available to the test suite.
98+
99+
## Running the Tests
100+
101+
### Build and Run the Tests
102+
103+
Open the Benchmark Xcode project:
104+
105+
```bash
106+
open extension/apple/Benchmark/Benchmark.xcodeproj
107+
```
108+
109+
Select the destination device or simulator and press `Command+U`, or click `Product` > `Test` in the menu to run the test suite.
110+
111+
<p align="center">
112+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app_tests.png" alt="Benchmark App Tests" style="width:800px">
113+
</p>
114+
115+
### Configure Signing (if necessary)
116+
117+
If you plan to run the app on a physical device, you may need to set up code signing:
118+
119+
1. Open the **Project Navigator** by pressing `Command+1` and click on the `Benchmark` root of the file tree.
120+
2. Under Targets section go to the **Signing & Capabilities** tab of both the `App` and `Tests` targets.
121+
3. Select your development team. Alternatively, manually pick a provisioning profile that supports the increased memory limit entitlement and modify the bundle identifier if needed.
122+
123+
<p align="center">
124+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app_signing.png" alt="Benchmark App Signing" style="width:800px">
125+
</p>
126+
127+
## Viewing Test Results and Metrics
128+
129+
After running the tests, you can view the results in Xcode:
130+
131+
1. Open the **Test Report Navigator** by pressing `Command+9`.
132+
2. Select the most recent test run.
133+
3. You'll see a list of tests that ran, along with their status (passed or failed).
134+
4. To view metrics for a specific test:
135+
- Double-click on the test in the list.
136+
- Switch to the **Metrics** tab to see detailed performance data.
137+
138+
**Note**: The tests use `XCTMeasureOptions` to run each test multiple times (usually five) to obtain average performance metrics.
139+
140+
<p align="center">
141+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app_test_load.png" alt="Benchmark App Test Load" style="width:800px">
142+
</p>
143+
<p align="center">
144+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app_test_forward.png" alt="Benchmark App Test Forward" style="width:800px">
145+
</p>
146+
<p align="center">
147+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app_test_generate.png" alt="Benchmark App Test Generate" style="width:800px">
148+
</p>
149+
150+
## Understanding the Test Suite
151+
152+
The Benchmark App uses a dynamic test generation framework to create tests based on the resources you provide.
153+
154+
### Dynamic Test Generation
155+
156+
The key components are:
157+
158+
- **`DynamicTestCase`**: A subclass of `XCTestCase` that allows for the dynamic creation of test methods.
159+
- **`ResourceTestCase`**: Builds upon `DynamicTestCase` to generate tests based on resources that match specified criteria.
160+
161+
### How It Works
162+
163+
1. **Define Directories and Predicates**: Override the `directories` and `predicates` methods to specify where to look for resources and how to match them.
164+
165+
2. **Generate Resource Combinations**: The framework searches the specified `directories` for files matching the `predicates`, generating all possible combinations.
166+
167+
3. **Create Dynamic Tests**: For each combination of resources, it calls `dynamicTestsForResources`, where you define the tests to run.
168+
169+
4. **Test Naming**: Test names are dynamically formed using the format:
170+
171+
```
172+
test_<TestName>_<Resource1>_<Resource2>_..._<OS>_<Version>_<DeviceModel>
173+
```
174+
175+
This ensures that each test is uniquely identifiable based on the resources and device.
176+
177+
### Example: Generic Model Tests
178+
179+
Here's how you might create a test to measure model load and inference times:
180+
181+
```objective-c
182+
@interface GenericTests : ResourceTestCase
183+
@end
184+
185+
@implementation GenericTests
186+
187+
+ (NSArray<NSString *> *)directories {
188+
return @[@"Resources"];
189+
}
190+
191+
+ (NSDictionary<NSString *, BOOL (^)(NSString *)> *)predicates {
192+
return @{
193+
@"model" : ^BOOL(NSString *filename) {
194+
return [filename hasSuffix:@".pte"];
195+
},
196+
};
197+
}
198+
199+
+ (NSDictionary<NSString *, void (^)(XCTestCase *)> *)dynamicTestsForResources:(NSDictionary<NSString *, NSString *> *)resources {
200+
NSString *modelPath = resources[@"model"];
201+
return @{
202+
@"load" : ^(XCTestCase *testCase) {
203+
[testCase measureWithMetrics:@[[XCTClockMetric new], [XCTMemoryMetric new]] block:^{
204+
XCTAssertEqual(Module(modelPath.UTF8String).load_forward(), Error::Ok);
205+
}];
206+
},
207+
@"forward" : ^(XCTestCase *testCase) {
208+
// Set up and measure the forward pass...
209+
},
210+
};
211+
}
212+
213+
@end
214+
```
215+
216+
In this example:
217+
218+
- We look for `.pte` files in the `Resources` directory.
219+
- For each model found, we create two tests: `load` and `forward`.
220+
- The tests measure the time and memory usage of loading and running the model.
221+
222+
## Extending the Test Suite
223+
224+
You can create custom tests by subclassing `ResourceTestCase` and overriding the necessary methods.
225+
226+
### Steps to Create Custom Tests
227+
228+
1. **Subclass `ResourceTestCase`**:
229+
230+
```objective-c
231+
@interface MyCustomTests : ResourceTestCase
232+
@end
233+
```
234+
235+
2. **Override `directories` and `predicates`**:
236+
237+
Specify where to look for resources and how to match them.
238+
239+
```objective-c
240+
+ (NSArray<NSString *> *)directories {
241+
return @[@"Resources"];
242+
}
243+
244+
+ (NSDictionary<NSString *, BOOL (^)(NSString *)> *)predicates {
245+
return @{
246+
@"model" : ^BOOL(NSString *filename) {
247+
return [filename hasSuffix:@".pte"];
248+
},
249+
@"config" : ^BOOL(NSString *filename) {
250+
return [filename isEqualToString:@"config.json"];
251+
},
252+
};
253+
}
254+
```
255+
256+
3. **Implement `dynamicTestsForResources`**:
257+
258+
Define the tests to run for each combination of resources.
259+
260+
```objective-c
261+
+ (NSDictionary<NSString *, void (^)(XCTestCase *)> *)dynamicTestsForResources:(NSDictionary<NSString *, NSString *> *)resources {
262+
NSString *modelPath = resources[@"model"];
263+
NSString *configPath = resources[@"config"];
264+
return @{
265+
@"customTest" : ^(XCTestCase *testCase) {
266+
// Implement your test logic here.
267+
},
268+
};
269+
}
270+
```
271+
272+
4. **Add the Test Class to the Test Target**:
273+
274+
Ensure your new test class is included in the test target in Xcode.
275+
276+
### Example: LLaMA Token Generation Test
277+
278+
An example of a more advanced test is measuring the tokens per second during text generation with the LLaMA model.
279+
280+
```objective-c
281+
@interface LLaMATests : ResourceTestCase
282+
@end
283+
284+
@implementation LLaMATests
285+
286+
+ (NSArray<NSString *> *)directories {
287+
return @[@"Resources"];
288+
}
289+
290+
+ (NSDictionary<NSString *, BOOL (^)(NSString *)> *)predicates {
291+
return @{
292+
@"model" : ^BOOL(NSString *filename) {
293+
return [filename hasSuffix:@".pte"] && [filename containsString:@"llama"];
294+
},
295+
@"tokenizer" : ^BOOL(NSString *filename) {
296+
return [filename isEqualToString:@"tokenizer.bin"];
297+
},
298+
};
299+
}
300+
301+
+ (NSDictionary<NSString *, void (^)(XCTestCase *)> *)dynamicTestsForResources:(NSDictionary<NSString *, NSString *> *)resources {
302+
NSString *modelPath = resources[@"model"];
303+
NSString *tokenizerPath = resources[@"tokenizer"];
304+
return @{
305+
@"generate" : ^(XCTestCase *testCase) {
306+
// Implement the token generation test...
307+
},
308+
};
309+
}
310+
311+
@end
312+
```
313+
314+
In this test:
315+
316+
- We look for LLaMA model files and a `tokenizer.bin`.
317+
- We measure tokens per second and memory usage during text generation.
318+
319+
## Measuring Performance
320+
321+
The Benchmark App leverages Apple's performance testing APIs to measure metrics such as execution time and memory usage.
322+
323+
- **Measurement Options**: By default, each test is run five times to calculate average metrics.
324+
- **Custom Metrics**: You can define custom metrics by implementing the `XCTMetric` protocol.
325+
- **Available Metrics**:
326+
- `XCTClockMetric`: Measures wall-clock time.
327+
- `XCTMemoryMetric`: Measures memory usage.
328+
- **Custom Metrics**: For example, the LLaMA test includes a `TokensPerSecondMetric`.
329+
330+
## Running Tests from the Command Line
331+
332+
You can also run the tests using `xcodebuild`:
333+
334+
```bash
335+
# Run on an iOS Simulator
336+
xcodebuild test -project extension/apple/Benchmark/Benchmark.xcodeproj \
337+
-scheme Benchmark \
338+
-destination 'platform=iOS Simulator,name=<SimulatorName>' \
339+
-testPlan Tests
340+
341+
# Run on a physical iOS device
342+
xcodebuild test -project extension/apple/Benchmark/Benchmark.xcodeproj \
343+
-scheme Benchmark \
344+
-destination 'platform=iOS,name=<DeviceName>' \
345+
-testPlan Tests \
346+
-allowProvisioningUpdates DEVELOPMENT_TEAM=<YourTeamID>
347+
```
348+
349+
Replace `<SimulatorName>`, `<DeviceName>`, and `<YourTeamID>` with your simulator/device name and Apple development team ID.
350+
351+
## macOS
352+
353+
The app can be built and run on macOS, just add it as the destination platform.
354+
355+
<p align="center">
356+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app_macos.png" alt="Benchmark App macOS" style="width:700px">
357+
</p>
358+
359+
Also, set up app signing to run locally.
360+
361+
<p align="center">
362+
<img src="https://raw.githubusercontent.com/pytorch/executorch/refs/heads/main/docs/source/_static/img/ios_benchmark_app_macos_signing.png" alt="Benchmark App macOS Signing" style="width:800px">
363+
</p>
364+
365+
## Conclusion
366+
367+
The Executorch Benchmark App provides a flexible and powerful framework for testing and measuring the performance of PyTorch models on Apple devices. By leveraging dynamic test generation, you can easily add your models and resources to assess their performance metrics. Whether you're optimizing existing models or developing new ones, this tool can help you gain valuable insights into their runtime behavior.

0 commit comments

Comments
 (0)