Skip to content

Add DeviceInfo in iOS benchmark run #5410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

huydhn
Copy link
Contributor

@huydhn huydhn commented Sep 17, 2024

Given the way iOS benchmark app measures model load time, inference time, and memory usage using measureWithMetrics with XCTClockMetric and XCTMemoryMetric. I think the easiest way to gather the benchmark metric is to do it after the test finishes and parse the output.

In the same spirit as #5332, this PR adds more information about the device so that it can be parsed later. I add the information into the test name, @shoumikhin plz let me know if you know a better way to pass this information around.

The output looks like this with the information in the test case name, i.e. test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1

Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started.
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 125171.731, relative standard deviation: 0.010%, values: [125158.624000, 125175.008000, 125175.008000, 125158.624000, 125191.392000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: -29.491, relative standard deviation: -228.792%, values: [-49.152000, -16.384000, 16.384000, 49.152000, -147.456000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.160, relative standard deviation: 3.460%, values: [0.163377, 0.165837, 0.164974, 0.152334, 0.154970], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (1.322 seconds).
Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started.
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 127403.280, relative standard deviation: 0.000%, values: [127403.280000, 127403.280000, 127403.280000, 127403.280000, 127403.280000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: 0.000, relative standard deviation: 0.000%, values: [0.000000, 0.000000, 0.000000, 0.000000, 0.000000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.000, relative standard deviation: 41.029%, values: [0.000001, 0.000001, 0.000001, 0.000001, 0.000001], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (0.132 seconds).

Copy link

pytorch-bot bot commented Sep 17, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5410

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 93bdc9b with merge base fdc7e45 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 17, 2024
@huydhn huydhn changed the title Export benchmark metrics ios Add DeviceInfo in iOS benchmark run Sep 17, 2024
@huydhn huydhn marked this pull request as ready for review September 17, 2024 21:53
@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

1 similar comment
@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@guangy10 guangy10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of relying on parsing the testname/filename or carry over the info from input to the final outcome, I think we can start creating the benchmark_results.json once the job starts and start populating the file with known info. The creation of the file is independent from the outcome of the run, it will populate the metrics field only if the run finished successfully otherwise it's left blank, and the status is populated with errors. We can refactor it after the PTC

facebook-github-bot pushed a commit that referenced this pull request Sep 18, 2024
Summary:
Given the way iOS benchmark app measures model load time, inference time, and memory usage using `measureWithMetrics` with `XCTClockMetric` and `XCTMemoryMetric`.  I think the easiest way to gather the benchmark metric is to do it after the test finishes and parse the output.

In the same spirit as #5332, this PR adds more information about the device so that it can be parsed later.  I add the information into the test name, shoumikhin plz let me know if you know a better way to pass this information around.

The output looks like this with the information in the test case name, i.e. `test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1`

```
Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started.
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 125171.731, relative standard deviation: 0.010%, values: [125158.624000, 125175.008000, 125175.008000, 125158.624000, 125191.392000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: -29.491, relative standard deviation: -228.792%, values: [-49.152000, -16.384000, 16.384000, 49.152000, -147.456000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.160, relative standard deviation: 3.460%, values: [0.163377, 0.165837, 0.164974, 0.152334, 0.154970], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (1.322 seconds).
Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started.
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 127403.280, relative standard deviation: 0.000%, values: [127403.280000, 127403.280000, 127403.280000, 127403.280000, 127403.280000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: 0.000, relative standard deviation: 0.000%, values: [0.000000, 0.000000, 0.000000, 0.000000, 0.000000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.000, relative standard deviation: 41.029%, values: [0.000001, 0.000001, 0.000001, 0.000001, 0.000001], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (0.132 seconds).
```


Reviewed By: guangy10

Differential Revision: D62902327

Pulled By: huydhn
@facebook-github-bot facebook-github-bot force-pushed the export-benchmark-metrics-ios branch from 49fbcc7 to c9ebb2b Compare September 18, 2024 18:26
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62902327

Summary:
Given the way iOS benchmark app measures model load time, inference time, and memory usage using `measureWithMetrics` with `XCTClockMetric` and `XCTMemoryMetric`.  I think the easiest way to gather the benchmark metric is to do it after the test finishes and parse the output.

In the same spirit as #5332, this PR adds more information about the device so that it can be parsed later.  I add the information into the test name, shoumikhin plz let me know if you know a better way to pass this information around.

The output looks like this with the information in the test case name, i.e. `test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1`

```
Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started.
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 125171.731, relative standard deviation: 0.010%, values: [125158.624000, 125175.008000, 125175.008000, 125158.624000, 125191.392000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: -29.491, relative standard deviation: -228.792%, values: [-49.152000, -16.384000, 16.384000, 49.152000, -147.456000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.160, relative standard deviation: 3.460%, values: [0.163377, 0.165837, 0.164974, 0.152334, 0.154970], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (1.322 seconds).
Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started.
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 127403.280, relative standard deviation: 0.000%, values: [127403.280000, 127403.280000, 127403.280000, 127403.280000, 127403.280000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: 0.000, relative standard deviation: 0.000%, values: [0.000000, 0.000000, 0.000000, 0.000000, 0.000000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.000, relative standard deviation: 41.029%, values: [0.000001, 0.000001, 0.000001, 0.000001, 0.000001], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (0.132 seconds).
```


Reviewed By: guangy10

Differential Revision: D62902327

Pulled By: huydhn
@facebook-github-bot facebook-github-bot force-pushed the export-benchmark-metrics-ios branch from c9ebb2b to 93bdc9b Compare September 18, 2024 18:26
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62902327

@facebook-github-bot
Copy link
Contributor

@huydhn merged this pull request in 0a9bbaa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants