-
Notifications
You must be signed in to change notification settings - Fork 607
Add DeviceInfo in iOS benchmark run #5410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5410
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 93bdc9b with merge base fdc7e45 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
1 similar comment
@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of relying on parsing the testname/filename or carry over the info from input to the final outcome, I think we can start creating the benchmark_results.json
once the job starts and start populating the file with known info. The creation of the file is independent from the outcome of the run, it will populate the metrics field only if the run finished successfully otherwise it's left blank, and the status is populated with errors. We can refactor it after the PTC
Summary: Given the way iOS benchmark app measures model load time, inference time, and memory usage using `measureWithMetrics` with `XCTClockMetric` and `XCTMemoryMetric`. I think the easiest way to gather the benchmark metric is to do it after the test finishes and parse the output. In the same spirit as #5332, this PR adds more information about the device so that it can be parsed later. I add the information into the test name, shoumikhin plz let me know if you know a better way to pass this information around. The output looks like this with the information in the test case name, i.e. `test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1` ``` Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started. /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 125171.731, relative standard deviation: 0.010%, values: [125158.624000, 125175.008000, 125175.008000, 125158.624000, 125191.392000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: -29.491, relative standard deviation: -228.792%, values: [-49.152000, -16.384000, 16.384000, 49.152000, -147.456000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.160, relative standard deviation: 3.460%, values: [0.163377, 0.165837, 0.164974, 0.152334, 0.154970], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (1.322 seconds). Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started. /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 127403.280, relative standard deviation: 0.000%, values: [127403.280000, 127403.280000, 127403.280000, 127403.280000, 127403.280000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: 0.000, relative standard deviation: 0.000%, values: [0.000000, 0.000000, 0.000000, 0.000000, 0.000000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.000, relative standard deviation: 41.029%, values: [0.000001, 0.000001, 0.000001, 0.000001, 0.000001], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (0.132 seconds). ``` Reviewed By: guangy10 Differential Revision: D62902327 Pulled By: huydhn
49fbcc7
to
c9ebb2b
Compare
This pull request was exported from Phabricator. Differential Revision: D62902327 |
Summary: Given the way iOS benchmark app measures model load time, inference time, and memory usage using `measureWithMetrics` with `XCTClockMetric` and `XCTMemoryMetric`. I think the easiest way to gather the benchmark metric is to do it after the test finishes and parse the output. In the same spirit as #5332, this PR adds more information about the device so that it can be parsed later. I add the information into the test name, shoumikhin plz let me know if you know a better way to pass this information around. The output looks like this with the information in the test case name, i.e. `test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1` ``` Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started. /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 125171.731, relative standard deviation: 0.010%, values: [125158.624000, 125175.008000, 125175.008000, 125158.624000, 125191.392000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: -29.491, relative standard deviation: -228.792%, values: [-49.152000, -16.384000, 16.384000, 49.152000, -147.456000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:134: Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.160, relative standard deviation: 3.460%, values: [0.163377, 0.165837, 0.164974, 0.152334, 0.154970], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 Test Case '-[Tests test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (1.322 seconds). Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' started. /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Peak Physical, kB] average: 127403.280, relative standard deviation: 0.000%, values: [127403.280000, 127403.280000, 127403.280000, 127403.280000, 127403.280000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Memory Physical, kB] average: 0.000, relative standard deviation: 0.000%, values: [0.000000, 0.000000, 0.000000, 0.000000, 0.000000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 /Users/huydo/Storage/mine/executorch/extension/apple/Benchmark/Tests/Tests.mm:85: Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' measured [Clock Monotonic Time, s] average: 0.000, relative standard deviation: 41.029%, values: [0.000001, 0.000001, 0.000001, 0.000001, 0.000001], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000 Test Case '-[Tests test_load_models_llama2_iPhone_iPhone14,2_iOS_17.6.1]' passed (0.132 seconds). ``` Reviewed By: guangy10 Differential Revision: D62902327 Pulled By: huydhn
c9ebb2b
to
93bdc9b
Compare
This pull request was exported from Phabricator. Differential Revision: D62902327 |
Given the way iOS benchmark app measures model load time, inference time, and memory usage using
measureWithMetrics
withXCTClockMetric
andXCTMemoryMetric
. I think the easiest way to gather the benchmark metric is to do it after the test finishes and parse the output.In the same spirit as #5332, this PR adds more information about the device so that it can be parsed later. I add the information into the test name, @shoumikhin plz let me know if you know a better way to pass this information around.
The output looks like this with the information in the test case name, i.e.
test_forward_models_llama2_iPhone_iPhone14,2_iOS_17.6.1