Skip to content

Switch Apple benchmark workflow to use the generic ET benchmark iOS app #5212

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

huydhn
Copy link
Contributor

@huydhn huydhn commented Sep 10, 2024

This requires @shoumikhin change in #5208, so I will rebase and test it out again after #5208 lands

Testing

https://github.com/pytorch/executorch/actions/runs/10787058020

Copy link

pytorch-bot bot commented Sep 10, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5212

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 31cf557 with merge base f9da675 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 10, 2024
@huydhn huydhn marked this pull request as ready for review September 10, 2024 00:43
@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

CODE_SIGNING_ALLOWED=No
fi
ARTIFACTS_DIR_NAME="$1"
APP_PATH="extension/apple/Benchmark/Benchmark"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shoumikhin what tokenizer is the app built with? w/ both bpe and tiktoken?

${CONDA_RUN} --no-capture-output \
build/build_apple_llm_demo.sh ${{ matrix.tokenizer }} ${ARTIFACTS_DIR_NAME}
build/build_apple_llm_demo.sh ${ARTIFACTS_DIR_NAME}

upload-ios-apps:
needs: build-llm-demo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we rename previous step to build-benchmark-app? Also should we update ios-ipa-archive and ios-xctestrun-zip to point to the new benchmark app

Copy link
Contributor

@guangy10 guangy10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few comments, otherwise lgtm

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@huydhn
Copy link
Contributor Author

huydhn commented Sep 11, 2024

The benchmark is running on device now, but it crashes once before being retrying successfully (weird) https://github.com/pytorch/executorch/actions/runs/10804559720/job/29970780023:

Here is the test output:

[DEVICEFARM] ########### Entering phase test ###########
 
[DeviceFarm] xcodebuild test-without-building -destination id=$DEVICEFARM_DEVICE_UDID -xctestrun $DEVICEFARM_TEST_PACKAGE_PATH/*.xctestrun -derivedDataPath $DEVICEFARM_LOG_DIR
Command line invocation:
    /Applications/Xcode_15.app/Contents/Developer/usr/bin/xcodebuild test-without-building -destination id=00008120-000424DC1122201E -xctestrun /tmp/devicefarm-workspace/execution-tvdc_j_e/test-package-ruiiiicu/Benchmark_Tests_iphoneos17.5-arm64.xctestrun -derivedDataPath /tmp/devicefarm-workspace/execution-tvdc_j_e/logs-3gvb1hak

User defaults from command line:
    IDEDerivedDataPathOverride = /tmp/devicefarm-workspace/execution-tvdc_j_e/logs-3gvb1hak
    IDEPackageSupportUseBuiltinSCM = YES

2024-09-10 21:07:33.547 xcodebuild[1200:8254]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError
Test Suite 'All tests' started at 2024-09-10 21:07:34.514.
Test Suite 'Tests.xctest' started at 2024-09-10 21:07:34.514.
Test Suite 'Tests' started at 2024-09-10 21:07:34.514.
Test Case '-[Tests test_forward_aatp_data_llama2]' started.
2024-09-10 21:07:35.608 xcodebuild[1200:8342]  DVTDevice: Error locating DeviceSupport directory using Optional("arm64e") or Optional("arm64e"): nilError

Restarting after unexpected exit, crash, or test timeout in -[Tests test_forward_aatp_data_llama2]; summary will include totals from previous launches.

Test Suite 'Selected tests' started at 2024-09-10 21:07:36.076.
Test Suite 'Tests.xctest' started at 2024-09-10 21:07:36.077.
Test Suite 'Tests' started at 2024-09-10 21:07:36.077.
Test Case '-[Tests test_load_aatp_data_llama2]' started.
/Users/runner/work/executorch/executorch/pytorch/executorch/extension/apple/Benchmark/Tests/Tests.mm:58: Test Case '-[Tests test_load_aatp_data_llama2]' measured [Memory Peak Physical, kB] average: 122225.936, relative standard deviation: 0.000%, values: [122225.936000, 122225.936000, 122225.936000, 122225.936000, 122225.936000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical_peak, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/runner/work/executorch/executorch/pytorch/executorch/extension/apple/Benchmark/Tests/Tests.mm:58: Test Case '-[Tests test_load_aatp_data_llama2]' measured [Memory Physical, kB] average: 0.000, relative standard deviation: 0.000%, values: [0.000000, 0.000000, 0.000000, 0.000000, 0.000000], performanceMetricID:com.apple.dt.XCTMetric_Memory.physical, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
/Users/runner/work/executorch/executorch/pytorch/executorch/extension/apple/Benchmark/Tests/Tests.mm:58: Test Case '-[Tests test_load_aatp_data_llama2]' measured [Clock Monotonic Time, s] average: 0.000, relative standard deviation: 26.339%, values: [0.000001, 0.000001, 0.000001, 0.000001, 0.000001], performanceMetricID:com.apple.dt.XCTMetric_Clock.time.monotonic, baselineName: "", baselineAverage: , polarity: prefers smaller, maxPercentRegression: 10.000%, maxPercentRelativeStandardDeviation: 10.000%, maxRegression: 0.000, maxStandardDeviation: 0.000
Test Case '-[Tests test_load_aatp_data_llama2]' passed (0.152 seconds).
Test Suite 'Tests' failed at 2024-09-10 21:07:36.229.
	 Executed 2 tests, with 1 failure (0 unexpected) in 0.152 (0.152) seconds
Test Suite 'Tests.xctest' failed at 2024-09-10 21:07:36.229.
	 Executed 2 tests, with 1 failure (0 unexpected) in 0.152 (0.152) seconds
Test Suite 'Selected tests' failed at 2024-09-10 21:07:36.229.
	 Executed 2 tests, with 1 failure (0 unexpected) in 0.152 (0.153) seconds
2024-09-10 21:07:42.661 xcodebuild[1200:8040] [MT] IDETestOperationsObserverDebug: 17.357 elapsed -- Testing started completed.
2024-09-10 21:07:42.661 xcodebuild[1200:8040] [MT] IDETestOperationsObserverDebug: 0.000 sec, +0.000 sec -- start
2024-09-10 21:07:42.661 xcodebuild[1200:8040] [MT] IDETestOperationsObserverDebug: 17.357 sec, +17.357 sec -- end

Test session results, code coverage, and logs:
	/tmp/devicefarm-workspace/execution-tvdc_j_e/logs-3gvb1hak/Logs/Test/Test-Benchmark-2024.09.10_21-07-25--0700.xcresult

Failing tests:
	-[Tests test_forward_aatp_data_llama2]

** TEST EXECUTE FAILED **

@shoumikhin has some leads on the cause of the crash:

et_view.cpp:88] In function et_view(), assert failed: self.numel() == out.numel()

So, I guess we need to figure this out and fix the issue before this can be landed.

@guangy10
Copy link
Contributor

guangy10 commented Sep 11, 2024

Is this PR ready to test other models w/ different delegates. If the crash is llama specific, we can verify the app with nongenai models in parallel.

Scheduled one: https://github.com/pytorch/executorch/actions/runs/10817524101

@facebook-github-bot
Copy link
Contributor

@huydhn has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@huydhn
Copy link
Contributor Author

huydhn commented Sep 11, 2024

D62511672 works https://github.com/pytorch/executorch/actions/runs/10818420721/job/30015121518 :)

facebook-github-bot pushed a commit that referenced this pull request Sep 11, 2024
…pp (#5212)

Summary:
This requires shoumikhin change in #5208, so I will rebase and test it out again after #5208 lands

### Testing

https://github.com/pytorch/executorch/actions/runs/10787058020


Reviewed By: shoumikhin, guangy10

Differential Revision: D62415898

Pulled By: huydhn
@facebook-github-bot facebook-github-bot force-pushed the switch-to-generic-apple-benchmark-app branch from ecf2ede to ef50595 Compare September 11, 2024 21:49
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62415898

…pp (#5212)

Summary:
This requires shoumikhin change in #5208, so I will rebase and test it out again after #5208 lands

### Testing

https://github.com/pytorch/executorch/actions/runs/10787058020


Reviewed By: shoumikhin, guangy10

Differential Revision: D62415898

Pulled By: huydhn
@facebook-github-bot facebook-github-bot force-pushed the switch-to-generic-apple-benchmark-app branch from ef50595 to 31cf557 Compare September 11, 2024 23:46
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D62415898

@facebook-github-bot
Copy link
Contributor

@huydhn merged this pull request in a4be79f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants