Skip to content

Refactor android demo job #4288

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Refactor android demo job #4288

wants to merge 1 commit into from

Conversation

guangy10
Copy link
Contributor

We would want to reuse the same demo app to benchmark as many models ad possible. It may be not easy to create super generic app for all types of models, but we can reuse our existing demo apps to swap in different models of performing same task, e.g. our llama demo should be able to benchmark different casual LLMs w/o problems. To do this, we need to organize the build vertically by the demo apps. Currently we have two demo apps for android (ios demo app would follow the same rule), this PR is to address the llama demo. The android job 'build-llm-demo' is going to build different flavors of the same app by android-abi and tokenizer library. In the downstream, an app built for arm with bpe tokenizer could be used to benchmark all LLMs using bpe tokenizer on a physical android device.

Copy link

pytorch-bot bot commented Jul 17, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4288

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d67bd34 with merge base 037cfcf (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 17, 2024
@guangy10 guangy10 force-pushed the refactor_android_demo_job branch 4 times, most recently from d71ad1b to 389732d Compare July 17, 2024 04:38
@guangy10 guangy10 requested review from huydhn and kirklandsign July 17, 2024 16:27
@guangy10 guangy10 marked this pull request as ready for review July 17, 2024 16:27
@guangy10 guangy10 force-pushed the refactor_android_demo_job branch from 389732d to 096ec07 Compare July 17, 2024 19:07
@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10 guangy10 force-pushed the refactor_android_demo_job branch from 096ec07 to 5820161 Compare July 18, 2024 00:13
# between Java and JNI
find jni -type f -name "libexecutorch_jni.so" -exec bash -c 'mv "$1" "${1/_jni/}"' bash {} \;
# Zip all necessary files into the AAR file
zip -r executorch.aar libs jni/*/libexecutorch.so AndroidManifest.xml
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kirklandsign Is this for llama demo or for the other demo app?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is for the other demo app (image processing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will split it out. Are there other things for the vision model demo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to publish two aars (llama and other model) for now, because the API are very unsimilar. However, the underlying ops and backend are the same. So in the future, might worth using 1 instead to have both functionalities, but the size will be larger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will split it out. Are there other things for the vision model demo?

Nothing else. Just different .so for these two .aar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Size should not be a concern since we are not deploy it for production. Test efficiency is our consideration. So what's your suggestion? Should we keep aars for all demos together? Imagine in the future we have more demo apps with different user interfaces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. For now let's just use two separate aars like now, because I don't feel like it blocks our test efficiency, and having two .so files with duplicated backends and kernels is really wasting space. So let's land this PR for now. I will try building a single .so file for llama and non-llama then. I probably need to do some refactor because right now we have two entry points to load the .so files from Java for llama and non-llama.

Comment on lines +47 to +48
# Build LLM Demo for Android
bash build/build_android_llm_demo.sh ${{ matrix.tokenizer }} ${ARTIFACTS_DIR_NAME}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm consolidating all these logics to the reusable build script so that we can reuse it for the on-device benchmarking jobs. As mentioned in the another thread, this android job should be only focusing on validating build and basic functions, it's not suitable for perf measurement because it's on the critical path.

@guangy10 guangy10 force-pushed the refactor_android_demo_job branch from 5820161 to fc617a8 Compare July 18, 2024 00:53
@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10 guangy10 force-pushed the refactor_android_demo_job branch from fc617a8 to 3a56ca9 Compare July 18, 2024 04:10
@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@guangy10 guangy10 force-pushed the refactor_android_demo_job branch from 3a56ca9 to d67bd34 Compare July 18, 2024 16:32
Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Btw, #4287 removes the upload to S3 part. I think we should put it back as it has nothing to do with the test on device part. We need it to expose all the Android artifacts here on S3.

@guangy10
Copy link
Contributor Author

LGTM! Btw, #4287 removes the upload to S3 part. I think we should put it back as it has nothing to do with the test on device part. We need it to expose all the Android artifacts here on S3.

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.

There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

@facebook-github-bot
Copy link
Contributor

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@huydhn
Copy link
Contributor

huydhn commented Jul 18, 2024

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.

There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

Yup, there are more artifacts there than the app and the test suite used by the on-device test job. IIRC, executorch.aar, executorch-llama.aar and those so libraries are consumed elsewhere (TorchChat). Also, these files are just kept on S3 like a regular artifacts and it's cheap to just keep them there.

@guangy10
Copy link
Contributor Author

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.
There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

Yup, there are more artifacts there than the app and the test suite used by the on-device test job. IIRC, executorch.aar, executorch-llama.aar and those so libraries are consumed elsewhere (TorchChat). Also, these files are just kept on S3 like a regular artifacts and it's cheap to just keep them there.

I see. I can add that step back if TorchChat needs those artifacts and it's not expensive.

@kirklandsign
Copy link
Contributor

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.
There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

Yup, there are more artifacts there than the app and the test suite used by the on-device test job. IIRC, executorch.aar, executorch-llama.aar and those so libraries are consumed elsewhere (TorchChat). Also, these files are just kept on S3 like a regular artifacts and it's cheap to just keep them there.

I see. I can add that step back if TorchChat needs those artifacts and it's not expensive.

Yes. When we upload artifacts, they are kept for a month and it will be removed, so it's not expensive. We need to use these artifacts and upload to a persistent storage for release candidates.

@facebook-github-bot
Copy link
Contributor

@guangy10 merged this pull request in 282e7fe.

kirklandsign added a commit to kirklandsign/executorch that referenced this pull request Jul 18, 2024
kirklandsign added a commit to kirklandsign/executorch that referenced this pull request Jul 18, 2024
facebook-github-bot pushed a commit that referenced this pull request Jul 18, 2024
Summary:
According to the comment in #4288 , add the uploading step back so that TorchChat can consume the artifacts from S3

Pull Request resolved: #4300

Reviewed By: huydhn, kirklandsign

Differential Revision: D59925505

Pulled By: guangy10

fbshipit-source-id: ce389fb16adb30d51240fdff655111580f07130b
@guangy10 guangy10 deleted the refactor_android_demo_job branch August 21, 2024 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants