-
Notifications
You must be signed in to change notification settings - Fork 606
Refactor android demo job #4288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4288
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d67bd34 with merge base 037cfcf ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
d71ad1b
to
389732d
Compare
389732d
to
096ec07
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
096ec07
to
5820161
Compare
# between Java and JNI | ||
find jni -type f -name "libexecutorch_jni.so" -exec bash -c 'mv "$1" "${1/_jni/}"' bash {} \; | ||
# Zip all necessary files into the AAR file | ||
zip -r executorch.aar libs jni/*/libexecutorch.so AndroidManifest.xml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kirklandsign Is this for llama demo or for the other demo app?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is for the other demo app (image processing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will split it out. Are there other things for the vision model demo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to publish two aars (llama and other model) for now, because the API are very unsimilar. However, the underlying ops and backend are the same. So in the future, might worth using 1 instead to have both functionalities, but the size will be larger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will split it out. Are there other things for the vision model demo?
Nothing else. Just different .so for these two .aar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Size should not be a concern since we are not deploy it for production. Test efficiency is our consideration. So what's your suggestion? Should we keep aars for all demos together? Imagine in the future we have more demo apps with different user interfaces.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. For now let's just use two separate aars like now, because I don't feel like it blocks our test efficiency, and having two .so files with duplicated backends and kernels is really wasting space. So let's land this PR for now. I will try building a single .so file for llama and non-llama then. I probably need to do some refactor because right now we have two entry points to load the .so files from Java for llama and non-llama.
# Build LLM Demo for Android | ||
bash build/build_android_llm_demo.sh ${{ matrix.tokenizer }} ${ARTIFACTS_DIR_NAME} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm consolidating all these logics to the reusable build script so that we can reuse it for the on-device benchmarking jobs. As mentioned in the another thread, this android job should be only focusing on validating build and basic functions, it's not suitable for perf measurement because it's on the critical path.
5820161
to
fc617a8
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
fc617a8
to
3a56ca9
Compare
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
3a56ca9
to
d67bd34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Btw, #4287 removes the upload to S3 part. I think we should put it back as it has nothing to do with the test on device part. We need it to expose all the Android artifacts here on S3.
@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective. There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script. |
@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Yup, there are more artifacts there than the app and the test suite used by the on-device test job. IIRC, |
I see. I can add that step back if TorchChat needs those artifacts and it's not expensive. |
Yes. When we upload artifacts, they are kept for a month and it will be removed, so it's not expensive. We need to use these artifacts and upload to a persistent storage for release candidates. |
This reverts commit 282e7fe.
This reverts commit 282e7fe.
Summary: According to the comment in #4288 , add the uploading step back so that TorchChat can consume the artifacts from S3 Pull Request resolved: #4300 Reviewed By: huydhn, kirklandsign Differential Revision: D59925505 Pulled By: guangy10 fbshipit-source-id: ce389fb16adb30d51240fdff655111580f07130b
We would want to reuse the same demo app to benchmark as many models ad possible. It may be not easy to create super generic app for all types of models, but we can reuse our existing demo apps to swap in different models of performing same task, e.g. our llama demo should be able to benchmark different casual LLMs w/o problems. To do this, we need to organize the build vertically by the demo apps. Currently we have two demo apps for android (ios demo app would follow the same rule), this PR is to address the llama demo. The android job 'build-llm-demo' is going to build different flavors of the same app by android-abi and tokenizer library. In the downstream, an app built for arm with bpe tokenizer could be used to benchmark all LLMs using bpe tokenizer on a physical android device.