Refactor android demo job #4288

guangy10 · 2024-07-17T01:12:00Z

We would want to reuse the same demo app to benchmark as many models ad possible. It may be not easy to create super generic app for all types of models, but we can reuse our existing demo apps to swap in different models of performing same task, e.g. our llama demo should be able to benchmark different casual LLMs w/o problems. To do this, we need to organize the build vertically by the demo apps. Currently we have two demo apps for android (ios demo app would follow the same rule), this PR is to address the llama demo. The android job 'build-llm-demo' is going to build different flavors of the same app by android-abi and tokenizer library. In the downstream, an app built for arm with bpe tokenizer could be used to benchmark all LLMs using bpe tokenizer on a physical android device.

pytorch-bot · 2024-07-17T01:12:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4288

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d67bd34 with merge base 037cfcf ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-07-17T19:52:59Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

.github/workflows/android.yml

guangy10 · 2024-07-18T00:17:11Z

build/build_android_llm_demo.sh

+  # between Java and JNI
+  find jni -type f -name "libexecutorch_jni.so" -exec bash -c 'mv "$1" "${1/_jni/}"' bash {} \;
+  # Zip all necessary files into the AAR file
+  zip -r executorch.aar libs jni/*/libexecutorch.so AndroidManifest.xml


@kirklandsign Is this for llama demo or for the other demo app?

This one is for the other demo app (image processing)

Ok, I will split it out. Are there other things for the vision model demo?

We need to publish two aars (llama and other model) for now, because the API are very unsimilar. However, the underlying ops and backend are the same. So in the future, might worth using 1 instead to have both functionalities, but the size will be larger

Ok, I will split it out. Are there other things for the vision model demo?

Nothing else. Just different .so for these two .aar

Size should not be a concern since we are not deploy it for production. Test efficiency is our consideration. So what's your suggestion? Should we keep aars for all demos together? Imagine in the future we have more demo apps with different user interfaces.

I see. For now let's just use two separate aars like now, because I don't feel like it blocks our test efficiency, and having two .so files with duplicated backends and kernels is really wasting space. So let's land this PR for now. I will try building a single .so file for llama and non-llama then. I probably need to do some refactor because right now we have two entry points to load the .so files from Java for llama and non-llama.

guangy10 · 2024-07-18T00:22:03Z

.github/workflows/android.yml

+        # Build LLM Demo for Android
+        bash build/build_android_llm_demo.sh ${{ matrix.tokenizer }} ${ARTIFACTS_DIR_NAME}


I'm consolidating all these logics to the reusable build script so that we can reuse it for the on-device benchmarking jobs. As mentioned in the another thread, this android job should be only focusing on validating build and basic functions, it's not suitable for perf measurement because it's on the critical path.

facebook-github-bot · 2024-07-18T00:54:24Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-07-18T05:30:48Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

huydhn

LGTM! Btw, #4287 removes the upload to S3 part. I think we should put it back as it has nothing to do with the test on device part. We need it to expose all the Android artifacts here on S3.

guangy10 · 2024-07-18T17:31:24Z

LGTM! Btw, #4287 removes the upload to S3 part. I think we should put it back as it has nothing to do with the test on device part. We need it to expose all the Android artifacts here on S3.

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.

There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

facebook-github-bot · 2024-07-18T17:31:56Z

@guangy10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

huydhn · 2024-07-18T17:39:26Z

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.

There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

Yup, there are more artifacts there than the app and the test suite used by the on-device test job. IIRC, executorch.aar, executorch-llama.aar and those so libraries are consumed elsewhere (TorchChat). Also, these files are just kept on S3 like a regular artifacts and it's cheap to just keep them there.

guangy10 · 2024-07-18T17:41:49Z

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.
There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

Yup, there are more artifacts there than the app and the test suite used by the on-device test job. IIRC, executorch.aar, executorch-llama.aar and those so libraries are consumed elsewhere (TorchChat). Also, these files are just kept on S3 like a regular artifacts and it's cheap to just keep them there.

I see. I can add that step back if TorchChat needs those artifacts and it's not expensive.

kirklandsign · 2024-07-18T17:50:39Z

@huydhn You mean the artifacts-uploading part right? I think it's closer to the on-device part. Since we are not running it on-device in this job, why do we want to waste the resource to upload and store it? I'm thinking about the cost of leasing AWS S3 and device farm, and want to be cost-effective.
There will be a new periodic job run the benchmarking e2e, from build, to upload, to on-device deployment. For the build part, it will reuse the same build script refactored in this PR. The job setup for build will lightweight because we already consolidated all logics in the script.

Yup, there are more artifacts there than the app and the test suite used by the on-device test job. IIRC, executorch.aar, executorch-llama.aar and those so libraries are consumed elsewhere (TorchChat). Also, these files are just kept on S3 like a regular artifacts and it's cheap to just keep them there.

I see. I can add that step back if TorchChat needs those artifacts and it's not expensive.

Yes. When we upload artifacts, they are kept for a month and it will be removed, so it's not expensive. We need to use these artifacts and upload to a persistent storage for release candidates.

facebook-github-bot · 2024-07-18T18:24:27Z

@guangy10 merged this pull request in 282e7fe.

This reverts commit 282e7fe.

Summary: According to the comment in #4288 , add the uploading step back so that TorchChat can consume the artifacts from S3 Pull Request resolved: #4300 Reviewed By: huydhn, kirklandsign Differential Revision: D59925505 Pulled By: guangy10 fbshipit-source-id: ce389fb16adb30d51240fdff655111580f07130b

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 17, 2024

guangy10 force-pushed the refactor_android_demo_job branch 4 times, most recently from d71ad1b to 389732d Compare July 17, 2024 04:38

guangy10 requested review from huydhn and kirklandsign July 17, 2024 16:27

guangy10 marked this pull request as ready for review July 17, 2024 16:27

guangy10 mentioned this pull request Jul 17, 2024

Fix Android app S3 path with tiktoken #4289

Closed

guangy10 force-pushed the refactor_android_demo_job branch from 389732d to 096ec07 Compare July 17, 2024 19:07

kirklandsign reviewed Jul 17, 2024

View reviewed changes

.github/workflows/android.yml Outdated Show resolved Hide resolved

guangy10 force-pushed the refactor_android_demo_job branch from 096ec07 to 5820161 Compare July 18, 2024 00:13

guangy10 commented Jul 18, 2024

View reviewed changes

kirklandsign approved these changes Jul 18, 2024

View reviewed changes

guangy10 force-pushed the refactor_android_demo_job branch from 5820161 to fc617a8 Compare July 18, 2024 00:53

guangy10 force-pushed the refactor_android_demo_job branch from fc617a8 to 3a56ca9 Compare July 18, 2024 04:10

[1/n] ET benchmarking - Refactor the android build

d67bd34

guangy10 force-pushed the refactor_android_demo_job branch from 3a56ca9 to d67bd34 Compare July 18, 2024 16:32

huydhn approved these changes Jul 18, 2024

View reviewed changes

guangy10 mentioned this pull request Jul 18, 2024

Upload android artifacts to s3 #4300

Closed

facebook-github-bot closed this in 282e7fe Jul 18, 2024

facebook-github-bot added the Merged label Jul 18, 2024

kirklandsign added a commit to kirklandsign/executorch that referenced this pull request Jul 18, 2024

Revert "Refactor android demo job (pytorch#4288)"

4638961

This reverts commit 282e7fe.

kirklandsign added a commit to kirklandsign/executorch that referenced this pull request Jul 18, 2024

Revert "Refactor android demo job (pytorch#4288)"

6c7dacb

This reverts commit 282e7fe.

guangy10 deleted the refactor_android_demo_job branch August 21, 2024 21:04

		# Build LLM Demo for Android
		bash build/build_android_llm_demo.sh ${{ matrix.tokenizer }} ${ARTIFACTS_DIR_NAME}

Refactor android demo job #4288

Refactor android demo job #4288

Uh oh!

Conversation

guangy10 commented Jul 17, 2024

Uh oh!

pytorch-bot bot commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/4288

✅ No Failures

Uh oh!

facebook-github-bot commented Jul 17, 2024

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 18, 2024

Uh oh!

facebook-github-bot commented Jul 18, 2024

Uh oh!

huydhn left a comment

Choose a reason for hiding this comment

Uh oh!

guangy10 commented Jul 18, 2024

Uh oh!

facebook-github-bot commented Jul 18, 2024

Uh oh!

huydhn commented Jul 18, 2024

Uh oh!

guangy10 commented Jul 18, 2024

Uh oh!

kirklandsign commented Jul 18, 2024

Uh oh!

facebook-github-bot commented Jul 18, 2024

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 17, 2024 •

edited

Loading